Skip to content
This repository has been archived by the owner on Sep 19, 2024. It is now read-only.

Open Source LLM #894

Open
0x4007 opened this issue Nov 22, 2023 · 8 comments
Open

Open Source LLM #894

0x4007 opened this issue Nov 22, 2023 · 8 comments

Comments

@0x4007
Copy link
Member

0x4007 commented Nov 22, 2023

Given all the recent developments with the OpenAI team, I am beginning research on using open source LLMs instead of OpenAI.

We just need to think through which models are the best for our planned use cases, as well as figure out where to host it.

Of course as always, if it's realistic to host on GitHub actions that would be great. But unfortunately I feel that we may need to set up some type of VPS (least desirable option due to maintenance)

Just opening up the conversation here. Will share my research results soon ish.

https://x.com/teknium1/status/1727126311578247397?s=46&t=bdMjuqzO5LYxLUsloRROxQ

@gitcoindev
Copy link
Contributor

Anthropic rejected takeover offer, so imho either Microsoft will buy them or the board will be replaced. It is very unlikely that the OpenAI and their developed models will disappear, taken into account how much money and effort was spent already. In the worst case scenario I feel that for the open source LLM enterprise GitHub action plan would do the job, might be expensive though.

@gitcoindev
Copy link
Contributor

Another option would be too look into Anthropic and their Claude 2 model. Their CEO was a former OpenAI's head of research and majority of their funding came from Amazon. A few details on this model are available at https://www.forbes.com/sites/sunilrajaraman/2023/11/21/openai-faces-competitive-pressure-from-anthropic/?sh=42cd65ef5352

@0x4007
Copy link
Member Author

0x4007 commented Nov 22, 2023

In regards to the planned AI powered features, the most cognitively complex work I think will be around working with code. For example, reviewing finalized pull requests to see if the changes achieve the specification requirements before requesting reviews from our human reviewers.

The other stuff I don't think requires a state-of-the-art LLM (checking comment relevance to the specification i.e. is the comment on topic.)

What's nice about self hosting is that our costs should be very manageable as we onboard partners.

@gitcoindev
Copy link
Contributor

Latest news: OpenAI board was replaced (my latter suggestion) https://twitter.com/OpenAI/status/1727206187077370115 , I am wondering how it unfolds.

@whilefoo
Copy link
Collaborator

Cloudflare has AI Workers but they are still in beta https://developers.cloudflare.com/workers-ai/

@0x4007
Copy link
Member Author

0x4007 commented Nov 24, 2023

I'm very interested to build off of cloudflare infra but unfortunately they appear to only have a few models

https://developers.cloudflare.com/workers-ai/models/text-generation/

I heard really good things about https://x.com/teknium1/status/1720188958154625296?s=46&t=bdMjuqzO5LYxLUsloRROxQ

@whilefoo
Copy link
Collaborator

Of course as always, if it's realistic to host on GitHub actions that would be great. But unfortunately I feel that we may need to set up some type of VPS (least desirable option due to maintenance)

Running on GitHub actions won't work because they don't have GPU instances. (at least not for now)

  1. option: use Transformers.js. It runs on edge runtime so anything like Supabase edge functions and Cloudflare workers, but is pretty limited in terms of available models.
  2. option: use Hugging Face's Inference API. It's basically an API endpoint which executes most of the models on Hugging Face servers. The downside is the model needs to support ONNX runtime otherwise you need to convert it.
  3. option: A more advanced version of Inference API is Inference Endpoints which basically deploys a model on a virtual machine on chosen cloud provider managed by Hugging Face so no maintenance.

@0x4007
Copy link
Member Author

0x4007 commented Nov 28, 2023

Of course as always, if it's realistic to host on GitHub actions that would be great. But unfortunately I feel that we may need to set up some type of VPS (least desirable option due to maintenance)

Running on GitHub actions won't work because they don't have GPU instances. (at least not for now)

  1. option: use Transformers.js. It runs on edge runtime so anything like Supabase edge functions and Cloudflare workers, but is pretty limited in terms of available models.

  2. option: use Hugging Face's Inference API. It's basically an API endpoint which executes most of the models on Hugging Face servers. The downside is the model needs to support ONNX runtime otherwise you need to convert it.

  3. option: A more advanced version of Inference API is Inference Endpoints which basically deploys a model on a virtual machine on chosen cloud provider managed by Hugging Face so no maintenance.

Great research. All interesting stuff!

I just signed up for access to a GPU instance on GitHub Actions Runners.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants