0xedB0...3484

Turn your stories into NFTs

0 Following     0 Follower

A list of Large Language Models

Here is a list of LLM I'll update each time I stumble upon a new of these Large Language Models.

One of the key metric when describing LLM is their number of parameters: that is the number of weights (a floating point number) that describes each connection between the neurons in the neural network.

So basically, one parameter = one neural connection.

To get some perspective, a human brain has around 80 billion neurons with one to ten thousand connections each. So a human brain should have more than 80 trillion, and possibly hundreds of trillions of neural connections (also called synapses).

GPT-3

One of the first very famous LLM, created by OpenAI. You can use it for free via chatGPT. 175 billion parameters. Very good. Training cost: $4 millions.

GPT-4

The best LLM available. Absolutely impressive. From OpenAI. You can use it via chatGPT if you are a premium user ($20 / month). Difficult to say how many parameters it has, some say 100 trillions. Training cost: several tens of million dollars.

LLaMa

FaceBooks LLM. Released for research use in March 2023. Comes in different size: 7b, 13b ...

You can run Llama on your computer using Dalai.

Alpaca

Based on LLaMa, then fine trained using chatGPT as instructor (although this is against OpenAI terms of use that state that you cannot use GPT's output to train another model).

The training from LLaMa to Alpaca was done by a research team from Stanford University, for a tiny cost of $600.

Vicuna

Based on LLaMa, then fine trained using shared chatGPT conversations from shareGPT.

You can install Vicuna on your local computer with fastchat.

StableML

Open source LLM from Stability AI (the same who created the Midjourney alternative: Stable Diffusion). First alpha version released in April 2023. Currently 7b parameters. Currently crap.

MiniGPT-4

That one isn't an LLM on its own, but it's an interface to Vicuna to which was added some image analysis layer, allowing it to process images as input. Test it on MiniGPT-4.

Why is it important?

The big question now in mid 2023 is whether good LLM will only be available from a small list of big corporations, or if some good open source LLM will be available for free.

Will we be able to run efficient LLMs on our local computers? In two or three years, will each of our smartphones have their incorporated LLM? What would we call them then? Smart Smartphones?

Currently, GPT-4 is far better than any of its competitors.

Interfaces

A LLM is a bit multi gigabytes file. Using it isn't really easy, and you'll surely need a web interface to make your life simpler. Here are a few web interfaces that helps you interact with LLMs:

  1. oobabooga: gives you a web interface to use LLaMa, GPT-J, Pythia, ... and many other models. Will install a ton of stuff on your computer from Python, to PHP, ... :-/. I never really managed to reach the end of the install procedure.
  2. Dalai: light and easy interface written in NodeJS to install and use Llama or Alpaca. It's simple, it works. It's just too bad that it hasn't been really active recently and I have no way of knowing how to use the latest models with it.
  3. FastChat: open platform to train, use and evaluate LLMs. I think this was used to train Vicuna.


So you see that hundreds of people are already busy at work building the smartest Large Language Models possible. Hundreds of other people are working on interfaces so that these sophisticated models get easy to use for anyone. And in the following story, you'll see that other people are also working hard to create autonomous agents that leverage these models to handle more and more complex tasks on their own.

Looks like the AI industry is creating quite a lot of new jobs.

Reactions are currently disabled. They will return soon.

⏪ Previous Story

0xedB0...3484 avatar

How AIs could acquire emotions

Can AIs have emotions? Feelings? Is it conceivable or are emotions the only privilege of human beings? Let's ...
by 0xedB0...3484

⏩ Next Story

0xedB0...3484 avatar

AI Agents: the dawn of a new era

As you can read on Why GPT-4 is well under control, Large Language Models cannot act on their own. They are ...
by 0xedB0...3484