LLMs: Evaluating the current SOTA models

Choosing an LLM

A critical distinction to make when considering which LLM to use is between models that are publicly available (open source) and models that are only available through an API. In the open source world development is fast paced and often there are many variants of popular models, such as LLaMA, which are tailored to specific use cases or can be fine tuned on your own dataset. Of course, the major drawback being that it is up to you to host these models / pay for hardware to run inference or further training. The PaLM series by Google (notably used by the chat assistant Bard), provides a hybrid offering providing paid inference APIs as well as being open source. A great way to keep to keep up to date with SOTA models is by checking the HuggingFace open LLM leaderboard which ranks models on a consistent suite of evaluation tasks.

In the proprietary model landscape, the dominating option is the GPT series by Open AI - which powers ChatGPT. They offer per token (subword units often used in Transformers) pricing. There is an argument to be made that these models offer more robust alignment than open source options and are less likely to produce unexpected or harmful output.

Choosing an LLM

A critical distinction to make when considering which LLM to use is between models that are publicly available (open source) and models that are only available through an API. In the open source world development is fast paced and often there are many variants of popular models, such as LLaMA, which are tailored to specific use cases or can be fine tuned on your own dataset. Of course, the major drawback being that it is up to you to host these models / pay for hardware to run inference or further training. The PaLM series by Google (notably used by the chat assistant Bard), provides a hybrid offering providing paid inference APIs as well as being open source. A great way to keep to keep up to date with SOTA models is by checking the HuggingFace open LLM leaderboard which ranks models on a consistent suite of evaluation tasks.

In the proprietary model landscape, the dominating option is the GPT series by Open AI - which powers ChatGPT. They offer per token (subword units often used in Transformers) pricing. There is an argument to be made that these models offer more robust alignment than open source options and are less likely to produce unexpected or harmful output.

All Guides

Start your Wishlist

All Guides

Start your Wishlist

LLMs: Evaluating the current SOTA models

LLMs: Evaluating the current SOTA models

Choosing an LLM

Choosing an LLM

LLMs: Evaluating the current SOTA models