Models Leaderboard: Comparison of AI Models & API Providers

Compare and analyze AI models (LLMs) from top leading API providers, on our Models Leaderboard. Evaluate key performance metrics such as quality, price, context window, knowledge cutoff, etc. Gain insights into each model’s strengths and weaknesses to make informed decisions about the best AI solution for your needs.

API Providers Compared: Groq, Microsoft Azure, Amazon Bedrock, Together.ai, Fireworks AI, Baseten, Lepton AI, Deepinfra, Replicate, Databricks, Novita AI, and OctoAI.

Models Compared: Gemma 2 27B, Gemma 2 9B, Gemma 7B Instruct, Llama 3.1 Instruct 405B, Llama 3.1 Instruct 70B, Llama 3 Instruct 70B, Llama 3.1 Instruct 8B, Llama 3 Instruct 8B, Mixtral 8x7B Instruct, Mistral 7B Instruct, Mixtral 8x22B Instruct, OpenChat 3.5 1210, Qwen2 Instruct 7B, Qwen2 Instruct 72B, Phi-3 Medium Instruct 14B, and Nous Capybara 7B.

LLM Leaderboard Highlights:

Highest Quality Index

Llama 3.1 405B (100)

Llama 3.1 70B (95)

Llama 3 70B (83)

Cheapest Price

OpenChat 3.5 ($0.14)

Phi-3 Medium 14B ($0.14)

Gemma 7B ($0.15)

Larger Context Window

Llama 3.1 405B (128k)

Llama 3.1 70B (128k)

Llama 3.1 8B (128k)

Explore AI Models (LLMs)

Gemma 2 27B

  • Creator: Google
  • Quality: 78
  • Knowledge: Jun 2024
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
together ai8k$0.80$0.80google/gemma-2-27b-it

Gemma 2 9B

  • Creator: Google
  • Quality: 71
  • Knowledge: Jun 2024
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai8k$0.20$0.20accounts/fireworks/models/gemma2-9b-it
deepinfra8k$0.09$0.09google/gemma-2-9b-it
groq8k$0.20$0.20gemma2-9b-it
together ai8k$0.30$0.30google/gemma-2-9b-it
novita ai8k$0.08$0.08google/gemma-2-9b-it

Gemma 7B Instruct

  • Creator: Google
  • Quality: 45
  • Knowledge:
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai8k$0.20$0.20accounts/fireworks/models/gemma-7b-it
deepinfra8k$0.07$0.07google/gemma-7b-it
groq8k$0.07$0.07gemma-7b-it
together ai8k$0.20$0.20google/gemma-7b-it

Llama 3.1 Instruct 405B

  • Creator: Meta
  • Quality: 100
  • Knowledge: Dec 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai128k$3.00$3.00accounts/fireworks/models/llama-v3p1-405b-instruct
deepinfra33k$7.00$14.00databricks-meta-llama-3.1-405b-instruct
replicate128k$9.50$9.50meta/meta-llama-3.1-405b-instruct
together ai4k$5.00$5.00meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
octoai128k$3.00$9.00meta-llama-3.1-405b-instruct
lepton ai128k$2.80$2.80lepton
databricks128k$10.00$30.00databricks-meta-llama-3.1-405b-instruct
novita ai33k$2.75$2.75meta-llama/llama-3.1-405b-instruct

Llama 3.1 Instruct 70B

  • Creator: Meta
  • Quality: 95
  • Knowledge: Dec 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai128k$0.90$0.90accounts/fireworks/models/llama-v3p1-70b-instruct
deepinfra128k$0.52$0.75meta-llama/Meta-Llama-3.1-70B-Instruct
together ai33k$0.88$0.88meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
octoai128k$0.90$0.90meta-llama-3.1-70b-instruct
lepton ai128k$0.80$0.80lepton
databricks128k$1.00$3.00databricks-meta-llama-3-70b-instruct
groq8k$0.59$0.79llama-3.1-70b-versatile
novita ai8k$0.76$0.55meta-llama/llama-3.1-70b-instruct

Llama 3 Instruct 70B

  • Creator: Meta
  • Quality: 83
  • Knowledge: Dec 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai8k$0.90$0.90accounts/fireworks/models/llama-v3-70b-instruct
deepinfra8k$0.52$0.75meta-llama/Meta-Llama-3-70B-Instruct
together ai8k$0.90$0.90META-LLAMA/LLAMA-3-70B-CHAT-HF
octoai8k$0.90$0.90meta-llama-3-70b-instruct
lepton ai8k$0.80$0.80llama3-70b
databricks8k$1.00$3.00databricks-meta-llama-3-70b-instruct
groq8k$0.59$0.79Llama3-70b-8192
microsoft azure8k$2.65$3.50Meta-Llama-3-70B-Instruct
replicate8k$0.65$2.75meta/meta-llama-3-70b-instruct
novita ai8k$0.51$0.74meta-llama/llama-3-70b-instruct

Llama 3.1 Instruct 8B

  • Creator: Meta
  • Quality: 66
  • Knowledge: Dec 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai128k$0.20$0.20accounts/fireworks/models/llama-v3p1-8b-instruct
deepinfra128k$0.09$0.09meta-llama/Meta-Llama-3.1-8B-Instruct
together ai33k$0.18$0.18meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
octoai128k$0.15$0.15meta-llama-3.1-8b-instruct
lepton ai128k$0.70$0.70lepton
groq8k$0.05$0.05llama-3.1-8b-instant
microsoft azure128k$0.30$0.30meta.llama3-1-8b-instruct-v1:0
novita ai8k$0.10$0.10meta-llama/llama-3.1-8b-instruct

Llama 3 Instruct 8B

  • Creator: Meta
  • Quality: 64
  • Knowledge: Mar 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai8k$0.20$0.20accounts/fireworks/models/llama-v3-8b-instruct
deepinfra8k$0.06$0.06meta-llama/Meta-Llama-3-8B-Instruct
together ai8k$0.20$0.20META-LLAMA/LLAMA-3-8B-CHAT-HF
octoai8k$0.15$0.15meta-llama-3-8b-instruct
lepton ai8k$0.07$0.07llama3-8b
groq8k$0.05$0.08Llama3-8b-8192
microsoft azure8k$0.37$1.10Meta-Llama-3-8B-Instruct
replicate8k$0.05$0.25meta/meta-llama-3-8b-instruct
aws8k$0.30$0.60meta.llama3-8b-instruct-v1:0
novita ai8k$0.06$0.06meta-llama/llama-3-8b-instruct

Mixtral 8x7B Instruct

  • Creator: Mistral AI
  • Quality: 61
  • Knowledge: Dec 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai33k$0.50$0.50accounts/fireworks/models/mixtral-8x7b-instruct
deepinfra33k$0.24$0.24mistralai/Mixtral-8x7B-Instruct-v0.1
together ai33k$0.60$0.60mistralai/Mixtral-8x7B-Instruct-v0.1
octoai33k$0.45$0.45mixtral-8x7b-instruct
lepton ai33k$0.50$0.50mixtral-8x7b
databricks33k$0.50$1.00databricks-mixtral-8x7b-instruct
aws33k$0.45$0.70mistral.mixtral-8x7b-instruct-v0:1
replicate33k$0.70$0.70mistralai/mixtral-8x7b-instruct-v0.1
groq33k$0.24$0.24mixtral-8x7b-32768

Mistral 7B Instruct

  • Creator: Mistral AI
  • Quality: 40
  • Knowledge: Dec 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
fireworks ai33k$0.20$0.20accounts/fireworks/models/mistral-7b-instruct-v0p2
deepinfra33k$0.06$0.06mistralai/Mistral-7B-Instruct-v0.3
together ai8k$0.20$0.20mistralai/Mistral-7B-Instruct-v0.3
octoai33k$0.15$0.15mistral-7b-instruct
lepton ai33k$0.07$0.07lepton
baseten4k$0.20$0.20mistral-7b
aws33k$0.15$0.20mistral.mistral-7b-instruct-v0:2
replicate33k$0.05$0.25mistralai/mistral-7b-instruct-v0.2
novita ai33k$0.06$0.06mistralai/mistral-7b-instruct

Mixtral 8x22B Instruct

  • Creator: Mistral AI
  • Quality: 71
  • Knowledge: Sep 2021
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
deepinfra65k$0.65$0.65mistralai/Mixtral-8x22B-Instruct-v0.1
together ai65k$1.20$1.20MISTRALAI/MIXTRAL-8X22B-INSTRUCT-V0.1
fireworks ai65k$1.20$1.20accounts/fireworks/models/mixtral-8x22b-instruct
octoai65k$1.20$1.20mixtral-8x22b-instruct

OpenChat 3.5 (1210)

  • Creator: OpenChat
  • Quality: 50
  • Knowledge:
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
deepinfra8k$0.07$0.07openchat/openchat_3.5
together ai8k$0.20$0.20openchat/openchat-3.5-1210
novita ai4k$0.06$0.06openchat/openchat-7b

Qwen2 Instruct 72B

  • Creator: Alibaba
  • Quality: 83
  • Knowledge: 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
deepinfra33k$0.56$0.77Qwen/Qwen2-72B-Instruct
together ai33k$0.90$0.90Qwen/Qwen2-72B-Instruct
fireworks ai33k$0.90$0.90accounts/fireworks/models/qwen2-72b-instruct

Qwen Instruct 7B

  • Creator: Alibaba
  • Quality:
  • Knowledge: 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
deepinfra33k$0.56$0.77Qwen/Qwen2-7B-Instruct
together ai33k$0.81$0.81Qwen/Qwen2-7B-Instruct
fireworks ai33k$0.90$0.90accounts/fireworks/models/qwen2-7b-instruct

Phi-3 Medium Instruct 14B

  • Creator: Microsoft Azure
  • Quality:
  • Knowledge: Oct 2023
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
deepinfra4k$0.14$0.14microsoft/Phi-3-medium-4k-instruct

Nous Capybara 7B

  • Creator: Nous Research
  • Quality:
  • Knowledge: 2021
API ProviderContext WindowInput Price $/1MOutput Price $/1MAPI ID
together ai8k$0.18$0.18Nous-Capybara-7B-V1p9

Key Definitions

  • Quality Index: A standardized score reflecting average performance across Chatbot Arena, MMLU, and MT-Bench benchmarks.
  • Context Window: The maximum combined number of input and output tokens. (Note: Output token limits are often lower than input limits.)
  • Input Price: Cost per token sent to the API in the request, in USD per million tokens.
  • Output Price: Cost per token generated by the model (received from the API), in USD per million tokens.
  • Knowledge Cutoff: The date the model’s training data was last updated. Information or events after this date may not be reflected in the model’s responses.