LLM Model Tips

Tips on which LLM model to use for different actions.

AI Activity Models

Model IDCapabilityCostSpeedNotes
azure-gpt-5AdvancedExpensiveSlow
azure-gpt-5-miniStandardModerateFast
azure-gpt-41StandardModerateFast
azure-gpt-4oStandardModerate, more expensive than azure-gpt-41Very FastProbably just use azure-gpt-41 instead.
azure-gpt-4o-miniSimpleVery CheapVery Fast
azure-o4-mini:reasoning=highStandardModerateSlow
azure-o4-mini:reasoning=mediumSimpleCheapFast
azure-o4-mini:reasoning=lowSimpleVery CheapVery Fast
gemini-2.5-proAdvancedExpensiveSlow
gemini-2.5-flashStandardCheapVery Fast

Generative Prompt Models

Model IDPerformanceCostSpeedNotes
azure-gpt-5EliteExpensiveSlow
azure-gpt-5-miniGreatModerateFast
azure-gpt-4oGoodModerateFast
azure-gpt-4o-miniOKVery CheapVery Fast
gemini-2.5-flashGreatCheapVery FastRecommended for basic tasks.

LLMs in general - Written by gemini Lol

Model NamePerformanceCostSpeedRecommended Usages
GPT-5EliteExpensiveSlowAutonomous agents, complex research, expert-level coding.
GPT-5-miniGreatModerateFastHigh-quality reasoning for production-grade apps.
GPT-4.1GreatModerateFastAdvanced logic, long-context (1M tokens) document analysis.
GPT-4oGoodModerateVery FastEveryday chat, general vision tasks, voice interaction.
GPT-4o-miniOKVery CheapVery FastHigh-volume simple tasks, basic customer support bots.
o4-mini (High)GreatModerateSlowScientific reasoning, complex debugging, precision math.
o4-mini (Medium)GoodCheapFastStandard reasoning tasks, logical data extraction.
o4-mini (Low)OKVery CheapVery FastFast logic-checks, structured output from simple data.
Gemini 2.5 ProEliteExpensiveSlowDeep multi-modal reasoning, massive context (2M tokens).
Gemini 2.5 FlashGreatCheapVery FastReal-time agents, high-speed coding, large-scale data processing.
Gemini 2.0 FlashGoodUltra-CheapInstantHigh-throughput routing, simple translation, low-latency UI.