LLM Model Tips

AI Activity Models

Model ID	Capability	Cost	Speed	Notes
azure-gpt-5	Advanced	Expensive	Slow
azure-gpt-5-mini	Standard	Moderate	Fast
azure-gpt-41	Standard	Moderate	Fast
azure-gpt-4o	Standard	Moderate, more expensive than azure-gpt-41	Very Fast	Probably just use azure-gpt-41 instead.
azure-gpt-4o-mini	Simple	Very Cheap	Very Fast
azure-o4-mini:reasoning=high	Standard	Moderate	Slow
azure-o4-mini:reasoning=medium	Simple	Cheap	Fast
azure-o4-mini:reasoning=low	Simple	Very Cheap	Very Fast
gemini-2.5-pro	Advanced	Expensive	Slow
gemini-2.5-flash	Standard	Cheap	Very Fast

Model ID	Performance	Cost	Speed	Notes
azure-gpt-5	Elite	Expensive	Slow
azure-gpt-5-mini	Great	Moderate	Fast
azure-gpt-4o	Good	Moderate	Fast
azure-gpt-4o-mini	OK	Very Cheap	Very Fast
gemini-2.5-flash	Great	Cheap	Very Fast	Recommended for basic tasks.

Model Name	Performance	Cost	Speed	Recommended Usages
GPT-5	Elite	Expensive	Slow	Autonomous agents, complex research, expert-level coding.
GPT-5-mini	Great	Moderate	Fast	High-quality reasoning for production-grade apps.
GPT-4.1	Great	Moderate	Fast	Advanced logic, long-context (1M tokens) document analysis.
GPT-4o	Good	Moderate	Very Fast	Everyday chat, general vision tasks, voice interaction.
GPT-4o-mini	OK	Very Cheap	Very Fast	High-volume simple tasks, basic customer support bots.
o4-mini (High)	Great	Moderate	Slow	Scientific reasoning, complex debugging, precision math.
o4-mini (Medium)	Good	Cheap	Fast	Standard reasoning tasks, logical data extraction.
o4-mini (Low)	OK	Very Cheap	Very Fast	Fast logic-checks, structured output from simple data.
Gemini 2.5 Pro	Elite	Expensive	Slow	Deep multi-modal reasoning, massive context (2M tokens).
Gemini 2.5 Flash	Great	Cheap	Very Fast	Real-time agents, high-speed coding, large-scale data processing.
Gemini 2.0 Flash	Good	Ultra-Cheap	Instant	High-throughput routing, simple translation, low-latency UI.