Available Providers
Snowflake
OpenAI
Gemini
Bedrock
Quick Reference: Model Capabilities
This table shows which capabilities each model category supports:| Model | Provider | Multimodal | Structured Output | Reasoning | Agents | Best Use Case |
|---|---|---|---|---|---|---|
| GPT-5 Series | OpenAI | No | Yes | Yes | Yes | Complex reasoning, premium applications |
| GPT-4.1 Series | OpenAI | No | Yes | No | Yes | Production deployments, reliable automation |
| GPT-4o Series | OpenAI | Yes | Yes | No | Yes | General-purpose, document analysis |
| o3-mini / o1-mini | OpenAI | No | No | Yes (req.) | No | Mathematical reasoning, logic problems |
| GPT-4 / GPT-3.5 | OpenAI | No | Partial | No | No | Legacy applications |
| Claude 4.5 Sonnet | Snowflake | No | No | No | Yes | Advanced reasoning, detailed analysis |
| Claude 4 Opus | Snowflake | No | No | No | Yes | Most sophisticated reasoning |
| Claude 4 Sonnet | Snowflake | No | No | No | Yes | Balanced performance |
| Claude Haiku 4.5 | Snowflake | No | No | No | Yes | Fast, efficient processing |
| Claude 3.7 / 3.5 | Snowflake | No | No | No | Yes | Cost-effective reasoning |
| Cortex GPT-5 | Snowflake | No | Yes | Yes | Yes | OpenAI through Snowflake |
| Cortex GPT-4.1 | Snowflake | No | Yes | No | Yes | Production OpenAI via Snowflake |
| Cortex o4-mini | Snowflake | No | No | Yes (req.) | Yes | Reasoning through Snowflake |
| Gemini 3 Pro | Gemini | Yes | No | No | Yes | Cutting-edge multimodal |
| Gemini 2.5 Pro/Flash | Gemini | Yes | No | No | Yes | Production multimodal |
| Gemini 2.0 Flash | Gemini | Yes | No | No | No | Cost-effective multimodal |
| Gemini 1.5 Pro | Gemini | Yes | No | No | No | Proven multimodal |
| DeepSeek R1 | Snowflake | No | No | No | No | Open-source reasoning |
| Llama 3.3 70B | Snowflake | No | No | No | No | Open-source, balanced |
| Llama 3.1 Series | Snowflake | No | Partial | No | No | Open-source, structured output |
| Llama 3 Series | Snowflake | No | No | No | No | Open-source, function calling |
| Mistral Large 2 | Snowflake | No | No | No | No | Multilingual, European focus |
| Mistral 7B / Mixtral | Snowflake | No | No | No | No | Efficient small models |
| Snowflake Arctic | Snowflake | No | No | No | No | Data cloud native |
| Arctic Embeddings | Snowflake | N/A | N/A | N/A | N/A | Semantic search |
- Multimodal: Processes text and images together
- Structured Output: Guaranteed JSON/XML format responses
- Reasoning: Advanced reasoning mode (req. = required, always on)
- Agents: Supports Elementum agent workflows
Models by Provider
Snowflake Cortex
Snowflake Cortex provides access to multiple AI model families through your Snowflake data cloud.Anthropic Claude (via Snowflake)
Claude 4.5 Sonnet
Claude 4.5 Sonnet
- Best for: Most advanced reasoning and analysis tasks
- Capabilities: Superior reasoning, nuanced understanding, extensive context windows
- Use cases: Complex research, detailed analysis, sophisticated automation, agent workflows
- Temperature range: 0.0 - 1.0 (default: 0.7)
- When to use: Premium applications requiring deepest understanding and analysis
Claude 4 Series
Claude 4 Series
- Best for: Most sophisticated reasoning requiring highest capability
- Use cases: Strategic decisions, complex research, mission-critical analysis
- When to use: Tasks where quality matters more than cost
- Best for: Balanced performance and cost for demanding tasks
- Use cases: Business automation, production workflows, detailed analysis
- When to use: Production workloads needing strong reasoning
- Best for: Fast, efficient processing
- Use cases: High-volume operations, real-time interactions, simple automation
- When to use: Speed and cost-efficiency are priorities
Claude 3.7 & 3.5 Sonnet
Claude 3.7 & 3.5 Sonnet
- Best for: Daily tasks requiring strong reasoning at lower cost
- Use cases: Standard business automation, customer support, content generation
- When to use: Cost-effective production deployments
- Best for: Proven reliability in production
- Use cases: Established workflows, production automation
- When to use: Stability and proven performance matter
OpenAI via Cortex
Access OpenAI models through your Snowflake environment:Cortex GPT-5 Series
Cortex GPT-5 Series
- Capabilities: Structured output, reasoning mode, through Snowflake
- Use cases: Complex analysis, conversations, classification, intelligence features, agents
- When to use: Need OpenAI capabilities with Snowflake data residency
Cortex GPT-4.1 & o4-mini
Cortex GPT-4.1 & o4-mini
- Production-ready OpenAI through Snowflake
- Structured output, reliable reasoning
- Reasoning model through Snowflake
- Required temperature: 1.0 (not adjustable)
- System role not supported
Open Source Models (via Snowflake)
DeepSeek R1
DeepSeek R1
- Best for: Advanced reasoning with open-source flexibility
- Capabilities: Strong reasoning, open-source architecture
- Use cases: Research, academic applications, cost-conscious deployments
- When to use: Open-source requirements or research projects
Meta Llama Models
Meta Llama Models
- Llama 3.3 70B (llama3.3-70b): Latest generation, balanced performance
- Llama 3.2 3B (llama3.2-3b): Efficient, compact
- Llama 3.2 1B (llama3.2-1b): Maximum efficiency for simple tasks
- Llama 3.1 405B (llama3.1-405b): Largest, most capable
- Llama 3.1 70B (llama3.1-70b): Production-ready, structured output support
- Llama 3.1 8B (llama3.1-8b): Cost-effective, structured output support
- Llama 3 70B (llama3-70b): Function calling, reliable
- Llama 3 8B (llama3-8b): Efficient operation
- Llama 2 70B Chat (llama2-70b-chat): Conversational focus
Mistral Models
Mistral Models
- Advanced capabilities, multilingual support, European markets
- Previous generation, proven performance
- Compact, efficient, cost-effective
- Mixture-of-experts architecture, balanced performance
Other Snowflake Models
Other Snowflake Models
- Data cloud native processing, integrated with Snowflake infrastructure
- When to use: Data-intensive workflows within Snowflake
- Lightweight Google-developed model
- When to use: Efficient processing on smaller tasks
- Instruction-following optimization
- Note: Not recommended for JSON/YAML parsing
- Advanced processing or fast operation
Snowflake Embedding Models
Arctic L V2.0
Arctic M V1.5
OpenAI Direct
Access OpenAI models directly through OpenAI API.GPT-5 Series - Latest Generation
GPT-5 Series - Latest Generation
- Best for: Most complex reasoning and premium applications
- Capabilities: Structured output, reasoning mode, advanced problem-solving
- Use cases: Complex analysis, strategic planning, research tasks
- Best for: Daily reasoning at lower cost
- Capabilities: Structured output, reasoning mode, balanced performance
- Use cases: Standard business logic, moderate analysis, automation
- Best for: Simple reasoning requiring efficiency
- Capabilities: Structured output, reasoning mode, cost-effective
- Use cases: Basic classification, simple analysis, high-volume operations
- Best for: Enhanced reasoning with improved accuracy
- Use cases: Business intelligence, detailed analysis, critical decisions
- Best for: Most advanced reasoning
- Use cases: Complex problem-solving, research, mission-critical applications
GPT-4.1 Series - Production Ready
GPT-4.1 Series - Production Ready
- Best for: Production applications requiring consistent performance
- Capabilities: Structured output, reliable reasoning
- Use cases: Customer-facing applications, production workflows
- Best for: Cost-effective production deployments
- Use cases: High-volume automation, chatbots, content generation
- Best for: Maximum efficiency for simple tasks
- Use cases: Real-time interactions, simple classification, quick responses
GPT-4o Series - Optimized
GPT-4o Series - Optimized
- Best for: Balanced performance and capability
- Capabilities: Structured output, multimodal support (text + images)
- Use cases: General-purpose applications, document analysis, versatile automation
- Supports: Prompts, conversations, email analysis, classification, intelligence, agents
- Best for: Cost-effective general-purpose tasks
- Capabilities: Structured output, multimodal support, efficient operation
- Use cases: Standard automation, customer support, content processing
- Supports: Prompts, conversations, email analysis, translation, classification, intelligence, agents
Reasoning Models - o3-mini & o1-mini
Reasoning Models - o3-mini & o1-mini
- Best for: Latest reasoning-focused tasks requiring deep analysis
- Capabilities: Advanced reasoning mode (required temperature: 1.0)
- Use cases: Mathematical problems, logical analysis, complex problem-solving
- Note: System role not supported; fixed temperature requirement
- Best for: Previous-generation reasoning tasks
- Capabilities: Reasoning mode (required temperature: 1.0)
- Use cases: Logic puzzles, analytical tasks, structured problem-solving
- Note: System role not supported; fixed temperature requirement
Legacy Models
Legacy Models
- Function calling, extended context
- Recommendation: Consider upgrading to GPT-4.1 or GPT-5 series
- Structured output, reliable performance
- Supports: Prompts, conversations, summarization, email analysis, classification
- Function calling, basic capabilities
- Recommendation: Upgrade to GPT-4.1 Mini for better performance
Google Gemini
Access Google’s multimodal Gemini models directly.Gemini 3 Series - Latest
Gemini 3 Series - Latest
- Best for: Cutting-edge multimodal capabilities
- Capabilities: Multimodal processing (text, images, audio), advanced reasoning
- Use cases: Document analysis with images, multimedia processing, complex automation
- Temperature range: 0.0 - 1.0 (default: 0.7)
- Supports: Prompts, translation, classification, file analysis, agents
Gemini 2.5 Series - Production Advanced
Gemini 2.5 Series - Production Advanced
- Best for: Complex multimodal tasks requiring high performance
- Capabilities: Multimodal, large context windows, detailed analysis
- Use cases: Document understanding, comprehensive analysis, advanced automation
- Best for: Fast multimodal processing
- Capabilities: Multimodal, efficient operation, quick responses
- Use cases: Real-time document analysis, responsive automation
Gemini 2.0 Series - Efficient
Gemini 2.0 Series - Efficient
- Best for: Cost-effective multimodal processing
- Use cases: Standard document processing, general automation
- Best for: Lightweight multimodal tasks
- Use cases: Simple document analysis, high-volume operations
Gemini 1.5 Pro - Established
Gemini 1.5 Pro - Established
- Best for: Proven multimodal performance
- Capabilities: Multimodal processing, reliable operation
- Use cases: Production workloads, established workflows
- Supports: Prompts, translation, classification, file analysis
Model Selection Guide
By Use Case
Conversational Agents
Conversational Agents
- GPT-4o Mini - Best balance of cost and performance
- Claude 3.7 Sonnet - Superior reasoning at reasonable cost
- Gemini 2.5 Flash - Fast multimodal conversations
- GPT-5 Mini - Advanced reasoning for complex interactions
- Support structured output for reliable responses
- Handle context well for conversation continuity
- Cost-effective for high-volume interactions
- Proven reliability in production
Document Analysis with Images
Document Analysis with Images
- Gemini 2.5 Pro - Complex multimodal analysis
- Gemini 3 Pro Preview - Cutting-edge document understanding
- GPT-4o - Strong multimodal processing
- Gemini 2.5 Flash - Fast multimodal analysis
- Multimodal support for images and text together
- Large context windows for lengthy documents
- Strong reasoning for extracting insights
- Handle charts, diagrams, and visual elements
Data Classification & Extraction
Data Classification & Extraction
- GPT-4.1 Nano - Fast, cost-effective
- GPT-4o Mini - Structured output for consistency
- Claude Haiku 4.5 - Quick, efficient
- GPT-4.1 Mini - Production-ready reliability
- Structured output ensures consistent categorization
- Cost-effective for high-volume operations
- Fast response times for real-time classification
- Reliable accuracy for business logic
Complex Reasoning & Analysis
Complex Reasoning & Analysis
- o3-mini - Specialized reasoning mode for logic
- Claude 4 Opus - Deepest reasoning capability
- GPT-5 - Advanced problem-solving
- Claude Sonnet 4.5 - Sophisticated analysis
- Advanced reasoning capabilities
- Handle multi-step logic effectively
- Understand complex relationships
- Provide detailed explanations
Content Generation
Content Generation
- Claude Sonnet 4.5 - Excellent writing quality
- GPT-5 - Creative and coherent content
- Gemini 2.5 Pro - Long-form content
- Claude 4 Sonnet - High-quality balanced output
- Natural, fluent writing style
- Good creativity control via temperature
- Handle various content types well
- Consistent quality and tone
Semantic Search
Semantic Search
- Snowflake Arctic L V2.0 - Latest, highest quality
- Snowflake Arctic M V1.5 - Reliable production
- Optimized for semantic similarity
- Consistent vector representations
- Efficient processing at scale
- Note: Must use Snowflake provider for embedding models
By Budget
- Cost-Conscious
- Balanced
- Performance
- GPT-4.1 Nano: Minimal cost, simple tasks
- Claude Haiku 4.5: Fast and efficient
- Gemini 2.0 Flash Lite: Lightweight multimodal
- Llama 3.2 1B/3B: Maximum efficiency
- Mistral 7B: Small but capable
By Provider Strengths
Snowflake
- Data residency in your cloud
- Wide model selection
- Claude and OpenAI access
- Native data processing
OpenAI Direct
- Latest GPT models first
- Structured output
- Mature ecosystem
- Reliable performance
Gemini
- Multimodal capabilities
- Large context windows
- Fast processing
- Cutting-edge features
Key Model Capabilities
Multimodal Processing
What it is: Process text and images together in the same request Supported Models:- All Gemini models (2.0+)
- GPT-4o, GPT-4o Mini
- Document analysis with charts/diagrams
- Image-based data extraction
- Visual content understanding
- OCR and form processing
Structured Output
What it is: Guaranteed JSON/XML format responses for reliable automation Supported Models:- All GPT-4o, GPT-4.1, GPT-5 series
- Cortex GPT models
- GPT-4 (partial)
- Llama 3.1 8B, 70B
- Data extraction to databases
- Automated classification
- API integrations
- Workflow automation
Reasoning Mode
What it is: Extended thinking for complex problems with step-by-step reasoning Supported Models:- o1-mini, o3-mini (dedicated reasoning, always on)
- GPT-5 series (configurable)
- Cortex o4-mini (dedicated reasoning)
- Mathematical problems
- Logic puzzles
- Complex analysis
- Multi-step problem-solving
Agent Support
What it is: Optimized for Elementum agent workflows and multi-step tasks Supported Models:- GPT-4o, GPT-4.1, GPT-5 series
- All Claude models (via Snowflake)
- Cortex OpenAI models
- Gemini 2.5+, Gemini 3 Pro
- Conversational agents
- Multi-turn interactions
- Complex workflows
- Autonomous task execution
Temperature Settings
All models except dedicated reasoning models support customizable temperature:- 0.0 - 0.3: Deterministic, consistent (classification, data extraction)
- 0.4 - 0.7: Balanced creativity (conversation, general tasks)
- 0.8 - 1.0: Creative, diverse (content generation, brainstorming)
Best Practices
Model Selection
Identify Your Use Case
Check Required Capabilities
Consider Your Provider
Balance Cost and Performance
Test Before Committing
Monitor and Optimize
Cost Optimization
Choose right-sized models:- Use Nano/Mini for simple tasks
- Reserve Pro/Opus for complex analysis
- Test if smaller models meet needs
- Write concise, clear instructions
- Remove unnecessary context
- Set appropriate max tokens
- Use structured output formats
- Snowflake Cortex models cost ~4.5x base rate (includes infrastructure and data residency)
- Direct provider access may be more cost-effective for high-volume, simple tasks
- Snowflake provides value through data residency and unified platform
Performance Optimization
For speed:- Use Mini/Nano/Haiku models
- Lower max tokens
- Choose geographically close providers
- Use Pro/Opus/Sonnet tier models
- Provide detailed context
- Test with real examples
- Use low temperature (0.0-0.2)
- Enable structured output
- Choose models with structured output support
Next Steps
Configure Providers
Create AI Services
Build Agents
AI Automations
Models are accessed through configured providers. Choose providers based on your data residency, integration, and model access needs, then select the right model for each specific task.