Available Providers
Snowflake
Claude, OpenAI (Cortex), Llama, Mistral, DeepSeek, embedding models
OpenAI
GPT models
Gemini
Multimodal Gemini models
Bedrock
AWS Bedrock agents (agent orchestration only)
Important: Models are accessed through providers you configure in Organization Settings. For example, Claude models are accessed through the Snowflake provider, not directly from Anthropic.
Quick Reference: Model Capabilities
This table shows which capabilities each model category supports:| Model | Provider | Multimodal | Structured Output | Reasoning | Agents | Best Use Case |
|---|---|---|---|---|---|---|
| GPT-5 Series | OpenAI | No | Yes | Yes | Yes | Complex reasoning, premium applications |
| GPT-4.1 Series | OpenAI | No | Yes | No | Yes | Production deployments, reliable automation |
| GPT-4o Series | OpenAI | Yes | Yes | No | Yes | General-purpose, document analysis |
| o3-mini / o1-mini | OpenAI | No | No | Yes (req.) | No | Mathematical reasoning, logic problems |
| GPT-4 / GPT-3.5 | OpenAI | No | Partial | No | No | Legacy applications |
| Claude 4.5 Sonnet | Snowflake | No | No | No | Yes | Advanced reasoning, detailed analysis |
| Claude 4 Opus | Snowflake | No | No | No | Yes | Most sophisticated reasoning |
| Claude 4 Sonnet | Snowflake | No | No | No | Yes | Balanced performance |
| Claude Haiku 4.5 | Snowflake | No | No | No | Yes | Fast, efficient processing |
| Claude 3.7 / 3.5 | Snowflake | No | No | No | Yes | Cost-effective reasoning |
| Cortex GPT-5 | Snowflake | No | Yes | Yes | Yes | OpenAI through Snowflake |
| Cortex GPT-4.1 | Snowflake | No | Yes | No | Yes | Production OpenAI via Snowflake |
| Cortex o4-mini | Snowflake | No | No | Yes (req.) | Yes | Reasoning through Snowflake |
| Gemini 3 Pro | Gemini | Yes | No | No | Yes | Cutting-edge multimodal |
| Gemini 2.5 Pro/Flash | Gemini | Yes | No | No | Yes | Production multimodal |
| Gemini 2.0 Flash | Gemini | Yes | No | No | No | Cost-effective multimodal |
| Gemini 1.5 Pro | Gemini | Yes | No | No | No | Proven multimodal |
| DeepSeek R1 | Snowflake | No | No | No | No | Open-source reasoning |
| Llama 3.3 70B | Snowflake | No | No | No | No | Open-source, balanced |
| Llama 3.1 Series | Snowflake | No | Partial | No | No | Open-source, structured output |
| Llama 3 Series | Snowflake | No | No | No | No | Open-source, function calling |
| Mistral Large 2 | Snowflake | No | No | No | No | Multilingual, European focus |
| Mistral 7B / Mixtral | Snowflake | No | No | No | No | Efficient small models |
| Snowflake Arctic | Snowflake | No | No | No | No | Data cloud native |
| Arctic Embeddings | Snowflake | N/A | N/A | N/A | N/A | Semantic search |
- Multimodal: Processes text and images together
- Structured Output: Guaranteed JSON/XML format responses
- Reasoning: Advanced reasoning mode (req. = required, always on)
- Agents: Supports Elementum agent workflows
Models by Provider
Snowflake Cortex
Snowflake Cortex provides access to multiple AI model families through your Snowflake data cloud.Data Residency: All Snowflake Cortex models run within your Snowflake environment, keeping data in your cloud.
Anthropic Claude (via Snowflake)
Claude 4.5 Sonnet
Claude 4.5 Sonnet
claude-sonnet-4-5
- Best for: Most advanced reasoning and analysis tasks
- Capabilities: Superior reasoning, nuanced understanding, extensive context windows
- Use cases: Complex research, detailed analysis, sophisticated automation, agent workflows
- Temperature range: 0.0 - 1.0 (default: 0.7)
- When to use: Premium applications requiring deepest understanding and analysis
Claude 4 Series
Claude 4 Series
Claude 4 Opus (claude-4-opus)
- Best for: Most sophisticated reasoning requiring highest capability
- Use cases: Strategic decisions, complex research, mission-critical analysis
- When to use: Tasks where quality matters more than cost
- Best for: Balanced performance and cost for demanding tasks
- Use cases: Business automation, production workflows, detailed analysis
- When to use: Production workloads needing strong reasoning
- Best for: Fast, efficient processing
- Use cases: High-volume operations, real-time interactions, simple automation
- When to use: Speed and cost-efficiency are priorities
Claude 3.7 & 3.5 Sonnet
Claude 3.7 & 3.5 Sonnet
Claude 3.7 Sonnet (claude-3-7-sonnet)
- Best for: Daily tasks requiring strong reasoning at lower cost
- Use cases: Standard business automation, customer support, content generation
- When to use: Cost-effective production deployments
- Best for: Proven reliability in production
- Use cases: Established workflows, production automation
- When to use: Stability and proven performance matter
OpenAI via Cortex
Access OpenAI models through your Snowflake environment:Cortex GPT-5 Series
Cortex GPT-5 Series
openai-gpt-5 - Advanced reasoning
openai-gpt-5-mini - Efficient reasoning
openai-gpt-5-nano - Maximum efficiency
openai-gpt-5-chat - Optimized for conversations
- Capabilities: Structured output, reasoning mode, through Snowflake
- Use cases: Complex analysis, conversations, classification, intelligence features, agents
- When to use: Need OpenAI capabilities with Snowflake data residency
Cortex GPT-4.1 & o4-mini
Cortex GPT-4.1 & o4-mini
openai-gpt-4.1
- Production-ready OpenAI through Snowflake
- Structured output, reliable reasoning
- Reasoning model through Snowflake
- Required temperature: 1.0 (not adjustable)
- System role not supported
Open Source Models (via Snowflake)
DeepSeek R1
DeepSeek R1
deepseek-r1
- Best for: Advanced reasoning with open-source flexibility
- Capabilities: Strong reasoning, open-source architecture
- Use cases: Research, academic applications, cost-conscious deployments
- When to use: Open-source requirements or research projects
Meta Llama Models
Meta Llama Models
Llama 3.3 Series
- Llama 3.3 70B (llama3.3-70b): Latest generation, balanced performance
- Llama 3.2 3B (llama3.2-3b): Efficient, compact
- Llama 3.2 1B (llama3.2-1b): Maximum efficiency for simple tasks
- Llama 3.1 405B (llama3.1-405b): Largest, most capable
- Llama 3.1 70B (llama3.1-70b): Production-ready, structured output support
- Llama 3.1 8B (llama3.1-8b): Cost-effective, structured output support
- Llama 3 70B (llama3-70b): Function calling, reliable
- Llama 3 8B (llama3-8b): Efficient operation
- Llama 2 70B Chat (llama2-70b-chat): Conversational focus
Mistral Models
Mistral Models
Mistral Large 2 (mistral-large2)
- Advanced capabilities, multilingual support, European markets
- Previous generation, proven performance
- Compact, efficient, cost-effective
- Mixture-of-experts architecture, balanced performance
Other Snowflake Models
Other Snowflake Models
Snowflake Arctic (snowflake-arctic)
- Data cloud native processing, integrated with Snowflake infrastructure
- When to use: Data-intensive workflows within Snowflake
- Lightweight Google-developed model
- When to use: Efficient processing on smaller tasks
- Instruction-following optimization
- Note: Not recommended for JSON/YAML parsing
- Advanced processing or fast operation
Snowflake Embedding Models
Arctic L V2.0
snowflake-arctic-embed-l-v2.0Latest high-quality embeddings for semantic searchUse for: New AI search implementations
Arctic M V1.5
snowflake-arctic-embed-m-v1.5Balanced performance and qualityUse for: Production search systems
OpenAI Direct
Access OpenAI models directly through OpenAI API.GPT-5 Series - Latest Generation
GPT-5 Series - Latest Generation
GPT-5 (gpt-5)
- Best for: Most complex reasoning and premium applications
- Capabilities: Structured output, reasoning mode, advanced problem-solving
- Use cases: Complex analysis, strategic planning, research tasks
- Best for: Daily reasoning at lower cost
- Capabilities: Structured output, reasoning mode, balanced performance
- Use cases: Standard business logic, moderate analysis, automation
- Best for: Simple reasoning requiring efficiency
- Capabilities: Structured output, reasoning mode, cost-effective
- Use cases: Basic classification, simple analysis, high-volume operations
- Best for: Enhanced reasoning with improved accuracy
- Use cases: Business intelligence, detailed analysis, critical decisions
- Best for: Most advanced reasoning
- Use cases: Complex problem-solving, research, mission-critical applications
GPT-4.1 Series - Production Ready
GPT-4.1 Series - Production Ready
GPT-4.1 (gpt-4.1)
- Best for: Production applications requiring consistent performance
- Capabilities: Structured output, reliable reasoning
- Use cases: Customer-facing applications, production workflows
- Best for: Cost-effective production deployments
- Use cases: High-volume automation, chatbots, content generation
- Best for: Maximum efficiency for simple tasks
- Use cases: Real-time interactions, simple classification, quick responses
GPT-4o Series - Optimized
GPT-4o Series - Optimized
GPT-4o (gpt-4o)
- Best for: Balanced performance and capability
- Capabilities: Structured output, multimodal support (text + images)
- Use cases: General-purpose applications, document analysis, versatile automation
- Supports: Prompts, conversations, email analysis, classification, intelligence, agents
- Best for: Cost-effective general-purpose tasks
- Capabilities: Structured output, multimodal support, efficient operation
- Use cases: Standard automation, customer support, content processing
- Supports: Prompts, conversations, email analysis, translation, classification, intelligence, agents
Reasoning Models - o3-mini & o1-mini
Reasoning Models - o3-mini & o1-mini
o3-mini (o3-mini)
- Best for: Latest reasoning-focused tasks requiring deep analysis
- Capabilities: Advanced reasoning mode (required temperature: 1.0)
- Use cases: Mathematical problems, logical analysis, complex problem-solving
- Note: System role not supported; fixed temperature requirement
- Best for: Previous-generation reasoning tasks
- Capabilities: Reasoning mode (required temperature: 1.0)
- Use cases: Logic puzzles, analytical tasks, structured problem-solving
- Note: System role not supported; fixed temperature requirement
Legacy Models
Legacy Models
GPT-4 Turbo Preview (gpt-4-turbo-preview)
- Function calling, extended context
- Recommendation: Consider upgrading to GPT-4.1 or GPT-5 series
- Structured output, reliable performance
- Supports: Prompts, conversations, summarization, email analysis, classification
- Function calling, basic capabilities
- Recommendation: Upgrade to GPT-4.1 Mini for better performance
Google Gemini
Access Google’s multimodal Gemini models directly.Gemini 3 Series - Latest
Gemini 3 Series - Latest
Gemini 3 Pro Preview (gemini-3-pro-preview)
- Best for: Cutting-edge multimodal capabilities
- Capabilities: Multimodal processing (text, images, audio), advanced reasoning
- Use cases: Document analysis with images, multimedia processing, complex automation
- Temperature range: 0.0 - 1.0 (default: 0.7)
- Supports: Prompts, translation, classification, file analysis, agents
Gemini 2.5 Series - Production Advanced
Gemini 2.5 Series - Production Advanced
Gemini 2.5 Pro (gemini-2.5-pro)
- Best for: Complex multimodal tasks requiring high performance
- Capabilities: Multimodal, large context windows, detailed analysis
- Use cases: Document understanding, comprehensive analysis, advanced automation
- Best for: Fast multimodal processing
- Capabilities: Multimodal, efficient operation, quick responses
- Use cases: Real-time document analysis, responsive automation
Gemini 2.0 Series - Efficient
Gemini 2.0 Series - Efficient
Gemini 2.0 Flash (gemini-2.0-flash)
- Best for: Cost-effective multimodal processing
- Use cases: Standard document processing, general automation
- Best for: Lightweight multimodal tasks
- Use cases: Simple document analysis, high-volume operations
Gemini 1.5 Pro - Established
Gemini 1.5 Pro - Established
Gemini 1.5 Pro (gemini-1.5-pro)
- Best for: Proven multimodal performance
- Capabilities: Multimodal processing, reliable operation
- Use cases: Production workloads, established workflows
- Supports: Prompts, translation, classification, file analysis
Model Selection Guide
By Use Case
Conversational Agents
Conversational Agents
Recommended Models:
- GPT-4o Mini - Best balance of cost and performance
- Claude 3.7 Sonnet - Superior reasoning at reasonable cost
- Gemini 2.5 Flash - Fast multimodal conversations
- GPT-5 Mini - Advanced reasoning for complex interactions
- Support structured output for reliable responses
- Handle context well for conversation continuity
- Cost-effective for high-volume interactions
- Proven reliability in production
Document Analysis with Images
Document Analysis with Images
Recommended Models:
- Gemini 2.5 Pro - Complex multimodal analysis
- Gemini 3 Pro Preview - Cutting-edge document understanding
- GPT-4o - Strong multimodal processing
- Gemini 2.5 Flash - Fast multimodal analysis
- Multimodal support for images and text together
- Large context windows for lengthy documents
- Strong reasoning for extracting insights
- Handle charts, diagrams, and visual elements
Data Classification & Extraction
Data Classification & Extraction
Recommended Models:
- GPT-4.1 Nano - Fast, cost-effective
- GPT-4o Mini - Structured output for consistency
- Claude Haiku 4.5 - Quick, efficient
- GPT-4.1 Mini - Production-ready reliability
- Structured output ensures consistent categorization
- Cost-effective for high-volume operations
- Fast response times for real-time classification
- Reliable accuracy for business logic
Complex Reasoning & Analysis
Complex Reasoning & Analysis
Recommended Models:
- o3-mini - Specialized reasoning mode for logic
- Claude 4 Opus - Deepest reasoning capability
- GPT-5 - Advanced problem-solving
- Claude Sonnet 4.5 - Sophisticated analysis
- Advanced reasoning capabilities
- Handle multi-step logic effectively
- Understand complex relationships
- Provide detailed explanations
Content Generation
Content Generation
Recommended Models:
- Claude Sonnet 4.5 - Excellent writing quality
- GPT-5 - Creative and coherent content
- Gemini 2.5 Pro - Long-form content
- Claude 4 Sonnet - High-quality balanced output
- Natural, fluent writing style
- Good creativity control via temperature
- Handle various content types well
- Consistent quality and tone
Semantic Search
Semantic Search
Required Models:
- Snowflake Arctic L V2.0 - Latest, highest quality
- Snowflake Arctic M V1.5 - Reliable production
- Optimized for semantic similarity
- Consistent vector representations
- Efficient processing at scale
- Note: Must use Snowflake provider for embedding models
By Budget
- Cost-Conscious
- Balanced
- Performance
Lowest Cost Models:
- GPT-4.1 Nano: Minimal cost, simple tasks
- Claude Haiku 4.5: Fast and efficient
- Gemini 2.0 Flash Lite: Lightweight multimodal
- Llama 3.2 1B/3B: Maximum efficiency
- Mistral 7B: Small but capable
By Provider Strengths
Snowflake
Strengths:
- Data residency in your cloud
- Wide model selection
- Claude and OpenAI access
- Native data processing
OpenAI Direct
Strengths:
- Latest GPT models first
- Structured output
- Mature ecosystem
- Reliable performance
Gemini
Strengths:
- Multimodal capabilities
- Large context windows
- Fast processing
- Cutting-edge features
Key Model Capabilities
Multimodal Processing
What it is: Process text and images together in the same request Supported Models:- All Gemini models (2.0+)
- GPT-4o, GPT-4o Mini
- Document analysis with charts/diagrams
- Image-based data extraction
- Visual content understanding
- OCR and form processing
Structured Output
What it is: Guaranteed JSON/XML format responses for reliable automation Supported Models:- All GPT-4o, GPT-4.1, GPT-5 series
- Cortex GPT models
- GPT-4 (partial)
- Llama 3.1 8B, 70B
- Data extraction to databases
- Automated classification
- API integrations
- Workflow automation
Reasoning Mode
What it is: Extended thinking for complex problems with step-by-step reasoning Supported Models:- o1-mini, o3-mini (dedicated reasoning, always on)
- GPT-5 series (configurable)
- Cortex o4-mini (dedicated reasoning)
- Mathematical problems
- Logic puzzles
- Complex analysis
- Multi-step problem-solving
Agent Support
What it is: Optimized for Elementum agent workflows and multi-step tasks Supported Models:- GPT-4o, GPT-4.1, GPT-5 series
- All Claude models (via Snowflake)
- Cortex OpenAI models
- Gemini 2.5+, Gemini 3 Pro
- Conversational agents
- Multi-turn interactions
- Complex workflows
- Autonomous task execution
Temperature Settings
All models except dedicated reasoning models support customizable temperature:- 0.0 - 0.3: Deterministic, consistent (classification, data extraction)
- 0.4 - 0.7: Balanced creativity (conversation, general tasks)
- 0.8 - 1.0: Creative, diverse (content generation, brainstorming)
Best Practices
Model Selection
Identify Your Use Case
Determine if you need conversation, classification, analysis, generation, or search
Check Required Capabilities
Verify if you need multimodal, structured output, or reasoning capabilities
Cost Optimization
Choose right-sized models:- Use Nano/Mini for simple tasks
- Reserve Pro/Opus for complex analysis
- Test if smaller models meet needs
- Write concise, clear instructions
- Remove unnecessary context
- Set appropriate max tokens
- Use structured output formats
- Snowflake Cortex models cost ~4.5x base rate (includes infrastructure and data residency)
- Direct provider access may be more cost-effective for high-volume, simple tasks
- Snowflake provides value through data residency and unified platform
Performance Optimization
For speed:- Use Mini/Nano/Haiku models
- Lower max tokens
- Choose geographically close providers
- Use Pro/Opus/Sonnet tier models
- Provide detailed context
- Test with real examples
- Use low temperature (0.0-0.2)
- Enable structured output
- Choose models with structured output support
Next Steps
Configure Providers
Set up your AI provider connections first
Create AI Services
Configure specific model instances for your workflows
Build Agents
Create conversational AI assistants using these models
AI Automations
Use AI models in automation workflows
Models are accessed through configured providers. Choose providers based on your data residency, integration, and model access needs, then select the right model for each specific task.