Skip to main content
Elementum supports a wide range of AI models across multiple providers. Models are accessed through configured AI providers - you’ll select your provider first, then choose from the models available through that provider.

Available Providers

Snowflake

Claude, OpenAI (Cortex), Llama, Mistral, DeepSeek, embedding models

OpenAI

GPT models

Gemini

Multimodal Gemini models

Bedrock

AWS Bedrock agents (agent orchestration only)
Important: Models are accessed through providers you configure in Organization Settings. For example, Claude models are accessed through the Snowflake provider, not directly from Anthropic.

Quick Reference: Model Capabilities

This table shows which capabilities each model category supports:
ModelProviderMultimodalStructured OutputReasoningAgentsBest Use Case
GPT-5 SeriesOpenAINoYesYesYesComplex reasoning, premium applications
GPT-4.1 SeriesOpenAINoYesNoYesProduction deployments, reliable automation
GPT-4o SeriesOpenAIYesYesNoYesGeneral-purpose, document analysis
o3-mini / o1-miniOpenAINoNoYes (req.)NoMathematical reasoning, logic problems
GPT-4 / GPT-3.5OpenAINoPartialNoNoLegacy applications
Claude 4.5 SonnetSnowflakeNoNoNoYesAdvanced reasoning, detailed analysis
Claude 4 OpusSnowflakeNoNoNoYesMost sophisticated reasoning
Claude 4 SonnetSnowflakeNoNoNoYesBalanced performance
Claude Haiku 4.5SnowflakeNoNoNoYesFast, efficient processing
Claude 3.7 / 3.5SnowflakeNoNoNoYesCost-effective reasoning
Cortex GPT-5SnowflakeNoYesYesYesOpenAI through Snowflake
Cortex GPT-4.1SnowflakeNoYesNoYesProduction OpenAI via Snowflake
Cortex o4-miniSnowflakeNoNoYes (req.)YesReasoning through Snowflake
Gemini 3 ProGeminiYesNoNoYesCutting-edge multimodal
Gemini 2.5 Pro/FlashGeminiYesNoNoYesProduction multimodal
Gemini 2.0 FlashGeminiYesNoNoNoCost-effective multimodal
Gemini 1.5 ProGeminiYesNoNoNoProven multimodal
DeepSeek R1SnowflakeNoNoNoNoOpen-source reasoning
Llama 3.3 70BSnowflakeNoNoNoNoOpen-source, balanced
Llama 3.1 SeriesSnowflakeNoPartialNoNoOpen-source, structured output
Llama 3 SeriesSnowflakeNoNoNoNoOpen-source, function calling
Mistral Large 2SnowflakeNoNoNoNoMultilingual, European focus
Mistral 7B / MixtralSnowflakeNoNoNoNoEfficient small models
Snowflake ArcticSnowflakeNoNoNoNoData cloud native
Arctic EmbeddingsSnowflakeN/AN/AN/AN/ASemantic search
Legend:
  • Multimodal: Processes text and images together
  • Structured Output: Guaranteed JSON/XML format responses
  • Reasoning: Advanced reasoning mode (req. = required, always on)
  • Agents: Supports Elementum agent workflows

Models by Provider

Snowflake Cortex

Snowflake Cortex provides access to multiple AI model families through your Snowflake data cloud.
Data Residency: All Snowflake Cortex models run within your Snowflake environment, keeping data in your cloud.

Anthropic Claude (via Snowflake)

claude-sonnet-4-5
  • Best for: Most advanced reasoning and analysis tasks
  • Capabilities: Superior reasoning, nuanced understanding, extensive context windows
  • Use cases: Complex research, detailed analysis, sophisticated automation, agent workflows
  • Temperature range: 0.0 - 1.0 (default: 0.7)
  • When to use: Premium applications requiring deepest understanding and analysis
Claude 4 Opus (claude-4-opus)
  • Best for: Most sophisticated reasoning requiring highest capability
  • Use cases: Strategic decisions, complex research, mission-critical analysis
  • When to use: Tasks where quality matters more than cost
Claude 4 Sonnet (claude-4-sonnet)
  • Best for: Balanced performance and cost for demanding tasks
  • Use cases: Business automation, production workflows, detailed analysis
  • When to use: Production workloads needing strong reasoning
Claude Haiku 4.5 (claude-haiku-4-5)
  • Best for: Fast, efficient processing
  • Use cases: High-volume operations, real-time interactions, simple automation
  • When to use: Speed and cost-efficiency are priorities
Claude 3.7 Sonnet (claude-3-7-sonnet)
  • Best for: Daily tasks requiring strong reasoning at lower cost
  • Use cases: Standard business automation, customer support, content generation
  • When to use: Cost-effective production deployments
Claude 3.5 Sonnet (claude-3-5-sonnet)
  • Best for: Proven reliability in production
  • Use cases: Established workflows, production automation
  • When to use: Stability and proven performance matter

OpenAI via Cortex

Access OpenAI models through your Snowflake environment:
openai-gpt-5 - Advanced reasoning openai-gpt-5-mini - Efficient reasoning openai-gpt-5-nano - Maximum efficiency openai-gpt-5-chat - Optimized for conversations
  • Capabilities: Structured output, reasoning mode, through Snowflake
  • Use cases: Complex analysis, conversations, classification, intelligence features, agents
  • When to use: Need OpenAI capabilities with Snowflake data residency
openai-gpt-4.1
  • Production-ready OpenAI through Snowflake
  • Structured output, reliable reasoning
openai-o4-mini
  • Reasoning model through Snowflake
  • Required temperature: 1.0 (not adjustable)
  • System role not supported
When to use: Production OpenAI workloads within Snowflake environment

Open Source Models (via Snowflake)

deepseek-r1
  • Best for: Advanced reasoning with open-source flexibility
  • Capabilities: Strong reasoning, open-source architecture
  • Use cases: Research, academic applications, cost-conscious deployments
  • When to use: Open-source requirements or research projects
Llama 3.3 Series
  • Llama 3.3 70B (llama3.3-70b): Latest generation, balanced performance
Llama 3.2 Series
  • Llama 3.2 3B (llama3.2-3b): Efficient, compact
  • Llama 3.2 1B (llama3.2-1b): Maximum efficiency for simple tasks
Llama 3.1 Series
  • Llama 3.1 405B (llama3.1-405b): Largest, most capable
  • Llama 3.1 70B (llama3.1-70b): Production-ready, structured output support
  • Llama 3.1 8B (llama3.1-8b): Cost-effective, structured output support
Llama 3 Series
  • Llama 3 70B (llama3-70b): Function calling, reliable
  • Llama 3 8B (llama3-8b): Efficient operation
Llama 2 Series
  • Llama 2 70B Chat (llama2-70b-chat): Conversational focus
When to use: Open-source requirements, cost optimization, specific model sizes
Mistral Large 2 (mistral-large2)
  • Advanced capabilities, multilingual support, European markets
Mistral Large (mistral-large)
  • Previous generation, proven performance
Mistral 7B (mistral-7b)
  • Compact, efficient, cost-effective
Mixtral 8x7B (mixtral-8x7b)
  • Mixture-of-experts architecture, balanced performance
When to use: International applications, multilingual needs, small model requirements
Snowflake Arctic (snowflake-arctic)
  • Data cloud native processing, integrated with Snowflake infrastructure
  • When to use: Data-intensive workflows within Snowflake
Gemma 7B (gemma-7b)
  • Lightweight Google-developed model
  • When to use: Efficient processing on smaller tasks
Jamba Instruct (jamba-instruct)
  • Instruction-following optimization
  • Note: Not recommended for JSON/YAML parsing
Reka Core / Flash (reka-core, reka-flash)
  • Advanced processing or fast operation

Snowflake Embedding Models

Arctic L V2.0

snowflake-arctic-embed-l-v2.0Latest high-quality embeddings for semantic searchUse for: New AI search implementations

Arctic M V1.5

snowflake-arctic-embed-m-v1.5Balanced performance and qualityUse for: Production search systems

OpenAI Direct

Access OpenAI models directly through OpenAI API.
GPT-5 (gpt-5)
  • Best for: Most complex reasoning and premium applications
  • Capabilities: Structured output, reasoning mode, advanced problem-solving
  • Use cases: Complex analysis, strategic planning, research tasks
GPT-5 Mini (gpt-5-mini)
  • Best for: Daily reasoning at lower cost
  • Capabilities: Structured output, reasoning mode, balanced performance
  • Use cases: Standard business logic, moderate analysis, automation
GPT-5 Nano (gpt-5-nano)
  • Best for: Simple reasoning requiring efficiency
  • Capabilities: Structured output, reasoning mode, cost-effective
  • Use cases: Basic classification, simple analysis, high-volume operations
GPT-5.1 (gpt-5.1)
  • Best for: Enhanced reasoning with improved accuracy
  • Use cases: Business intelligence, detailed analysis, critical decisions
GPT-5.2 (gpt-5.2)
  • Best for: Most advanced reasoning
  • Use cases: Complex problem-solving, research, mission-critical applications
All GPT-5 models support: Prompts, conversations, classification, intelligence features, agents
GPT-4.1 (gpt-4.1)
  • Best for: Production applications requiring consistent performance
  • Capabilities: Structured output, reliable reasoning
  • Use cases: Customer-facing applications, production workflows
GPT-4.1 Mini (gpt-4.1-mini)
  • Best for: Cost-effective production deployments
  • Use cases: High-volume automation, chatbots, content generation
GPT-4.1 Nano (gpt-4.1-nano)
  • Best for: Maximum efficiency for simple tasks
  • Use cases: Real-time interactions, simple classification, quick responses
All GPT-4.1 models support: Prompts, conversations, classification, intelligence features, agents
GPT-4o (gpt-4o)
  • Best for: Balanced performance and capability
  • Capabilities: Structured output, multimodal support (text + images)
  • Use cases: General-purpose applications, document analysis, versatile automation
  • Supports: Prompts, conversations, email analysis, classification, intelligence, agents
GPT-4o Mini (gpt-4o-mini-2024-07-18)
  • Best for: Cost-effective general-purpose tasks
  • Capabilities: Structured output, multimodal support, efficient operation
  • Use cases: Standard automation, customer support, content processing
  • Supports: Prompts, conversations, email analysis, translation, classification, intelligence, agents
o3-mini (o3-mini)
  • Best for: Latest reasoning-focused tasks requiring deep analysis
  • Capabilities: Advanced reasoning mode (required temperature: 1.0)
  • Use cases: Mathematical problems, logical analysis, complex problem-solving
  • Note: System role not supported; fixed temperature requirement
o1-mini (o1-mini)
  • Best for: Previous-generation reasoning tasks
  • Capabilities: Reasoning mode (required temperature: 1.0)
  • Use cases: Logic puzzles, analytical tasks, structured problem-solving
  • Note: System role not supported; fixed temperature requirement
When to use: Tasks requiring explicit step-by-step reasoning, math, logic
GPT-4 Turbo Preview (gpt-4-turbo-preview)
  • Function calling, extended context
  • Recommendation: Consider upgrading to GPT-4.1 or GPT-5 series
GPT-4 (gpt-4)
  • Structured output, reliable performance
  • Supports: Prompts, conversations, summarization, email analysis, classification
GPT-3.5 Turbo (gpt-3.5-turbo, gpt-3.5-turbo-1106)
  • Function calling, basic capabilities
  • Recommendation: Upgrade to GPT-4.1 Mini for better performance

Google Gemini

Access Google’s multimodal Gemini models directly.
Gemini 3 Pro Preview (gemini-3-pro-preview)
  • Best for: Cutting-edge multimodal capabilities
  • Capabilities: Multimodal processing (text, images, audio), advanced reasoning
  • Use cases: Document analysis with images, multimedia processing, complex automation
  • Temperature range: 0.0 - 1.0 (default: 0.7)
  • Supports: Prompts, translation, classification, file analysis, agents
Gemini 2.5 Pro (gemini-2.5-pro)
  • Best for: Complex multimodal tasks requiring high performance
  • Capabilities: Multimodal, large context windows, detailed analysis
  • Use cases: Document understanding, comprehensive analysis, advanced automation
Gemini 2.5 Flash (gemini-2.5-flash)
  • Best for: Fast multimodal processing
  • Capabilities: Multimodal, efficient operation, quick responses
  • Use cases: Real-time document analysis, responsive automation
Both support: Prompts, translation, classification, file analysis, agents
Gemini 2.0 Flash (gemini-2.0-flash)
  • Best for: Cost-effective multimodal processing
  • Use cases: Standard document processing, general automation
Gemini 2.0 Flash Lite (gemini-2.0-flash-lite)
  • Best for: Lightweight multimodal tasks
  • Use cases: Simple document analysis, high-volume operations
Both support: Prompts, translation, classification, file analysis
Gemini 1.5 Pro (gemini-1.5-pro)
  • Best for: Proven multimodal performance
  • Capabilities: Multimodal processing, reliable operation
  • Use cases: Production workloads, established workflows
  • Supports: Prompts, translation, classification, file analysis

Model Selection Guide

By Use Case

Recommended Models:
  1. GPT-4o Mini - Best balance of cost and performance
  2. Claude 3.7 Sonnet - Superior reasoning at reasonable cost
  3. Gemini 2.5 Flash - Fast multimodal conversations
  4. GPT-5 Mini - Advanced reasoning for complex interactions
Why these models:
  • Support structured output for reliable responses
  • Handle context well for conversation continuity
  • Cost-effective for high-volume interactions
  • Proven reliability in production
Recommended Models:
  1. Gemini 2.5 Pro - Complex multimodal analysis
  2. Gemini 3 Pro Preview - Cutting-edge document understanding
  3. GPT-4o - Strong multimodal processing
  4. Gemini 2.5 Flash - Fast multimodal analysis
Why these models:
  • Multimodal support for images and text together
  • Large context windows for lengthy documents
  • Strong reasoning for extracting insights
  • Handle charts, diagrams, and visual elements
Recommended Models:
  1. GPT-4.1 Nano - Fast, cost-effective
  2. GPT-4o Mini - Structured output for consistency
  3. Claude Haiku 4.5 - Quick, efficient
  4. GPT-4.1 Mini - Production-ready reliability
Why these models:
  • Structured output ensures consistent categorization
  • Cost-effective for high-volume operations
  • Fast response times for real-time classification
  • Reliable accuracy for business logic
Recommended Models:
  1. o3-mini - Specialized reasoning mode for logic
  2. Claude 4 Opus - Deepest reasoning capability
  3. GPT-5 - Advanced problem-solving
  4. Claude Sonnet 4.5 - Sophisticated analysis
Why these models:
  • Advanced reasoning capabilities
  • Handle multi-step logic effectively
  • Understand complex relationships
  • Provide detailed explanations
Recommended Models:
  1. Claude Sonnet 4.5 - Excellent writing quality
  2. GPT-5 - Creative and coherent content
  3. Gemini 2.5 Pro - Long-form content
  4. Claude 4 Sonnet - High-quality balanced output
Why these models:
  • Natural, fluent writing style
  • Good creativity control via temperature
  • Handle various content types well
  • Consistent quality and tone

By Budget

Lowest Cost Models:
  • GPT-4.1 Nano: Minimal cost, simple tasks
  • Claude Haiku 4.5: Fast and efficient
  • Gemini 2.0 Flash Lite: Lightweight multimodal
  • Llama 3.2 1B/3B: Maximum efficiency
  • Mistral 7B: Small but capable
Best for: High-volume operations, simple automation, basic classification

By Provider Strengths

Snowflake

Strengths:
  • Data residency in your cloud
  • Wide model selection
  • Claude and OpenAI access
  • Native data processing
Choose when: Data security, Snowflake integration, diverse model needs

OpenAI Direct

Strengths:
  • Latest GPT models first
  • Structured output
  • Mature ecosystem
  • Reliable performance
Choose when: Latest OpenAI features, proven production performance

Gemini

Strengths:
  • Multimodal capabilities
  • Large context windows
  • Fast processing
  • Cutting-edge features
Choose when: Document analysis with images, large contexts, latest AI

Key Model Capabilities

Multimodal Processing

What it is: Process text and images together in the same request Supported Models:
  • All Gemini models (2.0+)
  • GPT-4o, GPT-4o Mini
Use cases:
  • Document analysis with charts/diagrams
  • Image-based data extraction
  • Visual content understanding
  • OCR and form processing

Structured Output

What it is: Guaranteed JSON/XML format responses for reliable automation Supported Models:
  • All GPT-4o, GPT-4.1, GPT-5 series
  • Cortex GPT models
  • GPT-4 (partial)
  • Llama 3.1 8B, 70B
Use cases:
  • Data extraction to databases
  • Automated classification
  • API integrations
  • Workflow automation

Reasoning Mode

What it is: Extended thinking for complex problems with step-by-step reasoning Supported Models:
  • o1-mini, o3-mini (dedicated reasoning, always on)
  • GPT-5 series (configurable)
  • Cortex o4-mini (dedicated reasoning)
Use cases:
  • Mathematical problems
  • Logic puzzles
  • Complex analysis
  • Multi-step problem-solving
Note: Dedicated reasoning models (o1/o3/o4-mini) require temperature = 1.0 and don’t support system roles

Agent Support

What it is: Optimized for Elementum agent workflows and multi-step tasks Supported Models:
  • GPT-4o, GPT-4.1, GPT-5 series
  • All Claude models (via Snowflake)
  • Cortex OpenAI models
  • Gemini 2.5+, Gemini 3 Pro
Use cases:
  • Conversational agents
  • Multi-turn interactions
  • Complex workflows
  • Autonomous task execution

Temperature Settings

All models except dedicated reasoning models support customizable temperature:
  • 0.0 - 0.3: Deterministic, consistent (classification, data extraction)
  • 0.4 - 0.7: Balanced creativity (conversation, general tasks)
  • 0.8 - 1.0: Creative, diverse (content generation, brainstorming)
Default: 0.7 for most models Special cases: o1-mini, o3-mini, Cortex o4-mini require temperature 1.0 (not adjustable)

Best Practices

Model Selection

1

Identify Your Use Case

Determine if you need conversation, classification, analysis, generation, or search
2

Check Required Capabilities

Verify if you need multimodal, structured output, or reasoning capabilities
3

Consider Your Provider

Choose based on data residency, integration, and model access requirements
4

Balance Cost and Performance

Select the smallest model that meets your quality requirements
5

Test Before Committing

Compare 2-3 models with your actual use cases
6

Monitor and Optimize

Track quality, cost, and speed metrics to refine your selection

Cost Optimization

Choose right-sized models:
  • Use Nano/Mini for simple tasks
  • Reserve Pro/Opus for complex analysis
  • Test if smaller models meet needs
Optimize prompts:
  • Write concise, clear instructions
  • Remove unnecessary context
  • Set appropriate max tokens
  • Use structured output formats
Consider provider costs:
  • Snowflake Cortex models cost ~4.5x base rate (includes infrastructure and data residency)
  • Direct provider access may be more cost-effective for high-volume, simple tasks
  • Snowflake provides value through data residency and unified platform

Performance Optimization

For speed:
  • Use Mini/Nano/Haiku models
  • Lower max tokens
  • Choose geographically close providers
For quality:
  • Use Pro/Opus/Sonnet tier models
  • Provide detailed context
  • Test with real examples
For consistency:
  • Use low temperature (0.0-0.2)
  • Enable structured output
  • Choose models with structured output support

Next Steps


Models are accessed through configured providers. Choose providers based on your data residency, integration, and model access needs, then select the right model for each specific task.