Skip to main content

What Are AI Services?

AI Services are specific AI model instances that you configure for use in your workflows. While AI Providers establish connections to external AI platforms, AI Services define the actual models, settings, and configurations that power your AI features.
Prerequisites: You must have at least one AI Provider configured before creating AI Services. See the AI Providers Overview for setup instructions.

Types of AI Services

Elementum supports two primary types of AI Services:

LLM Services

Language Models for text generation, conversation, and analysisUsed for: Agents, automation actions, data analysis, and content generationExamples: GPT-4, Claude 3.5, Gemini Pro

Embedding Services

Embedding Models for semantic search and similarity analysisUsed for: AI Search, content similarity, and semantic understandingExamples: Snowflake Arctic L V2.0, Snowflake Arctic L V1.5

Creating AI Services

Accessing AI Services

1

Navigate to Services

In Organization Settings, go to the Services tab
2

Create New Service

Click ”+ Service” to open the service creation dialog
3

Select Provider

Choose from your configured AI Providers:
  • OpenAI - For general-purpose AI capabilities
  • Snowflake - For data-native AI on your warehouse
  • Gemini - For advanced multimodal AI features

Creating LLM Services

LLM Services power conversational AI, text generation, and intelligent automation:
  • Service Configuration
  • Advanced Settings
Service Name: Give your service a descriptive name (e.g., “Customer Support Bot”)Provider: Select your configured AI ProviderModel: Choose from available models:
  • OpenAI o4-mini - Fast, efficient reasoning for daily tasks
  • OpenAI o3 - Complex reasoning and research tasks
  • Claude Sonnet 4 - Advanced reasoning and premium applications
  • Claude 3.7 Sonnet - Cost-effective reasoning for most tasks
  • Claude Opus 4 - Most complex reasoning (expensive but capable)
  • Gemini 2.5 - Balanced performance for general-purpose tasks
  • Gemini 2.5 Pro - Complex use cases and large responses
Cost Per Million Tokens: Optional cost tracking (varies by provider)

Creating Embedding Services

Embedding Services enable AI Search and semantic understanding:
  • Service Configuration
  • Configuration Options
Service Name: Descriptive name (e.g., “Document Search Embeddings”)Provider: Select your configured AI ProviderModel: Choose from available embedding models:
  • Snowflake Arctic L V2.0 - Latest high-quality embeddings
  • Snowflake Arctic L V1.5 - Reliable embeddings for production use
Dimensions: Embedding vector size (varies by model)

Service Management

Testing Services

Before using AI Services in production, test them thoroughly:
1

Access Testing Interface

In the Services list, click on your service name to open the testing interface
2

Test LLM Services

Input: Enter sample prompts or questionsResponse: Review the AI-generated responsesParameters: Adjust settings and test againPerformance: Monitor response times and quality
3

Test Embedding Services

Input: Enter sample text for embeddingVector Output: Review generated embedding vectorsSimilarity: Test similarity calculations between textsPerformance: Monitor embedding generation speed

Service Monitoring

Usage Metrics

Token Consumption: Track token usage across servicesRequest Volume: Monitor API call frequencyResponse Times: Track performance metricsError Rates: Monitor service reliability

Cost Management

Cost Tracking: Monitor spending per serviceBudget Alerts: Set up spending notificationsOptimization: Identify cost-saving opportunitiesUsage Reports: Generate regular usage reports

Service Usage Across Features

LLM Services Usage

Purpose: Power chatbots and customer service agentsConfiguration:
  • Use models optimized for conversation (o4-mini, Claude 3.7 Sonnet, Gemini 2.5)
  • Set appropriate temperature for natural responses
  • Configure stop sequences for conversation control
Best Practices:
  • Use o4-mini for most customer support interactions
  • Use Claude 3.7 Sonnet for cost-effective daily tasks
  • Use Gemini 2.5 for balanced performance
  • Set reasonable token limits for responses
  • Use system prompts to define agent behavior
Example Use Cases:
  • Customer support chatbots
  • Internal help desk agents
  • Sales qualification bots
  • Technical support assistants
Purpose: Enable AI-powered automation workflowsConfiguration:
  • Use task-appropriate models (o4-mini for simple tasks, o3 for complex reasoning)
  • Set lower temperature for consistent results
  • Configure appropriate token limits
Best Practices:
  • Use o4-mini for most automation tasks
  • Use o3 for complex reasoning and analysis
  • Use Claude Sonnet 4 for detailed analysis
  • Use Gemini 2.5 Pro for complicated use cases
  • Use deterministic settings for predictable results
  • Monitor automation performance regularly
Example Use Cases:
  • Document classification
  • Email response generation
  • Data analysis and summarization
  • Content transformation
Purpose: Generate written content and documentationConfiguration:
  • Use models with strong writing capabilities (Claude Sonnet 4, Gemini 2.5 Pro)
  • Adjust temperature based on creativity needs
  • Set appropriate token limits for content length
Best Practices:
  • Use Claude Sonnet 4 for premium content creation
  • Use Gemini 2.5 Pro for large, detailed responses
  • Use o3 for complex content requiring deep analysis
  • Use detailed prompts for better results
  • Implement content review processes
  • Monitor quality and consistency
Example Use Cases:
  • Report generation
  • Email drafting
  • Documentation creation
  • Content summarization

Embedding Services Usage

Purpose: Find similar content and detect duplicatesConfiguration:
  • Use consistent embedding models across comparisons
  • Configure appropriate similarity thresholds
  • Set up batch processing for large datasets
Best Practices:
  • Use the same embedding model for all content
  • Implement proper similarity scoring
  • Monitor performance with large datasets
Example Use Cases:
  • Duplicate detection
  • Content recommendation
  • Similar document finding
  • Categorization assistance

Best Practices

Model Selection

LLM Selection

Most Complex Tasks: Use o3 or Claude Opus 4 for advanced reasoning and researchPremium Applications: Use Claude Sonnet 4 or Gemini 2.5 Pro for detailed analysisDaily Tasks: Use o4-mini, Claude 3.7 Sonnet, or Gemini 2.5 for most applicationsCost-Sensitive: Use Claude 3.7 Sonnet or o4-mini for cost-effective operations

Embedding Selection

High Quality: Use Snowflake Arctic L V2.0 for best search resultsProduction Ready: Use Snowflake Arctic L V1.5 for stable, reliable performanceData-Native: All embeddings run directly on your Snowflake data warehouseConsistency: Use the same embedding model throughout your search system

Performance Optimization

  • LLM Optimization
  • Embedding Optimization
Temperature Settings:
  • Use 0.0-0.3 for deterministic tasks
  • Use 0.4-0.7 for balanced creativity
  • Use 0.8-1.0 for creative tasks
Token Management:
  • Set appropriate max tokens for responses
  • Monitor token usage for cost control
  • Use truncation strategies for long inputs
Prompt Engineering:
  • Use clear, specific prompts
  • Provide examples for better results
  • Implement system prompts for consistency

Cost Management

Monitor Usage

Track Consumption: Monitor token usage across all servicesSet Budgets: Establish spending limits for each serviceUsage Patterns: Analyze usage patterns to optimize costsRegular Review: Conduct monthly cost reviews

Optimize Costs

Right-Size Models: Use appropriate models for tasksBatch Processing: Process multiple requests togetherCaching: Cache frequent responses and embeddingsEfficient Prompts: Use concise, effective prompts

Troubleshooting

Symptoms: Cannot create new AI servicesCommon Causes:
  • AI Provider not configured
  • Invalid model selection
  • Insufficient permissions
Solutions:
  1. Verify AI Provider is properly configured
  2. Check model availability for your provider
  3. Ensure proper permissions are granted
  4. Try creating with different model options
Symptoms: Slow response times or quality issuesCommon Causes:
  • Inappropriate model selection
  • Suboptimal configuration
  • Network or provider issues
Solutions:
  1. Review model selection for your use case
  2. Optimize service configuration settings
  3. Check provider status and network connectivity
  4. Consider switching to different models
Symptoms: Unexpected high token usage or costsCommon Causes:
  • Inefficient prompts or queries
  • Inappropriate model selection
  • Excessive API calls
Solutions:
  1. Review and optimize prompts
  2. Use more cost-effective models where appropriate
  3. Implement caching and batching
  4. Monitor and analyze usage patterns

Advanced Configuration

Custom Model Settings

For specialized use cases:
  1. Fine-tuning: Some providers support custom model fine-tuning
  2. Custom Endpoints: Configure custom API endpoints for specialized deployments
  3. Advanced Parameters: Use provider-specific advanced settings
  4. Performance Tuning: Optimize for specific performance requirements

Multi-Provider Strategy

Redundancy

Failover: Configure multiple providers for reliabilityLoad Balancing: Distribute requests across providersCost Optimization: Route requests to most cost-effective providerFeature Specialization: Use different providers for different capabilities

Hybrid Approach

LLM Diversity: Use different LLMs for different tasksEmbedding Consistency: Maintain consistent embedding modelsRegional Deployment: Use region-specific providersCompliance Requirements: Meet different regulatory needs

Next Steps

With your AI Services configured:
AI Services bridge the gap between AI Providers and your actual AI-powered features. Properly configured services ensure optimal performance, cost-effectiveness, and reliability for your AI workflows.
I