Polar Llama Documentation
A Python library for parallel LLM inference using Polars DataFrames
Polar Llama is a Python library that enables parallel inference calls to multiple Large Language Model providers through Polars dataframes. It streamlines batch processing of AI queries without serial request delays, making it ideal for data-intensive AI applications.
Concurrent Processing
Send multiple inference requests in parallel without waiting for individual completions
Polars Integration
Leverages efficient Polars dataframe operations for request management
Multi-turn Conversations
Supports context-preserving conversations across multiple message exchanges
Multiple Providers
Connects with OpenAI, Anthropic, Gemini, Groq, and AWS Bedrock models
Embeddings & Vector Operations
Generate embeddings and perform vector similarity searches with ANN, KNN, and cosine similarity
Cost Analytics
Track and calculate LLM inference costs to monitor and optimize your AI spending
llama Namespace & Performance
New namespace organization for cleaner imports, plus Link-Time Optimization for faster execution
Taxonomy-based Tagging
Classify documents with custom taxonomies including detailed reasoning, reflection, and confidence scores
Structured Output Support
Native support for structured outputs with Pydantic models and JSON schema validation
Using pip
pip install polar-llama==0.2.2Development Installation
maturin developimport polars as pl
from polar_llama import string_to_message, inference_async, Provider
import dotenv
# Load environment variables
dotenv.load_dotenv()
# Create a DataFrame with questions
questions = [
'What is the capital of France?',
'What is the difference between polars and pandas?',
'Explain async programming in Python'
]
df = pl.DataFrame({'Questions': questions})
# Convert questions to LLM messages
df = df.with_columns(
prompt=string_to_message("Questions", message_type='user')
)
# Run parallel inference
df = df.with_columns(
answer=inference_async('prompt', provider=Provider.OPENAI,
model='gpt-4o-mini')
)
# Display results
print(df)Embeddings & Vector Operations
NEW in 0.2.2Version 0.2.2 introduces powerful embedding and vector operations that enable semantic search, document similarity, and clustering capabilities. Generate embeddings from your text data and perform efficient similarity searches using industry-standard algorithms.
Embedding Generation
Generate vector embeddings from text using OpenAI, Cohere, or other embedding providers
Cosine Similarity
Calculate similarity scores between vectors for semantic matching
K-Nearest Neighbors (KNN)
Find the k most similar items using exact nearest neighbor search
Approximate NN (ANN)
Fast approximate similarity search for large-scale datasets
import polars as pl
from polar_llama import embed_async, Provider
# Create a DataFrame with text to embed
df = pl.DataFrame({
"id": [1, 2, 3],
"text": [
"Machine learning is a subset of artificial intelligence",
"Deep learning uses neural networks with many layers",
"Natural language processing analyzes human language"
]
})
# Generate embeddings
df = df.with_columns(
embedding=embed_async(
"text",
provider=Provider.OPENAI,
model="text-embedding-3-small"
)
)
print(df.select(["id", "text", "embedding"]))Cosine Similarity
from polar_llama import cosine_similarity
# Calculate similarity between query and documents
query_embedding = [0.1, 0.2, 0.3, ...] # Your query vector
df = df.with_columns(
similarity=cosine_similarity("embedding", query_embedding)
)
# Sort by similarity to find most relevant documents
results = df.sort("similarity", descending=True).head(10)K-Nearest Neighbors
from polar_llama import knn_search
# Find the 5 most similar documents
results = knn_search(
df,
query_embedding,
embedding_col="embedding",
k=5
)
print(results)Approximate Nearest Neighbors (ANN)
from polar_llama import ann_search
# Fast approximate search for large datasets
results = ann_search(
df,
query_embedding,
embedding_col="embedding",
k=10,
n_probes=10 # Trade-off between speed and accuracy
)
print(results)Semantic Search
Find documents by meaning rather than keyword matching
Document Clustering
Group similar documents together automatically
Recommendation Systems
Suggest similar items based on content similarity
Duplicate Detection
Identify near-duplicate content in large datasets
Cost Analytics
NEW in 0.2.2The new cost analytics feature helps you monitor and manage your AI spending by calculating the cost of each inference call. Track costs per request, aggregate spending over time, and identify opportunities to optimize your LLM usage.
Per-Request Costs
Calculate the exact cost of each inference call based on token usage
Provider Pricing
Built-in pricing data for OpenAI, Anthropic, and other providers
Cost Aggregation
Sum up costs across batches, time periods, or custom groupings
Budget Monitoring
Set cost thresholds and track spending against budgets
import polars as pl
from polar_llama import inference_async, calculate_cost, Provider
# Run inference and track costs
df = df.with_columns(
answer=inference_async(
'prompt',
provider=Provider.OPENAI,
model='gpt-4o-mini',
return_usage=True # Enable usage tracking
)
)
# Calculate cost for each request
df = df.with_columns(
cost=calculate_cost(
"answer",
provider=Provider.OPENAI,
model='gpt-4o-mini'
)
)
# Get total cost
total_cost = df.select(pl.col("cost").sum()).item()
print(f"Total inference cost: ${total_cost:.4f}")
# Cost breakdown by category
cost_by_category = df.group_by("category").agg(
pl.col("cost").sum().alias("total_cost"),
pl.col("cost").mean().alias("avg_cost"),
pl.col("cost").count().alias("request_count")
)
print(cost_by_category)Taxonomy-based Tagging
0.2.1Taxonomy-based tagging is a powerful feature that allows you to classify documents according to a custom taxonomy with detailed reasoning, reflection, and confidence scores. This feature is particularly useful for content classification, customer support routing, email triage, sentiment analysis, and multi-label classification.
Detailed Reasoning
For each possible value in each field, the model provides its reasoning
Reflection
After considering all options, the model reflects on its analysis
Confidence Scores
Each classification includes a confidence score (0.0 to 1.0)
Parallel Processing
Multiple documents and fields are processed in parallel automatically
import polars as pl
from polar_llama import tag_taxonomy, Provider
# Define your taxonomy
taxonomy = {
"sentiment": {
"description": "The emotional tone of the text",
"values": {
"positive": "Text expresses positive emotions or favorable opinions",
"negative": "Text expresses negative emotions or unfavorable opinions",
"neutral": "Text is factual and objective without clear emotional content"
}
},
"urgency": {
"description": "How urgent the content is",
"values": {
"high": "Requires immediate attention",
"medium": "Should be addressed soon",
"low": "Can be addressed at any time"
}
}
}
# Create a dataframe
df = pl.DataFrame({
"id": [1, 2],
"message": [
"URGENT: Server is down!",
"Thanks for your help yesterday."
]
})
# Apply taxonomy tagging
result = df.with_columns(
tags=tag_taxonomy(
pl.col("message"),
taxonomy,
provider=Provider.GROQ,
model="openai/gpt-oss-120b"
)
)
# Extract specific values
result.select([
"message",
pl.col("tags").struct.field("sentiment").struct.field("value").alias("sentiment"),
pl.col("tags").struct.field("sentiment").struct.field("confidence").alias("confidence"),
pl.col("tags").struct.field("urgency").struct.field("value").alias("urgency")
])A taxonomy is defined as a dictionary with the following structure:
taxonomy = {
"field_name": {
"description": "What this field represents",
"values": {
"value1": "Definition of value1",
"value2": "Definition of value2",
# ... more values
}
},
# ... more fields
}Design Tips
- Clear Definitions: Make value definitions specific and mutually exclusive
- Appropriate Granularity: 3-5 values per field works well; too many can confuse the model
- Balanced Options: Try to provide balanced options that cover the full range
- Domain-Specific: Tailor definitions to your specific use case
Each tagged document returns a Struct with the following nested structure:
{
"field_name": {
"thinking": {
"value1": "Reasoning about why value1 might apply...",
"value2": "Reasoning about why value2 might apply...",
# ... reasoning for each possible value
},
"reflection": "Overall reflection on the analysis of this field...",
"value": "selected_value", # The chosen value
"confidence": 0.87 # Confidence score (0.0 to 1.0)
},
# ... more fields
}A dictionary with reasoning for each possible value in the taxonomy
The model's overall reflection after considering all options
The selected value (one of the values from the taxonomy)
How confident the model is in its selection (0.0 = not confident, 1.0 = very confident)
Extract Specific Fields
# Get just the selected value
sentiment = result_df.select(
pl.col("tags").struct.field("sentiment").struct.field("value")
)
# Get value and confidence together
sentiment_analysis = result_df.select([
pl.col("tags").struct.field("sentiment").struct.field("value").alias("sentiment"),
pl.col("tags").struct.field("sentiment").struct.field("confidence").alias("confidence")
])Access Detailed Reasoning
# Get the thinking for a specific field
thinking = result_df.select(
pl.col("tags").struct.field("sentiment").struct.field("thinking")
)
# Get the reflection
reflection = result_df.select(
pl.col("tags").struct.field("sentiment").struct.field("reflection")
)Multiple Fields at Once
# Create a clean summary view
summary = result_df.select([
"id",
"document",
pl.col("tags").struct.field("sentiment").struct.field("value").alias("sentiment"),
pl.col("tags").struct.field("urgency").struct.field("value").alias("urgency"),
pl.col("tags").struct.field("category").struct.field("value").alias("category")
])Filtering by Confidence
# Only keep high-confidence results
high_confidence = result_df.filter(
pl.col("tags").struct.field("sentiment").struct.field("confidence") > 0.8
)Combining Multiple Conditions
# Find negative, urgent items with high confidence
critical = result_df.filter(
(pl.col("tags").struct.field("sentiment").struct.field("value") == "negative") &
(pl.col("tags").struct.field("urgency").struct.field("value") == "high") &
(pl.col("tags").struct.field("urgency").struct.field("confidence") > 0.7)
)Aggregating by Category
# Count documents by sentiment
sentiment_counts = result_df.groupby(
pl.col("tags").struct.field("sentiment").struct.field("value")
).count()
# Average confidence by category
avg_confidence = result_df.groupby(
pl.col("tags").struct.field("category").struct.field("value")
).agg(
pl.col("tags").struct.field("category").struct.field("confidence").mean()
)1. Customer Support Routing
taxonomy = {
"department": {
"description": "Which department should handle this",
"values": {
"sales": "Product inquiries and purchases",
"support": "Technical issues and bugs",
"billing": "Payment and account questions"
}
},
"priority": {
"description": "How urgent this is",
"values": {
"urgent": "Service down or critical issue",
"high": "Significant problem affecting work",
"normal": "Standard request or question"
}
}
}2. Content Classification
taxonomy = {
"category": {
"description": "Main topic area",
"values": {
"technology": "Tech, software, or digital topics",
"business": "Business, finance, or economics",
"lifestyle": "Health, wellness, or personal topics"
}
},
"content_type": {
"description": "Format and purpose",
"values": {
"tutorial": "Step-by-step instructional content",
"analysis": "In-depth examination of a topic",
"news": "Timely reporting of events"
}
}
}3. Social Media Analysis
taxonomy = {
"sentiment": {
"description": "Emotional tone",
"values": {
"positive": "Positive emotions or opinions",
"negative": "Negative emotions or criticism",
"neutral": "Factual without clear emotion"
}
},
"topic": {
"description": "Main subject discussed",
"values": {
"product": "Discussion of product features",
"service": "Customer service experience",
"brand": "General brand perception"
}
},
"intent": {
"description": "What the author wants",
"values": {
"complaint": "Expressing dissatisfaction",
"praise": "Sharing positive experience",
"question": "Seeking information"
}
}
}def tag_taxonomy(
expr: IntoExpr,
taxonomy: Dict[str, Dict[str, Any]],
*,
provider: Optional[Union[str, Provider]] = None,
model: Optional[str] = None,
) -> pl.ExprThe document expression to analyze and tag
Dictionary defining the taxonomy structure
The LLM provider to use (OpenAI, Anthropic, Gemini, Groq, Bedrock)
The specific model name to use
Polars Expression with structured tags as a Struct column
- 1Start Simple:
Begin with 2-3 fields and expand as needed
- 2Test Definitions:
Verify that your value definitions are clear and distinguishable
- 3Use Confidence Scores:
Filter or flag low-confidence results for review
- 4Validate Results:
Spot-check classifications to ensure quality
- 5Iterate:
Refine your taxonomy based on results
- 6Handle Errors:
Always check for and handle error cases
Structured Outputs
0.2.0Structured outputs allow you to define the exact schema you want the LLM to follow, ensuring responses are properly formatted and can be directly used in your data pipelines. This is perfect for extracting specific information, generating consistent data, or integrating LLM outputs with databases and APIs.
Type Safety
Define your output schema with Pydantic models for guaranteed type correctness
Validation
Automatic validation ensures responses match your schema before processing
Consistency
Get predictable, parseable outputs across all your inference requests
Easy Integration
Seamlessly integrate with databases, APIs, and data processing pipelines
import polars as pl
from polar_llama import string_to_message, inference_async, Provider
from pydantic import BaseModel
# Define your output schema
class ProductInfo(BaseModel):
name: str
price: float
category: str
in_stock: bool
# Create prompts
prompts = [
"Extract product info: iPhone 15 Pro for $999 in Electronics, available",
"Extract product info: Nike Air Max shoes for $129.99 in Footwear, sold out",
"Extract product info: Laptop Stand for $49.99 in Accessories, in stock"
]
df = pl.DataFrame({'prompt': prompts})
# Convert to messages
df = df.with_columns(
message=string_to_message("prompt", message_type='user')
)
# Run inference with structured output
df = df.with_columns(
product=inference_async(
'message',
provider=Provider.OPENAI,
model='gpt-4o-2024-08-06',
response_model=ProductInfo # Specify your Pydantic model
)
)
# Access structured fields directly
print(df.select(['product']))Examples & Cookbooks
import polars as pl
from polar_llama import string_to_message, inference_async
# Create a DataFrame with system prompts and user questions
df = pl.DataFrame({
"system_prompt": [
"You are a helpful assistant.",
"You are a math expert.",
"You are a creative writer."
],
"user_question": [
"What's the weather like today?",
"Solve x^2 + 5x + 6 = 0",
"Write a haiku about coding"
]
})
# Convert both columns to messages
df = df.with_columns([
string_to_message("system_prompt", message_type="system").alias("system_message"),
string_to_message("user_question", message_type="user").alias("user_message")
])
# Combine messages into conversations
from polar_llama import combine_messages, inference_messages
df = df.with_columns(
combine_messages("system_message", "user_message").alias("conversation")
)
# Run inference with combined messages
df = df.with_columns(
inference_messages("conversation",
provider="openai",
model="gpt-4").alias("response")
)
print(df.select(["user_question", "response"]))import polars as pl
from polar_llama import string_to_message, inference_async, Provider
# Load customer feedback data
feedback_df = pl.DataFrame({
'customer_id': [101, 102, 103, 104, 105],
'feedback': [
'The product is amazing but shipping was slow',
'Great quality, highly recommend!',
'Disappointed with customer service',
'Perfect for my needs, will buy again',
'Product arrived damaged, requesting refund'
]
})
# Create sentiment analysis prompts
sentiment_prompt = """Analyze the sentiment of this customer feedback
and classify it as Positive, Negative, or Neutral.
Also provide a brief reason.
Feedback: {feedback}"""
df = feedback_df.with_columns(
prompt=pl.format(sentiment_prompt, pl.col('feedback'))
)
# Convert to messages and run inference
df = df.with_columns(
message=string_to_message("prompt", message_type='user')
)
df = df.with_columns(
sentiment_analysis=inference_async('message',
provider=Provider.OPENAI,
model='gpt-4o-mini')
)
# Extract key insights
print(df.select(['customer_id', 'feedback', 'sentiment_analysis']))OpenAI
df = products.with_columns(
prompt=pl.format(prompt_template,
pl.col('product_name'),
pl.col('features'),
pl.col('target_audience'))
)Structured OutputsSupported on gpt-4o-2024-08-06 and later models with response_model parameter
Anthropic (Claude)
df = df.with_columns(
answer=inference_async('prompt',
provider=Provider.ANTHROPIC,
model='claude-3-haiku-20240307')
)Structured OutputsSupported on Claude 3.5 Sonnet and later with response_model parameter
AWS Bedrock
# Requires AWS credentials configured
df = df.with_columns(
answer=inference_async('prompt',
provider='bedrock',
model='anthropic.claude-3-haiku-20240307-v1:0')
)Google Gemini
df = df.with_columns(
answer=inference_async('prompt',
provider=Provider.GEMINI,
model='gemini-pro')
)Groq
df = df.with_columns(
answer=inference_async('prompt',
provider=Provider.GROQ,
model='llama3-70b-8192')
)Environment Configuration
Set up your API keys in a .env file:
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GEMINI_API_KEY=your_gemini_key
GROQ_API_KEY=your_groq_key
AWS_ACCESS_KEY_ID=your_aws_key
AWS_SECRET_ACCESS_KEY=your_aws_secret
AWS_REGION=us-east-1Testing
Run tests with configured providers:
pip install -r tests/requirements.txt
pytest tests/ -v
cargo test --test model_client_tests -- --nocaptureData Analysis
Process large datasets with AI insights - sentiment analysis, classification, entity extraction with validated structured outputs
Content Generation
Generate product descriptions, marketing copy, or documentation at scale with consistent formatting
Research & Summarization
Summarize documents, extract key points with structured metadata, or answer questions about large text corpora
Automation
Automate repetitive AI tasks like code review, email categorization, or data enrichment with type-safe outputs
Licensed under MIT
Questions or issues? Open an issue on GitHub