Lumen Ai

by cdcore09

documentdata

Master AI-powered natural language data exploration with Lumen AI. Use this skill when building conversational data analysis interfaces, enabling natural language queries to databases, creating custom AI agents for domain-specific analytics, implementing RAG with document context, or deploying self-service analytics with LLM-generated SQL and visualizations.

Skill Details

Repository Files

1 file in this skill directory


name: lumen-ai description: Master AI-powered natural language data exploration with Lumen AI. Use this skill when building conversational data analysis interfaces, enabling natural language queries to databases, creating custom AI agents for domain-specific analytics, implementing RAG with document context, or deploying self-service analytics with LLM-generated SQL and visualizations. compatibility: Requires lumen >= 0.10.0 (with AI support), panel >= 1.3.0, openai or anthropic or other LLM provider libraries. Supports OpenAI, Anthropic Claude, Google Gemini, Mistral, and local models via Ollama or LlamaCPP.

Lumen AI Skill

Overview

Lumen AI is an open-source, agent-based framework for conversational data exploration. Users ask questions in plain English and receive visualizations, SQL queries, and insights automatically generated by large language models.

What is Lumen AI?

Lumen AI translates natural language queries into:

  • SQL queries for database exploration
  • Interactive visualizations
  • Statistical summaries
  • Custom domain-specific analyses
  • Data-driven insights

Key Features

  • Natural Language Interface: Ask questions in plain English
  • Multi-LLM Support: OpenAI, Anthropic, Google, Mistral, local models
  • Agent Architecture: Specialized agents for SQL, charts, analyses
  • Extensible: Custom agents, tools, and analyses
  • Privacy-Focused: Full local deployment option
  • No Vendor Lock-in: Switch LLM providers with configuration change

Lumen AI vs Lumen Dashboards

Feature Lumen AI Lumen Dashboards
Interface Conversational, natural language Declarative YAML
Use Case Ad-hoc exploration, varying questions Fixed dashboards, repeated views
Users Non-technical users, self-service Developers, dashboard builders
Cost LLM API costs No LLM costs
Flexibility High - generates any query Fixed - predefined views

Use Lumen AI when:

  • Users need ad-hoc data exploration
  • Questions vary and aren't predictable
  • Enabling self-service analytics
  • Reducing analyst backlog

Use Lumen Dashboards when:

  • Dashboard structure is fixed
  • Same visualizations needed repeatedly
  • No LLM costs desired
  • Full control over outputs needed

Quick Start

Installation

# Install Lumen with AI support
pip install lumen[ai]

# Install LLM provider (choose one or more)
pip install openai        # OpenAI
pip install anthropic     # Anthropic Claude

Launch Built-in Interface

# Set API key
export OPENAI_API_KEY="sk-..."

# Launch with dataset
lumen-ai serve data/sales.csv

# Or with database
lumen-ai serve "postgresql://user:pass@localhost/mydb"

Python API - Basic Example

import lumen.ai as lmai
import panel as pn
from lumen.sources.duckdb import DuckDBSource

pn.extension()

# Configure LLM
lmai.llm.llm_type = "anthropic"
lmai.llm.model = "claude-3-5-sonnet-20241022"

# Load data
source = DuckDBSource(
    tables=["./data/sales.csv", "./data/customers.csv"]
)

# Create UI
ui = lmai.ExplorerUI(
    source=source,
    title="Sales Analytics AI"
)

ui.servable()

Example Queries

Once running, try queries like:

  • "What tables are available?"
  • "Show me total sales by region"
  • "Create a scatter plot of price vs quantity"
  • "What were the top 10 products last month?"
  • "Calculate average order value per customer"

Core Concepts

1. Agents

Specialized components that handle specific tasks:

  • TableListAgent: Shows available tables and schemas
  • ChatAgent: General conversation and summaries
  • SQLAgent: Generates and executes SQL queries
  • hvPlotAgent: Creates interactive visualizations
  • VegaLiteAgent: Publication-quality charts
  • AnalysisAgent: Custom domain-specific analyses

See: Built-in Agents Reference for complete agent documentation.

2. LLM Providers

Lumen AI works with multiple LLM providers:

Cloud Providers:

  • OpenAI (GPT-4o, GPT-4o-mini)
  • Anthropic (Claude 3.5 Sonnet, Claude 3 Opus/Haiku)
  • Google (Gemini 1.5 Pro/Flash)
  • Mistral (Mistral Large/Medium/Small)

Local Models:

  • Ollama (Llama 3.1, Mistral, CodeLlama)
  • LlamaCPP (custom models)

See: LLM Provider Configuration for setup details and provider comparison.

3. Memory and Context

Agents share a memory system:

  • Query results persist across interactions
  • Agents can build on previous work
  • Context maintained throughout conversation

4. Tools

Extend agent capabilities:

  • DocumentLookup: RAG for document context
  • TableLookup: Schema and metadata access
  • Custom Tools: External APIs, calculations, etc.

See: Custom Tools Guide for building tools.

Common Patterns

Pattern 1: Basic Analytics Interface

import lumen.ai as lmai
from lumen.sources.duckdb import DuckDBSource

# Configure LLM
lmai.llm.llm_type = "openai"
lmai.llm.model = "gpt-4o"

# Load data
source = DuckDBSource(tables=["sales.csv"])

# Create UI
ui = lmai.ExplorerUI(
    source=source,
    title="Business Analytics"
)

ui.servable()

Pattern 2: With Document Context (RAG)

source = DuckDBSource(
    tables=["sales.csv", "products.parquet"],
    documents=[
        "./docs/data_dictionary.pdf",
        "./docs/business_rules.md"
    ]
)

ui = lmai.ExplorerUI(
    source=source,
    tools=[lmai.tools.DocumentLookup]
)

Agents will automatically search documents for context when needed.

Pattern 3: Custom Agent

from lumen.ai.agents import Agent
import param

class SentimentAgent(Agent):
    """Analyze sentiment in text data."""

    requires = param.List(default=["current_source"])
    provides = param.List(default=["sentiment_analysis"])

    purpose = """
    Analyzes sentiment in text columns.
    Use when user asks about sentiment, emotions, or tone.
    Keywords: sentiment, emotion, positive, negative, tone
    """

    async def respond(self, query: str):
        # Agent implementation
        source = self.memory["current_source"]
        # ... analyze sentiment ...
        yield "Sentiment analysis results..."

# Use custom agent
ui = lmai.ExplorerUI(
    source=source,
    agents=[SentimentAgent, lmai.agents.ChatAgent]
)

See: Custom Agents Guide for detailed development guide.

Pattern 4: Custom Analysis

from lumen.ai.analyses import Analysis
from lumen.pipeline import Pipeline
import param

class CohortAnalysis(Analysis):
    """Customer cohort retention analysis."""

    columns = param.List(default=[
        'customer_id', 'signup_date', 'purchase_date'
    ])

    def __call__(self, pipeline: Pipeline):
        # Cohort analysis logic
        df = pipeline.data
        # ... calculate cohorts ...
        return results

# Register analysis
ui = lmai.ExplorerUI(
    source=source,
    agents=[
        lmai.agents.AnalysisAgent(analyses=[CohortAnalysis])
    ]
)

See: Custom Analyses Guide for examples.

Pattern 5: Multi-Source Data

from lumen.sources.duckdb import DuckDBSource

source = DuckDBSource(
    tables={
        "sales": "./data/sales.parquet",
        "customers": "./data/customers.csv",
        "products": "https://data.company.com/products.csv"
    }
)

ui = lmai.ExplorerUI(source=source)

Configuration

LLM Selection

Quick reference for choosing LLM:

Use Case Provider Model Why
Production analytics OpenAI gpt-4o Best balance
Complex SQL Anthropic claude-3-5-sonnet Superior reasoning
High volume OpenAI gpt-4o-mini Cost-effective
Sensitive data Ollama llama3.1 Local only
Development OpenAI gpt-4o-mini Fast, cheap

See: LLM Provider Configuration for complete setup.

Agent Selection

# Use only specific agents
agents = [
    lmai.agents.TableListAgent,
    lmai.agents.SQLAgent,
    lmai.agents.hvPlotAgent,
    # Exclude VegaLiteAgent if not needed
]

ui = lmai.ExplorerUI(source=source, agents=agents)

Coordinator Types

DependencyResolver (default): Recursively resolves agent dependencies

ui = lmai.ExplorerUI(source=source, coordinator="dependency")

Planner: Creates execution plan upfront

ui = lmai.ExplorerUI(source=source, coordinator="planner")

UI Customization

ui = lmai.ExplorerUI(
    source=source,
    title="Custom Analytics AI",
    accent_color="#00aa41",
    suggestions=[
        "Show me revenue trends",
        "What are the top products?",
        "Create customer segmentation"
    ]
)

Best Practices

1. Provider Selection

  • Production: Use Anthropic Claude 3.5 Sonnet or GPT-4o
  • Development: Use GPT-4o-mini for cost savings
  • Sensitive data: Use Ollama for local deployment

2. Security

import os

# ✅ Good: Environment variables
lmai.llm.api_key = os.getenv("OPENAI_API_KEY")

# ❌ Bad: Hardcoded secrets
lmai.llm.api_key = "sk-..."

3. Performance

# Limit table sizes for exploration
source = DuckDBSource(
    tables=["large_table.parquet"],
    table_kwargs={"large_table": {"nrows": 100000}}
)

4. User Experience

# Provide example queries
ui = lmai.ExplorerUI(
    source=source,
    suggestions=[
        "Show me revenue trends",
        "Top 10 products by sales",
        "Customer segmentation analysis"
    ]
)

Deployment

Development

lumen-ai serve app.py --autoreload --show

Production

panel serve app.py \
  --port 80 \
  --num-procs 4 \
  --allow-websocket-origin=analytics.company.com

Docker

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY app.py data/ ./
CMD ["panel", "serve", "app.py", "--port", "5006", "--address", "0.0.0.0"]

See: Deployment Guide for production deployment, Docker, Kubernetes, and security.

Troubleshooting

LLM Not Responding

# Check API key
import os
print(os.getenv("OPENAI_API_KEY"))

# Test connection
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Agent Not Selected

# Debug which agent was selected
print(ui.agent_manager.last_selected_agent)

# View agent purposes
for agent in ui.agents:
    print(f"{agent.__class__.__name__}: {agent.purpose}")

SQL Generation Errors

  • Add data dictionary as document for context
  • Provide example queries in agent prompts
  • Check table schemas match query expectations

See: Troubleshooting Guide for complete troubleshooting reference.

Progressive Learning Path

Level 1: Getting Started

  1. Install and launch built-in interface
  2. Try example queries
  3. Configure LLM provider

Resources:

Level 2: Python API

  1. Create basic ExplorerUI
  2. Configure agents and tools
  3. Add document context (RAG)

Resources:

Level 3: Customization

  1. Build custom agents
  2. Create custom analyses
  3. Add custom tools

Resources:

Level 4: Production

  1. Deploy with authentication
  2. Implement monitoring
  3. Scale horizontally

Resources:

Additional Resources

Documentation

External Links

Use Cases

Business Analytics

  • Ad-hoc revenue analysis
  • Customer behavior exploration
  • Sales performance tracking
  • Market segmentation

Data Science

  • Exploratory data analysis
  • Quick statistical summaries
  • Hypothesis testing
  • Pattern discovery

Operations

  • Real-time monitoring queries
  • Anomaly investigation
  • Performance metrics
  • Incident analysis

Self-Service Analytics

  • Enabling business users
  • Reducing analyst backlog
  • Democratizing data access
  • Maintaining governance

Summary

Lumen AI transforms data exploration through natural language interfaces powered by LLMs.

Strengths:

  • No SQL or coding required for users
  • Flexible LLM support (cloud and local)
  • Extensible architecture
  • Privacy-focused options
  • Reduces analyst workload

Ideal for:

  • Ad-hoc data exploration
  • Non-technical users
  • Rapid insights
  • Self-service analytics

Consider alternatives when:

Related Skills

Related Skills

Xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

data

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Analyzing Financial Statements

This skill calculates key financial ratios and metrics from financial statement data for investment analysis

data

Data Storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

data

Kpi Dashboard Design

Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.

designdata

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Sql Optimization Patterns

Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.

designdata

Clinical Decision Support

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug develo

developmentdocumentcli

Anndata

This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.

arttooldata

Skill Information

Category:Document
Last Updated:1/7/2026