Using Sqlite

by tilmon-engineering

workflowdatacli

Use when working with SQLite databases in DataPeeker analysis sessions - querying data, importing CSVs, exploring schemas, formatting output, or optimizing performance. Provides task-oriented guidance for effective SQLite CLI usage in data analysis workflows.

Skill Details

Repository Files

7 files in this skill directory


name: using-sqlite description: Use when working with SQLite databases in DataPeeker analysis sessions - querying data, importing CSVs, exploring schemas, formatting output, or optimizing performance. Provides task-oriented guidance for effective SQLite CLI usage in data analysis workflows.

Using SQLite

Overview

SQLite is the primary database for DataPeeker analysis sessions. This skill provides task-oriented guidance for common SQLite operations during data analysis.

Core principle: Explore schema first, format output for readability, write efficient queries, verify all operations.

When to Use

Use this skill when you need to:

  • Explore an unfamiliar database schema
  • Query data for analysis or hypothesis testing
  • Import CSV files into SQLite tables
  • Format query output for readability
  • Diagnose slow queries or optimize performance
  • Understand which CLI invocation pattern to use

When NOT to use:

  • Database-agnostic data profiling (see understanding-data skill for patterns that work across all SQL databases)
  • Complex data cleaning logic (delegate to cleaning-data skill + sub-agents)
  • Statistical analysis (use Python/pandas for advanced statistics)
  • Large-scale transformations (use Python sqlite3 module)

DataPeeker Conventions

Database Path:    data/analytics.db (relative from project root)
Table Naming:     raw_* for imported data, clean_* for cleaned data
Single Database:  All tables in one file per analysis session

Example workflow:

CSV file → raw_sales → clean_sales → Analysis queries

Quick Reference

Task Guidance File
Understand what tables/columns exist @./exploring-schema.md
Make query results readable @./formatting-output.md
Write analytical queries @./writing-queries.md
Load CSV files @./importing-data.md
Fix slow queries @./optimizing-performance.md
Choose CLI invocation method @./invoking-cli.md

Task-Oriented Guidance

Exploring Schema

Before writing queries, understand the database structure.

See @./exploring-schema.md for:

  • Listing all tables (.tables)
  • Viewing table structure (.schema, PRAGMA table_info)
  • Understanding column types and constraints
  • Checking for indexes

When: Starting analysis, unfamiliar database, before writing joins


Formatting Output

Make query results readable for analysis.

See @./formatting-output.md for:

  • Output modes (column, csv, json, markdown)
  • Showing/hiding headers (.headers on/off)
  • Setting column widths for readability
  • Redirecting output to files

When: Query results hard to read, need specific format for export, preparing reports


Writing Queries

SQLite-specific query patterns and conventions.

See @./writing-queries.md for:

  • SQLite idioms used in DataPeeker (COUNT(*) - COUNT(col) for NULLs)
  • Date handling with STRFTIME
  • DataPeeker percentage calculation conventions
  • Common verification queries

See also: writing-queries and understanding-data skills for database-agnostic SQL patterns and data profiling approaches. This guidance focuses on SQLite-specific syntax, CLI usage, and optimizations.

When: Need SQLite-specific syntax, DataPeeker query conventions, date formatting with STRFTIME


Importing Data

Load CSV files and verify import success.

See @./importing-data.md for:

  • Using .import command
  • Verification queries (row counts, sample data)
  • When to use CLI vs Python sqlite3
  • Transaction handling

When: Loading new data, Phase 4 of importing-data skill, verifying data loaded correctly


Optimizing Performance

Diagnose and fix slow queries.

See @./optimizing-performance.md for:

  • EXPLAIN QUERY PLAN analysis
  • Creating indexes for common queries
  • Using transactions for bulk operations
  • PRAGMA optimization

When: Query takes >1 second, loading large datasets, repeated similar queries


Invoking CLI

Choose the right method to run sqlite3 commands.

See @./invoking-cli.md for:

  • Interactive mode (for exploration)
  • Heredoc pattern (for multi-command scripts)
  • File redirect (for SQL files)
  • One-liner mode (for quick checks)

When: Starting any SQLite operation, unsure which invocation to use

Common Mistakes

Mistake Fix
Writing queries without exploring schema Always run .tables and .schema first
Poor output formatting (hard to read results) Use .mode column and .headers on for readability
Ignoring NULL values in calculations Use COUNT(*) - COUNT(col) for NULL counting
Integer division losing decimals Use 100.0 (not 100) for percentage calculations
Slow queries without diagnosis Run EXPLAIN QUERY PLAN before optimizing
Assuming import succeeded Always verify with SELECT COUNT(*) after import
Using wrong CLI invocation pattern Interactive for exploration, heredoc for scripts

Verification Before Proceeding

After any import operation:

-- 1. Verify row count matches expectation
SELECT COUNT(*) FROM raw_table;

-- 2. Check sample data looks correct
SELECT * FROM raw_table LIMIT 5;

-- 3. Verify no unexpected NULLs
SELECT COUNT(*) - COUNT(critical_column) FROM raw_table;

After writing a complex query:

-- 1. Check query plan
EXPLAIN QUERY PLAN SELECT ...;

-- 2. Time the query
.timer on
SELECT ...;

Real-World Impact

Systematic approach prevents:

  • Writing queries against wrong tables (explore schema first)
  • Unreadable output (format before analyzing)
  • Slow queries (diagnose with EXPLAIN QUERY PLAN)
  • Silent data loss (verify imports)
  • Incorrect percentages (use float division)

Following this skill:

  • Reduces query debugging time by 50%+
  • Catches import issues immediately
  • Produces readable analysis artifacts
  • Ensures reproducible workflows

Related Skills

Xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

data

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Analyzing Financial Statements

This skill calculates key financial ratios and metrics from financial statement data for investment analysis

data

Data Storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

data

Kpi Dashboard Design

Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.

designdata

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Sql Optimization Patterns

Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.

designdata

Clinical Decision Support

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug develo

developmentdocumentcli

Anndata

This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.

arttooldata

Skill Information

Category:Technical
Last Updated:1/1/2026