Nixtla Schema Mapper
by intent-solutions-io
Transform data sources to Nixtla schema (unique_id, ds, y) with column inference. Use when preparing data for forecasting. Trigger with 'map to Nixtla schema' or 'transform data'.
Skill Details
Repository Files
17 files in this skill directory
name: nixtla-schema-mapper description: "Transform data sources to Nixtla schema (unique_id, ds, y) with column inference. Use when preparing data for forecasting. Trigger with 'map to Nixtla schema' or 'transform data'." allowed-tools: "Read,Write,Glob,Grep,Edit" version: "1.1.0" author: "Jeremy Longshore jeremy@intentsolutions.io" license: MIT
Nixtla Schema Mapper
Transform data sources to Nixtla-compatible schema (unique_id, ds, y).
Overview
This skill automates data transformation:
- Column inference: Detects timestamp, target, and ID columns
- Code generation: Python modules for CSV/SQL/Parquet/dbt
- Schema contracts: Documentation with validation rules
- Quality checks: Validates transformed data
Prerequisites
Required:
- Python 3.8+
pandas
Optional:
pyarrow: For Parquet supportsqlalchemy: For SQL sourcesdbt-core: For dbt models
Installation:
pip install pandas pyarrow sqlalchemy
Instructions
Step 1: Identify Data Source
Supported formats:
- CSV/Parquet files
- SQL tables or queries
- dbt models
Step 2: Analyze Schema
python {baseDir}/scripts/analyze_schema.py --input data/sales.csv
Output:
Detected columns:
Timestamp: 'date' (datetime64)
Target: 'sales' (float64)
Series ID: 'store_id' (object)
Exogenous: price, promotion
Step 3: Generate Transformation
python {baseDir}/scripts/generate_transform.py \
--input data/sales.csv \
--id_col store_id \
--date_col date \
--target_col sales \
--output data/transform/to_nixtla_schema.py
Step 4: Create Schema Contract
python {baseDir}/scripts/create_contract.py \
--mapping mapping.json \
--output NIXTLA_SCHEMA_CONTRACT.md
Step 5: Validate Transformation
python data/transform/to_nixtla_schema.py
Output
- data/transform/to_nixtla_schema.py: Transformation module
- NIXTLA_SCHEMA_CONTRACT.md: Schema documentation
- nixtla_data.csv: Transformed data (optional)
Error Handling
-
Error:
No timestamp column detectedSolution: Specify manually with--date_col -
Error:
Multiple target candidatesSolution: Specify manually with--target_col -
Error:
Date parsing failedSolution: Specify format with--date_format "%Y-%m-%d" -
Error:
Non-numeric target columnSolution: Check for string values, usepd.to_numeric(errors='coerce')
Examples
Example 1: CSV Transformation
python {baseDir}/scripts/generate_transform.py \
--input sales.csv \
--id_col product_id \
--date_col timestamp \
--target_col revenue
Generated code:
def to_nixtla_schema(path="sales.csv"):
df = pd.read_csv(path)
df = df.rename(columns={
'product_id': 'unique_id',
'timestamp': 'ds',
'revenue': 'y'
})
df['ds'] = pd.to_datetime(df['ds'])
return df[['unique_id', 'ds', 'y']]
Example 2: SQL Source
python {baseDir}/scripts/generate_transform.py \
--sql "SELECT * FROM daily_sales" \
--connection postgresql://localhost/db \
--id_col store_id \
--date_col sale_date \
--target_col amount
Resources
- Scripts:
{baseDir}/scripts/ - Templates:
{baseDir}/assets/templates/ - Nixtla Schema Docs: https://nixtla.github.io/statsforecast/
Related Skills:
nixtla-timegpt-lab: Use transformed data for forecastingnixtla-experiment-architect: Reference in experiments
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
