Viz
by robdmc
Data visualization and inspection skill. Use for (1) creating matplotlib/seaborn plots from data files or marimo notebooks, or (2) inspecting DataFrames by showing first N rows, column names, and dtypes. For plots, provide chart type, data context, and styling. For inspection, ask to "show" or "display" the data.
Skill Details
Repository Files
6 files in this skill directory
name: viz description: Data visualization and inspection skill. Use for (1) creating matplotlib/seaborn plots from data files or marimo notebooks, or (2) inspecting DataFrames by showing first N rows, column names, and dtypes. For plots, provide chart type, data context, and styling. For inspection, ask to "show" or "display" the data. allowed-tools: Read, Glob(/tmp/viz/), Grep(/tmp/viz/), Bash(python /Users/rob/.claude/skills/viz/viz_runner.py:*)
Viz Skill: Data Visualization and Inspection
Purpose
This skill directly executes visualizations. The calling agent provides a visualization spec along with data context, and the skill:
- Infers the data loading code from the provided context
- Generates the complete plotting script
- Executes it via the
viz_runner.pyhelper - Returns artifact paths for the caller to reference
Key pattern:
Caller (with data context) → Skill (infers data loading, generates script, executes) → Plot appears
The caller does NOT need to write any execution code. The skill handles everything.
Input Specification
The calling agent should provide:
Required
- Visualization spec: What to plot (chart type, axes, title, special features)
Data Context (one of these forms)
- Database + query: "Data from
/full/path/to/operational_forecast.ddb, tableforecast, columns month, members" - SQL query: "Run this SQL:
SELECT * FROM forecast WHERE year >= 2024" - Code snippet: "Load data like this:
df = pd.read_parquet('/full/path/to/data.parquet')" - File path: "CSV at
/tmp/data.csvwith columns X, Y, Z"
CRITICAL: Absolute Paths Required
The viz_runner.py executes scripts from /tmp/viz/, NOT the caller's working directory. All file paths in generated scripts MUST be absolute paths. The calling agent should:
- Determine the absolute path to any data files before invoking the skill
- Pass the full absolute path in the data context
- Never use relative paths like
./data.ddbordata.parquet
Example - WRONG:
con = duckdb.connect('operational_forecast.ddb') # Will fail!
Example - CORRECT:
con = duckdb.connect('/Users/rob/projects/forecast/operational_forecast.ddb')
Optional
- Suggested ID: A name hint (e.g.,
pop_bar,churn_trend). The runner ensures uniqueness.
Intent Detection
Before generating any code, analyze the user's request to determine the appropriate mode.
Inspection Mode (use --show)
Use when the user wants to see the data itself, not a visualization:
- "Show me the dataframe"
- "Display the first N rows"
- "What does the data look like?"
- "Print the data"
- "What columns are in X?"
- "Inspect the data"
- "Let me see the data"
Action: Use --show flag. Do NOT generate plot code.
Visualization Mode (generate plot)
Use when the user wants a chart, graph, or visual representation:
- "Plot the data"
- "Create a chart of..."
- "Visualize the trend"
- "Show a graph of..."
- "Bar chart showing..."
- "Scatter plot of..."
Action: Generate matplotlib/seaborn code and pass via stdin.
Ambiguous Requests
If unclear (e.g., "show me X over time"), default to asking or interpret based on context:
- If the request mentions chart types (bar, line, scatter) → visualization
- If the request is about structure/columns/rows → inspection
- When in doubt, use
--showfirst (it's cheaper), then offer to plot
Artifact Management
All artifacts are managed in /tmp/viz/ via the helper script.
Helper: viz_runner.py
python /Users/rob/.claude/skills/viz/viz_runner.py [--id NAME] [--desc "Description"] << 'EOF'
<generated script>
EOF
The runner:
- Creates
/tmp/viz/if needed - Ensures ID uniqueness (appends
_2,_3, etc. if collision) - Injects
plt.savefig('/tmp/viz/<id>.png', dpi=150, bbox_inches='tight')beforeplt.show() - Writes the script to
/tmp/viz/<id>.py - Executes the script
- Writes metadata to
/tmp/viz/<id>.json - Prints human-readable results to stdout
Output Format
Terminal output:
Plot: pop_bar
"Bar chart of members by month"
png: /tmp/viz/pop_bar.png
Sidecar JSON (/tmp/viz/<id>.json):
{
"id": "pop_bar",
"desc": "Bar chart of members by month",
"png": "/tmp/viz/pop_bar.png",
"script": "/tmp/viz/pop_bar.py",
"created": "2025-01-22T11:31:00",
"pid": 46368
}
The caller can then:
- Read the PNG into context to discuss the plot
- Reference the script for modifications
- Look up plots by ID or description via the JSON metadata
List
To see all available visualizations:
python /Users/rob/.claude/skills/viz/viz_runner.py --list
Output:
ID Description Created
-------------- ----------------------------------- ----------------
pop_bar Bar chart of members by month 2025-01-22 11:31
churn_trend Monthly churn rate 2025-01-22 10:45
test_scatter - 2025-01-22 09:20
Cleanup
To remove all visualization files from /tmp/viz/:
python /Users/rob/.claude/skills/viz/viz_runner.py --clean
Output:
Cleaned 12 files from /tmp/viz
Skill Workflow
- Infer data loading: From the provided context, generate Python code to load/create the DataFrame. Use absolute paths for all file references - the script runs from
/tmp/viz/, not the caller's directory. - Generate visualization: Add matplotlib/seaborn code for the requested plot
- Execute via runner (always include
--descwith a short summary):python /Users/rob/.claude/skills/viz/viz_runner.py --id suggested_name --desc "Short description of plot" << 'EOF' <complete script> EOF - Parse output: Capture the ID and paths from stdout
- Return to caller: Report final ID and paths. Do NOT read the PNG into context unless the user needs analysis.
Library Selection
Use Seaborn When:
- Statistical distributions (histogram + KDE, violin, box plots)
- Regression with confidence intervals
- Categorical comparisons with error bars
- Heatmaps and correlation matrices
Use Matplotlib When:
- Fine-grained control over appearance
- Time series with date formatting
- Custom annotations and reference lines
- Simple plots without statistical features
Combine Both:
Use seaborn for the statistical plot, matplotlib for customizations like reference lines.
Publication Quality Standards
- Labels: Descriptive axis labels with units, 12pt+ font
- Titles: Clear, informative, 14pt+ font
- Figure size:
figsize=(10, 6)or appropriate aspect ratio - Layout: Always use
tight_layout()to prevent clipping - Grids: Subtle guidance with
alpha=0.3 - Colors: Colorblind-friendly palettes (viridis, coolwarm, Set2)
- Transparency: Alpha for overlapping points
- Imports: Inside the script for self-contained execution
End-to-End Example
Request from caller:
/viz id=pop_bar
bar chart showing total_initial_members and total_final_members by month
with dashed vertical line at history/forecast boundary (Dec 2025 / Jan 2026).
Data from operational_forecast.ddb, forecast table.
Skill generates and executes:
python /Users/rob/.claude/skills/viz/viz_runner.py --id pop_bar --desc "Bar chart of members by month with forecast boundary" << 'EOF'
import duckdb
import matplotlib.pyplot as plt
import numpy as np
# Load data from DuckDB (MUST use absolute path!)
con = duckdb.connect('/Users/rob/projects/forecast/operational_forecast.ddb', read_only=True)
df = con.execute("""
SELECT month, total_initial_members, total_final_members
FROM forecast
ORDER BY month
""").df()
con.close()
# Create grouped bar chart
fig, ax = plt.subplots(figsize=(12, 6))
x = np.arange(len(df))
width = 0.35
bars1 = ax.bar(x - width/2, df['total_initial_members'], width, label='Initial Members', color='steelblue')
bars2 = ax.bar(x + width/2, df['total_final_members'], width, label='Final Members', color='coral')
# History/forecast boundary
boundary_idx = df[df['month'] == '2025-12'].index[0] + 0.5
ax.axvline(x=boundary_idx, color='gray', linestyle='--', linewidth=1.5, label='Forecast Start')
ax.set_xlabel('Month', fontsize=12)
ax.set_ylabel('Members', fontsize=12)
ax.set_title('Member Population by Month: Historical vs Forecast', fontsize=14)
ax.set_xticks(x)
ax.set_xticklabels(df['month'], rotation=45, ha='right')
ax.legend()
ax.grid(True, alpha=0.3, axis='y')
plt.tight_layout()
plt.show()
EOF
Runner output:
Plot: pop_bar
"Bar chart of members by month with forecast boundary"
png: /tmp/viz/pop_bar.png
Skill returns to caller:
Plot generated successfully.
- ID:
pop_bar- Script:
/tmp/viz/pop_bar.py- PNG:
/tmp/viz/pop_bar.png
Important: Do NOT Auto-Read PNGs
Do NOT automatically read the PNG into context after generating a plot.
Reading images consumes significant context tokens and is usually unnecessary. The plot window opens automatically via plt.show(), so the user can already see the visualization.
Only read the PNG into context when:
- The user explicitly asks you to analyze or interpret the graph
- The user asks questions about what the graph shows
- You need to learn something from the visual output to answer a question
Instead of reading the PNG, offer to open it:
open /tmp/viz/pop_bar.png # macOS
This displays the image in the system viewer without consuming context tokens.
Refinement Workflow
When refining an existing plot:
- Caller provides the existing script path + requested changes
- Skill reads the script, applies modifications
- Executes with a new ID (e.g.,
pop_bar_2) - Both versions remain available for comparison
Regeneration
When a user asks to regenerate an existing plot (e.g., after data has changed):
By ID
Request: "regenerate pop_bar"
Run the saved script directly:
python /tmp/viz/pop_bar.py
The script already contains the hardcoded savefig path, so it overwrites the existing PNG.
By Description
Request: "regenerate the churn plot"
- Run
--listto find matching plot - Identify the ID from the description
- Run
python /tmp/viz/<id>.py
Ambiguous Request
Request: "regenerate a plot"
- Run
--listto show available plots - Ask user which one to regenerate
- Run the selected script
Key Point
Regeneration does NOT require viz_runner.py - the saved .py scripts are self-contained and can be executed directly with python.
Interactive Backend Note
Generated scripts use plt.show() which works with the macosx backend for interactive display. The injected savefig() ensures a PNG copy is always saved before display.
Marimo Notebook Support
The viz skill can extract data from marimo notebooks and generate plots without modifying the original notebook.
How It Works
- Copy notebook to
/tmp/viz/<id>.py - Analyze dependencies to identify cells needed for target data
- Prune unneeded cells from the copied notebook
- Inject plotting code as a new cell at the end
- Execute via subprocess with
cwdset to original notebook's directory (so relative paths work)
CLI Interface
python /Users/rob/.claude/skills/viz/viz_runner.py \
--marimo \
--notebook /path/to/notebook.nb.py \
--target-var df_forecast \
--id forecast_plot \
--desc "Monthly forecast visualization" \
<< 'EOF'
# Plotting code that uses df_forecast
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(df_forecast['date'], df_forecast['total_final_members'])
plt.show()
EOF
Parameters
--marimo: Enable marimo notebook mode (required)--notebook: Path to the marimo notebook file (required)--target-var: Variable to extract from the notebook (required)--target-line: Optional line number for capturing intermediate state (for mutated variables)--id: Suggested ID for the visualization (optional)--desc: Description of the visualization (optional)--show: Show mode - print dataframe info to console instead of plotting (no stdin required)--rows: Number of rows to display in show mode (default: 5)
Dependency Analysis
Marimo notebooks encode dependencies explicitly:
- Cell parameters = variables the cell reads (refs)
- Cell return tuple = variables the cell defines (defs)
The skill walks backwards from the target variable through the dependency graph to find all required cells.
Target Line (Advanced)
When a variable is mutated within a cell, use --target-line to capture intermediate state:
@app.cell
def _(raw_data):
df = raw_data.copy() # line 45
df = df[df['value'] > 0] # line 46 - filtered
df = df.groupby('cat').sum() # line 47 - aggregated
return (df,)
Use --target-var df --target-line 46 to capture df after filtering but before aggregation.
Show Mode (Data Inspection)
Use --show to print dataframe info to console instead of generating a plot. Useful for quickly inspecting data at a specific point in the notebook pipeline.
python /Users/rob/.claude/skills/viz/viz_runner.py \
--marimo \
--notebook /path/to/notebook.nb.py \
--target-var df \
--show \
--rows 10
Output:
Shape: (12345, 5)
Columns: ['date', 'profile_id', 'kind', 'state', 'channel_type']
Dtypes:
date datetime64[ns]
profile_id int64
kind object
state object
channel_type object
First 10 rows:
date profile_id kind state channel_type
0 2021-01-01 123456 monthly CA organic
1 2021-01-02 123457 monthly TX paid
...
No stdin (plot code) is required for show mode - it only prints dataframe metadata and contents.
Example Workflow
User request:
"Plot the member forecast over time from the operational forecast notebook"
Agent workflow:
- Read the notebook to identify candidate variables
- Ask clarifying questions if multiple candidates exist
- Execute:
python /Users/rob/.claude/skills/viz/viz_runner.py \
--marimo \
--notebook /Users/rob/repos/project/forecast.nb.py \
--target-var df_deliverable \
--id member_forecast \
--desc "Historical and forecast members" \
<< 'EOF'
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(df_deliverable['date'], df_deliverable['total_final_members'])
ax.set_xlabel('Date')
ax.set_ylabel('Members')
ax.set_title('Member Population Over Time')
plt.tight_layout()
plt.show()
EOF
Important Notes
- The original notebook is never modified (read-only access)
- All work happens on a copy in
/tmp/viz/ - The script runs with the notebook's directory as cwd, so relative file paths work
- Uses
uv run pythonif the notebook directory containspyproject.tomloruv.lock
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Team Composition Analysis
This skill should be used when the user asks to "plan team structure", "determine hiring needs", "design org chart", "calculate compensation", "plan equity allocation", or requests organizational design and headcount planning for a startup.
Startup Financial Modeling
This skill should be used when the user asks to "create financial projections", "build a financial model", "forecast revenue", "calculate burn rate", "estimate runway", "model cash flow", or requests 3-5 year financial planning for a startup.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Startup Metrics Framework
This skill should be used when the user asks about "key startup metrics", "SaaS metrics", "CAC and LTV", "unit economics", "burn multiple", "rule of 40", "marketplace metrics", or requests guidance on tracking and optimizing business performance metrics.
