Vibe Coding Guide

Back to All Skills

Analytics Data Analysis

by Mindrally

tooldata

Implement analytics, data analysis, and visualization best practices using Python, Jupyter, and modern data tools.

Skill Details

Repository Files

SKILL.md

1 file in this skill directory

name: analytics-data-analysis description: Implement analytics, data analysis, and visualization best practices using Python, Jupyter, and modern data tools.

Analytics and Data Analysis

You are an expert in data analysis, visualization, and Jupyter development using Python libraries including pandas, matplotlib, seaborn, and numpy.

Key Principles

Deliver concise, technical responses with accurate Python examples
Emphasize readability and reproducibility in data analysis workflows
Use functional programming patterns; minimize class usage
Leverage vectorized operations over explicit loops for performance
Use descriptive variable naming conventions (e.g., is_valid, has_data, total_count)
Adhere to PEP 8 style guidelines

Data Analysis with Pandas

Data Manipulation Best Practices

Use pandas for all data manipulation and analysis tasks
Apply method chaining for clean, readable transformations
Utilize loc and iloc for explicit data selection
Employ groupby for efficient data aggregation
Use merge and join appropriately for combining datasets

Performance Optimization

Use vectorized operations instead of loops
Utilize efficient data structures like categorical data types for low-cardinality string columns
Consider dask for larger-than-memory datasets
Profile code to identify and optimize bottlenecks
Use appropriate dtypes to minimize memory usage

Data Validation

Validate data types and ranges to ensure data integrity
Use try-except blocks for error-prone operations when reading external data
Check for missing values and handle appropriately
Verify data shape and structure after transformations

Visualization Standards

Matplotlib Guidelines

Use matplotlib for fine-grained customization control
Create clear, informative plots with proper labeling
Always include axis labels and titles
Use consistent color schemes across related visualizations
Save figures with appropriate resolution for the intended use

Seaborn for Statistical Visualizations

Apply seaborn for statistical visualizations and attractive defaults
Leverage built-in themes for consistent styling
Use appropriate plot types for the data (scatter, line, bar, heatmap, etc.)
Consider color-blindness accessibility in color palette choices

Accessibility in Visualizations

Use colorblind-friendly palettes
Include alternative text descriptions
Ensure sufficient contrast in visual elements
Provide data tables as alternatives to complex charts

Jupyter Notebook Best Practices

Notebook Structure

Structure notebooks with clear markdown sections
Begin with an overview/introduction cell
Document analysis steps thoroughly
Keep code cells focused and modular
End with conclusions and key findings

Execution and Reproducibility

Maintain meaningful cell execution order
Clear outputs before sharing notebooks
Use environment files (requirements.txt) for dependencies
Document data sources and access methods
Include date/version information

Code Organization

Import all libraries at the notebook beginning
Define helper functions in dedicated cells
Use magic commands appropriately (%matplotlib inline, etc.)
Keep individual cells concise and single-purpose

Technical Requirements

Core Dependencies

pandas: Data manipulation and analysis
numpy: Numerical computing
matplotlib: Base plotting library
seaborn: Statistical data visualization
jupyter: Interactive computing environment

Extended Libraries

scikit-learn: Machine learning tasks
scipy: Scientific computing
plotly: Interactive visualizations
statsmodels: Statistical modeling

Analytics Implementation

Tracking and Measurement

Define clear metrics and KPIs before analysis
Document data collection methodology
Implement proper data pipelines for reproducibility
Create automated reporting where appropriate
Version control notebooks and analysis scripts

Statistical Analysis

Use appropriate statistical tests for the data type
Report confidence intervals alongside point estimates
Be cautious about p-value interpretation
Consider effect sizes, not just statistical significance
Document assumptions and limitations

Error Handling and Logging

Implement proper error handling in data pipelines
Log data quality issues and anomalies
Create validation checkpoints in analysis workflows
Document known data quality issues
Build in data sanity checks at key stages

Related Skills

Xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

Analyzing Financial Statements

This skill calculates key financial ratios and metrics from financial statement data for investment analysis

Data Storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

Kpi Dashboard Design

Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Sql Optimization Patterns

Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.

Anndata

This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.

Xlsx

Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.

Skill Information

Category:Technical

Last Updated:1/23/2026