Xlsx
by srsubramanian
Recalculation script for verifying formula results
Skill Details
Repository Files
5 files in this skill directory
name: xlsx description: "Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization" category: data version: "1.0.0" license: Based on Anthropic's xlsx skill (https://github.com/anthropics/skills)
Tool configurations - this skill has no gated tools (uses general code execution)
tool_configs: []
Resource configurations - Level 3 content
resource_configs:
-
name: examples description: "Detailed code examples for common spreadsheet operations" file: examples.md
-
name: formatting description: "Advanced formatting guide including styles, conditional formatting, and charts" file: formatting.md
-
name: recalc description: "Recalculation script for verifying formula results" file: recalc.md
XLSX Skill
Core Purpose
Create, edit, and analyze spreadsheets (.xlsx, .xlsm, .csv, .tsv) with comprehensive support for formulas, formatting, data analysis, and visualization.
Requirements for All Excel Files
Zero Formula Errors
Every Excel model MUST be delivered with ZERO formula errors:
#REF!- Invalid cell references#DIV/0!- Division by zero#VALUE!- Wrong data type in formula#N/A- Value not available#NAME?- Unrecognized formula name
Preserve Existing Templates
When modifying files with established patterns:
- Study and EXACTLY match existing format, style, and conventions
- Never impose standardized formatting on files with established patterns
- Existing template conventions ALWAYS override these guidelines
Financial Model Standards
Color Coding Conventions
- Blue text (RGB: 0,0,255): Hardcoded inputs, numbers users will change
- Black text (RGB: 0,0,0): ALL formulas and calculations
- Green text (RGB: 0,128,0): Links from other worksheets in same workbook
- Red text (RGB: 255,0,0): External links to other files
- Yellow background (RGB: 255,255,0): Key assumptions needing attention
Number Formatting Rules
- Years: Format as text strings ("2024" not "2,024")
- Currency: Use $#,##0 format; specify units in headers ("Revenue ($mm)")
- Zeros: Format to display as "-"
- Percentages: Default to 0.0% format (one decimal)
- Multiples: Format as 0.0x for valuation multiples
- Negative numbers: Use parentheses (123) not minus -123
CRITICAL: Use Formulas, Not Hardcoded Values
Always use Excel formulas instead of calculating values in Python. This keeps spreadsheets dynamic and updateable.
Wrong - Hardcoding:
total = df['Sales'].sum()
sheet['B10'] = total # Hardcodes value
Correct - Using Formulas:
sheet['B10'] = '=SUM(B2:B9)' # Excel calculates
Tool Selection
- pandas: Best for data analysis, bulk operations, simple data export
- openpyxl: Best for complex formatting, formulas, Excel-specific features
Common Workflow
- Choose tool: pandas for data, openpyxl for formulas/formatting
- Create/Load: Create new workbook or load existing file
- Modify: Add/edit data, formulas, and formatting
- Save: Write to file
- Recalculate: Run
python recalc.py output.xlsx(MANDATORY for formulas) - Verify: Check recalc.py output for errors, fix any found
Quick Code Examples
Reading with pandas:
import pandas as pd
df = pd.read_excel('file.xlsx')
all_sheets = pd.read_excel('file.xlsx', sheet_name=None)
Creating with openpyxl:
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
wb = Workbook()
sheet = wb.active
sheet['A1'] = 'Header'
sheet['B2'] = '=SUM(A1:A10)' # Use formulas!
sheet['A1'].font = Font(bold=True, color='FF0000')
wb.save('output.xlsx')
Editing existing files:
from openpyxl import load_workbook
wb = load_workbook('existing.xlsx')
sheet = wb.active
sheet['A1'] = 'New Value'
wb.save('modified.xlsx')
Formula Verification Checklist
- Test 2-3 sample references before building full model
- Verify column mapping (column 64 = BL, not BK)
- Remember Excel rows are 1-indexed
- Check for NaN values with
pd.notna() - Handle division by zero in formulas
- Test formulas on 2-3 cells before applying broadly
Best Practices
- Cell indices are 1-based in openpyxl
- Use
data_only=Trueto read calculated values (WARNING: saves lose formulas) - For large files: Use
read_only=Trueorwrite_only=True - Write minimal, concise Python code
- Add comments to cells with complex formulas
Related Skills
Attack Tree Construction
Build comprehensive attack trees to visualize threat paths. Use when mapping attack scenarios, identifying defense gaps, or communicating security risks to stakeholders.
Grafana Dashboards
Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.
Matplotlib
Foundational plotting library. Create line plots, scatter, bar, histograms, heatmaps, 3D, subplots, export PNG/PDF/SVG, for scientific visualization and publication figures.
Scientific Visualization
Create publication figures with matplotlib/seaborn/plotly. Multi-panel layouts, error bars, significance markers, colorblind-safe, export PDF/EPS/TIFF, for journal-ready scientific plots.
Seaborn
Statistical visualization. Scatter, box, violin, heatmaps, pair plots, regression, correlation matrices, KDE, faceted plots, for exploratory analysis and publication figures.
Shap
Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model
Pydeseq2
Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.
Query Writing
For writing and executing SQL queries - from simple single-table queries to complex multi-table JOINs and aggregations
Pydeseq2
Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.
Scientific Visualization
Meta-skill for publication-ready figures. Use when creating journal submission figures requiring multi-panel layouts, significance annotations, error bars, colorblind-safe palettes, and specific journal formatting (Nature, Science, Cell). Orchestrates matplotlib/seaborn/plotly with publication styles. For quick exploration use seaborn or plotly directly.
