Xlsx
by benchflow-ai
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and pivot tables. When Claude needs to work with spreadsheets (.xlsx files) for: (1) Creating new spreadsheets with data and formatting, (2) Reading or analyzing Excel data with pandas, (3) Creating pivot tables programmatically with openpyxl, (4) Building multi-sheet workbooks with source data and pivot table sheets, or (5) Any Excel file operations
Skill Details
Repository Files
1 file in this skill directory
name: xlsx description: "Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and pivot tables. When Claude needs to work with spreadsheets (.xlsx files) for: (1) Creating new spreadsheets with data and formatting, (2) Reading or analyzing Excel data with pandas, (3) Creating pivot tables programmatically with openpyxl, (4) Building multi-sheet workbooks with source data and pivot table sheets, or (5) Any Excel file operations"
XLSX Creation, Editing, and Analysis
Overview
This skill covers working with Excel files using Python libraries: openpyxl for Excel-specific features (formatting, formulas, pivot tables) and pandas for data analysis.
Reading Data
With pandas (recommended for analysis)
import pandas as pd
df = pd.read_excel('file.xlsx') # First sheet
all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dict
df.head() # Preview
df.describe() # Statistics
With openpyxl (for cell-level access)
from openpyxl import load_workbook
wb = load_workbook('file.xlsx')
ws = wb.active
value = ws['A1'].value
# Read calculated values (not formulas)
wb = load_workbook('file.xlsx', data_only=True)
Creating Excel Files
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
wb = Workbook()
ws = wb.active
ws.title = "Data"
# Add data
ws['A1'] = 'Header'
ws.append(['Row', 'of', 'data'])
# Formatting
ws['A1'].font = Font(bold=True)
ws['A1'].fill = PatternFill('solid', start_color='2c3e50')
wb.save('output.xlsx')
Creating Pivot Tables
Pivot tables summarize data by grouping and aggregating. Use openpyxl's pivot table API.
CRITICAL: All pivot tables MUST use cacheId=0. Using any other cacheId (1, 2, etc.) will cause openpyxl to fail when reading the file back with KeyError. This is an openpyxl limitation.
Basic Pivot Table Structure
from openpyxl import Workbook
from openpyxl.pivot.table import TableDefinition, Location, PivotField, DataField, RowColField
from openpyxl.pivot.cache import CacheDefinition, CacheField, CacheSource, WorksheetSource, SharedItems
# 1. Create workbook with source data
wb = Workbook()
data_ws = wb.active
data_ws.title = "SourceData"
# Write your data (with headers in row 1)
data = [
["CategoryName", "ProductName", "Quantity", "Revenue"],
["Beverages", "Chai", 25, 450.00],
["Seafood", "Ikura", 12, 372.00],
# ... more rows
]
for row in data:
data_ws.append(row)
num_rows = len(data)
# 2. Create pivot table sheet
pivot_ws = wb.create_sheet("PivotAnalysis")
# 3. Define the cache (source data reference)
cache = CacheDefinition(
cacheSource=CacheSource(
type="worksheet",
worksheetSource=WorksheetSource(
ref=f"A1:D{num_rows}", # Adjust to your data range
sheet="SourceData"
)
),
cacheFields=[
CacheField(name="CategoryName", sharedItems=SharedItems(count=8)),
CacheField(name="ProductName", sharedItems=SharedItems(count=40)),
CacheField(name="Quantity", sharedItems=SharedItems()),
CacheField(name="Revenue", sharedItems=SharedItems()),
]
)
# 4. Create pivot table definition
pivot = TableDefinition(
name="RevenueByCategory",
cacheId=0, # MUST be 0 for ALL pivot tables - any other value breaks openpyxl
dataCaption="Values",
location=Location(
ref="A3:B10", # Where pivot table will appear
firstHeaderRow=1,
firstDataRow=1,
firstDataCol=1
),
)
# 5. Configure pivot fields (one for each source column)
# Fields are indexed 0, 1, 2, 3 matching cache field order
# Field 0: CategoryName - use as ROW
pivot.pivotFields.append(PivotField(axis="axisRow", showAll=False))
# Field 1: ProductName - not used (just include it)
pivot.pivotFields.append(PivotField(showAll=False))
# Field 2: Quantity - not used
pivot.pivotFields.append(PivotField(showAll=False))
# Field 3: Revenue - use as DATA (for aggregation)
pivot.pivotFields.append(PivotField(dataField=True, showAll=False))
# 6. Add row field reference (index of the field to use as rows)
pivot.rowFields.append(RowColField(x=0)) # CategoryName is field 0
# 7. Add data field with aggregation
pivot.dataFields.append(DataField(
name="Total Revenue",
fld=3, # Revenue is field index 3
subtotal="sum" # Options: sum, count, average, max, min, product, stdDev, var
))
# 8. Attach cache and add to worksheet
pivot.cache = cache
pivot_ws._pivots.append(pivot)
wb.save('output_with_pivot.xlsx')
Common Pivot Table Configurations
Count by Category (Order Count)
# Row field: CategoryName (index 0)
# Data field: Count any column
pivot.rowFields.append(RowColField(x=0)) # CategoryName as rows
pivot.dataFields.append(DataField(
name="Order Count",
fld=1, # Any field works for count
subtotal="count"
))
Sum by Category (Total Revenue)
pivot.rowFields.append(RowColField(x=0)) # CategoryName as rows
pivot.dataFields.append(DataField(
name="Total Revenue",
fld=3, # Revenue field
subtotal="sum"
))
Two-Dimensional Pivot (Rows and Columns)
# CategoryName as rows, Quarter as columns, revenue as values
pivot.pivotFields[0] = PivotField(axis="axisRow", showAll=False) # Category = row
pivot.pivotFields[1] = PivotField(axis="axisCol", showAll=False) # Quarter = col
pivot.pivotFields[3] = PivotField(dataField=True, showAll=False) # Revenue = data
pivot.rowFields.append(RowColField(x=0)) # Category
pivot.colFields.append(RowColField(x=1)) # Quarter
pivot.dataFields.append(DataField(name="Revenue", fld=3, subtotal="sum"))
Multiple Data Fields
# Show both count and sum
pivot.dataFields.append(DataField(name="Order Count", fld=2, subtotal="count"))
pivot.dataFields.append(DataField(name="Total Revenue", fld=3, subtotal="sum"))
Pivot Table Field Configuration Reference
| axis value | Meaning |
|---|---|
"axisRow" |
Field appears as row labels |
"axisCol" |
Field appears as column labels |
"axisPage" |
Field is a filter/slicer |
| (none) | Field not used for grouping |
| subtotal value | Aggregation |
|---|---|
"sum" |
Sum of values |
"count" |
Count of items |
"average" |
Average/mean |
"max" |
Maximum value |
"min" |
Minimum value |
"product" |
Product of values |
"stdDev" |
Standard deviation |
"var" |
Variance |
Important Notes
- CRITICAL - cacheId must be 0: ALL pivot tables must use
cacheId=0. Using 1, 2, etc. will causeKeyErrorwhen reading the file - Field indices must match: The order of
cacheFieldsmust match your source data columns - Location ref: The
refin Location is approximate - Excel will adjust when opened - Cache field count: For categorical fields, set
countparameter to approximate number of unique values - Multiple pivots: Each pivot table needs its own sheet for clarity
- Values populate on open: Pivot table values are calculated when the file is opened in Excel/LibreOffice, not when created
Working with Existing Excel Files
from openpyxl import load_workbook
wb = load_workbook('existing.xlsx')
ws = wb['SheetName']
# Modify
ws['A1'] = 'New Value'
ws.insert_rows(2)
# Add new sheet
new_ws = wb.create_sheet('Analysis')
wb.save('modified.xlsx')
Best Practices
- Use pandas for data manipulation, openpyxl for Excel features
- Match field indices carefully when creating pivot tables
- Test with small data first before scaling up
- Name your sheets clearly (e.g., "SourceData", "RevenueByCategory")
- Document your pivot table structure with comments in code
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
