Pareto Analysis

by ddunnock

artdata

Conduct Pareto Analysis (80/20 Rule) to identify the vital few causes driving the majority of problems. Guides data collection, category definition, chart creation, cumulative percentage calculation, and prioritization. Generates professional Pareto charts (SVG) and HTML reports with quality scoring. Use when prioritizing defects, complaints, failures, or improvement opportunities; when user mentions "Pareto", "80/20 rule", "vital few", "trivial many", "prioritization", or needs to identify whic

Skill Details

Repository Files

14 files in this skill directory


name: pareto-analysis description: Conduct Pareto Analysis (80/20 Rule) to identify the vital few causes driving the majority of problems. Guides data collection, category definition, chart creation, cumulative percentage calculation, and prioritization. Generates professional Pareto charts (SVG) and HTML reports with quality scoring. Use when prioritizing defects, complaints, failures, or improvement opportunities; when user mentions "Pareto", "80/20 rule", "vital few", "trivial many", "prioritization", or needs to identify which factors contribute most to a problem. tools: Read, Bash model: sonnet color: green field: quality expertise: intermediate

Pareto Analysis (80/20 Rule)

Systematically identify and prioritize the "vital few" causes that contribute to the majority of problems. Based on the Pareto Principle: roughly 80% of effects come from 20% of causes.

Integration with Other RCCA Tools

Pareto Analysis provides prioritization - identifying which problems or causes deserve attention first. Typical integration:

  1. Pareto → Fishbone → 5 Whys: Prioritize with Pareto, brainstorm causes with Fishbone, drill into root causes with 5 Whys
  2. Problem Definition → Pareto → Root Cause Tools: Define scope, prioritize focus areas, investigate top contributors
  3. DMAIC Measure Phase: Pareto charts establish baseline and identify improvement targets

Workflow Overview

5 Phases (Q&A-driven):

  1. Problem Scoping → Define what you're measuring and why
  2. Data Collection → Gather frequency/cost/impact data by category
  3. Chart Construction → Build Pareto chart with cumulative line
  4. Analysis & Interpretation → Identify vital few, validate 80/20 pattern
  5. Documentation → Generate chart and report

Phase 1: Problem Scoping

Goal: Establish clear measurement objective and categories.

Ask the user:

What problem or outcome are you trying to prioritize or analyze?

Examples:

  • "Customer complaints by type"
  • "Defects by category"
  • "Downtime by cause"
  • "Errors by department"

Then clarify:

What will you measure for each category?

Common measurements:

  • Frequency: Count of occurrences
  • Cost: Dollar impact per category
  • Time: Duration or delay per category
  • Severity: Weighted score (frequency × impact)

Quality Gate: Problem scope must:

  • Define a specific, measurable outcome
  • Identify the measurement type (frequency, cost, time, or weighted)
  • Have clear business relevance

Phase 2: Data Collection

Goal: Gather accurate, representative data by category.

Ask the user to provide data or guide collection:

Please provide your data in one of these formats:

Option A - Direct entry:

Category Count/Value
Category A 45
Category B 30
... ...

Option B - Raw incident list: Provide a list of incidents with their categories, and I'll tabulate them.

Option C - Describe the data source: Tell me where the data comes from, and I'll help you structure it.

Data Quality Checks:

  • Representative time period (not too short to miss patterns)
  • Consistent category definitions (no overlaps)
  • Sufficient sample size (minimum 30-50 data points recommended)
  • Categories follow MECE principle (Mutually Exclusive, Collectively Exhaustive)

Category Guidelines (see references/category-guidelines.md):

  • Keep categories to 7-10 maximum
  • Use an "Other" category sparingly (should not exceed 10% of total)
  • Categories should be actionable (low enough in causal chain to address)

Phase 3: Chart Construction

Goal: Build the Pareto chart with calculations.

Once data is collected, calculate:

  1. Sort categories by count/value in descending order
  2. Calculate percentage for each: (Category Value / Total) × 100
  3. Calculate cumulative percentage: Running sum of percentages
  4. Identify cutoff: Categories contributing to ≥80% cumulative

Run the calculation script:

python3 scripts/calculate_pareto.py --input data.json

Or provide data directly and I'll calculate:

  • Sort descending
  • Compute percentages
  • Compute cumulative percentages
  • Mark the 80% threshold

Output Structure:

Category | Count | % | Cumulative %
---------|-------|---|-------------
Defect A |   45  | 36% |    36%
Defect B |   30  | 24% |    60%     ← Vital few boundary
Defect C |   20  | 16% |    76%
Defect D |   15  | 12% |    88%     ← 80% threshold crossed
Defect E |   10  |  8% |    96%
Other    |    5  |  4% |   100%
---------|-------|-----|------------
TOTAL    |  125  |100% |

Phase 4: Analysis & Interpretation

Goal: Extract actionable insights from the Pareto chart.

Evaluate the analysis against these criteria:

Pattern Recognition

Strong Pareto Effect (steep cumulative curve):

  • Few categories (2-3) account for ≥80% of impact
  • Clear prioritization opportunity
  • Focus improvement efforts on vital few

Weak/No Pareto Effect (gradual cumulative curve):

  • Many categories contribute similar amounts
  • May indicate:
    • Wrong categorization level (too granular or too broad)
    • Truly distributed problem (no dominant causes)
    • Need to weight by severity, not just frequency

Validation Questions

Ask the user:

Looking at this Pareto analysis:

  1. Do the top categories (vital few) align with your intuition about the biggest problems?
  2. Are there any categories that should be split or combined?
  3. Should we apply weighting (e.g., severity × frequency) for more meaningful prioritization?
  4. What's the cost/effort to address each of the vital few?

Weighted Pareto (Optional)

If categories have unequal severity, apply weights:

Weighted Score = Frequency × Severity Weight

Then recalculate Pareto on weighted scores.

Phase 5: Documentation

Goal: Generate professional outputs.

Generate the Pareto chart:

python3 scripts/generate_chart.py --input data.json --output pareto_chart.svg

Generate the HTML report:

python3 scripts/generate_report.py --input data.json --output pareto_report.html

Report Contents

  • Problem statement and scope
  • Data collection period and sources
  • Pareto chart (SVG embedded)
  • Data table with calculations
  • Vital few identification
  • Recommendations for next steps
  • Quality score

Quality Scoring

See references/quality-rubric.md for detailed scoring criteria.

6 Dimensions (100 points total):

Dimension Weight Focus
Problem Clarity 15% Clear scope, measurement type, business relevance
Data Quality 25% Representative, sufficient, consistent categories
Category Design 20% MECE, actionable, appropriate granularity
Calculation Accuracy 15% Correct sorting, percentages, cumulative line
Pattern Interpretation 15% Valid conclusions from cumulative curve
Actionability 10% Clear next steps, linked to improvement actions

Passing threshold: 70 points

Common Pitfalls

See references/common-pitfalls.md for detailed descriptions.

  1. Flat histogram - No dominant categories; may need recategorization
  2. Large "Other" category - Obscures potentially important causes
  3. Frequency-only focus - Ignoring cost, severity, or effort to fix
  4. Insufficient data - Too short a period or too few observations
  5. Overlapping categories - Violates MECE principle
  6. Assuming 80/20 is exact - The ratio varies; focus on the pattern
  7. Stopping at Pareto - Chart identifies priorities but not root causes

Examples

See references/examples.md for worked examples:

  1. Manufacturing defects prioritization
  2. Customer complaint analysis
  3. IT incident categorization
  4. Cost reduction opportunity identification

Session Conduct Guidelines

  1. Validate categories early - Poor categories doom the analysis
  2. Check for Pareto effect - Steep cumulative curve indicates prioritization opportunity
  3. Consider weighting - Frequency alone may mislead
  4. Link to root cause tools - Pareto prioritizes; Fishbone/5 Whys investigate
  5. Iterate if needed - Drill down (nested Pareto) or re-categorize
  6. Communicate visually - Pareto charts are excellent stakeholder tools

Related Skills

Xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

data

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Analyzing Financial Statements

This skill calculates key financial ratios and metrics from financial statement data for investment analysis

data

Data Storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

data

Team Composition Analysis

This skill should be used when the user asks to "plan team structure", "determine hiring needs", "design org chart", "calculate compensation", "plan equity allocation", or requests organizational design and headcount planning for a startup.

artdesign

Startup Financial Modeling

This skill should be used when the user asks to "create financial projections", "build a financial model", "forecast revenue", "calculate burn rate", "estimate runway", "model cash flow", or requests 3-5 year financial planning for a startup.

art

Kpi Dashboard Design

Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.

designdata

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Startup Metrics Framework

This skill should be used when the user asks about "key startup metrics", "SaaS metrics", "CAC and LTV", "unit economics", "burn multiple", "rule of 40", "marketplace metrics", or requests guidance on tracking and optimizing business performance metrics.

art

Skill Information

Category:Creative
Last Updated:1/31/2026