Hammer Benchmark Report

by Ronin15

document

Generates professional performance analysis reports from SDL3 HammerEngine benchmark results including statistical analysis, comparison tables, visualizations, and recommendations. Use when preparing performance documentation, analyzing optimization efforts, or generating milestone/release reports.

Skill Details

Repository Files

SKILL.md

1 file in this skill directory

name: hammer-benchmark-report description: Generates professional performance analysis reports from SDL3 HammerEngine benchmark results including statistical analysis, comparison tables, visualizations, and recommendations. Use when preparing performance documentation, analyzing optimization efforts, or generating milestone/release reports. allowed-tools: [Bash, Read, Write, Grep]

HammerEngine Benchmark Report Generator

This Skill creates comprehensive, professional-quality performance reports from SDL3 HammerEngine benchmark results. Perfect for documentation, stakeholder updates, or performance tracking over time.

Report Capabilities

Data Aggregation - Collect metrics from all benchmark test results
Statistical Analysis - Calculate mean, median, stddev, percentiles
Visual Representation - Generate ASCII charts and tables
Trend Analysis - Compare against historical baselines
Callgrind Integration - Include profiling hotspot analysis
Professional Formatting - Markdown/HTML/PDF output formats
Executive Summary - High-level overview for stakeholders

Input Sources

Benchmark Result Files

Location: $PROJECT_ROOT/test_results/ (Ensure PROJECT_ROOT environment variable is set to project root directory)

Files to Scan:

$PROJECT_ROOT/test_results/
├── ai_benchmark/
│   ├── performance_metrics.txt
│   └── performance_report.md
├── collision_benchmark/
│   ├── performance_metrics.txt
│   └── performance_report.md
├── pathfinder_benchmark/
│   ├── performance_metrics.txt
│   └── performance_report.md
├── event_manager_scaling/
│   ├── performance_metrics.txt
│   └── performance_report.md
├── particle_benchmark/
│   ├── performance_metrics.txt
│   └── performance_report.md
├── ui_stress/
│   ├── performance_metrics.txt
│   └── performance_report.md
└── valgrind/
    ├── callgrind/
    │   └── callgrind.out.*
    └── cache/
        └── cachegrind.out.*

Historical Baselines (if available)

$PROJECT_ROOT/test_results/baseline/
$PROJECT_ROOT/test_results/baseline_history/

Execution Steps

Step 1: Discover Available Benchmark Results

Scan for results:

find "$PROJECT_ROOT/test_results/" -name "performance_metrics.txt" -type f

Expected benchmarks:

AI System Benchmark
Collision System Benchmark
Pathfinder Benchmark
Event Manager Scaling
Particle Manager Benchmark
UI Stress Tests

For each found result:

Record timestamp (from file modification date)
Extract all metrics
Categorize by system

Step 2: Extract and Parse Metrics

Extraction Patterns:

AI System Metrics (Dual Benchmark System)

Synthetic Benchmarks (AIManager infrastructure):

grep -B 5 -A 10 "TestSynthetic" "$PROJECT_ROOT/test_results/ai_scaling_benchmark_"*.txt | \
  grep -E "Entity updates per second:|entities"

Integrated Benchmarks (Production behaviors):

grep -B 5 -A 10 "TestIntegrated" "$PROJECT_ROOT/test_results/ai_scaling_benchmark_"*.txt | \
  grep -E "Entity updates per second:|entities"

Parse into structure:

AI_System_Synthetic:
  Entity_100: 170000
  Entity_200: 750000
  Entity_1000: 975000
  Entity_5000: 925000
  Entity_10000: 995000

AI_System_Integrated:
  Entity_100: 569152
  Entity_200: 579794
  Entity_500: 611098
  Entity_1000: 1192606
  Entity_2000: 1587491

Collision System Metrics

grep -E "^(Collision Checks|Query Time|Hash Efficiency|AABB Tests):" \
  "$PROJECT_ROOT/test_results/collision_benchmark/performance_metrics.txt"

Pathfinder Metrics

grep -E "^(Path Calculation|Nodes Explored|Cache Hits|A\* Performance):" \
  "$PROJECT_ROOT/test_results/pathfinder_benchmark/performance_metrics.txt"

Event Manager Metrics

grep -E "^(Events/sec|Dispatch Latency|Queue Depth|Peak Throughput):" \
  "$PROJECT_ROOT/test_results/event_manager_scaling/performance_metrics.txt"

Particle Manager Metrics

grep -E "^(Particles/frame|Render Time|Batch Count|Culling Efficiency):" \
  "$PROJECT_ROOT/test_results/particle_benchmark/performance_metrics.txt"

UI System Metrics

grep -E "^(Components|Render Time|Event Handling|DPI Scaling):" \
  "$PROJECT_ROOT/test_results/ui_stress/performance_metrics.txt"

Step 3: Statistical Analysis

For each metric with multiple data points:

Calculate Statistics

def calculate_statistics(values):
    """Calculate statistical measures for a set of values"""
    n = len(values)
    mean = sum(values) / n

    # Median
    sorted_values = sorted(values)
    median = sorted_values[n//2] if n % 2 == 1 else \
             (sorted_values[n//2-1] + sorted_values[n//2]) / 2

    # Standard Deviation
    variance = sum((x - mean) ** 2 for x in values) / n
    stddev = variance ** 0.5

    # Percentiles
    p50 = median
    p95 = sorted_values[int(n * 0.95)]
    p99 = sorted_values[int(n * 0.99)]

    # Min/Max
    min_val = min(values)
    max_val = max(values)

    return {
        'mean': mean,
        'median': median,
        'stddev': stddev,
        'p50': p50,
        'p95': p95,
        'p99': p99,
        'min': min_val,
        'max': max_val,
        'coefficient_of_variation': (stddev / mean) * 100 if mean > 0 else 0
    }

Apply to metrics:

If multiple benchmark runs exist, aggregate data
Calculate statistics for each metric
Identify outliers (values > 2 stddev from mean)

Step 4: Callgrind Analysis (if available)

Parse Callgrind Output:

# Find most recent callgrind output
CALLGRIND_FILE=$(ls -t "$PROJECT_ROOT/test_results/valgrind/callgrind/callgrind.out."* 2>/dev/null | head -n 1)

if [ -f "$CALLGRIND_FILE" ]; then
    # Extract top hotspots using callgrind_annotate
    callgrind_annotate --auto=yes "$CALLGRIND_FILE" | head -n 50
fi

Extract:

Top 10 functions by time (Ir - instruction reads)
Top 10 functions by calls
Call graphs for critical paths (AI update, collision detection, rendering)

Example Hotspot Data:

Function                               Ir        %
AIManager::updateBehaviors()           15,234M   45.2%
CollisionManager::detectCollisions()    6,123M   18.1%
PathfinderManager::calculatePath()      4,056M   12.0%

Step 5: Trend Analysis (if baselines available)

Compare Against Baseline:

For each metric:

Load baseline value (if exists)
Calculate percentage change
Determine trend direction (improving/degrading/stable)
Identify long-term trends (if multiple historical baselines)

Trend Classification:

📈 Improving: >5% better than baseline
📉 Degrading: >5% worse than baseline
➡️ Stable: Within ±5% of baseline
🆕 New: No baseline for comparison

Step 6: Generate Report Sections

Section 1: Executive Summary

# SDL3 HammerEngine Performance Report

**Generated:** YYYY-MM-DD HH:MM:SS
**Benchmark Suite Version:** <git-commit-hash>
**Branch:** <current-branch>
**Platform:** <OS-version>
**Build Type:** Debug/Release

---

## Executive Summary

### Overall Performance: ✓ EXCELLENT / ✓ GOOD / ⚠ FAIR / ❌ NEEDS IMPROVEMENT

SDL3 HammerEngine demonstrates <strong/adequate/weak> performance across all critical systems:

- **AI System:** Handles 10,000+ entities at 62 FPS with 5.8% CPU (Target: 60+ FPS, <6% CPU) ✓
- **Collision System:** 125,000 collision checks/sec, 0.08ms query time ✓
- **Pathfinding:** 8.5ms path calculation, 78% cache hit rate ✓
- **Event System:** 8,500 events/sec throughput, 0.12ms dispatch latency ✓
- **Particle System:** 5,000 particles/frame, 3.2ms render time ✓
- **UI System:** 1,000 components, 4.5ms render time ✓

### Key Achievements

- AI system exceeds 10K entity target with headroom
- Collision system optimization improved query speed by 12%
- Event throughput supports 8K+ concurrent events
- All systems meet or exceed performance targets

### Areas for Improvement

- Event dispatch latency increased 8% (monitor)
- AI behavior updates showing slight variance (9% stddev)

### Recommendation

**Status:** Ready for production
**Next Milestones:** Optimize event dispatching, stabilize AI behavior timing

Section 2: Detailed System Analysis

## Detailed Performance Analysis

### AI System - Synthetic Benchmarks (Infrastructure Performance)

#### Purpose
Tests AIManager infrastructure without integration overhead

#### Benchmark Results

| Entity Count | Value (updates/sec) | Status | Baseline | Change |
|--------------|---------------------|--------|----------|--------|
| 100 | 170K | ✓ | 170K | 0.0% |
| 200 | 750K | ✓ | 750K | 0.0% |
| 1000 | 975K | ✓ | 975K | 0.0% |
| 5000 | 925K | ✓ | 925K | 0.0% |
| 10000 | 995K | ✓ | 995K | 0.0% |

#### Threading Efficiency
- Single-threaded (100): 170K updates/sec
- Multi-threaded (5000): 925K updates/sec
- Speedup: 5.4x

#### Statistical Summary
- Mean: 963K updates/sec (across entity counts)
- Std Dev: 138K (14% CoV)
- Consistent performance across entity scales

---

### AI System - Integrated Benchmarks (Production Workload)

#### Purpose
Tests AIManager with PathfinderManager/CollisionManager integration

#### Benchmark Results

| Entity Count | Value (updates/sec) | Status | Baseline | Change |
|--------------|---------------------|--------|----------|--------|
| 100 | 569K | ✓ | 569K | 0.0% |
| 200 | 580K | ✓ | 580K | 0.0% |
| 500 | 611K | ✓ | 611K | 0.0% |
| 1000 | 1193K | ✓ | 1193K | 0.0% |
| 2000 | 1587K | ✓ | 1587K | 0.0% |

#### Threading Efficiency
- Single-threaded (100): 569K updates/sec
- Multi-threaded (2000): 1587K updates/sec
- Speedup: 2.8x

#### Statistical Summary
- Mean: 908K updates/sec (across entity counts)
- Std Dev: 444K (49% CoV)
- Performance scales with entity count

---

### AI System - Integration Overhead Analysis

#### Overhead Metrics

| Entity Count | Synthetic | Integrated | Overhead | Assessment |
|--------------|-----------|------------|----------|------------|
| 100 | 170K/s | 569K/s | -70% | Data inconsistency* |
| 200 | 750K/s | 580K/s | +23% | Expected |
| 1000 | 975K/s | 1193K/s | -22% | Data inconsistency* |
| 2000 | N/A | 1587K/s | N/A | N/A |

*Note: Negative overhead indicates synthetic values are estimates while integrated are measured.
Expected overhead: 20-40% (integrated slower due to PathfinderManager)

#### Overhead Sources
- PathfinderManager: Path requests, cache lookups, A* computation
- CollisionManager: Spatial hash queries for neighbors
- Production behaviors: Complex state machines and calculations

**Stability Analysis:**
- ✓ FPS variance low (2.9% CoV) - excellent stability
- ✓ CPU usage consistent (0.3% stddev)
- ✓ Update time predictable (<1ms variance)

#### Performance Profile (Callgrind Hotspots)

Top functions by instruction reads:
1. `AIManager::updateBehaviors()` - 45.2% (expected, main update loop)
2. `AIManager::processBatch()` - 12.3% (batch processing)
3. `BehaviorCache::lookup()` - 8.7% (cache lookups)
4. `ChaseBehavior::executeLogic()` - 6.1% (behavior logic)
5. `PathfinderManager::requestPath()` - 4.2% (pathfinding integration)

**Analysis:** Hotspot distribution is as expected. Most time in core update loop.

#### Trend Analysis

Performance trend over time (vs historical baselines):

FPS History: Jan 2025: 62.3 FPS ← Current Dec 2024: 61.8 FPS (+0.8% improvement) Nov 2024: 59.2 FPS (+5.2% improvement)


📈 **Trend:** Improving steadily

#### Recommendations

1. **Maintain current performance** - AI system exceeds targets
2. **Monitor behavior update variance** - Consider additional caching
3. **Document batch processing optimization** - Apply pattern to other systems

Repeat similar detailed analysis for:

Collision System
Pathfinding System
Event Manager
Particle Manager
UI System

Section 3: Cross-System Analysis

## Cross-System Performance Comparison

### Frame Budget Analysis (60 FPS = 16.67ms budget)

| System | Time (ms) | % Budget | Status |
|--------|-----------|----------|--------|
| AI Update | 12.4 | 74.4% | ✓ |
| Collision Detection | 2.8 | 16.8% | ✓ |
| Pathfinding | 1.2 | 7.2% | ✓ |
| Event Processing | 0.5 | 3.0% | ✓ |
| Particle Update | 0.8 | 4.8% | ✓ |
| UI Rendering | 4.5 | 27.0% | ✓ |
| **Total** | **22.2** | **133.3%** | ⚠ |

**Note:** Total exceeds 100% because systems run on separate threads (update vs render).

**Update Thread Budget (60 FPS = 16.67ms):**
- AI: 12.4ms
- Collision: 2.8ms
- Pathfinding: 1.2ms
- Event: 0.5ms
- **Total Update:** 16.9ms (101% of budget) ⚠ Slight overrun

**Render Thread Budget:**
- Particle Render: 0.8ms
- UI Render: 4.5ms
- World Render: 3.2ms
- **Total Render:** 8.5ms (51% of budget) ✓ Plenty of headroom

### System Interaction Analysis

**AI ↔ Pathfinding:**
- Pathfinding requests/frame: 15
- Average latency: 8.5ms
- Cache hit rate: 78%
- ✓ Integration efficient

**Collision ↔ Pathfinding:**
- Dynamic obstacle updates: 50/frame
- Pathfinding weight adjustments: 12/frame
- ✓ Integration smooth

**Event ↔ All Systems:**
- Event throughput: 8,500 events/sec
- Dispatch latency: 0.12ms
- Queue depth: 128 events
- ✓ No bottlenecks detected

### Resource Usage Summary

**CPU Usage by System:**

AI Manager: 5.8% Collision Manager: 2.3% Pathfinder: 1.2% Event Manager: 0.8% Particle Manager: 1.5% UI Manager: 0.9% Total Engine: 12.5%


**Memory Usage:**

AI Manager: 45 MB Collision Manager: 32 MB Pathfinder: 18 MB Event Manager: 8 MB Particle Manager: 12 MB UI Manager: 15 MB Total Engine: 130 MB

Section 4: Optimization Opportunities

## Optimization Opportunities

### High Priority

1. **Update Thread Frame Budget Overrun**
   - Current: 16.9ms (101% of 16.67ms budget)
   - Impact: Occasional frame drops
   - Recommendation: Reduce AI update time by 0.5ms
   - Approach: Increase batch size or optimize behavior cache

2. **Event Dispatch Latency Increase**
   - Current: 0.12ms (up 8% from baseline)
   - Impact: Slight event processing delay
   - Recommendation: Profile event dispatch path
   - Approach: Reduce lock contention or optimize event routing

### Medium Priority

3. **Pathfinding Cache Hit Rate**
   - Current: 78%
   - Target: 85%+
   - Recommendation: Increase cache size or improve eviction policy
   - Expected Improvement: Reduce path calculation time by ~15%

4. **Particle Culling Efficiency**
   - Current: 88%
   - Target: 95%+
   - Recommendation: Improve camera frustum culling
   - Expected Improvement: Reduce render time by ~10%

### Low Priority

5. **UI Component Render Time**
   - Current: 4.5ms (stable, within budget)
   - Opportunity: Apply batching technique from particle system
   - Expected Improvement: Reduce to 3.5ms (~22% faster)

Section 5: Historical Trends

## Performance Trends

### AI System FPS Over Time

68 ┤ 66 ┤ ╭─ 64 ┤ ╭─╯ 62 ┤ ╭───╯ ← Current (62.3) 60 ┤ ╭─╯ 58 ┤╭─╯ 56 ┼───────────────────────── Nov Dec Jan 2024 2024 2025


**Trend:** 📈 Improving (+5.2% over 3 months)

### Collision System Query Time

0.12 ┤╮ 0.10 ┤ ╰╮ 0.08 ┤ ╰───────╮ ← Current (0.08ms) 0.06 ┤ ╰─ 0.04 ┼───────────────────────── Nov Dec Jan 2024 2024 2025


**Trend:** 📈 Improving (-33% over 3 months)

### Event Throughput

9000 ┤ ╭─── ← Current (8500/sec) 8500 ┤ ╭─╯ 8000 ┤ ╭─╯ 7500 ┤╭─╯ 7000 ┼───────────────────────── Nov Dec Jan 2024 2024 2025


**Trend:** 📈 Improving (+21% over 3 months)

Section 6: Comparative Analysis

## Comparative Analysis

### Performance vs Industry Standards

| System | HammerEngine | Industry Avg | Status |
|--------|--------------|--------------|--------|
| Entity Count @ 60 FPS | 10,000 | 5,000-8,000 | ✓ Above Avg |
| Collision Checks/sec | 125,000 | 80,000-100,000 | ✓ Above Avg |
| Event Throughput | 8,500/sec | 5,000-10,000 | ✓ Average |
| Memory/Entity | 13 KB | 10-20 KB | ✓ Average |

**Overall:** HammerEngine performs above industry averages for 2D game engines.

### Performance vs Project Goals

| Goal | Target | Current | Status |
|------|--------|---------|--------|
| 10K+ Entities @ 60 FPS | 60 FPS | 62.3 FPS | ✓ Exceeded |
| AI CPU Usage | <6% | 5.8% | ✓ Met |
| Event Throughput | 10K/sec | 8.5K/sec | ⚠ 85% of goal |
| Collision Efficiency | N/A | 94.2% hash | ✓ Excellent |
| Pathfinding Speed | <10ms | 8.5ms | ✓ Met |

**Overall Progress:** 80% of goals met or exceeded

Section 7: Technical Details

## Technical Details

### Test Environment

- **Hardware:** <CPU-model>, <RAM-size>
- **OS:** Linux 6.16.4 (Bazzite Fedora 42)
- **Compiler:** GCC/Clang <version>, C++20
- **Build Flags:** -O3 -flto -march=x86-64-v3 -mavx2
- **SDL Version:** SDL3 (latest)

### Benchmark Methodology

- **Duration:** 20 minutes total
- **Repetitions:** 5 runs per benchmark (median reported)
- **Warm-up:** 30 seconds per test
- **Isolation:** Tests run sequentially, system idle
- **Profiling:** Callgrind with 1% sampling

### Data Collection

- **Metrics Collection:** Automated via test scripts
- **Storage:** $PROJECT_ROOT/test_results/ directory
- **Baseline:** Updated monthly
- **History:** 6 months retained

### Reliability

- **FPS Variance:** 2.9% CoV (excellent)
- **CPU Variance:** 5.1% CoV (good)
- **Memory Variance:** 1.2% CoV (excellent)

**Overall:** Results are highly reliable and reproducible.

Section 8: Appendices

## Appendix A: Raw Metrics

### AI System Benchmark (Raw Data)

Entities: 10000 FPS: 62.3 CPU: 5.8% Update Time: 12.4ms Batch Processing: 2.1ms Behavior Updates: 8.3ms Memory Usage: 45 MB Thread Safety: Mutex-protected Double Buffer: Enabled Cache Efficiency: 92%


<Include raw data for all systems>

## Appendix B: Callgrind Full Output

Appendix C: Test Scripts

All benchmark tests are located in:

tests/test_scripts/run_ai_system_benchmark.sh
tests/test_scripts/run_collision_benchmark.sh

Run full suite:

./run_all_tests.sh --benchmarks-only

Appendix D: Baseline History

Date	AI FPS	Collision Checks	Event Throughput
2025-01-15	62.3	125,000	8,500
2024-12-15	61.8	120,000	8,200
2024-11-15	59.2	110,000	7,000

Step 7: Format Report

Output Formats:

1. Markdown (Default)

Save to:

docs/performance_reports/performance_report_YYYY-MM-DD.md

Advantages:

Easy to read in text editor
Version control friendly
Can be converted to other formats

2. HTML (Optional)

Convert markdown to HTML:

# If pandoc available
pandoc performance_report.md -o performance_report.html \
  --standalone \
  --css=report_style.css \
  --metadata title="HammerEngine Performance Report"

Advantages:

Interactive charts (if using JavaScript)
Better for web viewing
Professional appearance

3. PDF (Optional)

Convert markdown to PDF:

# If pandoc + LaTeX available
pandoc performance_report.md -o performance_report.pdf \
  --pdf-engine=xelatex \
  --variable geometry:margin=1in \
  --variable fontsize=11pt \
  --metadata title="HammerEngine Performance Report"

Advantages:

Professional document
Easy to share with stakeholders
Print-ready

Step 8: Generate Summary Output

Console Output:

=== HammerEngine Benchmark Report Generated ===

Report Date: 2025-01-15 14:30:22
Benchmarks Analyzed: 6 systems
Metrics Collected: 42 data points
Baseline Comparison: Available (2024-12-15)

Performance Status: ✓ EXCELLENT

Key Highlights:
  ✓ AI System: 10,000 entities @ 62.3 FPS
  ✓ Collision: 125,000 checks/sec
  ✓ Pathfinding: 8.5ms avg calculation
  ⚠ Event Dispatch: +8% latency (monitor)

Report Generated:
  📄 Markdown: docs/performance_reports/performance_report_2025-01-15.md
  🌐 HTML: docs/performance_reports/performance_report_2025-01-15.html
  📑 PDF: docs/performance_reports/performance_report_2025-01-15.pdf

Report Size: 2.3 MB (includes charts and raw data)
Generation Time: 2m 15s

Next Steps:
  1. Review optimization opportunities (Section 4)
  2. Address update thread budget overrun
  3. Monitor event dispatch latency trend
  4. Update baseline after validating improvements

---
Report ready for distribution

Usage Examples

When the user says:

"generate performance report"
"create benchmark report"
"document current performance"
"prepare performance analysis"
"make stakeholder report"

Activate this Skill automatically.

Report Customization Options

Ask user for:

Report Scope:
- All systems (default)
- Specific systems only (specify which)
Detail Level:
- Executive summary only
- Full detailed report (default)
- Technical deep-dive (includes callgrind)
Output Formats:
- Markdown (always included)
- HTML
- PDF
Include Sections:
- Executive Summary
- System Analysis
- Trends
- Callgrind Hotspots (if available)
- Optimization Recommendations
- Raw Data Appendices
Comparison:
- vs Baseline only
- vs Historical baselines (trend)
- vs Industry standards

Performance Expectations

Data Collection: 1-2 minutes
Analysis: 30-60 seconds
Report Generation: 30-60 seconds
Total Time: 2-4 minutes

Manual equivalent: 45-60 minutes

Integration with Workflow

Use this Skill:

After completing benchmark runs
Before milestone releases
For monthly performance tracking
When preparing stakeholder updates
To document optimization efforts
Before/after major refactoring

File Management

Reports are saved to:

docs/performance_reports/
├── performance_report_2025-01-15.md
├── performance_report_2025-01-15.html
├── performance_report_2025-01-15.pdf
└── latest_report.md (symlink to most recent)

Cleanup old reports:

# Keep last 12 months, archive older
find docs/performance_reports/ -name "*.md" -mtime +365 -exec mv {} archive/ \;

Quality Assurance

Report includes:

✓ All benchmark systems analyzed
✓ Statistical validity checks (min 3 data points)
✓ Trend analysis (if historical data available)
✓ Clear status indicators (✓/⚠/❌)
✓ Actionable recommendations
✓ Professional formatting
✓ Version information (git commit, date)
✓ Reproducibility information (test environment)

Advanced Features

ASCII Charts

Generate simple ASCII charts for trends:

FPS Trend:
68 ┤          ╭─
66 ┤        ╭─╯
64 ┤      ╭─╯
62 ┤    ╭─╯
60 ┤  ╭─╯
58 ┤╭─╯
   └─────────────

Sparklines

Compact trend indicators:

AI FPS: ▁▂▃▅▆█ (improving)
Collision: ███▇▅▃ (degrading)

Color Coding (in HTML/PDF)

🟢 Green: Exceeds targets
🟡 Yellow: Meets targets
🟠 Orange: Below targets (warning)
🔴 Red: Critical issues

Error Handling

If no benchmark data found:

Report: "No benchmark data available. Run benchmarks first:"
Show command: ./run_all_tests.sh --benchmarks-only

If baseline missing:

Note in report: "No baseline for comparison. This will serve as baseline."
Save current metrics as baseline

If incomplete data:

Note missing systems in report
Generate partial report
Recommend running missing benchmarks

Exit Codes

0: Report generated successfully
1: No benchmark data found
2: Report generation failed
3: Partial report (missing data)

Time Savings

Manual Process: ~45-60 minutes

Collect metrics: 10 min
Calculate statistics: 15 min
Format tables: 10 min
Write analysis: 15 min
Generate charts: 10 min

With This Skill: ~2-4 minutes

Automated collection
Automated analysis
Automated formatting

Total Time Saved: ~40-55 minutes per report

View on GitHub

Related Skills

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Clinical Decision Support

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug develo

developmentdocumentcli

Scientific Schematics

Create publication-quality scientific diagrams using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.

artdocument

Mermaid Diagrams

Comprehensive guide for creating software diagrams using Mermaid syntax. Use when users need to create, visualize, or document software through diagrams including class diagrams (domain modeling, object-oriented design), sequence diagrams (application flows, API interactions, code execution), flowcharts (processes, algorithms, user journeys), entity relationship diagrams (database schemas), C4 architecture diagrams (system context, containers, components), state diagrams, git graphs, pie charts,

artdesigncode

Diagram Generation

Mermaid diagram generation for architecture visualization, data flow diagrams, and component relationships. Use for documentation, PR descriptions, and architectural analysis.

documentdata

Scientific Schematics

artdocument

Clinical Decision Support

developmentdocumentcli

Materialize Docs

Materialize documentation for SQL syntax, data ingestion, concepts, and best practices. Use when users ask about Materialize queries, sources, sinks, views, or clusters.

documentdata

Dbt Transformation Patterns

testingdocumenttool

Mermaidjs V11

Create diagrams and visualizations using Mermaid.js v11 syntax. Use when generating flowcharts, sequence diagrams, class diagrams, state diagrams, ER diagrams, Gantt charts, user journeys, timelines, architecture diagrams, or any of 24+ diagram types. Supports JavaScript API integration, CLI rendering to SVG/PNG/PDF, theming, configuration, and accessibility features. Essential for documentation, technical diagrams, project planning, system architecture, and visual communication.

artdocumentapi

Skill Information

Category:Document

Allowed Tools:[Bash, Read, Write, Grep]

Last Updated:11/16/2025