Alerting Dashboard Builder
by patricio0312rev
Creates SLO-based alerts and operational dashboards with key charts, alert thresholds, and runbook links. Use for "alerting", "dashboards", "SLO", or "monitoring".
Skill Details
Repository Files
1 file in this skill directory
name: alerting-dashboard-builder description: Creates SLO-based alerts and operational dashboards with key charts, alert thresholds, and runbook links. Use for "alerting", "dashboards", "SLO", or "monitoring".
Alerting & Dashboard Builder
Build effective alerts and dashboards based on SLOs.
SLO Definition
slos:
- name: api_availability
objective: 99.9%
window: 30d
sli: |
sum(rate(http_requests_total{status_code!~"5.."}[5m])) /
sum(rate(http_requests_total[5m]))
- name: api_latency
objective: 95% # 95% of requests under 500ms
window: 30d
sli: |
histogram_quantile(0.95,
rate(http_request_duration_seconds_bucket[5m])
) < 0.5
Alert Rules
groups:
- name: slo_alerts
rules:
# Fast burn (1% budget in 1h)
- alert: AvailabilitySLOFastBurn
expr: |
(1 - (sum(rate(http_requests_total{status_code!~"5.."}[1h])) /
sum(rate(http_requests_total[1h])))) > 0.01
for: 5m
labels:
severity: critical
annotations:
summary: "Burning 1% error budget per hour"
runbook: "https://runbooks.example.com/availability-fast-burn"
# Slow burn (10% budget in 24h)
- alert: AvailabilitySLOSlowBurn
expr: |
(1 - (sum(rate(http_requests_total{status_code!~"5.."}[24h])) /
sum(rate(http_requests_total[24h])))) > 0.001
for: 1h
labels:
severity: warning
annotations:
summary: "Burning error budget slowly"
Dashboard Template
{
"title": "Service Health Dashboard",
"rows": [
{
"title": "Golden Signals",
"panels": [
{
"title": "Request Rate",
"query": "sum(rate(http_requests_total[5m]))",
"type": "graph"
},
{
"title": "Error Rate",
"query": "sum(rate(http_requests_total{status_code=~"5.."}[5m]))",
"type": "graph"
},
{
"title": "Latency (p50, p95, p99)",
"queries": [
"histogram_quantile(0.50, rate(http_request_duration_seconds_bucket[5m]))",
"histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
"histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))"
]
},
{
"title": "Saturation (CPU, Memory)",
"queries": [
"rate(process_cpu_seconds_total[5m])",
"process_resident_memory_bytes"
]
}
]
},
{
"title": "SLO Tracking",
"panels": [
{
"title": "Error Budget Remaining",
"query": "1 - ((1 - 0.999) - (1 - slo_availability))"
}
]
}
]
}
What to Do When Alert Fires
# Alert Response Guide
## HighErrorRate
**What it means:** More than 5% of requests are failing
**First steps:**
1. Check recent deployments (rollback if needed)
2. Review error logs for patterns
3. Check dependent services health
4. Verify database connectivity
**Escalation:** If not resolved in 15 min, page on-call lead
## HighLatency
**What it means:** p95 latency above 2 seconds
**First steps:**
1. Check database query performance
2. Review recent code changes
3. Check cache hit rates
4. Look for slow external API calls
**Temporary mitigation:**
- Scale up instances
- Enable aggressive caching
## LowAvailability
**What it means:** Availability below 99.5%
**First steps:**
1. Check infrastructure (AWS status page)
2. Review load balancer health checks
3. Check for DDoS activity
4. Verify auto-scaling functioning
Output Checklist
- SLOs defined
- Alert rules configured
- Dashboards created
- Runbooks linked
- Response guides documented ENDFILE
Related Skills
Team Composition Analysis
This skill should be used when the user asks to "plan team structure", "determine hiring needs", "design org chart", "calculate compensation", "plan equity allocation", or requests organizational design and headcount planning for a startup.
Startup Financial Modeling
This skill should be used when the user asks to "create financial projections", "build a financial model", "forecast revenue", "calculate burn rate", "estimate runway", "model cash flow", or requests 3-5 year financial planning for a startup.
Startup Metrics Framework
This skill should be used when the user asks about "key startup metrics", "SaaS metrics", "CAC and LTV", "unit economics", "burn multiple", "rule of 40", "marketplace metrics", or requests guidance on tracking and optimizing business performance metrics.
Market Sizing Analysis
This skill should be used when the user asks to "calculate TAM", "determine SAM", "estimate SOM", "size the market", "calculate market opportunity", "what's the total addressable market", or requests market sizing analysis for a startup or business opportunity.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Geopandas
Python library for working with geospatial vector data including shapefiles, GeoJSON, and GeoPackage files. Use when working with geographic data for spatial analysis, geometric operations, coordinate transformations, spatial joins, overlay operations, choropleth mapping, or any task involving reading/writing/analyzing vector geographic data. Supports PostGIS databases, interactive maps, and integration with matplotlib/folium/cartopy. Use for tasks like buffer analysis, spatial joins between dat
Market Research Reports
Generate comprehensive market research reports (50+ pages) in the style of top consulting firms (McKinsey, BCG, Gartner). Features professional LaTeX formatting, extensive visual generation with scientific-schematics and generate-image, deep integration with research-lookup for data gathering, and multi-framework strategic analysis including Porter's Five Forces, PESTLE, SWOT, TAM/SAM/SOM, and BCG Matrix.
Plotly
Interactive scientific and statistical data visualization library for Python. Use when creating charts, plots, or visualizations including scatter plots, line charts, bar charts, heatmaps, 3D plots, geographic maps, statistical distributions, financial charts, and dashboards. Supports both quick visualizations (Plotly Express) and fine-grained customization (graph objects). Outputs interactive HTML or static images (PNG, PDF, SVG).
Excel Analysis
Analyze Excel spreadsheets, create pivot tables, generate charts, and perform data analysis. Use when analyzing Excel files, spreadsheets, tabular data, or .xlsx files.
Neurokit2
Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.
