Observability Analyzer
by adaptationio
Query and analyze Claude Code observability data (metrics, logs, traces). Use when analyzing performance, costs, errors, tool usage, sessions, conversations, or subagents.
Skill Details
Repository Files
5 files in this skill directory
name: observability-analyzer description: Query and analyze Claude Code observability data (metrics, logs, traces). Use when analyzing performance, costs, errors, tool usage, sessions, conversations, or subagents.
Observability Analyzer
Query Claude Code telemetry and generate insights from metrics, logs, and traces. Works with both default OTEL telemetry and enhanced hook-based telemetry.
Data Sources
| Source | Job Name | Contains |
|---|---|---|
| Default OTEL | claude_code |
API metrics, token usage, costs |
| Enhanced Hooks | claude_code_enhanced |
Sessions, conversations, tools, subagents |
Operations
query-metrics <promql>
Execute PromQL query against Prometheus.
query-metrics 'sum(claude_code_token_usage)[7d]'
query-logs <logql>
Execute LogQL query against Loki.
query-logs '{job="claude_code_enhanced", event_type="tool_call"} | json' --since 24h
analyze-errors
Detect and group error patterns from enhanced telemetry.
{job="claude_code_enhanced", event_type="tool_result", status="error"} | json
Output: Error types, frequencies, affected tools, recommendations.
analyze-performance
Identify slow operations and response sizes.
{job="claude_code_enhanced", event_type="tool_result"} | json | response_length > 50000
Output: Large responses, estimated token costs, slow patterns.
analyze-costs
Calculate token usage from content size estimates.
sum by (repo) (sum_over_time({job="claude_code_enhanced", event_type="context_utilization"} | json | unwrap estimated_session_tokens [24h]))
Output: Token estimates by repo, session costs, projections.
analyze-tools
Tool usage statistics and sequences.
sum by (tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))
Output: Call frequency, success rates, tool sequences, common patterns.
analyze-sessions
Session lifecycle and duration analytics.
{job="claude_code_enhanced", event_type="session_end"} | json
Output: Session durations, turn counts, tools per session, termination reasons.
analyze-conversations
Conversation and prompt analytics.
sum by (pattern) (count_over_time({job="claude_code_enhanced", event_type="user_prompt"} | json [24h]))
Output: Prompt patterns (question/debugging/creation/ultrathink), turn distribution.
analyze-subagents
Subagent/Task tool usage.
{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json
Output: Subagent types used, completion rates, parallel execution patterns.
analyze-skills
Skill invocation analytics.
sum by (skill_name) (count_over_time({job="claude_code_enhanced", event_type="skill_usage"} | json [24h]))
Output: Most used skills, skill usage by repo, trends.
analyze-context
Context window utilization.
{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 50
Output: High utilization sessions, compaction events, token efficiency.
analyze-repos
Repository/project activity.
sum by (repo, tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))
Output: Activity per repo, tool usage by project, branch patterns.
generate-report
Comprehensive analysis report (all dimensions). Output: Markdown report with errors, performance, costs, sessions, conversations, tools.
Key Queries
Enhanced Telemetry (Loki)
# All events (last hour)
{job="claude_code_enhanced"} | json
# Session analytics
{job="claude_code_enhanced", event_type="session_end"} | json | duration_seconds > 300
# Tool errors
{job="claude_code_enhanced", event_type="tool_result", status="error"} | json
# High context usage
{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 75
# Subagent spawns
{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json
# Skill invocations
{job="claude_code_enhanced", event_type="skill_usage"} | json
# Prompt patterns
{job="claude_code_enhanced", event_type="user_prompt"} | json | pattern="ultrathink"
# Tool sequences
{job="claude_code_enhanced", event_type="tool_call"} | json | line_format "{{.tool_name}} → {{.previous_tool}}"
# Context compaction
{job="claude_code_enhanced", event_type="context_compact"} | json
# Permission requests
{job="claude_code_enhanced", event_type="permission_request"} | json
Default OTEL (Prometheus)
# Total token usage (7 days)
sum(increase(claude_code_token_usage[7d]))
# Error rate by tool
sum by (tool_name) (rate(claude_code_tool_result{status="failure"}[1h]))
# P95 tool latency
histogram_quantile(0.95, claude_code_tool_duration_bucket)
# Daily costs
sum(increase(claude_code_cost_usage[24h]))
Event Types Reference
| Event Type | Description | Key Fields |
|---|---|---|
session_start |
Session initialization | source, permission_mode |
session_end |
Session termination | duration_seconds, turn_count, tools_used |
user_prompt |
User message submitted | pattern, prompt_length, estimated_tokens |
tool_call |
Tool invocation | tool_name, tool_details, sequence_position |
tool_result |
Tool completion | status, response_length, is_error |
skill_usage |
Skill invoked | skill_name |
context_utilization |
Token estimate | estimated_session_tokens, context_percentage |
context_compact |
Compaction event | trigger (manual/auto) |
subagent_complete |
Task agent finished | total_subagents |
permission_request |
Permission dialog | notification_type |
notification |
System notification | notification_type |
Grafana Dashboards
- Claude Code Overview - High-level metrics
- Tool Performance - Tool latencies and success rates
- Cost Analysis - Token usage and costs
- Error Tracking - Error patterns and trends
- Session Analytics - Session-level insights
- Enhanced Analytics - Model/skill/context/repo tracking
- Deep Analytics - Comprehensive conversation and tool analysis
Access: http://localhost:3000 (admin/admin)
Scripts
scripts/query-prometheus.sh- PromQL query helperscripts/query-loki.sh- LogQL query helperscripts/analyze-errors.sh- Error analysis automationscripts/analyze-sessions.sh- Session analyticsscripts/generate-report.sh- Full analysis report
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
