Observability Analyzer

by adaptationio

codetooldata

Query and analyze Claude Code observability data (metrics, logs, traces). Use when analyzing performance, costs, errors, tool usage, sessions, conversations, or subagents.

Skill Details

Repository Files

5 files in this skill directory


name: observability-analyzer description: Query and analyze Claude Code observability data (metrics, logs, traces). Use when analyzing performance, costs, errors, tool usage, sessions, conversations, or subagents.

Observability Analyzer

Query Claude Code telemetry and generate insights from metrics, logs, and traces. Works with both default OTEL telemetry and enhanced hook-based telemetry.

Data Sources

Source Job Name Contains
Default OTEL claude_code API metrics, token usage, costs
Enhanced Hooks claude_code_enhanced Sessions, conversations, tools, subagents

Operations

query-metrics <promql>

Execute PromQL query against Prometheus.

query-metrics 'sum(claude_code_token_usage)[7d]'

query-logs <logql>

Execute LogQL query against Loki.

query-logs '{job="claude_code_enhanced", event_type="tool_call"} | json' --since 24h

analyze-errors

Detect and group error patterns from enhanced telemetry.

{job="claude_code_enhanced", event_type="tool_result", status="error"} | json

Output: Error types, frequencies, affected tools, recommendations.

analyze-performance

Identify slow operations and response sizes.

{job="claude_code_enhanced", event_type="tool_result"} | json | response_length > 50000

Output: Large responses, estimated token costs, slow patterns.

analyze-costs

Calculate token usage from content size estimates.

sum by (repo) (sum_over_time({job="claude_code_enhanced", event_type="context_utilization"} | json | unwrap estimated_session_tokens [24h]))

Output: Token estimates by repo, session costs, projections.

analyze-tools

Tool usage statistics and sequences.

sum by (tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))

Output: Call frequency, success rates, tool sequences, common patterns.

analyze-sessions

Session lifecycle and duration analytics.

{job="claude_code_enhanced", event_type="session_end"} | json

Output: Session durations, turn counts, tools per session, termination reasons.

analyze-conversations

Conversation and prompt analytics.

sum by (pattern) (count_over_time({job="claude_code_enhanced", event_type="user_prompt"} | json [24h]))

Output: Prompt patterns (question/debugging/creation/ultrathink), turn distribution.

analyze-subagents

Subagent/Task tool usage.

{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json

Output: Subagent types used, completion rates, parallel execution patterns.

analyze-skills

Skill invocation analytics.

sum by (skill_name) (count_over_time({job="claude_code_enhanced", event_type="skill_usage"} | json [24h]))

Output: Most used skills, skill usage by repo, trends.

analyze-context

Context window utilization.

{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 50

Output: High utilization sessions, compaction events, token efficiency.

analyze-repos

Repository/project activity.

sum by (repo, tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))

Output: Activity per repo, tool usage by project, branch patterns.

generate-report

Comprehensive analysis report (all dimensions). Output: Markdown report with errors, performance, costs, sessions, conversations, tools.

Key Queries

Enhanced Telemetry (Loki)

# All events (last hour)
{job="claude_code_enhanced"} | json

# Session analytics
{job="claude_code_enhanced", event_type="session_end"} | json | duration_seconds > 300

# Tool errors
{job="claude_code_enhanced", event_type="tool_result", status="error"} | json

# High context usage
{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 75

# Subagent spawns
{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json

# Skill invocations
{job="claude_code_enhanced", event_type="skill_usage"} | json

# Prompt patterns
{job="claude_code_enhanced", event_type="user_prompt"} | json | pattern="ultrathink"

# Tool sequences
{job="claude_code_enhanced", event_type="tool_call"} | json | line_format "{{.tool_name}} → {{.previous_tool}}"

# Context compaction
{job="claude_code_enhanced", event_type="context_compact"} | json

# Permission requests
{job="claude_code_enhanced", event_type="permission_request"} | json

Default OTEL (Prometheus)

# Total token usage (7 days)
sum(increase(claude_code_token_usage[7d]))

# Error rate by tool
sum by (tool_name) (rate(claude_code_tool_result{status="failure"}[1h]))

# P95 tool latency
histogram_quantile(0.95, claude_code_tool_duration_bucket)

# Daily costs
sum(increase(claude_code_cost_usage[24h]))

Event Types Reference

Event Type Description Key Fields
session_start Session initialization source, permission_mode
session_end Session termination duration_seconds, turn_count, tools_used
user_prompt User message submitted pattern, prompt_length, estimated_tokens
tool_call Tool invocation tool_name, tool_details, sequence_position
tool_result Tool completion status, response_length, is_error
skill_usage Skill invoked skill_name
context_utilization Token estimate estimated_session_tokens, context_percentage
context_compact Compaction event trigger (manual/auto)
subagent_complete Task agent finished total_subagents
permission_request Permission dialog notification_type
notification System notification notification_type

Grafana Dashboards

  • Claude Code Overview - High-level metrics
  • Tool Performance - Tool latencies and success rates
  • Cost Analysis - Token usage and costs
  • Error Tracking - Error patterns and trends
  • Session Analytics - Session-level insights
  • Enhanced Analytics - Model/skill/context/repo tracking
  • Deep Analytics - Comprehensive conversation and tool analysis

Access: http://localhost:3000 (admin/admin)

Scripts

  • scripts/query-prometheus.sh - PromQL query helper
  • scripts/query-loki.sh - LogQL query helper
  • scripts/analyze-errors.sh - Error analysis automation
  • scripts/analyze-sessions.sh - Session analytics
  • scripts/generate-report.sh - Full analysis report

Related Skills

Xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

data

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Analyzing Financial Statements

This skill calculates key financial ratios and metrics from financial statement data for investment analysis

data

Data Storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

data

Kpi Dashboard Design

Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.

designdata

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Sql Optimization Patterns

Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.

designdata

Anndata

This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.

arttooldata

Xlsx

Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.

tooldata

Skill Information

Category:Technical
Last Updated:1/16/2026