Coralogix Analysis

by incidentfox

art

Expert guidance for analyzing Coralogix logs using partition-first methodology. Use when investigating errors, analyzing log patterns, or troubleshooting services via Coralogix.

Skill Details

Repository Files

1 file in this skill directory


name: coralogix-analysis description: Expert guidance for analyzing Coralogix logs using partition-first methodology. Use when investigating errors, analyzing log patterns, or troubleshooting services via Coralogix.

Coralogix Log Analysis Methodology

Core Principle: Statistics Before Samples

NEVER start by reading raw logs. Always begin with aggregated statistics to understand the landscape.

3-Step Analysis Process

Step 1: Get the Big Picture

Start every Coralogix investigation with aggregate queries:

Use DataPrime aggregations:

# How many logs total?
source logs | stats count() as total

# What services are logging?
source logs | groupby $l.subsystemname aggregate count() as cnt | orderby cnt desc

# Error severity breakdown
source logs | groupby $m.severity aggregate count() as cnt

# Error rate over time
source logs | filter $m.severity >= '4' | timebucket 5m aggregate count() as errors

Questions to answer:

  • What's the total log volume in the time window?
  • Which services are most active?
  • What's the error rate (severity 4-6)?
  • Is the error rate increasing, stable, or decreasing?

Step 2: Identify Error Patterns

Focus on error characteristics:

Most common errors:

source logs 
| filter $m.severity >= '5' 
| groupby $l.subsystemname aggregate count() as error_count 
| orderby error_count desc 
| limit 10

Error types by service:

source logs 
| filter $l.subsystemname == 'your-service' 
| filter $m.severity >= '5' 
| groupby extract($d.logRecord.body, 'Error|Exception') aggregate count()

Temporal clustering:

  • Did errors start at a specific time?
  • Correlation with deployments or traffic changes?
  • Is there periodicity (every N minutes)?

Step 3: Sample Strategically

Only NOW examine actual log content:

Sample from peaks:

source logs 
| filter $l.subsystemname == 'problematic-service' 
| filter @timestamp >= '2024-01-15T14:30:00' and @timestamp <= '2024-01-15T14:35:00'
| filter $m.severity >= '5' 
| limit 10

Sample by error type:

  • Get 5-10 examples of each distinct error
  • Compare against baseline period (normal behavior)

Available Coralogix Tools

Tool When to Use
search_coralogix_logs Execute any DataPrime query
list_coralogix_services Discover active services
get_coralogix_error_logs Get errors for a specific service
get_coralogix_service_health Overall service health summary
get_coralogix_alerts Check firing alerts
search_coralogix_traces Distributed trace analysis

DataPrime Syntax Quick Reference

Filters:

# Exact match (note: use == not =)
$l.subsystemname == 'api-server'

# Severity levels: 1=Debug, 2=Verbose, 3=Info, 4=Warning, 5=Error, 6=Critical
$m.severity >= '5'

# Text search (case-insensitive)
$d ~~ 'timeout'

# Combine filters
$l.subsystemname == 'api' && $m.severity >= '4'

Aggregations:

# Count
| aggregate count() as total

# Group by field
| groupby $l.subsystemname aggregate count() as cnt

# Time bucketing
| timebucket 5m aggregate count() as cnt

# Multiple aggregations
| groupby $l.subsystemname aggregate count() as cnt, avg($d.duration) as avg_duration

Common Fields:

  • $l.applicationname - Application/environment name
  • $l.subsystemname - Service name
  • $m.severity - Log level (1-6)
  • $d.logRecord.body - Log message content
  • @timestamp - Log timestamp

Anti-Patterns to Avoid

  1. Dumping raw logs first - Always start with statistics
  2. Unbounded queries - Always specify time ranges
  3. Ignoring severity - Filter by $m.severity to focus on errors
  4. Single service focus - Check dependencies and upstream services
  5. Missing temporal context - Correlate with deployments/changes

Investigation Template

## Coralogix Analysis Summary

### Time Window
- Start: [timestamp]
- End: [timestamp]
- Duration: X hours

### Statistics
- Total logs: X events
- Error count (severity >=5): Y events (Z%)
- Services affected: N services
- Error rate trend: [increasing/stable/decreasing]

### Top Error Services
1. [service1]: N errors
2. [service2]: M errors

### Error Pattern
- Primary error type: [description]
- First occurrence: [timestamp]
- Correlation: [deployment/traffic spike/external event]

### Sample Errors
[Quote 2-3 representative error messages with context]

### Root Cause Hypothesis
[Based on patterns observed in aggregations]

Pro Tips

Efficient querying:

  • Use list_coralogix_services to discover service names before filtering
  • Start with 1-hour windows, expand only if needed
  • Limit initial results to 20-50 entries

Pattern recognition:

  • Look for error clusters (many errors in short time)
  • Check if errors are distributed or focused on one service
  • Compare error types before/after a specific timestamp

Trace correlation:

  • Use search_coralogix_traces to connect logs to distributed traces
  • Filter traces by min_duration_ms to find slow requests
  • Correlate high-latency traces with error spikes

Related Skills

Team Composition Analysis

This skill should be used when the user asks to "plan team structure", "determine hiring needs", "design org chart", "calculate compensation", "plan equity allocation", or requests organizational design and headcount planning for a startup.

artdesign

Startup Financial Modeling

This skill should be used when the user asks to "create financial projections", "build a financial model", "forecast revenue", "calculate burn rate", "estimate runway", "model cash flow", or requests 3-5 year financial planning for a startup.

art

Startup Metrics Framework

This skill should be used when the user asks about "key startup metrics", "SaaS metrics", "CAC and LTV", "unit economics", "burn multiple", "rule of 40", "marketplace metrics", or requests guidance on tracking and optimizing business performance metrics.

art

Market Sizing Analysis

This skill should be used when the user asks to "calculate TAM", "determine SAM", "estimate SOM", "size the market", "calculate market opportunity", "what's the total addressable market", or requests market sizing analysis for a startup or business opportunity.

art

Anndata

This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.

arttooldata

Geopandas

Python library for working with geospatial vector data including shapefiles, GeoJSON, and GeoPackage files. Use when working with geographic data for spatial analysis, geometric operations, coordinate transformations, spatial joins, overlay operations, choropleth mapping, or any task involving reading/writing/analyzing vector geographic data. Supports PostGIS databases, interactive maps, and integration with matplotlib/folium/cartopy. Use for tasks like buffer analysis, spatial joins between dat

artdatacli

Market Research Reports

Generate comprehensive market research reports (50+ pages) in the style of top consulting firms (McKinsey, BCG, Gartner). Features professional LaTeX formatting, extensive visual generation with scientific-schematics and generate-image, deep integration with research-lookup for data gathering, and multi-framework strategic analysis including Porter's Five Forces, PESTLE, SWOT, TAM/SAM/SOM, and BCG Matrix.

artdata

Plotly

Interactive scientific and statistical data visualization library for Python. Use when creating charts, plots, or visualizations including scatter plots, line charts, bar charts, heatmaps, 3D plots, geographic maps, statistical distributions, financial charts, and dashboards. Supports both quick visualizations (Plotly Express) and fine-grained customization (graph objects). Outputs interactive HTML or static images (PNG, PDF, SVG).

artdata

Excel Analysis

Analyze Excel spreadsheets, create pivot tables, generate charts, and perform data analysis. Use when analyzing Excel files, spreadsheets, tabular data, or .xlsx files.

artdata

Neurokit2

Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.

arttooldata

Skill Information

Category:Creative
Last Updated:1/28/2026