Newrelic
by opzero1
Monitor applications, investigate performance issues, and analyze observability data in New Relic. Use when the user needs APM metrics, error tracking, infrastructure monitoring, or incident analysis.
Skill Details
Repository Files
1 file in this skill directory
name: newrelic description: Monitor applications, investigate performance issues, and analyze observability data in New Relic. Use when the user needs APM metrics, error tracking, infrastructure monitoring, or incident analysis. metadata: short-description: New Relic observability and monitoring
New Relic
Overview
This skill provides a structured workflow for querying New Relic observability data. It ensures consistent integration with the New Relic MCP server for application performance monitoring (APM), error tracking, infrastructure metrics, distributed tracing, log analysis, and incident management.
Prerequisites
- New Relic MCP server must be connected and accessible via API key
- Confirm access to the relevant New Relic account and applications
- Ensure
NEW_RELIC_API_KEYenvironment variable is set
Required Workflow
Follow these steps in order. Do not skip steps.
Step 0: Set up New Relic MCP (if not already configured)
If any MCP call fails because New Relic MCP is not connected, pause and set it up:
- Add the New Relic MCP:
codex mcp add newrelic --url https://mcp.newrelic.com/mcp/
- Enable remote MCP client:
- Set
[features] rmcp_client = trueinconfig.tomlor runcodex --enable rmcp_client
- Set
- Configure API credentials:
- Set environment variable:
NEW_RELIC_API_KEY - The API key should have appropriate permissions (read access minimum, write for incident acknowledgement)
- Optional: Set
NEW_RELIC_ACCOUNT_IDas default account (can be overridden per tool call)
- Set environment variable:
After successful configuration, the user will need to restart codex. You should finish your answer and tell them so when they try again they can continue with Step 1.
Step 1
Clarify the user's goal and scope (e.g., performance investigation, error analysis, capacity planning, incident response). Confirm application names/IDs, time ranges, metric types, and alert priorities as needed.
Step 2
Select the appropriate workflow (see Practical Workflows below) and identify the New Relic MCP tools you will need. Confirm required identifiers (application name, entity GUID, incident ID) before calling tools.
Step 3
Execute New Relic MCP tool calls in logical batches:
- Query first (metrics, applications, incidents) to gather context
- Analyze patterns (errors, latency, throughput, resource usage)
- For complex investigations, explain the analysis approach before executing multiple queries
- Use NRQL for detailed analysis with proper time filters (
SINCE,UNTIL)
Step 4
Summarize findings, highlight anomalies or trends, propose next actions (further investigation, configuration changes, alert acknowledgement, incident escalation), and provide actionable recommendations.
Available Tools
Application Performance: list_apm_applications, get_app_performance, get_app_errors, get_application_slow_transactions_details, get_application_top_database_operations_details
NRQL Queries: run_nrql_query, query_logs
Incident Management: list_open_incidents, list_open_incidents_rest, acknowledge_incident, list_alert_policies
Entity Discovery: search_entities, get_entity_details, list_related_entities
Infrastructure: get_infrastructure_hosts, get_metric_data_for_host, list_metric_names_for_host
Synthetic Monitoring: list_synthetics_monitors, create_simple_browser_monitor
Dashboards (if supported): list_dashboards, get_dashboard, create_dashboard
Practical Workflows
Performance Analysis
Goal: Identify slow endpoints, database bottlenecks, and optimize application performance.
Steps:
- List APM applications → Identify target application
- Get app performance metrics → Check response time, throughput, Apdex
- Query slow transactions → Find top 10 slowest endpoints with NRQL
- Analyze database operations → Identify slow queries
- Check infrastructure → Verify CPU/memory aren't bottlenecks
- Provide optimization recommendations
Example NRQL:
SELECT average(duration), max(duration), count(*)
FROM Transaction
WHERE appName = 'MyApp'
SINCE 24 hours ago
FACET name
ORDER BY average(duration) DESC
LIMIT 10
Error Investigation
Goal: Diagnose application errors, find root causes, track error rates.
Steps:
- Get app errors → Check error rate and count
- Query detailed errors → Find error messages and stack traces
- Group by error type → Identify most common errors
- Correlate with transactions → Find affected endpoints
- Check for recent deployments or changes
- Provide root cause analysis
Example NRQL:
SELECT errorMessage, error.class, stackTrace, duration, transactionName
FROM TransactionError
WHERE appName = 'MyApp'
AND error IS true
SINCE 30 minutes ago
LIMIT 50
ORDER BY timestamp DESC
Incident Triage
Goal: Monitor alerts, acknowledge incidents, reduce MTTR.
Steps:
- List open incidents (filter by CRITICAL/WARNING)
- Get incident details and affected entities
- Acknowledge critical incidents with status update
- Query related metrics to understand impact
- Check for cascading failures or dependencies
- Provide incident summary and next steps
Example:
1. list_open_incidents(priority="CRITICAL")
2. get_entity_details(guid="ENTITY_GUID")
3. acknowledge_incident(incident_id=12345, message="Investigating payment latency")
4. run_nrql_query("SELECT * FROM Transaction WHERE appName='PaymentService' AND error IS true SINCE 30 minutes ago")
Log Analysis
Goal: Search application logs to debug issues and understand system behavior.
Steps:
- Query logs with keywords or patterns
- Filter by log level (ERROR, WARN, INFO)
- Group by application or environment
- Correlate with trace IDs or transaction IDs
- Identify patterns and anomalies
Example NRQL:
SELECT timestamp, level, message, logger, threadName
FROM Log
WHERE message LIKE '%timeout%'
AND level = 'ERROR'
SINCE 1 hour ago
ORDER BY timestamp DESC
LIMIT 100
Capacity Planning
Goal: Analyze resource usage trends and forecast scaling needs.
Steps:
- Get infrastructure hosts → Check CPU, memory, disk
- Query throughput trends over time
- Analyze peak vs. average load
- Check database connection pool usage
- Identify resource constraints
- Provide scaling recommendations
Example NRQL:
SELECT average(cpuPercent), max(cpuPercent), average(memoryUsedPercent)
FROM SystemSample
SINCE 7 days ago
FACET hostname
TIMESERIES 1 day
Infrastructure Health Check
Goal: Monitor host-level metrics and identify resource constraints.
Steps:
- List infrastructure hosts → Get all hosts
- Check CPU usage → Identify high CPU hosts
- Check memory usage → Identify memory pressure
- Check disk usage → Identify storage issues
- Correlate with application performance
- Provide health summary
Synthetic Monitoring
Goal: Proactively monitor availability and performance from external locations.
Steps:
- List synthetic monitors → Check status
- Identify failed monitors
- Analyze failure patterns (geographic, time-based)
- Check success rate trends
- Create new monitors for critical endpoints
Tips for Maximum Productivity
- Always use time filters: Add
SINCEclause to NRQL queries to limit data volume (e.g.,SINCE 1 hour ago,SINCE 24 hours ago) - Start broad, drill down: Begin with high-level metrics (app performance, error rate), then query details
- Use FACET for grouping: Group results by endpoint, error type, host (e.g.,
FACET name,FACET error.class) - Leverage TIMESERIES: Visualize trends over time (e.g.,
TIMESERIES 5 minutes,TIMESERIES 1 day) - Combine data sources: Correlate
TransactionwithTransactionError,SystemSamplewithTransaction - Cache application IDs: Reuse application names/GUIDs across multiple queries
- Batch related queries: Execute multiple NRQL queries in parallel when investigating complex issues
- Use ORDER BY: Rank results (slowest endpoints, most frequent errors) with
ORDER BY average(duration) DESC
Troubleshooting
- Authentication Errors: Verify
NEW_RELIC_API_KEYhas appropriate permissions; check account ID is correct; re-authenticate if needed - Query Timeouts: Reduce time range (use shorter
SINCEintervals); limit result sets withLIMIT; avoid complex aggregations without filters - Missing Data: Confirm application instrumentation is active; check data retention policies; verify entity is reporting
- Rate Limits: Batch queries; use specific filters to reduce data volume; implement exponential backoff for retries
- NRQL Syntax Errors: Validate metric names with
list_metric_names_for_host; check NRQL syntax at https://docs.newrelic.com/docs/query-your-data/nrql-new-relic-query-language/get-started/introduction-nrql-new-relics-query-language/ - Incident Acknowledgement Failures: Verify incident ID is correct; check incident state (can't acknowledge already closed incidents); ensure API key has write permissions
Best Practices
Query Optimization
- Use
LIMITto prevent excessive result sets (100-1000 rows) - Apply
WHEREfilters beforeFACETfor better performance - Use
TIMESERIESwith appropriate intervals (1 minute for real-time, 1 day for trends) - Avoid
SELECT *- specify needed attributes
Security
- Use read-only API keys for monitoring agents
- Grant write permissions only for incident management
- Rotate API keys regularly
- Never commit API keys to version control
Investigation Methodology
- Scope: Identify affected applications and time range
- Metrics: Gather high-level performance data
- Errors: Check for error spikes or patterns
- Infrastructure: Verify resources aren't constrained
- Logs: Search for error messages and stack traces
- Correlate: Connect metrics, errors, and logs
- Root Cause: Provide evidence-based diagnosis
- Recommend: Actionable next steps
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
