name: newrelic description: Monitor applications, investigate performance issues, and analyze observability data in New Relic. Use when the user needs APM metrics, error tracking, infrastructure monitoring, or incident analysis. metadata: short-description: New Relic observability and monitoring

New Relic

Overview

This skill provides a structured workflow for querying New Relic observability data. It ensures consistent integration with the New Relic MCP server for application performance monitoring (APM), error tracking, infrastructure metrics, distributed tracing, log analysis, and incident management.

Prerequisites

New Relic MCP server must be connected and accessible via API key
Confirm access to the relevant New Relic account and applications
Ensure NEW_RELIC_API_KEY environment variable is set

Required Workflow

Follow these steps in order. Do not skip steps.

Step 0: Set up New Relic MCP (if not already configured)

If any MCP call fails because New Relic MCP is not connected, pause and set it up:

Add the New Relic MCP:
- codex mcp add newrelic --url https://mcp.newrelic.com/mcp/
Enable remote MCP client:
- Set [features] rmcp_client = true in config.toml or run codex --enable rmcp_client
Configure API credentials:
- Set environment variable: NEW_RELIC_API_KEY
- The API key should have appropriate permissions (read access minimum, write for incident acknowledgement)
- Optional: Set NEW_RELIC_ACCOUNT_ID as default account (can be overridden per tool call)

After successful configuration, the user will need to restart codex. You should finish your answer and tell them so when they try again they can continue with Step 1.

Step 1

Clarify the user's goal and scope (e.g., performance investigation, error analysis, capacity planning, incident response). Confirm application names/IDs, time ranges, metric types, and alert priorities as needed.

Step 2

Select the appropriate workflow (see Practical Workflows below) and identify the New Relic MCP tools you will need. Confirm required identifiers (application name, entity GUID, incident ID) before calling tools.

Step 3

Execute New Relic MCP tool calls in logical batches:

Query first (metrics, applications, incidents) to gather context
Analyze patterns (errors, latency, throughput, resource usage)
For complex investigations, explain the analysis approach before executing multiple queries
Use NRQL for detailed analysis with proper time filters (SINCE, UNTIL)

Step 4

Summarize findings, highlight anomalies or trends, propose next actions (further investigation, configuration changes, alert acknowledgement, incident escalation), and provide actionable recommendations.

Available Tools

Application Performance: list_apm_applications, get_app_performance, get_app_errors, get_application_slow_transactions_details, get_application_top_database_operations_details

NRQL Queries: run_nrql_query, query_logs

Incident Management: list_open_incidents, list_open_incidents_rest, acknowledge_incident, list_alert_policies

Entity Discovery: search_entities, get_entity_details, list_related_entities

Infrastructure: get_infrastructure_hosts, get_metric_data_for_host, list_metric_names_for_host

Synthetic Monitoring: list_synthetics_monitors, create_simple_browser_monitor

Dashboards (if supported): list_dashboards, get_dashboard, create_dashboard

Practical Workflows

Performance Analysis

Goal: Identify slow endpoints, database bottlenecks, and optimize application performance.

Steps:

List APM applications → Identify target application
Get app performance metrics → Check response time, throughput, Apdex
Query slow transactions → Find top 10 slowest endpoints with NRQL
Analyze database operations → Identify slow queries
Check infrastructure → Verify CPU/memory aren't bottlenecks
Provide optimization recommendations

Example NRQL:

SELECT average(duration), max(duration), count(*) 
FROM Transaction 
WHERE appName = 'MyApp' 
SINCE 24 hours ago 
FACET name 
ORDER BY average(duration) DESC 
LIMIT 10

Error Investigation

Goal: Diagnose application errors, find root causes, track error rates.

Steps:

Get app errors → Check error rate and count
Query detailed errors → Find error messages and stack traces
Group by error type → Identify most common errors
Correlate with transactions → Find affected endpoints
Check for recent deployments or changes
Provide root cause analysis

Example NRQL:

SELECT errorMessage, error.class, stackTrace, duration, transactionName
FROM TransactionError 
WHERE appName = 'MyApp' 
AND error IS true 
SINCE 30 minutes ago 
LIMIT 50
ORDER BY timestamp DESC

Incident Triage

Goal: Monitor alerts, acknowledge incidents, reduce MTTR.

Steps:

List open incidents (filter by CRITICAL/WARNING)
Get incident details and affected entities
Acknowledge critical incidents with status update
Query related metrics to understand impact
Check for cascading failures or dependencies
Provide incident summary and next steps

Example:

1. list_open_incidents(priority="CRITICAL")
2. get_entity_details(guid="ENTITY_GUID")
3. acknowledge_incident(incident_id=12345, message="Investigating payment latency")
4. run_nrql_query("SELECT * FROM Transaction WHERE appName='PaymentService' AND error IS true SINCE 30 minutes ago")

Log Analysis

Goal: Search application logs to debug issues and understand system behavior.

Steps:

Query logs with keywords or patterns
Filter by log level (ERROR, WARN, INFO)
Group by application or environment
Correlate with trace IDs or transaction IDs
Identify patterns and anomalies

Example NRQL:

SELECT timestamp, level, message, logger, threadName
FROM Log 
WHERE message LIKE '%timeout%' 
AND level = 'ERROR' 
SINCE 1 hour ago 
ORDER BY timestamp DESC 
LIMIT 100

Capacity Planning

Goal: Analyze resource usage trends and forecast scaling needs.

Steps:

Get infrastructure hosts → Check CPU, memory, disk
Query throughput trends over time
Analyze peak vs. average load
Check database connection pool usage
Identify resource constraints
Provide scaling recommendations

Example NRQL:

SELECT average(cpuPercent), max(cpuPercent), average(memoryUsedPercent)
FROM SystemSample 
SINCE 7 days ago 
FACET hostname 
TIMESERIES 1 day

Infrastructure Health Check

Goal: Monitor host-level metrics and identify resource constraints.

Steps:

List infrastructure hosts → Get all hosts
Check CPU usage → Identify high CPU hosts
Check memory usage → Identify memory pressure
Check disk usage → Identify storage issues
Correlate with application performance
Provide health summary

Synthetic Monitoring

Goal: Proactively monitor availability and performance from external locations.

Steps:

List synthetic monitors → Check status
Identify failed monitors
Analyze failure patterns (geographic, time-based)
Check success rate trends
Create new monitors for critical endpoints

Tips for Maximum Productivity

Always use time filters: Add SINCE clause to NRQL queries to limit data volume (e.g., SINCE 1 hour ago, SINCE 24 hours ago)
Start broad, drill down: Begin with high-level metrics (app performance, error rate), then query details
Use FACET for grouping: Group results by endpoint, error type, host (e.g., FACET name, FACET error.class)
Leverage TIMESERIES: Visualize trends over time (e.g., TIMESERIES 5 minutes, TIMESERIES 1 day)
Combine data sources: Correlate Transaction with TransactionError, SystemSample with Transaction
Cache application IDs: Reuse application names/GUIDs across multiple queries
Batch related queries: Execute multiple NRQL queries in parallel when investigating complex issues
Use ORDER BY: Rank results (slowest endpoints, most frequent errors) with ORDER BY average(duration) DESC

Troubleshooting

Authentication Errors: Verify NEW_RELIC_API_KEY has appropriate permissions; check account ID is correct; re-authenticate if needed
Query Timeouts: Reduce time range (use shorter SINCE intervals); limit result sets with LIMIT; avoid complex aggregations without filters
Missing Data: Confirm application instrumentation is active; check data retention policies; verify entity is reporting
Rate Limits: Batch queries; use specific filters to reduce data volume; implement exponential backoff for retries
NRQL Syntax Errors: Validate metric names with list_metric_names_for_host; check NRQL syntax at https://docs.newrelic.com/docs/query-your-data/nrql-new-relic-query-language/get-started/introduction-nrql-new-relics-query-language/
Incident Acknowledgement Failures: Verify incident ID is correct; check incident state (can't acknowledge already closed incidents); ensure API key has write permissions

Best Practices

Query Optimization

Use LIMIT to prevent excessive result sets (100-1000 rows)
Apply WHERE filters before FACET for better performance
Use TIMESERIES with appropriate intervals (1 minute for real-time, 1 day for trends)
Avoid SELECT * - specify needed attributes

Security

Use read-only API keys for monitoring agents
Grant write permissions only for incident management
Rotate API keys regularly
Never commit API keys to version control

Investigation Methodology

Scope: Identify affected applications and time range
Metrics: Gather high-level performance data
Errors: Check for error spikes or patterns
Infrastructure: Verify resources aren't constrained
Logs: Search for error messages and stack traces
Correlate: Connect metrics, errors, and logs
Root Cause: Provide evidence-based diagnosis
Recommend: Actionable next steps

Newrelic

Skill Details

Repository Files

name: newrelic description: Monitor applications, investigate performance issues, and analyze observability data in New Relic. Use when the user needs APM metrics, error tracking, infrastructure monitoring, or incident analysis. metadata: short-description: New Relic observability and monitoring

New Relic

Overview

Prerequisites

Required Workflow

Step 0: Set up New Relic MCP (if not already configured)

Step 1

Step 2

Step 3

Step 4

Available Tools

Practical Workflows

Performance Analysis

Error Investigation

Incident Triage

Log Analysis

Capacity Planning

Infrastructure Health Check

Synthetic Monitoring

Tips for Maximum Productivity

Troubleshooting

Best Practices

Query Optimization

Security

Investigation Methodology

Related Skills

Xlsx

Clickhouse Io

Clickhouse Io

Analyzing Financial Statements

Data Storytelling

Kpi Dashboard Design

Dbt Transformation Patterns

Sql Optimization Patterns

Anndata

Xlsx

Skill Information