Hepdata
by fundamental-physics
Use when the user mentions 'HEPData' or asks to find experimental data, download data tables from HEP papers, get digitized plots, or retrieve cross-section measurements. HEPData contains data points behind figures in high-energy physics publications.
Skill Details
Repository Files
2 files in this skill directory
name: hepdata description: Use when the user mentions 'HEPData' or asks to find experimental data, download data tables from HEP papers, get digitized plots, or retrieve cross-section measurements. HEPData contains data points behind figures in high-energy physics publications.
HEPData Search and Download Skill
Search and download experimental data tables from HEPData, the high-energy physics data repository.
Requires: requests (pip install requests)
What is HEPData?
HEPData stores the actual data points behind plots and tables in HEP publications:
- Cross-section measurements
- Exclusion limits
- Differential distributions
- Digitized figures
Data is linked to papers via INSPIRE IDs and available in multiple formats.
Basic Usage
# Get record by HEPData ID
python scripts/hepdata.py ins1234567
# Get record by INSPIRE ID
python scripts/hepdata.py 1234567 --inspire
# Get record by arXiv ID
python scripts/hepdata.py 1907.12345 --arxiv
# List tables in a record
python scripts/hepdata.py 1234567 --inspire --tables
Searching
# Search by keywords
python scripts/hepdata.py --search "Higgs cross section"
# Search by reaction
python scripts/hepdata.py --search 'reactions:"P P --> TOP TOPBAR"'
# Search by collaboration
python scripts/hepdata.py --search "collaboration:ATLAS"
# Search by observable
python scripts/hepdata.py --search "observables:SIG"
# Limit results
python scripts/hepdata.py --search "dark matter" -n 5
Search Query Syntax
reactions:"P P --> X"- Search by reactioncollaboration:NAME- Filter by collaboration (ATLAS, CMS, LHCb, etc.)observables:TYPE- Filter by observable typecmenergies:13000- Center-of-mass energy in GeVkeywords:term- Search keywords
Downloading Data
# List available tables first
python scripts/hepdata.py 1234567 --inspire --tables
# Download specific table as CSV
python scripts/hepdata.py 1234567 --inspire --download "Table 1" --format csv
# Download all tables as YAML
python scripts/hepdata.py 1234567 --inspire --download --format yaml
# Download to specific directory
python scripts/hepdata.py 1234567 --inspire --download "Table 1" -o ./data/
Download Formats
csv- Comma-separated values (default)yaml- YAML format (HEPData native)json- JSON formatroot- ROOT fileyoda- YODA format (for Rivet)
Identifier Types
HEPData accepts three types of identifiers:
| Flag | Type | Example |
|---|---|---|
| (none) | HEPData ID | ins1234567 |
--inspire |
INSPIRE recid | 1234567 |
--arxiv |
arXiv ID | 1907.12345 |
Typical Workflow
- Find paper on INSPIRE: get the INSPIRE ID
- Check for HEPData:
python scripts/hepdata.py <inspire_id> --inspire - List tables:
python scripts/hepdata.py <id> --inspire --tables - Download data:
python scripts/hepdata.py <id> --inspire --download "Table 1" -f csv - Use in analysis
Common Use Cases
Get Exclusion Limits
# Find ATLAS SUSY limits
python scripts/hepdata.py --search "collaboration:ATLAS SUSY exclusion"
# Download the limit data
python scripts/hepdata.py 1234567 --inspire --download "Exclusion contour" -f csv
Get Cross-Section Measurements
# Find Higgs measurements
python scripts/hepdata.py --search "Higgs cross section 13 TeV"
# Download measurement table
python scripts/hepdata.py 1234567 --inspire --download "Cross section" -f csv
Overlay Theory on Data
Download data points to compare with your theoretical predictions:
import pandas as pd
data = pd.read_csv("Table_1.csv")
# Plot data vs your theory
Integration with INSPIRE Skill
Use together with the inspire skill:
- Search on INSPIRE for papers
- Get the INSPIRE recid
- Fetch data from HEPData using that ID
API Notes
- No authentication required
- Data is CC0 licensed
- Some older records may have limited format support
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
