Hic Compartments Calling
by BIsnake2001
This skill performs PCA-based A/B compartments calling on Hi-C .mcool datasets using pre-defined MCP tools from the cooler-tools, cooltools-tools, and plot-hic-tools servers.
Skill Details
Repository Files
1 file in this skill directory
name: hic-compartments-calling description: This skill performs PCA-based A/B compartments calling on Hi-C .mcool datasets using pre-defined MCP tools from the cooler-tools, cooltools-tools, and plot-hic-tools servers.
Hi-C Compartments Calling (MCP-based)
Overview
This skill provides an automated workflow for compartments calling on .mcool, .cool or .hic Hi-C data.
Main steps include:
- Refer to the Inputs & Outputs section to verify required files and output structure.
- Always prompt user for genome assembly used.
- Always prompt user for resolution used to call compartments. ~50-250 kb is recommended. 100 kb is default.
- Locate the genome FASTA file from homer genome fasta file based on user input.
- Rename chromosomes in the .mcool or .cool file to satisfy the chromosome format with "chr".
- Generate chromosome-arm view files for compartment calling after changing the chromosome name.
- Perform PCA-based compartment analysis and extract the first principal component (PC1).
- Generate compartment interaction saddle plots and BigWig outputs for visualization.
When to Use This Skill
Use this skill when:
- You want to identify A/B compartments from Hi-C
.mcoolor.coolfiles. - You need PC1 compartment scores and bigWig tracks for genome browser visualization.
- You want a reproducible, normalized, automated compartment-calling workflow.
Inputs & Outputs
Inputs
- File format: .mcool, .cool, or .hic (Hi-C data file) data.
- Genome assembly: Prompt the user for genome assembly used.
- Resolution: Prompt the user for resolution used to call compartments. The default resolution is 100 kb.
Outputs
${sample}_Compartments_calling/
compartments/
eigs.${resolution}.cis.vecs.tsv # PC1 compartment scores
eigs.${resolution}.bw
eigs.${resolution}.cis.lam.txt
saddle.cis.${resolution}.digitized.tsv
saddle.cis.${resolution}.saddledump.npz
plots/ # PC1 track for genome browser
saddle.cis.${resolution}.pdf # Saddle plot visualization
temp/
expected.${resolution}.cis.tsv
view_${genome}.tsv # Chromosome-arm view definition
bins.${res}.tsv
gc.${res}.tsv
Allowed Tools
When using this skill, you should restrict yourself to the following MCP tools from server cooler-tools, cooltools-tools, plot-hic-tools, project-init-tools, genome-locate-tools:
mcp__project-init-tools__project_initmcp__genome-locate-tools__genome_locate_fastamcp__HiCExplorer-tools__hic_to_mcoolmcp__cooler-tools__list_mcool_resolutionsmcp__cooler-tools__harmonize_chrom_namesmcp__cooler-tools__make_view_chromarmsmcp__cooler-tools__dump_bins_for_gcmcp__cooltools-tools__run_genome_gcmcp__cooltools-tools__run_expected_cismcp__cooltools-tools__run_eigs_cismcp__cooltools-tools__run_saddlemcp__plot-hic-tools__plot_saddle_pdf
Do NOT fall back to:
- raw shell commands (
cooler dump,cooltools eigs-cis,cooltools saddle, etc.) - ad-hoc Python snippets (e.g. importing
cooler,bioframe,matplotlibmanually in the reply).
Decision Tree
Step 0 — Gather Required Information from the User
Before calling any tool, ask the user:
-
Sample name (
sample): used as prefix and for the output directory${sample}_Compartments_calling. -
Genome assembly (
genome): e.g.hg38,mm10,danRer11.- Never guess or auto-detect.
-
Hi-C matrix path/URI (
mcool_uri): e.g..mcoolfile path or.hicfile path.path/to/sample.mcool::/resolutions/100000(.mcool file with resolution specified)- or
.coolfile path - or
.hicfile path
-
Resolution (
resolution): default100000(100 kb).- If user does not specify, use
100000as default. - Must be the same as the resolution used for
${mcool_uri}
- If user does not specify, use
Step 1 — Initialize Project & Locate Genome FASTA
- Make director for this project:
Call:
mcp__project-init-tools__project_init
with:
sample: the user-provided sample nametask: loop_calling
The tool will:
- Create
${sample}_loop_callingdirectory. - Return the full path of the
${sample}_loop_callingdirectory, which will be used as${proj_dir}.
- If the user provides a
.hicfile, convert it to.mcoolfile usingmcp__HiCExplorer-tools__hic_to_mcooltool:
Call:
mcp__HiCExplorer-tools__hic_to_mcool
with:
input_hic: the user-provided path (e.g.input.hic)sample: the user-provided sample nameproj_dir: directory to save the view file. In this skill, it is the full path of the${sample}_loop_callingdirectory returned bymcp__project-init-tools__project_init.
The tool will:
- Convert the
.hicfile to.mcoolfile. - Return the path of the
.mcoolfile.
If the conversion is successful, update ${mcool_uri} to the path of the .mcool file.
- Locate genome fasta file:
Call:
mcp__genome-locate-tools__genome_locate_fasta
with:
genome: the user-provided genome assembly
The tool will:
- Locate genome FASTA.
- Verify the FASTA exists.
Step 2: List Available Resolutions in the .mcool file & Modify the Chromosome Names if Necessary
- Check the resolutions in
mcool_uri:
Call:
mcp__cooler-tools__list_mcool_resolutions
with:
mcool_path: the user-provided path (e.g.input.mcool) without resolution specified.
The tool will:
- List all resolutions in the .mcool file.
- Return the resolutions as a list.
If the user defined or default ${resolution} is not found in the list, ask the user to specify the resolution again.
Else, use ${resolution} for the following steps.
- Check if the chromosome names in the .mcool file are started with "chr", and if not, modify them to start with "chr":
Call:
mcp__cooler-tools__harmonize_chrom_names
with:
sample: the user-provided sample nameproj_dir: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initmcool_uri: cooler URI with resolution specified, e.g.input.mcool::/resolutions/${resolution}resolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integer
The tool will:
- Check if the chromosome names in the .mcool file.
- If not, harmonize the chromosome names in the .mcool file.
Step 3 — Create Chromosome-Arm View File
Use bioframe to define chromosome arms based on centromeres:
Call:
mcp__cooler-tools__make_view_chromarms
with:
proj_dir: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initmcool_uri: cooler URI with resolution specified, e.g.input.mcool::/resolutions/${resolution}resolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integergenome: genome assembly
The tool will:
- Fetch chromsizes and centromeres via
bioframe. - Generate chromosomal arms and filter them to those present in the cooler.
- Return the path of the view file under
${proj_dir}/temp/directory.
Step 4 — Compute GC Track for Bins
- Dump bins for GC track:
Call:
mcp__cooler-tools__dump_bins_for_gcwith:sample: the user-provided sample nameproj_dir: directory to save the GC track file. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initmcool_uri: cooler URI with resolution specified, e.g.input.mcool::/resolutions/${resolution}resolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integer
The tool will:
- Dump bins at the specified resolution from the cooler.
- Return the path of the bins file under
${proj_dir}/temp/directory.
- Compute GC track:
Call:
mcp__cooltools-tools__run_genome_gc
with:
sample: the user-provided sample nameproj_dir: directory to save the GC track file. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initmcool_uri: cooler URI with resolution specified, e.g.input.mcool::/resolutions/${resolution}resolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integergenome: genome assembly
The tool will:
- Compute GC content for each bin.
- Return the path of the GC track file under
${proj_dir}/temp/directory.
Step 5 — Run Expected-cis and Eigs-cis (PCA Compartment Calling)
- Calculate expected cis:
Call:
mcp__cooltools-tools__run_expected_cis
with:
sample: the user-provided sample nameproj_dir: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initmcool_uri: cooler URI with resolution specified, e.g.input.mcool::/resolutions/${resolution}resolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integerview_path: the path to the view file (e.g.${proj_dir}/temp/view_${genome}.tsv)clr_weight_name: the name of the weight column (default:weight)ignore_diags: the number of diagonals to ignore based on resolution
The tool will:
- Generate expected cis file.
- Return the path of the expected cis file under
${proj_dir}/temp/directory.
- Calculate eigs cis:
Call:
mcp__cooltools-tools__run_eigs_cis
with:
sample: the user-provided sample nameproj_dir: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initmcool_uri: cooler URI with resolution specified, e.g.input.mcool::/resolutions/${resolution}resolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integerview_path: the view TSV from Step 3 (e.g.view_${genome}.tsv)gc_tsv: GC track TSV from Step 4clr_weight_name: balancing column name (default"weight", but can be set based onclr.bins().columnsif the user tells you the correct name)n_eigs: the number of principal components to compute (default 1)make_bigwig: whether to make bigwig file for PC1 track (default True)
This tool will:
- Run
cooltools expected-cisto compute expected contact frequencies. - Run
cooltools eigs-cisto perform PCA and extract PC1. - Return the path of the eigs-cis vecs file under
${proj_dir}/compartments/directory. - Return the path of the bigWig file under
${proj_dir}/compartments/directory.
If the user reports an error about balancing weights:
- Ask the user which weight column should be used.
- Re-run
expected_and_eigswith the correctclr_weight_name.
Step 6 — Run Saddle Analysis
Call:
mcp__cooltools-tools__run_saddle
with:
sample: the user-provided sample nameproj_dir: directory to save the saddle file. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initmcool_uri: cooler URI with resolution specified, e.g.input.mcool::/resolutions/${resolution}resolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integerview_path: the view TSV from Step 3 (e.g.view_${genome}.tsv)eigs_vecs_tsv: the eigs-cis vecs TSV from Step 5 (e.g.compartments/eigs.${resolution}.cis.vecs.tsv)expected_cis_tsv: the expected-cis TSV from Step 5 (e.g.temp/expected_cis.${resolution}.tsv)clr_weight_name: balancing column name (default"weight", but can be set based onclr.bins().columnsif the user tells you the correct name)qrange_lowandqrange_high: default0.02and0.98
The tool will:
- Run
cooltools saddle. - Generate saddle dump and related outputs, typically:
- Return the path of the saddle dump file under
${proj_dir}/compartments/directory. - Return the path of the other related outputs under
${proj_dir}/compartments/directory.
Step 7 — Plot Saddle as PDF
Call:
mcp__plot-hic-tools__plot_saddle_pdf
with:
sample: the user-provided sample nameproj_dir: directory to save the saddle file. In this skill, it is the full path of the${sample}_Compartments_callingdirectory returned bymcp__project-init-tools__project_initresolution:${resolution}must be the same as the resolution used for${mcool_uri}and must be an integerchr_name: the user-provided chromosome name, e.g.chr1
This tool will:
- Load the corresponding
.saddledump.npzfile. - Plot the saddle matrix with
LogNorm(1e-1, 1e1)andRdBu_rcolormap. - Return the path of the compartment scores distribution PDF file under
${proj_dir}/plots/directory. - Return the path of the saddle plot PDF file under
${proj_dir}/plots/directory. - Return the path of the PC1 track PDF file under
${proj_dir}/plots/directory.
If the saddledump file is missing, inform the user to run run_saddle first.
Best Practices
- Always confirm the genome and resolution explicitly with the user.
- Always use the defined MCP tools instead of ad-hoc code.
- If the user asks “how to run this manually”, you may conceptually describe the steps but still prefer to recommend using the MCP pipeline for reproducibility.
- If multiple resolutions are required, re-run the MCP tools with different
resolutionvalues and keep outputs in the same${proj_dir}directory, using resolution in filenames for disambiguation.
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Team Composition Analysis
This skill should be used when the user asks to "plan team structure", "determine hiring needs", "design org chart", "calculate compensation", "plan equity allocation", or requests organizational design and headcount planning for a startup.
Startup Financial Modeling
This skill should be used when the user asks to "create financial projections", "build a financial model", "forecast revenue", "calculate burn rate", "estimate runway", "model cash flow", or requests 3-5 year financial planning for a startup.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Startup Metrics Framework
This skill should be used when the user asks about "key startup metrics", "SaaS metrics", "CAC and LTV", "unit economics", "burn multiple", "rule of 40", "marketplace metrics", or requests guidance on tracking and optimizing business performance metrics.
