Bio Reporting Automated Qc Reports
by GPTomics
Generates standardized quality control reports by aggregating metrics from FastQC, alignment, and other tools using MultiQC. Use when summarizing QC metrics across samples, creating shareable quality reports, or building automated QC pipelines.
Skill Details
Repository Files
3 files in this skill directory
name: bio-reporting-automated-qc-reports description: Generates standardized quality control reports by aggregating metrics from FastQC, alignment, and other tools using MultiQC. Use when summarizing QC metrics across samples, creating shareable quality reports, or building automated QC pipelines. tool_type: cli primary_tool: multiqc
Automated QC Reports with MultiQC
Basic Usage
# Aggregate all QC outputs in directory
multiqc results/ -o qc_report/
# Specify output name
multiqc results/ -n my_project_qc
# Include specific tools only
multiqc results/ --module fastqc --module star
Supported Tools
MultiQC recognizes outputs from 100+ bioinformatics tools:
| Category | Tools |
|---|---|
| Read QC | FastQC, fastp, Cutadapt |
| Alignment | STAR, HISAT2, BWA, Bowtie2 |
| Quantification | featureCounts, Salmon, kallisto |
| Variant Calling | bcftools, GATK |
| Single-cell | CellRanger, STARsolo |
Configuration
Create multiqc_config.yaml:
title: "RNA-seq QC Report"
subtitle: "Project XYZ"
intro_text: "QC metrics for all samples"
# Custom sample name cleaning
extra_fn_clean_exts:
- '.sorted'
- '.dedup'
# Report sections to include
module_order:
- fastqc
- star
- featurecounts
# Highlight samples
table_cond_formatting_rules:
pct_mapped:
fail: [{lt: 50}]
warn: [{lt: 70}]
Custom Data
# Add custom data file
# File format: sample\tmetric1\tmetric2
multiqc results/ --data-format tsv --custom-data-file custom_metrics.tsv
Python API
from multiqc import run as multiqc_run
# Run programmatically
multiqc_run(analysis_dir='results/', outdir='qc_report/')
Related Skills
- read-qc/quality-reports - Generate input FastQC reports
- read-qc/fastp-workflow - Preprocessing QC
- workflows/rnaseq-to-de - Full workflow with QC
Related Skills
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
Tensorboard
Visualize training metrics, debug models with histograms, compare experiments, visualize model graphs, and profile performance with TensorBoard - Google's ML visualization toolkit
Deeptools
NGS analysis toolkit. BAM to bigWig conversion, QC (correlation, PCA, fingerprints), heatmaps/profiles (TSS, peaks), for ChIP-seq, RNA-seq, ATAC-seq visualization.
Scvi Tools
This skill should be used when working with single-cell omics data analysis using scvi-tools, including scRNA-seq, scATAC-seq, CITE-seq, spatial transcriptomics, and other single-cell modalities. Use this skill for probabilistic modeling, batch correction, dimensionality reduction, differential expression, cell type annotation, multimodal integration, and spatial analysis tasks.
Statsmodels
Statistical modeling toolkit. OLS, GLM, logistic, ARIMA, time series, hypothesis tests, diagnostics, AIC/BIC, for rigorous statistical inference and econometric analysis.
Scikit Survival
Comprehensive toolkit for survival analysis and time-to-event modeling in Python using scikit-survival. Use this skill when working with censored survival data, performing time-to-event analysis, fitting Cox models, Random Survival Forests, Gradient Boosting models, or Survival SVMs, evaluating survival predictions with concordance index or Brier score, handling competing risks, or implementing any survival analysis workflow with the scikit-survival library.
Neurokit2
Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.
Statistical Analysis
Statistical analysis toolkit. Hypothesis tests (t-test, ANOVA, chi-square), regression, correlation, Bayesian stats, power analysis, assumption checks, APA reporting, for academic research.
