Generate_Candidate_Summary_Skill
by johnsonice
Generate a markdown summary report from candidate_profile.csv with statistics and insights
Skill Details
Repository Files
2 files in this skill directory
name: generate_candidate_summary_skill description: Generate a markdown summary report from candidate_profile.csv with statistics and insights
Generate Candidate Summary Report
This skill generates a comprehensive markdown summary report analyzing candidate profile data with statistics on gender distribution, URR representation, and nationality diversity.
What it does:
- Reads candidate profile CSV data
- Calculates comprehensive statistics (gender, URR, nationality)
- Generates formatted markdown report with tables and insights
- Identifies URR countries represented in the candidate pool
Usage:
Basic Usage
Run the summary generation script with default settings:
python .claude/skills/generate_candidate_summary_skill/generate_summary.py
This uses default paths:
- Input:
/data/home/xiong/dev/Fund_Process_Automation/candidate_profile.csv - Output:
/data/home/xiong/dev/Fund_Process_Automation/summary.md
With Custom Paths
Specify custom input and output files:
python .claude/skills/generate_candidate_summary_skill/generate_summary.py \
--csv_file /path/to/candidate_profile.csv \
--output_file /path/to/summary.md
Command-line Arguments:
--csv_file: Path to input CSV file (default:candidate_profile.csvin project root)--output_file: Path to output markdown file (default:summary.mdin project root)
Input Requirements:
Expected Input File:
- Path:
/data/home/xiong/dev/Fund_Process_Automation/candidate_profile.csv - Format: CSV file with the following columns:
Name: Candidate's full nameGender: Male/Female/UnknownCountry of Nationality: Country nameURR: "yes" or "no"
Note: This file is typically generated by the process_resume_skill.
Output:
Generated File:
- Path:
/data/home/xiong/dev/Fund_Process_Automation/summary.md - Format: Markdown document
Report Contents:
-
Overview Section
- Total number of candidates analyzed
-
Summary Statistics Tables
- Gender distribution (Male/Female/Unknown) with counts and percentages
- URR vs Non-URR distribution with percentages
- Top 10 nationalities with counts and URR status
-
Key Insights
- Gender balance analysis
- URR representation percentage
- Geographic diversity metrics
- Most common nationality
-
URR Countries List
- All URR countries represented in the pool
- Candidate count per URR country
Example Output Structure:
# Candidate Profile Summary
## Overview
This analysis covers X candidate resumes...
## Summary Statistics
### Gender Distribution
| Gender | Count | Percentage |
|--------|-------|------------|
| Male | X | XX.X% |
| Female | X | XX.X% |
### Under-Represented Region (URR) Distribution
| URR Status | Count | Percentage |
|------------|-------|------------|
| URR (Yes) | X | XX.X% |
### Top Nationalities Represented
| Country | Count | URR Status |
|---------|-------|------------|
...
## Key Insights
1. Gender Balance: ...
2. URR Representation: ...
3. Geographic Diversity: ...
## URR Countries Identified
- Country: X candidate(s)
...
Dependencies:
- Python 3.x
- pandas library (
pip install pandas)
Configuration:
Default file paths (can be overridden with command-line arguments):
- Input:
/data/home/xiong/dev/Fund_Process_Automation/candidate_profile.csv - Output:
/data/home/xiong/dev/Fund_Process_Automation/summary.md
Error Handling:
The script includes comprehensive error handling:
- Validates input CSV file exists before processing
- Checks for required columns (Gender, URR, Country of Nationality)
- Ensures CSV is not empty
- Creates output directory if it doesn't exist
- Provides clear error messages via logging
Console Output:
When successful, displays:
==================================================
SUMMARY REPORT GENERATED
==================================================
Output file: /path/to/summary.md
Total candidates: X
Male: X, Female: X, Unknown: X
URR: X, Non-URR: X
==================================================
Key Features:
- Flexible paths: Use command-line arguments to specify custom input/output locations
- Robust validation: Checks file existence, column presence, and data integrity
- Automatic directory creation: Creates output directories if they don't exist
- Comprehensive logging: Provides detailed information about processing steps
- Dynamic date: Report includes current generation date
- Error handling: Graceful failure with informative error messages
Related Skills
Attack Tree Construction
Build comprehensive attack trees to visualize threat paths. Use when mapping attack scenarios, identifying defense gaps, or communicating security risks to stakeholders.
Grafana Dashboards
Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.
Matplotlib
Foundational plotting library. Create line plots, scatter, bar, histograms, heatmaps, 3D, subplots, export PNG/PDF/SVG, for scientific visualization and publication figures.
Scientific Visualization
Create publication figures with matplotlib/seaborn/plotly. Multi-panel layouts, error bars, significance markers, colorblind-safe, export PDF/EPS/TIFF, for journal-ready scientific plots.
Seaborn
Statistical visualization. Scatter, box, violin, heatmaps, pair plots, regression, correlation matrices, KDE, faceted plots, for exploratory analysis and publication figures.
Shap
Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model
Pydeseq2
Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.
Query Writing
For writing and executing SQL queries - from simple single-table queries to complex multi-table JOINs and aggregations
Pydeseq2
Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.
Scientific Visualization
Meta-skill for publication-ready figures. Use when creating journal submission figures requiring multi-panel layouts, significance annotations, error bars, colorblind-safe palettes, and specific journal formatting (Nature, Science, Cell). Orchestrates matplotlib/seaborn/plotly with publication styles. For quick exploration use seaborn or plotly directly.
