Use Defaultdict Groups
by jimmc414
For grouping and auto-initialization: adjacency lists, grouping by key, multi-maps, nested structures without KeyError handling.
Skill Details
Repository Files
1 file in this skill directory
name: use-defaultdict-groups description: "For grouping and auto-initialization: adjacency lists, grouping by key, multi-maps, nested structures without KeyError handling."
use-defaultdict-groups
When to Use
- Building adjacency lists for graphs
- Grouping items by some key
- Multi-maps (key -> list of values)
- Nested dict structures
- Avoiding
if key not in dictchecks - Counting (though Counter is often better)
When NOT to Use
- Simple key-value mapping (use regular dict)
- When missing key should raise error
- When default value is complex/expensive
The Pattern
Use defaultdict to auto-initialize missing keys.
from collections import defaultdict
# List default - for grouping
groups = defaultdict(list)
groups['a'].append(1) # No KeyError, creates empty list first
groups['a'].append(2)
# {'a': [1, 2]}
# Set default - for unique grouping
unique_groups = defaultdict(set)
unique_groups['a'].add(1)
unique_groups['a'].add(1) # Deduped
# {'a': {1}}
# Int default - for counting
counts = defaultdict(int)
counts['a'] += 1 # No KeyError, starts at 0
# {'a': 1}
# Nested defaultdict
tree = lambda: defaultdict(tree)
d = tree()
d['a']['b']['c'] = 1 # Auto-creates nested structure
Example (from pytudes)
from collections import defaultdict
# Adjacency list for graphs (AdventUtils.ipynb)
class multimap(defaultdict):
"""A mapping of {key: [val1, val2, ...]}."""
def __init__(self, pairs=(), symmetric=False):
self.default_factory = list
for key, val in pairs:
self[key].append(val)
if symmetric:
self[val].append(key)
# Usage
edges = [('a', 'b'), ('a', 'c'), ('b', 'd')]
graph = multimap(edges, symmetric=True)
# graph['a'] = ['b', 'c']
# graph['b'] = ['a', 'd']
# Grouping by attribute
students = [('Alice', 'Math'), ('Bob', 'CS'), ('Carol', 'Math')]
by_major = defaultdict(list)
for name, major in students:
by_major[major].append(name)
# {'Math': ['Alice', 'Carol'], 'CS': ['Bob']}
# Precomputing relationships (Sudoku.ipynb concept)
units = defaultdict(list)
for unit in all_units:
for square in unit:
units[square].append(unit)
# units['A1'] = [row_unit, col_unit, box_unit]
Key Principles
- Factory, not value: Pass
list, not[] - Auto-creates on access: Even
d[key]creates entry - Symmetric graphs: Add both directions for undirected
- Nesting:
defaultdict(lambda: defaultdict(int)) - Check existence carefully:
key in dbefore access if you don't want creation
Related Skills
Attack Tree Construction
Build comprehensive attack trees to visualize threat paths. Use when mapping attack scenarios, identifying defense gaps, or communicating security risks to stakeholders.
Grafana Dashboards
Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.
Matplotlib
Foundational plotting library. Create line plots, scatter, bar, histograms, heatmaps, 3D, subplots, export PNG/PDF/SVG, for scientific visualization and publication figures.
Scientific Visualization
Create publication figures with matplotlib/seaborn/plotly. Multi-panel layouts, error bars, significance markers, colorblind-safe, export PDF/EPS/TIFF, for journal-ready scientific plots.
Seaborn
Statistical visualization. Scatter, box, violin, heatmaps, pair plots, regression, correlation matrices, KDE, faceted plots, for exploratory analysis and publication figures.
Shap
Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model
Pydeseq2
Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.
Query Writing
For writing and executing SQL queries - from simple single-table queries to complex multi-table JOINs and aggregations
Pydeseq2
Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.
Scientific Visualization
Meta-skill for publication-ready figures. Use when creating journal submission figures requiring multi-panel layouts, significance annotations, error bars, colorblind-safe palettes, and specific journal formatting (Nature, Science, Cell). Orchestrates matplotlib/seaborn/plotly with publication styles. For quick exploration use seaborn or plotly directly.
