Discovery.Data_Audit
by edwardmonteiro
Privacy or governance issues to consider.
Skill Details
Repository Files
1 file in this skill directory
name: discovery.data_audit phase: discovery roles:
- Data Analyst
- Analytics Engineer
description: Inventory available datasets, instrumentation gaps, and data quality considerations for the initiative.
variables:
required:
- name: domain description: Product area or journey requiring data assessment.
- name: decision_goals description: Business or product decisions the data should support. optional:
- name: current_sources description: Known data sources or dashboards already leveraged.
- name: compliance_flags description: Privacy or governance issues to consider. outputs:
- Data catalog listing sources, owners, freshness, and accessibility.
- Gap analysis with recommended instrumentation or ETL changes.
- Alignment summary on how data will support upcoming decisions.
Purpose
Give analytics partners a reusable way to surface the state of data readiness and highlight what is needed to support discovery.
Pre-run Checklist
- ✅ Access existing schema documentation or data dictionaries.
- ✅ Review outstanding data governance tickets or debt.
- ✅ Align with product on the decision timeline and required fidelity.
Invocation Guidance
codex skills run discovery.data_audit \
--vars "domain={{domain}}" \
"decision_goals={{decision_goals}}" \
"current_sources={{current_sources}}" \
"compliance_flags={{compliance_flags}}"
Recommended Input Attachments
- Links to Looker/Mode dashboards or warehouse tables.
- Screenshots of tracking plans or event schemas.
Claude Workflow Outline
- Summarize the decision goals and domain context.
- Produce a data catalog table with source details, owners, freshness, and trust level.
- Identify instrumentation or modeling gaps blocking the decision goals.
- Recommend implementation steps, owners, and sequencing.
- Outline interim proxies or experiments while data gaps are addressed.
Output Template
## Data Inventory
| Source | Owner | Freshness | Accessibility | Trust Level | Notes |
| --- | --- | --- | --- | --- | --- |
## Gaps & Recommendations
1. Gap — Impact — Suggested Fix — Owner — Timeline
## Decision Support Plan
- Immediate next step:
- Interim proxy:
- Long-term instrumentation:
Follow-up Actions
- File tracking or warehouse work items with clear acceptance criteria.
- Communicate data readiness to product and engineering leadership.
- Schedule follow-up audits post-implementation.
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
