Workflow Management
by treasure-data
TD workflow debugging and operations. Covers tdx wf commands for monitoring (sessions, attempt, logs), retry/backfill patterns, alerting (_error with Slack/email), and data quality checks.
Skill Details
Repository Files
1 file in this skill directory
name: workflow-management description: TD workflow debugging and operations. Covers tdx wf commands for monitoring (sessions, attempt, logs), retry/backfill patterns, alerting (_error with Slack/email), and data quality checks.
TD Workflow Management
Setup & Context
tdx wf use my_project # Set default project for session
tdx wf pull my_project # Pull project locally for editing
tdx wf push # Push changes with diff preview
Monitoring Commands
tdx wf sessions # List runs (uses session context)
tdx wf sessions --status error # Filter by status
tdx wf attempt <id> tasks # Show task status
tdx wf attempt <id> logs +task_name # View logs
Debugging Steps
- Check error in
tdx wf attempt <id> logs +failed_task - Verify query syntax if td> failed
- Check time ranges - does data exist for session_date?
- Validate parameter values
- Check resource limits (memory, timeout)
Retry Operations
tdx wf attempt <id> retry # Retry from start
tdx wf attempt <id> retry --resume-from +step # Retry from task
tdx wf attempt <id> retry --params '{"key":"val"}' # Override params
tdx wf attempt <id> kill # Stop running
Alerting
+critical_task:
td>: queries/important.sql
_error:
+slack_alert:
sh>: |
curl -X POST ${secret:slack.webhook_url} \
-H 'Content-Type: application/json' \
-d '{"text": "Workflow failed: ${session_id}"}'
Data Quality Checks
+process:
td>: queries/process.sql
create_table: results
+validate:
td>:
query: |
SELECT COUNT(*) as cnt,
SUM(CASE WHEN id IS NULL THEN 1 ELSE 0 END) as nulls
FROM results
store_last_results: true
+check:
if>: ${td.last_results.cnt == 0}
_do:
+fail:
sh>: exit 1
Wait for Data
+wait_for_data:
sh>: |
for i in {1..30}; do
COUNT=$(tdx query -d analytics "SELECT COUNT(*) FROM src WHERE date='${session_date}'" --format csv | tail -1)
if [ "$COUNT" -gt 0 ]; then exit 0; fi
sleep 60
done
exit 1
Idempotent Operations
+safe_insert:
td>:
query: |
DELETE FROM target WHERE date = '${session_date}';
INSERT INTO target SELECT * FROM source WHERE date = '${session_date}'
Backfill Pattern
+backfill:
loop>:
dates: ["2024-01-01", "2024-01-02", "2024-01-03"]
_do:
+process:
call>: main_workflow.dig
params:
session_date: ${dates}
Secrets Management
tdx wf secrets list # List secret keys (values hidden)
tdx wf secrets set API_KEY=xxx # Set a secret
tdx wf secrets delete API_KEY # Delete a secret
Usage in .dig files:
+task:
sh>: curl -H "Authorization: ${secret:API_KEY}" https://api.example.com
Common Issues
| Issue | Solution |
|---|---|
| Timeout | Add timeout: 3600s, _retry: 2 |
| Intermittent failures | Add _retry: 5 with exponential backoff |
| Out of memory | Reduce data volume, use approx functions |
| Duplicate runs | Use idempotent DELETE+INSERT pattern |
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
