Database Maintenance
by GangWooLee
Database maintenance and health checks. Use when user needs migration safety, data integrity, index optimization, or says "check database", "optimize DB", "migration rollback", "data consistency", "database health".
Skill Details
Repository Files
2 files in this skill directory
name: database-maintenance description: Database maintenance and health checks. Use when user needs migration safety, data integrity, index optimization, or says "check database", "optimize DB", "migration rollback", "data consistency", "database health".
Database Maintenance
데이터베이스 마이그레이션, 데이터 정합성, 인덱스 최적화 등 데이터베이스 유지보수 작업을 수행합니다.
Quick Start
Task Progress (copy and check off):
- [ ] 1. Identify maintenance need
- [ ] 2. Run appropriate check/fix script
- [ ] 3. Review results
- [ ] 4. Apply fixes if needed
- [ ] 5. Verify database health
Use Cases
✅ When to use:
- Before deploying migrations
- After major data changes
- Regular health checks
- Performance degradation
- Data inconsistency reports
- Index optimization needed
Maintenance Tasks
1. Migration Safety Check
Check pending migrations:
rails db:migrate:status
Dry-run migration (check SQL without executing):
# Add to migration file
def change
reversible do |dir|
dir.up do
# Log what will happen
Rails.logger.info "Will add index on users.email"
end
end
add_index :users, :email, if_not_exists: true
end
Safe migration patterns:
# ✅ Good: Add column with default
def change
add_column :posts, :view_count, :integer, default: 0, null: false
end
# ✅ Good: Add index concurrently (PostgreSQL)
disable_ddl_transaction!
def change
add_index :posts, :user_id, algorithm: :concurrently
end
# ❌ Bad: Remove column without safety period
def change
remove_column :users, :legacy_field # Dangerous!
end
# ✅ Good: Remove column with deprecation period
# Step 1: Ignore column (deploy)
class User < ApplicationRecord
self.ignored_columns = [:legacy_field]
end
# Step 2: Remove column after all servers updated
def change
remove_column :users, :legacy_field
end
2. Data Integrity Checks
Foreign key validation:
# Check for orphaned records
Post.where.missing(:user).count
Comment.where.missing(:post).count
# Fix orphaned records
Post.where.missing(:user).delete_all
Counter cache validation:
# Check counter cache accuracy
User.find_each do |user|
actual_count = user.posts.count
cached_count = user.posts_count
if actual_count != cached_count
puts "User #{user.id}: cached=#{cached_count}, actual=#{actual_count}"
user.update_column(:posts_count, actual_count)
end
end
Uniqueness validation:
# Find duplicate emails
User.group(:email).having('COUNT(*) > 1').count
# Fix duplicates (keep oldest)
User.select(:email)
.group(:email)
.having('COUNT(*) > 1')
.pluck(:email)
.each do |email|
users = User.where(email: email).order(:created_at)
users[1..-1].each(&:destroy)
end
3. Index Optimization
Find missing indexes:
# Check for foreign keys without indexes
ActiveRecord::Base.connection.tables.each do |table|
columns = ActiveRecord::Base.connection.columns(table)
columns.select { |c| c.name.end_with?('_id') }.each do |column|
indexes = ActiveRecord::Base.connection.indexes(table)
unless indexes.any? { |i| i.columns.include?(column.name) }
puts "Missing index: #{table}.#{column.name}"
end
end
end
Find unused indexes (PostgreSQL):
SELECT
schemaname,
tablename,
indexname,
idx_scan as index_scans
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY schemaname, tablename;
Add composite indexes:
# For queries like: Post.where(user_id: x, status: :published)
add_index :posts, [:user_id, :status]
# For queries with ORDER BY
add_index :posts, [:user_id, :created_at]
4. Database Health Check
Connection pool status:
# Check active connections
pool = ActiveRecord::Base.connection_pool
puts "Size: #{pool.size}, Active: #{pool.connections.size}, Available: #{pool.available}"
Table sizes (PostgreSQL):
SELECT
table_name,
pg_size_pretty(pg_total_relation_size(quote_ident(table_name))) as size
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY pg_total_relation_size(quote_ident(table_name)) DESC;
Slow queries log:
# config/environments/production.rb
config.active_record.logger = ActiveSupport::Logger.new('log/db_queries.log')
config.log_level = :debug
# Find slow queries (> 100ms)
# Requires logging setup from logging-setup skill
5. Migration Rollback
Rollback last migration:
rails db:rollback
Rollback multiple steps:
rails db:rollback STEP=3
Rollback to specific version:
rails db:migrate:down VERSION=20231219000000
Safe rollback pattern:
class AddEmailToUsers < ActiveRecord::Migration[8.0]
def up
add_column :users, :email, :string
add_index :users, :email, unique: true
end
def down
remove_index :users, :email
remove_column :users, :email
end
end
6. Data Migration
Backfill data safely:
# Use background job for large datasets
class BackfillUserEmailsJob < ApplicationJob
def perform(batch_size: 1000)
User.where(email: nil).find_in_batches(batch_size: batch_size) do |users|
users.each do |user|
user.update(email: generate_email(user))
end
# Sleep to avoid overloading DB
sleep 0.1
end
end
end
Data migration in migration file:
class MigrateOldDataToNewFormat < ActiveRecord::Migration[8.0]
def up
# Add new column
add_column :posts, :metadata, :jsonb, default: {}
# Migrate data in batches
Post.find_each do |post|
post.update_column(:metadata, {
old_field: post.old_field,
legacy_data: post.legacy_data
})
end
end
def down
remove_column :posts, :metadata
end
end
Automation Scripts
Database Health Check Script
# Run via: ruby .claude/skills/database-maintenance/scripts/health_check.rb
The script checks:
- Pending migrations
- Orphaned records
- Counter cache accuracy
- Missing indexes on foreign keys
- Connection pool status
Index Optimization Script
# Run via: ruby .claude/skills/database-maintenance/scripts/optimize_indexes.rb
The script:
- Identifies missing indexes
- Suggests composite indexes
- Finds unused indexes
Best Practices
Migration Safety
- Always test migrations locally first
- Use reversible migrations (up/down methods)
- Add indexes with if_not_exists: true
- Use disable_ddl_transaction! for large tables (PostgreSQL)
- Never remove columns in same deploy (deprecation period)
Data Integrity
- Run integrity checks before major releases
- Set up foreign key constraints where appropriate
- Use database-level validations (unique indexes)
- Regular counter cache resets
Performance
- Monitor query performance (logging-setup skill)
- Add indexes for frequently queried columns
- Use composite indexes for multi-column queries
- Remove unused indexes (they slow down writes)
Backup & Recovery
- Always backup before major migrations
- Test rollback procedures
- Keep migration files in version control
- Document destructive migrations
Common Issues & Solutions
Issue: Migration fails in production
# 1. Check migration status
rails db:migrate:status
# 2. Rollback if partially applied
rails db:rollback
# 3. Fix migration file
# 4. Re-run migration
rails db:migrate
Issue: Counter caches out of sync
# Reset all counter caches
rails runner "Post.find_each { |p| Post.reset_counters(p.id, :comments) }"
Issue: Orphaned records causing errors
# Find and clean orphaned records
Comment.includes(:post).where(posts: { id: nil }).delete_all
Issue: Slow queries
# 1. Enable query logging (logging-setup skill)
# 2. Identify slow queries
# 3. Add appropriate indexes
# 4. Optimize query with includes/joins
PostgreSQL-Specific Tips
Analyze tables after major changes:
ANALYZE table_name;
Vacuum to reclaim space:
VACUUM ANALYZE;
Reindex for performance:
REINDEX INDEX index_name;
SQLite-Specific Tips
Integrity check:
rails dbconsole
> PRAGMA integrity_check;
Optimize database:
rails dbconsole
> VACUUM;
> PRAGMA optimize;
Monitoring & Alerts
Integrate with logging-setup skill for:
- Migration duration tracking
- Query performance monitoring
- Data integrity alerts
Checklist
- Migrations tested locally
- Migrations are reversible
- Foreign keys have indexes
- Counter caches are accurate
- No orphaned records
- Backup taken before major changes
- Rollback procedure documented
- Performance impact assessed
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
