Database Maintenance

by GangWooLee

data

Database maintenance and health checks. Use when user needs migration safety, data integrity, index optimization, or says "check database", "optimize DB", "migration rollback", "data consistency", "database health".

Skill Details

Repository Files

2 files in this skill directory


name: database-maintenance description: Database maintenance and health checks. Use when user needs migration safety, data integrity, index optimization, or says "check database", "optimize DB", "migration rollback", "data consistency", "database health".

Database Maintenance

데이터베이스 마이그레이션, 데이터 정합성, 인덱스 최적화 등 데이터베이스 유지보수 작업을 수행합니다.

Quick Start

Task Progress (copy and check off):
- [ ] 1. Identify maintenance need
- [ ] 2. Run appropriate check/fix script
- [ ] 3. Review results
- [ ] 4. Apply fixes if needed
- [ ] 5. Verify database health

Use Cases

When to use:

  • Before deploying migrations
  • After major data changes
  • Regular health checks
  • Performance degradation
  • Data inconsistency reports
  • Index optimization needed

Maintenance Tasks

1. Migration Safety Check

Check pending migrations:

rails db:migrate:status

Dry-run migration (check SQL without executing):

# Add to migration file
def change
  reversible do |dir|
    dir.up do
      # Log what will happen
      Rails.logger.info "Will add index on users.email"
    end
  end

  add_index :users, :email, if_not_exists: true
end

Safe migration patterns:

# ✅ Good: Add column with default
def change
  add_column :posts, :view_count, :integer, default: 0, null: false
end

# ✅ Good: Add index concurrently (PostgreSQL)
disable_ddl_transaction!
def change
  add_index :posts, :user_id, algorithm: :concurrently
end

# ❌ Bad: Remove column without safety period
def change
  remove_column :users, :legacy_field  # Dangerous!
end

# ✅ Good: Remove column with deprecation period
# Step 1: Ignore column (deploy)
class User < ApplicationRecord
  self.ignored_columns = [:legacy_field]
end

# Step 2: Remove column after all servers updated
def change
  remove_column :users, :legacy_field
end

2. Data Integrity Checks

Foreign key validation:

# Check for orphaned records
Post.where.missing(:user).count
Comment.where.missing(:post).count

# Fix orphaned records
Post.where.missing(:user).delete_all

Counter cache validation:

# Check counter cache accuracy
User.find_each do |user|
  actual_count = user.posts.count
  cached_count = user.posts_count

  if actual_count != cached_count
    puts "User #{user.id}: cached=#{cached_count}, actual=#{actual_count}"
    user.update_column(:posts_count, actual_count)
  end
end

Uniqueness validation:

# Find duplicate emails
User.group(:email).having('COUNT(*) > 1').count

# Fix duplicates (keep oldest)
User.select(:email)
    .group(:email)
    .having('COUNT(*) > 1')
    .pluck(:email)
    .each do |email|
  users = User.where(email: email).order(:created_at)
  users[1..-1].each(&:destroy)
end

3. Index Optimization

Find missing indexes:

# Check for foreign keys without indexes
ActiveRecord::Base.connection.tables.each do |table|
  columns = ActiveRecord::Base.connection.columns(table)

  columns.select { |c| c.name.end_with?('_id') }.each do |column|
    indexes = ActiveRecord::Base.connection.indexes(table)

    unless indexes.any? { |i| i.columns.include?(column.name) }
      puts "Missing index: #{table}.#{column.name}"
    end
  end
end

Find unused indexes (PostgreSQL):

SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan as index_scans
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY schemaname, tablename;

Add composite indexes:

# For queries like: Post.where(user_id: x, status: :published)
add_index :posts, [:user_id, :status]

# For queries with ORDER BY
add_index :posts, [:user_id, :created_at]

4. Database Health Check

Connection pool status:

# Check active connections
pool = ActiveRecord::Base.connection_pool
puts "Size: #{pool.size}, Active: #{pool.connections.size}, Available: #{pool.available}"

Table sizes (PostgreSQL):

SELECT
  table_name,
  pg_size_pretty(pg_total_relation_size(quote_ident(table_name))) as size
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY pg_total_relation_size(quote_ident(table_name)) DESC;

Slow queries log:

# config/environments/production.rb
config.active_record.logger = ActiveSupport::Logger.new('log/db_queries.log')
config.log_level = :debug

# Find slow queries (> 100ms)
# Requires logging setup from logging-setup skill

5. Migration Rollback

Rollback last migration:

rails db:rollback

Rollback multiple steps:

rails db:rollback STEP=3

Rollback to specific version:

rails db:migrate:down VERSION=20231219000000

Safe rollback pattern:

class AddEmailToUsers < ActiveRecord::Migration[8.0]
  def up
    add_column :users, :email, :string
    add_index :users, :email, unique: true
  end

  def down
    remove_index :users, :email
    remove_column :users, :email
  end
end

6. Data Migration

Backfill data safely:

# Use background job for large datasets
class BackfillUserEmailsJob < ApplicationJob
  def perform(batch_size: 1000)
    User.where(email: nil).find_in_batches(batch_size: batch_size) do |users|
      users.each do |user|
        user.update(email: generate_email(user))
      end

      # Sleep to avoid overloading DB
      sleep 0.1
    end
  end
end

Data migration in migration file:

class MigrateOldDataToNewFormat < ActiveRecord::Migration[8.0]
  def up
    # Add new column
    add_column :posts, :metadata, :jsonb, default: {}

    # Migrate data in batches
    Post.find_each do |post|
      post.update_column(:metadata, {
        old_field: post.old_field,
        legacy_data: post.legacy_data
      })
    end
  end

  def down
    remove_column :posts, :metadata
  end
end

Automation Scripts

Database Health Check Script

# Run via: ruby .claude/skills/database-maintenance/scripts/health_check.rb

The script checks:

  • Pending migrations
  • Orphaned records
  • Counter cache accuracy
  • Missing indexes on foreign keys
  • Connection pool status

Index Optimization Script

# Run via: ruby .claude/skills/database-maintenance/scripts/optimize_indexes.rb

The script:

  • Identifies missing indexes
  • Suggests composite indexes
  • Finds unused indexes

Best Practices

Migration Safety

  1. Always test migrations locally first
  2. Use reversible migrations (up/down methods)
  3. Add indexes with if_not_exists: true
  4. Use disable_ddl_transaction! for large tables (PostgreSQL)
  5. Never remove columns in same deploy (deprecation period)

Data Integrity

  1. Run integrity checks before major releases
  2. Set up foreign key constraints where appropriate
  3. Use database-level validations (unique indexes)
  4. Regular counter cache resets

Performance

  1. Monitor query performance (logging-setup skill)
  2. Add indexes for frequently queried columns
  3. Use composite indexes for multi-column queries
  4. Remove unused indexes (they slow down writes)

Backup & Recovery

  1. Always backup before major migrations
  2. Test rollback procedures
  3. Keep migration files in version control
  4. Document destructive migrations

Common Issues & Solutions

Issue: Migration fails in production

# 1. Check migration status
rails db:migrate:status

# 2. Rollback if partially applied
rails db:rollback

# 3. Fix migration file
# 4. Re-run migration
rails db:migrate

Issue: Counter caches out of sync

# Reset all counter caches
rails runner "Post.find_each { |p| Post.reset_counters(p.id, :comments) }"

Issue: Orphaned records causing errors

# Find and clean orphaned records
Comment.includes(:post).where(posts: { id: nil }).delete_all

Issue: Slow queries

# 1. Enable query logging (logging-setup skill)
# 2. Identify slow queries
# 3. Add appropriate indexes
# 4. Optimize query with includes/joins

PostgreSQL-Specific Tips

Analyze tables after major changes:

ANALYZE table_name;

Vacuum to reclaim space:

VACUUM ANALYZE;

Reindex for performance:

REINDEX INDEX index_name;

SQLite-Specific Tips

Integrity check:

rails dbconsole
> PRAGMA integrity_check;

Optimize database:

rails dbconsole
> VACUUM;
> PRAGMA optimize;

Monitoring & Alerts

Integrate with logging-setup skill for:

  • Migration duration tracking
  • Query performance monitoring
  • Data integrity alerts

Checklist

  • Migrations tested locally
  • Migrations are reversible
  • Foreign keys have indexes
  • Counter caches are accurate
  • No orphaned records
  • Backup taken before major changes
  • Rollback procedure documented
  • Performance impact assessed

Related Skills

Xlsx

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

data

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Clickhouse Io

ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.

datacli

Analyzing Financial Statements

This skill calculates key financial ratios and metrics from financial statement data for investment analysis

data

Data Storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

data

Kpi Dashboard Design

Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.

designdata

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Sql Optimization Patterns

Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.

designdata

Anndata

This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.

arttooldata

Xlsx

Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.

tooldata

Skill Information

Category:Data
Last Updated:12/19/2025