name: funnel-analysis description: Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, A/B testing funnel performance, or optimizing user journeys.

Funnel Analysis

Quick Start

Analyze multi-step user journeys to measure conversion rates, identify drop-off points, compare segments, and optimize funnel performance with actionable insights.

Context Requirements

Before analyzing the funnel, I need:

Funnel Steps: The sequence of actions users take
Event Data: User activity showing who completed each step
Time Window: How long users have to complete the funnel
Success Criteria: What counts as completion at each step
Segments (optional): Groups to compare (e.g., by channel, device, cohort)

Context Gathering

For Funnel Steps:

"Please define the funnel steps in order. For example:

E-commerce Purchase Funnel:

View Product Page
Add to Cart
Begin Checkout
Enter Payment Info
Complete Purchase

SaaS Onboarding Funnel:

Sign Up
Email Verified
Complete Profile
Invite Team Member
First Project Created

What are your funnel steps?"

For Event Data:

"I need data showing which users completed which steps. Provide:

Option 1 - Event Log:

user_id | event_name        | timestamp
123     | view_product      | 2024-12-15 10:00:00
123     | add_to_cart       | 2024-12-15 10:05:00
123     | begin_checkout    | 2024-12-15 10:10:00
456     | view_product      | 2024-12-15 11:00:00

Option 2 - Pre-aggregated:

user_id | reached_step_1 | reached_step_2 | reached_step_3 |...
123     | TRUE           | TRUE           | TRUE           |...
456     | TRUE           | FALSE          | FALSE          |...

Option 3 - Database Query: Share SQL to fetch relevant events

Which format works for you?"

For Time Window:

"How long do users have to complete the funnel?

Common Windows:

Session-based: Within single session (30 min)
Same-day: Within 24 hours
Multi-day: Within 7 days, 30 days
Unlimited: Any time eventually

What makes sense for your use case?"

For Success Criteria:

"For each step, what counts as completion?

Examples:

Step 1 (View Product): Page view event
Step 2 (Add to Cart): Click 'Add to Cart' button
Step 3 (Checkout): Land on checkout page
Step 4 (Payment): Submit payment form
Step 5 (Complete): Order confirmation

Any nuances? (e.g., 'view product for >10 seconds', 'add any item', etc.)"

For Segments:

"Want to compare funnel performance across groups?

Common Segments:

Acquisition channel (organic, paid, referral)
Device type (mobile, desktop, tablet)
User type (new, returning, power user)
Geographic region
Product/plan tier
Time period (weekday vs weekend)

Which segments are most important?"

Workflow

Step 1: Load and Validate Event Data

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta

# Load event data
events = pd.read_csv('user_events.csv')
events['timestamp'] = pd.to_datetime(events['timestamp'])

print(f"📊 Event Data Loaded:")
print(f"  Total Events: {len(events):,}")
print(f"  Unique Users: {events['user_id'].nunique():,}")
print(f"  Date Range: {events['timestamp'].min()} to {events['timestamp'].max()}")
print(f"  Event Types: {events['event_name'].unique()}")

Checkpoint: "Data loaded. Do the event names match your funnel steps?"

Step 2: Define Funnel Configuration

# Define funnel steps in order
funnel_steps = [
    {'step': 1, 'name': 'View Product', 'event': 'view_product'},
    {'step': 2, 'name': 'Add to Cart', 'event': 'add_to_cart'},
    {'step': 3, 'name': 'Begin Checkout', 'event': 'begin_checkout'},
    {'step': 4, 'name': 'Payment Info', 'event': 'enter_payment'},
    {'step': 5, 'name': 'Complete Purchase', 'event': 'purchase_complete'}
]

# Time window for funnel completion (in days)
TIME_WINDOW_DAYS = 7

print("🎯 Funnel Configuration:")
for step in funnel_steps:
    print(f"  Step {step['step']}: {step['name']} ({step['event']})")
print(f"\nTime Window: {TIME_WINDOW_DAYS} days")

Step 3: Build Funnel Data

def build_funnel_data(events, funnel_steps, time_window_days):
    """
    For each user, determine which funnel steps they reached
    """
    funnel_data = []
    
    # Get users who started the funnel (reached step 1)
    step1_event = funnel_steps[0]['event']
    users_started = events[events['event_name'] == step1_event]['user_id'].unique()
    
    print(f"Building funnel for {len(users_started):,} users...")
    
    for user_id in users_started:
        user_events = events[events['user_id'] == user_id].sort_values('timestamp')
        
        # Find first occurrence of step 1
        step1_events = user_events[user_events['event_name'] == step1_event]
        if len(step1_events) == 0:
            continue
            
        start_time = step1_events.iloc[0]['timestamp']
        end_time = start_time + timedelta(days=time_window_days)
        
        # Check each subsequent step
        user_funnel = {
            'user_id': user_id,
            'start_time': start_time,
            'step_1': True,
            'step_1_time': start_time
        }
        
        for i, step in enumerate(funnel_steps[1:], start=2):
            # Look for this step's event after previous step and within window
            step_events = user_events[
                (user_events['event_name'] == step['event']) &
                (user_events['timestamp'] >= start_time) &
                (user_events['timestamp'] <= end_time)
            ]
            
            if len(step_events) > 0:
                user_funnel[f'step_{i}'] = True
                user_funnel[f'step_{i}_time'] = step_events.iloc[0]['timestamp']
            else:
                user_funnel[f'step_{i}'] = False
                user_funnel[f'step_{i}_time'] = None
                # If they didn't reach this step, they didn't reach later steps
                for j in range(i+1, len(funnel_steps)+1):
                    user_funnel[f'step_{j}'] = False
                    user_funnel[f'step_{j}_time'] = None
                break
        
        funnel_data.append(user_funnel)
    
    return pd.DataFrame(funnel_data)

funnel_df = build_funnel_data(events, funnel_steps, TIME_WINDOW_DAYS)
print(f"✓ Funnel built for {len(funnel_df):,} users")

Step 4: Calculate Funnel Metrics

def calculate_funnel_metrics(funnel_df, funnel_steps):
    """Calculate conversion rates and drop-offs"""
    
    metrics = []
    total_users = len(funnel_df)
    
    for i, step in enumerate(funnel_steps, start=1):
        users_reached = funnel_df[f'step_{i}'].sum()
        conversion_from_top = (users_reached / total_users) * 100
        
        if i > 1:
            users_prev_step = funnel_df[f'step_{i-1}'].sum()
            conversion_from_prev = (users_reached / users_prev_step) * 100 if users_prev_step > 0 else 0
            drop_off = users_prev_step - users_reached
            drop_off_rate = ((users_prev_step - users_reached) / users_prev_step) * 100 if users_prev_step > 0 else 0
        else:
            conversion_from_prev = 100.0
            drop_off = 0
            drop_off_rate = 0
        
        metrics.append({
            'step': i,
            'step_name': step['name'],
            'users_reached': int(users_reached),
            'conversion_from_top': conversion_from_top,
            'conversion_from_prev': conversion_from_prev,
            'drop_off': int(drop_off),
            'drop_off_rate': drop_off_rate
        })
    
    return pd.DataFrame(metrics)

funnel_metrics = calculate_funnel_metrics(funnel_df, funnel_steps)

print("\n📊 Funnel Conversion Metrics:\n")
print(funnel_metrics.to_string(index=False))

Step 5: Visualize Funnel

def plot_funnel(metrics):
    """Create funnel visualization"""
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))
    
    # Funnel chart (absolute numbers)
    ax1.barh(metrics['step_name'], metrics['users_reached'], 
             color=plt.cm.Blues(np.linspace(0.4, 0.8, len(metrics))))
    
    # Add value labels
    for i, (name, users) in enumerate(zip(metrics['step_name'], metrics['users_reached'])):
        ax1.text(users, i, f' {int(users):,}', va='center')
    
    ax1.set_xlabel('Users')
    ax1.set_title('Funnel: Absolute Users per Step')
    ax1.invert_yaxis()
    
    # Conversion rate chart
    colors = ['green' if rate >= 80 else 'orange' if rate >= 60 else 'red' 
              for rate in metrics['conversion_from_prev']]
    
    ax2.barh(metrics['step_name'], metrics['conversion_from_prev'], color=colors)
    
    # Add percentage labels
    for i, (name, rate) in enumerate(zip(metrics['step_name'], metrics['conversion_from_prev'])):
        ax2.text(rate, i, f' {rate:.1f}%', va='center')
    
    ax2.set_xlabel('Conversion Rate (%)')
    ax2.set_title('Step-to-Step Conversion Rate')
    ax2.set_xlim(0, 105)
    ax2.invert_yaxis()
    
    plt.tight_layout()
    plt.savefig('funnel_analysis.png', dpi=300, bbox_inches='tight')
    plt.show()

plot_funnel(funnel_metrics)

Step 6: Analyze Drop-Off Points

def analyze_drop_offs(metrics):
    """Identify and prioritize drop-off points"""
    
    # Find biggest drop-off by absolute users
    biggest_drop = metrics.loc[metrics['drop_off'].idxmax()]
    
    # Find biggest drop-off by rate
    worst_conversion = metrics.loc[metrics['conversion_from_prev'].idxmin()]
    
    print("\n🔍 Drop-Off Analysis:")
    print(f"\n  Biggest Drop-Off (absolute):")
    print(f"    {biggest_drop['step_name']}")
    print(f"    Lost {biggest_drop['drop_off']:,} users ({biggest_drop['drop_off_rate']:.1f}%)")
    
    print(f"\n  Worst Conversion Rate:")
    print(f"    {worst_conversion['step_name']}")
    print(f"    Only {worst_conversion['conversion_from_prev']:.1f}% converted")
    
    # Categorize steps
    print(f"\n  Step Performance:")
    for _, row in metrics.iterrows():
        if row['step'] == 1:
            continue
        rate = row['conversion_from_prev']
        if rate >= 80:
            status = "✅ GOOD"
        elif rate >= 60:
            status = "⚠️  MODERATE"
        else:
            status = "🔴 POOR"
        print(f"    {status} {row['step_name']}: {rate:.1f}%")

analyze_drop_offs(funnel_metrics)

Step 7: Time-to-Convert Analysis

def analyze_time_to_convert(funnel_df, funnel_steps):
    """Analyze how long users take at each step"""
    
    print("\n⏱️  Time to Convert Analysis:")
    
    for i in range(2, len(funnel_steps) + 1):
        # Calculate time between steps
        time_col = f'step_{i}_time'
        prev_time_col = f'step_{i-1}_time'
        
        converted = funnel_df[funnel_df[f'step_{i}'] == True].copy()
        
        if len(converted) == 0:
            continue
        
        converted['time_diff'] = (converted[time_col] - converted[prev_time_col]).dt.total_seconds() / 60
        
        print(f"\n  {funnel_steps[i-2]['name']} → {funnel_steps[i-1]['name']}:")
        print(f"    Median: {converted['time_diff'].median():.1f} minutes")
        print(f"    P25: {converted['time_diff'].quantile(0.25):.1f} min")
        print(f"    P75: {converted['time_diff'].quantile(0.75):.1f} min")
        print(f"    P95: {converted['time_diff'].quantile(0.95):.1f} min")

analyze_time_to_convert(funnel_df, funnel_steps)

Step 8: Segment Comparison

def compare_segments(events, funnel_df, segment_col='channel'):
    """Compare funnel performance across segments"""
    
    # Add segment info to funnel data
    user_segments = events[['user_id', segment_col]].drop_duplicates('user_id')
    funnel_with_segment = funnel_df.merge(user_segments, on='user_id', how='left')
    
    print(f"\n📊 Funnel by {segment_col.title()}:")
    
    segment_metrics = []
    for segment in funnel_with_segment[segment_col].unique():
        segment_data = funnel_with_segment[funnel_with_segment[segment_col] == segment]
        segment_funnel = calculate_funnel_metrics(segment_data, funnel_steps)
        
        # Overall conversion rate (top to bottom)
        overall_conversion = segment_funnel.iloc[-1]['conversion_from_top']
        
        segment_metrics.append({
            'segment': segment,
            'users': len(segment_data),
            'overall_conversion': overall_conversion
        })
        
        print(f"\n  {segment}:")
        print(f"    Users: {len(segment_data):,}")
        print(f"    End-to-End Conversion: {overall_conversion:.1f}%")
        
        # Show biggest drop-off for this segment
        worst = segment_funnel.loc[segment_funnel['conversion_from_prev'].idxmin()]
        print(f"    Worst Step: {worst['step_name']} ({worst['conversion_from_prev']:.1f}%)")
    
    # Compare segments
    segment_comparison = pd.DataFrame(segment_metrics).sort_values('overall_conversion', ascending=False)
    
    print(f"\n  Segment Ranking:")
    for _, row in segment_comparison.iterrows():
        print(f"    {row['segment']}: {row['overall_conversion']:.1f}%")

# Example: Compare by channel
if 'channel' in events.columns:
    compare_segments(events, funnel_df, 'channel')

Context Validation

Before proceeding, verify:

Funnel steps are clearly defined and in correct order
Event data includes all necessary steps
Time window makes sense for the user journey
Success criteria for each step is unambiguous
Have user IDs to track individuals through funnel

Output Template

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
FUNNEL ANALYSIS REPORT
E-commerce Purchase Funnel
Period: Dec 1-31, 2024
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

📊 FUNNEL OVERVIEW
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Total Users Entered: 50,000
Overall Conversion: 12.5% (6,250 purchases)

Step                    Users      Conv%    Drop-Off
──────────────────────────────────────────────────
1. View Product        50,000     100.0%         -
2. Add to Cart         35,000      70.0%    30.0%
3. Begin Checkout      21,000      60.0%    40.0%
4. Payment Info        15,750      75.0%    25.0%
5. Complete Purchase    6,250      39.7%    60.3%

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔍 KEY FINDINGS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

🔴 CRITICAL DROP-OFF:
   Complete Purchase (Step 5)
   - Only 39.7% complete after entering payment
   - Losing 9,500 users at final step
   - Potential revenue impact: $285,000

⚠️  MODERATE DROP-OFF:
   Begin Checkout (Step 3)
   - 40% abandon cart before checkout
   - Losing 14,000 users

✅ GOOD PERFORMANCE:
   Add to Cart (Step 2): 70% conversion
   Payment Info (Step 4): 75% conversion

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⏱️  TIME TO CONVERT
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

View → Add to Cart:     Median 2.3 min
Add to Cart → Checkout: Median 8.5 min
Checkout → Payment:     Median 3.1 min
Payment → Complete:     Median 1.2 min

Total Journey: Median 15.1 minutes

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📱 SEGMENT COMPARISON
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

By Device:
  Desktop:  15.2% conversion (30,000 users)
  Mobile:   10.1% conversion (20,000 users)
  
  Gap: Mobile 33% lower conversion
  Worst Mobile Step: Complete Purchase (28% vs 45% desktop)

By Channel:
  Organic:  14.3% conversion
  Paid:     11.8% conversion  
  Email:    16.7% conversion

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
💡 RECOMMENDATIONS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

PRIORITY 1 (High Impact):
1. Investigate payment completion drop-off
   - Review error messages at payment step
   - Check mobile payment UX
   - Consider guest checkout option
   - Potential gain: +3,000 conversions/month

PRIORITY 2 (Medium Impact):
2. Reduce cart abandonment
   - Add save cart feature
   - Send abandonment emails
   - Show trust signals earlier
   - Potential gain: +2,000 conversions/month

PRIORITY 3 (Mobile Optimization):
3. Improve mobile experience
   - Simplify mobile checkout flow
   - Optimize for smaller screens
   - Potential gain: +1,000 conversions/month

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📁 FILES GENERATED
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✓ funnel_analysis.png (visualization)
✓ funnel_metrics.csv (detailed metrics)
✓ user_journeys.csv (individual user paths)
✓ segment_comparison.csv (breakdown by segment)

Common Scenarios

Scenario 1: "Why is our signup funnel performing poorly?"

→ Build funnel from landing → signup → activation → Identify biggest drop-off step → Compare segments (source, device, etc.) → Analyze time-to-convert at each step → Provide specific recommendations

Scenario 2: "Mobile conversion is lower than desktop"

→ Run funnel analysis separately for each device → Identify which step(s) mobile underperforms → Compare time-to-convert (mobile users slower?) → Highlight specific mobile UX issues

Scenario 3: "Test if new checkout flow improved conversion"

→ Compare funnel before/after change → Calculate statistical significance of difference → Show which specific steps improved → Measure overall impact

Scenario 4: "Optimize onboarding for different user types"

→ Segment by user type (free, trial, paid) → Build separate funnels for each → Identify where each segment drops off → Create targeted interventions

Scenario 5: "Track funnel performance over time"

→ Calculate weekly/monthly funnel metrics → Show trend in conversion rates → Flag when performance degrades → Correlate with product changes

Handling Missing Context

User says "analyze our funnel" without defining steps: "I can help! First, what's the user journey you want to analyze? Example: Landing page → Signup → Onboarding → Activation. What are your steps?"

User doesn't know time window: "Let me analyze the data to see typical completion times, then we can decide on an appropriate window. Most users complete within X days."

Event data is messy: "I see multiple event names that might represent the same step. Let me map them:

'view_product', 'product_page' → Step 1?
'add_cart', 'added_to_cart' → Step 2? Does this look right?"

User wants to compare many segments: "I can analyze all segments, but let's prioritize. Which 2-3 segments matter most for decision-making?"

Advanced Options

After basic funnel analysis, offer:

Cohort-Based Funnels: "Want to see how funnel performance changes over time? I can show conversion rates by signup cohort."

Micro-Conversion Analysis: "I can break down each major step into micro-steps to find exactly where users hesitate."

Drop-Off Prediction: "Using behavior patterns, I can predict which users are likely to drop off and when."

Recovery Analysis: "I can identify users who dropped off but later returned to complete the funnel."

Funnel Optimization Calculator: "I can estimate revenue impact of improving conversion at each step by X%."

A/B Test Power Analysis: "Planning to test funnel changes? I can calculate required sample size for statistical significance."

Funnel Analysis

Skill Details

Repository Files

name: funnel-analysis description: Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, A/B testing funnel performance, or optimizing user journeys.

Funnel Analysis

Quick Start

Context Requirements

Context Gathering

For Funnel Steps:

For Event Data:

For Time Window:

For Success Criteria:

For Segments:

Workflow

Step 1: Load and Validate Event Data

Step 2: Define Funnel Configuration

Step 3: Build Funnel Data

Step 4: Calculate Funnel Metrics

Step 5: Visualize Funnel

Step 6: Analyze Drop-Off Points

Step 7: Time-to-Convert Analysis

Step 8: Segment Comparison

Context Validation

Output Template

Common Scenarios

Scenario 1: "Why is our signup funnel performing poorly?"

Scenario 2: "Mobile conversion is lower than desktop"

Scenario 3: "Test if new checkout flow improved conversion"

Scenario 4: "Optimize onboarding for different user types"

Scenario 5: "Track funnel performance over time"

Handling Missing Context

Advanced Options

Related Skills

Dbt Transformation Patterns

Senior Data Scientist

Hypogenic

Ux Researcher Designer

Hypogenic

Data Engineering Data Driven Feature

Dbt Transformation Patterns

Dashboard Design

Ux Researcher Designer

Performance Testing

Skill Information