Funnel Analysis

by nimrodfisher

testing

Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, A/B testing funnel performance, or optimizing user journeys.

Skill Details

Repository Files

1 file in this skill directory


name: funnel-analysis description: Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, A/B testing funnel performance, or optimizing user journeys.

Funnel Analysis

Quick Start

Analyze multi-step user journeys to measure conversion rates, identify drop-off points, compare segments, and optimize funnel performance with actionable insights.

Context Requirements

Before analyzing the funnel, I need:

  1. Funnel Steps: The sequence of actions users take
  2. Event Data: User activity showing who completed each step
  3. Time Window: How long users have to complete the funnel
  4. Success Criteria: What counts as completion at each step
  5. Segments (optional): Groups to compare (e.g., by channel, device, cohort)

Context Gathering

For Funnel Steps:

"Please define the funnel steps in order. For example:

E-commerce Purchase Funnel:

  1. View Product Page
  2. Add to Cart
  3. Begin Checkout
  4. Enter Payment Info
  5. Complete Purchase

SaaS Onboarding Funnel:

  1. Sign Up
  2. Email Verified
  3. Complete Profile
  4. Invite Team Member
  5. First Project Created

What are your funnel steps?"

For Event Data:

"I need data showing which users completed which steps. Provide:

Option 1 - Event Log:

user_id | event_name        | timestamp
123     | view_product      | 2024-12-15 10:00:00
123     | add_to_cart       | 2024-12-15 10:05:00
123     | begin_checkout    | 2024-12-15 10:10:00
456     | view_product      | 2024-12-15 11:00:00

Option 2 - Pre-aggregated:

user_id | reached_step_1 | reached_step_2 | reached_step_3 |...
123     | TRUE           | TRUE           | TRUE           |...
456     | TRUE           | FALSE          | FALSE          |...

Option 3 - Database Query: Share SQL to fetch relevant events

Which format works for you?"

For Time Window:

"How long do users have to complete the funnel?

Common Windows:

  • Session-based: Within single session (30 min)
  • Same-day: Within 24 hours
  • Multi-day: Within 7 days, 30 days
  • Unlimited: Any time eventually

What makes sense for your use case?"

For Success Criteria:

"For each step, what counts as completion?

Examples:

  • Step 1 (View Product): Page view event
  • Step 2 (Add to Cart): Click 'Add to Cart' button
  • Step 3 (Checkout): Land on checkout page
  • Step 4 (Payment): Submit payment form
  • Step 5 (Complete): Order confirmation

Any nuances? (e.g., 'view product for >10 seconds', 'add any item', etc.)"

For Segments:

"Want to compare funnel performance across groups?

Common Segments:

  • Acquisition channel (organic, paid, referral)
  • Device type (mobile, desktop, tablet)
  • User type (new, returning, power user)
  • Geographic region
  • Product/plan tier
  • Time period (weekday vs weekend)

Which segments are most important?"

Workflow

Step 1: Load and Validate Event Data

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta

# Load event data
events = pd.read_csv('user_events.csv')
events['timestamp'] = pd.to_datetime(events['timestamp'])

print(f"šŸ“Š Event Data Loaded:")
print(f"  Total Events: {len(events):,}")
print(f"  Unique Users: {events['user_id'].nunique():,}")
print(f"  Date Range: {events['timestamp'].min()} to {events['timestamp'].max()}")
print(f"  Event Types: {events['event_name'].unique()}")

Checkpoint: "Data loaded. Do the event names match your funnel steps?"

Step 2: Define Funnel Configuration

# Define funnel steps in order
funnel_steps = [
    {'step': 1, 'name': 'View Product', 'event': 'view_product'},
    {'step': 2, 'name': 'Add to Cart', 'event': 'add_to_cart'},
    {'step': 3, 'name': 'Begin Checkout', 'event': 'begin_checkout'},
    {'step': 4, 'name': 'Payment Info', 'event': 'enter_payment'},
    {'step': 5, 'name': 'Complete Purchase', 'event': 'purchase_complete'}
]

# Time window for funnel completion (in days)
TIME_WINDOW_DAYS = 7

print("šŸŽÆ Funnel Configuration:")
for step in funnel_steps:
    print(f"  Step {step['step']}: {step['name']} ({step['event']})")
print(f"\nTime Window: {TIME_WINDOW_DAYS} days")

Step 3: Build Funnel Data

def build_funnel_data(events, funnel_steps, time_window_days):
    """
    For each user, determine which funnel steps they reached
    """
    funnel_data = []
    
    # Get users who started the funnel (reached step 1)
    step1_event = funnel_steps[0]['event']
    users_started = events[events['event_name'] == step1_event]['user_id'].unique()
    
    print(f"Building funnel for {len(users_started):,} users...")
    
    for user_id in users_started:
        user_events = events[events['user_id'] == user_id].sort_values('timestamp')
        
        # Find first occurrence of step 1
        step1_events = user_events[user_events['event_name'] == step1_event]
        if len(step1_events) == 0:
            continue
            
        start_time = step1_events.iloc[0]['timestamp']
        end_time = start_time + timedelta(days=time_window_days)
        
        # Check each subsequent step
        user_funnel = {
            'user_id': user_id,
            'start_time': start_time,
            'step_1': True,
            'step_1_time': start_time
        }
        
        for i, step in enumerate(funnel_steps[1:], start=2):
            # Look for this step's event after previous step and within window
            step_events = user_events[
                (user_events['event_name'] == step['event']) &
                (user_events['timestamp'] >= start_time) &
                (user_events['timestamp'] <= end_time)
            ]
            
            if len(step_events) > 0:
                user_funnel[f'step_{i}'] = True
                user_funnel[f'step_{i}_time'] = step_events.iloc[0]['timestamp']
            else:
                user_funnel[f'step_{i}'] = False
                user_funnel[f'step_{i}_time'] = None
                # If they didn't reach this step, they didn't reach later steps
                for j in range(i+1, len(funnel_steps)+1):
                    user_funnel[f'step_{j}'] = False
                    user_funnel[f'step_{j}_time'] = None
                break
        
        funnel_data.append(user_funnel)
    
    return pd.DataFrame(funnel_data)

funnel_df = build_funnel_data(events, funnel_steps, TIME_WINDOW_DAYS)
print(f"āœ“ Funnel built for {len(funnel_df):,} users")

Step 4: Calculate Funnel Metrics

def calculate_funnel_metrics(funnel_df, funnel_steps):
    """Calculate conversion rates and drop-offs"""
    
    metrics = []
    total_users = len(funnel_df)
    
    for i, step in enumerate(funnel_steps, start=1):
        users_reached = funnel_df[f'step_{i}'].sum()
        conversion_from_top = (users_reached / total_users) * 100
        
        if i > 1:
            users_prev_step = funnel_df[f'step_{i-1}'].sum()
            conversion_from_prev = (users_reached / users_prev_step) * 100 if users_prev_step > 0 else 0
            drop_off = users_prev_step - users_reached
            drop_off_rate = ((users_prev_step - users_reached) / users_prev_step) * 100 if users_prev_step > 0 else 0
        else:
            conversion_from_prev = 100.0
            drop_off = 0
            drop_off_rate = 0
        
        metrics.append({
            'step': i,
            'step_name': step['name'],
            'users_reached': int(users_reached),
            'conversion_from_top': conversion_from_top,
            'conversion_from_prev': conversion_from_prev,
            'drop_off': int(drop_off),
            'drop_off_rate': drop_off_rate
        })
    
    return pd.DataFrame(metrics)

funnel_metrics = calculate_funnel_metrics(funnel_df, funnel_steps)

print("\nšŸ“Š Funnel Conversion Metrics:\n")
print(funnel_metrics.to_string(index=False))

Step 5: Visualize Funnel

def plot_funnel(metrics):
    """Create funnel visualization"""
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))
    
    # Funnel chart (absolute numbers)
    ax1.barh(metrics['step_name'], metrics['users_reached'], 
             color=plt.cm.Blues(np.linspace(0.4, 0.8, len(metrics))))
    
    # Add value labels
    for i, (name, users) in enumerate(zip(metrics['step_name'], metrics['users_reached'])):
        ax1.text(users, i, f' {int(users):,}', va='center')
    
    ax1.set_xlabel('Users')
    ax1.set_title('Funnel: Absolute Users per Step')
    ax1.invert_yaxis()
    
    # Conversion rate chart
    colors = ['green' if rate >= 80 else 'orange' if rate >= 60 else 'red' 
              for rate in metrics['conversion_from_prev']]
    
    ax2.barh(metrics['step_name'], metrics['conversion_from_prev'], color=colors)
    
    # Add percentage labels
    for i, (name, rate) in enumerate(zip(metrics['step_name'], metrics['conversion_from_prev'])):
        ax2.text(rate, i, f' {rate:.1f}%', va='center')
    
    ax2.set_xlabel('Conversion Rate (%)')
    ax2.set_title('Step-to-Step Conversion Rate')
    ax2.set_xlim(0, 105)
    ax2.invert_yaxis()
    
    plt.tight_layout()
    plt.savefig('funnel_analysis.png', dpi=300, bbox_inches='tight')
    plt.show()

plot_funnel(funnel_metrics)

Step 6: Analyze Drop-Off Points

def analyze_drop_offs(metrics):
    """Identify and prioritize drop-off points"""
    
    # Find biggest drop-off by absolute users
    biggest_drop = metrics.loc[metrics['drop_off'].idxmax()]
    
    # Find biggest drop-off by rate
    worst_conversion = metrics.loc[metrics['conversion_from_prev'].idxmin()]
    
    print("\nšŸ” Drop-Off Analysis:")
    print(f"\n  Biggest Drop-Off (absolute):")
    print(f"    {biggest_drop['step_name']}")
    print(f"    Lost {biggest_drop['drop_off']:,} users ({biggest_drop['drop_off_rate']:.1f}%)")
    
    print(f"\n  Worst Conversion Rate:")
    print(f"    {worst_conversion['step_name']}")
    print(f"    Only {worst_conversion['conversion_from_prev']:.1f}% converted")
    
    # Categorize steps
    print(f"\n  Step Performance:")
    for _, row in metrics.iterrows():
        if row['step'] == 1:
            continue
        rate = row['conversion_from_prev']
        if rate >= 80:
            status = "āœ… GOOD"
        elif rate >= 60:
            status = "āš ļø  MODERATE"
        else:
            status = "šŸ”“ POOR"
        print(f"    {status} {row['step_name']}: {rate:.1f}%")

analyze_drop_offs(funnel_metrics)

Step 7: Time-to-Convert Analysis

def analyze_time_to_convert(funnel_df, funnel_steps):
    """Analyze how long users take at each step"""
    
    print("\nā±ļø  Time to Convert Analysis:")
    
    for i in range(2, len(funnel_steps) + 1):
        # Calculate time between steps
        time_col = f'step_{i}_time'
        prev_time_col = f'step_{i-1}_time'
        
        converted = funnel_df[funnel_df[f'step_{i}'] == True].copy()
        
        if len(converted) == 0:
            continue
        
        converted['time_diff'] = (converted[time_col] - converted[prev_time_col]).dt.total_seconds() / 60
        
        print(f"\n  {funnel_steps[i-2]['name']} → {funnel_steps[i-1]['name']}:")
        print(f"    Median: {converted['time_diff'].median():.1f} minutes")
        print(f"    P25: {converted['time_diff'].quantile(0.25):.1f} min")
        print(f"    P75: {converted['time_diff'].quantile(0.75):.1f} min")
        print(f"    P95: {converted['time_diff'].quantile(0.95):.1f} min")

analyze_time_to_convert(funnel_df, funnel_steps)

Step 8: Segment Comparison

def compare_segments(events, funnel_df, segment_col='channel'):
    """Compare funnel performance across segments"""
    
    # Add segment info to funnel data
    user_segments = events[['user_id', segment_col]].drop_duplicates('user_id')
    funnel_with_segment = funnel_df.merge(user_segments, on='user_id', how='left')
    
    print(f"\nšŸ“Š Funnel by {segment_col.title()}:")
    
    segment_metrics = []
    for segment in funnel_with_segment[segment_col].unique():
        segment_data = funnel_with_segment[funnel_with_segment[segment_col] == segment]
        segment_funnel = calculate_funnel_metrics(segment_data, funnel_steps)
        
        # Overall conversion rate (top to bottom)
        overall_conversion = segment_funnel.iloc[-1]['conversion_from_top']
        
        segment_metrics.append({
            'segment': segment,
            'users': len(segment_data),
            'overall_conversion': overall_conversion
        })
        
        print(f"\n  {segment}:")
        print(f"    Users: {len(segment_data):,}")
        print(f"    End-to-End Conversion: {overall_conversion:.1f}%")
        
        # Show biggest drop-off for this segment
        worst = segment_funnel.loc[segment_funnel['conversion_from_prev'].idxmin()]
        print(f"    Worst Step: {worst['step_name']} ({worst['conversion_from_prev']:.1f}%)")
    
    # Compare segments
    segment_comparison = pd.DataFrame(segment_metrics).sort_values('overall_conversion', ascending=False)
    
    print(f"\n  Segment Ranking:")
    for _, row in segment_comparison.iterrows():
        print(f"    {row['segment']}: {row['overall_conversion']:.1f}%")

# Example: Compare by channel
if 'channel' in events.columns:
    compare_segments(events, funnel_df, 'channel')

Context Validation

Before proceeding, verify:

  • Funnel steps are clearly defined and in correct order
  • Event data includes all necessary steps
  • Time window makes sense for the user journey
  • Success criteria for each step is unambiguous
  • Have user IDs to track individuals through funnel

Output Template

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
FUNNEL ANALYSIS REPORT
E-commerce Purchase Funnel
Period: Dec 1-31, 2024
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

šŸ“Š FUNNEL OVERVIEW
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Total Users Entered: 50,000
Overall Conversion: 12.5% (6,250 purchases)

Step                    Users      Conv%    Drop-Off
──────────────────────────────────────────────────
1. View Product        50,000     100.0%         -
2. Add to Cart         35,000      70.0%    30.0%
3. Begin Checkout      21,000      60.0%    40.0%
4. Payment Info        15,750      75.0%    25.0%
5. Complete Purchase    6,250      39.7%    60.3%

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
šŸ” KEY FINDINGS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

šŸ”“ CRITICAL DROP-OFF:
   Complete Purchase (Step 5)
   - Only 39.7% complete after entering payment
   - Losing 9,500 users at final step
   - Potential revenue impact: $285,000

āš ļø  MODERATE DROP-OFF:
   Begin Checkout (Step 3)
   - 40% abandon cart before checkout
   - Losing 14,000 users

āœ… GOOD PERFORMANCE:
   Add to Cart (Step 2): 70% conversion
   Payment Info (Step 4): 75% conversion

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ā±ļø  TIME TO CONVERT
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

View → Add to Cart:     Median 2.3 min
Add to Cart → Checkout: Median 8.5 min
Checkout → Payment:     Median 3.1 min
Payment → Complete:     Median 1.2 min

Total Journey: Median 15.1 minutes

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
šŸ“± SEGMENT COMPARISON
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

By Device:
  Desktop:  15.2% conversion (30,000 users)
  Mobile:   10.1% conversion (20,000 users)
  
  Gap: Mobile 33% lower conversion
  Worst Mobile Step: Complete Purchase (28% vs 45% desktop)

By Channel:
  Organic:  14.3% conversion
  Paid:     11.8% conversion  
  Email:    16.7% conversion

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
šŸ’” RECOMMENDATIONS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

PRIORITY 1 (High Impact):
1. Investigate payment completion drop-off
   - Review error messages at payment step
   - Check mobile payment UX
   - Consider guest checkout option
   - Potential gain: +3,000 conversions/month

PRIORITY 2 (Medium Impact):
2. Reduce cart abandonment
   - Add save cart feature
   - Send abandonment emails
   - Show trust signals earlier
   - Potential gain: +2,000 conversions/month

PRIORITY 3 (Mobile Optimization):
3. Improve mobile experience
   - Simplify mobile checkout flow
   - Optimize for smaller screens
   - Potential gain: +1,000 conversions/month

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
šŸ“ FILES GENERATED
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

āœ“ funnel_analysis.png (visualization)
āœ“ funnel_metrics.csv (detailed metrics)
āœ“ user_journeys.csv (individual user paths)
āœ“ segment_comparison.csv (breakdown by segment)

Common Scenarios

Scenario 1: "Why is our signup funnel performing poorly?"

→ Build funnel from landing → signup → activation → Identify biggest drop-off step → Compare segments (source, device, etc.) → Analyze time-to-convert at each step → Provide specific recommendations

Scenario 2: "Mobile conversion is lower than desktop"

→ Run funnel analysis separately for each device → Identify which step(s) mobile underperforms → Compare time-to-convert (mobile users slower?) → Highlight specific mobile UX issues

Scenario 3: "Test if new checkout flow improved conversion"

→ Compare funnel before/after change → Calculate statistical significance of difference → Show which specific steps improved → Measure overall impact

Scenario 4: "Optimize onboarding for different user types"

→ Segment by user type (free, trial, paid) → Build separate funnels for each → Identify where each segment drops off → Create targeted interventions

Scenario 5: "Track funnel performance over time"

→ Calculate weekly/monthly funnel metrics → Show trend in conversion rates → Flag when performance degrades → Correlate with product changes

Handling Missing Context

User says "analyze our funnel" without defining steps: "I can help! First, what's the user journey you want to analyze? Example: Landing page → Signup → Onboarding → Activation. What are your steps?"

User doesn't know time window: "Let me analyze the data to see typical completion times, then we can decide on an appropriate window. Most users complete within X days."

Event data is messy: "I see multiple event names that might represent the same step. Let me map them:

  • 'view_product', 'product_page' → Step 1?
  • 'add_cart', 'added_to_cart' → Step 2? Does this look right?"

User wants to compare many segments: "I can analyze all segments, but let's prioritize. Which 2-3 segments matter most for decision-making?"

Advanced Options

After basic funnel analysis, offer:

Cohort-Based Funnels: "Want to see how funnel performance changes over time? I can show conversion rates by signup cohort."

Micro-Conversion Analysis: "I can break down each major step into micro-steps to find exactly where users hesitate."

Drop-Off Prediction: "Using behavior patterns, I can predict which users are likely to drop off and when."

Recovery Analysis: "I can identify users who dropped off but later returned to complete the funnel."

Funnel Optimization Calculator: "I can estimate revenue impact of improving conversion at each step by X%."

A/B Test Power Analysis: "Planning to test funnel changes? I can calculate required sample size for statistical significance."

Related Skills

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Senior Data Scientist

World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.

designtestingdata

Hypogenic

Automated hypothesis generation and testing using large language models. Use this skill when generating scientific hypotheses from datasets, combining literature insights with empirical data, testing hypotheses against observational data, or conducting systematic hypothesis exploration for research discovery in domains like deception detection, AI content detection, mental health analysis, or other empirical research tasks.

testingdata

Ux Researcher Designer

UX research and design toolkit for Senior UX Designer/Researcher including data-driven persona generation, journey mapping, usability testing frameworks, and research synthesis. Use for user research, persona creation, journey mapping, and design validation.

designtestingtool

Hypogenic

Automated LLM-driven hypothesis generation and testing on tabular datasets. Use when you want to systematically explore hypotheses about patterns in empirical data (e.g., deception detection, content analysis). Combines literature insights with data-driven hypothesis testing. For manual hypothesis formulation use hypothesis-generation; for creative ideation use scientific-brainstorming.

testingdata

Data Engineering Data Driven Feature

Build features guided by data insights, A/B testing, and continuous measurement using specialized agents for analysis, implementation, and experimentation.

testingdata

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Dashboard Design

USE THIS SKILL FIRST when user wants to create and design a dashboard, ESPECIALLY Vizro dashboards. This skill enforces a 3-step workflow (requirements, layout, visualization) that must be followed before implementation. For implementation and testing, use the dashboard-build skill after completing Steps 1-3.

designtestingworkflow

Ux Researcher Designer

UX research and design toolkit for Senior UX Designer/Researcher including data-driven persona generation, journey mapping, usability testing frameworks, and research synthesis. Use for user research, persona creation, journey mapping, and design validation.

designtestingtool

Performance Testing

Benchmark indicator performance with BenchmarkDotNet. Use for Series/Buffer/Stream benchmarks, regression detection, and optimization patterns. Target 1.5x Series for StreamHub, 1.2x for BufferList.

testing

Skill Information

Category:Technical
Last Updated:1/14/2026