Funnel Analysis
by nimrodfisher
Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, A/B testing funnel performance, or optimizing user journeys.
Skill Details
Repository Files
1 file in this skill directory
name: funnel-analysis description: Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, A/B testing funnel performance, or optimizing user journeys.
Funnel Analysis
Quick Start
Analyze multi-step user journeys to measure conversion rates, identify drop-off points, compare segments, and optimize funnel performance with actionable insights.
Context Requirements
Before analyzing the funnel, I need:
- Funnel Steps: The sequence of actions users take
- Event Data: User activity showing who completed each step
- Time Window: How long users have to complete the funnel
- Success Criteria: What counts as completion at each step
- Segments (optional): Groups to compare (e.g., by channel, device, cohort)
Context Gathering
For Funnel Steps:
"Please define the funnel steps in order. For example:
E-commerce Purchase Funnel:
- View Product Page
- Add to Cart
- Begin Checkout
- Enter Payment Info
- Complete Purchase
SaaS Onboarding Funnel:
- Sign Up
- Email Verified
- Complete Profile
- Invite Team Member
- First Project Created
What are your funnel steps?"
For Event Data:
"I need data showing which users completed which steps. Provide:
Option 1 - Event Log:
user_id | event_name | timestamp
123 | view_product | 2024-12-15 10:00:00
123 | add_to_cart | 2024-12-15 10:05:00
123 | begin_checkout | 2024-12-15 10:10:00
456 | view_product | 2024-12-15 11:00:00
Option 2 - Pre-aggregated:
user_id | reached_step_1 | reached_step_2 | reached_step_3 |...
123 | TRUE | TRUE | TRUE |...
456 | TRUE | FALSE | FALSE |...
Option 3 - Database Query: Share SQL to fetch relevant events
Which format works for you?"
For Time Window:
"How long do users have to complete the funnel?
Common Windows:
- Session-based: Within single session (30 min)
- Same-day: Within 24 hours
- Multi-day: Within 7 days, 30 days
- Unlimited: Any time eventually
What makes sense for your use case?"
For Success Criteria:
"For each step, what counts as completion?
Examples:
- Step 1 (View Product): Page view event
- Step 2 (Add to Cart): Click 'Add to Cart' button
- Step 3 (Checkout): Land on checkout page
- Step 4 (Payment): Submit payment form
- Step 5 (Complete): Order confirmation
Any nuances? (e.g., 'view product for >10 seconds', 'add any item', etc.)"
For Segments:
"Want to compare funnel performance across groups?
Common Segments:
- Acquisition channel (organic, paid, referral)
- Device type (mobile, desktop, tablet)
- User type (new, returning, power user)
- Geographic region
- Product/plan tier
- Time period (weekday vs weekend)
Which segments are most important?"
Workflow
Step 1: Load and Validate Event Data
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
# Load event data
events = pd.read_csv('user_events.csv')
events['timestamp'] = pd.to_datetime(events['timestamp'])
print(f"š Event Data Loaded:")
print(f" Total Events: {len(events):,}")
print(f" Unique Users: {events['user_id'].nunique():,}")
print(f" Date Range: {events['timestamp'].min()} to {events['timestamp'].max()}")
print(f" Event Types: {events['event_name'].unique()}")
Checkpoint: "Data loaded. Do the event names match your funnel steps?"
Step 2: Define Funnel Configuration
# Define funnel steps in order
funnel_steps = [
{'step': 1, 'name': 'View Product', 'event': 'view_product'},
{'step': 2, 'name': 'Add to Cart', 'event': 'add_to_cart'},
{'step': 3, 'name': 'Begin Checkout', 'event': 'begin_checkout'},
{'step': 4, 'name': 'Payment Info', 'event': 'enter_payment'},
{'step': 5, 'name': 'Complete Purchase', 'event': 'purchase_complete'}
]
# Time window for funnel completion (in days)
TIME_WINDOW_DAYS = 7
print("šÆ Funnel Configuration:")
for step in funnel_steps:
print(f" Step {step['step']}: {step['name']} ({step['event']})")
print(f"\nTime Window: {TIME_WINDOW_DAYS} days")
Step 3: Build Funnel Data
def build_funnel_data(events, funnel_steps, time_window_days):
"""
For each user, determine which funnel steps they reached
"""
funnel_data = []
# Get users who started the funnel (reached step 1)
step1_event = funnel_steps[0]['event']
users_started = events[events['event_name'] == step1_event]['user_id'].unique()
print(f"Building funnel for {len(users_started):,} users...")
for user_id in users_started:
user_events = events[events['user_id'] == user_id].sort_values('timestamp')
# Find first occurrence of step 1
step1_events = user_events[user_events['event_name'] == step1_event]
if len(step1_events) == 0:
continue
start_time = step1_events.iloc[0]['timestamp']
end_time = start_time + timedelta(days=time_window_days)
# Check each subsequent step
user_funnel = {
'user_id': user_id,
'start_time': start_time,
'step_1': True,
'step_1_time': start_time
}
for i, step in enumerate(funnel_steps[1:], start=2):
# Look for this step's event after previous step and within window
step_events = user_events[
(user_events['event_name'] == step['event']) &
(user_events['timestamp'] >= start_time) &
(user_events['timestamp'] <= end_time)
]
if len(step_events) > 0:
user_funnel[f'step_{i}'] = True
user_funnel[f'step_{i}_time'] = step_events.iloc[0]['timestamp']
else:
user_funnel[f'step_{i}'] = False
user_funnel[f'step_{i}_time'] = None
# If they didn't reach this step, they didn't reach later steps
for j in range(i+1, len(funnel_steps)+1):
user_funnel[f'step_{j}'] = False
user_funnel[f'step_{j}_time'] = None
break
funnel_data.append(user_funnel)
return pd.DataFrame(funnel_data)
funnel_df = build_funnel_data(events, funnel_steps, TIME_WINDOW_DAYS)
print(f"ā Funnel built for {len(funnel_df):,} users")
Step 4: Calculate Funnel Metrics
def calculate_funnel_metrics(funnel_df, funnel_steps):
"""Calculate conversion rates and drop-offs"""
metrics = []
total_users = len(funnel_df)
for i, step in enumerate(funnel_steps, start=1):
users_reached = funnel_df[f'step_{i}'].sum()
conversion_from_top = (users_reached / total_users) * 100
if i > 1:
users_prev_step = funnel_df[f'step_{i-1}'].sum()
conversion_from_prev = (users_reached / users_prev_step) * 100 if users_prev_step > 0 else 0
drop_off = users_prev_step - users_reached
drop_off_rate = ((users_prev_step - users_reached) / users_prev_step) * 100 if users_prev_step > 0 else 0
else:
conversion_from_prev = 100.0
drop_off = 0
drop_off_rate = 0
metrics.append({
'step': i,
'step_name': step['name'],
'users_reached': int(users_reached),
'conversion_from_top': conversion_from_top,
'conversion_from_prev': conversion_from_prev,
'drop_off': int(drop_off),
'drop_off_rate': drop_off_rate
})
return pd.DataFrame(metrics)
funnel_metrics = calculate_funnel_metrics(funnel_df, funnel_steps)
print("\nš Funnel Conversion Metrics:\n")
print(funnel_metrics.to_string(index=False))
Step 5: Visualize Funnel
def plot_funnel(metrics):
"""Create funnel visualization"""
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))
# Funnel chart (absolute numbers)
ax1.barh(metrics['step_name'], metrics['users_reached'],
color=plt.cm.Blues(np.linspace(0.4, 0.8, len(metrics))))
# Add value labels
for i, (name, users) in enumerate(zip(metrics['step_name'], metrics['users_reached'])):
ax1.text(users, i, f' {int(users):,}', va='center')
ax1.set_xlabel('Users')
ax1.set_title('Funnel: Absolute Users per Step')
ax1.invert_yaxis()
# Conversion rate chart
colors = ['green' if rate >= 80 else 'orange' if rate >= 60 else 'red'
for rate in metrics['conversion_from_prev']]
ax2.barh(metrics['step_name'], metrics['conversion_from_prev'], color=colors)
# Add percentage labels
for i, (name, rate) in enumerate(zip(metrics['step_name'], metrics['conversion_from_prev'])):
ax2.text(rate, i, f' {rate:.1f}%', va='center')
ax2.set_xlabel('Conversion Rate (%)')
ax2.set_title('Step-to-Step Conversion Rate')
ax2.set_xlim(0, 105)
ax2.invert_yaxis()
plt.tight_layout()
plt.savefig('funnel_analysis.png', dpi=300, bbox_inches='tight')
plt.show()
plot_funnel(funnel_metrics)
Step 6: Analyze Drop-Off Points
def analyze_drop_offs(metrics):
"""Identify and prioritize drop-off points"""
# Find biggest drop-off by absolute users
biggest_drop = metrics.loc[metrics['drop_off'].idxmax()]
# Find biggest drop-off by rate
worst_conversion = metrics.loc[metrics['conversion_from_prev'].idxmin()]
print("\nš Drop-Off Analysis:")
print(f"\n Biggest Drop-Off (absolute):")
print(f" {biggest_drop['step_name']}")
print(f" Lost {biggest_drop['drop_off']:,} users ({biggest_drop['drop_off_rate']:.1f}%)")
print(f"\n Worst Conversion Rate:")
print(f" {worst_conversion['step_name']}")
print(f" Only {worst_conversion['conversion_from_prev']:.1f}% converted")
# Categorize steps
print(f"\n Step Performance:")
for _, row in metrics.iterrows():
if row['step'] == 1:
continue
rate = row['conversion_from_prev']
if rate >= 80:
status = "ā
GOOD"
elif rate >= 60:
status = "ā ļø MODERATE"
else:
status = "š“ POOR"
print(f" {status} {row['step_name']}: {rate:.1f}%")
analyze_drop_offs(funnel_metrics)
Step 7: Time-to-Convert Analysis
def analyze_time_to_convert(funnel_df, funnel_steps):
"""Analyze how long users take at each step"""
print("\nā±ļø Time to Convert Analysis:")
for i in range(2, len(funnel_steps) + 1):
# Calculate time between steps
time_col = f'step_{i}_time'
prev_time_col = f'step_{i-1}_time'
converted = funnel_df[funnel_df[f'step_{i}'] == True].copy()
if len(converted) == 0:
continue
converted['time_diff'] = (converted[time_col] - converted[prev_time_col]).dt.total_seconds() / 60
print(f"\n {funnel_steps[i-2]['name']} ā {funnel_steps[i-1]['name']}:")
print(f" Median: {converted['time_diff'].median():.1f} minutes")
print(f" P25: {converted['time_diff'].quantile(0.25):.1f} min")
print(f" P75: {converted['time_diff'].quantile(0.75):.1f} min")
print(f" P95: {converted['time_diff'].quantile(0.95):.1f} min")
analyze_time_to_convert(funnel_df, funnel_steps)
Step 8: Segment Comparison
def compare_segments(events, funnel_df, segment_col='channel'):
"""Compare funnel performance across segments"""
# Add segment info to funnel data
user_segments = events[['user_id', segment_col]].drop_duplicates('user_id')
funnel_with_segment = funnel_df.merge(user_segments, on='user_id', how='left')
print(f"\nš Funnel by {segment_col.title()}:")
segment_metrics = []
for segment in funnel_with_segment[segment_col].unique():
segment_data = funnel_with_segment[funnel_with_segment[segment_col] == segment]
segment_funnel = calculate_funnel_metrics(segment_data, funnel_steps)
# Overall conversion rate (top to bottom)
overall_conversion = segment_funnel.iloc[-1]['conversion_from_top']
segment_metrics.append({
'segment': segment,
'users': len(segment_data),
'overall_conversion': overall_conversion
})
print(f"\n {segment}:")
print(f" Users: {len(segment_data):,}")
print(f" End-to-End Conversion: {overall_conversion:.1f}%")
# Show biggest drop-off for this segment
worst = segment_funnel.loc[segment_funnel['conversion_from_prev'].idxmin()]
print(f" Worst Step: {worst['step_name']} ({worst['conversion_from_prev']:.1f}%)")
# Compare segments
segment_comparison = pd.DataFrame(segment_metrics).sort_values('overall_conversion', ascending=False)
print(f"\n Segment Ranking:")
for _, row in segment_comparison.iterrows():
print(f" {row['segment']}: {row['overall_conversion']:.1f}%")
# Example: Compare by channel
if 'channel' in events.columns:
compare_segments(events, funnel_df, 'channel')
Context Validation
Before proceeding, verify:
- Funnel steps are clearly defined and in correct order
- Event data includes all necessary steps
- Time window makes sense for the user journey
- Success criteria for each step is unambiguous
- Have user IDs to track individuals through funnel
Output Template
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
FUNNEL ANALYSIS REPORT
E-commerce Purchase Funnel
Period: Dec 1-31, 2024
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š FUNNEL OVERVIEW
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
Total Users Entered: 50,000
Overall Conversion: 12.5% (6,250 purchases)
Step Users Conv% Drop-Off
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
1. View Product 50,000 100.0% -
2. Add to Cart 35,000 70.0% 30.0%
3. Begin Checkout 21,000 60.0% 40.0%
4. Payment Info 15,750 75.0% 25.0%
5. Complete Purchase 6,250 39.7% 60.3%
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š KEY FINDINGS
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š“ CRITICAL DROP-OFF:
Complete Purchase (Step 5)
- Only 39.7% complete after entering payment
- Losing 9,500 users at final step
- Potential revenue impact: $285,000
ā ļø MODERATE DROP-OFF:
Begin Checkout (Step 3)
- 40% abandon cart before checkout
- Losing 14,000 users
ā
GOOD PERFORMANCE:
Add to Cart (Step 2): 70% conversion
Payment Info (Step 4): 75% conversion
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā±ļø TIME TO CONVERT
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
View ā Add to Cart: Median 2.3 min
Add to Cart ā Checkout: Median 8.5 min
Checkout ā Payment: Median 3.1 min
Payment ā Complete: Median 1.2 min
Total Journey: Median 15.1 minutes
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š± SEGMENT COMPARISON
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
By Device:
Desktop: 15.2% conversion (30,000 users)
Mobile: 10.1% conversion (20,000 users)
Gap: Mobile 33% lower conversion
Worst Mobile Step: Complete Purchase (28% vs 45% desktop)
By Channel:
Organic: 14.3% conversion
Paid: 11.8% conversion
Email: 16.7% conversion
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š” RECOMMENDATIONS
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
PRIORITY 1 (High Impact):
1. Investigate payment completion drop-off
- Review error messages at payment step
- Check mobile payment UX
- Consider guest checkout option
- Potential gain: +3,000 conversions/month
PRIORITY 2 (Medium Impact):
2. Reduce cart abandonment
- Add save cart feature
- Send abandonment emails
- Show trust signals earlier
- Potential gain: +2,000 conversions/month
PRIORITY 3 (Mobile Optimization):
3. Improve mobile experience
- Simplify mobile checkout flow
- Optimize for smaller screens
- Potential gain: +1,000 conversions/month
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š FILES GENERATED
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā funnel_analysis.png (visualization)
ā funnel_metrics.csv (detailed metrics)
ā user_journeys.csv (individual user paths)
ā segment_comparison.csv (breakdown by segment)
Common Scenarios
Scenario 1: "Why is our signup funnel performing poorly?"
ā Build funnel from landing ā signup ā activation ā Identify biggest drop-off step ā Compare segments (source, device, etc.) ā Analyze time-to-convert at each step ā Provide specific recommendations
Scenario 2: "Mobile conversion is lower than desktop"
ā Run funnel analysis separately for each device ā Identify which step(s) mobile underperforms ā Compare time-to-convert (mobile users slower?) ā Highlight specific mobile UX issues
Scenario 3: "Test if new checkout flow improved conversion"
ā Compare funnel before/after change ā Calculate statistical significance of difference ā Show which specific steps improved ā Measure overall impact
Scenario 4: "Optimize onboarding for different user types"
ā Segment by user type (free, trial, paid) ā Build separate funnels for each ā Identify where each segment drops off ā Create targeted interventions
Scenario 5: "Track funnel performance over time"
ā Calculate weekly/monthly funnel metrics ā Show trend in conversion rates ā Flag when performance degrades ā Correlate with product changes
Handling Missing Context
User says "analyze our funnel" without defining steps: "I can help! First, what's the user journey you want to analyze? Example: Landing page ā Signup ā Onboarding ā Activation. What are your steps?"
User doesn't know time window: "Let me analyze the data to see typical completion times, then we can decide on an appropriate window. Most users complete within X days."
Event data is messy: "I see multiple event names that might represent the same step. Let me map them:
- 'view_product', 'product_page' ā Step 1?
- 'add_cart', 'added_to_cart' ā Step 2? Does this look right?"
User wants to compare many segments: "I can analyze all segments, but let's prioritize. Which 2-3 segments matter most for decision-making?"
Advanced Options
After basic funnel analysis, offer:
Cohort-Based Funnels: "Want to see how funnel performance changes over time? I can show conversion rates by signup cohort."
Micro-Conversion Analysis: "I can break down each major step into micro-steps to find exactly where users hesitate."
Drop-Off Prediction: "Using behavior patterns, I can predict which users are likely to drop off and when."
Recovery Analysis: "I can identify users who dropped off but later returned to complete the funnel."
Funnel Optimization Calculator: "I can estimate revenue impact of improving conversion at each step by X%."
A/B Test Power Analysis: "Planning to test funnel changes? I can calculate required sample size for statistical significance."
Related Skills
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Senior Data Scientist
World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics. Expertise in Python (NumPy, Pandas, Scikit-learn), R, SQL, statistical methods, A/B testing, time series, and business intelligence. Includes experiment design, feature engineering, model evaluation, and stakeholder communication. Use when designing experiments, building predictive models, performing causal analysis, or driving data-driven decisions.
Hypogenic
Automated hypothesis generation and testing using large language models. Use this skill when generating scientific hypotheses from datasets, combining literature insights with empirical data, testing hypotheses against observational data, or conducting systematic hypothesis exploration for research discovery in domains like deception detection, AI content detection, mental health analysis, or other empirical research tasks.
Ux Researcher Designer
UX research and design toolkit for Senior UX Designer/Researcher including data-driven persona generation, journey mapping, usability testing frameworks, and research synthesis. Use for user research, persona creation, journey mapping, and design validation.
Hypogenic
Automated LLM-driven hypothesis generation and testing on tabular datasets. Use when you want to systematically explore hypotheses about patterns in empirical data (e.g., deception detection, content analysis). Combines literature insights with data-driven hypothesis testing. For manual hypothesis formulation use hypothesis-generation; for creative ideation use scientific-brainstorming.
Data Engineering Data Driven Feature
Build features guided by data insights, A/B testing, and continuous measurement using specialized agents for analysis, implementation, and experimentation.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Dashboard Design
USE THIS SKILL FIRST when user wants to create and design a dashboard, ESPECIALLY Vizro dashboards. This skill enforces a 3-step workflow (requirements, layout, visualization) that must be followed before implementation. For implementation and testing, use the dashboard-build skill after completing Steps 1-3.
Ux Researcher Designer
UX research and design toolkit for Senior UX Designer/Researcher including data-driven persona generation, journey mapping, usability testing frameworks, and research synthesis. Use for user research, persona creation, journey mapping, and design validation.
Performance Testing
Benchmark indicator performance with BenchmarkDotNet. Use for Series/Buffer/Stream benchmarks, regression detection, and optimization patterns. Target 1.5x Series for StreamHub, 1.2x for BufferList.
