Survey Analyzer

by dkyazzentwatwa

skill

Analyze survey responses with Likert scale analysis, cross-tabulations, sentiment scoring, and frequency distributions with visualizations.

Skill Details

Repository Files

3 files in this skill directory


name: survey-analyzer description: Analyze survey responses with Likert scale analysis, cross-tabulations, sentiment scoring, and frequency distributions with visualizations.

Survey Analyzer

Comprehensive survey data analysis with Likert scales, cross-tabs, and sentiment analysis.

Features

  • Likert Scale Analysis: Agreement scale scoring and visualization
  • Cross-Tabulation: Relationship analysis between categorical variables
  • Frequency Analysis: Response distributions and percentages
  • Sentiment Scoring: Text response sentiment analysis
  • Open-Ended Analysis: Theme extraction from text responses
  • Statistical Tests: Chi-square, correlations, significance testing
  • Visualizations: Bar charts, heatmaps, word clouds, distribution plots
  • Report Generation: Comprehensive PDF/HTML reports

Quick Start

from survey_analyzer import SurveyAnalyzer

analyzer = SurveyAnalyzer()

# Load survey data
analyzer.load_csv('survey_responses.csv')

# Analyze Likert scale question
results = analyzer.likert_analysis('satisfaction', scale_type='agreement')
print(f"Mean score: {results['mean_score']:.2f}")

# Cross-tabulation
crosstab = analyzer.crosstab('age_group', 'product_preference')
print(crosstab)

# Generate report
analyzer.generate_report('survey_report.pdf')

CLI Usage

# Analyze Likert scale
python survey_analyzer.py --data survey.csv --likert satisfaction --output results.pdf

# Cross-tabulation
python survey_analyzer.py --data survey.csv --crosstab age_group product --output crosstab.png

# Sentiment analysis
python survey_analyzer.py --data survey.csv --sentiment comments --output sentiment.html

# Full report
python survey_analyzer.py --data survey.csv --report --output full_report.pdf

API Reference

SurveyAnalyzer Class

class SurveyAnalyzer:
    def __init__(self)

    # Data Loading
    def load_csv(self, filepath, **kwargs) -> 'SurveyAnalyzer'
    def load_data(self, data: pd.DataFrame) -> 'SurveyAnalyzer'

    # Likert Scale Analysis
    def likert_analysis(self, column, scale_type='agreement') -> Dict
    def likert_comparison(self, columns: List[str]) -> pd.DataFrame
    def plot_likert(self, column, output, scale_type='agreement') -> str

    # Frequency Analysis
    def frequency_table(self, column) -> pd.DataFrame
    def multiple_choice(self, column, delimiter=',') -> pd.DataFrame
    def plot_frequencies(self, column, output, top_n=None) -> str

    # Cross-Tabulation
    def crosstab(self, row_var, col_var, normalize=None) -> pd.DataFrame
    def chi_square_test(self, row_var, col_var) -> Dict
    def plot_crosstab(self, row_var, col_var, output) -> str

    # Sentiment Analysis
    def sentiment_analysis(self, column) -> pd.DataFrame
    def sentiment_summary(self, column) -> Dict
    def plot_sentiment(self, column, output) -> str

    # Open-Ended Analysis
    def word_frequency(self, column, top_n=20) -> pd.DataFrame
    def word_cloud(self, column, output) -> str
    def extract_themes(self, column, n_themes=5) -> List[str]

    # Statistics
    def satisfaction_score(self, columns: List[str]) -> Dict
    def response_rate(self) -> Dict
    def demographics_summary(self, columns: List[str]) -> pd.DataFrame

    # Reporting
    def generate_report(self, output, format='pdf') -> str
    def summary(self) -> str

Likert Scale Analysis

Standard Scales

# 5-point agreement scale
analyzer.likert_analysis('satisfaction', scale_type='agreement')
# 1=Strongly Disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly Agree

# 5-point frequency scale
analyzer.likert_analysis('usage', scale_type='frequency')
# 1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always

# Custom scale
analyzer.likert_analysis('rating', scale_type='custom',
                        labels=['Poor', 'Fair', 'Good', 'Excellent'])

Results

results = analyzer.likert_analysis('satisfaction')
# {
#     'mean_score': 4.2,
#     'median': 4,
#     'mode': 5,
#     'distribution': {1: 2, 2: 5, 3: 15, 4: 40, 5: 38},
#     'percentages': {1: 2%, 2: 5%, 3: 15%, 4: 40%, 5: 38%},
#     'top_2_box': 78%,  # % Agree + Strongly Agree
#     'bottom_2_box': 7%  # % Disagree + Strongly Disagree
# }

Visualization

# Stacked bar chart
analyzer.plot_likert('satisfaction', 'likert_chart.png')

# Compare multiple questions
analyzer.likert_comparison(['quality', 'value', 'service'])
analyzer.plot_likert_comparison(['quality', 'value', 'service'],
                               'comparison.png')

Frequency Analysis

Single Choice

freq = analyzer.frequency_table('age_group')
#          Count  Percentage
# 18-24      45      22.5%
# 25-34      78      39.0%
# 35-44      52      26.0%
# 45+        25      12.5%

# Plot
analyzer.plot_frequencies('age_group', 'age_distribution.png')

Multiple Choice

For questions allowing multiple selections:

# Data format: "Option A, Option B, Option C"
results = analyzer.multiple_choice('features_liked', delimiter=',')
#                Count  Percentage
# Price            120      60%
# Quality           95      47.5%
# Design            80      40%
# Durability        70      35%

analyzer.plot_frequencies('features_liked', 'features.png', top_n=10)

Cross-Tabulation

Basic Cross-Tab

crosstab = analyzer.crosstab('age_group', 'satisfaction')
#           Satisfied  Neutral  Dissatisfied
# 18-24          30       10           5
# 25-34          60       15           3
# 35-44          40        8           4
# 45+            18        5           2

# With percentages
crosstab_pct = analyzer.crosstab('age_group', 'satisfaction',
                                 normalize='index')  # Row percentages

Statistical Testing

result = analyzer.chi_square_test('age_group', 'satisfaction')
# {
#     'statistic': 12.45,
#     'p_value': 0.014,
#     'significant': True,
#     'interpretation': 'There is a significant relationship between
#                        age_group and satisfaction (p=0.014)'
# }

Visualization

# Heatmap
analyzer.plot_crosstab('age_group', 'satisfaction', 'crosstab_heatmap.png')

Sentiment Analysis

Analyze open-ended text responses:

# Analyze all comments
sentiment_df = analyzer.sentiment_analysis('comments')
#      comment                          polarity  sentiment
# 0    "Great product!"                  0.8      Positive
# 1    "Could be better"                 0.1      Neutral
# 2    "Very disappointed"              -0.6      Negative

# Summary
summary = analyzer.sentiment_summary('comments')
# {
#     'positive': 65%,
#     'neutral': 20%,
#     'negative': 15%,
#     'avg_polarity': 0.35
# }

# Visualize
analyzer.plot_sentiment('comments', 'sentiment_distribution.png')

Open-Ended Analysis

Word Frequency

words = analyzer.word_frequency('comments', top_n=20)
#        Word   Frequency
# 0     great       45
# 1   quality       38
# 2     price       32
# ...

Word Cloud

analyzer.word_cloud('comments', 'wordcloud.png')

Theme Extraction

themes = analyzer.extract_themes('feedback', n_themes=5)
# ['product quality', 'customer service', 'pricing',
#  'delivery speed', 'user experience']

Satisfaction Metrics

Net Promoter Score (NPS)

nps = analyzer.nps_score('recommendation')  # 0-10 scale
# {
#     'promoters': 65%,   # 9-10
#     'passives': 25%,    # 7-8
#     'detractors': 10%,  # 0-6
#     'nps': 55
# }

Overall Satisfaction

satisfaction = analyzer.satisfaction_score([
    'product_quality',
    'customer_service',
    'value_for_money',
    'ease_of_use'
])
# {
#     'overall_score': 4.3,
#     'category_scores': {...},
#     'satisfaction_rate': 86%  # % scoring 4-5
# }

Demographics Analysis

demographics = analyzer.demographics_summary([
    'age_group',
    'gender',
    'location',
    'income_range'
])

# Returns frequency tables for each demographic variable

Response Rate Analysis

response_rate = analyzer.response_rate()
# {
#     'total_respondents': 200,
#     'completion_rate': 85%,
#     'average_time': '5m 30s',
#     'dropout_points': {
#         'question_5': 8%,
#         'question_12': 5%
#     }
# }

Report Generation

Comprehensive Report

analyzer.generate_report('survey_report.pdf', format='pdf')

Report includes:

  • Executive summary
  • Response rate and demographics
  • Question-by-question analysis
  • Likert scale visualizations
  • Cross-tabulations
  • Sentiment analysis
  • Key findings and recommendations

Custom Report Sections

analyzer.set_report_sections([
    'executive_summary',
    'demographics',
    'likert_questions',
    'cross_tabs',
    'sentiment',
    'recommendations'
])

Advanced Features

Filter by Segment

# Analyze subset of responses
analyzer.filter('age_group', '25-34')
results = analyzer.likert_analysis('satisfaction')
analyzer.clear_filter()

Compare Segments

comparison = analyzer.compare_segments(
    segment_col='age_group',
    metric_col='satisfaction'
)
# Shows how different segments scored the metric

Trend Analysis

For longitudinal surveys:

trends = analyzer.trend_analysis(
    metric='satisfaction',
    time_col='survey_date',
    period='month'
)
analyzer.plot_trends(trends, 'satisfaction_trend.png')

Dependencies

  • pandas>=2.0.0
  • numpy>=1.24.0
  • scipy>=1.10.0
  • textblob>=0.17.0
  • matplotlib>=3.7.0
  • seaborn>=0.12.0
  • wordcloud>=1.9.0
  • reportlab>=4.0.0

Related Skills

Attack Tree Construction

Build comprehensive attack trees to visualize threat paths. Use when mapping attack scenarios, identifying defense gaps, or communicating security risks to stakeholders.

skill

Grafana Dashboards

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

skill

Matplotlib

Foundational plotting library. Create line plots, scatter, bar, histograms, heatmaps, 3D, subplots, export PNG/PDF/SVG, for scientific visualization and publication figures.

skill

Scientific Visualization

Create publication figures with matplotlib/seaborn/plotly. Multi-panel layouts, error bars, significance markers, colorblind-safe, export PDF/EPS/TIFF, for journal-ready scientific plots.

skill

Seaborn

Statistical visualization. Scatter, box, violin, heatmaps, pair plots, regression, correlation matrices, KDE, faceted plots, for exploratory analysis and publication figures.

skill

Shap

Model interpretability and explainability using SHAP (SHapley Additive exPlanations). Use this skill when explaining machine learning model predictions, computing feature importance, generating SHAP plots (waterfall, beeswarm, bar, scatter, force, heatmap), debugging models, analyzing model bias or fairness, comparing models, or implementing explainable AI. Works with tree-based models (XGBoost, LightGBM, Random Forest), deep learning (TensorFlow, PyTorch), linear models, and any black-box model

skill

Pydeseq2

Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.

skill

Query Writing

For writing and executing SQL queries - from simple single-table queries to complex multi-table JOINs and aggregations

skill

Pydeseq2

Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.

skill

Scientific Visualization

Meta-skill for publication-ready figures. Use when creating journal submission figures requiring multi-panel layouts, significance annotations, error bars, colorblind-safe palettes, and specific journal formatting (Nature, Science, Cell). Orchestrates matplotlib/seaborn/plotly with publication styles. For quick exploration use seaborn or plotly directly.

skill

Skill Information

Category:Skill
Last Updated:12/16/2025