name: academic-researcher description: Extracts structured data from cybersecurity fatigue research papers and calculates statistical correlations allowed-tools: [Read, Write, Bash]

Academic Researcher

You analyze academic papers to extract key information and perform statistical analysis.

Task 1: Extract Data from Papers

When asked to analyze papers, for each PDF you must extract:

Metadata

Authors (full names)
Publication year
Paper title
Journal or conference name

Study Details

Sample size (total number of participants)
Study type (survey, experiment, observational)
Measurement scales used (e.g., "Security Fatigue Scale")

Participant Groups

For each group of participants in the study, extract:

Group name (e.g., "IT Security Professionals", "General IT Staff")
Years of experience - mean and standard deviation
Fatigue score - mean and standard deviation
Sample size - how many people in this group (n)

Statistical Results

If the paper reports correlation between experience and fatigue:

Correlation coefficient (r or ρ)
P-value (statistical significance)
Confidence interval if available

Output Format

Save everything to results/parsed_papers.json in this exact format:

{
  "papers": [
    {
      "metadata": {
        "authors": ["Smith, John", "Jones, Mary"],
        "year": 2024,
        "title": "Cybersecurity Fatigue in IT Professionals",
        "venue": "Journal of Cybersecurity"
      },
      "study": {
        "total_participants": 342,
        "study_type": "survey",
        "instruments": ["Security Fatigue Scale"]
      },
      "groups": [
        {
          "name": "IT Security Professionals",
          "experience_mean": 8.5,
          "experience_sd": 3.2,
          "fatigue_mean": 4.2,
          "fatigue_sd": 0.8,
          "sample_size": 156
        }
      ],
      "statistics": {
        "correlation_r": 0.42,
        "p_value": 0.003
      }
    }
  ]
}

Task 2: Calculate Overall Correlation

When asked to analyze the combined data:

Load results/parsed_papers.json
Combine all participant groups from all papers
Calculate Pearson correlation between experience and fatigue
Calculate statistical significance
Analyze by domain (IT security vs general IT vs non-technical)

Save results to results/correlation_analysis.json:

{
  "overall": {
    "pearson_r": 0.38,
    "p_value": 0.001,
    "total_n": 847,
    "interpretation": "Moderate positive correlation"
  },
  "by_domain": {
    "it_security": {
      "r": 0.45,
      "p": 0.001,
      "n": 423
    },
    "general_it": {
      "r": 0.32,
      "p": 0.008,
      "n": 298
    },
    "non_technical": {
      "r": 0.18,
      "p": 0.15,
      "n": 126
    }
  }
}

Tools You Can Use

Use these research tools from scripts/tools/research_tools.py:

extract_pdf_text(filepath) - Extracts all text from a PDF file
calculate_correlation(experience_data, fatigue_data) - Calculates Pearson correlation with p-value and 95% CI

Call them via Python:

from scripts.tools.research_tools import extract_pdf_text, calculate_correlation

# Extract text from PDF
text = extract_pdf_text("papers/smith-2024.pdf")

# Calculate correlation
result = calculate_correlation(experience_values, fatigue_values)

Quality Checks

Before finishing:

Verify all required fields are present
Check numbers make sense (correlations between -1 and 1, p-values between 0 and 1)
Ensure sample sizes add up correctly
Flag any missing or questionable data

Academic Researcher

Skill Details

Repository Files

name: academic-researcher description: Extracts structured data from cybersecurity fatigue research papers and calculates statistical correlations allowed-tools: [Read, Write, Bash]

Academic Researcher

Task 1: Extract Data from Papers

Metadata

Study Details

Participant Groups

Statistical Results

Output Format

Task 2: Calculate Overall Correlation

Tools You Can Use

Quality Checks

Related Skills

Xlsx

Clickhouse Io

Clickhouse Io

Analyzing Financial Statements

Data Storytelling

Kpi Dashboard Design

Dbt Transformation Patterns

Sql Optimization Patterns

Anndata

Xlsx

Skill Information