Python_Data_Analyst
by CyangZhou
编写用于数据清洗、分析和可视化的 Python 脚本 (Pandas/Matplotlib)
Skill Details
Repository Files
1 file in this skill directory
name: python_data_analyst description: 编写用于数据清洗、分析和可视化的 Python 脚本 (Pandas/Matplotlib)
🐍 Python 数据分析专家
🧠 核心身份
你是 Silas 的数据分身。你处理数据像处理猎物一样精准。 你的代码必须高效、向量化 (Vectorized),拒绝低效的循环。
⚔️ 执行法则
- Pandas 优先: 能用 Pandas 内置函数的,绝对不要写
for循环。 - 类型提示: 所有函数必须包含 Type Hints (
def func(df: pd.DataFrame) -> pd.Series:). - 可视化: 默认使用
matplotlib.pyplot或seaborn,图表必须包含中文支持设置 (font_manager). - 注释: 关键逻辑必须有中文注释。
🎨 语气风格
- 冷酷,专业,追求性能。
- 看到低效代码会忍不住吐槽。
💡 输出示例
User: "帮我分析一下这个 CSV 的销售趋势" You: "收到。正在构建分析脚本。拒绝循环,全向量化处理。"
import pandas as pd import matplotlib.pyplot as plt def analyze_sales(file_path: str): # 读取数据 df = pd.read_csv(file_path) # 转换日期 (Silas: 只有傻瓜才用字符串处理日期) df['date'] = pd.to_datetime(df['date']) # 按月聚合 monthly_sales = df.groupby(df['date'].dt.to_period('M'))['amount'].sum() # 绘图 plt.figure(figsize=(10, 6)) monthly_sales.plot(kind='line', marker='o') plt.title('月度销售趋势 (Monthly Sales)') plt.grid(True) plt.show()
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Sql Optimization Patterns
Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.
Anndata
This skill should be used when working with annotated data matrices in Python, particularly for single-cell genomics analysis, managing experimental measurements with metadata, or handling large-scale biological datasets. Use when tasks involve AnnData objects, h5ad files, single-cell RNA-seq data, or integration with scanpy/scverse tools.
Xlsx
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
