Data Visualization
by AmnadTaowsoam
Creating effective data visualizations using charts, graphs, and visual representations to communicate insights clearly and accurately following Tufte and Few principles.
Skill Details
Repository Files
1 file in this skill directory
name: Data Visualization description: Creating effective data visualizations using charts, graphs, and visual representations to communicate insights clearly and accurately following Tufte and Few principles.
Data Visualization
Current Level: Intermediate
Domain: Business Analytics / Data Science
Overview
Data visualization transforms data into visual representations that make insights clear and actionable. Effective data visualization follows principles of clarity, accuracy, and simplicity to help users understand complex data quickly.
Data Visualization Principles
Edward Tufte's Principles
| Principle | Description | Example |
|---|---|---|
| Show the data | Let data speak for itself | Don't hide behind decoration |
| Maximize data-ink ratio | Remove non-data ink | Eliminate chart junk |
| Integrate text and graphics | Labels near data | Direct labeling, not legends |
| Avoid distortion | Accurate representation | Start Y-axis at zero |
| Small multiples | Compare many charts | Side-by-side comparison |
Stephen Few's Principles
| Principle | Description |
|---|---|
| Simplicity | Remove unnecessary elements |
| Clarity | Make message obvious |
| Accuracy | Represent data truthfully |
| Efficiency | Convey information quickly |
| Aesthetics | Pleasing but not distracting |
Data-Ink Ratio
Data-Ink Ratio = Data-Ink / Total Ink
Goal: Maximize ratio (close to 1)
Bad: Lots of decoration, 3D effects, shadows Good: Clean, minimal, data-focused
Chart Types and Use Cases
Comprehensive Chart Guide
| Chart Type | Best For | Data Type | Example |
|---|---|---|---|
| Line Chart | Trends over time | Continuous | Revenue over months |
| Bar Chart | Compare categories | Categorical | Sales by region |
| Column Chart | Compare categories (vertical) | Categorical | Sales by product |
| Pie Chart | Parts of whole (max 5) | Categorical | Market share |
| Donut Chart | Parts of whole (modern) | Categorical | Budget breakdown |
| Area Chart | Cumulative over time | Continuous | Total users over time |
| Scatter Plot | Correlation | Two continuous | Price vs. quantity |
| Bubble Chart | Three dimensions | Two continuous + size | Sales vs. profit vs. volume |
| Heatmap | Two dimensions | Categorical × Categorical | Sales by region × month |
| Treemap | Hierarchical parts | Hierarchical | Budget by department |
| Histogram | Distribution | Continuous | Order value distribution |
| Box Plot | Distribution + outliers | Continuous | Salary distribution |
| Violin Plot | Distribution shape | Continuous | Response time distribution |
| Radar Chart | Multi-variable comparison | Multiple | Skills assessment |
| Gauge Chart | Single metric vs. target | Single | Progress to goal |
| Funnel Chart | Process stages | Sequential | Conversion funnel |
| Sankey Diagram | Flow between stages | Sequential | User journey |
| Waterfall Chart | Cumulative changes | Sequential | Revenue breakdown |
| Sparkline | Mini trend | Time series | Stock price trend |
Line Chart
Use for: Time series, trends
Revenue ($)
$100k ┤
$80k ┤ ●───●───●───●───●
$60k ┤
$40k ┤
$20k ┤
$0 └────────────────────────────
Jan Feb Mar Apr May
Best practices:
- Smooth curves for trends
- Don't connect unrelated points
- Use area charts for cumulative
- Max 5-7 lines
Bar Chart
Use for: Category comparison
Sales by Region
North America ████████████████████████████████████████ $50M
Europe ████████████████████████████ $35M
Asia ████████████████████ $25M
Other ████████████ $10M
0 10M 20M 30M 40M 50M
Best practices:
- Horizontal bars for many categories
- Sort by value (not alphabetically)
- Start Y-axis at zero
- Use consistent bar widths
Pie/Donut Chart
Use for: Parts of whole (max 5 slices)
Market Share
Product A ████████████████████████████████████████ 50%
Product B ████████████████████ 25%
Product C ████████████████████ 25%
Best practices:
- Max 5 slices
- Use donut for modern look
- Consider bar chart instead
- Order slices by size
Scatter Plot
Use for: Correlation between two variables
Quantity
100 ┤ ●
│ ● ●
50 ┤ ● ● ●
│ ● ● ●
0 └────────────────────
0 50 100 150
Price
Best practices:
- Add trend line
- Color by category
- Identify outliers
- Use transparency for overlapping points
Heatmap
Use for: Two-dimensional data
Sales by Region × Month
Jan Feb Mar Apr
US ■■■■■ ■■■■■ ■■■■■ ■■■■■
EU ■■■■ ■■■■ ■■■■ ■■■■
Asia ■■■ ■■■ ■■■ ■■■
Best practices:
- Use color scale
- Include legend
- Label both axes
- Consider diverging colors for +/- data
Histogram
Use for: Distribution of continuous data
Order Value Distribution
$0-10 ████████████████████████████████████████ 100
$10-20 ████████████████████████████ 80
$20-30 ████████████████████ 60
$30-40 ████████████ 40
$40-50 ██████████ 20
$50+ ██████ 10
Best practices:
- Choose appropriate bin size
- Show normal distribution if applicable
- Label axes clearly
- Consider density plot for smooth curves
Box Plot
Use for: Distribution with outliers
Salary Distribution
┌─────────────────────────────────────┐
│ ┌───┐ │
│ │ │ │
│ ┌───┤ ├───┐ │
│ │ │ │ │ │
│ │ │ │ │ │
│ └───┴───┴───┘ │
└─────────────────────────────────────┘
Min Q1 Med Q3 Max
Best practices:
- Show outliers as points
- Compare multiple box plots
- Label quartiles
- Use for skewed distributions
Color Theory
Color Palettes
Sequential (Ordered Data)
Use for continuous, ordered data.
Viridis: #440154 → #3b528b → #21918c → #5ec962 → #fde725
Blues: #f7fbff → #deebf7 → #c6dbef → #9ecae1 → #6baed6
Greens: #f7fcf5 → #e5f5e0 → #c7e9c0 → #a1d99b → #74c476
Diverging (Deviation from Center)
Use for data with meaningful midpoint.
RdYlGn: #a50026 → #d73027 → #f46d43 → #fdae61 → #fee08b
→ #d9ef8b → #a6d96a → #66bd63 → #1a9850 → #006837
Qualitative (Categorical)
Use for distinct categories.
Set1: #e41a1c, #377eb8, #4daf4a, #984ea3, #ff7f00
Set2: #66c2a5, #fc8d62, #8da0cb, #e78ac3, #a6d854
Semantic Colors
| Color | Meaning | Use Case |
|---|---|---|
| Green | Positive, good | Above target, growth |
| Red | Negative, bad | Below target, decline |
| Yellow/Orange | Warning | Near threshold |
| Blue | Neutral, information | Default state |
| Gray | Inactive, placeholder | Disabled elements |
Colorblind-Friendly Design
Tips:
- Avoid red-green only
- Use patterns + color
- Test with colorblind simulators
- Use diverging palettes
Safe Palettes:
- Viridis (colorblind-safe)
- ColorBrewer (designed for accessibility)
- Okabe-Ito (8 colorblind-safe colors)
Visual Perception
Preattentive Attributes
Attributes processed instantly (before conscious thought).
| Attribute | Speed | Example |
|---|---|---|
| Color | Fast | Highlighting important data |
| Size | Fast | Larger = more important |
| Orientation | Fast | Angles, lines |
| Motion | Very fast | Animations |
| Position | Fast | Top-left = primary |
Ranking of Visual Attributes
From most to least accurate for quantitative data:
- Position (most accurate)
- Length
- Angle
- Direction
- Area
- Volume
- Color saturation
- Color hue (least accurate)
Implication: Use position/length for precise values, color for categories.
Gestalt Principles
| Principle | Description | Application |
|---|---|---|
| Proximity | Near items grouped together | Group related charts |
| Similarity | Similar items grouped | Use consistent colors |
| Continuity | Eye follows lines | Use flow in layouts |
| Closure | Complete incomplete shapes | Don't over-complete |
| Figure-Ground | Separate foreground/background | Use white space |
Chart Design Best Practices
1. Direct Labeling
Bad (legend):
[Chart]
Legend: Blue = A, Red = B, Green = C
Good (direct labels):
[Chart with labels on chart]
Series A ●───●───●
Series B ●───●───●
Series C ●───●───●
2. Start Y-Axis at Zero
Bad (truncated):
$100k ┤ ●
$98k ┤
$96k ┤ ●
$94k ┤
└────────
Good (starts at zero):
$100k ┤ ●
$50k ┤
$0 ┤ ●
└────────
Exception: When zero is not meaningful (e.g., temperature)
3. Sort by Value
Bad (alphabetical):
Zebra ████████
Apple ████████████████
Banana ████████████
Good (by value):
Apple ████████████████
Banana ████████████
Zebra ████████
4. Remove Chart Junk
Bad (cluttered):
[Chart with grid, borders, 3D effects, shadows, gradients]
Good (clean):
[Clean chart with minimal decoration]
5. Use Consistent Scales
Bad (different scales):
Chart A: 0-100
Chart B: 0-1000
Good (consistent scales):
Chart A: 0-100
Chart B: 0-100 (normalized)
Interactive Visualizations
Interactive Features
| Feature | Description | Use Case |
|---|---|---|
| Tooltips | Hover for details | Show exact values |
| Zoom | Zoom into data | Explore details |
| Pan | Move around zoomed data | Navigate |
| Filter | Filter data | Focus on subset |
| Highlight | Highlight selection | Compare |
| Brush | Select range | Time range selection |
| Click | Click for drill-down | Navigate hierarchy |
Tooltip Design
Good tooltip:
┌─────────────────────────────────────┐
│ January 2024 │
│ ─────────────────────────────── │
│ Revenue: $1,234,567 │
│ Orders: 1,234 │
│ AOV: $1,000 │
│ ▲ 12.3% vs previous month │
└─────────────────────────────────────┘
Zoom and Pan
Before zoom:
┌─────────────────────────────────────┐
│ ●───●───●───●───●───●───●───● │ ← Full view
└─────────────────────────────────────┘
After zoom:
┌─────────────────────────────────────┐
│ ●───●───● │ ← Zoomed in
└─────────────────────────────────────┘
Accessibility
WCAG Guidelines
| Guideline | Requirement |
|---|---|
| Color contrast | 4.5:1 for text, 3:1 for large text |
| Color independence | Don't rely on color alone |
| Keyboard navigation | All features accessible via keyboard |
| Screen reader support | Provide alt text, ARIA labels |
| Focus indicators | Visible focus state |
Color Contrast Checker
Good contrast (7:1):
Black text on white background
#000000 on #FFFFFF
Poor contrast (1.5:1):
Light gray on white background
#CCCCCC on #FFFFFF
Color Independence
Bad (color only):
Red = Negative, Green = Positive
Good (color + pattern/icon):
Red ↓ = Negative, Green ↑ = Positive
Alt Text for Charts
Example:
<figure>
<img src="revenue-chart.png" alt="Line chart showing revenue
increasing from $1M in January to $1.5M in June, with a
12% month-over-month growth rate.">
<figcaption>Revenue Growth Q1-Q2 2024</figcaption>
</figure>
Tools
JavaScript Libraries
| Library | Strengths | Learning Curve |
|---|---|---|
| D3.js | Most flexible, powerful | Steep |
| Plotly | Easy, interactive | Gentle |
| Chart.js | Simple, popular | Gentle |
| Recharts | React-friendly | Gentle |
| Victory | React, declarative | Gentle |
| Nivo | React, beautiful | Gentle |
Python Libraries
| Library | Strengths | Use Case |
|---|---|---|
| Matplotlib | Foundation, flexible | All-purpose |
| Seaborn | Statistical plots | Data analysis |
| Plotly | Interactive | Web dashboards |
| Altair | Declarative grammar | Statistical |
| Bokeh | Interactive | Web apps |
R Libraries
| Library | Strengths | Use Case |
|---|---|---|
| ggplot2 | Grammar of graphics | All-purpose |
| plotly | Interactive | Web dashboards |
| lattice | Trellis displays | Multi-panel |
BI Tools
| Tool | Strengths |
|---|---|
| Tableau | Powerful visualizations |
| Looker | SQL-based, embedded |
| Power BI | Microsoft ecosystem |
| Metabase | Open-source, simple |
Common Mistakes
1. 3D Charts
Problem: 3D distorts data perception.
Bad:
[3D bar chart with perspective]
Good:
[2D bar chart, flat]
2. Truncated Y-Axis
Problem: Exaggerates differences.
Bad:
$100k ┤ ●
$98k ┤
$96k ┤ ●
Good:
$100k ┤ ●
$0 ┤ ●
3. Too Many Colors
Problem: Confusing, hard to distinguish.
Bad: 10+ colors Good: 3-5 colors max
4. Pie Charts with Many Slices
Problem: Hard to compare.
Bad: 10+ slices Good: Max 5 slices, use bar chart instead
5. Missing Context
Problem: Numbers without meaning.
Bad: "Revenue: $1.2M" Good: "Revenue: $1.2M ▲ 12% vs last month"
6. Rainbow Colors
Problem: No meaning, hard to read.
Bad: Random colors Good: Semantic colors (red=bad, green=good)
7. Small Fonts
Problem: Hard to read.
Bad: 10px font Good: 12px+ font
Mobile-Responsive Charts
Responsive Design
Desktop: Wide charts, side-by-side Mobile: Narrow charts, stacked
Desktop: Mobile:
┌─────┬─────┐ ┌─────┐
│ A │ B │ │ A │
├─────┼─────┤ ├─────┤
│ C │ D │ │ B │
└─────┴─────┘ ├─────┤
│ C │
├─────┤
│ D │
└─────┘
Touch-Friendly
- Minimum tap target: 44×44px
- Large touch areas
- Swipe gestures
Performance
- Optimize images
- Lazy load charts
- Use canvas for many points
Animation and Transitions
Animation Principles
| Principle | Description |
|---|---|
| Purposeful | Animation should have purpose |
| Smooth | 60fps, no jank |
| Subtle | Don't distract |
| Fast | < 500ms for transitions |
Transition Types
| Type | Use Case |
|---|---|
| Fade | Show/hide elements |
| Slide | Move between states |
| Scale | Emphasize elements |
| Rotate | Draw attention |
Example: D3.js Animation
// Smooth transition
d3.select('.bar')
.transition()
.duration(500)
.attr('height', newHeight);
Implementation Examples
Chart.js Example
import Chart from 'chart.js/auto';
const ctx = document.getElementById('myChart');
new Chart(ctx, {
type: 'line',
data: {
labels: ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
datasets: [{
label: 'Revenue',
data: [100, 120, 115, 134, 168],
borderColor: 'rgb(75, 192, 192)',
backgroundColor: 'rgba(75, 192, 192, 0.2)',
tension: 0.1
}]
},
options: {
responsive: true,
plugins: {
legend: {
display: false
},
tooltip: {
callbacks: {
label: function(context) {
return `$${context.parsed.y}K`;
}
}
}
},
scales: {
y: {
beginAtZero: true,
ticks: {
callback: function(value) {
return `$${value}K`;
}
}
}
}
}
});
D3.js Example
import * as d3 from 'd3';
const data = [
{ month: 'Jan', value: 100 },
{ month: 'Feb', value: 120 },
{ month: 'Mar', value: 115 },
{ month: 'Apr', value: 134 },
{ month: 'May', value: 168 }
];
const margin = {top: 20, right: 30, bottom: 40, left: 50};
const width = 800 - margin.left - margin.right;
const height = 400 - margin.top - margin.bottom;
const svg = d3.select('#chart')
.append('svg')
.attr('width', width + margin.left + margin.right)
.attr('height', height + margin.top + margin.bottom)
.append('g')
.attr('transform', `translate(${margin.left},${margin.top})`);
const x = d3.scaleBand()
.domain(data.map(d => d.month))
.range([0, width])
.padding(0.2);
const y = d3.scaleLinear()
.domain([0, d3.max(data, d => d.value)])
.range([height, 0]);
svg.append('g')
.attr('transform', `translate(0,${height})`)
.call(d3.axisBottom(x));
svg.append('g')
.call(d3.axisLeft(y));
svg.selectAll('.bar')
.data(data)
.enter()
.append('rect')
.attr('class', 'bar')
.attr('x', d => x(d.month))
.attr('width', x.bandwidth())
.attr('y', height)
.attr('height', 0)
.transition()
.duration(500)
.attr('y', d => y(d.value))
.attr('height', d => height - y(d.value))
.attr('fill', '#4daf4a');
Plotly Example (Python)
import plotly.express as px
import pandas as pd
data = pd.DataFrame({
'month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
'revenue': [100, 120, 115, 134, 168]
})
fig = px.line(
data,
x='month',
y='revenue',
title='Revenue Trend',
labels={'revenue': 'Revenue ($K)', 'month': 'Month'},
markers=True
)
fig.update_layout(
yaxis_range=[0, 200],
hovermode='x unified'
)
fig.update_traces(
line=dict(width=3, color='#4daf4a'),
marker=dict(size=8)
)
fig.show()
Seaborn Example (Python)
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
data = pd.DataFrame({
'region': ['North', 'South', 'East', 'West'],
'sales': [50, 35, 25, 10]
})
plt.figure(figsize=(10, 6))
ax = sns.barplot(
data=data,
x='sales',
y='region',
palette='viridis'
)
ax.set_xlabel('Sales ($M)')
ax.set_ylabel('Region')
ax.set_title('Sales by Region')
# Add value labels
for i, v in enumerate(data['sales']):
ax.text(v + 1, i, f'${v}M', va='center')
plt.tight_layout()
plt.show()
Summary Checklist
Before Creating Visualization
- Understand audience and goal
- Choose appropriate chart type
- Select color palette
- Plan layout
- Consider accessibility
During Creation
- Use direct labeling
- Start Y-axis at zero (if appropriate)
- Sort by value
- Remove chart junk
- Add context
---
## Quick Start
### Basic Chart with Chart.js
```javascript
import { Chart } from 'chart.js'
const ctx = document.getElementById('myChart')
const chart = new Chart(ctx, {
type: 'line',
data: {
labels: ['Jan', 'Feb', 'Mar', 'Apr'],
datasets: [{
label: 'Revenue',
data: [1000, 1200, 1100, 1300],
borderColor: 'rgb(75, 192, 192)',
tension: 0.1
}]
},
options: {
responsive: true,
scales: {
y: {
beginAtZero: true
}
}
}
})
D3.js Basic Bar Chart
import * as d3 from 'd3'
const data = [10, 20, 30, 40, 50]
const svg = d3.select('body').append('svg')
.attr('width', 400)
.attr('height', 300)
svg.selectAll('rect')
.data(data)
.enter()
.append('rect')
.attr('x', (d, i) => i * 50)
.attr('y', d => 300 - d * 5)
.attr('width', 40)
.attr('height', d => d * 5)
Production Checklist
- Chart Selection: Choose appropriate chart type for data
- Data Accuracy: Ensure data is accurate and up-to-date
- Accessibility: Charts accessible to screen readers
- Responsive: Charts work on all screen sizes
- Performance: Charts render efficiently with large datasets
- Color: Use color-blind friendly palettes
- Labels: Clear labels and legends
- Context: Provide context and explanations
- Interactivity: Add tooltips and interactions where helpful
- Testing: Test with real data
- Documentation: Document chart purpose and data source
- Updates: Keep charts current with data changes
Anti-patterns
❌ Don't: Misleading Y-Axis
// ❌ Bad - Y-axis doesn't start at zero
chart.options.scales.y.min = 90 // Misleading!
// ✅ Good - Y-axis starts at zero
chart.options.scales.y.min = 0 // Accurate representation
❌ Don't: Chart Junk
// ❌ Bad - Too many decorations
chart.options.plugins.legend.display = true
chart.options.plugins.title.display = true
chart.options.plugins.annotation = { /* decorations */ }
// Too much!
// ✅ Good - Clean, focused
chart.options.plugins.legend.display = true // Only if needed
// Remove unnecessary decorations
❌ Don't: Wrong Chart Type
// ❌ Bad - Line chart for categories
type: 'line' // Categories don't have trends
// ✅ Good - Bar chart for categories
type: 'bar' // Better for comparing categories
Integration Points
- Dashboard Design (
23-business-analytics/dashboard-design/) - Dashboard layouts - KPI Metrics (
23-business-analytics/kpi-metrics/) - Metric visualization - SQL for Analytics (
23-business-analytics/sql-for-analytics/) - Data queries
Further Reading
After Creation
- Test for accessibility
- Verify on mobile
- Check color contrast
- Get feedback
- Iterate based on feedback
Related Skills
Xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Clickhouse Io
ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
Analyzing Financial Statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
Data Storytelling
Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.
Team Composition Analysis
This skill should be used when the user asks to "plan team structure", "determine hiring needs", "design org chart", "calculate compensation", "plan equity allocation", or requests organizational design and headcount planning for a startup.
Startup Financial Modeling
This skill should be used when the user asks to "create financial projections", "build a financial model", "forecast revenue", "calculate burn rate", "estimate runway", "model cash flow", or requests 3-5 year financial planning for a startup.
Kpi Dashboard Design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Startup Metrics Framework
This skill should be used when the user asks about "key startup metrics", "SaaS metrics", "CAC and LTV", "unit economics", "burn multiple", "rule of 40", "marketplace metrics", or requests guidance on tracking and optimizing business performance metrics.
