R Package
by drmowinckels
Personal R package development preferences for code style, documentation conventions, and workflow. Complements Posit r-lib skills with specific style choices for minimal comments and neuroimaging package patterns.
Skill Details
Repository Files
1 file in this skill directory
name: r-package description: Personal R package development preferences for code style, documentation conventions, and workflow. Complements Posit r-lib skills with specific style choices for minimal comments and neuroimaging package patterns. license: CC-BY-4.0 compatibility: opencode metadata: language: R audience: package-developers focus: style-preferences
R Package Development - Personal Preferences
Specific style preferences and workflow choices for R package development. Use alongside Posit testing-r-packages for comprehensive guidance.
Code Style Philosophy
Self-Explanatory Code Without Comments
Functions and variables should be named clearly enough that comments are unnecessary:
# Good: Self-explanatory without comments
calculate_mean_cortical_thickness <- function(surface_data,
region_labels,
exclude_medial_wall = TRUE) {
valid_vertices <- identify_valid_vertices(surface_data, exclude_medial_wall)
regional_means <- compute_regional_means(surface_data, region_labels, valid_vertices)
regional_means
}
# Bad: Needs comments to explain
calc_mct <- function(sd, rl, emw = TRUE) {
# Get valid vertices
vv <- get_vv(sd, emw)
# Calculate means
rm <- calc_rm(sd, rl, vv)
rm
}
# Exception: Comments allowed for workarounds
process_freesurfer_annotation <- function(annot_file) {
# WORKAROUND: FreeSurfer annot files have non-standard header format
# that readBin() misinterprets. Skip first 4 bytes manually.
raw_data <- readBin(annot_file, "raw", n = file.size(annot_file))
data_without_header <- raw_data[-(1:4)]
parse_annotation_data(data_without_header)
}
When comments ARE acceptable:
- Explaining workarounds for upstream bugs
- Documenting non-obvious algorithm choices with citations
- Noting technical constraints (e.g., file format quirks)
Consistent Naming Patterns
# Good: Consistent verb-noun pattern
read_freesurfer_surface()
read_freesurfer_annotation()
read_freesurfer_curv()
write_freesurfer_surface()
write_freesurfer_annotation()
# Bad: Inconsistent patterns
fs_read_surface()
read_annot_freesurfer()
freesurfer_curv_read()
Naming conventions:
- Data reading functions:
read_* - Data writing functions:
write_* - Data transformation:
transform_*,convert_* - Calculations:
calculate_*,compute_* - Checks/validation:
is_*,has_*,validate_*
Documentation Preferences
Minimal but Complete roxygen2
Focus on what users need, not implementation details:
# Good: User-focused documentation
#' Read FreeSurfer surface file
#'
#' Reads surface geometry from FreeSurfer format files. Supports both
#' ASCII and binary formats automatically.
#'
#' @param filepath Path to FreeSurfer surface file
#'
#' @return List with two elements:
#' * `vertices` - Nx3 matrix of vertex coordinates
#' * `faces` - Mx3 matrix of face indices (1-indexed)
#'
#' @examples
#' surf <- read_freesurfer_surface("lh.pial")
#' n_vertices <- nrow(surf$vertices)
#'
#' @family freesurfer-io
#' @export
read_freesurfer_surface <- function(filepath) {
# Implementation
}
# Bad: Over-documented with implementation details
#' Read FreeSurfer surface file
#'
#' This function reads a FreeSurfer surface file by first checking
#' if it's binary or ASCII format, then parsing the header to get
#' the number of vertices and faces, and finally reading the data
#' into R matrices using optimized C++ code via Rcpp.
#'
#' The function performs the following steps:
#' 1. Opens file connection
#' 2. Reads magic number
#' 3. Determines format
#' ...
#' [Users don't need implementation details in documentation]
Example-Driven Documentation
Prioritize realistic examples over abstract descriptions:
# Good: Practical workflow example
#' Calculate cortical thickness statistics
#'
#' @param thickness_file Path to thickness data
#' @param parcellation_file Path to parcellation
#'
#' @examples
#' # Typical neuroimaging workflow
#' thickness <- read_freesurfer_curv("lh.thickness")
#' parcellation <- read_freesurfer_annotation("lh.aparc.annot")
#'
#' # Get mean thickness per region
#' regional_stats <- calculate_regional_thickness(
#' thickness,
#' parcellation,
#' exclude_unknown = TRUE
#' )
#'
#' # Extract specific regions of interest
#' motor_thickness <- regional_stats$thickness[
#' regional_stats$region == "precentral"
#' ]
Package-Specific Patterns
Neuroimaging Data Structures
Consistent structure for neuroimaging data:
# Good: Consistent structure across functions
structure(
list(
data = numeric_vector,
metadata = list(
n_vertices = length(numeric_vector),
hemisphere = "left",
structure = "pial"
)
),
class = c("fs_surface_data", "list")
)
# Bad: Inconsistent attributes
some_data <- numeric_vector
attr(some_data, "n") <- length(numeric_vector)
attr(some_data, "hemi") <- "lh" # Inconsistent: "left" vs "lh"
Graceful Handling of Missing Data
# Good: Explicit handling with informative messages
read_subject_data <- function(subjects_dir, subject_id, measure) {
filepath <- file.path(subjects_dir, subject_id, "surf", measure)
if (!file.exists(filepath)) {
stop(
"Could not find ", measure, " for subject ", subject_id, "\n",
"Expected at: ", filepath, "\n",
"Check that FreeSurfer has been run for this subject.",
call. = FALSE
)
}
read_freesurfer_data(filepath)
}
# Bad: Cryptic errors
read_subject_data <- function(subjects_dir, subject_id, measure) {
read_freesurfer_data(file.path(subjects_dir, subject_id, "surf", measure))
# Let file.exists() error propagate with unhelpful message
}
Testing Preferences
Test Coverage Focus
Prioritize testing of:
- Public API - All exported functions
- File I/O - Reading/writing different formats
- Edge cases - Empty data, missing values, malformed input
- Data transformations - Coordinate conversions, index remapping
De-prioritize testing of:
- Simple getters/setters
- Internal helper functions with trivial logic
- Functions that only call other packages' functions
# Good: Focus on behavior and edge cases
describe("read_freesurfer_surface()", {
it("reads binary format surfaces", {
surf <- read_freesurfer_surface(test_path("fixtures/lh.pial"))
expect_equal(nrow(surf$vertices), 163842)
expect_equal(ncol(surf$vertices), 3)
})
it("handles missing files gracefully", {
expect_error(
read_freesurfer_surface("nonexistent.file"),
"Could not find"
)
})
it("validates surface structure", {
surf <- read_freesurfer_surface(test_path("fixtures/lh.pial"))
expect_true(all(surf$faces > 0))
expect_true(all(surf$faces <= nrow(surf$vertices)))
})
})
# Less important: Testing trivial functions
test_that("get_n_vertices returns vertex count", {
surf <- list(vertices = matrix(1:9, ncol = 3))
expect_equal(get_n_vertices(surf), 3)
# This is too simple to need testing
})
Workflow Integration
Development Cycle
# Typical development workflow
devtools::load_all() # Load package
devtools::document() # Update documentation
devtools::test() # Run tests
devtools::check() # R CMD check
# Before committing
styler::style_pkg() # Format code
lintr::lint_package() # Check style
covr::package_coverage() # Check coverage
Pre-CRAN Checklist
Beyond standard checks, verify:
# Check that examples run
devtools::run_examples()
# Verify vignettes build
devtools::build_vignettes()
# Test on multiple platforms
devtools::check_win_devel()
devtools::check_mac_release()
# Spell check
spelling::spell_check_package()
# Ensure URLs work
urlchecker::url_check()
# Check reverse dependencies if updating existing package
revdepcheck::revdep_check()
Works Well With
testing-r-packages(Posit) - Comprehensive testthat 3+ testing guide (use this for all testing patterns)release-post(Posit) - Create release announcementsbrand-yml(Posit) - pkgdown site branding
When to Use Me
Use this skill when:
- Setting up a new R package and want style guidance
- Need preferences for code organization and naming
- Working with neuroimaging or FreeSurfer data packages
- Want to understand the "no comments" philosophy
Do NOT use for:
- Testing patterns (use Posit
testing-r-packagesinstead) - General R package structure (covered in Posit skills)
- CRAN submission procedures (standard across packages)
Quick Reference
Code style: Self-explanatory names, no comments except workarounds
Naming: verb_noun() pattern, consistent prefixes
Documentation: User-focused, example-driven
Testing focus: Public API, file I/O, edge cases
Workflow: load_all() → document() → test() → check()
Related Skills
Dbt Transformation Patterns
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.
Clinical Decision Support
Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug develo
Dask
Parallel/distributed computing. Scale pandas/NumPy beyond memory, parallel DataFrames/Arrays, multi-file processing, task graphs, for larger-than-RAM datasets and parallel workflows.
Scikit Survival
Comprehensive toolkit for survival analysis and time-to-event modeling in Python using scikit-survival. Use this skill when working with censored survival data, performing time-to-event analysis, fitting Cox models, Random Survival Forests, Gradient Boosting models, or Survival SVMs, evaluating survival predictions with concordance index or Brier score, handling competing risks, or implementing any survival analysis workflow with the scikit-survival library.
Scientific Schematics
Create publication-quality scientific diagrams using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.
Mermaid Diagrams
Comprehensive guide for creating software diagrams using Mermaid syntax. Use when users need to create, visualize, or document software through diagrams including class diagrams (domain modeling, object-oriented design), sequence diagrams (application flows, API interactions, code execution), flowcharts (processes, algorithms, user journeys), entity relationship diagrams (database schemas), C4 architecture diagrams (system context, containers, components), state diagrams, git graphs, pie charts,
Polars
Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.
G2 Legend Expert
Expert skill for G2 legend development - provides comprehensive knowledge about legend rendering implementation, component architecture, layout algorithms, and interaction handling. Use when implementing, customizing, or debugging legend functionality in G2 visualizations.
Diagram Generation
Mermaid diagram generation for architecture visualization, data flow diagrams, and component relationships. Use for documentation, PR descriptions, and architectural analysis.
Matlab
MATLAB and GNU Octave numerical computing for matrix operations, data analysis, visualization, and scientific computing. Use when writing MATLAB/Octave scripts for linear algebra, signal processing, image processing, differential equations, optimization, statistics, or creating scientific visualizations. Also use when the user needs help with MATLAB syntax, functions, or wants to convert between MATLAB and Python code. Scripts can be executed with MATLAB or the open-source GNU Octave interpreter
