R Package

by drmowinckels

codedevelopmentdocumentworkflow

Personal R package development preferences for code style, documentation conventions, and workflow. Complements Posit r-lib skills with specific style choices for minimal comments and neuroimaging package patterns.

Skill Details

Repository Files

1 file in this skill directory


name: r-package description: Personal R package development preferences for code style, documentation conventions, and workflow. Complements Posit r-lib skills with specific style choices for minimal comments and neuroimaging package patterns. license: CC-BY-4.0 compatibility: opencode metadata: language: R audience: package-developers focus: style-preferences

R Package Development - Personal Preferences

Specific style preferences and workflow choices for R package development. Use alongside Posit testing-r-packages for comprehensive guidance.

Code Style Philosophy

Self-Explanatory Code Without Comments

Functions and variables should be named clearly enough that comments are unnecessary:

# Good: Self-explanatory without comments
calculate_mean_cortical_thickness <- function(surface_data, 
                                               region_labels,
                                               exclude_medial_wall = TRUE) {
  valid_vertices <- identify_valid_vertices(surface_data, exclude_medial_wall)
  regional_means <- compute_regional_means(surface_data, region_labels, valid_vertices)
  regional_means
}

# Bad: Needs comments to explain
calc_mct <- function(sd, rl, emw = TRUE) {
  # Get valid vertices
  vv <- get_vv(sd, emw)
  # Calculate means
  rm <- calc_rm(sd, rl, vv)
  rm
}

# Exception: Comments allowed for workarounds
process_freesurfer_annotation <- function(annot_file) {
  # WORKAROUND: FreeSurfer annot files have non-standard header format
  # that readBin() misinterprets. Skip first 4 bytes manually.
  raw_data <- readBin(annot_file, "raw", n = file.size(annot_file))
  data_without_header <- raw_data[-(1:4)]
  parse_annotation_data(data_without_header)
}

When comments ARE acceptable:

  • Explaining workarounds for upstream bugs
  • Documenting non-obvious algorithm choices with citations
  • Noting technical constraints (e.g., file format quirks)

Consistent Naming Patterns

# Good: Consistent verb-noun pattern
read_freesurfer_surface()
read_freesurfer_annotation()
read_freesurfer_curv()

write_freesurfer_surface()
write_freesurfer_annotation()

# Bad: Inconsistent patterns
fs_read_surface()
read_annot_freesurfer()
freesurfer_curv_read()

Naming conventions:

  • Data reading functions: read_*
  • Data writing functions: write_*
  • Data transformation: transform_*, convert_*
  • Calculations: calculate_*, compute_*
  • Checks/validation: is_*, has_*, validate_*

Documentation Preferences

Minimal but Complete roxygen2

Focus on what users need, not implementation details:

# Good: User-focused documentation
#' Read FreeSurfer surface file
#'
#' Reads surface geometry from FreeSurfer format files. Supports both
#' ASCII and binary formats automatically.
#'
#' @param filepath Path to FreeSurfer surface file
#'
#' @return List with two elements:
#'   * `vertices` - Nx3 matrix of vertex coordinates
#'   * `faces` - Mx3 matrix of face indices (1-indexed)
#'
#' @examples
#' surf <- read_freesurfer_surface("lh.pial")
#' n_vertices <- nrow(surf$vertices)
#'
#' @family freesurfer-io
#' @export
read_freesurfer_surface <- function(filepath) {
  # Implementation
}

# Bad: Over-documented with implementation details
#' Read FreeSurfer surface file
#'
#' This function reads a FreeSurfer surface file by first checking
#' if it's binary or ASCII format, then parsing the header to get
#' the number of vertices and faces, and finally reading the data
#' into R matrices using optimized C++ code via Rcpp.
#'
#' The function performs the following steps:
#' 1. Opens file connection
#' 2. Reads magic number
#' 3. Determines format
#' ...
#' [Users don't need implementation details in documentation]

Example-Driven Documentation

Prioritize realistic examples over abstract descriptions:

# Good: Practical workflow example
#' Calculate cortical thickness statistics
#'
#' @param thickness_file Path to thickness data
#' @param parcellation_file Path to parcellation
#'
#' @examples
#' # Typical neuroimaging workflow
#' thickness <- read_freesurfer_curv("lh.thickness")
#' parcellation <- read_freesurfer_annotation("lh.aparc.annot")
#'
#' # Get mean thickness per region
#' regional_stats <- calculate_regional_thickness(
#'   thickness,
#'   parcellation,
#'   exclude_unknown = TRUE
#' )
#'
#' # Extract specific regions of interest
#' motor_thickness <- regional_stats$thickness[
#'   regional_stats$region == "precentral"
#' ]

Package-Specific Patterns

Neuroimaging Data Structures

Consistent structure for neuroimaging data:

# Good: Consistent structure across functions
structure(
  list(
    data = numeric_vector,
    metadata = list(
      n_vertices = length(numeric_vector),
      hemisphere = "left",
      structure = "pial"
    )
  ),
  class = c("fs_surface_data", "list")
)

# Bad: Inconsistent attributes
some_data <- numeric_vector
attr(some_data, "n") <- length(numeric_vector)
attr(some_data, "hemi") <- "lh"  # Inconsistent: "left" vs "lh"

Graceful Handling of Missing Data

# Good: Explicit handling with informative messages
read_subject_data <- function(subjects_dir, subject_id, measure) {
  filepath <- file.path(subjects_dir, subject_id, "surf", measure)
  
  if (!file.exists(filepath)) {
    stop(
      "Could not find ", measure, " for subject ", subject_id, "\n",
      "Expected at: ", filepath, "\n",
      "Check that FreeSurfer has been run for this subject.",
      call. = FALSE
    )
  }
  
  read_freesurfer_data(filepath)
}

# Bad: Cryptic errors
read_subject_data <- function(subjects_dir, subject_id, measure) {
  read_freesurfer_data(file.path(subjects_dir, subject_id, "surf", measure))
  # Let file.exists() error propagate with unhelpful message
}

Testing Preferences

Test Coverage Focus

Prioritize testing of:

  1. Public API - All exported functions
  2. File I/O - Reading/writing different formats
  3. Edge cases - Empty data, missing values, malformed input
  4. Data transformations - Coordinate conversions, index remapping

De-prioritize testing of:

  • Simple getters/setters
  • Internal helper functions with trivial logic
  • Functions that only call other packages' functions
# Good: Focus on behavior and edge cases
describe("read_freesurfer_surface()", {
  it("reads binary format surfaces", {
    surf <- read_freesurfer_surface(test_path("fixtures/lh.pial"))
    expect_equal(nrow(surf$vertices), 163842)
    expect_equal(ncol(surf$vertices), 3)
  })
  
  it("handles missing files gracefully", {
    expect_error(
      read_freesurfer_surface("nonexistent.file"),
      "Could not find"
    )
  })
  
  it("validates surface structure", {
    surf <- read_freesurfer_surface(test_path("fixtures/lh.pial"))
    expect_true(all(surf$faces > 0))
    expect_true(all(surf$faces <= nrow(surf$vertices)))
  })
})

# Less important: Testing trivial functions
test_that("get_n_vertices returns vertex count", {
  surf <- list(vertices = matrix(1:9, ncol = 3))
  expect_equal(get_n_vertices(surf), 3)
  # This is too simple to need testing
})

Workflow Integration

Development Cycle

# Typical development workflow
devtools::load_all()              # Load package
devtools::document()              # Update documentation
devtools::test()                  # Run tests
devtools::check()                 # R CMD check

# Before committing
styler::style_pkg()               # Format code
lintr::lint_package()             # Check style
covr::package_coverage()          # Check coverage

Pre-CRAN Checklist

Beyond standard checks, verify:

# Check that examples run
devtools::run_examples()

# Verify vignettes build
devtools::build_vignettes()

# Test on multiple platforms
devtools::check_win_devel()
devtools::check_mac_release()

# Spell check
spelling::spell_check_package()

# Ensure URLs work
urlchecker::url_check()

# Check reverse dependencies if updating existing package
revdepcheck::revdep_check()

Works Well With

  • testing-r-packages (Posit) - Comprehensive testthat 3+ testing guide (use this for all testing patterns)
  • release-post (Posit) - Create release announcements
  • brand-yml (Posit) - pkgdown site branding

When to Use Me

Use this skill when:

  • Setting up a new R package and want style guidance
  • Need preferences for code organization and naming
  • Working with neuroimaging or FreeSurfer data packages
  • Want to understand the "no comments" philosophy

Do NOT use for:

  • Testing patterns (use Posit testing-r-packages instead)
  • General R package structure (covered in Posit skills)
  • CRAN submission procedures (standard across packages)

Quick Reference

Code style: Self-explanatory names, no comments except workarounds

Naming: verb_noun() pattern, consistent prefixes

Documentation: User-focused, example-driven

Testing focus: Public API, file I/O, edge cases

Workflow: load_all()document()test()check()

Related Skills

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

testingdocumenttool

Clinical Decision Support

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug develo

developmentdocumentcli

Dask

Parallel/distributed computing. Scale pandas/NumPy beyond memory, parallel DataFrames/Arrays, multi-file processing, task graphs, for larger-than-RAM datasets and parallel workflows.

workflowdata

Scikit Survival

Comprehensive toolkit for survival analysis and time-to-event modeling in Python using scikit-survival. Use this skill when working with censored survival data, performing time-to-event analysis, fitting Cox models, Random Survival Forests, Gradient Boosting models, or Survival SVMs, evaluating survival predictions with concordance index or Brier score, handling competing risks, or implementing any survival analysis workflow with the scikit-survival library.

workflowtooldata

Scientific Schematics

Create publication-quality scientific diagrams using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.

artdocument

Mermaid Diagrams

Comprehensive guide for creating software diagrams using Mermaid syntax. Use when users need to create, visualize, or document software through diagrams including class diagrams (domain modeling, object-oriented design), sequence diagrams (application flows, API interactions, code execution), flowcharts (processes, algorithms, user journeys), entity relationship diagrams (database schemas), C4 architecture diagrams (system context, containers, components), state diagrams, git graphs, pie charts,

artdesigncode

Polars

Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.

workflowapidata

G2 Legend Expert

Expert skill for G2 legend development - provides comprehensive knowledge about legend rendering implementation, component architecture, layout algorithms, and interaction handling. Use when implementing, customizing, or debugging legend functionality in G2 visualizations.

development

Diagram Generation

Mermaid diagram generation for architecture visualization, data flow diagrams, and component relationships. Use for documentation, PR descriptions, and architectural analysis.

documentdata

Matlab

MATLAB and GNU Octave numerical computing for matrix operations, data analysis, visualization, and scientific computing. Use when writing MATLAB/Octave scripts for linear algebra, signal processing, image processing, differential equations, optimization, statistics, or creating scientific visualizations. Also use when the user needs help with MATLAB syntax, functions, or wants to convert between MATLAB and Python code. Scripts can be executed with MATLAB or the open-source GNU Octave interpreter

codedata

Skill Information

Category:Technical
License:CC-BY-4.0
Last Updated:1/16/2026