Workflow Pipeline Reference

This reference document provides complete technical details of the pyama processing workflow, including algorithm specifications, data formats, and implementation details for creating plugins or reproducing the pipeline in other systems.

Overview

The pyama processing workflow processes time-lapse microscopy images through seven sequential steps to extract cell traces with quantitative features. The workflow operates on individual Fields of View (FOVs) and processes data in configurable batches for efficiency.

Processing Order:

  1. Copying - Extract frames from microscopy files (ND2/CZI) to Zarr format

  2. Segmentation - Identify cell boundaries using phase contrast images (requires PC channel)

  3. Tracking - Track cells across time points using consistent cell IDs (requires PC channel)

  4. Background Estimation - Estimate background fluorescence using tiled interpolation (requires FL channels)

  5. Cropping - Extract cell bounding box crops from tracked segmentation (requires PC channel)

  6. Extraction - Extract quantitative features and generate trace CSV files (always runs if PC configured)

  7. Caching - Generate visualization levels (/1/) for all channels after processing completes

Channel-Conditional Behavior:

  • No PC channel configured: Segmentation, tracking, cropping, and extraction are skipped

  • No FL channels configured: Background estimation is skipped automatically

  • PC channel with no features: Extraction still outputs base fields (cell, frame, position, bbox) for tracking

  • Copy-only mode: Processing stages 2-7 are skipped when config.params.copy_only is True

Input Requirements

Microscopy Data

  • Input format: ND2 or CZI files containing time-lapse microscopy images

  • Multiple FOVs: Supported and processed in parallel

  • Time-lapse: Each FOV contains multiple time frames

  • Multi-channel: One phase contrast and one or more fluorescence channels

Channel Configuration

  • Phase Contrast (PC) Channel: Required for segmentation. One channel specified for cell boundary detection.

  • Fluorescence (FL) Channels: Optional, one or more channels specified for feature extraction.

Processing Context

  • Output directory paths

  • Channel configurations (Channels dataclass)

  • Processing parameters

  • Configuration saved to processing_config.yaml

  • FOV outputs discovered by naming convention (fov_XXX/)

Workflow Steps

Step 1: CopyingService

Purpose: Extract raw image frames from microscopy file formats and save to Zarr format for efficient access.

Input

  • Microscopy file path (ND2 or CZI)

  • FOV index range

  • Channel configuration

Processing Algorithm

For each FOV and specified channel:

  1. Load frames sequentially from microscopy file

  2. Create or open Zarr store: fov_{fov:03d}/images.zarr

  3. Write channel data to {pc|fl}_ch_{channel_id}/0/ where:

    • 0 = full resolution level

    • Data type: uint16 for raw pixel values

    • Dimensions: (T, H, W) where T = number of time frames, H,W = image dimensions

  4. Channels are stored as separate groups in the Zarr hierarchy

Output Format

  • Phase contrast: fov_{fov:03d}/images.zarr/pc_ch_{pc_id}/0/

  • Fluorescence: fov_{fov:03d}/images.zarr/fl_ch_{fl_id}/0/

Implementation Notes

  • Runs sequentially per batch to avoid I/O bottlenecks

  • Zarr format provides efficient chunked storage and compression

  • Existing channels are detected and skipped (supports resuming)

  • Visualization level (/1/) is generated later by CachingService

Step 2: SegmentationService

Purpose: Identify cell boundaries in each frame using phase contrast microscopy images using LOG-STD method.

Channel Requirements: Requires PC channel. Skips with warning if no PC channel configured.

Input

  • Phase contrast stack from Step 1: fov_{fov}/images.zarr/pc_ch_{pc_id}/0/ - (T, H, W) of uint16

LOG-STD Algorithm

For each time frame t:

  1. Local Standard Deviation:

    local_mean = uniform_filter(image, size=window_size)
    local_var = uniform_filter(image**2, size=window_size) - local_mean**2
    logstd = 0.5 * log(local_var)
    
  2. Automatic Thresholding:

    • Build histogram of log-STD values

    • Find valley threshold between background/cell modes

    • Binary mask: binary = logstd > threshold

  3. Morphological Cleanup:

    mask = binary_closing(binary, structure=disk(7), iterations=3)
    mask = remove_small_objects(mask)
    mask = binary_fill_holes(mask)
    mask = binary_opening(mask, structure=disk(7), iterations=3)
    

Output

  • Labeled segmentation: fov_{fov}/images.zarr/seg_labeled_ch_{pc_id}/0/

  • Format: (T, H, W) of uint16

  • 0 = background, 1-N = cell IDs (frame-specific)

  • Visualization level (/1/) generated later by CachingService

Algorithm Characteristics

  • Computes per-pixel local intensity variation

  • LOG-STD is effective for phase contrast where boundaries create local intensity changes

  • Parameters: window size (default from neighborhood), number of iterations (default: 3)

Step 3: TrackingService

Purpose: Track cells across time frames by assigning consistent cell IDs using Intersection over Union (IoU).

Channel Requirements: Requires PC channel. Skips with warning if no PC channel configured.

Input

  • Labeled segmentation from Step 2: fov_{fov}/images.zarr/seg_labeled_ch_{pc_id}/0/ - (T, H, W) of uint16

IoU-based Hungarian Assignment Algorithm

Per-frame Processing:

  1. Extract Regions:

    from skimage.measure import regionprops_table
    
    props = regionprops_table(labeled_frame, 
                            properties=['label', 'area', 'bbox'])
    
  2. IoU Cost Matrix:

    from scipy.spatial.distance import cdist
    
    # Calculate IoU for all current vs previous region pairs
    cost_matrix = np.zeros((n_current, n_previous))
    for i, current in enumerate(current_regions):
        for j, prev in enumerate(prev_regions):
            iou = calculate_iou(current.bbox, prev.bbox)
            cost_matrix[i, j] = 1.0 - iou  # Convert distance
    
  3. Hungarian Assignment:

    from scipy.optimize import linear_sum_assignment
    
    row_ind, col_ind = linear_sum_assignment(cost_matrix)
    
    # Apply minimum IoU threshold
    for r, c in zip(row_ind, col_ind):
        if (1.0 - cost_matrix[r, c]) < min_iou:
            # Mark as new cell
            assign_new_id(r)
        else:
            # Assign previous ID
            assign_previous_id(r, c)
    
  4. Cell ID Management:

    • Frame 0: Assign new IDs 1, 2, 3, …

    • Frame n: Matched cells inherit IDs, new cells get new IDs

    • Disappeared cells: terminate trace

Output

  • Labeled tracking: fov_{fov}/images.zarr/seg_tracked_ch_{pc_id}/0/

  • Format: (T, H, W) of uint16

  • 0 = background, 1-N = cell IDs (consistent across frames)

  • Visualization level (/1/) generated later by CachingService

Implementation Details

  • IoU计算使用边界框近似,性能优化

  • 最小min_iou阈值(默认0.1)过滤低质量匹配

  • 匈牙利算法保证全局最优匹配

  • 支持min_sizemax_size过滤

Step 4: BackgroundEstimationService

Purpose: Estimate background fluorescence using tiled interpolation for each frame.

Channel Requirements: Requires FL channels. Skips automatically if no FL channels configured.

Input

  • Labeled segmentation from Step 2: fov_{fov}/images.zarr/seg_labeled_ch_{pc_id}/0/ - (T, H, W) of uint16

  • Raw fluorescence from Step 1: fov_{fov}/images.zarr/fl_ch_{fl_id}/0/ - (T, H, W) of uint16

IoU-based Hungarian Assignment Algorithm

Per-frame Processing:

  1. Extract Regions:

    from skimage.measure import regionprops_table
    
    props = regionprops_table(labeled_frame, 
                            properties=['label', 'area', 'bbox'])
    
  2. IoU Cost Matrix:

    from scipy.spatial.distance import cdist
    
    # Calculate IoU for all current vs previous region pairs
    cost_matrix = np.zeros((n_current, n_previous))
    for i, current in enumerate(current_regions):
        for j, prev in enumerate(prev_regions):
            iou = calculate_iou(current.bbox, prev.bbox)
            cost_matrix[i, j] = 1.0 - iou  # Convert distance
    
  3. Hungarian Assignment:

    from scipy.optimize import linear_sum_assignment
    
    row_ind, col_ind = linear_sum_assignment(cost_matrix)
    
    # Apply minimum IoU threshold
    for r, c in zip(row_ind, col_ind):
        if (1.0 - cost_matrix[r, c]) < min_iou:
            # Mark as new cell
            assign_new_id(r)
        else:
            # Assign previous ID
            assign_previous_id(r, c)
    
  4. Cell ID Management:

    • Frame 0: Assign new IDs 1, 2, 3, …

    • Frame n: Matched cells inherit IDs, new cells get new IDs

    • Disappeared cells: terminate trace

Output

  • Labeled tracking: fov_{fov}/images.zarr/seg_tracked_ch_{pc_id}/0/

  • Format: (T, H, W) of uint16

  • 0 = background, 1-N = cell IDs (consistent across frames)

Implementation Details

  • IoU计算使用边界框近似,性能优化

  • 最小min_iou阈值(默认0.1)过滤低质量匹配

  • 匈牙利算法保证全局最优匹配

  • 支持min_sizemax_size过滤

Tiled Interpolation Algorithm

For each fluorescence channel and frame t:

  1. Mask Foreground:

    dilated = binary_dilation(seg_labeled, disk(10))
    masked = np.where(dilated, np.nan, fluorescence_image)
    
  2. Tile Medians:

    • Divide frame into overlapping tiles (typical: 50-100 px)

    • Compute median of non-NaN pixels in each tile

    • Handle tiles with insufficient background via interpolation

  3. Interpolate Background:

    from scipy.interpolate import RectBivariateSpline
    
    # Grid of tile medians
    x_grid, y_grid = np.meshgrid(tile_centers_x, tile_centers_y)
    z_grid = tile_medians
    
    # Interpolate to full resolution
    spline = RectBivariateSpline(x_grid.ravel(), y_grid.ravel(), z_grid.T)
    background = spline(flat_x_coords, flat_y_coords)
    

Output

  • Background stack: fov_{fov}/images.zarr/fl_background_ch_{fl_id}/0/

  • Format: (T, H, W) of float32

  • Ready for correction during extraction

  • Visualization level (/1/) generated later by CachingService

Algorithm Notes

  • Each fluorescence channel processed independently

  • Background saved separately for flexible correction weights

  • Tile size configurable (default: 50-100 px with overlap)

  • Interpolation preserves spatial variation patterns

Step 5: CroppingService

Purpose: Extract cell bounding box crops from tracked segmentation for efficient feature extraction.

Channel Requirements: Requires PC channel. Skips with warning if no PC channel configured.

Input

  • Tracked segmentation from Step 3: fov_{fov}/images.zarr/seg_tracked_ch_{pc_id}/0/ - (T, H, W) of uint16

  • Phase contrast from Step 1: fov_{fov}/images.zarr/pc_ch_{pc_id}/0/ - (T, H, W) of uint16

  • Fluorescence channels (optional): fov_{fov}/images.zarr/fl_ch_{fl_id}/0/ - (T, H, W) of uint16

  • Background channels (optional): fov_{fov}/images.zarr/fl_background_ch_{fl_id}/0/ - (T, H, W) of float32

Processing Algorithm

For each tracked cell:

  1. Extract bounding boxes across all frames where cell is present

  2. Crop regions from all configured channels (PC + FL)

  3. Apply background correction to FL crops if available

  4. Store crops in per-cell structure: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/0/

Output Format

  • Cell crops: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/0/ - (T, H, W) of float32 (normalized [0,1])

  • Visualization level: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/1/ - (T, H, W) of uint8

  • Metadata: fov_{fov}/cells.zarr/metadata/ - cell_ids, bboxes, valid_frames

Implementation Notes

  • Works with PC-only data: crops PC channel only

  • With FL configured: crops both PC and FL channels, applies background if available

  • Creates cells.zarr with per-cell structure for efficient feature extraction

  • Essential for extraction step - provides cropped regions

Step 6: ExtractionService

Purpose: Extract quantitative features for each tracked cell at each time point and generate CSV traces.

Channel Requirements: Always runs if PC channel is configured, even with empty features. Creates empty CSV if no channels configured.

Purpose: Extract quantitative features for each tracked cell at each time point and generate CSV traces.

Input

  • Cell crops from Step 5: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/0/ - (T, H, W) of float32

  • Metadata from Step 5: fov_{fov}/cells.zarr/metadata/ - bboxes, valid_frames

  • Feature configuration list from config

Feature Extraction Algorithm

For each FOV, cell, and time point:

  1. Load Cell Crop:

    # Load cropped region for this cell and frame
    cell_crop = cells_zarr[f"{channel_name}/{cell_id}/0"][frame_idx]
    bbox = metadata["bboxes"][cell_idx, frame_idx]  # [y0, x0, y1, x1]
    
  2. Base Features (Always Computed):

    # Base fields are always extracted, regardless of channel configs
    row['fov'] = fov_id
    row['cell'] = cell_id
    row['frame'] = frame_index
    row['good'] = metadata["valid_frames"][cell_idx, frame_idx]
    row['position_x'] = (bbox[1] + bbox[3]) / 2  # Centroid x
    row['position_y'] = (bbox[0] + bbox[2]) / 2  # Centroid y
    row['bbox_x0'] = bbox[1]  # x0
    row['bbox_y0'] = bbox[0]  # y0
    row['bbox_x1'] = bbox[3]  # x1
    row['bbox_y1'] = bbox[2]  # y1
    
  3. Channel-Specific Features:

    # Phase contrast features (if configured)
    if pc_features:
        mask = cell_crop > 0  # Binary mask from crop
        if 'area' in pc_features:
            row[f'area_ch_{pc_channel}'] = np.sum(mask)
        
        if 'aspect_ratio' in pc_features:
            ellipse = regionprops(mask.astype(int))[0]
            row[f'aspect_ratio_ch_{pc_channel}'] = ellipse.major_axis_length / ellipse.minor_axis_length
    
    # Fluorescence features with background correction (if configured)
    if fl_features:
        raw_intensity = np.sum(cell_crop)
        if background_available:
            background_crop = cells_zarr[f"fl_background_ch_{fl_id}/{cell_id}/0"][frame_idx]
            background_intensity = np.sum(background_crop * background_weight)
            corrected_intensity = raw_intensity - background_intensity
        else:
            corrected_intensity = raw_intensity
        
        row[f'intensity_total_ch_{fl_id}'] = corrected_intensity
    
  4. Background Correction:

    # Configurable weight from config.params.background_weight
    background_weight = clip(config.params.background_weight, 0.0, 1.0)
    
    # Applied during extraction from background crops
    corrected_intensity = raw_intensity - background_weight * background_intensity
    

Quality Filtering

  1. Trace Length Filter:

    min_frames = params.get('min_frames', 30)
    trace_lengths = calculate_trace_lengths(traces)
    filtered = traces[trace_lengths >= min_frames]
    
  2. Border Filter:

    border_margin = params.get('border_margin', 50)
    
    def on_border(mask):
        return np.any(mask[:border_margin, :]) or \
               np.any(mask[-border_margin:, :]) or \
               np.any(mask[:, :border_margin]) or \
               np.any(mask[:, -border_margin:])
    
    # Remove border cells entirely
    filtered = filtered[~filtered['cell'].isin(border_cells)]
    

Output Format

Per-FOV CSV: fov_{fov:03d}/{basename}_fov_{fov:03d}_traces.csv

fov,cell,frame,good,position_x,position_y,bbox_x0,bbox_y0,bbox_x1,bbox_y1,area_ch_0,aspect_ratio_ch_0,intensity_total_ch_1
0,0,0,True,100.5,200.3,85,165,115,235,450,1.234,1234.5
0,0,1,True,101.2,199.8,86,166,116,236,455,1.236,1356.2

Column Naming Convention:

  • Base columns: fov, cell, frame, good, position_x/y, bbox_*

  • Feature columns: {feature}_ch_{channel_id} (e.g., intensity_total_ch_1)

  • Base fields are always included if PC channel is configured, even with empty features

Step 7: CachingService

Purpose: Generate visualization levels (/1/) for all channels after processing completes.

Channel Requirements: Runs for all channels that have /0/ level data.

Input

  • All channels in fov_{fov}/images.zarr/ with /0/ level

  • All channels in fov_{fov}/cells.zarr/ with /0/ level

Processing Algorithm

For each channel:

  1. Check if /0/ exists and /1/ is missing

  2. Read /0/ data: (T, H, W) array

  3. Normalize to uint8:

    • Intensity channels: Percentile normalization (1st-99th percentile) across entire stack

    • Segmentation channels: Scale proportionally to [0, 255] based on max label

  4. Downsample by 2x using generate_half_resolution()

  5. Write to /1/ level: (T, H/2, W/2) of uint8

Output Format

  • Visualization level: {channel_name}/1/ - (T, H/2, W/2) of uint8

  • Compression: LZ4 for fast access

  • Chunks: (1, min(256, H/2), min(256, W/2))

Implementation Notes

  • Unifies caching logic that was previously scattered across services

  • Runs after all processing completes for efficiency

  • Normalization ensures consistent scaling across all frames

  • Non-critical: Workflow continues even if caching fails

Output Structure

output_dir/
├── processing_config.yaml           # Metadata, channels, parameters
├── fov_000/
│   ├── images.zarr/                 # Image data (all channels)
│   │   ├── pc_ch_0/
│   │   │   ├── 0/                   # Full resolution (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   ├── fl_ch_1/
│   │   │   ├── 0/                   # Full resolution (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   ├── seg_labeled_ch_0/
│   │   │   ├── 0/                   # Labeled segmentation (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   ├── seg_tracked_ch_0/
│   │   │   ├── 0/                   # Tracked segmentation (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   └── fl_background_ch_1/
│   │       ├── 0/                   # Background estimate (T, H, W) float32
│   │       └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   ├── cells.zarr/                  # Per-cell crops
│   │   ├── metadata/
│   │   │   ├── cell_ids             # (N,) int32
│   │   │   ├── bboxes               # (N, T, 4) int32 [y0, x0, y1, x1]
│   │   │   └── valid_frames         # (N, T) bool
│   │   ├── pc_ch_0/
│   │   │   ├── {cell_id}/
│   │   │   │   ├── 0/               # Normalized crop (T, H, W) float32 [0,1]
│   │   │   │   └── 1/               # Visualization level (T, H/2, W/2) uint8
│   │   └── fl_ch_1/
│   │       └── {cell_id}/
│   │           ├── 0/               # Normalized crop (T, H, W) float32 [0,1]
│   │           └── 1/               # Visualization level (T, H/2, W/2) uint8
│   └── basename_fov_000_traces.csv  # Combined feature traces
├── fov_001/
│   └── ...

Batch Processing Implementation

Thread Pool Executor Pattern

from concurrent.futures import ThreadPoolExecutor, as_completed
from pyama.processing.workflow.run import run_single_worker

def run_batch(fov_batch, config, n_workers, metadata, output_dir, cancel_event):
    """Process a batch of FOVs in parallel."""
    
    # Sequential copying (I/O bound)
    copy_service = CopyingService()
    copy_service.process_all_fovs(
        metadata=metadata,
        config=config,
        output_dir=output_dir,
        fov_start=fov_batch[0],
        fov_end=fov_batch[-1],
        cancel_event=cancel_event,
    )
    
    # Copy-only mode: skip processing stages
    if config.params.copy_only:
        return
    
    # Parallel processing (CPU bound)
    worker_ranges = _split_worker_ranges(fov_batch, n_workers)
    
    with ThreadPoolExecutor(max_workers=n_workers) as executor:
        futures = {
            executor.submit(
                run_single_worker,
                fov_range,
                metadata,
                config,
                output_dir,
                cancel_event,
            ): fov_range
            for fov_range in worker_ranges
        }
        
        # Wait for completion with progress tracking
        for future in as_completed(futures):
            fov_range, successful, failed, message = future.result()
            update_progress(successful, failed, message)
    
    # Generate visualization cache after all processing
    if overall_success:
        caching_service = CachingService()
        caching_service.process_all_fovs(
            metadata=metadata,
            config=config,
            output_dir=output_dir,
            fov_start=fov_batch[0],
            fov_end=fov_batch[-1],
            cancel_event=cancel_event,
        )

Memory Management

def run_single_worker(fovs, metadata, config, output_dir, cancel_event):
    """Worker function for parallel FOV processing."""
    
    try:
        # Initialize services
        segmentation = SegmentationService(method=config.params.segmentation_method)
        tracking = TrackingService(method=config.params.tracking_method)
        background_estimation = BackgroundEstimationService()
        cropping = CroppingService()
        extraction = ExtractionService()
        
        # Process each service sequentially for FOV range
        segmentation.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        tracking.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        background_estimation.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        cropping.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        extraction.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        
        return (fovs, len(fovs), 0, "Completed")
        
    except Exception as e:
        logger.error(f"Error processing FOVs {fovs[0]}-{fovs[-1]}: {e}")
        return (fovs, 0, len(fovs), str(e))

Cancellation Support

Cancellation is handled via threading.Event within the task runner:

# Usage in workflow
if cancel_event.is_set():
    logger.info("Cancellation requested, cleaning up")
    cleanup_partial_results()
    return False

Data Type Specifications

Image Arrays (Zarr Format)

Stage

Zarr Path

Data Type

Dimensions

Notes

Raw Images

images.zarr/{pc|fl}_ch_{id}/0/

uint16

(T, H, W)

Chunked Zarr arrays

Segmentation

images.zarr/seg_labeled_ch_{id}/0/

uint16

(T, H, W)

Labeled mask (untracked)

Tracking

images.zarr/seg_tracked_ch_{id}/0/

uint16

(T, H, W)

Cell IDs, 0=background

Background

images.zarr/fl_background_ch_{id}/0/

float32

(T, H, W)

Estimate per channel

Visualization

images.zarr/{channel}/1/

uint8

(T, H/2, W/2)

Downsampled for display

Cell Crops

cells.zarr/{channel}/{cell_id}/0/

float32

(T, H, W)

Normalized [0,1]

Cell Viz

cells.zarr/{channel}/{cell_id}/1/

uint8

(T, H/2, W/2)

Downsampled for display

CSV Schemas

Processing Traces (per-FOV):

  • All columns prefixed by channel ID

  • Frame-based, time computed after loading

  • Includes quality flag (good column)

Merged Traces (per-sample):

  • Same format as processing traces

  • Multiple FOVs combined

  • Includes sample metadata in headers

Fitted Results (post-analysis):

  • One row per cell

  • Includes model type, R², parameters

  • Additional columns per model parameters

Algorithm Parameters

Segmentation Parameters

segmentation_params = {
    'logstd_window_size': 3,  # Neighborhood for std computation
    'morph_size': 7,         # Structuring element size
    'morph_iterations': 3,   # Number of opening/closing iterations
    'min_object_size': 50,   # Minimum cell size in pixels
    'max_object_size': 10000, # Maximum cell size in pixels
}

Tracking Parameters

tracking_params = {
    'min_iou': 0.1,     # Minimum IoU for cell matching
    'min_frames': 30,   # Minimum trace length
    'border_margin': 50, # Exclusion margin (pixels)
}

Extraction Parameters

extraction_params = {
    'background_weight': 1.0,  # Background correction weight [0-1]
    'frame_interval': 10.0,    # Minutes per frame (default)
    'time_mapping': None,      # Custom frame->time mapping (dict)
    'features': {
        'phase': ['area', 'aspect_ratio'],
        'fluorescence': ['intensity_total', 'intensity_mean']
    }
}

Performance Characteristics

Memory Usage

Dataset Size

Approximate RAM Usage

Notes

10 FOVs, 50 frames

1-2 GB

Single workstation

100 FOVs, 180 frames

8-12 GB

Requires 16GB+ RAM

500+ FOVs

32GB+

Consider distributed processing

Processing Speed

Operation

Speed (per FOV)

Parallel Scaling

Copying (sequential)

2-5 sec

No parallelization

Segmentation

10-30 sec

4-8 threads optimal

Tracking

5-15 sec

Linear up to CPU count

Background Estimation

5-20 sec

CPU-bound, parallel

Cropping

2-10 sec

CPU-bound, parallel

Extraction

2-8 sec

CPU-bound, parallel

Caching

1-5 sec

Sequential (post-processing)

Optimization Strategies

  1. Memory Mapping: Use mmap_mode='r' for large arrays

  2. Batch Size: Tune based on RAM availability

  3. Worker Count: Match to CPU cores (typically 4-8)

  4. SSD Storage: Improves I/O for large datasets

Extension Points

Custom Features

def extract_custom_feature(image, mask, context):
    """User-defined feature extraction."""
    # Implement custom logic
    return feature_value

# Register in feature system
PHASE_FEATURES['custom_feature'] = extract_custom_feature

Alternative Algorithms

Replace core algorithms while maintaining interface:

  • Segmentation: watershed, deep learning

  • Tracking: Kalman filter, graph-based

  • Feature extraction: custom metrics

Integration Hooks

class CustomPreprocessor:
    """Pre-process frames before segmentation."""
    def process(self, image):
        # Custom preprocessing
        return processed_image

# Inject into workflow
workflow.register_preprocessor(CustomPreprocessor())

Implementation Guidelines

Plugin Development

  1. Follow Interface Contracts: Maintain input/output shapes

  2. Handle Errors Gracefully: Return status codes, not exceptions

  3. Consider Performance: Use vectorized operations where possible

  4. Document Parameters: Include bounds, defaults, units

  5. Provide Tests: Visual verification for image-based operations

Quality Assurance

  1. Deterministic RNG: Use fixed seeds for reproducibility

  2. Parameter Validation: Check bounds before processing

  3. Progress Reporting: Provide meaningful status updates

  4. Cleanup on Failure: Preserve partial results for debugging

  5. Logging: Include sufficient diagnostic information

This reference provides complete technical specifications for implementing, extending, or reproducing the PyAMA processing pipeline in any environment.