Workflow Pipeline Reference

This reference document provides complete technical details of the pyama processing workflow, including algorithm specifications, data formats, and implementation details for creating plugins or reproducing the pipeline in other systems.

Overview

The pyama processing workflow processes time-lapse microscopy images through seven sequential steps to extract cell traces with quantitative features. The workflow operates on individual Fields of View (FOVs) and processes data in configurable batches for efficiency.

Processing Order:

Copying - Extract frames from microscopy files (ND2/CZI) to Zarr format
Segmentation - Identify cell boundaries using phase contrast images (requires PC channel)
Tracking - Track cells across time points using consistent cell IDs (requires PC channel)
Background Estimation - Estimate background fluorescence using tiled interpolation (requires FL channels)
Cropping - Extract cell bounding box crops from tracked segmentation (requires PC channel)
Extraction - Extract quantitative features and generate trace CSV files (always runs if PC configured)
Caching - Generate visualization levels (/1/) for all channels after processing completes

Channel-Conditional Behavior:

No PC channel configured: Segmentation, tracking, cropping, and extraction are skipped
No FL channels configured: Background estimation is skipped automatically
PC channel with no features: Extraction still outputs base fields (cell, frame, position, bbox) for tracking
Copy-only mode: Processing stages 2-7 are skipped when config.params.copy_only is True

Input Requirements

Microscopy Data

Input format: ND2 or CZI files containing time-lapse microscopy images
Multiple FOVs: Supported and processed in parallel
Time-lapse: Each FOV contains multiple time frames
Multi-channel: One phase contrast and one or more fluorescence channels

Channel Configuration

Phase Contrast (PC) Channel: Required for segmentation. One channel specified for cell boundary detection.
Fluorescence (FL) Channels: Optional, one or more channels specified for feature extraction.

Processing Context

Output directory paths
Channel configurations (Channels dataclass)
Processing parameters
Configuration saved to processing_config.yaml
FOV outputs discovered by naming convention (fov_XXX/)

Workflow Steps

Step 1: CopyingService

Purpose: Extract raw image frames from microscopy file formats and save to Zarr format for efficient access.

Input

Microscopy file path (ND2 or CZI)
FOV index range
Channel configuration

Processing Algorithm

For each FOV and specified channel:

Load frames sequentially from microscopy file
Create or open Zarr store: fov_{fov:03d}/images.zarr
Write channel data to {pc|fl}_ch_{channel_id}/0/ where:
- 0 = full resolution level
- Data type: uint16 for raw pixel values
- Dimensions: (T, H, W) where T = number of time frames, H,W = image dimensions
Channels are stored as separate groups in the Zarr hierarchy

Output Format

Phase contrast: fov_{fov:03d}/images.zarr/pc_ch_{pc_id}/0/
Fluorescence: fov_{fov:03d}/images.zarr/fl_ch_{fl_id}/0/

Implementation Notes

Runs sequentially per batch to avoid I/O bottlenecks
Zarr format provides efficient chunked storage and compression
Existing channels are detected and skipped (supports resuming)
Visualization level (/1/) is generated later by CachingService

Step 2: SegmentationService

Purpose: Identify cell boundaries in each frame using phase contrast microscopy images using LOG-STD method.

Channel Requirements: Requires PC channel. Skips with warning if no PC channel configured.

Input

Phase contrast stack from Step 1: fov_{fov}/images.zarr/pc_ch_{pc_id}/0/ - (T, H, W) of uint16

LOG-STD Algorithm

For each time frame t:

Local Standard Deviation:

local_mean = uniform_filter(image, size=window_size)
local_var = uniform_filter(image**2, size=window_size) - local_mean**2
logstd = 0.5 * log(local_var)

Automatic Thresholding:
- Build histogram of log-STD values
- Find valley threshold between background/cell modes
- Binary mask: binary = logstd > threshold

Morphological Cleanup:

mask = binary_closing(binary, structure=disk(7), iterations=3)
mask = remove_small_objects(mask)
mask = binary_fill_holes(mask)
mask = binary_opening(mask, structure=disk(7), iterations=3)

Output

Labeled segmentation: fov_{fov}/images.zarr/seg_labeled_ch_{pc_id}/0/
Format: (T, H, W) of uint16
0 = background, 1-N = cell IDs (frame-specific)
Visualization level (/1/) generated later by CachingService

Algorithm Characteristics

Computes per-pixel local intensity variation
LOG-STD is effective for phase contrast where boundaries create local intensity changes
Parameters: window size (default from neighborhood), number of iterations (default: 3)

Step 3: TrackingService

Purpose: Track cells across time frames by assigning consistent cell IDs using Intersection over Union (IoU).

Channel Requirements: Requires PC channel. Skips with warning if no PC channel configured.

Input

Labeled segmentation from Step 2: fov_{fov}/images.zarr/seg_labeled_ch_{pc_id}/0/ - (T, H, W) of uint16

IoU-based Hungarian Assignment Algorithm

Per-frame Processing:

Extract Regions:

from skimage.measure import regionprops_table

props = regionprops_table(labeled_frame, 
                        properties=['label', 'area', 'bbox'])

IoU Cost Matrix:

from scipy.spatial.distance import cdist

# Calculate IoU for all current vs previous region pairs
cost_matrix = np.zeros((n_current, n_previous))
for i, current in enumerate(current_regions):
    for j, prev in enumerate(prev_regions):
        iou = calculate_iou(current.bbox, prev.bbox)
        cost_matrix[i, j] = 1.0 - iou  # Convert distance

Hungarian Assignment:

from scipy.optimize import linear_sum_assignment

row_ind, col_ind = linear_sum_assignment(cost_matrix)

# Apply minimum IoU threshold
for r, c in zip(row_ind, col_ind):
    if (1.0 - cost_matrix[r, c]) < min_iou:
        # Mark as new cell
        assign_new_id(r)
    else:
        # Assign previous ID
        assign_previous_id(r, c)

Cell ID Management:
- Frame 0: Assign new IDs 1, 2, 3, …
- Frame n: Matched cells inherit IDs, new cells get new IDs
- Disappeared cells: terminate trace

Output

Labeled tracking: fov_{fov}/images.zarr/seg_tracked_ch_{pc_id}/0/
Format: (T, H, W) of uint16
0 = background, 1-N = cell IDs (consistent across frames)
Visualization level (/1/) generated later by CachingService

Implementation Details

IoU计算使用边界框近似，性能优化
最小min_iou阈值（默认0.1）过滤低质量匹配
匈牙利算法保证全局最优匹配
支持min_size和max_size过滤

Step 4: BackgroundEstimationService

Purpose: Estimate background fluorescence using tiled interpolation for each frame.

Channel Requirements: Requires FL channels. Skips automatically if no FL channels configured.

Input

Labeled segmentation from Step 2: fov_{fov}/images.zarr/seg_labeled_ch_{pc_id}/0/ - (T, H, W) of uint16
Raw fluorescence from Step 1: fov_{fov}/images.zarr/fl_ch_{fl_id}/0/ - (T, H, W) of uint16

IoU-based Hungarian Assignment Algorithm

Per-frame Processing:

Extract Regions:

from skimage.measure import regionprops_table

props = regionprops_table(labeled_frame, 
                        properties=['label', 'area', 'bbox'])

IoU Cost Matrix:

from scipy.spatial.distance import cdist

# Calculate IoU for all current vs previous region pairs
cost_matrix = np.zeros((n_current, n_previous))
for i, current in enumerate(current_regions):
    for j, prev in enumerate(prev_regions):
        iou = calculate_iou(current.bbox, prev.bbox)
        cost_matrix[i, j] = 1.0 - iou  # Convert distance

Hungarian Assignment:

from scipy.optimize import linear_sum_assignment

row_ind, col_ind = linear_sum_assignment(cost_matrix)

# Apply minimum IoU threshold
for r, c in zip(row_ind, col_ind):
    if (1.0 - cost_matrix[r, c]) < min_iou:
        # Mark as new cell
        assign_new_id(r)
    else:
        # Assign previous ID
        assign_previous_id(r, c)

Cell ID Management:
- Frame 0: Assign new IDs 1, 2, 3, …
- Frame n: Matched cells inherit IDs, new cells get new IDs
- Disappeared cells: terminate trace

Output

Labeled tracking: fov_{fov}/images.zarr/seg_tracked_ch_{pc_id}/0/
Format: (T, H, W) of uint16
0 = background, 1-N = cell IDs (consistent across frames)

Implementation Details

IoU计算使用边界框近似，性能优化
最小min_iou阈值（默认0.1）过滤低质量匹配
匈牙利算法保证全局最优匹配
支持min_size和max_size过滤

Tiled Interpolation Algorithm

For each fluorescence channel and frame t:

Mask Foreground:

dilated = binary_dilation(seg_labeled, disk(10))
masked = np.where(dilated, np.nan, fluorescence_image)

Tile Medians:
- Divide frame into overlapping tiles (typical: 50-100 px)
- Compute median of non-NaN pixels in each tile
- Handle tiles with insufficient background via interpolation

Interpolate Background:

from scipy.interpolate import RectBivariateSpline

# Grid of tile medians
x_grid, y_grid = np.meshgrid(tile_centers_x, tile_centers_y)
z_grid = tile_medians

# Interpolate to full resolution
spline = RectBivariateSpline(x_grid.ravel(), y_grid.ravel(), z_grid.T)
background = spline(flat_x_coords, flat_y_coords)

Output

Background stack: fov_{fov}/images.zarr/fl_background_ch_{fl_id}/0/
Format: (T, H, W) of float32
Ready for correction during extraction
Visualization level (/1/) generated later by CachingService

Algorithm Notes

Each fluorescence channel processed independently
Background saved separately for flexible correction weights
Tile size configurable (default: 50-100 px with overlap)
Interpolation preserves spatial variation patterns

Step 5: CroppingService

Purpose: Extract cell bounding box crops from tracked segmentation for efficient feature extraction.

Channel Requirements: Requires PC channel. Skips with warning if no PC channel configured.

Input

Tracked segmentation from Step 3: fov_{fov}/images.zarr/seg_tracked_ch_{pc_id}/0/ - (T, H, W) of uint16
Phase contrast from Step 1: fov_{fov}/images.zarr/pc_ch_{pc_id}/0/ - (T, H, W) of uint16
Fluorescence channels (optional): fov_{fov}/images.zarr/fl_ch_{fl_id}/0/ - (T, H, W) of uint16
Background channels (optional): fov_{fov}/images.zarr/fl_background_ch_{fl_id}/0/ - (T, H, W) of float32

Processing Algorithm

For each tracked cell:

Extract bounding boxes across all frames where cell is present
Crop regions from all configured channels (PC + FL)
Apply background correction to FL crops if available
Store crops in per-cell structure: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/0/

Output Format

Cell crops: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/0/ - (T, H, W) of float32 (normalized [0,1])
Visualization level: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/1/ - (T, H, W) of uint8
Metadata: fov_{fov}/cells.zarr/metadata/ - cell_ids, bboxes, valid_frames

Implementation Notes

Works with PC-only data: crops PC channel only
With FL configured: crops both PC and FL channels, applies background if available
Creates cells.zarr with per-cell structure for efficient feature extraction
Essential for extraction step - provides cropped regions

Step 6: ExtractionService

Purpose: Extract quantitative features for each tracked cell at each time point and generate CSV traces.

Channel Requirements: Always runs if PC channel is configured, even with empty features. Creates empty CSV if no channels configured.

Purpose: Extract quantitative features for each tracked cell at each time point and generate CSV traces.

Input

Cell crops from Step 5: fov_{fov}/cells.zarr/{channel_name}/{cell_id}/0/ - (T, H, W) of float32
Metadata from Step 5: fov_{fov}/cells.zarr/metadata/ - bboxes, valid_frames
Feature configuration list from config

Feature Extraction Algorithm

For each FOV, cell, and time point:

Load Cell Crop:

# Load cropped region for this cell and frame
cell_crop = cells_zarr[f"{channel_name}/{cell_id}/0"][frame_idx]
bbox = metadata["bboxes"][cell_idx, frame_idx]  # [y0, x0, y1, x1]

Base Features (Always Computed):

# Base fields are always extracted, regardless of channel configs
row['fov'] = fov_id
row['cell'] = cell_id
row['frame'] = frame_index
row['good'] = metadata["valid_frames"][cell_idx, frame_idx]
row['position_x'] = (bbox[1] + bbox[3]) / 2  # Centroid x
row['position_y'] = (bbox[0] + bbox[2]) / 2  # Centroid y
row['bbox_x0'] = bbox[1]  # x0
row['bbox_y0'] = bbox[0]  # y0
row['bbox_x1'] = bbox[3]  # x1
row['bbox_y1'] = bbox[2]  # y1

Channel-Specific Features:

# Phase contrast features (if configured)
if pc_features:
    mask = cell_crop > 0  # Binary mask from crop
    if 'area' in pc_features:
        row[f'area_ch_{pc_channel}'] = np.sum(mask)
    
    if 'aspect_ratio' in pc_features:
        ellipse = regionprops(mask.astype(int))[0]
        row[f'aspect_ratio_ch_{pc_channel}'] = ellipse.major_axis_length / ellipse.minor_axis_length

# Fluorescence features with background correction (if configured)
if fl_features:
    raw_intensity = np.sum(cell_crop)
    if background_available:
        background_crop = cells_zarr[f"fl_background_ch_{fl_id}/{cell_id}/0"][frame_idx]
        background_intensity = np.sum(background_crop * background_weight)
        corrected_intensity = raw_intensity - background_intensity
    else:
        corrected_intensity = raw_intensity
    
    row[f'intensity_total_ch_{fl_id}'] = corrected_intensity

Background Correction:

# Configurable weight from config.params.background_weight
background_weight = clip(config.params.background_weight, 0.0, 1.0)

# Applied during extraction from background crops
corrected_intensity = raw_intensity - background_weight * background_intensity

Quality Filtering

Trace Length Filter:

min_frames = params.get('min_frames', 30)
trace_lengths = calculate_trace_lengths(traces)
filtered = traces[trace_lengths >= min_frames]

Border Filter:

border_margin = params.get('border_margin', 50)

def on_border(mask):
    return np.any(mask[:border_margin, :]) or \
           np.any(mask[-border_margin:, :]) or \
           np.any(mask[:, :border_margin]) or \
           np.any(mask[:, -border_margin:])

# Remove border cells entirely
filtered = filtered[~filtered['cell'].isin(border_cells)]

Output Format

Per-FOV CSV: fov_{fov:03d}/{basename}_fov_{fov:03d}_traces.csv

fov,cell,frame,good,position_x,position_y,bbox_x0,bbox_y0,bbox_x1,bbox_y1,area_ch_0,aspect_ratio_ch_0,intensity_total_ch_1
0,0,0,True,100.5,200.3,85,165,115,235,450,1.234,1234.5
0,0,1,True,101.2,199.8,86,166,116,236,455,1.236,1356.2

Column Naming Convention:

Base columns: fov, cell, frame, good, position_x/y, bbox_*
Feature columns: {feature}_ch_{channel_id} (e.g., intensity_total_ch_1)
Base fields are always included if PC channel is configured, even with empty features

Step 7: CachingService

Purpose: Generate visualization levels (/1/) for all channels after processing completes.

Channel Requirements: Runs for all channels that have /0/ level data.

Input

All channels in fov_{fov}/images.zarr/ with /0/ level
All channels in fov_{fov}/cells.zarr/ with /0/ level

Processing Algorithm

For each channel:

Check if /0/ exists and /1/ is missing
Read /0/ data: (T, H, W) array
Normalize to uint8:
- Intensity channels: Percentile normalization (1st-99th percentile) across entire stack
- Segmentation channels: Scale proportionally to [0, 255] based on max label
Downsample by 2x using generate_half_resolution()
Write to /1/ level: (T, H/2, W/2) of uint8

Output Format

Visualization level: {channel_name}/1/ - (T, H/2, W/2) of uint8
Compression: LZ4 for fast access
Chunks: (1, min(256, H/2), min(256, W/2))

Implementation Notes

Unifies caching logic that was previously scattered across services
Runs after all processing completes for efficiency
Normalization ensures consistent scaling across all frames
Non-critical: Workflow continues even if caching fails

Output Structure

output_dir/
├── processing_config.yaml           # Metadata, channels, parameters
├── fov_000/
│   ├── images.zarr/                 # Image data (all channels)
│   │   ├── pc_ch_0/
│   │   │   ├── 0/                   # Full resolution (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   ├── fl_ch_1/
│   │   │   ├── 0/                   # Full resolution (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   ├── seg_labeled_ch_0/
│   │   │   ├── 0/                   # Labeled segmentation (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   ├── seg_tracked_ch_0/
│   │   │   ├── 0/                   # Tracked segmentation (T, H, W) uint16
│   │   │   └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   │   └── fl_background_ch_1/
│   │       ├── 0/                   # Background estimate (T, H, W) float32
│   │       └── 1/                   # Visualization level (T, H/2, W/2) uint8
│   ├── cells.zarr/                  # Per-cell crops
│   │   ├── metadata/
│   │   │   ├── cell_ids             # (N,) int32
│   │   │   ├── bboxes               # (N, T, 4) int32 [y0, x0, y1, x1]
│   │   │   └── valid_frames         # (N, T) bool
│   │   ├── pc_ch_0/
│   │   │   ├── {cell_id}/
│   │   │   │   ├── 0/               # Normalized crop (T, H, W) float32 [0,1]
│   │   │   │   └── 1/               # Visualization level (T, H/2, W/2) uint8
│   │   └── fl_ch_1/
│   │       └── {cell_id}/
│   │           ├── 0/               # Normalized crop (T, H, W) float32 [0,1]
│   │           └── 1/               # Visualization level (T, H/2, W/2) uint8
│   └── basename_fov_000_traces.csv  # Combined feature traces
├── fov_001/
│   └── ...

Batch Processing Implementation

Thread Pool Executor Pattern

from concurrent.futures import ThreadPoolExecutor, as_completed
from pyama.processing.workflow.run import run_single_worker

def run_batch(fov_batch, config, n_workers, metadata, output_dir, cancel_event):
    """Process a batch of FOVs in parallel."""
    
    # Sequential copying (I/O bound)
    copy_service = CopyingService()
    copy_service.process_all_fovs(
        metadata=metadata,
        config=config,
        output_dir=output_dir,
        fov_start=fov_batch[0],
        fov_end=fov_batch[-1],
        cancel_event=cancel_event,
    )
    
    # Copy-only mode: skip processing stages
    if config.params.copy_only:
        return
    
    # Parallel processing (CPU bound)
    worker_ranges = _split_worker_ranges(fov_batch, n_workers)
    
    with ThreadPoolExecutor(max_workers=n_workers) as executor:
        futures = {
            executor.submit(
                run_single_worker,
                fov_range,
                metadata,
                config,
                output_dir,
                cancel_event,
            ): fov_range
            for fov_range in worker_ranges
        }
        
        # Wait for completion with progress tracking
        for future in as_completed(futures):
            fov_range, successful, failed, message = future.result()
            update_progress(successful, failed, message)
    
    # Generate visualization cache after all processing
    if overall_success:
        caching_service = CachingService()
        caching_service.process_all_fovs(
            metadata=metadata,
            config=config,
            output_dir=output_dir,
            fov_start=fov_batch[0],
            fov_end=fov_batch[-1],
            cancel_event=cancel_event,
        )

Memory Management

def run_single_worker(fovs, metadata, config, output_dir, cancel_event):
    """Worker function for parallel FOV processing."""
    
    try:
        # Initialize services
        segmentation = SegmentationService(method=config.params.segmentation_method)
        tracking = TrackingService(method=config.params.tracking_method)
        background_estimation = BackgroundEstimationService()
        cropping = CroppingService()
        extraction = ExtractionService()
        
        # Process each service sequentially for FOV range
        segmentation.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        tracking.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        background_estimation.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        cropping.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        if cancel_event and cancel_event.is_set():
            return (fovs, 0, len(fovs), "Cancelled")
        
        extraction.process_all_fovs(metadata, config, output_dir, fovs[0], fovs[-1], cancel_event)
        
        return (fovs, len(fovs), 0, "Completed")
        
    except Exception as e:
        logger.error(f"Error processing FOVs {fovs[0]}-{fovs[-1]}: {e}")
        return (fovs, 0, len(fovs), str(e))

Cancellation Support

Cancellation is handled via threading.Event within the task runner:

# Usage in workflow
if cancel_event.is_set():
    logger.info("Cancellation requested, cleaning up")
    cleanup_partial_results()
    return False

Data Type Specifications

Image Arrays (Zarr Format)

Stage	Zarr Path	Data Type	Dimensions	Notes
Raw Images	`images.zarr/{pc\|fl}_ch_{id}/0/`	uint16	(T, H, W)	Chunked Zarr arrays
Segmentation	`images.zarr/seg_labeled_ch_{id}/0/`	uint16	(T, H, W)	Labeled mask (untracked)
Tracking	`images.zarr/seg_tracked_ch_{id}/0/`	uint16	(T, H, W)	Cell IDs, 0=background
Background	`images.zarr/fl_background_ch_{id}/0/`	float32	(T, H, W)	Estimate per channel
Visualization	`images.zarr/{channel}/1/`	uint8	(T, H/2, W/2)	Downsampled for display
Cell Crops	`cells.zarr/{channel}/{cell_id}/0/`	float32	(T, H, W)	Normalized [0,1]
Cell Viz	`cells.zarr/{channel}/{cell_id}/1/`	uint8	(T, H/2, W/2)	Downsampled for display

CSV Schemas

Processing Traces (per-FOV):

All columns prefixed by channel ID
Frame-based, time computed after loading
Includes quality flag (good column)

Merged Traces (per-sample):

Same format as processing traces
Multiple FOVs combined
Includes sample metadata in headers

Fitted Results (post-analysis):

One row per cell
Includes model type, R², parameters
Additional columns per model parameters

Algorithm Parameters

Segmentation Parameters

segmentation_params = {
    'logstd_window_size': 3,  # Neighborhood for std computation
    'morph_size': 7,         # Structuring element size
    'morph_iterations': 3,   # Number of opening/closing iterations
    'min_object_size': 50,   # Minimum cell size in pixels
    'max_object_size': 10000, # Maximum cell size in pixels
}

Tracking Parameters

tracking_params = {
    'min_iou': 0.1,     # Minimum IoU for cell matching
    'min_frames': 30,   # Minimum trace length
    'border_margin': 50, # Exclusion margin (pixels)
}

Extraction Parameters

extraction_params = {
    'background_weight': 1.0,  # Background correction weight [0-1]
    'frame_interval': 10.0,    # Minutes per frame (default)
    'time_mapping': None,      # Custom frame->time mapping (dict)
    'features': {
        'phase': ['area', 'aspect_ratio'],
        'fluorescence': ['intensity_total', 'intensity_mean']
    }
}

Performance Characteristics

Memory Usage

Dataset Size	Approximate RAM Usage	Notes
10 FOVs, 50 frames	1-2 GB	Single workstation
100 FOVs, 180 frames	8-12 GB	Requires 16GB+ RAM
500+ FOVs	32GB+	Consider distributed processing

Processing Speed

Operation	Speed (per FOV)	Parallel Scaling
Copying (sequential)	2-5 sec	No parallelization
Segmentation	10-30 sec	4-8 threads optimal
Tracking	5-15 sec	Linear up to CPU count
Background Estimation	5-20 sec	CPU-bound, parallel
Cropping	2-10 sec	CPU-bound, parallel
Extraction	2-8 sec	CPU-bound, parallel
Caching	1-5 sec	Sequential (post-processing)

Optimization Strategies

Memory Mapping: Use mmap_mode='r' for large arrays
Batch Size: Tune based on RAM availability
Worker Count: Match to CPU cores (typically 4-8)
SSD Storage: Improves I/O for large datasets

Extension Points

Custom Features

def extract_custom_feature(image, mask, context):
    """User-defined feature extraction."""
    # Implement custom logic
    return feature_value

# Register in feature system
PHASE_FEATURES['custom_feature'] = extract_custom_feature

Alternative Algorithms

Replace core algorithms while maintaining interface:

Segmentation: watershed, deep learning
Tracking: Kalman filter, graph-based
Feature extraction: custom metrics

Integration Hooks

class CustomPreprocessor:
    """Pre-process frames before segmentation."""
    def process(self, image):
        # Custom preprocessing
        return processed_image

# Inject into workflow
workflow.register_preprocessor(CustomPreprocessor())

Implementation Guidelines

Plugin Development

Follow Interface Contracts: Maintain input/output shapes
Handle Errors Gracefully: Return status codes, not exceptions
Consider Performance: Use vectorized operations where possible
Document Parameters: Include bounds, defaults, units
Provide Tests: Visual verification for image-based operations

Quality Assurance

Deterministic RNG: Use fixed seeds for reproducibility
Parameter Validation: Check bounds before processing
Progress Reporting: Provide meaningful status updates
Cleanup on Failure: Preserve partial results for debugging
Logging: Include sufficient diagnostic information

This reference provides complete technical specifications for implementing, extending, or reproducing the PyAMA processing pipeline in any environment.