Segmentation tool
The Segmentation tool splits a block's text into sentence-level segments. Segmentation determines the unit of translation and of translation-memory matching, so consistent segmentation is important for leverage and review. By default the tool segments source text using built-in SRX-style rules that handle common sentence boundaries and abbreviations; rules can also be loaded from an SRX file.
The number of segments produced is recorded on each block. Already-segmented text is left alone unless re-segmentation is requested. Target text can be segmented independently, with its own rules file.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
credential | string | Stored credential name for the llm engine | |
engine | string | Segmenter backend: srx (rule-based; default)/ uax29 (Unicode baseline)/ llm (semantic chunks)/ sat (ML model) | |
instruction | string | Optional guidance for the llm engine | |
layer | string | Segmentation overlay layer name; empty uses the engine's natural layer | |
model | string | Model name for the llm or sat engine | |
overwriteSegmentation | boolean | false | Re-segment already-segmented blocks replacing previous segmentation |
provider | string | AI provider id for the llm engine | |
renumberCodes | boolean | false | Renumber inline code IDs when materializing segments to a bilingual format |
satModel | string | SaT model for the sat engine (e.g. sat-3l-sm | |
segmentSource | boolean | true | Segment the source text |
segmentTarget | boolean | false | Segment existing target text |
sourceSrxPath | string | Path to an SRX 2.0 rules file for source text (srx engine) | |
targetSrxPath | string | Path to an SRX 2.0 rules file for target text (srx engine) | |
threshold | number | Boundary probability threshold for the sat engine (0 = model default) | |
treatIsolatedCodesAsWhitespace | boolean | false | Treat isolated inline codes as whitespace during segmentation |
trimLeadingWhitespace | boolean | true | Exclude leading whitespace from each segment span |
trimTrailingWhitespace | boolean | true | Exclude trailing whitespace from each segment span |
Configure these parameters interactively and copy the flow-step YAML on the Tool Reference.
Examples
Segment source text with default rules
Split source into sentences using the built-in rules.
segmentSource: true
Re-segment with a custom SRX file
Replace existing segmentation using project-specific rules.
segmentSource: true overwriteSegmentation: true sourceSrxPath: ./rules/segmentation.srx
Processing notes
Operates on translatable blocks only; non-translatable blocks pass through unchanged.
The resulting segment count is written to a block property.
Limitations
The built-in rule set targets common Latin-script sentence boundaries; non-Latin scripts may need a custom SRX file.
← Back to the Tool Reference