Repetition Analysis tool

The Repetition Analysis tool tracks source text as blocks stream through the pipeline and tags each block according to whether its source has been seen before. The first time a given source text appears it is marked as a first occurrence; subsequent identical occurrences are marked as repetitions. Each block also records a group key linking equal segments, the running count of occurrences, and its 1-based index within the group.

Source text is trimmed of surrounding whitespace before comparison. The output feeds scoping and pricing: repeated segments can be translated once and reused, so identifying them quantifies the leverage available from repetition.

IDrepetition-analysis

SourceBuilt-in

Categoryanalysis

Cardinalitymonolingual

Tagsanalysis

Parameters

Parameter	Type	Default	Description
`caseSensitive`	boolean	true	Whether comparison is case-sensitive

Configure these parameters interactively and copy the flow-step YAML on the Tool Reference.

Examples

Case-insensitive repetition

Treat segments that differ only in case as repetitions.

caseSensitive: false

Processing notes

Operates on translatable blocks; non-translatable structure passes through unchanged.
Counts the source side only.

Limitations

Matches exact (trimmed) source text only; near-duplicates and fuzzy matches are not detected here.
Repetition state is tracked across the whole run; ordering of occurrences depends on block order in the pipeline.

← Back to the Tool Reference