Regex Extraction format (.ini, .info, .rls)
The regex format is a configurable extractor for text formats that
have no dedicated reader — Mac .strings, INI files, StringInfo, and
similar key/value or delimited text. You supply a list of rules; each
rule is a regular expression with capture groups that identify the
translatable source text and, optionally, an identifier and a note.
Rules are applied across the input left to right. Where rules would overlap, the earliest match wins. Each match produces one translatable block whose source is the configured capture group; the text on either side of the captured value is preserved verbatim so the writer can rebuild the line by assembling prefix, translated value, and suffix. Content that no rule matches is preserved as non-translatable structure. With no rules configured, nothing is extracted and the whole input is preserved.
Configuration is supplied on the collection's format: block in a
.kapi project. The keys are: rules — a list of
{ pattern, sourceGroup, idGroup, noteGroup } records, where
sourceGroup is the 1-based capture group holding the source text
(required, must be at least 1), idGroup names the block
(0 auto-generates IDs), and noteGroup captures a translator note
(0 for none); escapeType — how to decode escapes inside the
extracted text, one of none (the default), backslash (\", \\,
\n, \t, …), or doublechar (a character escaped by doubling it);
and escapeChar — the character used for doublechar mode (default
").
This format has no configurable parameters.
Examples
Extract Mac .strings entries
A single rule capturing the value of a "key" = "value"; line.
Group 1 is the source text; group 2 supplies the block ID.
format:
name: regex
config:
rules:
- pattern: '"([^"]*)"\s*=\s*"([^"]*)";'
sourceGroup: 2
idGroup: 1
escapeType: backslashProcessing notes
One block per rule match; non-matching spans become non-translatable Data so the output round-trips.
The raw text surrounding the captured source group is stored on the block so the writer reassembles output by
prefix + value + suffix.The optional
idGroupandnoteGroupcaptures populate the block name and a translator note respectively.
Limitations
Patterns use Go's RE2 regular-expression syntax, which does not support backreferences or lookarounds.
Where matches overlap, only the first match is kept; later overlapping matches are discarded.
Escape decoding (
escapeType) applies to the extracted source text, which can differ from how a delimiter-scanning filter would handle the same escapes.
← Back to the Format Reference