XML format (.xml)
The XML format reads generic XML documents, extracts the text of configured elements and attributes as translatable blocks, and writes the translations back while preserving the surrounding markup. Inline elements become inline codes within a block, so formatting tags survive translation.
Extraction is rule-driven. The simple lists — translatableElements,
translatableAttributes, inlineElements, excludedElements — cover most
cases. The advanced elements and attributes maps express richer,
ITS-style rules with attribute conditions, ID and translatable-attribute
mappings, and explicit include/exclude behaviour. When no translatable
elements are listed, all text content is treated as translatable; set
excludeByDefault to invert that and include only what rules opt in.
How kapi reads it
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
attributes | object | Advanced attribute-specific processing rules with element scope constraints | |
blockTypeMap | object | Map of element names to block type strings for semantic type annotation | |
codeFinderRules | array | Regex patterns that match inline codes within translatable text | |
elements | object | Advanced element-specific processing rules with conditions, inline marking, and translatable attribute mappings | |
excludeByDefault | boolean | false | Exclude all elements unless explicitly included by an element rule with INCLUDE |
excludedElements | array | Element names whose content is excluded from extraction | |
groupElements | array | Element names that produce group/layer boundaries in the output | |
idAttributes | array | Attribute names used to extract block IDs from elements | |
inlineElements | array | Element names treated as inline (spans within text) rather than block-level | |
preserveWhitespace | boolean | false | Preserve original whitespace in text content instead of collapsing it |
preserveWhitespaceElements | array | Element names that preserve whitespace regardless of the global setting | |
subfilters | array | Array of {pattern, format} mappings for embedded content. Patterns use dot-separated element paths with glob support. | |
translatableAttributes | array | Attribute names that are translatable across all elements | |
translatableElements | array | Element names whose text content is translatable. If empty, all text content is translatable. | |
useCodeFinder | boolean | false | Enable regex-based detection of inline codes within translatable text |
Configure these parameters interactively and copy the YAML on the Format Reference.
Examples
Extract specific elements
Translate only title and para text, treating b and i as inline.
translatableElements: - title - para inlineElements: - b - i
Translatable attributes
Extract title and alt attribute values across all elements.
translatableAttributes: - title - alt
Conditional extraction with rules
Include only div elements whose translate attribute is yes.
excludeByDefault: true
elements:
div:
ruleTypes:
- INCLUDE
conditions:
- translate
- EQUALS
- "yes"Processing notes
Inline elements become inline codes within blocks; block-level elements form the surrounding structure.
Element rules with
INCLUDE/EXCLUDEcombine withexcludeByDefaultto give fine control over which content is extracted.
Limitations
This is a generic XML reader; for specific XML dialects (such as RESX) start from a tailored rule set rather than the bare defaults.
Element and attribute names in advanced rules wrapped in single quotes are treated as anchored regular expressions.
← Back to the Format Reference