HTML format (.html, .htm, .xhtml)
The HTML format reads HTML documents, extracts translatable text and
localizable attributes, and writes the translations back while preserving
the surrounding markup. Inline elements (such as b, i, a) become inline
codes within a block, so formatting and links survive translation.
The reader ships with sensible defaults for which elements hold translatable
text, which are inline, and which attributes (such as alt and title) are
localizable. The elements and attributes maps let you override or extend
those rules per element and per attribute, mirroring the okf_html bridge
configuration. Parser behaviour (whitespace handling) is grouped under
parser.
How kapi reads it
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
attributes | object | Global attribute extraction rules -- maps attribute names to their rule configuration (ruleTypes, allElementsExcept, onlyTheseElements, conditions) | |
codeFinderRules | array | Regex patterns that match inline codes within translatable text | |
elements | object | Element extraction rules -- maps element names to their rule configuration (ruleTypes, conditions, idAttributes, translatableAttributes) | |
parser | object | Settings that control how the HTML parser reads input | |
useCodeFinder | boolean | false | Enable regex-based detection of inline codes (placeholders, variables, tags) within translatable text |
Configure these parameters interactively and copy the YAML on the Format Reference.
Examples
Preserve whitespace
Keep significant whitespace in text nodes instead of collapsing it.
parser: preserveWhitespace: true
Make a custom element translatable
Extract the text of a custom <summary> element.
elements:
summary:
ruleTypes:
- TEXTUNITProcessing notes
Inline elements become inline codes within blocks; block-level elements form the surrounding structure.
Localizable attributes (such as
altandtitle) are extracted as their own translatable units.
Limitations
The reader applies built-in element/attribute defaults; the
elementsandattributesmaps adjust them rather than replacing the entire rule set.
← Back to the Format Reference