Interchange: work with any TMS or CAT tool
Interchange is project-based. kapi extract and kapi merge read the
recipe's content globs and target locales and use the project's translation
memory plus the extraction bookkeeping kapi records for you, so start with
kapi init (see Day 0) and run the commands
from inside the project directory. See Project file
for the recipe and its git-style discovery. (Ad-hoc, one-way format conversion
with no round-trip needs no project — that is plain kapi run -i <file>.)
Interchange is how kapi hands content to the outside world in a standard, tool-neutral file and takes the result back. Most serious localization runs on bilingual exchange formats like XLIFF 2.x and PO (gettext): translators open them in a CAT tool (Trados, memoQ, OmegaT, Phrase, Lokalise, Crowdin, Poedit, …), translate, and send them back. kapi's job is the engineering glue on either side of that handoff — emit a clean bilingual file, accept the returned one, rebuild the source format byte-for-byte from a captured skeleton, and keep the project's translation memory and terminology in the loop so every merge makes the next extract cheaper.
The interchange landscape
"Interchange" spans three distinct boundaries. They share one seam — the same format readers and writers over kapi's content model — but differ in who they talk to and where the durable state lives:
| Boundary | What crosses | Driven by | Counterpart |
|---|---|---|---|
| Format conversion (ad-hoc, one-way) | one file, no round-trip state | kapi run -i + a format writer | none — just the pipeline |
| Bilingual round-trip (this page) | a document as a bilingual .klz (native) or XLIFF 2.x / PO (interop), worked elsewhere, merged back | kapi extract / kapi merge | a translator/reviewer, or an external TMS / CAT tool |
| Asset interchange | accumulated TM and terminology | kapi tm import/export (TMX), kapi termbase import/export (TBX) | corporate TM / term archives |
The bright line: bilingual round-trip and asset interchange both hand a file to a counterpart and bring the result back, with project state holding the way back; format conversion is one-way. This page is the bilingual round-trip — the handoff to a translator or TMS.
Two carriers, chosen by recipient. neokapi's native interchange format is the
lossless bilingual .klz (kapi extract --format klz) — for a translator or
reviewer working in kapi or the neokapi review tool; it carries inline codes, TM
matches, and term context in one file and gives integrity-verified, diffable
review. For a recipient on a third-party CAT tool, kapi extract emits XLIFF
2.x / PO (the default), the industry-interop tier this page focuses on. Both go
out and come back through the same extract / merge verbs. See
KLF vs XLIFF.
For architectural context and design decisions see AD-017: Bilingual Format Interop.
TM in the loop
The two commands that define this workflow are kapi extract and
kapi merge. The translation memory participates on both sides:
-
TM pre-fill on extract — every segment is looked up against the project TM before it lands in the XLIFF. Exact matches are pre-filled with
state="translated"; fuzzy matches above the recipe'stm.fuzzy_thresholdare pre-filled as fuzzy. The translator's CAT tool shows leverage as a normal TM match. -
TM absorb on merge — every accepted target segment becomes a new or updated translation unit (TU) in the project TM, carrying provenance (merge batch id, source file path, block content hash, originating XLIFF filename) so a later
kapi tm audit --batch <id>can trace every TU back to the merge that introduced it.
See the round-trip on a bilingual file
The diagram above is the workflow; the explorer below is the engine that
makes it safe. Pick the bilingual XLIFF sample, run it through, and compare
the source with the round-tripped output — kapi reads the file into the
content model, the leaf text is replaced, and everything else (structure,
identifiers, the skeleton kapi merge later rewrites from) comes back
unchanged. The same reader and writer back kapi extract and kapi merge,
running here in your browser via WebAssembly.
Both are on by default. --no-tm and --no-tm-update disable them
(e.g. cold-translation workflows or dry-run review).
The .kapi recipe
A minimal recipe wiring extract/merge into a React app:
# app.kapi
version: v1
name: My App
defaults:
source_language: en-US
target_languages: [fr-FR, de-DE, es-ES]
merge:
conflict_policy: translator-wins # | existing-wins | newest-wins
tm:
fuzzy_threshold: 75 # percent; 0..100
segmentation:
source: false # opt-in SRX segmentation
content:
- path: src/locales/en/*.json
target: src/locales/{lang} # mirror each file under the per-language dir
The merge, tm, and segmentation sections are described in detail
on the Kapi project file reference. All three are
optional — defaults apply when a section is omitted.
Day 0: initialize and seed the TM
kapi init # scaffold app.kapi + .kapi/
Edit the generated app.kapi to declare content globs, target
locales, and any optional recipe sections. Then seed the project TM
from existing memory, if you have any:
kapi tm import ./corporate-en-fr.tmx
Day 1: first extraction
From inside the project directory (no -p needed — the command
auto-discovers the .kapi recipe from cwd, git-style):
kapi extract
Which writes:
- One XLIFF 2.2 per source → target pair under
out/:out/src-locales-en-app.en-US-to-fr-FR.xliffout/src-locales-en-app.en-US-to-de-DE.xliffout/src-locales-en-app.en-US-to-es-ES.xliff
- Extraction bookkeeping (the captured skeletons and a manifest) recorded
in the project's gitignored cache so
kapi mergecan reattach the returned translations and rebuild the source byte-for-byte — for document and markup formats and for keyed catalog formats alike (JSON, YAML,.properties, Android XML,.resx, Apple.strings/.stringsdict/.xcstrings,.arb, i18next, design tokens) - A file-level
<note category="kapi">on each XLIFF stamping the batch id, source file, and source hash
Stdout summarizes:
Extracting batch 6f2e8a1c... (format=xliff2, targets=[fr-FR de-DE es-ES], sources=1)
fr-FR: 1 files, 412 blocks, TM exact=108 fuzzy=67 new=237
de-DE: 1 files, 412 blocks, TM exact=0 fuzzy=0 new=412
es-ES: 1 files, 412 blocks, TM exact=0 fuzzy=0 new=412
Batch 6f2e8a1c... complete. 3 files written to out/
Aggregate TM leverage: exact=108 fuzzy=67 new=1061 (total=1236)
Common extract options
kapi extract --target-lang fr # single target
kapi extract --target-lang fr,de # subset (comma-separated)
kapi extract --only marketing # one collection by name
kapi extract --pattern 'src/**/*.json' # extra glob
kapi extract --xliff-version 2.0 # pin older namespace
kapi extract --no-tm # skip pre-fill
kapi extract --out-dir dist/bilingual # alternate output dir
kapi extract --redact-rules .kapi/redaction.yaml # hide sensitive content
Multi-target in one pass is the default. Omit --target-lang to use
every locale in defaults.target_languages; pass a comma-separated
subset to restrict.
When extraction is redacted (--redact, --redact-rules, or
defaults.redaction in the recipe), the emitted file carries only
placeholders and the originals stay in a local vault; kapi merge
restores them automatically. See Redaction.
Day 2: translator work
Out-of-scope for kapi. The translator opens the XLIFF/PO in their CAT tool of choice, translates it, and returns it. Kapi is entirely out of the loop during this phase — which is exactly the point. Any bilingual tool that speaks XLIFF 2.x works.
Day N: merge the return
kapi merge -i vendor-return/app.en-US-to-fr-FR.xliff
Merge resolves the extraction manifest by reading the batch id from the
file's <note category="kapi" id="batch-id"> — the filename can
change during the vendor round-trip and merge still finds the right
batch.
Multi-file in one pass:
kapi merge -i vendor-return/ # every .xliff in the directory
kapi merge -i 'vendor-return/*.xliff' # glob
kapi merge -i fr-FR.xliff -i de-DE.xliff # multiple -i
Each returning XLIFF is applied independently. A failure on one input (parse error, missing manifest, stale source) doesn't abort the others — per-file outcomes are reported and the exit code reflects any failure.
Stale segments
Merge compares every incoming block's <source> against the current
source file. If the source text drifted since extract (someone edited
the JSON while the XLIFF was out with the translator), that block is
reported as stale and skipped — neither applied nor TM-absorbed.
You can re-extract and re-translate the drifted portion.
Conflict policy
When an on-disk target (or TM TU) already carries a translation for a
block, merge.conflict_policy picks what happens:
| Policy | Behavior |
|---|---|
translator-wins | Default. The translator's target always replaces the existing. |
existing-wins | Preserve the existing target; skip the translator's. |
newest-wins | Compare timestamps (file mtime / TU UpdatedAt); pick the newer. |
The same policy applies to TM write-back when a TU for this source already has a translation — never ambiguous, never interactive.
Audit: what did this merge contribute?
kapi tm audit --batch <merge-batch-id>
Lists every TM entry written or updated by a specific merge with timestamp, source file path, block content hash, and originating XLIFF filename — the answer to "what did this merge do?" without leaving the CLI.
PO (gettext) alternative
Same flow, different exchange format:
kapi extract --format po # emits .po files
kapi merge -i vendor-return/app.en-US-to-fr-FR.po
PO output carries kapi's bookkeeping as extracted comments:
- File-level (on the header entry):
#. kapi-batch:,#. kapi-source-file:,#. kapi-source-hash: - Per-entry:
#. kapi-block: <block-id>for merge correlation #, fuzzyon entries pre-filled from fuzzy TM matches
A single kapi merge -i invocation can mix XLIFF and PO inputs from
the same batch — useful when different vendors ship in different
formats.
PO output is one entry per block (matches the segmentation-off case).
Enabling defaults.segmentation.source: true together with
--format po errors early; emit XLIFF for projects that rely on
sentence-level segmentation.
What stays in the project
XLIFF and PO carry exactly one chosen target per locale, the source runs and inline codes, and — with the captured skeleton — enough to rebuild the original file byte-for-byte. Everything else in kapi's content model is deliberately not projected into the bilingual file; it stays in project state (the translation memory, termbase, and stand-off overlays):
- Stand-off overlays — terminology, entity, and QA annotations anchored to run ranges.
- Tone / channel variants — alternate targets keyed by more than locale.
- Provenance and status — where each target came from (TM pre-fill, a specific merge batch, an MT or AI tool) and its review state.
This is by design: the bilingual file is the lean, tool-neutral handoff, and the project is the system of record. It is also why merge needs the same project the extract ran in — the skeleton and manifest that make the round-trip byte-exact live there, not in the file the translator returns.
Deeper reading
- AD-017: Bilingual Format Interop — full design rationale, six-boundary framing, conflict policy table
- AD-009: Translation Memory —
matching tiers, TM schema,
LookupSegmentfor sentence-level leverage - AD-008: Project Model —
the extraction-batch bookkeeping and
merge.conflict_policyrecipe shape - AD-013: Kapi CLI — command tree and auto-discovery semantics