Skip to main content

Interchange: work with any TMS or CAT tool

Needs a .kapi project

Interchange is project-based. kapi extract and kapi merge read the recipe's content globs and target locales and use the project's translation memory plus the extraction bookkeeping kapi records for you, so start with kapi init (see Day 0) and run the commands from inside the project directory. See Project file for the recipe and its git-style discovery. (Ad-hoc, one-way format conversion with no round-trip needs no project — that is plain kapi run -i <file>.)

Interchange is how kapi hands content to the outside world in a standard, tool-neutral file and takes the result back. Most serious localization runs on bilingual exchange formats like XLIFF 2.x and PO (gettext): translators open them in a CAT tool (Trados, memoQ, OmegaT, Phrase, Lokalise, Crowdin, Poedit, …), translate, and send them back. kapi's job is the engineering glue on either side of that handoff — emit a clean bilingual file, accept the returned one, rebuild the source format byte-for-byte from a captured skeleton, and keep the project's translation memory and terminology in the loop so every merge makes the next extract cheaper.

The interchange landscape

"Interchange" spans three distinct boundaries. They share one seam — the same format readers and writers over kapi's content model — but differ in who they talk to and where the durable state lives:

BoundaryWhat crossesDriven byCounterpart
Format conversion (ad-hoc, one-way)one file, no round-trip statekapi run -i + a format writernone — just the pipeline
Bilingual round-trip (this page)a document as a bilingual .klz (native) or XLIFF 2.x / PO (interop), worked elsewhere, merged backkapi extract / kapi mergea translator/reviewer, or an external TMS / CAT tool
Asset interchangeaccumulated TM and terminologykapi tm import/export (TMX), kapi termbase import/export (TBX)corporate TM / term archives

The bright line: bilingual round-trip and asset interchange both hand a file to a counterpart and bring the result back, with project state holding the way back; format conversion is one-way. This page is the bilingual round-trip — the handoff to a translator or TMS.

Two carriers, chosen by recipient. neokapi's native interchange format is the lossless bilingual .klz (kapi extract --format klz) — for a translator or reviewer working in kapi or the neokapi review tool; it carries inline codes, TM matches, and term context in one file and gives integrity-verified, diffable review. For a recipient on a third-party CAT tool, kapi extract emits XLIFF 2.x / PO (the default), the industry-interop tier this page focuses on. Both go out and come back through the same extract / merge verbs. See KLF vs XLIFF.

For architectural context and design decisions see AD-017: Bilingual Format Interop.

TM in the loop

The two commands that define this workflow are kapi extract and kapi merge. The translation memory participates on both sides:

pre-fillabsorbauthored sourcekapi extractXLIFF / POtranslatorkapi mergetranslated XLIFF / POreturned by translatorproject TM
  • TM pre-fill on extract — every segment is looked up against the project TM before it lands in the XLIFF. Exact matches are pre-filled with state="translated"; fuzzy matches above the recipe's tm.fuzzy_threshold are pre-filled as fuzzy. The translator's CAT tool shows leverage as a normal TM match.

  • TM absorb on merge — every accepted target segment becomes a new or updated translation unit (TU) in the project TM, carrying provenance (merge batch id, source file path, block content hash, originating XLIFF filename) so a later kapi tm audit --batch <id> can trace every TU back to the merge that introduced it.

See the round-trip on a bilingual file

The diagram above is the workflow; the explorer below is the engine that makes it safe. Pick the bilingual XLIFF sample, run it through, and compare the source with the round-tripped output — kapi reads the file into the content model, the leaf text is replaced, and everything else (structure, identifiers, the skeleton kapi merge later rewrites from) comes back unchanged. The same reader and writer back kapi extract and kapi merge, running here in your browser via WebAssembly.

Loading the interactive lab…

Both are on by default. --no-tm and --no-tm-update disable them (e.g. cold-translation workflows or dry-run review).

The .kapi recipe

A minimal recipe wiring extract/merge into a React app:

# app.kapi
version: v1
name: My App
defaults:
source_language: en-US
target_languages: [fr-FR, de-DE, es-ES]
merge:
conflict_policy: translator-wins # | existing-wins | newest-wins
tm:
fuzzy_threshold: 75 # percent; 0..100
segmentation:
source: false # opt-in SRX segmentation
content:
- path: src/locales/en/*.json
target: src/locales/{lang} # mirror each file under the per-language dir

The merge, tm, and segmentation sections are described in detail on the Kapi project file reference. All three are optional — defaults apply when a section is omitted.

Day 0: initialize and seed the TM

kapi init # scaffold app.kapi + .kapi/

Edit the generated app.kapi to declare content globs, target locales, and any optional recipe sections. Then seed the project TM from existing memory, if you have any:

kapi tm import ./corporate-en-fr.tmx

Day 1: first extraction

From inside the project directory (no -p needed — the command auto-discovers the .kapi recipe from cwd, git-style):

kapi extract

Which writes:

  • One XLIFF 2.2 per source → target pair under out/:
    • out/src-locales-en-app.en-US-to-fr-FR.xliff
    • out/src-locales-en-app.en-US-to-de-DE.xliff
    • out/src-locales-en-app.en-US-to-es-ES.xliff
  • Extraction bookkeeping (the captured skeletons and a manifest) recorded in the project's gitignored cache so kapi merge can reattach the returned translations and rebuild the source byte-for-byte — for document and markup formats and for keyed catalog formats alike (JSON, YAML, .properties, Android XML, .resx, Apple .strings/.stringsdict/.xcstrings, .arb, i18next, design tokens)
  • A file-level <note category="kapi"> on each XLIFF stamping the batch id, source file, and source hash

Stdout summarizes:

Extracting batch 6f2e8a1c... (format=xliff2, targets=[fr-FR de-DE es-ES], sources=1)
fr-FR: 1 files, 412 blocks, TM exact=108 fuzzy=67 new=237
de-DE: 1 files, 412 blocks, TM exact=0 fuzzy=0 new=412
es-ES: 1 files, 412 blocks, TM exact=0 fuzzy=0 new=412

Batch 6f2e8a1c... complete. 3 files written to out/
Aggregate TM leverage: exact=108 fuzzy=67 new=1061 (total=1236)

Common extract options

kapi extract --target-lang fr # single target
kapi extract --target-lang fr,de # subset (comma-separated)
kapi extract --only marketing # one collection by name
kapi extract --pattern 'src/**/*.json' # extra glob
kapi extract --xliff-version 2.0 # pin older namespace
kapi extract --no-tm # skip pre-fill
kapi extract --out-dir dist/bilingual # alternate output dir
kapi extract --redact-rules .kapi/redaction.yaml # hide sensitive content

Multi-target in one pass is the default. Omit --target-lang to use every locale in defaults.target_languages; pass a comma-separated subset to restrict.

When extraction is redacted (--redact, --redact-rules, or defaults.redaction in the recipe), the emitted file carries only placeholders and the originals stay in a local vault; kapi merge restores them automatically. See Redaction.

Day 2: translator work

Out-of-scope for kapi. The translator opens the XLIFF/PO in their CAT tool of choice, translates it, and returns it. Kapi is entirely out of the loop during this phase — which is exactly the point. Any bilingual tool that speaks XLIFF 2.x works.

Day N: merge the return

kapi merge -i vendor-return/app.en-US-to-fr-FR.xliff

Merge resolves the extraction manifest by reading the batch id from the file's <note category="kapi" id="batch-id"> — the filename can change during the vendor round-trip and merge still finds the right batch.

Multi-file in one pass:

kapi merge -i vendor-return/ # every .xliff in the directory
kapi merge -i 'vendor-return/*.xliff' # glob
kapi merge -i fr-FR.xliff -i de-DE.xliff # multiple -i

Each returning XLIFF is applied independently. A failure on one input (parse error, missing manifest, stale source) doesn't abort the others — per-file outcomes are reported and the exit code reflects any failure.

Stale segments

Merge compares every incoming block's <source> against the current source file. If the source text drifted since extract (someone edited the JSON while the XLIFF was out with the translator), that block is reported as stale and skipped — neither applied nor TM-absorbed. You can re-extract and re-translate the drifted portion.

Conflict policy

When an on-disk target (or TM TU) already carries a translation for a block, merge.conflict_policy picks what happens:

PolicyBehavior
translator-winsDefault. The translator's target always replaces the existing.
existing-winsPreserve the existing target; skip the translator's.
newest-winsCompare timestamps (file mtime / TU UpdatedAt); pick the newer.

The same policy applies to TM write-back when a TU for this source already has a translation — never ambiguous, never interactive.

Audit: what did this merge contribute?

kapi tm audit --batch <merge-batch-id>

Lists every TM entry written or updated by a specific merge with timestamp, source file path, block content hash, and originating XLIFF filename — the answer to "what did this merge do?" without leaving the CLI.

PO (gettext) alternative

Same flow, different exchange format:

kapi extract --format po # emits .po files
kapi merge -i vendor-return/app.en-US-to-fr-FR.po

PO output carries kapi's bookkeeping as extracted comments:

  • File-level (on the header entry): #. kapi-batch:, #. kapi-source-file:, #. kapi-source-hash:
  • Per-entry: #. kapi-block: <block-id> for merge correlation
  • #, fuzzy on entries pre-filled from fuzzy TM matches

A single kapi merge -i invocation can mix XLIFF and PO inputs from the same batch — useful when different vendors ship in different formats.

note

PO output is one entry per block (matches the segmentation-off case). Enabling defaults.segmentation.source: true together with --format po errors early; emit XLIFF for projects that rely on sentence-level segmentation.

What stays in the project

XLIFF and PO carry exactly one chosen target per locale, the source runs and inline codes, and — with the captured skeleton — enough to rebuild the original file byte-for-byte. Everything else in kapi's content model is deliberately not projected into the bilingual file; it stays in project state (the translation memory, termbase, and stand-off overlays):

  • Stand-off overlays — terminology, entity, and QA annotations anchored to run ranges.
  • Tone / channel variants — alternate targets keyed by more than locale.
  • Provenance and status — where each target came from (TM pre-fill, a specific merge batch, an MT or AI tool) and its review state.

This is by design: the bilingual file is the lean, tool-neutral handoff, and the project is the system of record. It is also why merge needs the same project the extract ran in — the skeleton and manifest that make the round-trip byte-exact live there, not in the file the translator returns.

Deeper reading