.kapi Project File Format
Implementation notes for the .kapi project file format. See AD-008 for the architectural decision.
Schema
The .kapi file is a YAML document parsed by core/project.KapiProject:
type KapiProject struct {
Version string `yaml:"version"`
Name string `yaml:"name,omitempty"`
Plugins map[string]PluginSpec `yaml:"plugins,omitempty"` // name → spec (scalar = version short form)
Defaults Defaults `yaml:"defaults,omitempty"` // project-wide defaults (locales live here)
Content []ContentCollection `yaml:"content,omitempty"`
Preset string `yaml:"preset,omitempty"`
Flows map[string]*flow.StepsSpec `yaml:"flows,omitempty"`
Requires RequiresMap `yaml:"requires,omitempty"` // plugin name → semver constraint
Extras map[string]yaml.Node `yaml:",inline"` // unknown keys (platform extensions)
}
// Defaults holds project-wide processing defaults — including locales.
type Defaults struct {
SourceLanguage model.LocaleID `yaml:"source_language,omitempty"`
TargetLanguages []model.LocaleID `yaml:"target_languages,omitempty"`
Concurrency int `yaml:"concurrency,omitempty"`
ParallelBlocks int `yaml:"parallel_blocks,omitempty"`
Encoding string `yaml:"encoding,omitempty"`
// (also: locale_format, formats, exclude, merge, tm, segmentation,
// redaction, brand_voice, termbase — see core/project/project.go)
}
// ContentCollection is either a bare entry (path/format/target) or a named
// collection (name + items), and can carry its own source/target languages.
type ContentCollection struct {
Name string `yaml:"name,omitempty"`
SourceLanguage model.LocaleID `yaml:"source_language,omitempty"`
TargetLanguages []model.LocaleID `yaml:"target_languages,omitempty"`
Items []ContentItem `yaml:"items,omitempty"`
Base string `yaml:"base,omitempty"` // dir items' paths are made relative to; items inherit it
// Bare-entry fields (short form):
Path string `yaml:"path,omitempty"` // doublestar glob for source files
Format *FormatSpec `yaml:"format,omitempty"` // format ID; auto-detect per file if empty
Target string `yaml:"target,omitempty"` // output path template (tokens below)
}
// ContentItem additionally carries its own `base` (yaml:"base,omitempty"),
// falling back to the collection's Base when empty.
Flow definitions reuse core/flow.StepsSpec and core/flow.FlowStep (see flow-steps-format).
Content model
Content is a list of ContentCollection values. Each entry is one of two
shapes, distinguished by ContentCollection.IsBareEntry():
- Bare entry — has a
pathand noitems. Thepath,format, andtargetfields are promoted onto the collection directly. Use this for a single glob with no grouping. - Named collection — has a
nameand a non-emptyitemslist ofContentItem, and may set its ownsource_language/target_languages. Use this to group related patterns and scope languages per group.
KapiProject.IterateContent walks both shapes uniformly, yielding each
ContentItem paired with its parent collection so callers can resolve
fall-through fields. Language resolution falls through item → collection →
project defaults via ContentItem.ResolvedSourceLanguage /
ResolvedTargetLanguages. A bare entry's promoted fields are wrapped as a
single-item slice by ContentCollection.EffectiveItems, carrying its Extras
through so platform per-item fields survive.
Defaults-scoped settings
Defaults holds project-wide processing settings that individual content items
can override. Beyond locales and the parallelism/encoding knobs shown above:
merge(MergeDefaults.ConflictPolicy) — howkapi mergeresolves a translator's target against an existing on-disk target or TM entry (translator-winsdefault,existing-wins,newest-wins). See AD-017.tm(TMDefaults) —fuzzy_threshold(TM pre-fill cutoff onkapi extract, default 75) andread(additional read-only TM files; writes always go to the project TM).segmentation(SegmentationDefaults) — opt-in SRX sentence segmentation overlay on extract (source, optionalsrxrules file).redaction(*RedactionSpec) — replace sensitive content with protected placeholders before processing and restore it afterwards. Overridable perContentItem.Redaction.brand_voice(*BrandVoiceBinding) — bind a brand voice profile (one ofprofile_file,profile, orpack) as standing project context. This is the framework binding underdefaults:, distinct from a platform's top-levelbrand_voiceextension.termbase(string) — path to a glossary/termbase, resolved relative to the project root, used for project-scoped term enforcement with no--termbaseflag.
Platform extensions and the server: block
The framework knows nothing about platform-specific keys. Unknown top-level YAML
keys land in Extras map[string]yaml.Node (with yaml:",inline") on
KapiProject, Defaults, ContentCollection, and ContentItem. Platform
layers decode their own typed schema from these maps via GetExtra and
re-encode on SetExtra; round-tripping a recipe through the framework alone
preserves the keys verbatim.
A vendor may use this to add their own recipe keys — for example, a server:
block (and hooks, automations, assets, brand_voice policy). A recipe
with no such extension is a pure local project. The kapi CLI tolerates unknown
blocks but ignores them; the owning plugin decodes them from Extras.
requires: (a map of plugin name → semver constraint) gates loading: a recipe
declaring requires: { myplugin: "^1.0" } refuses to load in a binary that has
not registered the myplugin extension. See
AD-008 for the full extension
model and server: schema.
Validation Rules
versionis required, must be"v1"- For each
content[]entry:- Bare entry —
pathis required anditemsmust be empty. - Named collection —
pathmust be empty (useitems) anditemsmust be non-empty; each item requires a non-emptypath.
- Bare entry —
defaults.merge.conflict_policy,defaults.tm.fuzzy_threshold(0..100),defaults.redaction.detectors, anddefaults.brand_voiceare each shape-checked.- Each flow must have at least one step
- Each step must have a non-empty
toolfield (unless it usesparallel) - Steps with
parallelcan omittool(the parallel branches provide tools) - Each
requires:entry must have a non-empty plugin name and a well-formed semver constraint (^1.0,>=1.4.0,1.4.0,~1.4.2, or*). UnlessSkipRequiresCheckis set, every named plugin must have a registered extension group, else loading fails with an install hint. - Extras at each scope are validated against any registered extension schema.
Note: name is optional (yaml:"name,omitempty"); the framework does not
require it.
File Paths
- Content patterns are expanded via
core/project.ExpandGlob, backed bygithub.com/bmatcuk/doublestar/v4— recursive**directory matching is supported (e.g.src/**/*.json).ExpandGlobfilters out any match that matches one of thedefaults.excludeglob patterns (matched withdoublestar.Match) - Patterns are resolved relative to the project root (the recipe's parent directory)
targetis expanded per source file and target language bycore/project.ResolveTargetPath(itemPath, base, target, source, lang):baseis the directory the source path is made relative to. When empty it defaults toGlobFixedPrefix(path)— the literal prefix of the glob before the first*/?/[/{(soinput/docs/*.mdmirrors just filenames whileinput/**/*.md, or an explicitbase, mirrors the subtree). On a named collection, an item inherits the collection'sbasewhen it sets none.- Tokens:
{lang},{relpath}(rel path with extension),{path}(rel path without extension),{dir},{filename},{name}(alias{basename}),{ext}; a bare*is legacy shorthand for{name}.{lang}is handled byResolvePathPattern; the rest byExpandTemplate. - Directory-mirror form: when the target (after
{lang}expansion) ends with/, is empty, or its final segment has no extension and no wildcard/token, it denotes a directory — the source's{relpath}(underbase) is appended. Sotarget: output/{lang}mirrors the source tree under each per-language root with no token and no doubled extension. SeeisDirectoryTargetincore/project/path.go.
Credential Resolution
The .kapi file references AI providers by type (e.g., provider: anthropic), not by key. API keys are resolved at runtime:
- OS keychain via
cli/credentials.Store(non-secret config at~/.config/kapi/providers.json; keys under the keychain service"kapi") - Environment variables (
ANTHROPIC_API_KEY,OPENAI_API_KEY) or the--api-keyflag - The
--providerand--modelCLI flags override project defaults
CLI Integration
# One-shot (no project)
kapi ai-translate -i file.json --target-lang fr
# With project file: run a built-in flow with project defaults
kapi run ai-translate-qa -p translation.kapi --target-lang de
# Or run a flow defined in the recipe's flows: map (here named "translate")
kapi run translate -p translation.kapi
Built-in flows are ai-translate, ai-translate-qa, pseudo-translate,
qa-check, tm-leverage, and secure-translate (see
core/flow.BuiltInFlows). A recipe's flows: map can add new flows and
override the single-tool built-ins (ai-translate, pseudo-translate,
qa-check, tm-leverage). It cannot override the composed built-ins
(ai-translate-qa, secure-translate) when invoked via -p: runWithProject
(cli/run.go) dispatches those to the built-in pipeline before consulting
proj.GetFlow.
With -p:
- The flow name is matched against the built-in composed flows first (currently
ai-translate-qaandsecure-translate— theBuiltInFlowsentries with 2+ tool nodes); if it is not one of those, it is looked up in the project'sflowsmap (and finally the plugin fallback) defaults.source_languageanddefaults.target_languages[0]provide defaults (CLI flags override)- For single-file flows,
--inputselects the file. The project'scontentcollections describe which fileskapi extract/kapi mergeoperate on across the project
Desktop Integration
Kapi Desktop at apps/kapi-desktop/:
- Opens
.kapifiles as documents (File > Open, drag-and-drop, OS file association) - Edits flows inline (steps editor)
- Resolves content patterns against the filesystem via
App.MatchContent(tabID), using the samecore/projectglob expansion the CLI relies on forextract/merge— pattern resolution is shared framework code, not a desktop-only feature - Stores recent files at
~/.config/kapi-desktop/recent.json - Stores settings at
~/.config/kapi-desktop/settings.json
Example Files
Minimal
version: v1
name: Quick Translate
Full
version: v1
name: Acme App Localization
defaults:
source_language: en-US
target_languages: [fr-FR, de-DE, ja-JP]
concurrency: 4
parallel_blocks: 3
encoding: utf-8
exclude:
- "**/*.generated.json"
merge:
conflict_policy: translator-wins
tm:
fuzzy_threshold: 75
segmentation:
source: true
termbase: glossary/terms.db
content:
# Bare entry — single glob, languages inherited from defaults.
# Directory-mirror target: src/i18n/en/app.json → src/i18n/{lang}/app.json.
- path: "src/i18n/en/*.json"
target: "src/i18n/{lang}"
# Named collection — groups patterns, scopes languages, and shares a base.
- name: Marketing
target_languages: [fr-FR, de-DE]
base: en
items:
- path: "en/docs/**/*.md"
target: "{lang}/docs"
- path: "en/site/**/*.html"
target: "{lang}/site"
preset: nextjs
requires:
okapi-bridge: ">=1.47.0"
flows:
translate:
steps:
- tool: ai-translate
config:
provider: anthropic
model: claude-sonnet-4-20250514
full-pipeline:
steps:
- tool: tm-leverage
config:
fuzzyThreshold: 75
- tool: ai-translate
config:
provider: anthropic
- tool: qa-check
pseudo:
steps:
- tool: pseudo-translate
config:
expansionPercent: 30