Skip to main content

AD-007: Plugin System and Okapi Bridge

Summary

Plugins are manifest-driven, signed, out-of-process executables. Every plugin ships a manifest.json declaring everything it provides — commands, MCP tools, format readers/writers, flow tools, source connectors, and recipe schema extensions. kapi reads all manifests at startup and builds dispatch tables from them; there is no name fall-through. Plugins are discovered structurally by location ($KAPI_PLUGINS_DIR > $XDG_DATA_HOME/kapi/plugins/ > system roots), not by $PATH. Each capability picks its transport:

  • Mode A — one-shot subprocess (commands)
  • Mode B — long-lived stdio subprocess (MCP tools)
  • Mode C — long-lived daemon over Unix socket + gRPC (formats, tools, source connectors)

Plugin tarballs are cosign-signed via Sigstore keyless OIDC; kapi plugin install verifies SHA-256 + Sigstore JSON bundle against a registry-pinned cert identity before unpacking. The Okapi bridge and any third-party plugin all use the same model. The default kapi binary is Apache-2.0 and ships zero vendor-plugin code.

Context

Plugins enable third-party formats, tools, connectors, and providers to evolve independently of the framework. Key requirements:

  • License clarity. kapi is Apache-2.0. Bundling a plugin under a more restrictive (e.g. copyleft) license would force the combined binary distribution onto those terms. The plugin model must let vendors ship their own binaries on their own license terms without re-licensing kapi.
  • Discoverability and consent. A teammate's recipe declaring requires: { myplugin: "^1.0" } should produce a clear, one-step path to install — not a cryptic "extension group not registered" error.
  • Security. Plugins run with full user privileges; signature verification raises the bar against tampering and supply-chain attacks. This is supply-chain signing (cosign / Sigstore), distinct from OS code signing / notarization — see Plugin signing vs. OS notarization.
  • Performance for format-heavy workloads. Okapi bridge processes large IDML / TMX / Word files at high throughput. JVM startup is hundreds of ms; the model must support long-lived daemons with multiplexed concurrent requests so the JVM only starts once per kapi session.
  • Polyglot from day one. kapi publishes a language-neutral protocol spec; plugin authors implement against it in any language. A minimal Go reference plugin ships in examples/plugins/hello/.

Decision

Manifest

Every plugin's directory contains a manifest.json declaring its identity (plugin, version, binary, license, author, homepage, min_kapi_version, group) and the capabilities it provides under one or more sections:

{
"manifest_version": "1",
"plugin": "myplugin",
"version": "1.4.0",
"binary": "kapi-myplugin",
"license": "Apache-2.0",
"min_kapi_version": "1.0.0",
"capabilities": {
"commands": [...],
"mcp_tools": [...],
"formats": [...],
"tools": [...],
"source_connectors": [...],
"schema_extensions": [...]
},
"daemon": {
"idle_timeout_seconds": 300,
"handshake": { "type": "stdio-handshake", "fields": ["socket", "version"] }
}
}

The daemon block is present only for plugins that declare any formats, tools, or source connectors (Mode C). The full schema is embedded at core/plugin/manifest/schema.json; canonical Go types live in core/plugin/manifest/manifest.go. The protocol is described in detail at docs/internals/plugin-protocol-v1.md.

Discovery

kapi scans this fixed list of locations in precedence order:

OrderLocationPurpose
1 (highest)$KAPI_PLUGINS_DIR (:-separated; ; on Windows)Dev / CI / sandbox
2$XDG_DATA_HOME/kapi/plugins/ (default ~/.local/share/kapi/plugins/)kapi plugin install target
3/opt/homebrew/share/kapi/plugins/ (macOS Homebrew)OS package manager
3/usr/local/share/kapi/plugins/ (Linux /usr/local)OS package manager
3/usr/share/kapi/plugins/ (distro)OS package manager

Within each location, every direct subdirectory containing a manifest.json is a plugin. First-match-wins on plugin name. Conflicting capabilities between two different plugins are an error — kapi prints both manifests and refuses to dispatch the conflicting capability.

Precedence over built-ins. A plugin capability that collides with a built-in one (e.g. a plugin format reader for pdf when the framework ships a pure-Go one) overrides the built-in — installing a plugin for a format is an explicit signal to prefer it. Built-ins remain the fallback when the plugin is absent, so behaviour degrades gracefully. Plugin-vs-plugin collisions still error (above); plugin-vs-built-in is resolved in the plugin's favour via the format registry's source/priority (SetFormatSource assigns DefaultPluginPriority > DefaultBuiltInPriority).

A consolidated dispatch cache at $XDG_CACHE_HOME/kapi/plugins-cache.json holds parsed manifests + pre-compiled JSON Schema validators. The cache is invalidated by an mtime check on each discovery root: if none of the roots changed since the last write, kapi loads the cache and skips manifest parsing entirely.

Three transport modes

A plugin declares one or more capability sections in its manifest. kapi picks the right transport per capability type.

Mode A — one-shot subprocess

Used for commands. kapi forks and execs the plugin once per invocation:

<binary> command <name> [extra args/flags]

stdin / stdout / stderr inherited; env block carries KAPI_PLUGIN_DIR, KAPI_PLUGIN_NAME, KAPI_PLUGIN_VERSION. Exit code propagated. The plugin doesn't keep state across calls.

Mode B — session subprocess

Used for mcp_tools. kapi spawns one plugin process per kapi mcp session and proxies tool calls over MCP-over-stdio:

<binary> mcp-server

Mode C — daemon over Unix socket

Used for formats, tools, source_connectors. kapi spawns a long-lived plugin process; the plugin binds a Unix-domain socket, prints one JSON line on stdout (the canonical handshake), then serves gRPC on the socket:

<binary> daemon

{"socket":"/tmp/kapi-daemon-myplugin-12345.sock","version":"1.4.0"}

kapi opens a gRPC client to that socket and dispatches concurrent requests. The daemon stays alive until kapi exits or hits its idle timeout (per-manifest, default 5 min). Concurrent daemons are capped via KAPI_MAX_DAEMONS (default 8) with LRU eviction. The daemon transport is a Unix-domain socket, dialed by kapi as a gRPC client over unix. Each plugin supplies its own socket server: the reference Okapi bridge serves it with Netty's native transports — kqueue on macOS, epoll on Linux — and is POSIX-only today.

Lifecycle commands

kapi plugin list # show installed plugins
kapi plugin install <name> # download + verify + register
kapi plugin install <name>@<version> # pin a specific version
kapi plugin install <name> --channel beta # pick a channel; persists for updates
kapi plugin update <name> # upgrade to latest matching constraint
kapi plugin update-index # explicit registry-index refresh
kapi plugin remove <name> # uninstall
kapi plugin info <name> # show manifest details
kapi plugin search <query> # list registry candidates
kapi plugin verify <name> # re-check sha256 + signature
kapi plugin rebuild-cache # force regenerate the dispatch cache

Recipe requires: syntax

A .kapi recipe declares plugin dependencies as a map of plugin name to semver constraint:

version: v1
name: my-app
requires:
myplugin: "^1.0"
okapi-bridge: ">=1.47.0"

Validation fails if any named plugin is not registered. On a TTY, kapi prompts to install the missing plugin and retries the command; in CI it prints an actionable error pointing at kapi plugin install. The bare-list form (requires: [myplugin]) is rejected with an actionable migration hint.

Registry and signing

A registry is a JSON index served over HTTPS. The default registry is https://neokapi.github.io/registry/manifest-plugins.json. The schema maps plugin name → versions → per-platform tarball URL + SHA-256 + cosign cert identity:

{
"plugins": {
"okapi-bridge": {
"versions": {
"1.47.0": {
"channel": "stable",
"min_kapi_version": "0.1.0",
"platforms": {
"darwin/arm64": {
"url": "https://github.com/.../kapi-okapi-bridge_1.47.0_darwin_arm64.tar.gz",
"sha256": "...",
"signature": "https://.../kapi-okapi-bridge_1.47.0_darwin_arm64.tar.gz.sigstore.json",
"cert_identity": "https://github.com/neokapi/okapi-bridge/.github/workflows/release.yml@refs/tags/v2.46.0",
"cert_oidc_issuer": "https://token.actions.githubusercontent.com"
}
}
}
}
}
}
}

kapi plugin install downloads the tarball + Sigstore JSON bundle, verifies SHA-256 against the registry-pinned hash, then verifies the bundle's signing cert against the pinned identity + OIDC issuer using sigstore-go. Unsigned plugins refuse to install unless --unsafe is passed.

The 1-hour cache at $XDG_CACHE_HOME/kapi/registry-index.json keeps auto-install prompts cheap; explicit kapi plugin install / search / update-index always fetches fresh.

Plugin signing vs. OS notarization

Plugin signing sits on a different trust layer than the OS code signing applied to the kapi CLI and desktop apps. The two are independent and answer different questions:

LayerQuestionMechanismTriggered by
Supply chainIs this the genuine, untampered plugin?cosign / Sigstore bundle + SHA-256, verified at install (above)every kapi plugin install
OS Gatekeeper / SmartScreenWill the OS let the binary run without a warning?Apple Developer ID + notarization (macOS); Authenticode (Windows)the com.apple.quarantine xattr — set only by browser / mail downloads

Plugins rely on the supply-chain layer only. kapi plugin install fetches tarballs over HTTPS using Go's HTTP client, which does not set the quarantine attribute, and unpacks them under the data dir. The extracted binary is therefore never quarantined, so macOS Gatekeeper and Windows SmartScreen never engage on it — no Apple notarization or Authenticode signature is required for a plugin to run. The cosign signature + SHA-256 check is the meaningful integrity guarantee, and it is enforced on every platform.

This is the inverse of the kapi CLI and desktop apps, which are Developer-ID-signed + notarized (macOS) and Authenticode-signed (Windows): users fetch those through a browser (DMG, release archive), so they arrive quarantined and must clear Gatekeeper / SmartScreen on first launch.

The okapi-bridge is a jpackage app-image — a native launcher plus a bundled JRE, i.e. genuine native code — yet the same reasoning holds: installed unquarantined via kapi plugin install and verified by cosign + SHA-256, it runs without OS-level signing. Deep-signing and notarizing it (the launcher plus every bundled-JRE dylib, across each Okapi version × OS × arch) is deliberately not done, because it buys nothing for the programmatic install path. A plugin would need OS code signing only if it were also distributed as a direct browser download — out of scope for the registry-driven model.

JSON Schema validation for schema_extensions

A plugin can declare recipe schema keys it owns:

{
"schema_extensions": [
{ "name": "server", "scope": "project", "json_schema": "schemas/server.json" }
]
}

At plugin-register time, kapi loads <plugin-dir>/schemas/server.json, compiles it via github.com/google/jsonschema-go, and registers an extension decoder with core/project. When a recipe is loaded, the decoder validates the YAML payload against the compiled schema. Failures render with the recipe path prefix and the JSON Schema constraint that failed.

Standard plugins

  • A platform plugin — cloud-server sync (push/pull/auth), distributed separately on its own license terms. It demonstrates how a separately-licensed plugin attaches over the manifest model without re-licensing kapi: installed via its own brew formula (depends on kapi, drops its binary into share/kapi/plugins/<plugin>/).
  • okapi-bridge — Java bridge exposing 57+ Okapi Framework filters. Built with jpackage (no Go shim): produces a native launcher
    • bundled JRE per platform. Cosign-signed via GitHub Actions keyless OIDC.
  • kapi-pdfium (plugins/pdfium/) — first-party Mode-C format plugin providing a high-fidelity pdf reader backed by Google's PDFium (go-pdfium, cgo). It extracts correct text (including CID/Type0 fonts and CJK), per-block and per-glyph geometry, and document structure, and runs as an isolated daemon so a malformed-PDF crash dies with the subprocess, not kapi. There is no in-core PDF reader on native builds, so the plugin supplies the pdf format outright; the browser uses PDFium compiled to WebAssembly instead. Bundled with both the kapi-cli distribution and the kapi-desktop app: the CLI installs it into the shared share/kapi/plugins/pdfium/ root, and the desktop installs it on demand the first time a PDF is opened, both hosting it over the same cli/pluginhost discovery + daemon pool — one engine, not one per host. PDFium ships as a bundled shared library beside the binary (found via rpath), not statically linked. The full PDF subsystem — extraction modes, the geometry model, and the tagged/geometric structure tiers — is described in AD-028.

A minimal Go reference plugin in examples/plugins/hello/ covers Mode A + B with no third-party deps.

Status

Implemented and merged in #438 (phases 1-9). The legacy v1 plugin runtime — core/plugin/{loader,host,server,shared,registry,cache}/ plus the kapi plugins (plural) command tree — has been deleted. core/plugin/{manifest,proto,protoconvert}/ are kept: manifest for the manifest types and embedded JSON Schema, proto for the gRPC service definitions consumed by Mode-C daemons, and protoconvert for Part↔proto translation. The host-side runtime — discovery, dispatch, the daemon pool, the registry client, cosign verification, and the bridge format client — lives in cli/pluginhost/.

Native binaries ship for linux/amd64, linux/arm64, darwin/arm64, and windows/amd64. darwin/amd64 (Intel Mac) is intentionally not in the release matrix — Apple has dropped Intel from new product lines and macos-13 runners on GitHub Actions are scarce. Intel users can run the JAR directly with their own JRE 17+ or use Rosetta on the arm64 binary.

References