AI Entity Extract tool

The AI Entity Extract tool analyses a block's source text with a large language model and records two kinds of stand-off annotation: named entities (people, organizations, products, locations, and also dates, times, currencies, and measurements) and terminology candidates (domain-specific terms that would benefit from a termbase entry). Each entity carries a suggested do-not-translate flag; each term candidate carries a category and a translatability classification (do-not-translate, consistent, or free). It is read-only — it writes annotations only and never changes the source or target.

Extraction can optionally combine the LLM with a NER provider for fast entity detection; the LLM classification is preferred where the two overlap. Blocks can be analysed one at a time or grouped into batches sent in a single structured call, and batches can run concurrently. Known terms already in the termbase can be supplied so they are not re-proposed. A provider and, for hosted providers, credentials are required.

IDai-entity-extract

SourceBuilt-in

Categoryanalysis

Cardinalitymonolingual

Requirescredentials

Tagsai-powered

Parameters

Parameter	Type	Default	Description
`apiKey`	string		API key for the AI provider
`batchConcurrency`	integer	1	Number of concurrent batch calls (0 or 1 = sequential)
`batchSize`	integer	1	Number of blocks per LLM call (0 or 1 = one block per call)
`engine`	string	llm	llm (AI provider; default) / ner (local on-device model — nothing leaves the machine) / hybrid (both)
`knownTerms`	string[]		Terms to exclude from extraction (already in termbase)
`locale`	string		Locale of the source content
`model`	string		AI model name
`provider`	string	anthropic	AI provider

Configure these parameters interactively and copy the flow-step YAML on the Tool Reference.

Examples

Extract entities and terms with Anthropic

Analyse source blocks one at a time with an Anthropic model.

provider: anthropic
locale: en-US

Batched extraction

Analyse blocks in batches of 20, four batches at a time.

provider: openai
batchSize: 20
batchConcurrency: 4

Processing notes

Operates on translatable blocks with non-empty source; other parts pass through unchanged.
Read-only — writes entity and term-candidate annotations and never modifies source or target.
When both an LLM and a NER provider produce an entity at the same span, the LLM classification is kept.
Dates, times, currencies, and measurements are not defaulted to do-not-translate, since they need locale-specific formatting.

Limitations

Requires a provider and, for hosted providers, valid credentials; hosted providers make billed, rate-limited network calls.
The NER provider is optional and supplied programmatically; with no NER provider, extraction is LLM-only.
Entity and term suggestions (including the do-not-translate flag and translatability) are model proposals and should be reviewed before acting on them.

← Back to the Tool Reference