AI Entity Extract tool
The AI Entity Extract tool analyses a block's source text with a large language model and records two kinds of stand-off annotation: named entities (people, organizations, products, locations, and also dates, times, currencies, and measurements) and terminology candidates (domain-specific terms that would benefit from a termbase entry). Each entity carries a suggested do-not-translate flag; each term candidate carries a category and a translatability classification (do-not-translate, consistent, or free). It is read-only — it writes annotations only and never changes the source or target.
Extraction can optionally combine the LLM with a NER provider for fast entity detection; the LLM classification is preferred where the two overlap. Blocks can be analysed one at a time or grouped into batches sent in a single structured call, and batches can run concurrently. Known terms already in the termbase can be supplied so they are not re-proposed. A provider and, for hosted providers, credentials are required.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
apiKey | string | API key for the AI provider | |
batchConcurrency | integer | 1 | Number of concurrent batch calls (0 or 1 = sequential) |
batchSize | integer | 1 | Number of blocks per LLM call (0 or 1 = one block per call) |
engine | string | llm | llm (AI provider; default) / ner (local on-device model — nothing leaves the machine) / hybrid (both) |
knownTerms | string[] | Terms to exclude from extraction (already in termbase) | |
locale | string | Locale of the source content | |
model | string | AI model name | |
provider | string | anthropic | AI provider |
Configure these parameters interactively and copy the flow-step YAML on the Tool Reference.
Examples
Extract entities and terms with Anthropic
Analyse source blocks one at a time with an Anthropic model.
provider: anthropic locale: en-US
Batched extraction
Analyse blocks in batches of 20, four batches at a time.
provider: openai batchSize: 20 batchConcurrency: 4
Processing notes
Operates on translatable blocks with non-empty source; other parts pass through unchanged.
Read-only — writes entity and term-candidate annotations and never modifies source or target.
When both an LLM and a NER provider produce an entity at the same span, the LLM classification is kept.
Dates, times, currencies, and measurements are not defaulted to do-not-translate, since they need locale-specific formatting.
Limitations
Requires a provider and, for hosted providers, valid credentials; hosted providers make billed, rate-limited network calls.
The NER provider is optional and supplied programmatically; with no NER provider, extraction is LLM-only.
Entity and term suggestions (including the do-not-translate flag and translatability) are model proposals and should be reviewed before acting on them.
← Back to the Tool Reference