Enrichment Strategies - Entity Enricher Documentation

Enrichment Strategies

Entity Enricher offers two enrichment strategies that control how LLM calls are orchestrated. Choosing the right strategy affects accuracy, speed, and cost.

Pipeline Diagrams

From the simplest to the most powerful configuration. Each builds on the previous one.

Simple

Single Pass — 1 Model

One model, one call. The entire schema is sent in a single prompt. Fast and cheap — ideal for simple schemas or quick iteration.

Entity

Aspirin

Any entity: company, drug, legal case, research paper...

Anthropic

Use any LLM provider with your own API key.

AnthropicFull schema

Full schema in one call — auto-retries on validation failure.

Enriched Result

Aspirin

Multi-Model

Single Pass — 3 Models

Same strategy, but run across multiple models in parallel. Results are compared and arbitrated field-by-field to produce a single high-confidence output.

Entity

Aspirin

Any entity: company, drug, legal case, research paper...

Pre-flight Classification

Match — Pharmaceutical Compound

Catches type mismatches before wasting LLM credits.

Anthropic
OpenAI
Google Gemini

Bring your own API keys — works with any LLM provider.

AnthropicFull schema
OpenAIFull schema
GeminiFull schema

Full schema in one call — auto-retries on validation failure.

Final Enriched Result

Aspirin

Arbitrated

Reasoned field-level conflict resolution produces the final trusted result.

Advanced

Multi-Expertise — 3 Models

The schema is split by expertise domain. Each model receives focused sub-prompts for each domain. Results are deep-merged per model, then arbitrated across models. Maximum accuracy for complex, multi-domain schemas.

Entity

Aspirin

Any entity: company, drug, legal case, research paper...

Pre-flight Classification

Match — Pharmaceutical Compound

Catches type mismatches before wasting LLM credits.

Anthropic
OpenAI
Google Gemini

Bring your own API keys — works with any LLM provider.

Anthropic
PharmacologyLLM prompt
RegulatoryLLM prompt
OpenAI
PharmacologyLLM prompt
RegulatoryLLM prompt
Gemini
PharmacologyLLM prompt
RegulatoryLLM prompt

Schema split by domain — self-correcting prompts retry on validation failure.

Anthropic Result
OpenAI Result
Gemini Result

Deep merge of expertise responses per model.

Final Enriched Result

Aspirin

Arbitrated

Reasoned field-level conflict resolution produces the final trusted result.

Detailed Comparison

AspectSingle PassMulti-Expertise
LLM Calls1 per modelN per model (1 per expertise domain)
Schema SentFull schema in one promptSubset per expertise domain
ExecutionSequential (one call)Parallel (all expertises run simultaneously)
AccuracyGood for simple schemasHigher — focused prompts yield better results
SpeedFaster for small schemasParallel execution can be faster for large schemas
CostLower (single call overhead)Higher (multiple calls with per-call overhead)
StreamingAll-or-nothing resultProgressive — results stream as each expertise completes
Partial SuccessNo — entire call succeeds or failsYes — successful expertises are preserved, failed ones can be retried

When to Use Each Strategy

Use Single Pass When:

  • Your schema has fewer than 15–20 properties
  • All properties belong to a single domain (e.g., all financial data)
  • You want the fastest, cheapest result and accuracy is less critical
  • You are testing a new schema and iterating quickly

Use Multi-Expertise When:

  • Your schema spans multiple expertise domains (pharmaceutical, financial, geographic, etc.)
  • You have a complex schema with 20+ properties
  • Accuracy is critical and you want focused, specialized prompts
  • You want real-time progress as each domain completes
  • You need partial success handling — retry only what failed

How Multi-Expertise Works in Detail

The multi-expertise strategy follows a four-step process for each model:

1
Group Properties by Expertise

The schema is traversed recursively. Each property with an expertise domain tag is grouped with others sharing the same domain. For example, revenue and market_cap go to the “financial” group, while gmp_status goes to “regulatory”.

2
Create Focused Sub-Schemas

Each expertise group becomes a minimal sub-schema containing only its properties. This means the LLM receives a smaller, more focused prompt and only needs to fill in fields it specializes in.

3
Run in Parallel

All expertise calls run concurrently. A schema with 5 expertise domains will launch 5 LLM calls at the same time. As each one completes, its results are deep-merged into the accumulated output and streamed to the UI in real-time.

4
Handle Partial Failures

If some expertises fail, the system returns the merged output from successful ones with a “Partial” status. You can retry only the failed expertises, and the new results will be merged into the existing output without repeating the work that already succeeded.

Combining with Multi-Model Enrichment

Both strategies can be combined with multi-model enrichment. When you select multiple models, each model runs the chosen strategy independently. The results can then be merged using multi-model fusion to produce a single high-confidence output.

Example: Using multi-expertise with 3 models and a schema that has 4 expertise domains will launch 12 LLM calls in total (3 models x 4 expertises). Models from different providers run in parallel, while models from the same provider are queued to respect rate limits.