Enrichment Strategies - Entity Enricher Documentation

Enrichment Strategies

Entity Enricher offers two enrichment strategies that control how LLM calls are orchestrated. Choosing the right strategy affects accuracy, speed, and cost.

Pipeline Diagrams

From the simplest to the most powerful configuration. Each builds on the previous one.

Simple

Single Pass — 1 Model

One model, one call. The entire schema is sent in a single prompt. Fast and cheap — ideal for simple schemas or quick iteration.

Entity

Aspirin

Any entity: company, drug, legal case, research paper...

Anthropic

Use any LLM provider with your own API key.

AnthropicFull schema

Full schema in one call — auto-retries on validation failure.

Enriched Result

Aspirin

Multi-Model

Single Pass — 3 Models

Same strategy, but run across multiple models in parallel. Results are compared and arbitrated field-by-field to produce a single high-confidence output.

Entity

Aspirin

Any entity: company, drug, legal case, research paper...

Pre-flight Classification

Match — Pharmaceutical Compound

Catches type mismatches before wasting LLM credits.

Anthropic

OpenAI

Google Gemini

Bring your own API keys — works with any LLM provider.

AnthropicFull schema

OpenAIFull schema

GeminiFull schema

Full schema in one call — auto-retries on validation failure.

Final Enriched Result

Aspirin

Arbitrated

Reasoned field-level conflict resolution produces the final trusted result.

Advanced

Multi-Expertise — 3 Models

The schema is split by expertise domain. Each model receives focused sub-prompts for each domain. Results are deep-merged per model, then arbitrated across models. Maximum accuracy for complex, multi-domain schemas.

Entity

Aspirin

Any entity: company, drug, legal case, research paper...

Pre-flight Classification

Match — Pharmaceutical Compound

Catches type mismatches before wasting LLM credits.

Anthropic

OpenAI

Google Gemini

Bring your own API keys — works with any LLM provider.

Anthropic

PharmacologyLLM prompt

RegulatoryLLM prompt

OpenAI

PharmacologyLLM prompt

RegulatoryLLM prompt

Gemini

PharmacologyLLM prompt

RegulatoryLLM prompt

Schema split by domain — self-correcting prompts retry on validation failure.

Anthropic Result

OpenAI Result

Gemini Result

Deep merge of expertise responses per model.

Final Enriched Result

Aspirin

Arbitrated

Reasoned field-level conflict resolution produces the final trusted result.

Detailed Comparison

Aspect	Single Pass	Multi-Expertise
LLM Calls	1 per model	N per model (1 per expertise domain)
Schema Sent	Full schema in one prompt	Subset per expertise domain
Execution	Sequential (one call)	Parallel (all expertises run simultaneously)
Accuracy	Good for simple schemas	Higher — focused prompts yield better results
Speed	Faster for small schemas	Parallel execution can be faster for large schemas
Cost	Lower (single call overhead)	Higher (multiple calls with per-call overhead)
Streaming	All-or-nothing result	Progressive — results stream as each expertise completes
Partial Success	No — entire call succeeds or fails	Yes — successful expertises are preserved, failed ones can be retried

When to Use Each Strategy

Use Single Pass When:

•Your schema has fewer than 15–20 properties
•All properties belong to a single domain (e.g., all financial data)
•You want the fastest, cheapest result and accuracy is less critical
•You are testing a new schema and iterating quickly

Use Multi-Expertise When:

•Your schema spans multiple expertise domains (pharmaceutical, financial, geographic, etc.)
•You have a complex schema with 20+ properties
•Accuracy is critical and you want focused, specialized prompts
•You want real-time progress as each domain completes
•You need partial success handling — retry only what failed

How Multi-Expertise Works in Detail

The multi-expertise strategy follows a four-step process for each model:

Group Properties by Expertise

The schema is traversed recursively. Each property with an expertise domain tag is grouped with others sharing the same domain. For example, revenue and market_cap go to the “financial” group, while gmp_status goes to “regulatory”.

Create Focused Sub-Schemas

Each expertise group becomes a minimal sub-schema containing only its properties. This means the LLM receives a smaller, more focused prompt and only needs to fill in fields it specializes in.

Run in Parallel

All expertise calls run concurrently. A schema with 5 expertise domains will launch 5 LLM calls at the same time. As each one completes, its results are deep-merged into the accumulated output and streamed to the UI in real-time.

Handle Partial Failures

If some expertises fail, the system returns the merged output from successful ones with a “Partial” status. You can retry only the failed expertises, and the new results will be merged into the existing output without repeating the work that already succeeded.

Combining with Multi-Model Enrichment

Both strategies can be combined with multi-model enrichment. When you select multiple models, each model runs the chosen strategy independently. The results can then be merged using multi-model fusion to produce a single high-confidence output.

Example: Using multi-expertise with 3 models and a schema that has 4 expertise domains will launch 12 LLM calls in total (3 models x 4 expertises). Models from different providers run in parallel, while models from the same provider are queued to respect rate limits.

Enrichment Flow

Full pipeline walkthrough

Classification

Pre-flight entity type verification

Multi-Model Fusion

Merge results from multiple models

Enrichment Strategies

Pipeline Diagrams

Single Pass — 1 Model

Single Pass — 3 Models

Multi-Expertise — 3 Models

Detailed Comparison

When to Use Each Strategy

Use Single Pass When:

Use Multi-Expertise When:

How Multi-Expertise Works in Detail

Combining with Multi-Model Enrichment

Related Documentation