Multi-Model Fusion - Entity Enricher Documentation

Multi-Model Fusion

When you run the same enrichment across multiple AI models, Entity Enricher can fuse the results into a single, high-confidence output. Fusion detects conflicts between model outputs and resolves them using deterministic rules or LLM-powered arbitration.

Fusion Pipeline

Model Outputs
Claude Result
GPT-4 Result
Gemini Result
Conflict Detection
Compare every field
across all models
Resolution
Rule-Based Merge
or
LLM Arbitration
Merged Result
Single output with
conflict audit trail

Step 1: Conflict Detection

The conflict detector compares every field across all model outputs. Fields where all models agree pass through unchanged. Fields where models disagree are flagged as conflicts that need resolution.

Comparison Rules by Field Type
TypeHow ComparedAgreement Means
ScalarNormalized exact match (trimmed, lowercased, rounded)All values equal after normalization
MultilingualPer-language comparisonEach language key matches across models
ArraySet comparison (order-independent)Same items regardless of order
ObjectRecursive per-propertyAll nested properties match
NullNull equals missingTreated as equivalent
Example: Enriching “Sanofi” with 2 Models
Claude Output
revenue: 42.2
gmp_status: true
description: “Sanofi is a global...”
GPT-4 Output
revenue: 44.1
gmp_status: true
description: “Sanofi SA is a...”
Result: gmp_status = agreed | revenue = conflict (42.2 vs 44.1) | description = conflict (different text)

Step 2: Conflict Resolution

Conflicts are resolved using one of two methods, depending on whether you selected an arbitration model in the sidebar.

Option A

Rule-Based Merge

Deterministic rules are applied based on each field's data type. No additional LLM calls are needed — resolution is instant and free.

Field TypeRuleRationale
StringMajority vote; tie goes to the longest valueMore detail is usually better
NumberMedian valueRobust to outliers
BooleanMajority; true wins tiesConservative default
MultilingualPer-language majority voteEach language resolved independently
ArrayUnion of all itemsPreserve all information
ObjectRecursive per-fieldApply rules to nested fields
Null vs ValuePrefer non-nullMissing data is worse than any value

Tie-breaker: When votes are tied, the value from the higher-priced model wins (as a proxy for capability), followed by alphabetical model name ordering.

Option B

LLM Arbitration

When you select an arbitration model in the sidebar, conflicts are sent to an LLM for intelligent resolution. The arbitrator receives the entity context, schema field descriptions, and all conflicting values, then makes reasoned decisions.

What the Arbitrator Returns
Chosen ValueThe value it considers most accurate
Source ModelWhich model the chosen value came from
ReasoningWhy it chose that value over alternatives
ConfidenceHow confident it is in the decision (high, medium, low)

Fallback: If the arbitration model fails (timeout, error), the system automatically falls back to rule-based merge so you always get a result.

Step 3: The Merged Result

After conflict resolution, the system builds a single merged result and stores it as an “arbitration” record in the database. Every merged result includes an audit trail so you can trace how each conflict was resolved.

Audit Trail (Arbitration Metadata)

Every merged result includes metadata that documents the fusion process:

“method”: “rule_based” | “llm”
“source_record_ids”: [“uuid-1”, “uuid-2”]
“total_fields”: 23
“agreed_fields”: 18
“conflicted_fields”: 5
“decisions”: [{ path, chosen_value, rule_used, ... }]

What You See in the UI

After fusion completes, the “Merged” tab in the results panel shows:

1
Summary Header
Shows the resolution method (Rule-Based or LLM), and a count like “18 agreed / 5 resolved / 23 total fields”.
2
Merged JSON
The complete structured output combining agreed values and resolved conflicts into a single JSON document.
3
Conflict Report
Expandable cards for each conflict showing: the field path, the resolution method badge (Majority Vote, Median, Union, etc.), all model values with the chosen one highlighted, and reasoning text if LLM arbitration was used.

Automatic Fusion in Batch Processing

In batch enrichment, fusion happens automatically when you select two or more models. You do not need to click “Merge Results” manually — as soon as all models complete for an entity, fusion runs and the merged result appears alongside the individual model outputs.

Streaming fusion: During both single-entity and batch enrichment, fusion progress is streamed via Server-Sent Events. You see fusion_started, conflicts_detected, and fusion_completed events in real-time.

Rule-Based vs LLM Arbitration: When to Use Each

Rule-Based (Free, Instant)
  • Mostly factual/numeric data where voting logic works well
  • High volume or batch processing where cost matters
  • Simple schemas with few expected conflicts
  • When you want deterministic, reproducible results
LLM Arbitration (Additional Cost)
  • Complex schemas where context matters for resolution
  • Textual data (descriptions, summaries) where voting is insufficient
  • When you need explainable decisions with reasoning
  • High-stakes enrichments where accuracy is worth the extra cost