Batch Processing - Entity Enricher Documentation

Batch Processing

Enrich up to 100 entities in parallel with real-time progress tracking, automatic multi-model fusion, and export to JSON or Excel.

Input Methods

Batch enrichment supports two ways to provide entity data:

JSON Editor

Paste or type a JSON array of entities directly. The editor provides syntax highlighting, validation markers, and persists your data across sessions in local storage.

[
  { "name": "Sanofi", "country": "France" },
  { "name": "Pfizer", "country": "USA" },
  { "name": "Novartis", "country": "CH" }
]

URL Fetch

Fetch entities from any REST API endpoint. The system automatically extracts arrays from common response wrappers.

Supported authentication:

NoneBearer TokenAPI Key HeaderBasic Auth

If the API returns an object, the system checks keys like data, results, items for an embedded array.

Entity Selection & Validation

After loading entities, they appear in a selectable list with validation status. You can choose which entities to include in the batch:

Multi-select— Click to select individual entities. Shift+click for ranges. Ctrl+A to select all, Ctrl+D to deselect all.

Inline editing— Click search key fields (name, country, etc.) to edit them directly in the list before enrichment.

Validation— Each entity is validated against the schema's search keys. At least one search key must be filled. Invalid entities show warnings but can still be selected.

Selective processing— Only selected entities are sent for enrichment. Deselect entities you don't want to process.

Configuration

The sidebar mirrors the single enrichment configuration options:

Option	Description
Schema	Target schema that defines the enrichment output structure
Strategy	Single pass, expert domains, or multi-expertise (parallel calls per domain)
Models	One or more AI models to run per entity. Multiple models enable automatic fusion.
Languages	Languages for multilingual field enrichment (e.g., English + French)
Classification	Optional fast model for entity type verification before enrichment
Arbitration	Model for LLM-based conflict resolution during fusion. If unset, rule-based merge is used.

Cost Estimation

Before starting a batch, a confirmation dialog shows a cost estimate and summary. The estimate is calculated based on property count, model pricing, and the number of entities and models selected. A warning appears when the total LLM call count exceeds 100.

Entities

Models

Total Calls

~40

Est. Cost

~$1.50

Parallel Execution

All selected entities are processed simultaneously. Each entity goes through the full enrichment pipeline independently:

Per-Entity Pipeline

Classification (optional) — A fast model verifies the entity type. In batch mode, mismatches do not pause the job; context is passed through.
Multi-model enrichment — Each selected model enriches the entity in parallel, with per-provider rate limiting.
Auto-fusion (when 2+ models succeed) — Results are automatically merged using conflict detection and resolution.

Rate Limiting

A global rate limiter prevents overwhelming AI providers. All entities share the same per-provider concurrency limits (typically 5 concurrent calls per provider). With 20 entities and 2 models, up to 5 calls run simultaneously per provider — the rest wait for availability. This ensures reliable execution without hitting API rate limits.

Real-Time Progress

The results panel shows live progress using Server-Sent Events (SSE). Each entity has a collapsible card that updates in real time:

Pending

Waiting to start processing

Running

Currently being enriched, with expertise progress badges showing completion per domain

Completed

All models finished successfully. Card auto-collapses.

Partial

Some models or expertises failed. Partial results available.

Failed

All models failed for this entity. Error details shown.

Cancellation & Error Handling

You can cancel a running batch at any time. Cancellation is cooperative — entities already in-flight complete their current LLM call, but no new calls start. Partial results from completed entities are preserved.

Error Resilience

Batch processing is designed to be resilient. Individual failures do not stop the batch:

If classification fails for an entity, enrichment proceeds without context
If one model fails, other models for that entity continue
If all models fail for an entity, it is marked as failed while others continue
Models that return “not found” errors are automatically deactivated

Export Formats

After batch completion, export results in three formats. For each entity, the fusion result is preferred if available; otherwise, the best model result is used.

JSON File

Download the full results as a structured JSON file with all entity data, model outputs, and fusion metadata.

Clipboard

Copy the JSON results directly to your clipboard for pasting into other tools or scripts.

Excel

A three-sheet workbook: Results (one row per entity with flattened properties), Summary (batch metadata, models, costs), and Conflicts (per-entity conflict details with resolution reasoning).

Limits

Limit	Value
Max entities per batch	100
Max entity data size	50,000 characters
Max prompt length	100,000 characters
URL fetch timeout	30 seconds