Batch Processing - Entity Enricher

Batch Processing

Enrich up to 100 entities in parallel with real-time progress tracking, per-provider rate limiting, automatic multi-model fusion, and export to JSON or Excel. Batch processing turns Entity Enricher from a single-entity tool into a production-grade data pipeline.

Batch Processing Pipeline

INPUT A

Paste JSON Array

INPUT B

Fetch from URL

INPUT C

Drag & Drop File

VALIDATE & SELECT

Entity List

Select entities, validate against schema, inline editing

PARALLEL EXECUTION

Enrich All Entities Simultaneously

Per-provider rate limiting, per-entity SSE progress, cancel/retry support

AUTO-FUSION (if 2+ models)

Merge Results Per Entity

Conflict detection and resolution runs automatically after each entity completes

JSON EXPORT

Structured results array

EXCEL EXPORT

3-sheet workbook with conflicts

Flexible Input Methods

Paste JSON

Paste a JSON array of entity objects directly into the editor. The system auto-detects the array structure and extracts individual entities. The JSON editor provides syntax highlighting, validation markers, and line numbers.

Fetch from URL

Enter a REST API URL to fetch entities remotely. Supports bearer token, API key, and basic authentication. The system auto-extracts arrays from nested response wrappers (e.g., { results: [...] }).

Drag & Drop

Drag a JSON file directly onto the page. The paste overlay detects JSON content from clipboard or file drops and loads entities automatically.

Real-Time Progress Tracking

Every batch job streams progress events via Server-Sent Events (SSE). The UI shows:

Each entity result card is collapsible, showing per-model tabs with raw output and a merged result tab when fusion is enabled. Failed entities can be retried individually without re-running the entire batch.

Per-Provider Rate Limiting

Batch processing uses concurrent semaphores per provider to stay within API rate limits. If you're enriching 50 entities with 3 models, the system doesn't fire 150 API calls at once. Instead, it respects each provider's configured rate limit -- for example, 5 concurrent calls to Anthropic, 10 to OpenAI, and 3 to a self-hosted Ollama instance.

Rate limits are configurable per provider in the model management settings. The system maximizes throughput within your limits while preventing 429 errors.

Export Formats

JSON Export

A structured JSON array with one object per entity. Includes the full enriched output, metadata, and fusion results. Ideal for programmatic consumption and downstream data pipelines.

[
  {
    "entity": { "name": "..." },
    "enriched": { ... },
    "metadata": {
      "models": [...],
      "cost_usd": 0.012
    }
  }
]

Excel Export

A three-sheet workbook designed for analysts and stakeholders:

  • Results sheet: One row per entity with flattened enrichment fields as columns.
  • Summary sheet: Batch metadata, model configuration, total cost, and processing time.
  • Conflicts sheet: Field-level conflict details with arbitration reasoning and confidence scores.

Cancel & Retry

Running batch jobs can be cancelled at any time. Cancellation is graceful -- in-flight LLM calls complete (you still get their results), but no new calls are started. Already-completed entities keep their results.

Before starting a batch, the system provides a cost estimate based on the selected models, entity count, and schema complexity. This lets you validate the expected cost before committing to the run.

Start Batch Enrichment

Upload your entity list, select models, and enrich up to 100 entities in parallel. Export results as JSON or Excel with full conflict reports.

Get Started Free