Document Attachments - Entity Enricher Documentation

Document Attachments

Attach PDFs, images, Office documents, spreadsheets, slides, and text files to any enrichment, schema generation, sample generation, AI schema edit, or playground request. Files reach the model either as native bytes (for PDF- and vision-capable models) or as server-extracted text inlined into the prompt — no manual OCR, conversion, or chunking required.

Where You Can Attach Documents

Single enrichment
Per-record attachments alongside JSON input
Batch enrichment
Shared attachments applied to every entity in the batch
Schema generation (guided)
Generate a schema from an example document
Sample JSON generation
Extract a sample entity from a source file
AI schema editing
Refine a schema with natural language + a reference doc
Playground
Free-form custom prompts with attachments

Two Delivery Modes

Each supported MIME type has an admin-configured delivery mode. The mode determines how the file reaches the model.

binaryNative bytes

The original bytes are passed to the model as BinaryContent. The model reads the file directly — no server-side preprocessing.

Requires a model with the matching capability flag (supports_pdf_input for PDFs,supports_vision for images). The model picker is automatically filtered to only show compatible models.

inline_textExtracted text

A server-side extractor runs once at upload time and caches the resulting text. On every subsequent LLM call the cached text is inlined into the user prompt.

No model capability required — works with every model. Plain text and Markdown skip the extractor and decode the raw bytes directly.

Supported Formats

14 formats ship enabled by default. System administrators can flip any format between binary andinline_text mode, change its label, or disable it entirely from Model Management → Document policies.

FormatExtensionsDefault modeCapability / extractor
PDF document.pdfbinarysupports_pdf_input
PNG image.pngbinarysupports_vision
JPEG image.jpg, .jpegbinarysupports_vision
Plain text.txtinline_textraw decode
Markdown.md, .markdowninline_textraw decode
Word (legacy .doc).docbinarydocx2txt
Word (.docx).docxbinarypython-docx
OpenDocument text.odtbinaryodfpy
Rich Text Format.rtfbinarystriprtf
EPUB ebook.epubbinaryebooklib
HTML.html, .htmbinarybeautifulsoup
CSV.csvbinarycsv (stdlib)
Spreadsheet (.xlsx).xlsxbinaryopenpyxl
Presentation (.pptx).pptxbinarypython-pptx

Limits

10 MB
Per file
Reject upload above this cap
50 MB
Per request
Sum of all files in a single upload
5 files
Per request
Maximum file count per upload
Extracted text cap: 500 KB per attachment — longer source documents are truncated when extracted server-side. Extractor timeout: 10s wall-clock per attachment (uploads that exceed the timeout still succeed; the file is stored but its extracted text is empty).

Lifecycle

1
Upload
Drag-and-drop or pick files in the attachment panel of any supported page. The browser-supplied content type is not trusted — the server sniffs magic bytes and rejects anything outside the allow-list. Each file is hashed (SHA-256) and stored on encrypted block storage.
2
Dedup by content
Identical bytes uploaded twice within the same organization deduplicate to a single stored file. Two different organizations uploading the same file produce two independent rows — no cross-tenant leakage. The dedup key is (organization_id, sha256).
3
Extract once (inline_text mode)
For inline_text formats, the extractor runs at upload time and the resulting text is cached on the attachment row. Subsequent LLM calls reuse the cached text — no re-extraction cost. binary formats skip this step.
4
Reference by ID in any job
Once uploaded, attachments are passed by ID in subsequent enrichment, schema-generation, or playground requests. Each attachment is added to the model's user content as either native bytes (binary mode) or inlined text (inline_text mode), preserving the original filename.
5
Persisted on the record
When an enrichment record is saved, the attachment IDs are linked to it. The record detail page lists all attachments with a download button. Records can be re-merged or retried without re-uploading.

Automatic Model Filtering

When you attach a binary file with a capability requirement (PDF or image), the model picker is filtered to only show models that declare that capability. If you attach multiple files with different requirements, only models satisfying all requirements appear.

Attached filesEligible models
1 PDFsupports_pdf_input
1 PNGsupports_vision
1 PDF + 1 PNGsupports_pdf_input AND supports_vision
1 DOCX (binary mode, no capability)All models — native byte support is assumed when no capability flag is set
1 TXT or 1 MD (inline_text mode)All models — text is inlined into the prompt

Pricing & Token Usage

Attachments are billed as input tokens reported by the model provider — Entity Enricher does not charge a separate per-document fee. The cost depends on the file type and the selected model.

PDFs & images (binary mode)

Consume model-specific input tokens. Anthropic charges around 1700 tokens per PDF page; OpenAI prices vision inputs by tile count. Check your model's pricing card in Models & Pricing.

Office docs & spreadsheets (extracted text)

The extracted text consumes input tokens at the standard text rate. Large documents are capped at 500 KB of extracted text — longer content is truncated.

Security & Tenancy

MIME allow-list with magic-byte sniffing
The browser-supplied content type is ignored. The server inspects file headers and rejects anything outside the configured allow-list.
Organization-scoped storage
Each file is stored under its owning organization. The download endpoint enforces org membership — there is no path through the API to reach another tenant’s files.
Sandboxed extractors
Each extractor runs with a 10-second wall-clock timeout inside a try/except boundary. A misbehaving file cannot stall or crash the API process.
Encrypted at rest
Attachment bytes live on encrypted block storage, mounted into the application container with restricted permissions.
Admin-controlled per-MIME policies
System administrators can disable any format globally, change a format from binary to inline_text (or vice versa), or relabel it. Changes take effect on the next upload of that MIME type.