Attach PDFs, images, Office documents, spreadsheets, slides, and text files to any enrichment, schema generation, sample generation, AI schema edit, or playground request. Files reach the model either as native bytes (for PDF- and vision-capable models) or as server-extracted text inlined into the prompt — no manual OCR, conversion, or chunking required.
Each supported MIME type has an admin-configured delivery mode. The mode determines how the file reaches the model.
The original bytes are passed to the model as BinaryContent. The model reads the file directly — no server-side preprocessing.
Requires a model with the matching capability flag (supports_pdf_input for PDFs,supports_vision for images). The model picker is automatically filtered to only show compatible models.
A server-side extractor runs once at upload time and caches the resulting text. On every subsequent LLM call the cached text is inlined into the user prompt.
No model capability required — works with every model. Plain text and Markdown skip the extractor and decode the raw bytes directly.
14 formats ship enabled by default. System administrators can flip any format between binary andinline_text mode, change its label, or disable it entirely from Model Management → Document policies.
| Format | Extensions | Default mode | Capability / extractor |
|---|---|---|---|
| PDF document | binary | supports_pdf_input | |
| PNG image | .png | binary | supports_vision |
| JPEG image | .jpg, .jpeg | binary | supports_vision |
| Plain text | .txt | inline_text | raw decode |
| Markdown | .md, .markdown | inline_text | raw decode |
| Word (legacy .doc) | .doc | binary | docx2txt |
| Word (.docx) | .docx | binary | python-docx |
| OpenDocument text | .odt | binary | odfpy |
| Rich Text Format | .rtf | binary | striprtf |
| EPUB ebook | .epub | binary | ebooklib |
| HTML | .html, .htm | binary | beautifulsoup |
| CSV | .csv | binary | csv (stdlib) |
| Spreadsheet (.xlsx) | .xlsx | binary | openpyxl |
| Presentation (.pptx) | .pptx | binary | python-pptx |
(organization_id, sha256).inline_text formats, the extractor runs at upload time and the resulting text is cached on the attachment row. Subsequent LLM calls reuse the cached text — no re-extraction cost. binary formats skip this step.When you attach a binary file with a capability requirement (PDF or image), the model picker is filtered to only show models that declare that capability. If you attach multiple files with different requirements, only models satisfying all requirements appear.
| Attached files | Eligible models |
|---|---|
| 1 PDF | supports_pdf_input |
| 1 PNG | supports_vision |
| 1 PDF + 1 PNG | supports_pdf_input AND supports_vision |
| 1 DOCX (binary mode, no capability) | All models — native byte support is assumed when no capability flag is set |
| 1 TXT or 1 MD (inline_text mode) | All models — text is inlined into the prompt |
Attachments are billed as input tokens reported by the model provider — Entity Enricher does not charge a separate per-document fee. The cost depends on the file type and the selected model.
Consume model-specific input tokens. Anthropic charges around 1700 tokens per PDF page; OpenAI prices vision inputs by tile count. Check your model's pricing card in Models & Pricing.
The extracted text consumes input tokens at the standard text rate. Large documents are capped at 500 KB of extracted text — longer content is truncated.