Plug your laptop's local Ollama into Entity Enricher with one command. Use the models you already have, keep certain inputs on your machine, and revoke access instantly — all over a single outbound HTTPS connection.
Production (Hetzner) Your Laptop / your server
┌─────────────────────────┐ ┌────────────────────┐
│ Org enrichment │ │ ee-tunnel CLI │
│ │ │ │ │ │
│ ▼ │ │ ▼ │
│ Ollama provider URL: │ │ Ollama :11434 │
│ http://ollama.<org> │ │ ┌──────────────┐ │
│ .tunnel/v1/chat │ │ │ qwen3:36b │ │
│ │ │ │ │ llama3:70b │ │
│ ▼ │ │ │ ... │ │
│ WebSocket bridge │◀───────▶│ └──────────────┘ │
└─────────────────────────┘ └────────────────────┘
one outbound :443/wss connectionThe synthetic hostname ollama.<org-slug>.tunnel is never resolved by DNS. A custom transport in the Entity Enricher process intercepts requests for *.tunnel and routes them through the matching active WebSocket.
Run enrichments against open-weight models already pulled on your laptop (Llama 3, Qwen, DeepSeek, Mistral, …) without re-downloading them on the server.
For sensitive prompts, route specific enrichments through your own GPU. The platform still records cost, tokens, and metadata — only the inference itself stays local.
Pull a model with `ollama pull` on your laptop, click Discover Models, and it appears in the Entity Enricher selector with real context length and capabilities.
When a hosted provider rate-limits you, fail over to a tunneled local model for the rest of a batch.
Two commands on your laptop, one click in the browser. The whole flow takes about a minute.
Open API Keys → Ollama Tunnels and click New Ollama Tunnel. Type a label so you'll recognize this device later (e.g. "Anthony's MacBook").
Paste this in a terminal. The script verifies a cosign signature before installing.
curl -fsSL https://entityenricher.ai/install.sh | sh
The installer prints what it's about to do (download URL, signature, install path) and pauses 5 seconds so you have a Ctrl+C window. Source code lives at TOT-Concept/ee-tunnel (MIT). Windows users use install.ps1 instead.
Run ee-tunnel pair. A browser tab opens at /tunnel/connect with a 6-character code. Confirm the device label and click Connect.
ee-tunnel pair --server https://entityenricher.ai Open this URL in your browser to confirm pairing: https://entityenricher.ai/tunnel/connect?code=9KV-DU6 Code: 9KV-DU6 Waiting for confirmation...
You never copy-paste a token. The browser only sees tunnel metadata; the CLI receives its credentials over its own polling channel.
Run ee-tunnel. The status pill in the Tunnels tab flips to Connected within seconds. Then click Discover Models on the Ollama provider in Models & Pricing to auto-import every locally-installed chat model.
Pairing uses an OAuth-style device-code flow (RFC 8628 shape). The user types a short code in a browser instead of copy-pasting a token. The browser only sees tunnel metadata; the CLI receives its refresh token directly over its own polling channel — one-shot redemption.
laptop entity-enricher browser
│ │ │
│── ee-tunnel pair ──────▶ │ │
│ --server URL │ │
│ │ │
│ POST /api/tunnel/ │ │
│ device-code │ │
│ ───────────────────────▶ │ │
│ ◀── { device_code, │ │
│ user_code='9KV-DU6', │ │
│ verification_uri } │ │
│ │ │
│ prints URL, opens │ │
│ browser at /tunnel/ │ │
│ connect?code=… ──────────────────────────────────────▶│
│ │ │
│ │ POST /api/tunnel/confirm/ │
│ │ {user_code} │
│ │ body: { label } │
│ │ Authorization: Bearer │
│ │ <user JWT, role=owner> │
│ │ ◀──────────────────────────── │
│ │ • create tunnel_credentials │
│ │ • mark device_code complete │
│ │ with refresh_token │
│ │ ────────────────────────────▶│
│ │ 200 { tunnel metadata } │
│ │ │
│ POST /api/tunnel/poll │ │
│ ───────────────────────▶ │ │
│ ◀── { status:'ok', │ │
│ refresh_token, │ │
│ server_url, │ │
│ tunnel_id } │ │
│ │ │
│ persist token mode 0600 │ │
│ ✓ Paired │ │Once connected, your tunneled Ollama is just another LLM provider as far as the rest of the platform is concerned. Multi-expertise, fusion, batch — everything works unchanged.
enrichment job
│
▼
agent_factory → Ollama provider, base_url=http://ollama.acme.tunnel
│
▼
custom httpx transport ← intercepts *.tunnel hosts
│
▼
TunnelSession (one per laptop)
│ frames request: { id, method, path, headers, body_b64 }
▼
WebSocket ──────────────────────▶ ee-tunnel CLI
│
▼
localhost:11434
(your local Ollama)
│
response_start, response_chunk, ...
◀──────────────────────────────────┘
│
▼
pydantic_ai consumes streaming body → enrichment recordMultiple in-flight enrichments share one tunnel WebSocket. Each request gets a UUID; frames are demultiplexed by id on both ends so concurrent calls (e.g. multi-expertise) work naturally.
Streaming responses (Ollama's stream=true) travel as a sequence of response_chunk frames, so token-by-token UI updates stay incremental even over the tunnel.
An organization owner clicks "New Ollama Tunnel", types a label, and confirms. The platform allocates the credentials.
On your laptop, run the one-line installer and `ee-tunnel pair`. A browser tab opens — confirm with the same Entity Enricher account.
Run `ee-tunnel`. The CLI opens a single outbound HTTPS WebSocket. The new provider appears in your model list within seconds.
Click Discover Models — every model in your local Ollama is auto-imported with real context length and capabilities. Enrich as usual.
One click in the UI tears down the WebSocket and invalidates the credentials immediately.
The tunnel was designed to add as little attack surface as possible. The earlier draft of the feature used a public SSH server on a new port; we replaced that with the in-process WebSocket multiplexer to eliminate the SSH attack surface entirely.
Everything goes through the existing :443 ingress — the same TLS endpoint as the web app. No SSH server, no exposed Ollama port.
The CLI initiates the WebSocket. Your laptop never accepts inbound connections — your home network firewall doesn't need to change.
A tunnel is bound to a single organization. Other tenants on the platform cannot see, select, or invoke it. System admins have cross-org visibility by design.
Revoke removes the credentials from the database. The next request fails with 401. The active WebSocket is evicted within ~1 second.
The browser-OAuth flow handles credentials end-to-end. The browser never sees your refresh token; the CLI receives it directly through its polling channel.
5 active tunnels per org by default; 16 MB inbound WebSocket frame cap; structured 402 errors when over quota.
SSRF guard. The tunnel feature also closes a pre-existing gap: the specific_endpoint field on Ollama providers is now validated against an allowlist regex, so private IPs, loopback, and Docker compose service names can never be used as a provider URL.
| Command | What it does |
|---|---|
| ee-tunnel pair --server URL | Browser-OAuth pairing. Opens the verification URL automatically. |
| ee-tunnel pair --server URL <token> | Manual pairing with a refresh token copied from the UI (fallback for headless environments). |
| ee-tunnel | Connect and serve. Reconnects with exponential backoff (1s → 30s) on transient drops. |
| ee-tunnel status | Show pairing state, server URL, configured Ollama URL, label. |
| ee-tunnel disconnect | Forget local credentials. Does not revoke server-side; use the UI for that. |
| ee-tunnel version | Print version. |
If your Ollama listens on a different port, set EE_TUNNEL_OLLAMA_URL or pass --ollama URL at pair time.
macOS · ~/Library/Application Support/ee-tunnel/
Linux · ~/.config/ee-tunnel/
Windows · %APPDATA%\ee-tunnel\
| Active tunnels per organization | 5 (default) |
| Refresh-token TTL | 365 days |
| Access-token TTL | 15 minutes |
| Pairing TTL | 10 minutes |
| Inbound WebSocket frame max | 16 MB |
| Outbound chunk size (CLI) | 16 KB |
| Heartbeat interval | 30 seconds |
No. The CLI initiates an outbound WebSocket. Your laptop never accepts inbound connections, and your Ollama port stays bound to localhost.
The CLI exits gracefully on disconnect. Resuming the laptop and re-running ee-tunnel reconnects with the existing credentials. Enrichments running while the tunnel is down fail with a clean ConnectError.
No. Each tunnel is bound to one organization. Other tenants on the platform cannot see, select, or invoke it. System admins can see all tunnels by design (same as global API keys).
Almost always — only outbound :443 is needed. The same path the web app uses.
Click "Rotate token" on the Tunnels tab. The old token is invalidated immediately; the modal shows a fresh pair command for your laptop.
Yes — the CLI forwards arbitrary HTTP. Set EE_TUNNEL_OLLAMA_URL to any local OpenAI-compatible endpoint. Anything that speaks Ollama's /v1/chat/completions or /api/tags will work.
The CLI and installer are MIT-licensed and live in a public repository so anyone can audit what runs on their machine.
Source: github.com/TOT-Concept/ee-tunnel
Releases: github.com/TOT-Concept/ee-tunnel/releases — each binary is signed with cosign before publication.
Audit the installer: curl -fsSL https://entityenricher.ai/install.sh | less