Ollama Tunnels - Entity Enricher Documentation

Ollama Tunnels

Plug your laptop's local Ollama into Entity Enricher with one command. Use the models you already have, keep certain inputs on your machine, and revoke access instantly — all over a single outbound HTTPS connection.

            Production (Hetzner)          Your Laptop / your server
        ┌─────────────────────────┐         ┌────────────────────┐
        │  Org enrichment         │         │  ee-tunnel CLI     │
        │     │                   │         │     │              │
        │     ▼                   │         │     ▼              │
        │  Ollama provider URL:   │         │  Ollama :11434     │
        │  http://ollama.<org>    │         │  ┌──────────────┐  │
        │       .tunnel/v1/chat   │         │  │ qwen3:36b    │  │
        │     │                   │         │  │ llama3:70b   │  │
        │     ▼                   │         │  │ ...          │  │
        │  WebSocket bridge       │◀───────▶│  └──────────────┘  │
        └─────────────────────────┘         └────────────────────┘
                                  one outbound :443/wss connection

The synthetic hostname ollama.<org-slug>.tunnel is never resolved by DNS. A custom transport in the Entity Enricher process intercepts requests for *.tunnel and routes them through the matching active WebSocket.

Why use a tunnel?

Use models you already have

Run enrichments against open-weight models already pulled on your laptop (Llama 3, Qwen, DeepSeek, Mistral, …) without re-downloading them on the server.

Keep certain inputs on your machine

For sensitive prompts, route specific enrichments through your own GPU. The platform still records cost, tokens, and metadata — only the inference itself stays local.

Try a new model in 30 seconds

Pull a model with `ollama pull` on your laptop, click Discover Models, and it appears in the Entity Enricher selector with real context length and capabilities.

Offload bursty work

When a hosted provider rate-limits you, fail over to a tunneled local model for the rest of a batch.

Quick start

Two commands on your laptop, one click in the browser. The whole flow takes about a minute.

  1. 1

    Create a tunnel in the UI

    Open API Keys → Ollama Tunnels and click New Ollama Tunnel. Type a label so you'll recognize this device later (e.g. "Anthony's MacBook").

    [Screenshot placeholder — Tunnels tab with "New Ollama Tunnel" button highlighted]
  2. 2

    Install the CLI on your laptop

    Paste this in a terminal. The script verifies a cosign signature before installing.

    curl -fsSL https://entityenricher.ai/install.sh | sh

    The installer prints what it's about to do (download URL, signature, install path) and pauses 5 seconds so you have a Ctrl+C window. Source code lives at TOT-Concept/ee-tunnel (MIT). Windows users use install.ps1 instead.

  3. 3

    Pair via your browser

    Run ee-tunnel pair. A browser tab opens at /tunnel/connect with a 6-character code. Confirm the device label and click Connect.

    ee-tunnel pair --server https://entityenricher.ai
    
    Open this URL in your browser to confirm pairing:
       https://entityenricher.ai/tunnel/connect?code=9KV-DU6
    
      Code: 9KV-DU6
    
    Waiting for confirmation...
    [Screenshot placeholder — /tunnel/connect confirmation page with device-code]

    You never copy-paste a token. The browser only sees tunnel metadata; the CLI receives its credentials over its own polling channel.

  4. 4

    Connect and discover models

    Run ee-tunnel. The status pill in the Tunnels tab flips to Connected within seconds. Then click Discover Models on the Ollama provider in Models & Pricing to auto-import every locally-installed chat model.

    [Video placeholder — Discover Models pulling 4 local Ollama models into the platform]

How pairing works

Pairing uses an OAuth-style device-code flow (RFC 8628 shape). The user types a short code in a browser instead of copy-pasting a token. The browser only sees tunnel metadata; the CLI receives its refresh token directly over its own polling channel — one-shot redemption.

 laptop                  entity-enricher                  browser
   │                          │                              │
   │── ee-tunnel pair ──────▶ │                              │
   │   --server URL           │                              │
   │                          │                              │
   │  POST /api/tunnel/       │                              │
   │       device-code        │                              │
   │ ───────────────────────▶ │                              │
   │ ◀── { device_code,       │                              │
   │     user_code='9KV-DU6', │                              │
   │     verification_uri }   │                              │
   │                          │                              │
   │  prints URL, opens       │                              │
   │  browser at /tunnel/     │                              │
   │  connect?code=… ──────────────────────────────────────▶│
   │                          │                              │
   │                          │  POST /api/tunnel/confirm/   │
   │                          │       {user_code}            │
   │                          │  body: { label }             │
   │                          │  Authorization: Bearer       │
   │                          │   <user JWT, role=owner>     │
   │                          │ ◀──────────────────────────── │
   │                          │  • create tunnel_credentials │
   │                          │  • mark device_code complete │
   │                          │    with refresh_token        │
   │                          │ ────────────────────────────▶│
   │                          │  200 { tunnel metadata }     │
   │                          │                              │
   │  POST /api/tunnel/poll   │                              │
   │ ───────────────────────▶ │                              │
   │ ◀── { status:'ok',       │                              │
   │     refresh_token,       │                              │
   │     server_url,          │                              │
   │     tunnel_id }          │                              │
   │                          │                              │
   │  persist token mode 0600 │                              │
   │  ✓ Paired                │                              │

How a request flows through the tunnel

Once connected, your tunneled Ollama is just another LLM provider as far as the rest of the platform is concerned. Multi-expertise, fusion, batch — everything works unchanged.

  enrichment job
       │
       ▼
  agent_factory  →  Ollama provider, base_url=http://ollama.acme.tunnel
       │
       ▼
  custom httpx transport  ← intercepts *.tunnel hosts
       │
       ▼
  TunnelSession (one per laptop)
       │ frames request: { id, method, path, headers, body_b64 }
       ▼
  WebSocket  ──────────────────────▶  ee-tunnel CLI
                                          │
                                          ▼
                                     localhost:11434
                                     (your local Ollama)
                                          │
                          response_start, response_chunk, ...
       ◀──────────────────────────────────┘
       │
       ▼
  pydantic_ai consumes streaming body  →  enrichment record

Multiplexing

Multiple in-flight enrichments share one tunnel WebSocket. Each request gets a UUID; frames are demultiplexed by id on both ends so concurrent calls (e.g. multi-expertise) work naturally.

Streaming responses (Ollama's stream=true) travel as a sequence of response_chunk frames, so token-by-token UI updates stay incremental even over the tunnel.

Lifecycle

01

Create

An organization owner clicks "New Ollama Tunnel", types a label, and confirms. The platform allocates the credentials.

02

Pair

On your laptop, run the one-line installer and `ee-tunnel pair`. A browser tab opens — confirm with the same Entity Enricher account.

03

Connect

Run `ee-tunnel`. The CLI opens a single outbound HTTPS WebSocket. The new provider appears in your model list within seconds.

04

Use

Click Discover Models — every model in your local Ollama is auto-imported with real context length and capabilities. Enrich as usual.

05

Revoke

One click in the UI tears down the WebSocket and invalidates the credentials immediately.

Security

The tunnel was designed to add as little attack surface as possible. The earlier draft of the feature used a public SSH server on a new port; we replaced that with the in-process WebSocket multiplexer to eliminate the SSH attack surface entirely.

No new public ports

Everything goes through the existing :443 ingress — the same TLS endpoint as the web app. No SSH server, no exposed Ollama port.

Outbound-only

The CLI initiates the WebSocket. Your laptop never accepts inbound connections — your home network firewall doesn't need to change.

Org-scoped

A tunnel is bound to a single organization. Other tenants on the platform cannot see, select, or invoke it. System admins have cross-org visibility by design.

Revocable instantly

Revoke removes the credentials from the database. The next request fails with 401. The active WebSocket is evicted within ~1 second.

No copy-pasted secrets

The browser-OAuth flow handles credentials end-to-end. The browser never sees your refresh token; the CLI receives it directly through its polling channel.

Quota & frame caps

5 active tunnels per org by default; 16 MB inbound WebSocket frame cap; structured 402 errors when over quota.

SSRF guard. The tunnel feature also closes a pre-existing gap: the specific_endpoint field on Ollama providers is now validated against an allowlist regex, so private IPs, loopback, and Docker compose service names can never be used as a provider URL.

CLI reference

CommandWhat it does
ee-tunnel pair --server URLBrowser-OAuth pairing. Opens the verification URL automatically.
ee-tunnel pair --server URL <token>Manual pairing with a refresh token copied from the UI (fallback for headless environments).
ee-tunnelConnect and serve. Reconnects with exponential backoff (1s → 30s) on transient drops.
ee-tunnel statusShow pairing state, server URL, configured Ollama URL, label.
ee-tunnel disconnectForget local credentials. Does not revoke server-side; use the UI for that.
ee-tunnel versionPrint version.

Custom Ollama URL

If your Ollama listens on a different port, set EE_TUNNEL_OLLAMA_URL or pass --ollama URL at pair time.

Where files live

macOS · ~/Library/Application Support/ee-tunnel/
Linux · ~/.config/ee-tunnel/
Windows · %APPDATA%\ee-tunnel\

Limits and timing

Active tunnels per organization5 (default)
Refresh-token TTL365 days
Access-token TTL15 minutes
Pairing TTL10 minutes
Inbound WebSocket frame max16 MB
Outbound chunk size (CLI)16 KB
Heartbeat interval30 seconds

Common questions

Is my Ollama exposed to the internet?

No. The CLI initiates an outbound WebSocket. Your laptop never accepts inbound connections, and your Ollama port stays bound to localhost.

What happens when my laptop sleeps?

The CLI exits gracefully on disconnect. Resuming the laptop and re-running ee-tunnel reconnects with the existing credentials. Enrichments running while the tunnel is down fail with a clean ConnectError.

Can other organizations use my tunnel?

No. Each tunnel is bound to one organization. Other tenants on the platform cannot see, select, or invoke it. System admins can see all tunnels by design (same as global API keys).

Does this work behind a corporate firewall?

Almost always — only outbound :443 is needed. The same path the web app uses.

How do I rotate credentials?

Click "Rotate token" on the Tunnels tab. The old token is invalidated immediately; the modal shows a fresh pair command for your laptop.

Can I use this without Ollama?

Yes — the CLI forwards arbitrary HTTP. Set EE_TUNNEL_OLLAMA_URL to any local OpenAI-compatible endpoint. Anything that speaks Ollama's /v1/chat/completions or /api/tags will work.

Open source

The CLI and installer are MIT-licensed and live in a public repository so anyone can audit what runs on their machine.

Source: github.com/TOT-Concept/ee-tunnel

Releases: github.com/TOT-Concept/ee-tunnel/releases — each binary is signed with cosign before publication.

Audit the installer: curl -fsSL https://entityenricher.ai/install.sh | less