Karakor
PRIVATE AI · LEGAL

On-premises model deployment is no longer exotic. Three constraints decide whether it is right for your firm.

A schematic showing a firm's privilege boundary as a brass-rimmed circle containing a server rack, a small neural-network node, retrieval connections, source documents, and an operator workstation. Outside the boundary, faint dashed lines indicate data that does not cross.
Shahed Daoud6 min read

Most firms we speak to have already decided that sending privileged material to a third-party model API is a non-starter. The question is what to do instead.

The short answer: run the model on hardware your firm controls, indexed against documents that never leave your boundary. The longer answer is that three constraints decide whether on-premises deployment is the right call — and the order of those constraints is the only thing that matters.

1. Privilege boundary

The first constraint is also the hardest to argue with. If your firm cannot defend, on cross-examination, that no privileged material left the firm's control, then you have a privilege problem before you have a technology problem.

A privately-hosted model running on a server in the firm's data centre — or on a workstation under a partner's desk — moves the boundary in the right direction. Prompts, completions, document context, and embeddings stay inside infrastructure the firm already controls and already insures.

This is a posture argument before it is a feature argument. A firm that can describe its model deployment in two sentences to opposing counsel is a firm that has done the work.

2. Retrieval architecture

The model itself is the cheaper half of the problem. The harder half is what the model is allowed to see.

A practitioner asking the model to summarise discovery for one matter cannot be shown material from a different matter, a different client, or a different ethical wall. Retrieval — the system that picks which documents the model receives along with the prompt — has to be matter-aware, ethical-wall-aware, and audit-trail-aware from the first day.

Most off-the-shelf "private AI" products solve the model deployment problem and leave retrieval as an exercise for the reader. That exercise is the engagement.

3. Operator viability

A model deployment your firm cannot maintain past the consulting engagement is worse than no deployment. The third constraint is whether a senior IT person at the firm — not a vendor, not a managed service — can install, configure, and update the system without an outside engineer on standby.

We design private model deployments to be runnable by one disciplined operator with a written runbook. Where that is not realistic, we say so before the engagement begins.

Private model deployment is no longer the exotic option. It is the default for firms whose threat model takes privilege seriously. The three constraints above decide whether your firm should run the deployment itself or outsource the obligation — and either answer is defensible if it is made deliberately.