The technology behind nu:legal.
Agentic AI, custom legal workflows, and lawyers in the loop. Here's how the system actually works.
Why a general-purpose model isn't enough.
Most legal AI is a thin wrapper: a prompt template, a base model, a UI. That works for casual queries. It falls apart on real legal work — work where one missed clause, one wrong statute reference, one hallucinated citation has consequences.
nu:legal is built differently. The system is agentic: instead of one prompt-and-response, the platform runs multi-step legal workflows that plan, retrieve, draft, self-check, and escalate. Each step is constrained by structure built by practicing lawyers. Each output is auditable.
The model layer is part of the system, not the system itself.
The pieces.
Lawyer-built workflow graphs.
Every workflow is a structured graph of legal decisions — designed by a practicing employment law or data privacy specialist. The graph defines what to ask, what to check, what to flag, and where to escalate. The model executes the graph; it doesn't invent it.
German legal corpus and retrieval.
The system has direct access to German statutes, case law summaries, and BAG/BVerfG guidance — retrieved at the relevant step, not relied on from training memory. Fewer hallucinations. More current.
Multi-step agentic execution.
A single user request triggers multiple coordinated steps: classification, retrieval, draft generation, self-check, edge-case detection, and human-review routing. Each step has its own constraints and its own quality bar.
Self-evaluation at every step.
The system grades its own output before returning it — against the workflow's defined success criteria, against retrieval evidence, and against benchmarked specialist output. If confidence is low, the work is flagged for specialist review rather than handed back as final.
Human-in-the-loop by design.
Specialist review isn't an upsell bolted on top. It's a native step in the architecture — the system can route to a practicing lawyer mid-workflow when the situation requires it.
Continuous benchmarking.
Every release is tested against expert legal work and against general-purpose models like ChatGPT and Claude. We track win rates by workflow, surface regressions, and ship improvements weekly.
The trade-offs we made on purpose.
Generic AI optimises for breadth. We optimise for legal correctness in the German system — a much narrower target with a much higher floor.
That means we accept some constraints: Counsel won't help you write a screenplay or summarise your inbox. It will draft an Aufhebungsvertrag that holds up under KSchG and flag the Sperrzeit risk before you sign it.
We think that trade is worth making. So do the lawyers we've built it with.
How we know it's working.
We benchmark Counsel against expert legal work on every release — and against general-purpose models. We track accuracy, completeness, and how often the system correctly flags a situation that needs human attention.
Where we fall short, we publish what we're working on. Where we lead, we publish that too.

Want to dig deeper?
Read our technical blog, build with us, or reach out for press inquiries.