Skip to main content
Entitybits
Perspective · 2026

How we think about AI.

The AI tooling landscape moved a lot between 2023 and 2026. We have shipped through every phase of it. Five things we believe today, and how they shape how we build.

01

Models are not the moat. The system around them is.

Anyone can call Claude Opus 4.7, GPT-5, Gemini 3, or Llama 4. The moat is the system around the model: retrieval that actually retrieves the right thing, evals that catch regressions before users do, prompts that are versioned and tested like code, and infrastructure that survives a production load test.

Where we've shipped this — Enterprise RAG with neural reranking
02

Agents are real, but most "agentic" demos are not.

A working agent has tool access, durable memory, structured outputs, and recovery logic for when a tool call fails. Most production work we do involves agentic patterns now.

The Model Context Protocol (MCP) has become a useful standard for exposing internal APIs to LLMs, and we build MCP servers as part of integration work where it makes sense.

Where we've shipped this — AI agents across 6+ programmatic platforms
03

Hallucination reduction is a stack, not a setting.

Grounding through retrieval, citation requirements, structural constraints in the prompt, output validation against schemas, confidence-aware fallbacks, and a human review path for the cases the model gets wrong. Each layer catches what the others miss. Nothing about this is solved by switching models.

Where we've shipped this — Multi-layer RAG with citations and validation
04

Evals are the single biggest gap in most teams' AI work.

Teams ship a prompt, watch it work in three test cases, and call it done. Six weeks later quality has drifted, traffic has shifted, the model version updated, and nobody has a number for how good or bad the output is.

We treat eval design as a deliverable on every AI engagement.

Where we've shipped this — Eval-driven analytics with explainable outputs
05

The buying question has changed.

Two years ago, the question was "can we use AI for this." Today the question is "should we, and at what cost." Compute is not free, latency is not free, and a wrong AI answer in a customer-facing surface is more expensive than no AI answer at all.

We help clients answer the second question honestly.

This is the operating context we work in, and it shapes how we scope, build, and hand off every engagement.

Have a real AI system to build?

30-minute scoping call. We'll tell you if it's worth doing — and if not, what to do instead.

Start the conversation