Back to the blog
Technology
August 3, 2025

Your AI Agent Needs Minimal Relevant Context at the Right Time

Written by
Fawad Khaliq
X logoLinkedin logo
Start for Free
Estimated Reading time
3 min

What is context engineering? It’s the practice of assembling all the facts, instructions, and tools an LLM needs to complete a task—delivered at the right time.

Large context windows feel liberating. You paste the logs, the docs, the schema dumps—the entire system footprint—and it feels like you’re setting your AI agent up for success. The catch? Models slog through that noise, you pay more tokens, and they still miss the point. At Chkk, we’ve learned that less but sharper beats more but messy every single time. Let’s call this Minimal Relevant Context (MRC).

Humans, Agents, and the Cost of Vagueness

Think about your high agency teammate. When you hand them clear instructions, they sprint. When you mumble, they stumble. Agents are no different. Dumping the entire company wiki into a prompt is like forwarding your intern every email thread since 2018: more overwhelming than empowering.

We’ve seen this firsthand. Whether you're managing humans or agents, context quality determines outcome quality. The more intentional you are about what context gets passed in, the better the result.

The Early Days at Chkk

We’ve been obsessed with context since day one. Long before LLMs left the lab, our customers were drowning in information. A routine infrastructure upgrade typically involved: hundreds of services, dozens of open-source projects, multiple deployment systems and a jungle of configs, parameters, and dependencies.

Expecting an engineer to read every repo’s CHANGELOG and figure out what mattered to their live infrastructure was painful, to say the least.

So we built a system that collects running state from infrastructure, classifies every object, contextualizes operational insights from our Knowledge Graph, and generates tailored workflows. The end result? Contextualized breaking changes, contextualized diffs, contextualized preflight/postflight checks, and more. This freed our customers from labor-intensive, error-prone lifecycle management and saved them months of effort.

Think of this system as a pipeline: Collect → Analyze → Reason → Act. It worked for humans. That same foundation is exactly what agents need too.

Collect

We start by gathering raw configuration and runtime metadata from customer environments: everything from helm values and deployment manifests to AMI digests and kernel params.

Meanwhile, our Collective Learning system continuously ingests upstream changelogs, release notes, documentation, and risk advisories across hundreds of open-source projects. It filters, normalizes, and distills this firehose into machine-actionable knowledge.

Analyze

We classify every running object down to the exact project, package, component, and deployment system. Then we contextualize. We don’t just link to upstream docs, we extract only the context that actually touch "this version" of "this object" in "this environment". The result isn’t a “prompt.” It’s a fact graph tailored to your live stack.

Reason and Act

Our decision cortex turns that graph into structured upgrade plans. What’s breaking? What needs testing? What post-upgrade checks should run? Instead of weeks of manual yak-shaving, customers get environment-specific plans in minutes.

All because the right context shows up at the right time.

Fast-Forward to Agents: Same Need, New Consumer

When we started operationalizing agents and handing off this context, we already had the hard parts: clean inventory, enriched knowledge, and environment-aware slices of context. But agents consume context differently. A typical agent loop looks like this:

1. Observe user input and scratchpad

2. Choose a tool

3. Execute and observe output

4. Append the result to context

5. Repeat

The context inflates every turn, while the output remains just a few lines. A million-token window sounds huge. Until your agent burns through it in 20 tool calls.

That’s when we started seeing failure patterns.

Common Failures with Large Context

Models Pay Attention to Everything You Say

Just because it’s in the context doesn’t mean it should be. But models don’t know that. Feed them stale configs, half-relevant logs, or outdated notes, and they’ll still try to be helpful. We’ve seen agents hallucinate from an irrelvant changelog buried in the prompt, simply because it was there.

Larger context windows don’t fix this. They enable it. Agents slog through noise, slow down, cost more, and sometimes take actions based on red herrings. The smarter the model, the harder it is to catch, because the output sounds plausible.

Agents Anchor on Historical Context

As agents gather observations, they build a sense of history. That history should help, but often, it anchors them. We’ve watched agents repeat tool calls even after conditions changed, stuck in a loop because an old success path stayed too prominent in the buffer.

This isn’t a token limit issue. It’s a relevance decay issue. When everything feels equally weighted, the agent loses its ability to focus on what matters now.

Models Second-Guess Themselves

This kind of failure usually shows up late. The agent drafts a plan, then uncovers new information that partially invalidates it. Without memory hygiene, the draft lingers in the buffer. The model starts to hedge. Then rewrite. Then overwrite. We’ve seen agents produce mutually contradictory outputs in a single run.

It happens because the model is trying to be helpful. It retains prior thoughts, blends in new ones, and gives you “options.” But what you really want is confidence, not contradiction.

Design for Minimal Relevant Context

Minimal Relevant Context was the linchpin. Our system treats context as a living contract: continuously pruned, shaped, and ranked for relevance. We don’t just pipe data into prompts. We build environment-aware context slices, scoped to the goal, the latest evidence, and the smallest surface area required to act.

Where We’re Headed

Context engineering is still young, but for agent systems, it’s already mission-critical. Models are getting faster, cheaper, and smarter. But raw power doesn’t replace memory hygiene, relevance, or judgment.

At Chkk, we treat Minimal Relevant Context as bedrock. It lives in how we structure inventory, normalize knowledge, track environment diffs, and scope inputs to the task at hand.

Because in the end, your agent is only as good as the context it’s handed.

Tags
AI-Coding-Agents
Context Engineering

Continue reading

Upgrade Advisory

Upgrade Advisory: Missing External Service Metrics After Istio v1.22 → v1.23 Upgrade

by
Muneeb Ahmad
Read more
Spotlight

Spotlight: Simplifying Keycloak Upgrades with Chkk

by
Chkk Team
Read more
Operational Safety

A Practitioner's Taxonomy of Infrastructure‑as‑Code (IaC) Patterns in the Wild

by
Fawad Khaliq
Read more