The Days of Prompt Engineering Are Over

Published: 2025-09-27

When large language models first arrived, **prompt engineering** became the new craft. If you wanted a model to reason step by step, you told it explicitly. If you wanted structured output, you begged in natural language. Everything depended on phrasing: *“Let’s think step by step.”* *“Return only JSON.”* *“Answer in exactly three bullets.”* It worked—but it was brittle. A single word change could break the output. The model wasn’t really following rules; it was pattern-matching against instructions. As models matured, their capabilities expanded. Context windows grew from a few thousand tokens to **hundreds of thousands—up to around a million, depending on the model**. Tool use and structured outputs became native. With that, a subtle but profound shift began for me: I no longer tell the model *how* to perform a task—I show it *what to work with* and give it clear rails. That’s context engineering. ### Why the Shift Happened The journey from prompt engineering to context engineering is really a story of **model evolution**. ![Timeline: Model evolution → Workflow evolution](/images/blog/prompt-engineering-to-context-engineering.svg) As models gained longer context, tool use, and schema-constrained outputs, the bottleneck stopped being clever phrasing. It became **curating the right context**—the smallest set of files, artifacts, and constraints that makes the task obvious to the model. ### What Changed in My Daily Flow I default to modern, long-context, tool-using models (often in Cursor) because they balance fast edit—run loops with reliability on multi-file refactors. They also tolerate larger, messier inputs without derailing, which means I can feed real code instead of toy snippets. Most of my day fits a few patterns—refactors, writing tests, generating small docs, asking for code explanations, and skimming PRs. All of them benefit from the same approach: **point the model at the exact surfaces involved and ask for a concrete change or summary**. Let the repo be the source of truth; avoid chasing phrasing tricks; get consistent diffs. Instructions are intentionally short—one sentence that states the goal and guardrails. The “work” lives in the context (files, examples, acceptance criteria). That shift makes results more deterministic and easier to review because the model is constrained by the artifacts I choose. **Micro-example (how this actually plays out):** I needed to rename a React hook across modules and update imports. My instruction was a single line: “Rename `useX` to `useY` across these files; keep behavior identical; update imports only where needed.” I attached the affected modules plus the nearest tests and the ticket’s acceptance criteria. The first diff was correct—no retries—because the constraints lived in the files and tests, not the prompt prose. ### The Context I Attach Most Often * **Files and Folders.** The module under change, adjacent types/helpers, and nearby tests. This confines edits to real interfaces and reduces guesswork about names, imports, and side effects. * **Screenshots and OG (Open Graph) Previews.** When copy or layout matters, a picture anchors intent—what text sits where, which state is visible—so the model doesn’t invent UI that isn’t on screen. * **Tickets and Notes.** Acceptance criteria, edge cases, and links to prior commits. With those attached, the model optimizes for the outcome I care about instead of a generic improvement. * **Context Packs (defined).** A lightweight bundle for the task: code under change + nearby tests + relevant docs/screenshots + the ticket. I assemble this per task rather than relying on a massive, static system prompt. ### How the Workflows Differ Here’s the practical difference I see every day: ![Prompt engineering vs Context engineering](/images/blog/prompt-engineering-vs-context-engineering.svg) Instead of cramming prompts with examples, I hand the model the exact materials it needs. In Cursor that means selecting the folder or files that define the behavior, pasting in the ticket, and—when relevant—attaching screenshots or preview outputs. The instruction stays short because the context does the heavy lifting. Instead of building giant system prompts to cover every edge case, I keep a thin, durable policy—just enough to define goals and tone—and then assemble **context packs** for each task. The instruction stays short because the context constrains the edit. ### The Bigger Picture We started with prompts because models were too limited to understand much else. Today, they can reason, validate, and plan. **When the ticket is clear and I attach the exact files/tests, the first diff is usually right; ambiguous tickets still benefit from one clarifying pass.** The shift from telling the model *how* to think toward curating *what it sees* makes it reliable and practical for day-to-day engineering work. So it raises the question: if prompt engineering defined the past and context engineering defines today—**what comes next?** Try this in your own setup: keep the instruction to one sentence, attach the modules, tests, and ticket, and see how often your first diff lands.

AILLMsMLAI engineeringPrompt engineeringContext engineering