Building Software Faster with LLMs: Part 1 - The Pain Points
Note: This article series was written by me, with LLMs helping to refine the style and structure.
Working with 10 parallel LLM coding sessions exposes problems that don’t appear at smaller scale. Managing multiple conversations, maintaining context across sessions, and ensuring quality all require different approaches than single-session work.
This series documents those problems and the solutions that emerged. The tools shown use Claude Code and Emacs, but the patterns apply broadly to any LLM workflow.
Series Navigation: Part 2: Ergonomics →
The Pain Points
The problems:
- Managing Multiple Conversations - 10 terminal windows, no visibility into which sessions need attention
- Lost Context - No audit trail of past sessions or decisions made
- Quality & Regressions - LLMs fix one thing, break another
- Language-Specific Edit Challenges - Parenthesis balance issues in Lisp
- Project Exploration Speed - 10+ minutes to load a 20-file project
- Context Switching Between Sessions - No shared knowledge between parallel sessions
- Review Without Full IDE Context - Reviewing diffs without syntax highlighting and jump-to-def
- No Long-Term Memory - Every session starts from scratch
- Parallelization Challenge - Coordinating multiple LLMs working simultaneously
- Safety and Access Control - Too easy to grant access to private data
Let’s dive into each of these.
Problem 1: Managing Multiple Conversations
Picture this: 10 terminal windows, each running a different LLM session. One is refactoring your note system, another is debugging a home automation script, a third is implementing a new feature. Zero visibility into which needs your attention.
The problem becomes clear when context switching:
- Which session is waiting for input?
- Which is still processing?
- Which finished 10 minutes ago and has been idle?
Without state tracking across sessions, every context switch means manually checking each window. You switch to a session only to find the LLM finished 10 minutes ago while you were focused elsewhere.
Problem 2: Lost Context
Open a project you worked on last week with an LLM. The code looks unfamiliar. You don’t remember writing it. Questions arise:
- What was the original prompt?
- Did I review this properly?
- What architectural decisions were made?
- Why this approach instead of alternatives?
Without an audit trail of past sessions, there’s no way to reconstruct the reasoning behind the code. You’re essentially trusting that past-you made good decisions—but you have no record of what those decisions were.
Automatic context compaction makes this worse. LLMs will drop older messages to fit within token limits, but I want explicit control over what gets retained from session to session, not an algorithm deciding what’s “important.”
Problem 3: Quality and Regressions
Whack-a-mole development: LLMs fix one issue and silently break another. The problem wasn’t the LLM’s capabilities—it was my process. I was treating LLM sessions like conversations with a developer I trusted to test their own code.
The first solution: treat every change like a pull request. Tests must pass.
# After every LLM change
make test # Must pass before continuing
This catches regressions but doesn’t solve architectural consistency. Code generated across dozens of separate sessions felt scattered, like it was designed by committee where no one talked to each other.
The second solution: persona-based prompts. Instead of “Refactor this code”:
You are Robert C. Martin (Uncle Bob). Review this code and refactor
it according to clean code principles.
The difference was striking. Suddenly: smaller functions, better separation of concerns, consistent naming conventions across the codebase.
You can use different personas for different needs. Want paranoid security review? “You are a security-minded, paranoid QA engineer who trusts nothing.” Need simplicity? “You are obsessed with reducing complexity and eliminating unnecessary abstractions.” The persona focuses the LLM’s attention on specific concerns.
Problem 4: Language-Specific Edit Challenges
Lisp-based languages (Elisp, Clojure, Scheme) are harder for LLMs to edit because of parenthesis balance.
The problem: Remove one closing paren and get “end-of-file during parsing” with no location. The error could be 200 lines away from the actual edit.
The feedback loop:
- LLM edits code
- Compile fails
- Hunt for unbalanced paren manually
- Fix and retry
This affects any language with nested structure spanning many lines: deeply nested JSON, XML, etc.
The solution: validation tooling that gives precise error locations. Without that, you’re debugging blind.
Problem 5: Project Exploration Speed
New codebase? Get ready to spend 10+ minutes on initial exploration. A 20-file project means feeding files one by one to the LLM, waiting for API calls, managing context windows.
This creates a cold-start problem. Every new project or every time you switch projects means a lengthy ramp-up period before the LLM has enough context to be productive.
The solution: a way to efficiently snapshot and load project context—not just individual files, but the structure, key patterns, and architectural decisions all at once.
Problem 6: Context Switching Between Sessions
I’d discover a great pattern in session A. Session B, working on a related problem, had no idea it existed.
Each LLM conversation was an island. Problems with this isolation:
- Can’t share knowledge between sessions
- Contradictory decisions across different LLM instances
- Manual copy-paste required to propagate learnings
- If I made an architectural decision in conversation A, conversation B would make a different one
The solution: a shared context system where different LLM sessions can coordinate and learn from each other.
Problem 7: Review Without Full IDE Context
Code review without your IDE is code review on hard mode.
The LLM generates a diff. You’re looking at it in a terminal or web interface. You’re missing:
- Syntax highlighting
- Jump-to-definition
- Project-wide search
- Static analysis
- Your configured linters
Example: The LLM renames process()
to process_data()
. Questions you can’t answer:
- What calls this function?
- Is this part of a larger refactoring?
- Did it affect other functions that depend on it?
Tools like Cursor solve this with deep editor integration—the LLM changes happen natively in your IDE. But if you’re using terminal-based LLM tools or trying to integrate with Emacs/Vim, you need a workflow to bring LLM-generated changes into your full development environment.
Problem 8: No Long-Term Memory
Sessions had amnesia. Yesterday’s architectural decisions? Gone. Last week’s patterns? Forgotten.
Sure, I had a global CLAUDE.md file with preferences, but that was static. I couldn’t easily capture evolving patterns like:
- “When working on MCP servers, always check the umcp wrapper patterns”
- “The smoke test paradigm works better than unit tests for these projects”
- “Remember that the memento CLI should never be called directly—use MCP”
These insights lived in my head, not in a form the LLM could access and build upon. Each new session started from zero, unable to leverage the accumulated knowledge from previous sessions.
Problem 9: Parallelization Challenge
I wanted parallel LLM sessions building different parts of the same project. Chaos ensued.
The ideal workflow:
- Session A: implements a feature
- Session B: writes tests for that feature
- Session C: updates documentation
- Session D: reviews the changes from A, B, and C
But coordinating multiple LLM sessions is harder than coordinating humans. Problems:
- Sessions can’t see each other’s progress
- No natural communication channel between sessions
- They’ll happily work on the same file and create conflicts
- No way to express dependencies (Session B needs Session A to finish first)
The solution: orchestration patterns to divide tasks, prevent conflicts, and merge results without manual intervention.
Problem 10: Safety and Access Control
When you’re in flow, you say ‘yes’ to everything. That’s how the LLM reads your private notes.
Claude Code prompts have become like cookie consent banners or Terms of Service pages. You’ve seen the prompt 50 times today. “Do you want to let Claude read this file?” Yes. “Run this command?” Yes. “Search this directory?” Yes. Decision fatigue sets in. You stop reading carefully. You just click yes to make the prompt go away and get back to work.
This is exactly how website designers exploit users with cookie banners—they know after the 10th website, you’ll just click “Accept All” without reading. The same psychological pattern applies to LLM tool use.
I discovered a serious problem when building my note management system. Despite explicit prompts telling the LLM “do NOT access private notes,” I’d occasionally review logs and find it had read private files anyway. This wasn’t malicious—the LLM was trying to be helpful, pattern-matched similar file paths, and I’d reflexively approved the request without carefully reading which specific file it wanted.
Risk areas where this becomes dangerous:
- Personal notes or journals
- Configuration files with API keys or tokens
- Any sensitive data mixed with development work
The fundamental tension:
- Speed vs Safety: Careful review of every action slows you down
- Context vs Control: The LLM needs broad context to be useful, but that increases risk
- Automation vs Oversight: You want automated workflows, but automation can bypass safety checks
The real solution isn’t better logging—it’s making the wrong thing impossible by design. Don’t rely on prompts or careful review. Build systems where sensitive data simply can’t be accessed.
For my note system, I mark notes as PUBLIC in org-mode by setting a property. Only PUBLIC notes are accessible to the LLM via MCP. The system enforces this at the API level—no amount of prompt engineering or reflexive approval can expose private notes.
But this pattern doesn’t scale well to code. You can’t mark every file in a codebase as PUBLIC or PRIVATE.
A more scalable approach: leverage Unix file permissions. Make LLM tools run as a specific user or group with restricted permissions:
- Private files:
chmod 600
(owner-only) - Public files:
chmod 644
(world-readable) - LLM runs as different user/group: physically cannot read private files
This enforces access control at the OS level. The LLM tool literally can’t open the file, regardless of prompts or approval. You could even use chattr +i
on Linux to make sensitive files immutable.
The challenge: this requires discipline in setting permissions and may conflict with normal development workflows. But it’s the right direction—making violations impossible, not just logged.
Other needed patterns:
- Directory-level access control (allow
~/projects/blog
, block~/.ssh
) - Pattern-based restrictions (block
*.env
,*credentials*
,*secrets*
) - API-level enforcement that tools can’t bypass
- Audit trails that make violations obvious
Until we solve this systematically, the onus is on us to be vigilant—and that’s exhausting when you’re trying to move fast.
The Solutions
- Ergonomics (Part 2): Terminal integration showing LLM state, telemetry tracking all sessions, logging every command
- Abstractions (Part 3): Shared context between sessions, smoke test paradigm, coordinating parallel LLMs
- Experiments (Part 4): Project exploration tools, diff review workflows, lessons from failures
- Learning (Part 5): Flashcard generation, annotated code worksheets, spaced repetition
The next articles show how each works.
What’s Next
Part 2: Ergonomics and Observability - Terminal integration for managing multiple LLM sessions, telemetry and logging infrastructure that makes everything auditable.
Part 3: Higher-Level Abstractions - Shared context systems for long-term memory, smoke tests as the foundation of quality, patterns for coordinating multiple LLM sessions.
Part 4: The Way We Build Software Is Rapidly Evolving - Tools that became obsolete, workflows that work, and the broader implications of AI-augmented development.
Part 5: Learning & Knowledge - Using LLMs to generate flashcards, worksheets, and heavily-annotated code for studying complex topics.
Continue Reading: Part 2: Ergonomics and Observability →