The Build-Test-Fix Loop: Automating Development

The Vision

Traditional development:

You → write code → run tests → read errors → fix code → repeat

Agent-driven development:

You → define what you want
         ↓
   Builder Agent → writes code
         ↓
   Tester Agent → runs tests
         ↓
   Builder Agent → fixes failures
         ↓
   Tester Agent → confirms fix
         ↓
You → review final result

The loop runs automatically. You step in only for approval.

The Components

1. Builder Agent

Role: Write and modify code based on requirements and feedback.

Capabilities:

Understand specifications
Access codebase context
Generate implementation
Interpret test failures
Apply targeted fixes

2. Tester Agent

Role: Validate code meets requirements without writing tests itself.

Capabilities:

Run test suites
Analyze output
Identify failure causes
Generate clear failure reports
Verify fixes resolve issues

3. Knowledge Base

Role: Maintain shared context both agents access.

Contains:

Coding standards and patterns
Project structure
Historical decisions
Current requirements
Test expectations

The Loop Protocol

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│  1. Requirements In                                         │
│     ↓                                                       │
│  ┌─────────────┐                                           │
│  │   BUILDER   │ ← Context from Knowledge Base             │
│  └─────────────┘                                           │
│     ↓ Code                                                  │
│  ┌─────────────┐                                           │
│  │   TESTER    │ → Run tests, analyze results              │
│  └─────────────┘                                           │
│     ↓                                                       │
│  ┌─────────────────────────────────────────────┐           │
│  │ Tests Pass?                                  │           │
│  │   YES → Done, return result                  │           │
│  │   NO  → Generate feedback for Builder        │           │
│  └─────────────────────────────────────────────┘           │
│     ↓ (if NO)                                               │
│  ┌─────────────┐                                           │
│  │   BUILDER   │ ← Receives failure feedback               │
│  └─────────────┘                                           │
│     ↓ Fixed code                                            │
│  [Loop back to TESTER, max 3 iterations]                    │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Implementation

The Orchestrator

async function buildTestFixLoop(requirements, maxIterations = 3) {
  let code = null;
  let iteration = 0;

  // Initial build
  code = await builderAgent.generate({
    requirements,
    context: await getProjectContext(),
  });

  while (iteration < maxIterations) {
    iteration++;

    // Test
    const testResult = await testerAgent.validate({
      code,
      requirements,
    });

    if (testResult.passed) {
      return {
        success: true,
        code,
        iterations: iteration,
        testOutput: testResult.output,
      };
    }

    // Generate feedback for builder
    const feedback = await testerAgent.generateFeedback(testResult);

    // Fix
    code = await builderAgent.fix({
      currentCode: code,
      testFeedback: feedback,
      requirements,
    });
  }

  // Max iterations reached
  return {
    success: false,
    code,
    iterations: iteration,
    reason: "Max iterations reached",
  };
}

Builder Agent

const builderSystemPrompt = `
You are a code generation agent. Given requirements, you write clean,
tested, production-quality code.

You have access to:
- Project coding standards
- Existing codebase patterns
- Test expectations

When generating code:
1. Follow existing patterns in the codebase
2. Include necessary imports
3. Handle edge cases
4. Write code that tests can verify

When fixing code:
1. Read the test failure carefully
2. Identify the root cause
3. Make minimal changes to fix
4. Don't break existing functionality
`;

Tester Agent

const testerSystemPrompt = `
You are a test validation agent. You run tests and provide clear
feedback on failures.

Your responsibilities:
1. Execute the test suite
2. Parse test output
3. Identify failure causes
4. Generate actionable feedback for the builder

When reporting failures:
- Be specific about what failed
- Include relevant error messages
- Suggest likely causes
- Don't fix the code yourself
`;

Real Example: Adding Validation

Input Requirements

Add email validation to the user registration form.
- Email must be valid format
- Email must be unique (not already registered)
- Show clear error messages
- Update tests to cover new validation

Iteration 1: Builder

// Generated by Builder
function validateEmail(email) {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  if (!emailRegex.test(email)) {
    return { valid: false, error: "Invalid email format" };
  }
  return { valid: true };
}

Iteration 1: Tester

FAIL: validateEmail should check uniqueness
  Expected: { valid: false, error: "Email already registered" }
  Received: { valid: true }

Feedback: The implementation only validates email format.
The requirement specifies uniqueness check against existing
registrations. The function needs to accept a database/store
parameter and check if email exists.

Iteration 2: Builder

// Fixed by Builder
async function validateEmail(email, userStore) {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;

  if (!emailRegex.test(email)) {
    return { valid: false, error: "Invalid email format" };
  }

  const exists = await userStore.emailExists(email);
  if (exists) {
    return { valid: false, error: "Email already registered" };
  }

  return { valid: true };
}

Iteration 2: Tester

All tests passed.

✓ validateEmail rejects invalid format
✓ validateEmail rejects duplicate email
✓ validateEmail accepts valid unique email

Result

Two iterations. Human reviews final code. Done.

Preventing Infinite Loops

Hard Limits

const MAX_ITERATIONS = 3;
const MAX_TIME = 5 * 60 * 1000; // 5 minutes

async function safeLoop(requirements) {
  const startTime = Date.now();
  let iteration = 0;

  while (iteration < MAX_ITERATIONS) {
    if (Date.now() - startTime > MAX_TIME) {
      return { success: false, reason: "Timeout" };
    }
    // ... loop logic
    iteration++;
  }
}

Change Detection

function detectOscillation(codeHistory) {
  if (codeHistory.length < 3) return false;

  // Check if we're flip-flopping between similar states
  const last = codeHistory[codeHistory.length - 1];
  const thirdLast = codeHistory[codeHistory.length - 3];

  const similarity = calculateSimilarity(last, thirdLast);
  return similarity > 0.95; // Nearly identical = oscillating
}

Escalation Triggers

const escalationTriggers = {
  sameErrorTwice: (errors) => {
    return errors.length >= 2 &&
      errors[errors.length - 1] === errors[errors.length - 2];
  },

  noProgress: (testResults) => {
    const recent = testResults.slice(-3);
    const failingTests = recent.map((r) => r.failingTests);
    return failingTests.every((f) => f === failingTests[0]);
  },
};

if (Object.values(escalationTriggers).some((trigger) => trigger(history))) {
  return escalateToHuman(context);
}

Context Is Everything

The loop quality depends on context quality:

Good Context

const context = {
  // Coding standards
  patterns: await getPatterns(),

  // Similar implementations
  examples: await findSimilarCode(requirements),

  // Test patterns
  testExamples: await getTestPatterns(),

  // Known pitfalls
  warnings: await getRelevantWarnings(),
};

Bad Context

const context = {}; // Agent guesses everything

The richer your knowledge base, the better your agents perform.

When to Use This Pattern

Good fits:

Well-defined, testable requirements
Existing test suite
Standard code patterns
Routine feature additions

Poor fits:

Exploratory development
Architectural decisions
Performance optimization
Security-critical code

The Human Role

You're not eliminated. You're elevated:

Before: Write code, run tests, fix bugs, repeat

After: Define requirements, review solutions, approve changes

Your judgment is still essential. It's now applied at higher leverage.

Getting Started

Week 1: Manual Loop
Run builder, run tests, provide feedback manually. Learn the dynamics.

Week 2: Semi-Automated
Automate test running and feedback generation. Review builder output.

Week 3: Full Loop
Let the loop run. Set iteration limits. Review final results.

Week 4+: Refinement
Improve prompts based on failure patterns. Expand context. Increase autonomy.

The build-test-fix loop isn't the future. It's available today.

The question is whether you'll adopt it before or after your competitors do.

The Build-Test-Fix Loop: Automating Development with Agent Pipelines