Back to Blog

The Build-Test-Fix Loop: Automating Development with Agent Pipelines

Human defines requirements. Agents build, test, and fix. You review the result.

·12 min read

The Vision

Traditional development:

You → write code → run tests → read errors → fix code → repeat

Agent-driven development:

You → define what you want
         ↓
   Builder Agent → writes code
         ↓
   Tester Agent → runs tests
         ↓
   Builder Agent → fixes failures
         ↓
   Tester Agent → confirms fix
         ↓
You → review final result

The loop runs automatically. You step in only for approval.


The Components

1. Builder Agent

Role: Write and modify code based on requirements and feedback.

Capabilities:

  • Understand specifications
  • Access codebase context
  • Generate implementation
  • Interpret test failures
  • Apply targeted fixes

2. Tester Agent

Role: Validate code meets requirements without writing tests itself.

Capabilities:

  • Run test suites
  • Analyze output
  • Identify failure causes
  • Generate clear failure reports
  • Verify fixes resolve issues

3. Knowledge Base

Role: Maintain shared context both agents access.

Contains:

  • Coding standards and patterns
  • Project structure
  • Historical decisions
  • Current requirements
  • Test expectations

The Loop Protocol

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│  1. Requirements In                                         │
│     ↓                                                       │
│  ┌─────────────┐                                           │
│  │   BUILDER   │ ← Context from Knowledge Base             │
│  └─────────────┘                                           │
│     ↓ Code                                                  │
│  ┌─────────────┐                                           │
│  │   TESTER    │ → Run tests, analyze results              │
│  └─────────────┘                                           │
│     ↓                                                       │
│  ┌─────────────────────────────────────────────┐           │
│  │ Tests Pass?                                  │           │
│  │   YES → Done, return result                  │           │
│  │   NO  → Generate feedback for Builder        │           │
│  └─────────────────────────────────────────────┘           │
│     ↓ (if NO)                                               │
│  ┌─────────────┐                                           │
│  │   BUILDER   │ ← Receives failure feedback               │
│  └─────────────┘                                           │
│     ↓ Fixed code                                            │
│  [Loop back to TESTER, max 3 iterations]                    │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Implementation

The Orchestrator

async function buildTestFixLoop(requirements, maxIterations = 3) {
  let code = null;
  let iteration = 0;

  // Initial build
  code = await builderAgent.generate({
    requirements,
    context: await getProjectContext(),
  });

  while (iteration < maxIterations) {
    iteration++;

    // Test
    const testResult = await testerAgent.validate({
      code,
      requirements,
    });

    if (testResult.passed) {
      return {
        success: true,
        code,
        iterations: iteration,
        testOutput: testResult.output,
      };
    }

    // Generate feedback for builder
    const feedback = await testerAgent.generateFeedback(testResult);

    // Fix
    code = await builderAgent.fix({
      currentCode: code,
      testFeedback: feedback,
      requirements,
    });
  }

  // Max iterations reached
  return {
    success: false,
    code,
    iterations: iteration,
    reason: "Max iterations reached",
  };
}

Builder Agent

const builderSystemPrompt = `
You are a code generation agent. Given requirements, you write clean,
tested, production-quality code.

You have access to:
- Project coding standards
- Existing codebase patterns
- Test expectations

When generating code:
1. Follow existing patterns in the codebase
2. Include necessary imports
3. Handle edge cases
4. Write code that tests can verify

When fixing code:
1. Read the test failure carefully
2. Identify the root cause
3. Make minimal changes to fix
4. Don't break existing functionality
`;

Tester Agent

const testerSystemPrompt = `
You are a test validation agent. You run tests and provide clear
feedback on failures.

Your responsibilities:
1. Execute the test suite
2. Parse test output
3. Identify failure causes
4. Generate actionable feedback for the builder

When reporting failures:
- Be specific about what failed
- Include relevant error messages
- Suggest likely causes
- Don't fix the code yourself
`;

Real Example: Adding Validation

Input Requirements

Add email validation to the user registration form.
- Email must be valid format
- Email must be unique (not already registered)
- Show clear error messages
- Update tests to cover new validation

Iteration 1: Builder

// Generated by Builder
function validateEmail(email) {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  if (!emailRegex.test(email)) {
    return { valid: false, error: "Invalid email format" };
  }
  return { valid: true };
}

Iteration 1: Tester

FAIL: validateEmail should check uniqueness
  Expected: { valid: false, error: "Email already registered" }
  Received: { valid: true }

Feedback: The implementation only validates email format.
The requirement specifies uniqueness check against existing
registrations. The function needs to accept a database/store
parameter and check if email exists.

Iteration 2: Builder

// Fixed by Builder
async function validateEmail(email, userStore) {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;

  if (!emailRegex.test(email)) {
    return { valid: false, error: "Invalid email format" };
  }

  const exists = await userStore.emailExists(email);
  if (exists) {
    return { valid: false, error: "Email already registered" };
  }

  return { valid: true };
}

Iteration 2: Tester

All tests passed.

✓ validateEmail rejects invalid format
✓ validateEmail rejects duplicate email
✓ validateEmail accepts valid unique email

Result

Two iterations. Human reviews final code. Done.


Preventing Infinite Loops

Hard Limits

const MAX_ITERATIONS = 3;
const MAX_TIME = 5 * 60 * 1000; // 5 minutes

async function safeLoop(requirements) {
  const startTime = Date.now();
  let iteration = 0;

  while (iteration < MAX_ITERATIONS) {
    if (Date.now() - startTime > MAX_TIME) {
      return { success: false, reason: "Timeout" };
    }
    // ... loop logic
    iteration++;
  }
}

Change Detection

function detectOscillation(codeHistory) {
  if (codeHistory.length < 3) return false;

  // Check if we're flip-flopping between similar states
  const last = codeHistory[codeHistory.length - 1];
  const thirdLast = codeHistory[codeHistory.length - 3];

  const similarity = calculateSimilarity(last, thirdLast);
  return similarity > 0.95; // Nearly identical = oscillating
}

Escalation Triggers

const escalationTriggers = {
  sameErrorTwice: (errors) => {
    return errors.length >= 2 &&
      errors[errors.length - 1] === errors[errors.length - 2];
  },

  noProgress: (testResults) => {
    const recent = testResults.slice(-3);
    const failingTests = recent.map((r) => r.failingTests);
    return failingTests.every((f) => f === failingTests[0]);
  },
};

if (Object.values(escalationTriggers).some((trigger) => trigger(history))) {
  return escalateToHuman(context);
}

Context Is Everything

The loop quality depends on context quality:

Good Context

const context = {
  // Coding standards
  patterns: await getPatterns(),

  // Similar implementations
  examples: await findSimilarCode(requirements),

  // Test patterns
  testExamples: await getTestPatterns(),

  // Known pitfalls
  warnings: await getRelevantWarnings(),
};

Bad Context

const context = {}; // Agent guesses everything

The richer your knowledge base, the better your agents perform.


When to Use This Pattern

Good fits:

  • Well-defined, testable requirements
  • Existing test suite
  • Standard code patterns
  • Routine feature additions

Poor fits:

  • Exploratory development
  • Architectural decisions
  • Performance optimization
  • Security-critical code

The Human Role

You're not eliminated. You're elevated:

Before: Write code, run tests, fix bugs, repeat

After: Define requirements, review solutions, approve changes

Your judgment is still essential. It's now applied at higher leverage.


Getting Started

Week 1: Manual Loop
Run builder, run tests, provide feedback manually. Learn the dynamics.

Week 2: Semi-Automated
Automate test running and feedback generation. Review builder output.

Week 3: Full Loop
Let the loop run. Set iteration limits. Review final results.

Week 4+: Refinement
Improve prompts based on failure patterns. Expand context. Increase autonomy.


The build-test-fix loop isn't the future. It's available today.

The question is whether you'll adopt it before or after your competitors do.

Automate Your Loop

Xtended provides the context infrastructure your agents need. Rich project knowledge leads to better generated code and fewer iterations.

Get Started Free