Technical Insight

We Open-Sourced Our AI Development Agent — How a 3-Day Feature Became a 4-Hour Sprint

April 16, 20262 views

BasKorea shares Flow: AGENTS.md rules, F/L/N lanes, explicit approvals, and A-B-C-D stages that lock scope for AI-assisted development.

Hey, I'm dongkyu — lead developer at BasKorea, a marine equipment trading company based in Busan. For the past five months, our team has been running an in-house SaaS product called FlowMate entirely on an AI development agent we call Flow.

We Open-Sourced Our AI Development Agent

Flow isn't a separate tool or framework. It's a setup where Claude Code reads a single rules document — AGENTS.md — and uses it to classify every request, choose a path, and gate each step before moving forward.

This post is for you if any of these sound familiar:

You've handed a feature to an AI agent and gotten back three unrelated changes bundled in
You've tried AI-written PRs + AI code review, but something feels off — "did this actually get validated?"
Your team is small, the backlog is huge, and you need AI to genuinely move the needle, not just generate more code to review
You're a PM, tech lead, or CTO watching diffs go by thinking "wait, did I approve this?"

The short version: we went from 3 days per feature cycle to 4 hours, and I haven't written a line of code myself since February.

A year ago I was copy-pasting code into ChatGPT. This is what the other side looks like.

Part 1. Why We Built This

1-1. The "While We're At It" Problem

There's a specific failure mode that hits every team working with AI agents. It goes like this:

Monday: ticket says "add feature A"
Tuesday: agent decides to refactor B "since it's related"
Wednesday: schema C gets touched "while we're in there"
Thursday: nobody remembers what the original ticket was

AI moves faster than your review cycle. That gap is where "looks good enough, let's merge" creeps in.

A developer survey found that 66% of engineers spend more time fixing AI-generated code than writing from scratch. We hit that wall too.

After three separate incidents of runaway scope, we stopped trying to fix it with more rules and asked a different question: what if we treated every change as a contract?

1-2. What We Tried First (and Why It Didn't Work)

We went through the usual playbook:

PR templates: AI writes the PR, AI reviews the PR → the checklist becomes a rubber stamp
Coding guidelines in a .md file: the agent claims to have read it, then ignores it at the exact moment it matters
Reviewer assignment rules: agents reviewing their own work is just confirmation bias at scale

According to CodeRabbit's analysis, AI-written code reviewed by AI results in 1.7× more issues per PR and 8× more duplicate code.

Stack too many instructions and you hit what we started calling the "instruction overload" effect — the agent stops following any of them properly.

The root problem wasn't the agent's capability. It was that the entity executing the rules had no stake in understanding them.

1-3. The Fix: A Contract, Not More Rules

We put three principles at the top of AGENTS.md — non-negotiable, read before every single task:

Every change requires explicit approval. Two phrases only — Approved or LGTM
Stage transitions happen only on explicit human Y/N. Flow can't advance itself.
Scope changes void the current approval. Continuing without re-approval is a contract violation.

Flow reads these three lines before touching anything. Every session, every task.

Part 2. How Flow Makes Decisions

2-1. The Full Picture — AGENTS.md Decision Tree

Everything Flow does fits on one diagram.

AGENTS.md decision tree — Route Selection Gate full structure

Three things to keep in mind up front:

Read-only requests skip the gate entirely. A gate that blocks everything is a gate nobody uses.
The lane is determined by the nature of the change — not by Flow, not by the developer. You can't talk your way into Fast-Track.
Only Standard Lane runs the full A-B-C-D loop. Fast-Track and Lightweight don't even have a Stage D.

For any change-executing request, Flow always opens with exactly this:

Select a route: Fast-Track (F) / Lightweight (L) / Standard (N)

No variation. Every time.

2-2. What Goes Through the Gate?

Gate-free (Read-only)

Q&A, research, analysis
Log and query lookups
Explanation, read-only code review

Gate required (Change-executing)

Any code or document modification
Implementation-linked tests or builds
DB / query logic changes, commits

Without this split, the gate becomes friction and the team routes around it.

2-3. F / L / N — The Change Determines the Lane

Lane When Required Docs F Fast-Track No logic touched — style, copy, config tweaks None L Lightweight New feature, existing API & DB untouched 1 README N Standard API contract or DB schema changes Spec Pack (5 files)

Every lane has a gate Flow can't bypass. If you try to manually downgrade a Standard change to Lightweight, Flow pushes back.

Part 3. Standard Lane — The A-B-C-D Loop

Standard Lane runs before a single line of code is written. Flow generates a 5-file Spec Pack, then steps through four stages in order.

Stage What Gets Locked Core Rule A Scope Lock "What are we building?" In/Out fixed, acceptance criteria frozen B Build & Test "What did we build?" Approved scope only, test evidence required C Adversarial Review "What could break?" Verdict only — no code changes allowed D Ship Decision "How do we control this in prod?" Go/No-Go, rollback plan, commit gate

Every stage transition is gated on a human Y/N. Flow cannot advance itself.

Stage C is the critical one.

The moment a reviewer says "I found an issue — I'll just fix it while I'm here," the reviewer becomes the implementer. The review is over. You've lost your independent check.

Stage C issues get a verdict, nothing else. Fixes go back to Stage B.

On top of the stage gates, there's a scoring gate (85 → 90 → 92 → 95 points to advance) and a hard gate (one Critical finding blocks everything). The full scoring rubric is in the AGENTS.md release coming soon.

Part 4. What This Looks Like in Practice

4-1. The 4-Hour Sprint — Workflow & Stage Mapping

Here's how a Standard Lane feature actually runs. Left column is the real-world workflow. Right column is what's happening inside the stage gates.

# What Happens Stage Gate 1 Record the meeting or user request — (requirements gathering) 2 Flow converts recording → structured MD — (requirements writing) 3 Flow generates Spec Pack (5 files) Route Gate → N confirmed / Stage A enters 4 Backend sends api-spec.md to frontend Stage A contract exchange 5 Frontend agent flags issues and questions Stage A verification 6 Backend agent reviews questions → updates spec Stage A re-confirmation 7 Frontend signs off → both agents build & push Approved/LGTM → Stage B enters 8 Push → staging auto-deploys → manual QA Stage C verdict 9 QA passes → production deploy Stage D Go/No-Go → commit 10 User feedback → next cycle Post feature-unit commit

Why 3 days became 4 hours.

The speed gain didn't come from faster implementation (step 7). It came from steps 3–6 — the contract phase.

Before this system, spec ambiguity would surface during implementation. We'd have to loop back, and every loop back burned the session context we'd built up.

Lock the contract at the front, and the back half of the workflow stops blowing up.

4-2. How We Got Here

This wasn't designed upfront. I spent a Lunar New Year holiday doing what I've started calling a "solo hackathon" — pulling apart well-known open-source agent architectures on GitHub, rebuilding from scratch, and not stopping until it was production-ready.

When What Changed Nov 2025 Introduced Codex — no rules, no structure Dec 2025 Started reverse-engineering open-source agent architectures Jan 2026 Beta version deployed to production codebase Feb 2026 (holiday) Solo hackathon — beta rebuilt to production quality Feb 2026 (post-holiday) Switched to 100% agent-driven development Today Flow running in production

"A contract matters more than the code. Scripts enforce rules. Contracts make rules understood." — dongkyu

Over the coming posts, I'll walk through how each stage actually runs in practice and how the system evolved to where it is today.

Key Takeaways

The risk with AI development isn't speed — it's approval and scope becoming invisible
An overly aggressive gate gets bypassed — the Read-only exemption is why Flow actually gets used
Reviewer and implementer can't occupy the same stage — Stage C's no-edit rule enforces this hard

AGENTS.md v1.4 — Full Release Coming Soon

The complete AGENTS.md file will be published in a follow-up post once I've cleaned up the annotations.

BasKorea dev team motto

"The first user is always the salesperson at the next desk. — dongkyu"

AGENTS.mdAI AgentDevelop Process

BAS KOREA

Blog

We Open-Sourced Our AI Development Agent — How a 3-Day Feature Became a 4-Hour Sprint