Case studies·Feb 20, 2026·10 min

Shipping a B2B SaaS MVP in 7 weeks: a teardown.

A full breakdown of the Kestrel AI build — sprints, decisions, what we'd do differently.

In September 2024, Aman came to us with a working prototype and a problem.

The prototype was a document intelligence tool — a RAG pipeline that let ops teams ask questions of their contracts, SOPs, and vendor agreements in plain English. It worked. He'd been demoing it to potential customers for three months. Every demo got a "yes, we'd pay for this." None of it was production-ready.

He'd been quoted six months by two agencies. He had eight weeks and a firm belief that if it took longer than that, the window would close.

We shipped it in seven weeks. Here's how.

Week 1: Scope before code

The first week was not engineering. It was a 90-minute call, a lot of reading, and a one-page document.

The call revealed three things. First, the prototype's latency was 8 seconds average. That was not viable. Second, Aman had been promised things by previous agencies that had inflated his expectations about what was feasible in the timeline. Third, the two pilot customers who were waiting had specific requirements that the prototype didn't meet: multi-tenancy (they needed their data isolated from each other) and document processing that didn't require manual upload by the operator.

The one-page scope we produced at the end of week one defined the MVP as:

→Multi-tenant auth with org-level data isolation → Automated document ingestion (PDF, DOCX, CSV) → Query interface with source citation → Usage metering for billing → Admin dashboard for Aman to manage orgs

That was it. No API docs. No team management. No custom branding. No settings page beyond the essentials. Every feature request that came up in week one that wasn't on that list went to a backlog that we explicitly said we would not look at until after launch.

Aman pushed back on two things. He wanted SSO from day one. He wanted a Slack integration.

We told him that SSO would add two weeks and the pilot customers hadn't asked for it. The Slack integration was a "nice to have" on his side, not a requirement from the customers. He agreed to move them to post-launch. Good decision.

Week 2–3: Design and architecture

Aanya ran the design sprint in parallel with Vikram's architecture decisions. By the time the screens were approved, the infrastructure choices were made and the data model was written.

The architecture decisions that mattered:

The latency problem was the first thing Vikram addressed. The prototype was using OpenAI's text-embedding-ada-002 model and doing cosine similarity on a flat array in Python. Not wrong, but not fast.

We rebuilt the pipeline on pgvector with the text-embedding-3-small model. We pre-processed and chunked documents at ingestion time rather than at query time. Average query latency dropped to 1.2 seconds in week two's testing. That number held in production.

Multi-tenancy was handled at the database level using row-level security in Postgres. Every query is automatically scoped to the authenticated organisation. There's no application-level tenant filtering that can be forgotten or misconfigured. If a developer accidentally queries without the right context, they get no results, not another tenant's results.

The document processing queue used a simple Postgres table as a queue (rather than SQS or RabbitMQ) because the volume was predictable and the retry logic was simple. This was the right call. Adding a message queue for the MVP would have been over-engineering.

The design decisions that mattered:

Aanya's principle on this project was: if a user has to read a tooltip to understand what a button does, we failed.

The query interface has three elements: a text input, a submit button, and the results. The results show the answer and the source documents it came from, highlighted. That's it. Users in the pilot were making queries within four minutes of being onboarded without any training.

Week 4–7: The build

We ran weekly sprints starting Monday, demo on Friday.

Aman started inviting pilot prospects to the Friday calls at week four. This was not something we planned for. He'd mentioned the project to a potential customer who asked if they could "see what was being built." That customer became paying pilot number three before we launched.

The build decisions, in order of how important they turned out to be:

The ingestion pipeline took the most time. PDF processing sounds trivial until you encounter the variety of PDFs that clients actually upload: scanned documents, password-protected files, PDFs that are actually images, PDFs with form fields, PDFs with tables that don't have semantic structure.

We used a combination of PyMuPDF for text extraction and a custom classification step that routes documents based on their structure before chunking. The classifier adds about 200ms per document but reduced the number of "why did it answer that way" queries by 60% in testing.

Usage metering was wired to PostHog events rather than database counters. This was a minor architectural choice that had outsized benefits: Aman could see exactly what queries were being run, at what frequency, by which orgs, in the PostHog dashboard without needing a custom analytics system. When one org was querying at 10x the rate of others, he knew before they hit any limit.

Stripe integration used Stripe's Usage Records API to let the billing catch up to actual usage at the end of each billing period, rather than requiring prepayment or credit packs. This was correct for B2B customers who don't want to manage credits and expect to be invoiced in arrears.

What we'd do differently

Ship the admin dashboard later. We built Aman a dashboard for managing organisations in week five. He used it once in the first month. A simpler Retool or even a direct Postgres interface would have freed up three or four days of engineering time that could have gone into the ingestion pipeline.

Write the onboarding flow earlier. We treated onboarding as a week-six problem. It should have been a week-two design problem. The onboarding flow is the first thing pilot customers see, and the version we shipped was functional but not smooth. Users got through it, but it required a setup call from Aman for the first two customers. That's two hours of his time that better onboarding would have saved.

Involve the pilot customers from week one. Aman invited them to demos from week four. If he'd looped them in from week one — even just sent them the scope document for review — we would have caught two assumption mismatches before they became build decisions.

The numbers

→Weeks to production: 7 → Launch-to-first-paying-pilot: 2 weeks before launch (closed via Friday demo) → ARR at 90 days post-launch: $180K → Query latency: 1.2 seconds average (down from 8 seconds in prototype) → Document processing: fully automated (was manual in prototype) → Downtime in first 60 days: 0

The things that made the difference: the one-page scope that said no to SSO and Slack. The Friday demos that turned into a sales channel. The decision to use pgvector instead of a separate vector database. And building the simplest possible admin interface instead of the comprehensive one.

The simplest version of the thing is almost always the right first version.

Written by

Microsive Studio