A Year's Work in Two Weeks

Jan 24, 2026

December 20th, 2025. Another malformed PDF. Another parsing error. Another hour of my life I won't get back.

At Documenso, we process a lot of documents. Hundreds of thousands of signatures, all flowing through our platform (you can see our live metrics). At that volume, edge cases aren't edge cases anymore. They're Tuesday.

The JavaScript PDF ecosystem has some great libraries, each excellent at what they do.

PDF.js is brilliant for rendering.
pdf-lib has a clean, intuitive API.
pdfkit generates beautiful documents.

But none of them do everything, and the ones that parse often struggle with the messy, malformed PDFs that exist in the real world. We'd swapped to a semi-maintained fork of pdf-lib, but we knew the core parsing layer wasn't robust enough for what we needed.

We'd always talked about building our own library, but the mental math never worked out. Six months of full-time effort, minimum. Time better spent making Documenso better.

Then I thought: what if the math has changed?

The Port That Worked (Sort Of)

I fired up Claude's Opus 4.5 with a simple idea, port Apache PDFBox from Java to TypeScript. PDFBox is battle-tested. It handles the weird stuff. If I could just translate it, I'd have a robust foundation.

Two weeks of churning through tokens later, I had... something. The library was technically ported. Tests passed. But the code felt dicey.

Java-isms don't translate cleanly to TypeScript. Circular dependencies that Java handles gracefully become awkward await import(...)'s. Function overloading based on argument types (idiomatic in Java, awkward in TypeScript). The patterns that make PDFBox robust made the port feel brittle.

The neat part is that the port actually worked. Tests passed, the library was functional. But I didn't trust it. That gut sense that something is off even when the tests are green, that's the part AI can't provide. That's taste.

The Pivot

So I scrapped it. Two weeks of work, gone. When you're less attached to the labour that produced the code, throwing it away gets easier. The sunk cost fallacy loses its grip.

But the work wasn't wasted. The insight was this: I didn't need PDFBox's code. I needed its knowledge. PDFBox and PDF.js have years of accumulated wisdom about malformed documents, weird edge cases, and spec violations that exist in the wild. That wisdom is encoded in their test fixtures.

The new approach: build from scratch, but validate against their fixtures. Write our own parser with a clean TypeScript-first architecture, then prove it handles the same edge cases the battle-tested libraries handle. Use their test suites as our acceptance criteria.

This required judgment AI couldn't provide. The model could port code all day, but deciding not to port, to extract the validation strategy instead of the implementation, that was a human call.

The Workflow

With the new direction set, I broke everything into plans:

├── GOALS.md
├── ARCHITECTURE.md
├── plans/
│   ├── 001-scanner.md
│   ├── 002-pdf-objects.md
│   ├── 003-token-reader.md
│   ├── ...
│   ├── 035-text-extraction.md
│   ├── 036-documentation.md
│   └── 037-annotation-support.md

Thirty-seven incremental plans. Each one a discrete chunk of functionality: the lexer, the object model, xref parsing, encryption, digital signatures, form handling, font embedding.

The loop looked like this:

1. Spec the plan together. I'd sit with the agent and discuss what the API should feel like. What are the edge cases? What patterns from PDFBox or PDF.js should we follow? We'd run exploration tasks against the reference libraries to understand how they handled specific scenarios.

2. Interview for details. Once I was happy with the direction, I'd use the interview pattern to capture details I wouldn't have otherwise considered. This might be the most important development in coding agent workflows. So often people get skill-diffed trying to write code with AI simply because they can't express themselves clearly enough to get a reasonable result. Interviews solve this by coercing context from the user. The model asks you questions until it has what it needs.

3. Implement autonomously. With a solid spec, I'd kick off implementation using a structured prompt:

4. Review everything. This is where taste comes back in. I'd review every implementation personally, checking that the code was correct and that the APIs felt right. "Correct" meaning functional and not off the rails. Models have a tendency to be overly clever. Smart solutions that are hard to maintain. Abstractions that don't earn their complexity.

We repeated this for all thirty-seven plans, occasionally tearing out implementations in favor of new patterns that emerged during review.

The Taste Problem

This workflow differs from the currently trending "Ralph loop" where an agent runs autonomously until it's completed a significant chunk of work. I like the concept, but I think it falls apart for anything non-trivial.

The problem is taste.

An agent can run tests. It can fix lint errors. It can even refactor code that doesn't meet style guidelines. But it can't tell you when an abstraction is wrong. It can't feel that a function is doing too much, or that an API will be confusing to future users, or that a "clever" solution is actually a maintenance burden.

During LibPDF's development, I caught countless instances of the model being too smart. Overly generic solutions where specific ones would be clearer. Premature abstractions. Code that worked perfectly but would be inscrutable in six months.

That's not a failure of the model. It's doing exactly what it's designed to do: produce working code that satisfies the spec. The spec just can't capture everything. Taste fills the gap.

The Result

LibPDF is now a real library. MIT licensed, already in use at Documenso. It parses documents pdf-lib would reject. It supports encryption (RC4, AES-128, AES-256), digital signatures (PAdES B-B through B-LTA), form filling and flattening, font embedding with subsetting, incremental saves that preserve existing signatures.

The philosophy that emerged: be lenient. Real-world PDFs are messy. Export a document through three different tools and you'll get three slightly different interpretations of the spec. LibPDF prioritizes opening your document over strict compliance. When standard parsing fails, it falls back to brute-force recovery, scanning the entire file to rebuild the structure.

Would this have been possible without AI? Eventually. But we'd estimated six months of full-time work. I did it over Christmas break, in the margins between family time and everything else.

I must admit I was consumed by it. Lost track of time writing specs, reviewing code, feeling out the library as it took shape. That flow state, the one where you look up and hours have passed, I hadn't felt it in a while. AI didn't replace that feeling. It amplified it. Removed the friction so I could stay in the zone.

What This Means

The math has changed. Projects that lived in the "someday" pile because the effort didn't justify the outcome, they're worth revisiting. The force multiplier is real.

But so is the bottleneck. The constraint isn't token throughput or model capability. It's taste. Your judgment about what's good, what's right, what will hold up over time. That's the part that doesn't scale.

AI gives you leverage. What you do with it is still up to you.

← Back to Posts