Maybe I'm one of an increasingly small group of AI-assisted coding holdouts by insisting on reviewing most if not all generated code--I don't think I'll ever be complete robot factory about it because I've seen too many dumb mistakes--but it really does get overwhelming. Recently I switched from a plan-based workflow to beads, and it feels like it might be a game-changer.
Beads is like a mini command-line issue tracker built for agents. Instead of writing giant markdown plans that inevitably go off the rails, I instead have it open beads/issues for tasks, each with dependencies and blockers. I can then tell it to pick work from bd ready, which kicks off a full test/develop/review/close cycle complete with review comments, evidence, and verification stages. When done, I'm left with a much more manageable review artifact that the agent has already checked for obvious footguns, rather than a few sections in the middle of a plan that may or may not be done. When that's done, I can just ask for another round of picking something from bd ready and repeat.
Neatly, it also seems to open issues when it finds bugs or things that might be wrong. And none of this pollutes any human-facing issue trackers, so plans can be broken down into very granular tasks. Paired with jj, I even have it retroactively editing linked changes as long as those changes aren't pushed and immutable.
I also appreciate that it seems to do handoff well. I'm working on porting Paperback to Linux. It did a live region implementation, including researching 2 implementation paths and documenting its research in a ticket. When done, it assigned me another ticket to test the flow in Orca, complete with very specific steps on hotkeys to test and expected announcements.
I hear so many stats on how developers are XX% less effective with AI than they believe they are. I'd be interested in how those stats translate to folks with disabilities. Even with all the workflow ceremony I've created that basically mimics a ticket implement/review cycle, I feel like I'm working far faster than I could before, and after my review feedback, my work is of comparable or better quality. Maybe it'd be different if I could just skim a syntax-highlighted screen of code for errors, or quickly research dbus APIs and libraries. But I can't, so here we are.
Scott
in reply to Jamie Teh • • •James Scholes
in reply to Scott • • •Scott
in reply to James Scholes • • •Jamie Teh
in reply to Scott • • •Scott
in reply to Jamie Teh • • •Jamie Teh
in reply to Scott • • •Scott
in reply to Jamie Teh • • •Jamie Teh
in reply to Scott • • •Jamie Teh
in reply to Scott • • •Scott
in reply to Jamie Teh • • •Jamie Teh
in reply to Scott • • •x0
in reply to Jamie Teh • • •Pitermach
in reply to x0 • • •x0
in reply to Pitermach • • •Scott
in reply to x0 • • •Scott
in reply to Pitermach • • •x0
in reply to x0 • • •x0
in reply to x0 • • •Jamie Teh
in reply to x0 • • •x0
in reply to Jamie Teh • • •Jamie Teh
in reply to x0 • • •x0
in reply to Jamie Teh • • •Jamie Teh
in reply to x0 • • •