[HN Gopher] Automating myself out of development
___________________________________________________________________
Automating myself out of development
Author : nisabek
Score : 91 points
Date : 2026-06-09 18:59 UTC (4 days ago)
HTML web link (www.thoughtfultechnologist.com)
TEXT w3m dump (www.thoughtfultechnologist.com)
| noelwelsh wrote:
| I wish people would describe in more detail the tasks they use
| LLMs to code. My experience is that simple components in an
| existing architecture are fine, but anything requiring
| architectural considerations quickly becomes a mess. On my
| projects (e.g. a ui framework), running multiple agents in
| parallel would just increase the speed at which it can stuff up
| the project.
| nullbio wrote:
| It's great for people who are just maintaining something. Less
| so for someone building something from scratch, in the earlier
| phases.
| properbrew wrote:
| I used LLMs to develop Whistle Enterprise (https://whistle-
| enterprise.com) from the ground up, from scratch.
|
| It's taken _a lot_ of time and effort, but this is an example
| of what can be developed using LLMs alone.
|
| You have to have dedication and a goal to reach, but you can
| absolutely build anything if you're building with the right
| foundations in mind.
| ryanackley wrote:
| I think the relevant question isn't what can be built but the
| amount of effort in comparison to doing this the old
| fashioned way.
|
| What do you think the productivity gain was from using an
| LLM? This question assumes you're already an experienced
| developer.
| andai wrote:
| n=1 but, a friend of mine spent the last few months working
| on an experimental music software with Claude. What he
| built is amazing and far beyond my abilities (I have been
| programming for 20 years). He doesn't know any programming.
|
| In fact, it's far beyond what I would even _attempt,_
| because I 've just spent two decades building up a data
| bank of how hard things are supposed to be.
|
| He doesn't know it's supposed to be hard, so he just does
| it.
| dmortin wrote:
| Is his code maintainable, though? Or is it just a pile of
| code which happens to work? What if he wants to change
| something? Does he generate again the whole thing from
| scratch? Or does he tell Claude to make the changes and
| doesn't even know when something breaks when a new thing
| is added? (Assuming the software is complex, having
| multiple non trivial features.)
| motoroco wrote:
| There's no free lunch, it takes time and effort still. And
| expertise if you need it to be robust.
|
| In terms of velocity, let me offer some numbers. In 6
| months I generated >150k lines of code and merged 10k PRs
| to ship and iterate on https://plotalong.app
|
| I follow best practices and isolate agents to continuously
| deployed dev environments, semi-manually review PRs and
| gate the release process between multiple protected envs.
| The project is getting close to 500 end-to-end tests in
| Playwright.
|
| That's just working nights and weekends. Before AI, it took
| my team at the office 4 years to produce this much work.
| There are some qualitative differences but the speed and
| results are real
| properbrew wrote:
| Thank you for the assumption, I'm actually not a developer
| at all.
|
| I'm from a hardware / networking / infrastructure
| background. I've had extensive exposure to (web)
| application development as I'm working closely with
| development teams and I do have the bash/powershell
| scripting knowledge.
|
| But honestly, if I tried this "the old fashioned way" it
| probably would have taken me about 6 to 7 years to develop
| that application, that's an optimistic estimate. You really
| do have to have a passion for what you're building, I
| didn't know that voice transcription and local LLMs would
| be such a driving force for me, but it's all I think about,
| so much that I find it hard to go to sleep sometimes.
| leguy wrote:
| neat. I saw the "no bot joins the call". Is it obvious to
| others in the virtual meeting that you are using this tool?
| properbrew wrote:
| Thank you! No they cannot tell. It is your requirement as
| per the laws of your country to notify the other party if
| you're going to use it.
| Npovview wrote:
| There are hour long youtube videos where people explain the
| process by using a complex toy project. Search for them.
| davidcann wrote:
| I built this with 94% written by coding agents:
| https://buildermark.dev/
|
| The complete log of all prompts and commits is here:
| https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN
| MonstraG wrote:
| It seems that pages 2-5 on
|
| https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN/.
| ..
|
| still show content of page 1
| davidcann wrote:
| Thanks for the report. I messed up the CDN settings. It
| looks fixed now.
| germanptr wrote:
| I get this question a lot, and I found it hard to answer
| briefly, so I ended up writing a longer post about how I work:
|
| https://www.trigosec.com/insights/mob-programming-for-one/
|
| The short version is that I don't let AI agents work
| unsupervised on my code. I treat them like participants in a
| mob programming session instead of autonomous developers.
| Different agents get different roles (implementer, reviewer,
| architect, security reviewer, etc.), and I stay involved
| throughout the process.
|
| I also agree with your point about architecture. Generating
| isolated components is relatively easy; preserving and evolving
| the architectural boundaries across a larger codebase is much
| harder.
|
| We're still missing a good way to express and measure
| architectural quality. Until then, architecture heavy work
| requires much closer supervision than implementation heavy work
| Swizec wrote:
| > We're still missing a good way to express and measure
| architectural quality
|
| Architectural complexity[1]! There's several really good
| papers on this.
|
| Unfortunately it never caught on and we don't have great
| automated tools to spit out a number. Also the majority of
| people just don't care enough. Research in this field kinda
| died out when we invented microservices and started treating
| those as a silver bullet to The Architecture Problem (it's
| not [2])
|
| [1] https://swizec.com/blog/why-taming-architectural-
| complexity-...
|
| [2] https://youtu.be/y8OnoxKotPQ
| fbrchps wrote:
| Didn't even need to click the YouTube link, I knew it would
| be Krazaam.
| iot_devs wrote:
| > Also the majority of people just don't care enough.
|
| Yet! It is the next frontier and we will need it for having
| agent as described in the post to really work
| Swizec wrote:
| > Yet! It is the next frontier
|
| While researching my book I read papers from the 80's
| saying this. If you get a good enough spec and define the
| contracts and architecture, you then just hand off
| implementation to juniors/offshore/etc
|
| So far has not worked. Maybe this time!
| vslira wrote:
| > The short version is that I don't let AI agents work
| unsupervised on my code. I treat them like participants in a
| mob programming session instead of autonomous developers.
|
| I wonder if OS maintainers would have a leg up in defining
| workflows to better leverage this. Of course, OS contributors
| are autonomous developers, but maybe a trick or two might
| transfer across
| TheBigSalad wrote:
| You have to make those architectural decisions and feed them to
| the agents. Be very specific. That's been my experience.
| pipes wrote:
| I found that this guys stuff has really helped me:
|
| https://youtu.be/-QFHIoCo-Ko?is=FYYdukWluYX3vdQL
|
| Worth a watch.
| pjmlp wrote:
| Me when not trying to meet management expectations, only as
| smarter code completion, formatting code, basic code analysis,
| and helping copy pasting code examples between languages.
|
| Me when meeting management expectations, agent orchestration
| tools like Boomi and Workato calling into tools, doing with AI
| what a few years ago would be done with BPEL.
| amelius wrote:
| I personally limit LLMs to single files only at the moment.
| Self-contained components.
|
| Using LLMs in a larger scope can sometimes work, but it has the
| real risk of turning a project into a mess after which you will
| have to undo the work and lose a lot of time.
|
| Also, using LLMs this way with less clear boundaries will make
| reading and maintaining the code more cumbersome.
| rootusrootus wrote:
| I use this strategy, too. I liken it to limiting the blast
| radius. If the LLM truly fouls things up it's easier to pick
| up the pieces if you keep the scope limited.
| warumdarum wrote:
| The true test challenges should be how far an AI can minimize a
| given fucked up codebase and keep full functionality.
|
| I also think that writting large codebases into a sort of
| functional transformer tree as information compression stage
| would allow them to easier reason about large code bases by
| having a large lossless overview with minimal token usage.
| zem wrote:
| i've been running claude in what the blog calls phase 0 for the
| last 6-7 months. i'm perfectly happy with it, my development
| velocity has increased while i still have a good grasp of the
| entire app, and i've actually been making decent progress with
| web development for a personal project, which is something i've
| bounced off several times in the past. also i do not get stuck
| as often on stuff like "how do i get django to statically serve
| up a js bundle with relative imports" which is more about
| knowing specific APIs of specific frameworks than any feature
| of my code or architecture.
|
| i would not want to go down the "take myself out of the loop"
| path because yes, i do have to micromanage the claude session,
| often course-correcting every commit and then doing large scale
| refactoring every so often. but i'm perfectly happy doing that
| - i see claude as more of a tool than a coder i can hand work
| off to.
| girvo wrote:
| I'm currently using it to do a large migration from one Relay
| environment to another, but this is possible because
|
| 1. We've done it by hand for another route already, which the
| LLM uses as reference
|
| 2. Theres a strong validation setup/harness I've setup for it
| with storybooks, and component tests
|
| 3. It's a _mostly_ mechanical transform. Not entirely, as the
| two environments/APIs are not 1:1, but it's close enough
|
| But! I and my team are still reviewing everything _shrug_ it is
| "faster" because I get to have this running while I'm in
| meetings planning other more interesting projects
|
| And this isn't really that many agents in parallel. Yeah,
| plenty of fan-out subagents, but that IMO doesn't count/isn't
| really the same as what others are talking about
| mattmanser wrote:
| I think a problem here is you're overestimating how hard it
| is to rewrite something when you have one example of how to
| do it right. Even in the 2000s, I remember a junior
| essentially rewriting our entire codebase from old school asp
| vbscript to .Net in a few months. A 100 or so pages back
| then.
|
| Your team could have done it pre-AI, but you just thought it
| was hard so you didn't try.
|
| I remember migrating a code base from MySQL to SQL Server in
| the 2010s. I thought it would take me weeks, if not months.
| It took me a couple of days.
|
| Immediately made me sour on the "hot" idea in the 2010s that
| your data layer should be provider agnostic so you could
| switch if you needed to. That was never a real thing, it was
| a made up justification for unnecessary over-engineering, by
| people who had clearly never tried to port an app from one
| data source to another. There are other reasons for a clear
| separation, but switching a few hundred SQL statements is not
| it.
|
| In reality, mechanical ports are not that hard, you can sit
| down, put some music on and blitz it in a few days.
| Programmers just over-estimate how hard they will be.
| LtWorf wrote:
| Architectural considerations are easy. Figuring out what to
| actually do from the super vague requirements is even worse I
| think.
| yieldcrv wrote:
| I don't know if I'm overly critical but there's gotta be a middle
| ground between totally AI pilled people that otherwise have no
| talents, and control freak veteran developers who cant let go
|
| My current process is also using Github projects in a normal
| scrum style way, with many tickets written or fleshed out and
| state managed by the LLM, and it doubling as the memory system
|
| Completely leapfrogging all these other open and closed source
| concoctions and being more effective
|
| But its effective enough that I don't need OP's final form state
| of still approving everything
|
| Auto-mode is fine. Worktrees are built into Claude Code now. I
| just tell it to classify tickets as sequential or parallel
| possible and spawn subagents to tackle all of the tickets in the
| todo list
|
| They all get their own context window its pretty perfect now
|
| in the meantime I work in a couple tabs of Claude Design for
| different flows of any client side app. My philosophy has been
| that devs could pick up graphic and UI/UX design easily, its just
| still a full time job to make variations of layouts and portray
| their states.
|
| UI/UX is not a full time job anymore.
|
| And I use Claude chat to flesh out aspects of the overall idea
|
| I think you may be overcomplicating your workflow in the
| concluding state.
|
| Overall I agree that planning and intention is now most of the
| time, before a 10 subagent precision strike is initiated
| thi2 wrote:
| There are tons of people, those are just not as vocal.
| nisabek wrote:
| Could be (the overcomplicating part), I'm just not yet
| comfortable loosing the mental model of the final application.
| At least not in all types of tickets. Are you not seeing
| that?..
| yieldcrv wrote:
| I focus on one side project at a time, alongside work
| applications
|
| Both are giving me skillsets to excel in the other domain
|
| I watch the subagents, push back on some choices, look at
| commits and glance at pull requests
| ai_fry_ur_brain wrote:
| All these people saying UI/UX is dead, then I see their designs
| and they're absolutely the worst (but they're always swearing
| by how incredible it is).
|
| Sorry access to an LLM (even if it could center a div reliably
| and make a responsive designs, it can't) does not give you
| taste, intuition or make you good at building user interfaces.
| You people/sloppers have no idea the amount of sweat that gets
| poured into great UX.
|
| Its insulting when you people say these things and Im not even
| a designer or frontend dev.
|
| I actually think UI/UX designers and devs will be the last to
| fall. I will want beautiful products that were built by
| beautiful minds, thats how you will set yourself apart from the
| slop. And fortunately it will be even easier when 80% of
| everything is half assed cranked out UI by llm design tools.
| The contrast is already glaring.
| yieldcrv wrote:
| I've seen that slop but
|
| Claude Design has barely been out for a month
|
| And it's fulfilled my needs better than v0, lovable,
| playwright via LLM or just iterating in the coding LLM. I've
| worked with graphic designers my whole career and have also
| contracted design agencies to do style guides and collaborate
| on branding and layouts. I've gotten the output that I'm
| looking for with Claude Design
|
| eventually you'll see examples but its not in my purview to
| publicly link any of my projects as being vibe coded
| ai_fry_ur_brain wrote:
| Lmao claude design sucks ass. You have low standards.
| bluefirebrand wrote:
| > control freak veteran developers who cant let go
|
| It is not control freak behavior to want to be in control when
| you are the one accountable for it if it breaks.
| brcmthrowaway wrote:
| More Yegge tier psychosis.
| gnunicorn wrote:
| Interestingly, despite it being much more detailed and a lot more
| process and procedure than what I currently do - which is more
| akin to the version 0 described, but in parallel - we come up at
| the same final problem: reviews and quality assurance.
|
| I sign off the code I merged, part of company policy but also
| just to be sure it is actually decent. But reviewing has become
| the real draining bottleneck: even stacked PRs, if that total
| 5-6k lines is not a 5min job. Even if I brainstormed and set the
| plan, that's really the part that doesn't scale right now for me
| in this. But the author is very shy about that: either the
| changes arent that big in the end or they trust the process
| enough to review in a more casual manner. Being equally
| untrusting I can't do that ...
| strogonoff wrote:
| Proper review should take longer than writing it yourself,
| because you need to know the correct solution, understand the
| proposed solution, and evaluate the difference between the two.
| When designing it yourself, you just need to know the correct
| solution and write it, and with modern high-level languages and
| IDEs with autocomplete writing it is hardly a bottleneck.
| minihat wrote:
| It is harder to solve a sudoku than verify a solution's
| correctness. I find similar benefits occasionally when coding
| with LLMs.
| skydhash wrote:
| Sudoku's constraints are knownn and easy to build an
| harness for. Software has a more malleable structure. An
| harness is hard to build and the tests cases for the
| constraints can be a lot.
| layer8 wrote:
| I disagree under the following circumstances, which in my
| experience is the common case: You don't know from the
| outset all relevant considerations that go into
| implementing something. Coding yourself is an exploration
| process of those considerations. Being shown a finished
| solution doesn't let you see and understand all the
| considerations and the possible options that you'd have
| contemplated when implementing it yourself. When reviewing,
| you still have to do that exploratory thinking to weigh the
| possible options. And the fact that you have to do that
| exploration purely mentally rather than in a process of
| working with code arguably makes it harder (similar to
| contemplating alternative solutions to a Sudoku purely
| mentally, actuallu).
|
| There rarely is a single correct way of implementing some
| requirement or feature. It's a trade-off between
| compromises, not binary correct or incorrect like a Sudoku
| puzzle. The insights that the exploration give you may even
| lead you to implement something significantly different
| from what you originally set out to.
| strogonoff wrote:
| Imagine sudoku with hundreds of subtle, sometimes mutually
| exclusive rules, and no single valid solution.
|
| This is not about LLMs, by the way. It's about reviewing
| any code, including by a fellow human. It's just that many
| people mistakenly feel like with LLMs they can lower their
| guard and accept even if they have not gone through the
| steps of themselves coming up with their solution and
| comparing it to the one suggested by the LLM.
|
| The reason is that many correctly see proper review as
| _duplicate work_ , and while it is justified with another
| human (because it is (A) instructive and (B) reducing bus
| factor) with LLMs most people simply can't be bothered. If
| you personally can, you are a minority.
| nisabek wrote:
| If I'm attentive during spec/plan creation I sort of build this
| "expectation" of what the actual PR will look like, the mental
| model of it. Then it's somewhat easier to review. But the
| mental load is brutal tbh, and still not sure if it's "worth
| it"
| philbo wrote:
| For decades, engineers understood that large code reviews are
| harder than small ones. Out of both politeness and a desire to
| receive better code reviews, we learned to break our large
| changes into smaller chunks. Some engineers took things even
| further and replaced code reviews with pair programming. But
| then LLMs showed up and everyone seems to have forgotten those
| lessons.
|
| They can be still be applied now using coding agents, if you're
| willing to push back against the default setup and change your
| mode of thinking a little bit. Of course it doesn't help that
| an entire industry is dedicated to persuading us that
| maximizing token spend is the only way to get shit done.
|
| I appreciate this probably seems like an extremist take, but I
| wrote some more about it here in case there's anybody out there
| who identifies with it:
|
| https://philbooth.me/blog/agentic-coding-and-mental-models
| aocallaghan17 wrote:
| Agree with this completely. This push for more autonomy I
| think is the complete wrong direction for how to use LLMs.
|
| I want less code to maintain not more that I don't even fully
| understand.
|
| I think research and very supervised coding with lots of
| guardrails is the way to actually gain productivity from
| these tools.
| firegodjr wrote:
| I think that's reasonable. My only gripe is that making small
| sets of changes is often faster to do by hand than waiting on
| llm reasoning, so I've found it amounts to very little
| speedup.
| girvo wrote:
| > They can be still be applied now using coding agents, if
| you're willing to push back against the default setup and
| change your mode of thinking a little bit. Of course it
| doesn't help that an entire industry is dedicated to
| persuading us that maximizing token spend is the only way to
| get shit done.
|
| Yeah the problem is the executives and managers around us are
| demanding we ship massive features as quickly as possible,
| and I like having a job and _dread_ having to find a new one
| in this market...
| oblio wrote:
| Give them some time to be slapped with the real AI invoices
| once Anthropic, OpenAI IPO.
| pydry wrote:
| >Automating myself out of development
|
| >I want to start by saying that I'm neither an AI-fanatic
|
| Kind of like saying you are a fanatic before saying you aren't.
|
| I don't think theres too much here (e.g. "spec driven
| development") I haven't seen elsewhere.
| general1465 wrote:
| I am completely calm regarding AI and development.
|
| First nobody sane want to give their domain IP to
| OpenAI/Anthropic. That's why local AI will eventually prevail and
| flourish because people who actually have some IP will have no
| problem to buy 10k+ EUR machine to run some pretty good models on
| it. However if your main job is just doing CRUD stuff, then you
| are screwed.
|
| Secondly hallucination is really Achilles heel of every LLM. Sure
| you can recreate an application which exists in thousand of
| variations on the internet, but the moment you will try to go
| more into domain knowledge you will start struggling more and
| more.
|
| Try to make CAN driver for ESP32, easy it is probably going to
| work. Try to make CAN driver for STM32F7xx now the AI will start
| having a problem but probably will be able to produce something
| what is working after a lot of debugging. Now let's make CAN
| driver for MPC5555. AI will start writing fairy tales about
| registers which do not exist. All of processor above have
| reference manuals and sometimes example git repositories
| available on open internet.
| abletonlive wrote:
| > All of processor above have reference manuals and sometimes
| example git repositories available on open internet.
|
| okay? then give those reference manuals and git repositories? I
| haven't heard something know LLMs can't get around and figure
| out?
| bonoboTP wrote:
| Did you try this by giving it access to the materials? Human
| programmers also don't memorize all this stuff. If this is the
| reason for your calmness it's quite shortsighted.
|
| There are problems when you rely too much on AI generated code,
| but these shallow dismissals are quite annoying.
| duggan wrote:
| > First nobody sane want to give their domain IP to
| OpenAI/Anthropic. That's why local AI will eventually prevail
| and flourish because people who actually have some IP will have
| no problem to buy 10k+ EUR machine to run some pretty good
| models on it. However if your main job is just doing CRUD
| stuff, then you are screwed
|
| Replace OpenAI/Anthropic with AWS and this is not too
| dissimilar to the arguments in 2009 about cloud providers.
|
| It's not that there's nobody for whom this is true, it's just
| that there's enough of everyone else to build an empire with.
| 2001zhaozhao wrote:
| Good writeup. I think the main difference in my workflow is that
| I skipped the sandboxing part and accepted the coding agent
| having access to the entire 24/7 dev machine, so I'm still
| running on worktrees. Also, the "idea enrich" steps in my
| workflow are less formal - I tend to write most details in a
| feature spec myself. I also do my workflow on my own self-hosted
| custom interface which comes with a kanban board for project
| tracking, so I don't need Github. The rest of the workflow looks
| pretty similar.
___________________________________________________________________
(page generated 2026-06-13 23:01 UTC)