hngopher.com/1/live/items/48465890

       [HN Gopher] Automating myself out of development
       ___________________________________________________________________
        
       Automating myself out of development
        
       Author : nisabek
       Score  : 91 points
       Date   : 2026-06-09 18:59 UTC (4 days ago)
        
  HTML web link (www.thoughtfultechnologist.com)
  TEXT w3m dump (www.thoughtfultechnologist.com)
        
       | noelwelsh wrote:
       | I wish people would describe in more detail the tasks they use
       | LLMs to code. My experience is that simple components in an
       | existing architecture are fine, but anything requiring
       | architectural considerations quickly becomes a mess. On my
       | projects (e.g. a ui framework), running multiple agents in
       | parallel would just increase the speed at which it can stuff up
       | the project.
        
         | nullbio wrote:
         | It's great for people who are just maintaining something. Less
         | so for someone building something from scratch, in the earlier
         | phases.
        
         | properbrew wrote:
         | I used LLMs to develop Whistle Enterprise (https://whistle-
         | enterprise.com) from the ground up, from scratch.
         | 
         | It's taken _a lot_ of time and effort, but this is an example
         | of what can be developed using LLMs alone.
         | 
         | You have to have dedication and a goal to reach, but you can
         | absolutely build anything if you're building with the right
         | foundations in mind.
        
           | ryanackley wrote:
           | I think the relevant question isn't what can be built but the
           | amount of effort in comparison to doing this the old
           | fashioned way.
           | 
           | What do you think the productivity gain was from using an
           | LLM? This question assumes you're already an experienced
           | developer.
        
             | andai wrote:
             | n=1 but, a friend of mine spent the last few months working
             | on an experimental music software with Claude. What he
             | built is amazing and far beyond my abilities (I have been
             | programming for 20 years). He doesn't know any programming.
             | 
             | In fact, it's far beyond what I would even _attempt,_
             | because I 've just spent two decades building up a data
             | bank of how hard things are supposed to be.
             | 
             | He doesn't know it's supposed to be hard, so he just does
             | it.
        
               | dmortin wrote:
               | Is his code maintainable, though? Or is it just a pile of
               | code which happens to work? What if he wants to change
               | something? Does he generate again the whole thing from
               | scratch? Or does he tell Claude to make the changes and
               | doesn't even know when something breaks when a new thing
               | is added? (Assuming the software is complex, having
               | multiple non trivial features.)
        
             | motoroco wrote:
             | There's no free lunch, it takes time and effort still. And
             | expertise if you need it to be robust.
             | 
             | In terms of velocity, let me offer some numbers. In 6
             | months I generated >150k lines of code and merged 10k PRs
             | to ship and iterate on https://plotalong.app
             | 
             | I follow best practices and isolate agents to continuously
             | deployed dev environments, semi-manually review PRs and
             | gate the release process between multiple protected envs.
             | The project is getting close to 500 end-to-end tests in
             | Playwright.
             | 
             | That's just working nights and weekends. Before AI, it took
             | my team at the office 4 years to produce this much work.
             | There are some qualitative differences but the speed and
             | results are real
        
             | properbrew wrote:
             | Thank you for the assumption, I'm actually not a developer
             | at all.
             | 
             | I'm from a hardware / networking / infrastructure
             | background. I've had extensive exposure to (web)
             | application development as I'm working closely with
             | development teams and I do have the bash/powershell
             | scripting knowledge.
             | 
             | But honestly, if I tried this "the old fashioned way" it
             | probably would have taken me about 6 to 7 years to develop
             | that application, that's an optimistic estimate. You really
             | do have to have a passion for what you're building, I
             | didn't know that voice transcription and local LLMs would
             | be such a driving force for me, but it's all I think about,
             | so much that I find it hard to go to sleep sometimes.
        
           | leguy wrote:
           | neat. I saw the "no bot joins the call". Is it obvious to
           | others in the virtual meeting that you are using this tool?
        
             | properbrew wrote:
             | Thank you! No they cannot tell. It is your requirement as
             | per the laws of your country to notify the other party if
             | you're going to use it.
        
         | Npovview wrote:
         | There are hour long youtube videos where people explain the
         | process by using a complex toy project. Search for them.
        
         | davidcann wrote:
         | I built this with 94% written by coding agents:
         | https://buildermark.dev/
         | 
         | The complete log of all prompts and commits is here:
         | https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN
        
           | MonstraG wrote:
           | It seems that pages 2-5 on
           | 
           | https://demo.buildermark.dev/projects/u020uhEFtuWwPei6z6nbN/.
           | ..
           | 
           | still show content of page 1
        
             | davidcann wrote:
             | Thanks for the report. I messed up the CDN settings. It
             | looks fixed now.
        
         | germanptr wrote:
         | I get this question a lot, and I found it hard to answer
         | briefly, so I ended up writing a longer post about how I work:
         | 
         | https://www.trigosec.com/insights/mob-programming-for-one/
         | 
         | The short version is that I don't let AI agents work
         | unsupervised on my code. I treat them like participants in a
         | mob programming session instead of autonomous developers.
         | Different agents get different roles (implementer, reviewer,
         | architect, security reviewer, etc.), and I stay involved
         | throughout the process.
         | 
         | I also agree with your point about architecture. Generating
         | isolated components is relatively easy; preserving and evolving
         | the architectural boundaries across a larger codebase is much
         | harder.
         | 
         | We're still missing a good way to express and measure
         | architectural quality. Until then, architecture heavy work
         | requires much closer supervision than implementation heavy work
        
           | Swizec wrote:
           | > We're still missing a good way to express and measure
           | architectural quality
           | 
           | Architectural complexity[1]! There's several really good
           | papers on this.
           | 
           | Unfortunately it never caught on and we don't have great
           | automated tools to spit out a number. Also the majority of
           | people just don't care enough. Research in this field kinda
           | died out when we invented microservices and started treating
           | those as a silver bullet to The Architecture Problem (it's
           | not [2])
           | 
           | [1] https://swizec.com/blog/why-taming-architectural-
           | complexity-...
           | 
           | [2] https://youtu.be/y8OnoxKotPQ
        
             | fbrchps wrote:
             | Didn't even need to click the YouTube link, I knew it would
             | be Krazaam.
        
             | iot_devs wrote:
             | > Also the majority of people just don't care enough.
             | 
             | Yet! It is the next frontier and we will need it for having
             | agent as described in the post to really work
        
               | Swizec wrote:
               | > Yet! It is the next frontier
               | 
               | While researching my book I read papers from the 80's
               | saying this. If you get a good enough spec and define the
               | contracts and architecture, you then just hand off
               | implementation to juniors/offshore/etc
               | 
               | So far has not worked. Maybe this time!
        
           | vslira wrote:
           | > The short version is that I don't let AI agents work
           | unsupervised on my code. I treat them like participants in a
           | mob programming session instead of autonomous developers.
           | 
           | I wonder if OS maintainers would have a leg up in defining
           | workflows to better leverage this. Of course, OS contributors
           | are autonomous developers, but maybe a trick or two might
           | transfer across
        
         | TheBigSalad wrote:
         | You have to make those architectural decisions and feed them to
         | the agents. Be very specific. That's been my experience.
        
         | pipes wrote:
         | I found that this guys stuff has really helped me:
         | 
         | https://youtu.be/-QFHIoCo-Ko?is=FYYdukWluYX3vdQL
         | 
         | Worth a watch.
        
         | pjmlp wrote:
         | Me when not trying to meet management expectations, only as
         | smarter code completion, formatting code, basic code analysis,
         | and helping copy pasting code examples between languages.
         | 
         | Me when meeting management expectations, agent orchestration
         | tools like Boomi and Workato calling into tools, doing with AI
         | what a few years ago would be done with BPEL.
        
         | amelius wrote:
         | I personally limit LLMs to single files only at the moment.
         | Self-contained components.
         | 
         | Using LLMs in a larger scope can sometimes work, but it has the
         | real risk of turning a project into a mess after which you will
         | have to undo the work and lose a lot of time.
         | 
         | Also, using LLMs this way with less clear boundaries will make
         | reading and maintaining the code more cumbersome.
        
           | rootusrootus wrote:
           | I use this strategy, too. I liken it to limiting the blast
           | radius. If the LLM truly fouls things up it's easier to pick
           | up the pieces if you keep the scope limited.
        
         | warumdarum wrote:
         | The true test challenges should be how far an AI can minimize a
         | given fucked up codebase and keep full functionality.
         | 
         | I also think that writting large codebases into a sort of
         | functional transformer tree as information compression stage
         | would allow them to easier reason about large code bases by
         | having a large lossless overview with minimal token usage.
        
         | zem wrote:
         | i've been running claude in what the blog calls phase 0 for the
         | last 6-7 months. i'm perfectly happy with it, my development
         | velocity has increased while i still have a good grasp of the
         | entire app, and i've actually been making decent progress with
         | web development for a personal project, which is something i've
         | bounced off several times in the past. also i do not get stuck
         | as often on stuff like "how do i get django to statically serve
         | up a js bundle with relative imports" which is more about
         | knowing specific APIs of specific frameworks than any feature
         | of my code or architecture.
         | 
         | i would not want to go down the "take myself out of the loop"
         | path because yes, i do have to micromanage the claude session,
         | often course-correcting every commit and then doing large scale
         | refactoring every so often. but i'm perfectly happy doing that
         | - i see claude as more of a tool than a coder i can hand work
         | off to.
        
         | girvo wrote:
         | I'm currently using it to do a large migration from one Relay
         | environment to another, but this is possible because
         | 
         | 1. We've done it by hand for another route already, which the
         | LLM uses as reference
         | 
         | 2. Theres a strong validation setup/harness I've setup for it
         | with storybooks, and component tests
         | 
         | 3. It's a _mostly_ mechanical transform. Not entirely, as the
         | two environments/APIs are not 1:1, but it's close enough
         | 
         | But! I and my team are still reviewing everything _shrug_ it is
         | "faster" because I get to have this running while I'm in
         | meetings planning other more interesting projects
         | 
         | And this isn't really that many agents in parallel. Yeah,
         | plenty of fan-out subagents, but that IMO doesn't count/isn't
         | really the same as what others are talking about
        
           | mattmanser wrote:
           | I think a problem here is you're overestimating how hard it
           | is to rewrite something when you have one example of how to
           | do it right. Even in the 2000s, I remember a junior
           | essentially rewriting our entire codebase from old school asp
           | vbscript to .Net in a few months. A 100 or so pages back
           | then.
           | 
           | Your team could have done it pre-AI, but you just thought it
           | was hard so you didn't try.
           | 
           | I remember migrating a code base from MySQL to SQL Server in
           | the 2010s. I thought it would take me weeks, if not months.
           | It took me a couple of days.
           | 
           | Immediately made me sour on the "hot" idea in the 2010s that
           | your data layer should be provider agnostic so you could
           | switch if you needed to. That was never a real thing, it was
           | a made up justification for unnecessary over-engineering, by
           | people who had clearly never tried to port an app from one
           | data source to another. There are other reasons for a clear
           | separation, but switching a few hundred SQL statements is not
           | it.
           | 
           | In reality, mechanical ports are not that hard, you can sit
           | down, put some music on and blitz it in a few days.
           | Programmers just over-estimate how hard they will be.
        
         | LtWorf wrote:
         | Architectural considerations are easy. Figuring out what to
         | actually do from the super vague requirements is even worse I
         | think.
        
       | yieldcrv wrote:
       | I don't know if I'm overly critical but there's gotta be a middle
       | ground between totally AI pilled people that otherwise have no
       | talents, and control freak veteran developers who cant let go
       | 
       | My current process is also using Github projects in a normal
       | scrum style way, with many tickets written or fleshed out and
       | state managed by the LLM, and it doubling as the memory system
       | 
       | Completely leapfrogging all these other open and closed source
       | concoctions and being more effective
       | 
       | But its effective enough that I don't need OP's final form state
       | of still approving everything
       | 
       | Auto-mode is fine. Worktrees are built into Claude Code now. I
       | just tell it to classify tickets as sequential or parallel
       | possible and spawn subagents to tackle all of the tickets in the
       | todo list
       | 
       | They all get their own context window its pretty perfect now
       | 
       | in the meantime I work in a couple tabs of Claude Design for
       | different flows of any client side app. My philosophy has been
       | that devs could pick up graphic and UI/UX design easily, its just
       | still a full time job to make variations of layouts and portray
       | their states.
       | 
       | UI/UX is not a full time job anymore.
       | 
       | And I use Claude chat to flesh out aspects of the overall idea
       | 
       | I think you may be overcomplicating your workflow in the
       | concluding state.
       | 
       | Overall I agree that planning and intention is now most of the
       | time, before a 10 subagent precision strike is initiated
        
         | thi2 wrote:
         | There are tons of people, those are just not as vocal.
        
         | nisabek wrote:
         | Could be (the overcomplicating part), I'm just not yet
         | comfortable loosing the mental model of the final application.
         | At least not in all types of tickets. Are you not seeing
         | that?..
        
           | yieldcrv wrote:
           | I focus on one side project at a time, alongside work
           | applications
           | 
           | Both are giving me skillsets to excel in the other domain
           | 
           | I watch the subagents, push back on some choices, look at
           | commits and glance at pull requests
        
         | ai_fry_ur_brain wrote:
         | All these people saying UI/UX is dead, then I see their designs
         | and they're absolutely the worst (but they're always swearing
         | by how incredible it is).
         | 
         | Sorry access to an LLM (even if it could center a div reliably
         | and make a responsive designs, it can't) does not give you
         | taste, intuition or make you good at building user interfaces.
         | You people/sloppers have no idea the amount of sweat that gets
         | poured into great UX.
         | 
         | Its insulting when you people say these things and Im not even
         | a designer or frontend dev.
         | 
         | I actually think UI/UX designers and devs will be the last to
         | fall. I will want beautiful products that were built by
         | beautiful minds, thats how you will set yourself apart from the
         | slop. And fortunately it will be even easier when 80% of
         | everything is half assed cranked out UI by llm design tools.
         | The contrast is already glaring.
        
           | yieldcrv wrote:
           | I've seen that slop but
           | 
           | Claude Design has barely been out for a month
           | 
           | And it's fulfilled my needs better than v0, lovable,
           | playwright via LLM or just iterating in the coding LLM. I've
           | worked with graphic designers my whole career and have also
           | contracted design agencies to do style guides and collaborate
           | on branding and layouts. I've gotten the output that I'm
           | looking for with Claude Design
           | 
           | eventually you'll see examples but its not in my purview to
           | publicly link any of my projects as being vibe coded
        
             | ai_fry_ur_brain wrote:
             | Lmao claude design sucks ass. You have low standards.
        
         | bluefirebrand wrote:
         | > control freak veteran developers who cant let go
         | 
         | It is not control freak behavior to want to be in control when
         | you are the one accountable for it if it breaks.
        
       | brcmthrowaway wrote:
       | More Yegge tier psychosis.
        
       | gnunicorn wrote:
       | Interestingly, despite it being much more detailed and a lot more
       | process and procedure than what I currently do - which is more
       | akin to the version 0 described, but in parallel - we come up at
       | the same final problem: reviews and quality assurance.
       | 
       | I sign off the code I merged, part of company policy but also
       | just to be sure it is actually decent. But reviewing has become
       | the real draining bottleneck: even stacked PRs, if that total
       | 5-6k lines is not a 5min job. Even if I brainstormed and set the
       | plan, that's really the part that doesn't scale right now for me
       | in this. But the author is very shy about that: either the
       | changes arent that big in the end or they trust the process
       | enough to review in a more casual manner. Being equally
       | untrusting I can't do that ...
        
         | strogonoff wrote:
         | Proper review should take longer than writing it yourself,
         | because you need to know the correct solution, understand the
         | proposed solution, and evaluate the difference between the two.
         | When designing it yourself, you just need to know the correct
         | solution and write it, and with modern high-level languages and
         | IDEs with autocomplete writing it is hardly a bottleneck.
        
           | minihat wrote:
           | It is harder to solve a sudoku than verify a solution's
           | correctness. I find similar benefits occasionally when coding
           | with LLMs.
        
             | skydhash wrote:
             | Sudoku's constraints are knownn and easy to build an
             | harness for. Software has a more malleable structure. An
             | harness is hard to build and the tests cases for the
             | constraints can be a lot.
        
             | layer8 wrote:
             | I disagree under the following circumstances, which in my
             | experience is the common case: You don't know from the
             | outset all relevant considerations that go into
             | implementing something. Coding yourself is an exploration
             | process of those considerations. Being shown a finished
             | solution doesn't let you see and understand all the
             | considerations and the possible options that you'd have
             | contemplated when implementing it yourself. When reviewing,
             | you still have to do that exploratory thinking to weigh the
             | possible options. And the fact that you have to do that
             | exploration purely mentally rather than in a process of
             | working with code arguably makes it harder (similar to
             | contemplating alternative solutions to a Sudoku purely
             | mentally, actuallu).
             | 
             | There rarely is a single correct way of implementing some
             | requirement or feature. It's a trade-off between
             | compromises, not binary correct or incorrect like a Sudoku
             | puzzle. The insights that the exploration give you may even
             | lead you to implement something significantly different
             | from what you originally set out to.
        
             | strogonoff wrote:
             | Imagine sudoku with hundreds of subtle, sometimes mutually
             | exclusive rules, and no single valid solution.
             | 
             | This is not about LLMs, by the way. It's about reviewing
             | any code, including by a fellow human. It's just that many
             | people mistakenly feel like with LLMs they can lower their
             | guard and accept even if they have not gone through the
             | steps of themselves coming up with their solution and
             | comparing it to the one suggested by the LLM.
             | 
             | The reason is that many correctly see proper review as
             | _duplicate work_ , and while it is justified with another
             | human (because it is (A) instructive and (B) reducing bus
             | factor) with LLMs most people simply can't be bothered. If
             | you personally can, you are a minority.
        
         | nisabek wrote:
         | If I'm attentive during spec/plan creation I sort of build this
         | "expectation" of what the actual PR will look like, the mental
         | model of it. Then it's somewhat easier to review. But the
         | mental load is brutal tbh, and still not sure if it's "worth
         | it"
        
         | philbo wrote:
         | For decades, engineers understood that large code reviews are
         | harder than small ones. Out of both politeness and a desire to
         | receive better code reviews, we learned to break our large
         | changes into smaller chunks. Some engineers took things even
         | further and replaced code reviews with pair programming. But
         | then LLMs showed up and everyone seems to have forgotten those
         | lessons.
         | 
         | They can be still be applied now using coding agents, if you're
         | willing to push back against the default setup and change your
         | mode of thinking a little bit. Of course it doesn't help that
         | an entire industry is dedicated to persuading us that
         | maximizing token spend is the only way to get shit done.
         | 
         | I appreciate this probably seems like an extremist take, but I
         | wrote some more about it here in case there's anybody out there
         | who identifies with it:
         | 
         | https://philbooth.me/blog/agentic-coding-and-mental-models
        
           | aocallaghan17 wrote:
           | Agree with this completely. This push for more autonomy I
           | think is the complete wrong direction for how to use LLMs.
           | 
           | I want less code to maintain not more that I don't even fully
           | understand.
           | 
           | I think research and very supervised coding with lots of
           | guardrails is the way to actually gain productivity from
           | these tools.
        
           | firegodjr wrote:
           | I think that's reasonable. My only gripe is that making small
           | sets of changes is often faster to do by hand than waiting on
           | llm reasoning, so I've found it amounts to very little
           | speedup.
        
           | girvo wrote:
           | > They can be still be applied now using coding agents, if
           | you're willing to push back against the default setup and
           | change your mode of thinking a little bit. Of course it
           | doesn't help that an entire industry is dedicated to
           | persuading us that maximizing token spend is the only way to
           | get shit done.
           | 
           | Yeah the problem is the executives and managers around us are
           | demanding we ship massive features as quickly as possible,
           | and I like having a job and _dread_ having to find a new one
           | in this market...
        
             | oblio wrote:
             | Give them some time to be slapped with the real AI invoices
             | once Anthropic, OpenAI IPO.
        
       | pydry wrote:
       | >Automating myself out of development
       | 
       | >I want to start by saying that I'm neither an AI-fanatic
       | 
       | Kind of like saying you are a fanatic before saying you aren't.
       | 
       | I don't think theres too much here (e.g. "spec driven
       | development") I haven't seen elsewhere.
        
       | general1465 wrote:
       | I am completely calm regarding AI and development.
       | 
       | First nobody sane want to give their domain IP to
       | OpenAI/Anthropic. That's why local AI will eventually prevail and
       | flourish because people who actually have some IP will have no
       | problem to buy 10k+ EUR machine to run some pretty good models on
       | it. However if your main job is just doing CRUD stuff, then you
       | are screwed.
       | 
       | Secondly hallucination is really Achilles heel of every LLM. Sure
       | you can recreate an application which exists in thousand of
       | variations on the internet, but the moment you will try to go
       | more into domain knowledge you will start struggling more and
       | more.
       | 
       | Try to make CAN driver for ESP32, easy it is probably going to
       | work. Try to make CAN driver for STM32F7xx now the AI will start
       | having a problem but probably will be able to produce something
       | what is working after a lot of debugging. Now let's make CAN
       | driver for MPC5555. AI will start writing fairy tales about
       | registers which do not exist. All of processor above have
       | reference manuals and sometimes example git repositories
       | available on open internet.
        
         | abletonlive wrote:
         | > All of processor above have reference manuals and sometimes
         | example git repositories available on open internet.
         | 
         | okay? then give those reference manuals and git repositories? I
         | haven't heard something know LLMs can't get around and figure
         | out?
        
         | bonoboTP wrote:
         | Did you try this by giving it access to the materials? Human
         | programmers also don't memorize all this stuff. If this is the
         | reason for your calmness it's quite shortsighted.
         | 
         | There are problems when you rely too much on AI generated code,
         | but these shallow dismissals are quite annoying.
        
         | duggan wrote:
         | > First nobody sane want to give their domain IP to
         | OpenAI/Anthropic. That's why local AI will eventually prevail
         | and flourish because people who actually have some IP will have
         | no problem to buy 10k+ EUR machine to run some pretty good
         | models on it. However if your main job is just doing CRUD
         | stuff, then you are screwed
         | 
         | Replace OpenAI/Anthropic with AWS and this is not too
         | dissimilar to the arguments in 2009 about cloud providers.
         | 
         | It's not that there's nobody for whom this is true, it's just
         | that there's enough of everyone else to build an empire with.
        
       | 2001zhaozhao wrote:
       | Good writeup. I think the main difference in my workflow is that
       | I skipped the sandboxing part and accepted the coding agent
       | having access to the entire 24/7 dev machine, so I'm still
       | running on worktrees. Also, the "idea enrich" steps in my
       | workflow are less formal - I tend to write most details in a
       | feature spec myself. I also do my workflow on my own self-hosted
       | custom interface which comes with a kanban board for project
       | tracking, so I don't need Github. The rest of the workflow looks
       | pretty similar.
        
       ___________________________________________________________________
       (page generated 2026-06-13 23:01 UTC)