commented: It is not as bad as this at my workplace, but directionally the same. Never money for new hires or raises, but always money for business consultants, sprawling enterprise COTS software licenses, and AI subscriptions. commented: You work for a US-based governmental organization, don't you? (Nobody else calls it "COTS".) That tiny bit of snark aside, it's still common for "buying things" money to be easier to get than "paying employees" money, even in less-ossified sectors than government. commented: Can neither confirm nor deny. commented: Makes sense. (More seriously, I've worked in banking, retail, and US government, which all have their own very thick jargon and are all unfriendly, in various measures, to their employees talking about their work online. So it always makes me chuckle when their jargon leaks into a more general forum. It was a friendly chuckle, not an attempt to out you.) commented: True buying things is an expense that is much less binding in the future than giving raises. Which is, unfortunately, a reasonable consideration in risk-management! This is typically incorrectly extrapolated, e.g. when SaaS is locking in a function one cannot just drop on the floor for a few months. I have actually worked some time in a place where explicitly-not-predictable kinds of bonuses were treated as non-binding one-time expenses (and thus happened with different decision-making process than binding raises). commented: The author describes a situation in which their management is nearly-fatally incompetent at running their business. The author should find a new job with less incompetent management. If that's not possible, take other steps towards survival and sanity. commented: I had friends at FAANG describe similar enough stories a few months ago (when tokenmaxxing was policy). This isn’t to say FAANG is immune from being fatally incompetent, but more to say that this “ooh it can summarize my emails and the lunch menu?!” vibe is more widespread than one might think. commented: One of the things about giant companies is that they can survive an awful lot of nearly-fatally incompetent management... by making so much money, because they are already entrenched, that the losses just make them less insanely profitable. One of the things about small companies is that they can survive an awful lot of nearly-fatally incompetent management... but when the owner runs out of money, dies, or gets into a legal fight with their family, it all goes crunch. I'm lucky enough to be working for a small company that is making money and has not been prone to nearly-fatal management problems, so far. That can change, though. commented: My experience is a bit more positive overall, I think? Used irresponsibly, coding agents trash your code base. This is fine for small throwaway prototypes or if it's replacing utter garbage SaaS, but it won't do for serious systems. Claude Fable just makes a much larger mess. Used responsibly—which requires self-discipline and the right people—eh, coding agents can be helpful. They're not as transformative as people think, because the bottleneck is still mostly human understanding of the code, and getting stakeholders on the same page. But if they all disappeared, I'd miss them a bit. If only because I don't want to write my own first-draft test code against weird vendor APIs. A lot of document data-mining tasks definitely benefit from cheap, high-volume LLMs, if you can accept the error rate. Management likes a nice AI use case, as long as it's reasonably cheap and good. But then I walk through an airport and I see the AI-related ads aimed at management, and yeah. It's really bad out there. commented: In a way, this kind of reminds me of the whole static typing vs dynamic typing debate. Like you can be extremely productive in a language like Clojure when you have the right team and everybody has enough experience to use it correctly, or you can end up with a horrible unmaintainable mess when people abuse the language and start getting clever with it. Using LLMs feels very similar in that sense. When you understand what the tool can do, and how to apply it effectively, it really can save you time, but if you just churn out code as fast as you can, it ends up being an impenetrable mess. commented: except that if the productivity improvement from using language models is in the same order of magnitude as using static typing instead of dynamic, this is apocalyptic news for the inference providers, who need this technology to be absolutely nothing less than revolutionary to justify the incredible debt they have taken on in order to build out the physical infrastructure to support these tools. Remember Sam Altman said they would achieve AGI and then ask ChatGPT how to make a profit? THAT is the behavior their (Anthropic, OpenAI, and SpaceX) valuations demand, in order to be justified. That's not "well, it's like inventing static typing," and I love static typing! interesting times! commented: Oh absolutely, that's why the current AI bonanza is a huge bubble that's inevitably going to pop. The problem is that the human is still the bottleneck because any code that AI generates needs to be understood by a human. Even if you don't actually read the code, you still have to test at least as a black box to see that it behaves roughly the way you want. What a lot of people in management don't understand is that writing code isn't what takes time. It's understanding the problem, designing the architecture, and figuring out how to formally express business requirements. The LLM crapping out a ton of code really fast doesn't help with any of that. That's why every company that doubled down on AI is now reporting that they're simply not seeing the gains they expected. I'm convinced that the whole business model is going to prove unsustainable. On top of that, as models get cheaper to run locally there's less incentive to buy subscriptions. If you have your own model you get to tune exactly how it works, and you get to keep your data local. If models get efficient enough to make selling access to them as a service profitable, they're also going to be cheap enough to run on your own hardware. And I suspect that's what most devs will end up doing. We're basically in the mainframe age of this technology, but as has happened many times in the past, it's almost certainly going to move to edge devices in the coming years. Apple seems to be the only company to actually understand this and they're actively banking on making commodity hardware geared toward running local models. commented: The LLM crapping out a ton of code really fast doesn't help with any of that. It's a classic "Theory of Constraints" bottleneck. Most systems have one "bottleneck" that limits throughput. Optimizing the bottleneck increases throughput almost 1-for-1. Optimizing literally anything else barely makes a difference. For typical coding tasks, the bottleneck isn't writing code. Depending on the project, the bottleneck might be understanding the problem. It might be getting all the stakeholders onto the same page. My Claude Code "memories" at work are full of notes saying things like, "Writing code faster accomplishes nothing if the human doesn't understand it. Human learning speed is the bottleneck. The human is only part of the system that remembers." And so on. commented: It's been always that way: Amdahl/Conway/Brooks laws, all variations of the same. The bottleneck (communication, understanding, parallelization of work) is always the fulcrum. commented: I think this depends a lot on the kind of programming you're doing. For certain tasks, the ability to create a prototype quickly is incredibly useful — the bottleneck there is the "human repl" of being able to adjust, rebuild, and evaluate in rapid succession. For that, anything that makes the "adjust, rebuild" steps quicker can be very useful. It's the same principle that made dynamic scripting languages so powerful for many years. commented: Yes, this is fair. And it's also sort of the entire point of the Theory of Constraints. Your bottleneck might be in any number of different places. In some cases your bottleneck absolutely will be "How quickly can I spew dodgy greenfield code to see if something works and/or if people like it?" And when that's your bottleneck, you want to take a very different approach. But for any kind of code that has multiple people working on it regularly for more than 6 months, and actual production users, then the bottleneck often shifts to somewhere else entirely. And you will usually need humans who understand how everything fits together. commented: The model I've come to apply most to LLM code is the 3D printer. I can produce a shitty plastic version of any part I need in a matter of hours instead of days/weeks. Shitty plastic versions of things are not production quality stuff. I don't want to be a plastic vendor at the local flea market. But shitty plastic versions of stuff can be quite useful when I don't need something durable. With the right choices of filaments, feeds, and speeds (models, development/review pipeline, and testing tool choices), I can produce something more than sufficient to leave me focused on the bits of the project I want to focus on. It can also be useful to build 'molds' that I backfill later with 'real' code. I've taken to a habit of having the LLM produce typescript stubs for services/tools I want but don't want to write in the moment, and by doing so can drive out exactly what problem I want the tool to solve. I then am left with a comprehensive suite of tests that I can use to drive the 'real' implementation. It seems management at most places is still enamored of the plastic tchotskies being peddled by gestures wildly around, though. I hope that dies down soon. commented: well put commented: If models get efficient enough to make selling access to them as a service profitable, they're also going to be cheap enough to run on your own hardware. This would be great but there's at least two reasons why the economics of hosted LLMs are very different from self-hosting: A datacenter GPU running submitted queries can run at close to 100% utilization; even if I spent 8 hours straight doing local queries that's only 33%, and in reality less because a lot of that time will be spent waiting for me to type commands or review plans. LLMs benefit from batching; architecturally it's much more efficient to run 100 queries on the same set of matrices, but a single user can't generate enough parallelism to take advantage of that efficiency. The price list on OpenRouter has kimi-k2.7-code (a 1.1T param model, several tens of thousands of USD to run locally) at $4 per 1M output tokens, deepseek-v4-pro (1.6T) at $0.87/1M, claude-opus-4.8 at $25/1M, and claude-fable-5 at $50/1M. If we assume the various independent open-source model hosts aren't operating at a loss then the big proprietary models (Claude, ChatGPT, Gemini) are likely already profitable. We're basically in the mainframe age of this technology, but as has happened many times in the past, it's almost certainly going to move to edge devices in the coming years. The advent of widespread wireless internet access and adoption of battery-powered devices as most people's primary computer have caused a lot of the actual compute power to move into datacenters. Few reasons to crunch numbers on a phone when a rack-mounted server can get the answer faster even after network latency. commented: Sure, there are advantages to data centres, but local models are already getting quite capable. The question isn't whether you can match efficiency of a data centre locally. It's whether a model you can run on your machine can do the work you need it to. For example, Qwen 3.6 is already quite capable, and can competently do a lot of coding tasks. But there's another aspect to this which is the tooling. A lot of focus tends to be on raw model capability, but how well the coding harness meets its expectations is equally important. For example, ATLAS can make small models punch way above their weight https://github.com/itigges22/ATLAS So, I expect that within a year or two we'll see models you can comfortably run on your machine performing as well as current frontier models. And at that point it doesn't even matter how good commercial models get because local ones will be good enough for what most people are doing. commented: It will take a while until we can really talk with some nuance and call things properly. Then again... we still argue about what unit testing actually is/means, so maybe I shouldn't hold my breath. There are some companies that don't know what they're doing, some that do, and some that really should at least test the waters before someone else takes over and runs circles around their business. But they're still often lumped into the same group and ridiculed by either side of the conversation for different reasons. commented: This is my second Covid Wow... Such a simple and powerful sentence. The author managed to perfectly articulate my feelings about the AI bubble in 5 words... I didn't even realize these were my feelings: "AI is here to stay" = "We won't go back to normal." Two years later everybody forgot wearing masks to fight influenza. People will cough on you, with full open mouth, in the u-bahn/underground during winter. "$MODEL_FROM_ANTHROPIC is a gamechanger" = Hydroxychloroquine, Ivermectin and Vitamin D "We don't need people to write code anymore" = "We won't return to office" / "The future of office work is remote-only" .... (and I could keep going) commented: whether or not you believe AI can possibly help, i think it makes sense to look at this like the time FedEx's CEO took the company's last $5000 down to the casino and played blackjack to try to keep the company going for another week the company sounds like it's sinking, and the AI consultants sound like a hail mary to increase productivity or lower costs commented: I get great results as an AI consultant, but my approach is to rely as little on LLMs as possible. Chase low hanging fruits like your HR folk who still doesn’t know how to export his LinkedIn navigation history instead of say developing a new language or server or database or what not. I also kill any discussion about what model is best and so on. They are all the same if you use them properly. .