commented:
It is not as bad as this at my workplace, but directionally the same.
Never money for new hires or raises, but always money for business
consultants, sprawling enterprise COTS software licenses, and AI
subscriptions.

  commented:
You work for a US-based governmental organization, don't you? (Nobody
else calls it "COTS".)
That tiny bit of snark aside, it's still common for "buying things"
money to be easier to get than "paying employees" money, even in
less-ossified sectors than government.

  commented:
Can neither confirm nor deny.

  commented:
Makes sense.
(More seriously, I've worked in banking, retail, and US government,
which all have their own very thick jargon and are all unfriendly, in
various measures, to their employees talking about their work online.
So it always makes me chuckle when their jargon leaks into a more
general forum. It was a friendly chuckle, not an attempt to out you.)

  commented:
True buying things is an expense that is much less binding in the
future than giving raises. Which is, unfortunately, a reasonable
consideration in risk-management! This is typically incorrectly
extrapolated, e.g. when SaaS is locking in a function one cannot just
drop on the floor for a few months.
I have actually worked some time in a place where
explicitly-not-predictable kinds of bonuses were treated as
non-binding one-time expenses (and thus happened with different
decision-making process than binding raises).

  commented:
The author describes a situation in which their management is
nearly-fatally incompetent at running their business.
The author should find a new job with less incompetent management. If
that's not possible, take other steps towards survival and sanity.

  commented:
I had friends at FAANG describe similar enough stories a few months
ago (when tokenmaxxing was policy). This isn’t to say FAANG is immune
from being fatally incompetent, but more to say that this “ooh it can
summarize my emails and the lunch menu?!” vibe is more widespread than
one might think.

  commented:
One of the things about giant companies is that they can survive an
awful lot of nearly-fatally incompetent management... by making so
much money, because they are already entrenched, that the losses just
make them less insanely profitable.
One of the things about small companies is that they can survive an
awful lot of nearly-fatally incompetent management... but when the
owner runs out of money, dies, or gets into a legal fight with their
family, it all goes crunch.
I'm lucky enough to be working for a small company that is making
money and has not been prone to nearly-fatal management problems, so
far. That can change, though.

  commented:
My experience is a bit more positive overall, I think?

Used irresponsibly, coding agents trash your code base. This is fine
for small throwaway prototypes or if it's replacing utter garbage
SaaS, but it won't do for serious systems. Claude Fable just makes a
much larger mess.
Used responsibly—which requires self-discipline and the right
people—eh, coding agents can be helpful. They're not as transformative
as people think, because the bottleneck is still mostly human
understanding of the code, and getting stakeholders on the same page.
But if they all disappeared, I'd miss them a bit. If only because I
don't want to write my own first-draft test code against weird vendor
APIs.
A lot of document data-mining tasks definitely benefit from cheap,
high-volume LLMs, if you can accept the error rate.
Management likes a nice AI use case, as long as it's reasonably cheap
and good.

But then I walk through an airport and I see the AI-related ads aimed
at management, and yeah. It's really bad out there.

  commented:
In a way, this kind of reminds me of the whole static typing vs
dynamic typing debate. Like you can be extremely productive in a
language like Clojure when you have the right team and everybody has
enough experience to use it correctly, or you can end up with a
horrible unmaintainable mess when people abuse the language and start
getting clever with it. Using LLMs feels very similar in that sense.
When you understand what the tool can do, and how to apply it
effectively, it really can save you time, but if you just churn out
code as fast as you can, it ends up being an impenetrable mess.

  commented:
except that if the productivity improvement from using language models
is in the same order of magnitude as using static typing instead of
dynamic, this is apocalyptic news for the inference providers, who
need this technology to be absolutely nothing less than revolutionary
to justify the incredible debt they have taken on in order to build
out the physical infrastructure to support these tools.
Remember Sam Altman said they would achieve AGI and then ask ChatGPT
how to make a profit? THAT is the behavior their (Anthropic, OpenAI,
and SpaceX) valuations demand, in order to be justified. That's not
"well, it's like inventing static typing," and I love static typing!
interesting times!

  commented:
Oh absolutely, that's why the current AI bonanza is a huge bubble
that's inevitably going to pop. The problem is that the human is still
the bottleneck because any code that AI generates needs to be
understood by a human. Even if you don't actually read the code, you
still have to test at least as a black box to see that it behaves
roughly the way you want. What a lot of people in management don't
understand is that writing code isn't what takes time. It's
understanding the problem, designing the architecture, and figuring
out how to formally express business requirements. The LLM crapping
out a ton of code really fast doesn't help with any of that. That's
why every company that doubled down on AI is now reporting that
they're simply not seeing the gains they expected.
I'm convinced that the whole business model is going to prove
unsustainable. On top of that, as models get cheaper to run locally
there's less incentive to buy subscriptions. If you have your own
model you get to tune exactly how it works, and you get to keep your
data local. If models get efficient enough to make selling access to
them as a service profitable, they're also going to be cheap enough to
run on your own hardware. And I suspect that's what most devs will end
up doing. We're basically in the mainframe age of this technology, but
as has happened many times in the past, it's almost certainly going to
move to edge devices in the coming years.
Apple seems to be the only company to actually understand this and
they're actively banking on making commodity hardware geared toward
running local models.

  commented:

The LLM crapping out a ton of code really fast doesn't help with any
of that.

It's a classic "Theory of Constraints" bottleneck. Most systems have
one "bottleneck" that limits throughput. Optimizing the bottleneck
increases throughput almost 1-for-1. Optimizing literally anything
else barely makes a difference.
For typical coding tasks, the bottleneck isn't writing code. Depending
on the project, the bottleneck might be understanding the problem. It
might be getting all the stakeholders onto the same page.
My Claude Code "memories" at work are full of notes saying things
like, "Writing code faster accomplishes nothing if the human doesn't
understand it. Human learning speed is the bottleneck. The human is
only part of the system that remembers." And so on.

  commented:
It's been always that way: Amdahl/Conway/Brooks laws, all variations
of the same. The bottleneck (communication, understanding,
parallelization of work) is always the fulcrum.

  commented:
I think this depends a lot on the kind of programming you're doing.
For certain tasks, the ability to create a prototype quickly is
incredibly useful — the bottleneck there is the "human repl" of being
able to adjust, rebuild, and evaluate in rapid succession.  For that,
anything that makes the "adjust, rebuild" steps quicker can be very
useful.  It's the same principle that made dynamic scripting languages
so powerful for many years.

  commented:
Yes, this is fair. And it's also sort of the entire point of the
Theory of Constraints. Your bottleneck might be in any number of
different places.
In some cases your bottleneck absolutely will be "How quickly can I
spew dodgy greenfield code to see if something works and/or if people
like it?" And when that's your bottleneck, you want to take a very
different approach.
But for any kind of code that has multiple people working on it
regularly for more than 6 months, and actual production users, then
the bottleneck often shifts to somewhere else entirely. And you will
usually need humans who understand how everything fits together.

  commented:
The model I've come to apply most to LLM code is the 3D printer.
I can produce a shitty plastic version of any part I need in a matter
of hours instead of days/weeks. Shitty plastic versions of things are
not production quality stuff. I don't want to be a plastic vendor at
the local flea market. But shitty plastic versions of stuff can be
quite useful when I don't need something durable. With the right
choices of filaments, feeds, and speeds (models, development/review
pipeline, and testing tool choices), I can produce something more than
sufficient to leave me focused on the bits of the project I want to
focus on.
It can also be useful to build 'molds' that I backfill later with
'real' code. I've taken to a habit of having the LLM produce
typescript stubs for services/tools I want but don't want to write in
the moment, and by doing so can drive out exactly what problem I want
the tool to solve. I then am left with a comprehensive suite of tests
that I can use to drive the 'real' implementation.
It seems management at most places is still enamored of the plastic
tchotskies being peddled by gestures wildly around, though. I hope
that dies down soon.

  commented:
well put

  commented:

If models get efficient enough to make selling access to them as a
service profitable, they're also going to be cheap enough to run on
your own hardware.

This would be great but there's at least two reasons why the economics
of hosted LLMs are very different from self-hosting:

A datacenter GPU running submitted queries can run at close to 100%
utilization; even if I spent 8 hours straight doing local queries
that's only 33%, and in reality less because a lot of that time will
be spent waiting for me to type commands or review plans.
LLMs benefit from batching; architecturally it's much more efficient
to run 100 queries on the same set of matrices, but a single user
can't generate enough parallelism to take advantage of that
efficiency.

The price list on OpenRouter has kimi-k2.7-code (a 1.1T param model,
several tens of thousands of USD to run locally) at $4 per 1M output
tokens, deepseek-v4-pro (1.6T) at $0.87/1M, claude-opus-4.8 at $25/1M,
and claude-fable-5 at $50/1M. If we assume the various independent
open-source model hosts aren't operating at a loss then the big
proprietary models (Claude, ChatGPT, Gemini) are likely already
profitable.

We're basically in the mainframe age of this technology, but as has
happened many times in the past, it's almost certainly going to move
to edge devices in the coming years.

The advent of widespread wireless internet access and adoption of
battery-powered devices as most people's primary computer have caused
a lot of the actual compute power to move into datacenters. Few
reasons to crunch numbers on a phone when a rack-mounted server can
get the answer faster even after network latency.

  commented:
Sure, there are advantages to data centres, but local models are
already getting quite capable. The question isn't whether you can
match efficiency of a data centre locally. It's whether a model you
can run on your machine can do the work you need it to.
For example, Qwen 3.6 is already quite capable, and can competently do
a lot of coding tasks. But there's another aspect to this which is the
tooling. A lot of focus tends to be on raw model capability, but how
well the coding harness meets its expectations is equally important.
For example, ATLAS can make small models punch way above their weight
https://github.com/itigges22/ATLAS
So, I expect that within a year or two we'll see models you can
comfortably run on your machine performing as well as current frontier
models. And at that point it doesn't even matter how good commercial
models get because local ones will be good enough for what most people
are doing.

  commented:
It will take a while until we can really talk with some nuance and
call things properly. Then again... we still argue about what unit
testing actually is/means, so maybe I shouldn't hold my breath.
There are some companies that don't know what they're doing, some that
do, and some that really should at least test the waters before
someone else takes over and runs circles around their business. But
they're still often lumped into the same group and ridiculed by either
side of the conversation for different reasons.

  commented:

This is my second Covid

Wow... Such a simple and powerful sentence. The author managed to
perfectly articulate my feelings about the AI bubble in 5 words... I
didn't even realize these were my feelings:

"AI is here to stay" = "We won't go back to normal." Two years later
everybody forgot wearing masks to fight influenza. People will cough
on you, with full open mouth, in the u-bahn/underground during winter.
"$MODEL_FROM_ANTHROPIC is a gamechanger" = Hydroxychloroquine,
Ivermectin and Vitamin D
"We don't need people to write code anymore" = "We won't return to
office" / "The future of office work is remote-only"
.... (and I could keep going)


  commented:
whether or not you believe AI can possibly help, i think it makes
sense to look at this like the time FedEx's CEO took the company's
last $5000 down to the casino and played blackjack to try to keep the
company going for another week
the company sounds like it's sinking, and the AI consultants sound
like a hail mary to increase productivity or lower costs

  commented:
I get great results as an AI consultant, but my approach is to rely as
little on LLMs as possible. Chase low hanging fruits like your HR folk
who still doesn’t know how to export his LinkedIn navigation history
instead of say developing a new language or server or database or what
not. I also kill any discussion about what model is best and so on.
They are all the same if you use them properly.
.