[HN Gopher] "Don't You Just Upload It to ChatGPT?"
___________________________________________________________________
"Don't You Just Upload It to ChatGPT?"
Author : speckx
Score : 332 points
Date : 2026-06-12 17:52 UTC (9 hours ago)
HTML web link (correresmidestino.com)
TEXT w3m dump (correresmidestino.com)
| pixel_popping wrote:
| I agree with the take, but it's a temporary one, the sad reality
| is that we will be literally _inferior_ soon, there will be a
| point where we will not trust human input without counter check
| by AI, we need to remember that we are kinda at the beginning of
| the AI era, in 5 to 10 years it 's very unlikely that a human
| translator or software engineers will do better than the tooling
| we will have.
|
| There is already a tipping point now in software engineering
| where we prefer to ask AI instead of humans because we believe
| accuracy will be better, see SO death as an example or just see
| the current state of online dev communities, it's getting
| deserted and between team members at work, we can also notice
| that people speak less and less.
|
| Sad but I believe it.
| Johnbot wrote:
| This is anecdata, but in my experience with myself and my
| coworkers, it is not that we believe the AI will be more
| accurate in software engineering, but that the answer will come
| faster and be more tailored to our exact problems. If I have to
| search SO, I have to find the answer and then tweak it to fit
| my codebase, but with AI tooling, the AI is already basing its
| answer around my code.
| pixel_popping wrote:
| I think we actually do believe it, do you believe Fable
| 5+GPT-5.5 _(+ the whole model zoo)_ in loop with adversarial
| (no budget limit) or a 10-year experienced SWE?
|
| We are talking about "codebases" but realistically we won't
| even be checking the filetree of them soon, it will be all
| blind, containerized and verified with pseudo guarantees
| which are good enough to build serious things. We don't even
| write documentation for humans anymore, we need to look at
| the trends and the reality within companies, most developers
| became "callcenter agents" in a matter of only 2 years and
| literally most of them are not even using proper automated
| tooling yet as we can see the "vibe coding" trend with Claude
| Code which is weak, by far most work done daily by developers
| is already automatable entirely, but with exceptions, sure,
| but in a few years those exceptions will become rare.
|
| There will be niche problems about legacy products, sure, but
| legacy products will all be replaced over time, if we think
| in depth, why do we even need that many languages, that many
| tools? Tomorrow AI will write 99% if not all code existing _(
| "code" doesn't even matter anyway)_, so it's much better if
| it's specific to AI and not playing this dance where we think
| we are doing a meaningful human contribution on an "AI-made
| codebase".
|
| For context, I have 2 decades of software dev behind me.
| Ancapistani wrote:
| This is the direction I'm going.
|
| For personal projects that I don't plan to share widely,
| I'm making it a point to not look at the code at all. So
| far - and to my surprise - I've not only found that this
| has result in no more bugs than before, but it seems to
| result in fewer bugs over time. Every time I find a bug or
| a regression, I add it to the specification. My SDLC
| requires that every specification have at least one
| associated test. Not every function, or every line, or
| anything like that - every _specified feature_. The end
| result has been that my projects have matured over time
| much faster than if I 'd been more closely involved.
|
| I've already toyed with writing some projects in Nim and
| Haskell for token efficiency. At some point I plan to put
| together a simple test project, then do a comparison of
| token efficiency with every language I can think of to find
| the one that I'm able to generate most quickly, correctly,
| and cheaply.
| bigstrat2003 wrote:
| > there will be a point where we will not trust human input
| without counter check by AI
|
| That's nonsense. There is zero reason to believe that AI (with
| the current techniques) will ever become reliable enough to let
| it do its own thing, let alone better than a human. It's been
| years of development and you still can't trust it to get basic
| facts correct, not even "well it's better than it used to be".
| Saying it'll replace humans in 5-10 years is a fantasy (or a
| prediction that people are stupid enough to fall for hype, I
| guess).
| graemep wrote:
| It can spot mistakes made by a human if asked to review code
| or write tests.
|
| GP is is over the top ins saying humans will "be inferior
| soon" but AI can be a nice additional check so AI review
| might be come standard.
| pixel_popping wrote:
| You come from the principle that humans are reliable at first
| which is partly right but also wrong in so many scenarios,
| you can even see lately the CVE spree happening, which
| demonstrates that human-made codebases have serious
| vulnerabilities and without the help of AI, we probably won't
| even know about them which proves that humans are not that
| "reliable", the current societal structure is also built
| around the fact that humans can't really be trusted, nothing
| really different with AI, we can't fully trust them like we
| can't fully trust humans.
|
| It's not a fantasy, I would bet that no serious engineer
| nowadays is putting in prod a codebase not AI reviewed
| meaning we already can't work on our own, we must factor in
| the on-going decline of human capabilities _(at least
| developers)_ as well of course.
|
| I'm not really saying this because of any sort of hype, but I
| can personally relate where I went from actually coding to
| NEVER CODE in less than 2 years, and everyone around me is
| the same thing, what it will be in 5 years?
|
| Knowing that really, most developers aren't even using proper
| tooling yet so they are very slow compared to what they could
| be, I mean how many people we hear saying they can't even
| saturate an Anthropic Max 20 subscription? I saturated 7
| accounts the last 2h alone, it's because they haven't
| entirely rethought their workflows yet, why do they even have
| "downtimes", it should be 24/7.
| Ancapistani wrote:
| > It's been years of development and you still can't trust it
| to get basic facts correct
|
| There's the rub: AI is not an oracle. It's neither designed
| nor intended to provide accurate recall of all facts. It's
| closer to a reasoning engine than anything IMO.
|
| Oh, and for the record: I don't trust people to get basic
| facts correct, either. It's already far better than the
| average human at trivia.
| WillowWithAWand wrote:
| The thing is that AI is not some inevitable force of nature
| that must just be contended with and weathered. It is an active
| choice by our society to develop it and it is a choice by our
| society how we should use it, if at all.
|
| We would all do well to remember that and remember that each
| and every advancement and use case regarding AI is the result
| of choices by people (or the groups of people we call
| corporations) and are oftentimes motivated by the profit
| motive, not the best interest of humanity.
|
| We could make different choices up to and including our own
| Butlerian Jihad where we ban all forms of AI but we could also
| do everything we can to prevent the worst fallout short of
| that.
|
| There are only two types of problems in the universe: 1) those
| posed by the laws of physics 2) those posed by human choices
|
| The problem of AI is one of the latter.
| rootusrootus wrote:
| > we will be literally inferior soon
|
| This plague of misanthropic doom is itself pretty depressing.
| Why do so many people think LLMs are in any way on a path to
| compete with human brains? Why do you think so little of
| yourself? The brain is _magnificent_ and complex in ways that
| we are unable to decipher anytime soon, and it does way more
| than an LLM. Way, way more.
| pixel_popping wrote:
| I don't talk specifically about LLMs but AI in general, it's
| an important distinction because tooling is currently what
| make models useful and more performant.
|
| When I say _we_ , I mean the general population really.
| There0-'ll always be the super bright ones, sure, but we
| gotta be realistic here. Most people already struggle to make
| any meaningful contribution because it's so hard to compete,
| and that gap is just gonna get bigger and bigger.
|
| I agree the brain is pretty magnificent, but when it comes to
| stuff like language, figuring out if an idea actually works,
| building the next LLM, or running business stuff, it's pretty
| obvious we'll be inferior. AI can already innovate and come
| up with new things way faster than any human could, so at
| some point (soon) => the majority of contributions are just
| gonna come from AI, not from us.
| layer8 wrote:
| What's unfortunate is that the market that is willing to pay for
| high-quality human translation has shrunken considerably.
| kevincox wrote:
| Is it that unfortunate? Tasks that don't require high-quality
| translation now don't need human labor. We should be
| celebrating.
|
| The sad part is that we haven't figured out how to distribute
| our resources fairly to these people even thought their
| services aren't required as often. Instead we just take their
| wages and give them to the top 0.1%
| layer8 wrote:
| It's unfortunate because we are seeing more poor translations
| in all domains, and users suffer from it. It's part of a
| general enshittification of things. There are few contexts
| where low-quality translations don't constitute a degradation
| of user experience.
|
| Just one amusing example I saw recently: On the Amazon
| website, a submit button labeled "Go" in English was
| translated to something which when translated back would be
| "Walking". That's the kind of thing that would be exceedingly
| unlikely to happen with a human translator.
| Legend2440 wrote:
| I think you overestimate human translators. There is a
| _lot_ of very poor quality human-translated text out there.
| English translated from Chinese is famous for this.
|
| There will never be enough expert-level human translators,
| and they tend to be very expensive. Machine translation has
| raised the floor.
| kouteiheika wrote:
| > I think you overestimate human translators. There is a
| lot of very poor quality human-translated text out there.
|
| This.
|
| There was even a big controversy recently with one of the
| games on Steam where human translators just completely
| botched and vandalized the translation, mistranslating
| huge chunks of it and injecting their own personal
| politics which are not present in the original text (only
| English was affected; other languages were translated
| fine apparently): https://store.steampowered.com/news/app
| /2914150/view/5028562...
|
| If you'd get the AI to translate it, even without any
| editing, it would have done much better job. Just because
| something's done by a human it doesn't automatically make
| it good; you still need competent people at the helm, and
| recent machine translation advances certainly raise the
| floor on that.
| layer8 wrote:
| I don't agree that machine translation has raised the
| floor, because even LLM-based translation can get pretty
| bad when it isn't provided with the necessary context.
| And the average quality level I'm encountering has
| dropped since machine translation became mainstream. Poor
| translations have become the norm, which wasn't the case
| 20 years ago, despite the occasional "all your base are
| belong to us".
| arjie wrote:
| On the other hand, a bridge sign that says "No entry for
| heavy vehicles" is unlikely to now read "I am out of office
| for the next 2 weeks" in Welsh:
| https://www.theguardian.com/theguardian/2008/nov/01/5
| kube-system wrote:
| Now instead it will say, in Welsh
| "Switched to Opus 4.8 - Fable has safety measures that
| flag messages on most cybersecurity or biology topics.
| They may flag safe, normal content as well. These
| measures let us bring you Mythos-level capability in
| other areas sooner, and we're working to refine them."
| arjie wrote:
| Hahaha that's pretty funny. But in their defence perhaps
| if you didn't want a tall tale you shouldn't have asked
| for a Fable? ;)
| 627467 wrote:
| This story is so great because it shows how robotic are
| so many jobs and tasks. Like, what happened in the
| reciepient mind to not consider whether the reply was
| appropriate or not? Did the almost instant response not
| hint at an automated email? Or the lack of any other
| content of the email (a greeting, something)? Or maybe
| people send so many emails or is doing so many thing they
| switch off certain parts of the brain?
| robertnowell wrote:
| if it was valuable, people would pay for it
| layer8 wrote:
| That's not how it works. Value for users doesn't translate
| 1:1 into value for businesses, nor are either necessarily
| willing to pay for value. That's why things enshittify.
| mapmeld wrote:
| I think it's an interesting perspective, because translation is
| one of the jobs that I (a) hear is the first to lose work due to
| AI, and (b) often used as an example of "acceptable" AI by people
| who are skeptics of LLMs and AI-generated art.
| xigoi wrote:
| > often used as an example of "acceptable" AI by people who are
| skeptics of LLMs and AI-generated art.
|
| As one of such people, I think there is a nuance to it. AI is
| great when you're translating something to yourself. But when
| translating things for others, more caution and human judgement
| is needed. Espesially when translating instruction manuals,
| where bad wording could cause someone to injure themself.
| ai-x wrote:
| Exactly, it's never about absolute results, it's always
|
| Expected Value (Upside, given time/cost savings + Downside,
| given %reliability).
|
| So, every task falls under a spectrum
| inigyou wrote:
| This. I put things through Google translate all the time and
| they're always unreliable. Sometimes they're correct,
| sometimes I need to know roughly what the original said.
| Infamously, Google used to say "geiler Typ" meant "horny guy"
| when it means "awesome guy". Google used to think "geil"
| meant "horny" in general, which it can but not usually
| carlosjobim wrote:
| Google Translate is at the bottom of the barrel. All other
| AI translation tools are vastly superior. You'd want to
| evaluate those, and forget about Google Translate
| completely.
| numpad0 wrote:
| It's all the same, except LLMs are less precise with
| names.
| carlosjobim wrote:
| Just like a car and a school bus are the same because
| both have four wheels?
| edude03 wrote:
| Googles machine translation team wrote the Attention is
| all you need paper that introduced transformers
| specifically to solve the problem that you can just model
| language by mapping one word to another. I'd be floored
| if they weren't using the tech they invented for intended
| purpose
| smallerfish wrote:
| Google translate is primitive compared to Claude at
| translations.
| duffycommaryan wrote:
| Language is incredibly complex. I remember a TikTok from a
| bilingual English-Korean speaker comparing the English
| subtitles from a Squid Game scene to what was actually being
| said by the characters. The nuance and info density lost in
| translation made the subtitles feel completely remedial.
| Americans were basically watching a different show
| altogether.
| ClimaxGravely wrote:
| I'm by no means a native level Japanese speaker but I'm
| frequently surprised at how off Japanese-English subtitles
| can be.
| raincole wrote:
| There are translators and there are translators. Translating
| legal/business documents is a completely different thing from
| translating movies/books/games.
|
| I can confidently say that LLMs do a better job than the
| average _traditionally published_ fictions in my country, at
| least when the original works are in English. Every single time
| I watch a subbed movie there will be some lines noticeably
| wrong.
| anigbrowl wrote:
| Yes, I've become very leery of artistic translation, in part
| because the paradigm of translators as adapters and
| localizers often ends up at odds with the job of faithfully
| and accurately representing the original material.
|
| The most egregious example I came across recently was where a
| friend enthused about some manga he was reading and I agreed
| to read a few chapters, only to discover that the translator
| has decided to render the countryside accents of western
| Japan (engaging with a protagonist visiting from Tokyo) by
| having them say 'y'all' and 'bless your heart' and other
| Southern USA tropes. I get the aspiration of the translator,
| but it was excruciatingly unpleasant to read. At that point,
| why not just say the protagonist was from New York and on
| vacation in Florida, or draw in some meshback caps on some of
| the characters and add alligators here and there in the
| background?
| qsort wrote:
| Not all translations are the same. Literary translations are
| often works of art in and of themselves, and automating them
| would be missing the point entirely, like automating homework
| or weightlifting at the gym. I don't really know what's the
| state of the art, but I do buy that, on the other hand,
| translating toaster manuals or generic copy could soon be
| automatic.
| greiskul wrote:
| Yup. If you are bilingual, you quickly realize how some
| translations are very bad. How some translations are very
| good. And how hard it is to translate. With dry, simple text,
| it might be easy. But when it involves art? Some jokes don't
| translate directly. There is pun. Sounds of words. Double
| meaning. Ambiguity. Cultural background. The creation of new
| words.
|
| It can be reasonably argued that some poetry can be
| impossible to translate from some languages to others. A poem
| might be explained, but by a lenghty, dissecting explanation,
| that completely loses the point of it.
| graemep wrote:
| Or if you compare a poetic translation to a literal one, of
| different translations of the same work to the same
| language to each other.
| duffycommaryan wrote:
| When it's one one-hundredth the cost, "good enough" is
| generally good enough.
| layer8 wrote:
| Translators already started losing jobs due to machine
| translation a decade ago (e.g. DeepL), before LLMs.
| Remuneration going down made it more difficult to make a living
| as a translator already then, even if you still received
| offers.
| SecretDreams wrote:
| It'll be a similar theme for all facets of work involving any
| language, slowly - human or code. We'll parrot about humans in
| the loop this and that, but I think it'll be less humans in the
| loop over time and I think most people will even be willing to
| settle for a slightly more mediocre translation or coded
| project. It all comes back to our dopamine addiction, where we
| like fast feedback. And the oligarchs like tools to suppress
| wages. We will be our own demise for not advocating for either
| UBI or job protections, instead, happily using the technology
| while also rolling our eyes that it could never replace _us_.
| geon wrote:
| "Could not connect to translation service" was apparently good
| enough for someone, so the bar must be extremely low.
|
| https://www.reddit.com/r/funny/comments/3e786n/chinese_hair_...
|
| On the other hand, a lot of people become extremely put off by
| the smallest sign of ai slop. And the llms have a tendency to
| impart their style to any text they touch.
| anigbrowl wrote:
| I prefer to get my hair cut at 'Usage limits exceeded.'
| Marsymars wrote:
| Well it's more than acceptable to translate e.g. web pages for
| reading, but it's not something you'd want to professionally
| publish.
|
| Kinda conceptually similar to how typos and grammatical
| mistakes aren't a big deal if you're shooting off a quick text
| or email, but publishing if you've got typos in your
| advertising copy, in your resume, on your medicine label, etc.
| it's a real bad look.
| ValentineC wrote:
| From the post:
|
| > _Ah, you can't fire me, I'm self-employed!_
|
| I don't understand thinking like this. I think companies can
| certainly fire their contractors.
| tombert wrote:
| I have no doubt that the writer is better at translating than AI,
| but I have to say that AI translation has gotten so good that I'm
| not sure how much longer translation work will be there, or
| rather it might end up being more about auditing.
|
| For example, I just read the Lawrence Ellsworth translation of
| The Three Musketeers, which I very thoroughly enjoyed. I don't
| speak or read French, but from my understanding Ellsworth's
| translation is considered one of the more accurate translations
| of the work.
|
| Out of curiosity, I sic'd Claude Fable on the original French
| version of The Three Musketeers and told it to translate
| accurately, but also try and keep the same jovial tone as the
| original and do _not_ censor anything. After it was done, I didn
| 't read the entire output, but I did compare a few individual
| chapters between the Ellsworth translation and the Fable
| translation.
|
| They were honestly remarkably similar. As far as I could tell,
| nothing was substantially different from the Ellsworth
| translation and the Fable translation. I do think that the prose
| for the Ellsworth translation was a bit better, but the prose for
| the Fable one was actually perfectly readable. Again, I don't
| speak French so I cannot say for sure, but I do not believe that
| I would have gotten a significantly different experience had I
| read the Fable version instead of the Ellsworth version.
|
| Now, it's possible (and likely) that this is somewhat self-
| fulfilling; Fable might have been trained using Ellsworth's
| translation and as such it's very directly able to crib from it;
| sadly since I do not speak any language outside of English,
| there's sort of a catch-22: the only way I can compare the
| accuracy of a translation is to compare against other
| translations, but if other translations exist then that will
| likely influence the results, and if a translation doesn't
| already exist then I have no way of auditing it.
|
| I'm still going to continue reading through Ellsworth's
| translations for the subsequent stories simply because that feels
| more canonical, and as I said I do think the prose was a bit
| better.
| zuzululu wrote:
| This moment is coming for software developers too
| tombert wrote:
| Yeah almost certainly, especially the ones who made a career
| out of "copypaste from StackOverflow", which is most
| engineers.
|
| But even the good engineers should likely be a little
| worried.
| VBprogrammer wrote:
| I think there is going to be a long time before all of the
| obscure knowledge of a decent software developer can be
| completely replaced by AI. Though the job is going to change
| beyond recognition. It already has in many ways.
| rootusrootus wrote:
| More specifically, it is coming for _coders_. If you make
| your living by banging out lines of code all day, then you
| may want to be looking at adjusting your career trajectory.
| But if that is your job, you are either very junior, or a bit
| foolish for getting into that situation.
| zuzululu wrote:
| so what is software developer doing if writing code is not
| part of their job
|
| I don't see how not writing code is being offered as a
| moat, it seems like that is just translating
| business/stakeholder requirements to architecture/biz
| processes which is exactly the type of low hanging fruit
| that AI will capture first
|
| or was it your point that the position sits closer to the
| stakeholders (relatively compared to those lifting) thus
| immune from replacement by AI
|
| or is your argument that your taste is exquisite that no AI
| will be able to match it like it already has with software
| so far and it will not improve beyond the current state
| skydhash wrote:
| So what a lab researcher doing if typing articles is not
| part of the job?
| jujube3 wrote:
| Well--well look. I already told you: I deal with the god
| damn customers so the engineers don't have to. I have
| people skills; I am good at dealing with people. Can't
| you understand that? What the hell is wrong with you
| people?
|
| https://www.reddit.com/r/ProductManagement/comments/uy1ot
| 1/w...
| tombert wrote:
| If you get to senior level then most of your job probably
| is _not_ writing code, but planning things out. The code
| is largely an implementation detail.
|
| At least that's how it was for me, maybe other peoples'
| careers are different.
| bluefirebrand wrote:
| Yes, my career has been different. At my workplaces
| seniors still have to code because they dont want to hire
| juniors
|
| The "planning things out" has moved to another layer,
| called "architects"
| lelanthran wrote:
| > If you get to senior level then most of your job
| probably is not writing code, but planning things out.
|
| If they're so good at banging out code now, they're
| coming for that too, you know.
| tombert wrote:
| I don't necessarily disagree, but there's gotta be a name
| for some kind of "infinite extrapolation" fallacy, where
| you assume that the current rate of progress will
| continue indefinitely.
|
| That _might_ happen, but I don 't think it's implied, at
| least given literally every other bit of technology that
| has ever happened in history ever.
| lelanthran wrote:
| > I don't necessarily disagree, but there's gotta be a
| name for some kind of "infinite extrapolation" fallacy,
| where you assume that the current rate of progress will
| continue indefinitely.
|
| I am not assuming they'll continue indefinitely, but it's
| a _small_ step from writing code to planning out the code
| to write, and _another_ small step from planning a coding
| project to planning a software project, etc.
|
| These are all small steps, and because the act of
| specification + planning paid _less_ than specification +
| planning + programming, what reason do you have for
| thinking that specification + planning is valuable enough
| to keep the salaries the same as specification + planning
| + programming?
| tombert wrote:
| I think with a fixed size problem, no we wouldn't be able
| to demand the same salaries that we get now.
|
| I dispute that the problem is fixed size. The people who
| are senior engineers now will learn how to think at a
| higher level with the AI models.
| pwython wrote:
| Same thing architects do if drawing lines gets automated:
| architecture.
|
| Would you trust living in a high rise designed by AI?
|
| Designing a system that survives production is the job.
| daveguy wrote:
| But not before a huge crash in optimism about their
| capabilities. Specifically wrt accuracy, reliability,
| efficiency, and organization/architecture.
| ixtli wrote:
| I think this collapses a global, complex heirarchy of
| software engineering workers into a single monolith and
| serves only to advertise for frontier LLM providers. the
| point where you no longer need engineers is not going to be
| reached by making LLMs better and better.
| exe34 wrote:
| > Again, I don't speak French so I cannot say for sure
|
| This reminds me of the adage, that ChatGPT is really great at
| everything except my own work.
| tombert wrote:
| Yeah, that's why I put the caveat in there. I have no real
| way to verify the result outside of checking against "known
| good" translations, though if the known-good translation
| exists then there's not exactly a lot of reason to do the AI
| translation in the first place.
|
| I suspect if I knew another language I would be able to find
| errors in the translation.
| rootusrootus wrote:
| Yes, it is another variation on the Gell-Mann Amnesia Effect.
| I have a number of non-developers in my circle of friends who
| think Claude is about to put me out of work. They think it is
| just a great tool for _them_ , not a replacement. Of course!
| layer8 wrote:
| I see the difficulties more in other areas, such as technical
| translations, specialist books, user manuals, and translating
| UIs, where contextual information and a back and forth with the
| client is needed to clarify details, and (for user manuals and
| UIs) the translator has to put themselves in the mind of the
| user and has to consider the possible contexts and use cases.
| Swizec wrote:
| > As far as I could tell, nothing was substantially different
| from the Ellsworth translation and the Fable translation.
|
| Crucially the full translation was part of ChatGPT's training
| set. Recall is a pretty solved problem in machine learning.
|
| How well does it translate a French novel published yesterday?
| Where neither the original novel nor any translations are in
| the training set yet? Or might not even exist!
|
| I tried asking ChatGPT to translate a letter I wrote in
| Slovenian this weekend. It got the general gist but missed a
| lot of the nuance. Completely missed several of the little
| touches of tone where the right choice of synonym conveys a
| whole bunch of information.
| tombert wrote:
| Did no one actually finish reading my comment?
| zipy124 wrote:
| Welcome to the internet
| Swizec wrote:
| I feel like that wasn't there when I started writing my
| comment. I also have a bad habit of quickly posting and
| then adding over a few minutes.
|
| Glad we agree :)
| tombert wrote:
| Guess I have no way of proving it, but I pinky swear that
| I didn't edit it in later!
|
| But yeah, I broadly do agree; if I read other languages I
| could find a book that hadn't been thoroughly translated
| to English and then I could give a proper analysis on how
| good the translation is, but since I'm a very
| stereotypical American I know exactly one language (and
| sometimes my comprehension of even that is questionable).
| paulddraper wrote:
| .
| tombert wrote:
| Already mentioned in the comment lol.
| Wowfunhappy wrote:
| > Out of curiosity, I sic'd Claude Fable on the original French
| version of The Three Musketeers and told it to translate
| accurately, but also try and keep the same jovial tone as the
| original and do not censor anything. After it was done, I
| didn't read the entire output, but I did compare a few
| individual chapters between the Ellsworth translation and the
| Fable translation.
|
| This isn't a great test, because Claude almost certainly has
| multiple translations of The Three Musketeers in its training
| data.
| tombert wrote:
| Read the last two paragraphs :)
| card_zero wrote:
| They say "yes, I admit it, this is all invalid".
| tombert wrote:
| No, they are a disclaimer that it's possible that the
| data isn't conclusive. Not the same thing as saying "it's
| all invalid".
| Wowfunhappy wrote:
| Oops, I legitimately missed the second-to-last paragraph.
|
| I still think there are better tests you could do. Ideally,
| you would choose a book that was published recently--after
| the model's cut-off date--which is considered to be a good
| translation. But even something like _The Girl With the
| Dragon Tattoo_ , which is not particularly new and by no
| means obscure, would be better than a famous work of
| literature like _The Three Musketeers_ that has many
| translations.
| tombert wrote:
| Almost certainly correct, though I've noticed that these
| LLMs like to complain when you give it stuff that is
| still in copyright. The Three Musketeers is thoroughly
| public domain everywhere so in that sense it's a good
| test, but of course because it's public domain everywhere
| there are lots of translations to crib from so I
| acknowledge it's not a great test because the training
| data almost certainly contains a competent translation.
|
| Even if Fable didn't have Ellsworth's translation, it
| certainly has the William Barrow translation, which would
| still get it like 80+% of the way there.
|
| My wife speaks Spanish, I should get her to do some kind
| of comparison with a Spanish book that doesn't have
| English translations.
| svara wrote:
| The things is, this is almost certainly what's happening.
|
| You can (could, maybe they 'fixed' it by now) get sota LLMs
| to reproduce entire novels near verbatim.
|
| The idea of giving it parallel texts of those novels in
| different languages, to train it on translation, is so
| obvious it'd just be strange if the AI labs didn't do it.
|
| In fact DeepL was doing basically that more than 10 y ago.
| jimbo808 wrote:
| LLMs are now being aggressively manipulated for propaganda
| purposes. Powerful people have realized that people believe
| LLMs, and treat them as authoritative sources of fact.
|
| The number of lies, lies by omission, deceptive distortions,
| and fallacious argument tactics they generate is absurd, and
| increasing rapidly. Translation, when done as a service you are
| paid for, can't be relied on by propaganda bots.
| smallpipe wrote:
| Do you have examples?
| geon wrote:
| > I did compare a few individual chapters between the Ellsworth
| translation and the Fable translation.
|
| I'm pretty sure the Ellsworth translation is in the corpus. You
| basically instructed claude to regurgitate it.
|
| The llms _all_ have the more famous books memorized. You can
| trick them to recite them more or less word for word.
| tombert wrote:
| I mentioned this specifically in my comment :)
| stdbrouw wrote:
| ... yet you still conclude "AI translation has gotten so
| good", so which is it?
| tombert wrote:
| I do think it's gotten pretty good. I'm just
| acknowledging my limitations in the matter. It's not a
| contradiction.
| oytis wrote:
| Try translating some prose from English to another
| language, then, in a different model, back to English
| lambda wrote:
| I tried this with the original comment in the thread.
| Guaranteed to not be in the corpus, references a few
| terms that also wouldn't be in the corpus (Claude Fable),
| and long enough to be more than a sentence or two while
| short enough to compare in a discussion like this.
|
| I did this with entirely local models I have sitting
| around on my laptop. Minimax M2.7 at a 3 bit quant with 8
| bit quantized KV cache for English -> French, Gemma 4 31B
| QAT (4 bit quant) MTP for French -> English.
|
| It's perfectly readable, but there are a few places where
| the phrasing is a bit more awkward after the double
| translation ("auditing" to "revision" in particular is a
| bit off). Gemma did comment on not knowing what Claude
| Fable was in its thought process: "The author compares
| Ellsworth's translation with one produced by "Claude
| Fable" (likely a misspelling of "Claude" or a specific
| version of Claude)."
|
| Here's the double translation:
|
| "I have no doubt that a writer is better at translating
| than AI, but I must say that AI translation has become so
| good that I'm not sure how much longer the profession of
| translation will exist--or rather, it may become more a
| matter of revision.
|
| "For example, I just read Lawrence Ellsworth's
| translation of _The Three Musketeers_ , which I enjoyed
| immensely. I neither speak nor read French, but from what
| I understand, Ellsworth's translation is considered one
| of the most faithful translations of the work.
|
| "Out of curiosity, I asked Claude Fable to translate the
| original French version of _The Three Musketeers_ ; I
| asked it to translate faithfully, but also to try to
| maintain the same playful tone as the original and to
| censor nothing.
|
| "Once it was finished, I didn't read the entire result,
| but I compared a few individual chapters between
| Ellsworth's translation and Fable's.
|
| "They were honestly remarkably similar. As far as I can
| tell, nothing was substantially different between
| Ellsworth's translation and Fable's. I think the prose in
| Ellsworth's translation was slightly better, but Fable's
| was actually perfectly readable. Again, I don't speak
| French, so I can't say for certain, but I don't believe I
| would have had a significantly different experience if I
| had read Fable's version instead of Ellsworth's.
|
| "It is possible (and probable) that this is partly a
| self-fulfilling prophecy; Fable may have been trained
| using Ellsworth's translation and can therefore draw
| directly from it. Unfortunately, since I don't speak any
| language other than English, there is a sort of vicious
| circle: the only way to compare the fidelity of a
| translation is to compare it to other translations, but
| if other translations already exist, that will likely
| influence the results, and if a translation doesn't exist
| yet, I have no way of verifying it.
|
| "I am going to continue reading Ellsworth's translations
| for the following stories simply because it feels more
| canonical to me, and as I said, I think the prose was
| slightly better."
| tombert wrote:
| This is terrible. I never use em dashes!
| ixtli wrote:
| This is sort of missing the point-- people who dont deal with
| linguistics dont understand that there are multiple types of
| translation. There's word for word (which is what you're
| talking about) and sense for sense. If you let an LLM do all of
| your translation you're letting it interpret huge amounts of
| intent and context it doesnt (and probably cant) access. The
| ways in which this impacts the translation will forever be
| unknown to you and in the worst case lost forever.
|
| So i guess in the end it just matters how important the work
| is.
| tombert wrote:
| Actually I was talking about tonally as well.
|
| A raw "word for word" translation (which I also tried) made
| the story somewhat hard to follow and very dry, but just
| asking it to keep the same kind of jovial swashbuckling tone
| of the original made something pretty similar to Ellsworth's
| translation.
|
| Again, before someone decides to "correct" me on this, I am
| aware that it's very likely that the Ellsworth translations
| are part of the training set so it's not directly a fair
| comparison.
| vel0city wrote:
| > If you let an LLM do all of your translation you're letting
| it interpret huge amounts of intent and context it doesnt
| (and probably cant) access.
|
| Assuming lots of material local to the context one is wanting
| to translate is included, why _couldn 't_ it potentially
| access that additional context?
| senordevnyc wrote:
| _If you let an LLM do all of your translation you 're letting
| it interpret huge amounts of intent and context it doesnt
| (and probably cant) access._
|
| What's the intent and context that a human translator of a
| text is typically privy to that an LLM is not?
| bombcar wrote:
| You're very likely to get a somewhat circular reference; the
| key (for me) is that for 90% of the usages, "standard
| translation LLMs" are just fine - I still recommend a
| translator but they're more of a proof-reader for both
| languages, catching where something slipped through.
| mjmsmith wrote:
| An interesting counter-example:
| https://xcancel.com/ValerioCapraro/status/206506665753442336...
| layer8 wrote:
| I wonder if "Just 3 words: you're not alone" would have been
| acceptable. :)
| mjmsmith wrote:
| The Empire Strikes Back: "I'm your dad."
| turtletontine wrote:
| > ... considered one of the more accurate translations of the
| work.
|
| I think you're missing a big point of translating literary
| works. A purely "accurate", phrase-by-phrase translation is
| often not very good; the actual literary style, the feeling and
| the allusions and references, often get lost that way. A good
| translation of literary work requires a lot of deliberate
| choices by the translator to deviate from literal translations
| in ways that convey the style of the original, or an extra
| layer of meaning that would be lost by an "accurate"
| translation of a phrase. Also, being consistent with these
| choices matters a lot, which OP claims LLMs are less good at.
| j_w wrote:
| As somebody who regularly reads translated works, including the
| occasional machine translation (MTL), they (MTL) suck. You got
| a hugely biased result, which you recognize.
|
| Translation is hard. If you're familiar with reading
| translations from specific languages MTL works have a very
| specific smell to them, it's a bit hard to describe but it's
| there. A good translation is miles (kilometers, for those
| outside of the US) above MTL.
|
| That's not to say that perhaps the latest LLMs will have better
| translation abilities, but that they are generally crap
| currently. Maybe they are fine for something very short, but
| absolutely not for longer content.
| JeremyNT wrote:
| Honestly, translations of fiction are themselves creative
| works, and the translator needs to really understand both
| cultures and needs to write cohesively throughout the work. I'm
| not sure this is even really a question of "can it translate"
| so much as "can it create a good work of fiction" which is a
| much higher bar. So maybe the model can mimic the style
| (especially given that it was probably trained on existing
| translations) but could it really do so from scratch in a way
| that is actually compelling? I'm not so sure.
|
| Of course as for the poor OP... is this a majority of what
| working translators are paid to do?
|
| I suspect a lot of translation is just grunt work - technical
| and business documents. The lack of a cohesive voice with
| considered style is perhaps not really much of an issue in
| those. The expectations are just much lower; text that conveys
| the basic meaning is a much lower bar to clear.
|
| She's probably better than a bot at that stuff, at least for
| now, but my concern is that it won't be "enough" better for
| businesses to justify her continued employment. And this is my
| general feeling about this stuff across society, in basically
| all domains.
| no_multitudes wrote:
| > Fable might have been trained using Ellsworth's translation
| and as such it's very directly able to crib from it
|
| The `cp` program on my computer also has the remarkable ability
| to produce a faithful translation of The Three Musketeers when
| provided one as input.
| Folcon wrote:
| > but I have to say that AI translation has gotten so good that
| I'm not sure how much longer translation work will be there, or
| rather it might end up being more about auditing
|
| It's functional? I wouldn't say it's poetic, I wouldn't want
| any AI translator translating art, like say a book or poem, I'd
| be so uncertain that it would correctly bridge the concepts
|
| A good translator can make stylistic choices that elevate the
| work and make it fit in their language
|
| (Having read lots of well translated manga and anime, also from
| what I understand there's a few books I've been told by my
| bilingual friend's are just _chef 's kiss_ quality
| translations)
|
| Considering translating meaningful art is of some value, on
| that score I don't think we're there yet
| Drupon wrote:
| An honest to god article full of em dashes that's not because it
| was AI but because it was a human using them as a crutch to get
| around crafting sentences that flow naturally. Almost brings a
| tear to my eye.
| ixtli wrote:
| I wish more people had casual exposure to professional
| translators. Its a deeply important, vanishingly small segment
| of the human population and has been this way for at least many
| thousands of years. Also, it will continue to be!
| madaxe_again wrote:
| I've a friend who does simultaneous interpretation at the UN
| and she's just... good god, how do you even do that. Oh, and
| she does it in six languages.
|
| And here I am, brain the size of a galaxy, and I fumble my
| way through every language I speak other than English.
|
| Serious respect for the linguists.
| projektfu wrote:
| I guess I should have figured Marvin would be here on HN,
| feeling sorry for himself.
| bluechair wrote:
| My first rule--before doing anything else--when writing a
| sentence, is to check whether I could have removed the em
| dashes by re-ordering the elements.
|
| Update: in case it's not obvious, I am sorry. I could not help
| it.
| madaxe_again wrote:
| My writing used to be littered with them, but I now eschew the
| em in favour of en, as it has become too strong an anti-
| shibboleth.
|
| I have also taken to being sloppier in my prose, as I've had
| stories rejected for being "written by AI" - when they're
| shorts I wrote more than a decade ago. Reworked them to sound
| like a moron, accepted. Sigh.
| AStrangeMorrow wrote:
| I have a similar issue. I tend to have a very "structured"
| type of writing. Say on slack or Reddit for example. Using
| markdown formatting. Lists with bulletpoints etc. And I tend
| to write long detailed explanations, sometimes too long if I
| am being honest.
|
| But now I find myself adding noise and imperfections to my
| writing (not that it was perfect) to make it more human,
| which is kinda silly.
| jimbokun wrote:
| The LLMs decided to use you as the model for the pinnacle
| of human communication style.
| olivierestsage wrote:
| Em dashes are really good actually and a standard stylistic
| choice for non-technical writing, particularly outside the US.
| anigbrowl wrote:
| They certainly have their place, but are massively overused
| in contemporary American prose. This might be slight more of
| an east coast thing, but that's just a subjective impression
| that I'm not willing to spend time measuring.
|
| To me they come off as faddish, with many writers using them
| where commas and semicolons would have done just as well. I
| think their popularity stems from teh fact that provide the
| sense of a personal aside from the writer, allowing them to
| be more expressive while clearly delineating the personal or
| contextual remark from the main flow of the prose. No doubt
| this works for a lot of readers, but I find it tedious.
| kevinwang wrote:
| I use them because I know what I want to say out loud, but
| transcribing the pause with commas is incorrect because
| it's a comma splice, and I find that the semicolon often
| looks glaringly overly formal. So I've settled on the em-
| dash.
| epihelix wrote:
| It's a fad that has been going strong for centuries in
| published literature, so I'd guess an awful lot of authors
| world disagree with you.
|
| You can restructure _any_ sentence to use fewer forms of
| punctuation -- but if you do that, you 'll lose nuance. And
| nuance, in writing, is a very fine thing.
| anigbrowl wrote:
| The em-dash has indeed been around for centuries, but the
| fad I refer to is its overuse in contemporary American
| prose. IF you look at Google Books n-gram viewer, you can
| see it went through a surge of popularity over a few
| decades that then fell off sharply.
|
| https://books.google.com/ngrams/graph?content=%E2%80%93&y
| ear...
|
| It's also notable that the em-dash is approved in
| American Manuals of Style, while discouraged in British
| ones. I was unable to find longitudinal data for the em-
| dash's use in magazines, blogs etc., but AI summaries
| suggest it's 3-4 times more used in those contexts than
| in news reports.
|
| Like strawberry ice cream or apple pie, nuance is
| certainly a fine thing; but a surfeit of it becomes
| cloying, and the antipathy toward the omnipresence of the
| em-dash in LLM-generated prose, along with other kinds of
| literary expression like contrast and comparison,
| suggests to me that people have had more than enough of
| it.
| hyperpape wrote:
| First sentence:
|
| > In my Ottawa life, every Tuesday evening, I take two gym
| classes back to back--boxing and the pompously named "body
| sculpt," which makes me discover muscles I didn't know I had.
|
| The em-dash matches how you'd speak out loud.
|
| You'd say "I take two classes every Tuesday back to back,
| boxing and 'body sculpt'. Weird name." (Parts of that sentence
| did flow oddly, but not because of the em-dash).
|
| Grammarians say you can't make those separate sentences without
| adding some extra words, and because of blah-de-blah-blah-blah,
| someone might say you can't join them with a comma. So we have
| an em-dash.
|
| Rewriting the sentence would make it flow less naturally, not
| more.
| pvillano wrote:
| When I write like I talk, I use a lot of commas. Replacing
| some of my commas with em dashies, so long as it was done
| judiciously, would probably make things easier to chunk.
| stogot wrote:
| I've seen people use colons where em dashes are effective.
| I use em dashes. AI leans heavily on them for same reason
| mcmcmc wrote:
| It's become the exclamation mark of mid-sentence
| punctuation. It connotes fragmented or interrupted speech
| in my opinion. The problem is that writing is not speech,
| that's why it is more often seen in written dialogue.
| 113 wrote:
| Good writing shouldn't just be how you talk out loud.
| inopinatus wrote:
| Good writing doesn't exclude it.
| mcmcmc wrote:
| If I had a nickel for every em-dash I saw that could've been
| a colon...
| TZubiri wrote:
| Either it's LLM generated, or it's written by someone who wants
| to be ambiguous about using LLMs.
|
| Either way, I'm not reading it, it's a clanker or a clanker
| collaborationist.
|
| I mean, how would you even write an em dash? There's no button
| in the keyboard for em dashes, it's not in ascii, it's just not
| something we write in internet text with, it's a safety
| watermark put into LLMs by OpenAI to help making LLM generated
| content identifiable as such.
|
| If for some reason you are an em dash lover that was hurt by
| the LLM debacle, I'm so sorry for your loss, but look who's on
| your side, give the em dash a funeral and let it go.
| inopinatus wrote:
| Your argument goes as follows: "I'm incapable of it,
| therefore no-one is capable of it".
|
| Followed by, "You should abandon your preferences because I
| don't share them".
| hexasquid wrote:
| "clanker"
|
| Slang for an AI, used by a Blade Runner
| Lalabadie wrote:
| > I mean, how would you even write an em dash?
|
| [?] | +
|
| It's been seared into my muscle memory for more than a
| decade. I keep using it, too. It's present in the popular
| training sets - and then in LLM outputs - simply because it's
| proper punctuation.
| Seattle3503 wrote:
| Presumably the people paying the author for translation services
| are aware of AI, but for whatever reason are choosing a humans
| services instead. IMO it would be a form fraud to heavily rely on
| AI and not disclose that to the customer.
| liquidise wrote:
| > "Great. So, do you use AI a lot at work?"
|
| > "Oh, I can't! It's really not reliable enough."
|
| Gell-Mann Amnesia strikes again.
| vulcan01 wrote:
| wrt. the end of the story, it will be interesting to see if
| people start noticing their Dunning-Kruger bias as a result of
| LLMs.
|
| Specifically: LLMs make it really easy to misunderestimate the
| complexity of fields other than your own. (You can see this with
| a lot of vibecoded projects, for example - once they hit the wall
| of complexity, they stall out or start finding ugly patches for
| fundamental design issues, etc.)
|
| I don't think this sort of cultural change will happen short-
| term, though.
| rootusrootus wrote:
| Agreed. LLMs are really terrific at sounding like they know
| exactly what they are talking about. Fable is the best yet.
| Beautiful, thorough explanations with absolute certainty, which
| under even light scrutiny turn out to be mostly bullshit.
|
| I still love the tool, but remain as convinced as ever that AGI
| does not lie at the end of this particular path.
| nzach wrote:
| > LLMs make it really easy to misunderestimate the complexity
|
| In my experience this is a real problem. Just yesterday I asked
| my LLM to create a piece of software that could help me build
| an 'ambilight-like experience' through my home assistant. It
| did something that seems to work as I expected, but there is a
| lot of theory that I just brushed past. It would be pretty easy
| for me to assume that I would be able to replicate this feature
| from scratch 'now that I understand the problem'.
| analogpixel wrote:
| All I got out of this article is that he should have went home
| and dumped it into chatgpt just to see what happened; then if it
| did as good a job as him, he should start looking for other
| places he can add value that AI can't.
| byronic wrote:
| she did. Did you remember to read the article?
| bachmeier wrote:
| From the phrasing of the sentence, with the incorrect gender
| and the generic nature of the comment, obviously not.
| int3trap wrote:
| The article does not say that. The author doesn't take the
| text the other person dumped into ChatGPT and evaluate its
| quality. That is what OP is referring to.
| xboxnolifes wrote:
| The article clearly implies she has tried so previously.
| analogpixel wrote:
| when someone says they have tried previously that makes
| me think once long ago when they first came out. If your
| employment could be replaced by this, I'd be testing all
| new models to see where they stand.
|
| Just because you don't want to use AI/LLM to translate,
| that won't stop someone else who will, and they will end
| up doing it cheaper and faster (maybe not better, but
| most people don't really care about quality too much
| anymore.)
| analogpixel wrote:
| The point of the comment was that models are improving a lot
| every release, so if your livelihood depends on something, you
| might want to check to see what the latest models are capable
| of before someone else (like your employer ) tells you.
|
| The other person in the gym was right, did you you just dump it
| in the latest model?
| JackFr wrote:
| I worked at large Japanese bank in New York and happened to sit
| near Chief US Economist next to his Japanese translator. She
| would occasionally ask about certain idioms. I remember
| explaining what a wildcat strike was for instance. But it must
| have been pretty tough because the guy was prolific in his
| commentary.
| aaroninsf wrote:
| True, and relevant (I live with a professional editor)... yet I
| immediately think of Ximm's Law:
|
| Every critique of AI assumes to some degree that contemporary
| implementations will not, or cannot, be improved upon.
|
| Lemma: any statement about AI which uses the word "never" to
| preclude some feature from future realization is false.
|
| Lemma: contemporary implementations have already improved;
| they're just unevenly distributed.
| Planktonne wrote:
| No one assumes that AI systems won't be improved upon. What
| people don't assume is that progress will be infinite in every
| domain cheaply forever.
| edude03 wrote:
| I think it can't be improved because it's measuring the wrong
| thing. A junior engineer becomes a senior when they stop being
| told what code to write and start solving business needs.
| Therefore often the highest paid engineers aren't the ones who
| would do the best on leetcode - or SWE bench pro verified.
|
| Maybe AGI is possible and we'll have software defined human
| intelligence that's completely autonomous but that's not coming
| in the next slightly better RL trained LLM and if existed
| likely wouldn't be under our control anyway
| carlosjobim wrote:
| Translating is one thing that artificial intelligence undeniably
| excels at, and the value of this alone is enough to underpin the
| trillion dollar valuations of the gigantic AI companies.
|
| Translation is a gigantic boon for business, but just as
| important for human connection, for culture, science, art, and
| entertainment. The value of automatic and cheap translation
| between all languages, this tower of Babylon, is immeasurable.
|
| Human translators will always be better than any AI at their job.
| But they don't have unlimited time and energy, and they aren't
| cheap. AI makes good to great translations available to
| everybody.
| xp84 wrote:
| The ending is a really powerful point. Most people apparently
| agree on two things:
|
| 1. AI is a great boon for all tasks and specialties we don't have
| the skills to do ourselves. Understandable, since (A) we're ill
| equipped to see the flaws in its output because it isn't our area
| of expertise, and (B) it often can unlock great gains because if
| we trust it, we then don't have to pay and wait for humans to do
| that thing.
|
| 2. AI is a terrible replacement for me - my skills are at such a
| high level that it's almost theoretical that it'll ever be good
| enough to replace me for 90% of what I get paid to do. It's a
| tool at best.
|
| This is why I use AI for all my medical questions and doctors use
| AI to write software, and we both smirk at the quality the other
| person is getting from it.
| holmesworcester wrote:
| Reminded me of this post by EY. (You're making a different
| point about existing expertise, not LLM expertise, but I think
| it holds in general.)
|
| _Every month a new guy discovers LLMs; discovers a skill the
| current LLMs require to get good results; and writes about the
| future jobs that will always be available for smart people like
| HIM, that are SKILLED in using LLMs.
|
| The next generation of AIs doesn't need his fancy prompt. The
| image model goes from needing to type in just the right set of
| weird words and cryptic sorcerous invocations, to most people
| being able to type in English what they want and get a pretty
| good result.
|
| There are still tasks that require careful invocation. But they
| are a much smaller fraction of all the tasks people are trying
| to do, or you can get a bleh result without the elaborate
| invocation to get it really good. And to improve on the bleh
| result you need to be substantially more of an expert than back
| when the Guy was memorizing a rule about adding "trending on
| Artstation" to the image prompts, as would always require a
| human paid to do that.
|
| Another generation of AIs comes out. The next generation of
| Clever Skills is obsolete. Image models just obey the
| instructions for compositing panels without mixing them up, and
| you don't need to be an expert to get them to do it right.
| Another human value-add is gone. A wider set of tasks require
| no human expert.
|
| Now a new Guy notices LLMs have become useful in his field for
| the first time. He discovers they require SKILL to use
| CORRECTLY. He posts about how there will always be jobs for
| humans who are SKILLED in using LLMs like HIM.
|
| But it is not an infinite cycle. It is not the same each time
| it repeats. Now the Guy is a highly paid programmer or a career
| mathematician in 2026, instead of a graphic artist in 2023.
|
| In six months the models will no longer require his vaunted
| Skills.
|
| And by then there will be another Guy.
|
| But the process doesn't continue forever. The Guys are coming
| from fields that were harder and harder for AIs. The brief
| centaur eras are shorter and shorter.
|
| Today it is writers who are laughing at how bad the LLMs are at
| their job, and who will perhaps soon be posting about how it
| takes Skill to get an LLM to do their job Correctly. But the
| models are coming faster, and the eras of kinds of human value-
| add in each field are shortening.
|
| There is a point when you run out of Guys, either because the
| centaur eras are too short for people to develop SKILLs and
| post to Twitter about them; or because there are not lands left
| for AIs to conquer; or because ordinary people are not
| reassured by some Nobel laureate proclaiming there will always
| be jobs for Nobel laureates with the SKILLS to prompt robotized
| biology labs Correctly.
|
| But we'll never run out of amateur economists who assert
| entirely without a brief contemporary example that there will
| always be jobs for humans skilled at operating AIs!
|
| We'll run out of professional economists saying it when nobody
| is paid for that work anymore.
|
| I guess we'll also run out of amateur economists when they're
| dead._
|
| Source: https://x.com/allTheYud/status/2057136382817231151
| CGMthrowaway wrote:
| Well said. Everyone agrees AI can't do their job, so it ends up
| doing everyone else's.
|
| I'm not sure how to formulate it yet but it seems there is some
| Peter Principle/Gell-Mann Effect corollary that is AI-related
| we can say here.
|
| Perhaps: "AI rises to the level of its users' incompetence."
|
| Or: "Confidence in AI output is inversely proportional to one's
| ability to verify it"
| baby_souffle wrote:
| > Confidence in AI output is inversely proportional to one's
| ability to verify it
|
| I like this / generally agree. The only wrinkle is that - for
| some tasks - the verification _is_ "run the script, see if it
| worked, don't care how... just that it did" which is
| distinctly different from "not only did it do it correctly,
| it did so in the most direct and performant way possible".
|
| For a _lot_ of what I use LLMs to build, the former is all I
| need.
| OptionOfT wrote:
| And for as long that that runs on your computer, I don't
| care.
|
| But the problem is that for many people they now believe
| it's ok to present a 10k line vibe-coded PR that only has
| been verified against external behavior, and some Senior
| Engineer needs to review it, in time, under pressure,
| without too much push-back, and lastly, it's the Senior
| Engineer that gets paged at 2am because something has
| fallen over.
|
| Also, those scripts tend to start a life of their own, and
| because it looks good enough, people don't look at them
| again.
|
| I recall a bug of someone vibe-coding a cleanup script for
| folders older than $x (on Windows).
|
| Get the CreationDate, and sort. Delete older than $x.
| Except CreationDate can be null and null is always smaller
| than $x.
|
| Oops.
| theendisney wrote:
| >Well said. Everyone agrees AI can't do their job, so it ends
| up doing everyone else's.
|
| Its like basic income, everyone will stop working except from
| you.
| cwmoore wrote:
| It is not at all like universal basic income, except that
| both of those are misleadingly simple quips.
| whazor wrote:
| But using AI itself is a job too. It takes effort to
| correctly prompt, to steer it, to verify it, and to improve
| the harness.
| kingkongjaffa wrote:
| show me a prompt that is meaningfully expertly crafted
| beyond just providing Do's, Do not's, task context, and a
| goal.
|
| > Correctly prompt, to steer it, to verify it, and to
| improve the harness.
|
| I doubt this a lot. The average AI user is running claude
| code as the harness, or Codex etc. prompting has no secret
| incantations, and steer and verify is just knowing what the
| answer should roughly look like, which is a domain skill,
| not an AI skill.
| dools wrote:
| > show me a prompt that is meaningfully expertly crafted
| beyond just providing Do's, Do not's, task context, and a
| goal.
|
| The way that information is organised and formatted
| matters for compliance. It's pretty similar to writing
| good procedural documentation for humans.
| jenniferhooley wrote:
| I feel like you don't have any friends who make software
| but don't know how to code.
|
| Yes, they do make software now - whereas it was
| impossible before. You may be absolutely shocked at how
| bad LLM code can be when prompted from a noncoder. How
| buggy, and how absolutely rife with security problems it
| can have. I honestly don't know how they can get LLMs to
| write such bad software - but somehow they can. This is
| from people who have been vibe coding for 3 years
| straight btw (huge amount of time p/day).
| Kiro wrote:
| > Everyone agrees AI can't do their job, so it ends up doing
| everyone else's.
|
| In real life I haven't met a single programmer who doesn't
| think AI can do their job.
|
| If someone would actually say that I would immediately think
| they have hubris and overestimate their skills.
| jenniferhooley wrote:
| You mean theoretically in the future? Or right now?
| notsirius wrote:
| are you saying that all of the programmers you've met in
| real life have automated their work away and are coasting
| while waiting for their bosses to fire them...?
|
| ...if not, they've found developer work that ai can't do
| yet, no?
| PaulRobinson wrote:
| I was saying something like this a few years ago when people
| were getting first excited about ChatGPT. The gap has narrowed,
| but not by as much as people think.
|
| AI produces output that is very convincing to a non-expert, and
| (dangerously), it's so good at looking like an expert, they
| might believe that it is an expert. But the moment you ask
| someone to use it for something they're an expert in
| themselves, the holes appear wide, consistent & obvious.
|
| My favourite moment of seeing this in action was watching AI-
| worrier TV host/comedian Bill Maher. He has spent years talking
| about the dangers of AI taking everyone's jobs, destroying
| civilisation, ruining the economy, starting wars, "it's just
| getting better and better all the time", and so on. But one
| night he let slip a tell. "It's no good at writing jokes. Not
| yet, anyway". There you go, Bill... connect those dots...
|
| There is real utility in it being a tool to help experts apply
| their expertise, as in this story where it speeds up some tasks
| to help the translator do part of the work, enhance their
| expertise, allow them to be more productive.
|
| It's a better screwdriver, a better hammer, in the hands of
| somebody who knows what needs a screwdriver or a hammer. It
| doesn't replace them. It can't replace them. It's a tool that
| enhances the human, not an alternative.
|
| I don't understand why this is not widely understood yet, but
| I'm sure it will in due course.
|
| And I don't expect this to change. Even if the latest model
| scores 100% on every benchmark, all that really tells us is
| that it's now more productive/efficient than it was before at
| helping experts do that work, not that it can replace everyone
| in that category of work.
| s_tec wrote:
| It seems to be a general principle: If AI is better than you at
| something, you use it. If AI is worse than you, you don't.
|
| Each time the frontier models get better, I see another wave of
| AI doubters suddenly become believers. People say things like,
| "AI couldn't code last year, but now I use it for everything!"
| Interesting. Now we know how that the person who said this has
| the coding skills of a Claude Opus 4.5 or whenever the frontier
| was when they flipped.
|
| Meanwhile, the rest of us keep using AI as simple tools, like
| the person in the article. I wonder how long it will take
| before computers can program better than me, and I flip too.
| r3trohack3r wrote:
| I'm not sure I agree with this but maybe I just lack self
| awareness?
|
| There are large portions of my codebases that are essentially
| extremely verbose grunt work. My UI stack, IaC YAML, thin
| CRUD routes, etc.
|
| I know what the code is supposed to look like when it's done
| being written, but it's going to take me for freaking ever to
| type it all out.
|
| I can just few shot it now in an hour. Plan -> feedback loop
| -> build -> review loop.
|
| Does it try to do weird stuff? Yeah. And then I'm just like
| "that's weird, no, the components should be broken up like
| XYZ" and then it's not weird anymore. Occasionally (1% of the
| time) I just do a quick refactor myself instead of trying to
| tell the agent harness what to do.
|
| I can get something fairly close to the ballpark of what I
| would have done but in like single digit percentage of the
| time.
|
| And the result is that I can spit out a bunch of purpose
| built tools (personal tools, internal tools for teams, etc.)
| that I never would have been able to justify building
| otherwise.
| greiskul wrote:
| > the person who said this has the coding skills of a Claude
| Opus 4.5 or whenever the frontier was when they flipped
|
| It's not about just skill. It's a matter of skill, time, and
| how critical the software you are writing is. There is a lot
| of software that is not critical. That is not close to
| security mechanisms. And that even if the code quality is not
| the highest, it does not matter.
|
| Even if you are the best coder in the world, you would
| already become more productive by using ai. Things that in
| the past you might have not coded yourself but delegated to
| an intern, or things that you wouldn't even delegate to an
| intern because they are just too boring to do like some
| refactorings.
|
| Like I had this project at work that was written without
| typescript strict mode turned on. When I turned it on, it had
| over 700 errors. I might be better than AI to fix every
| single of one these errors. But my time is worth more than
| that in doing other things. But I can, and did, ask AI to fix
| every single one. And then I reviewed it batches, and
| something that my team wanted to do for multiple years and
| nobody had the time for, finally got done.
| black3r wrote:
| the sentiment "AI couldn't code last year, but now I use it
| for everything!" rings true for me... but I didn't flip cause
| AI is now better than me... I flipped cause now I am faster
| with AI than without it...
|
| A year ago the AI output was so bad that getting it up to my
| standards took more than writing it myself from scratch. And
| nowadays it is faster for me to start with AI output and
| iterate from there to reach quality submission.
|
| The ninety-ninety[0] rule was a thing talked about 40 years
| ago, long before anyone thought of AI coding. AI can nowadays
| make the first 90% of the task very fast and good enough. The
| last 10% is still the hardest part of coding by far.
|
| [0]: https://en.wikipedia.org/wiki/Ninety%E2%80%93ninety_rule
| jasonfarnon wrote:
| "Now we know how that the person who said this has the coding
| skills of a Claude Opus 4.5 or whenever the frontier was when
| they flipped."
|
| Well, once folks like Linus Torvalds concede, this doesn't
| carry much sting.
| Aurornis wrote:
| > This is why I use AI for all my medical questions and doctors
| use AI to write software, and we both smirk at the quality the
| other person is getting from it.
|
| There is an interesting third group emerging: People who
| acknowledge the quality problem, but think they can deal with
| it by applying more AI to the output.
|
| This takes the form of people who spin up a lot of "agents" and
| give them personalities like security director or quality
| director (which are unnecessarily complex and maddeningly
| unpredictable ways to trigger an LLM session for doing a
| security review or a quality check pass).
|
| It also includes the person who knows that their app is full of
| bugs, but thinks it's not a problem because they can have the
| AI fix the bugs as they show up. People in this class haven't
| encountered security breaches or data loss bugs yet. They think
| it's all about having Claude fix that div that isn't centered
| or handle that error code that shows up some times.
| toddmorey wrote:
| I always imagine the model rolling its silicon eyes when it's
| assigned a personality ("you are an expert growth hacker") at
| the start of the prompt. Was that ever actually shown to be
| effective? Is it still?
| gs17 wrote:
| I've always wondered if the go-to should have been
| prefilling its response with "I am an expert growth leader,
| and here are my thoughts:".
| techpression wrote:
| I feel it helps for the personality aspect, how it handles
| answers and general vocabulary, but it doesn't in any way
| improve skill level, at least that's my take from building
| an assistant.
| spudlyo wrote:
| It reminds me when people would stuff their image prompts
| with things like NO DEFORMED FINGERS.
| cwillu wrote:
| Instructions unclear, digitized subject into a mass of
| fingers.
| sebastiennight wrote:
| Thanks for reigniting the PTSD of reading about SCP-4051.
| throw-the-towel wrote:
| You mean the 4051 from _There 's No Antimemetics
| Division_ and not the mainline 4051, right?
| sebastiennight wrote:
| Yes. I'll confess that I started with the novel :)
| badc0ffee wrote:
| Perfectly formed fingers.
| 205guy wrote:
| I hope that pun was intended!?
| cwillu wrote:
| SCP-48510055
| hexasquid wrote:
| "Don't think of an elephant"
| bryanrasmussen wrote:
| I remember there were some studies that this kind of thing
| was effective a year or so ago, so essentially a lifetime
| in Model years.
|
| However to me it seems completely reasonable that it would
| work, because my understanding of what happens is the model
| interprets what you said as:
|
| Look for a group of people who are considered to be expert
| growth hackers by the world at large and answer my
| questions as though they were answering them.
|
| So assuming that there are a set of questions that can best
| be answered by people that most other people identify as
| expert growth hackers then yes, I believe assigning a
| personality in this way should obviously work.
| xpct wrote:
| I propose we move away from the framing of "Model years"
| - they're standard human research years. Yes, likely more
| people are working on it, and also working harder, but
| ever since we acquired a certain amount of compute in the
| world, many people were able to independently find the
| same patterns and train models.
| FeteCommuniste wrote:
| I imagined it as kind of a shorthand for "you should be
| spending my tokens on looking for / addressing issues
| like X, Y, and Z," where X, Y, and Z are the sorts of
| things that an expert in [insert domain here] would be
| likely to care most about.
| bryanrasmussen wrote:
| right, but the thing is how do they know what an expect
| in [insert domain here] would care about? Obviously by
| finding content created by
|
| people who claim to be experts in [domain] people who
| others claim to be experts in [domain]
|
| hopefully valuing membership in group two over membership
| in group 1.
| bandrami wrote:
| At some point we have to just admit we're mass cargo-
| culting here and that these secret invocations people
| swear by have the same epistemic value as medieval
| superstitions.
| code_biologist wrote:
| It's been interesting to see how aggressively some
| reasoning models like to "reason" by analogy. They love
| to say things like "it's like a CPU" or "it's like a
| highway", and then they start to make logical leaps based
| off that rather than just using it for user explanation.
| Gemini 2.5 and 3.1 Pro have been particularly bad for
| this type of behavior. Telling models to "speak as though
| you are a physiologist considering the case with an
| expert colleague" gets them to "reason" using a more
| correct linguistic substrate.
|
| The Opus models over the last year doesn't seem as
| vulnerable to this type of behavior and I've noticed the
| "identify as expert" prompt tricks aren't as meaningful
| there.
| not_a_bot_4sho wrote:
| > Was that ever actually shown to be effective? Is it
| still?
|
| Yes! Personas demonstrated measurable improvement in a few
| different ways, with caveats of course. The common
| intuition is that personas influence token space in
| beneficial ways.
|
| I'll come back here later on desktop and link a few (still)
| relevant papers on this topic.
| shnock wrote:
| Please do, thank you! I have been similarly skeptical as
| your comment's parent
| not_a_bot_4sho wrote:
| I added some brief commentary here:
| https://news.ycombinator.com/item?id=48507278#48511524
| (or just refresh parent comment replies to see it)
|
| It scratches the surface really but hopefully provides a
| helpful starting point.
| Blackthorn wrote:
| At least in the beginning of spicy autocomplete, this sort
| of role-play did work pretty dramatically at aligning a
| conversation to a task, though I don't think anyone ever
| tested it versus somewhat less cringe priming.
|
| After that, cargo cults do what they do best.
| customguy wrote:
| > though I don't think anyone ever tested it versus
| somewhat less cringe priming.
|
| I really wonder if phrasing it differently would make a
| difference. In good faith conversations, it just doesn't
| happen that someone tells _someone else_ who that person
| is.
| Sharlin wrote:
| There was a time when stuff like "Unreal Engine, trending
| on ArtStation, 8K resolution" actually worked when
| prompting image gen models because such labels actually
| correlated with higher-quality images in the web-crawled
| training datasets available back then.
| antonvs wrote:
| The reason it seems suspicious is that it's phrased in a
| way that's oriented towards humans. I haven't tested this,
| but I suspect you'd get similar results if you said
| something like "orient your response to that of a growth
| hacker." Either one is likely to have the desired effect on
| the stochastic result.
| not_a_bot_4sho wrote:
| Back with some papers. (Apologies in advance; I typically
| don't edit/format comments much here, please bear with me.)
|
| Notable papers describing performance improvements with
| prescribed roles and personas:
|
| - ExpertPrompting: Instructing Large Language Models to be
| Distinguished Experts (2023)
| https://arxiv.org/abs/2305.14688 (if you're going to only
| read one paper here, maybe read this one but know there has
| been a lot of follow up with more modern models.)
|
| - Expert Personas Improve LLM Alignment but Damage Accuracy
| (2026) https://arxiv.org/abs/2603.18507
|
| - When Does Persona Prompting Actually Help? (2026)
| https://arxiv.org/abs/2605.29420
|
| - Unveiling Power on Combining Prompt Engineering
| Techniques: An Experimental Evaluation on Code Generation
| (2025) https://doi.org/10.5753/sbbd.2025.247251
|
| - A Pattern Language for Persona-based Interactions with
| LLMs (2025)
| https://www.dre.vanderbilt.edu/~schmidt/PDF/Persona-
| Pattern-...
|
| A TLDR of my *admittedly heavily biased* mental model (so
| take it with a grain of salt): personas do improve task
| alignment and precision to measurable effect but with
| observed negative impact to accuracy and knowledge
| grounding. Overall, this makes it quite suitable and
| preferred for code generation scenarios. (Don't over-index
| on 'accuracy' here as meaning "bad code", it's more about
| verbosity/jargon reducing clarity of higher order goals
| like business objectives and system architecture.)
|
| Outside of code generation, personas have the interesting
| effect of increasing implicit biases and stereotypes. It's
| not hard to imagine something like "you are a left|right
| wing politician ..." or "you are a senior-citizen|teenager
| ..." influencing token space construction considerably.
| MichaelZuo wrote:
| How did you get over 52,000 karma in under 3 years with no
| submissions at all?
|
| Are you averaging like 2000+ comments a month?
| mschild wrote:
| 3 pages deep into their comment history only brings me to 5
| days ago so probably yes.
| Aurornis wrote:
| Commenting more than I should, to be honest.
|
| I have a few periods during my daily routine where I'm
| waiting somewhere away from the computer and need a break
| from email.
|
| A lot of my comments have double digit upvotes and some get
| into the mid hundreds. I try to actually read articles and
| provide thoughtful comments, which gets upvoted a lot more
| than the throwaway.
|
| > Are you averaging like 2000+ comments a month?
|
| 52000 / 3 years would be under 1500 points per month or 48
| points per day. That could be done with 1-2 helpful
| comments per day on popular threads.
| dotancohen wrote:
| Serious, non-acusatory question. Your writing looks
| human. Do you use any writing assistants?
|
| Where else, other than HN, do you post?
| aquariusDue wrote:
| I browse HN a bit more than I should and I see you and
| simonw around a lot, like you said always providing
| thoughtful commentary.
|
| When I write comments on here I tend to spend upwards of
| 15 minutes to draft and reformulate my comments.
| Sometimes double-checking what I'm about to say
| (sometimes not thoroughly enough as some of my recent
| comments show) and I was wondering if you have a similar
| experience in that regard or do you just manage to fire
| off a comment in a stream of thought fashion from start
| to end?
| soperj wrote:
| They spin up agents, and then give them roles like
| commenter, and director of quality for the commenter.
| Although I'm unsure how the director helps since I've never
| seen one do actual work.
| throw-the-towel wrote:
| > People who acknowledge the quality problem, but think they
| can deal with it by applying more AI to the output.
|
| Brute Force: if it doesn't work, you're just not using
| enough.
|
| What if they're right though?
| keeganpoppen wrote:
| they are right. bad output is user error. there, am i
| suiting the role appropriately? i do like 65% believe that,
| fwiw.
| pianopatrick wrote:
| There are other places where some process has an error rate
| and you make up for that error rate by doing the work more
| than once and then comparing results. For example, I've
| heard in a video that satellites and other space craft
| often have 3 or 4 processors and compare the results to
| make sure there were no errors due to radiation. Similarly,
| we have RAID arrays that store data multiple times because
| disks can fail. So, even if AI has a failure rate of like
| 20%, maybe you can make up for that by running the same
| prompt multiple times with slight variations or with
| different models, comparing the results and choosing the
| best.
| tgma wrote:
| It does not have to be brushed away as "brute force"
| necessarily. We can, and do, build more reliable systems
| out of less reliable components. In fact, most industrial
| engineering accepts some defect rate and builds margins
| around it.
|
| Software is no different. Even without AI, you already have
| buggy compilers and buggy OSes and buggy libraries. You
| just tend to accept the risk because you have some idea of
| what the failure modes are and can work around it or manage
| the risk in some other way (buy literal insurance.)
| eqmvii wrote:
| I've seen it turn right in business contexts. Sometimes you
| can even lower your standard of "good enough" and find
| quantity has a quality all its own.
|
| But it requires taste and engineering to do it right, and
| on the right things. It'll be an interesting few years.
| goatlover wrote:
| They're right until they're not.
| greazy wrote:
| > There is an interesting third group emerging: People who
| acknowledge the quality problem, but think they can deal with
| it by applying more AI to the output.
|
| Ah yes, the known unknowns.
|
| The discussion reminds me of a talk Zizek gave in which he
| discusses the speech Rumsfeld gave regarding the evidence
| Iraq supplying weapons to terrorist[0].
|
| Zezik argues the unknown knowns are far more interesting (and
| the reason why USA was losing in Iraq). While Rumsfeld
| focused on the unknown unknowns.
|
| I've noticed that domain experts who implicitly know the the
| known unknowns of their field distrust LLMs because they can
| identify their shortcomings. Those subtle mistakes LLMs make.
| I argue this is why domain experts using LLMs get such a
| boost. They can identify and avoid pitfalls sometimes before
| they happen. But in other fields the same people are in awe
| of LLM capabilities precisely because the known unknowns are
| a mystery.
|
| The Unknown Unknowns of LLMs are the IMO the most
| interesting. The so called emergent capabilities of the
| technology. The use of LLMs in others fields such as biology,
| eg in protein language models, is really cool.
|
| Everyone focuses on replacement of people workers when I
| think opening new fields of work for humans should be the
| goal of LLMs by leveraging the tech to discover.
|
| The other interesting caregory is unknown knows. But that's
| another topic for another time.
|
| [0] https://en.wikipedia.org/wiki/There_are_unknown_unknowns
| bandrami wrote:
| As an aside, the mass mockery in response to Rumsfeld's
| statement always bothered me because it's the single most
| intelligent statement he ever made about the Iraq war, and
| if he had _started out_ with that mindset things probably
| would not have gone nearly as pear-shaped as they did.
| thisoneisreal wrote:
| This is one of those classic "sounds dumb / doesn't play
| well on TV but is actually smarter than most of the other
| people babbling about it" things. Nassim Taleb has
| written for example about how maddening it is to watch
| world-class economists who are also just sort of awkward
| and a little nerdy go on TV and "lose" to blowhards who
| don't actually know what the hell they're talking about
| but appear confident and look good on camera. Thankfully
| in Rumsfeld's case I think as time has gone on it's
| become a pretty respected statement about risk even if
| people still occasionally find the phrasing a bit
| amusing.
| tetromino_ wrote:
| Link for the curious:
| https://www.lacan.com/zizekrumsfeld.htm
| madrox wrote:
| This is a new form of Gell-Mann Amnesia:
| https://en.wiktionary.org/wiki/Gell-Mann_Amnesia_effect
| chrsw wrote:
| My fear is in the future it won't matter. People will accept
| slop because while they can be convinced it's not as good as it
| could be, it's good enough. To them it's good enough because
| it's fast and cheap not because it's actually good. There won't
| be any room in the economy for the value human output brings
| because the economy will rearrange itself around AI and become
| completely dependent on cheap output, good enough or not.
| ben_w wrote:
| > 2. AI is a terrible replacement for me - my skills are at
| such a high level that it's almost theoretical that it'll ever
| be good enough to replace me for 90% of what I get paid to do.
| It's a tool at best.
|
| Most? Perhaps it's depression, but I look back at my career and
| wonder if any code I've ever been paid to write is beyond what
| current AI can do.
|
| Sure, this leaves me with the non-coding tasks of UX taste, and
| code review + a few other forms of QA (and, when self-employed,
| project management, game design, etc.), but man, I'm someone
| who actually learned to read in part on the Commodore 64 user
| manual (as in, trying to understand what PEAK and POKE meant
| concurrent with having "Jack and Jill go up the hill" picture
| books).
|
| (And no, I'm not claiming LLMs make bug-free code, I see the
| bugs LLMs make during my code review of their output and some
| of them are awful, hence "this leaves me with ...").
| borzi wrote:
| And? How valuable are individual lines of code? To the
| author's point, I'm sure AI can translate individual
| sentences perfectly, but miss the nuance of communication in
| a bigger project or body of text. In the same vein, when was
| the last time someone put an AI on a ralph loop, posted the
| result on r/vibecoding and ended up with actual users.
| ben_w wrote:
| > How valuable are individual lines of code?
|
| Don't care, only time I've measured them was personal
| curiosity about hand-written projects, and one time I was
| trying to work out how many blank comments a co-worker had
| put into their codebase*.
|
| How valuable are features? Management kept giving me them,
| and I always just assumed they'd decided which ones were
| important. But I've seen git histories of apps where the
| same feature was added twice, 5 years apart, by the same
| developer.
|
| > In the same vein, when was the last time someone put an
| AI on a ralph loop, posted the result on r/vibecoding and
| ended up with actual users.
|
| How often do the megacorps currently boasting that 80% of
| their code is now vibed, post anything (other than adverts)
| to reddit?
|
| * 20% of the whole project, or 24 thousand blank comments.
| ozgung wrote:
| I feel like I am the only one thinking AI is actually much
| better than me in the things I'm supposed to do well. I feel
| like that for years now, so it's not about the latest
| generation of models. I can't imagine a single thing I can
| really compete with an AI at this stage. I am not sure if I am
| under-skilled or others are overconfident. Maybe people who
| feel like me don't say this out laud.
| dfee wrote:
| agree. it's strange reading the loud voices that are counter
| to my lived experience. llms just have seemingly infinite
| depth - or can at least debug and execute without fatigue.
| aphroz wrote:
| Except that it is also quite difficult to assess the quality of
| a doctor or a software developer if you don't work in the
| field.
|
| I've heard numerous cases where AI solved medical issues that
| doctor couldn't.
| perrygeo wrote:
| At what point does this become an issue for data quality and
| global epistemology?
|
| It seems inevitable that we ask for more AI assistance on
| topics we don't understand. And therefore have the least
| context to correct. Result: a flood of poor quality
| information.
|
| In areas we DO understand, we'll either not ask AI at all, or
| treat its results with a higher degree of skepticism. Result: a
| lack of high quality information.
|
| Inevitably this means a higher volume of non-expert prompts
| gets translated into the next generation of internet content.
| AIs are pumping out more novice-level text and less expert
| guidance.
|
| The result will be an internet full content written from the
| perspective of an ignoramus; not addressing any complex issues,
| staying surface level on every topic. Which will cascade into
| future models, etc.
| tpmoney wrote:
| > The result will be an internet full content written from
| the perspective of an ignoramus; not addressing any complex
| issues,
|
| Not to be overly negative, but have you really looked at the
| vast majority of the content on the internet? There are good
| pockets of real, in depth content. But the absolute vast
| majority of it is surface level basics at best, and
| completely wrong hot takes at worst. Content farms and click
| spam have made up huge portions of the internet for a while,
| never mind the absolute hell holes that places like Facebook,
| Twitter and Tumblr were and have been. And that's before you
| consider how often news media gets stuff wrong and then
| everyone copies everyone else's homework. Knowledge
| propagation, and more specifically correct knowledge
| propagation has always been difficult, slow and rare. You
| have always needed to check primary sources, and AI is just
| the latest in a long line of reminders of that fact.
| Xeoncross wrote:
| Honestly, we're at a point where AI can write better software
| than some devs and answer medical questions with more knowledge
| than some doctors.
|
| Likewise, AI is oblivious to it's own mistakes, much like said
| professionals can be at times.
|
| Not that AI is actually thinking, but rather the collective
| corpus of text yields greater insights (knowledge of the crowd,
| not wisdom of the crowd) than a lower-average person in that
| same industry.
| athrowaway3z wrote:
| So i assume this post is just a bit of writing out frustration,
| but i'm always hoping that "AI can't do it" posts to include
| examples.
|
| A list of "Examples AI will silently fail at" would be a lot more
| interesting, and might just convince your next potential client
| to _not_ use AI.
| TekMol wrote:
| AI isn't replacing me. Like a toddler, it needs to be
| constantly coached.
|
| Like a toddler, it will grow up.
|
| Humans are really bad at noticing trajectories. They see the
| current situation. They know what the situation was 5 years ago.
| But for some reason they do not believe that there is a
| trajectory. They view the present state as the final destination.
| allknowingfrog wrote:
| Sure, just like AI enthusiasts seem to be unfamiliar with the
| concept of local maxima...
| FromTheFirstIn wrote:
| It's been basically the same for 3 years now. Are you sure
| we're the ones who can't see trends?
| Ancapistani wrote:
| Your experiences must be _much_ different from mine.
|
| Three years ago, AI was barely able to provide sort-of
| reliable command completion.
|
| Two years ago, it could extrapolate a single function from a
| docstring - but the docstring had to be so verbose that it
| wasn't practical to use in that way.
|
| A year ago, I was tinkering with Devin to try to find a way
| to get it to reliably implement small, isolated features from
| verbose Jira tickets.
|
| Six months ago, I started using AI to generate the majority
| of my code output. Most of my time was spent reviewing, and I
| was ecstatic to reach ~2x output because I could run the next
| task while reviewing the last.
|
| Now, at work I'm managing a half dozen Claude Code instances,
| Devin sessions, and orchestrating a review loop between
| Claude, Devin, and CodeRabbit. It's not uncommon for me to be
| working on four or more discrete features at once. My output
| is approximately 15x my pre-AI baseline - and I've not sat
| down and written a line of code directly in six months.
|
| At home I'm managing a Hermes agent that can spin up a whole
| fleet of purpose-tuned agents for whatever purpose I'd like.
| I've implemented spec-driven development a'la Acai, and
| extended it to the point that my agent creates specs from
| text or voice conversation, I review them, and it handles
| implementation end-to-end. The code itself is an almost
| disposable artifact - useful primarily to ensure no
| regressions have been introduced between rounds.
|
| ... I simply don't understand how you can assert that "it's
| been basically the same for 3 years". It absolutely has not.
| NichoPaolucci wrote:
| Cmon - cursor has been out for like 3.5 years at this
| point. AI was still in its infancy but it was definitely
| able to complete tasks, albeit smaller ones.
|
| Not disputing the overall trajectory, yeah it's gotten
| better. But it was definitely capable of more than just
| command completion 3 years ago.
|
| I reach for it more frequently. But personally, it's at the
| point of diminishing returns for my work. It's capable
| enough now to handle most of the things I want to throw at
| it, sometimes it's wrong, sometimes it's right.
|
| I'm not doing cutting edge deep tech work - and I also
| don't have the motivation (or salary increase) to be 15X
| more productive, if that's even measurable. We are so busy
| because the CEO hears these "15X" statements and then the
| pressure is on to match or exceed that, and I'm not playing
| that game.
| robertnowell wrote:
| head in the sand
| jubilanti wrote:
| > Like a toddler, it will grow up. Humans are really bad at
| noticing trajectories.
|
| Yourself included??
| tiborsaas wrote:
| It's quite ironic as the transformer architecture that powers
| most generative AI was invented for language translation :)
| esafak wrote:
| This is just about the worst career you could be in right now. Of
| course people are just going to upload it to ChatGPT. Processing
| text is its forte.
|
| This person is in the first stage of grief (denial); artists are
| several stages ahead. Most customers are not going to care about
| the difference in translation quality unless it's in a regulated
| sector.
| AnodicElegy wrote:
| Out of curiosity, I pasted an article in French I was reading a
| few minutes before coming across this thread into ChatGPT and
| asked for a translation into English. It was certainly passable
| from a functional perspective, and I wouldn't hesitate to use it
| to translate an article from a language I don't understand. But
| it was not professional-quality work. There were a couple
| instances where the French grammar was mistranslated, and the
| writing was perfunctory, not going into any effort to have the
| article flow like it was originally written in English instead of
| simply translating each sentence literally. Would I read an
| article written like this? A short one. A novel? Definitely not.
| HDThoreaun wrote:
| I think the issue is that a lot of professional work is being
| done when the commissioner would be perfectly fine with non
| professional work. There will always be a place for artful
| translation, theres a place for hasty translation as well.
| throw310822 wrote:
| Especially when you get three assignments from 4 to 6 pm, all
| due for the day after. It's certainly literary translation
| they're after.
| km3r wrote:
| > Should you pay your roofer less because he uses a hammer
| instead of his bare hands?
|
| Yes. Effective tools increase the supply of roofs made. More
| supply means lower prices per roof. But because the same number
| of roofs need to get worked on, the increase in roofs per roofer
| means less roofers will be needed.
| Chuzam wrote:
| Who is gonna tell her?
| ghusto wrote:
| Sounds a aweful lot like the kind of things we were all saying
| before realising that we had to change what our jobs meant.
| bwhiting2356 wrote:
| AI should be used for all the bullshit tasks that no one wants to
| do. There are garbage dumps full of stuff that can be reused and
| recycled. But it's not high enough ROI to pay someone $25/h to
| sort trash, so it isn't happening.
| robertnowell wrote:
| unfortunately this person will soon be unemployed.
| robertnowell wrote:
| unfortunately this person will soon be unemployed.
|
| not because their skills are no longer relevant, but because they
| are taking a principled stance defending now irrelevant skills.
| xboxnolifes wrote:
| Close. They will be unemployed because AI be "good enough" and
| companies won't care about it being better. Nothing they
| mentioned was really about principles. Everything was about
| quality output. Too bad companies dont care about quality.
| loloquwowndueo wrote:
| > If you ask me, nothing can save downtown Ottawa or North
| American public transit.
|
| Come to Montreal. Only 2H away and you can get by decently well
| without a car.
| dmitrygr wrote:
| Any expert in any field will gladly tell you that ML sucks for
| specifics of their field (and it does). But if you are not an
| expect in that field, it looks convincing enough to make you
| think that maybe it is OK for that field, and your field is
| somehow unique. It is not. Any expect in any field will confirm
| to you that ML produces plausible-looking slop which is
| occasionally completely wrong. This is the case for all fields.
| yaky wrote:
| I don't see LLMs being able to replace translators for less-
| spoken languages.
|
| I know a translator between two Eastern European languages, and
| some jobs require use of specialized dictionaries. Using LLMs in
| such cases would be very unreliable and would require even more
| effort to check and correct than doing it correctly in the first
| place. Plus, I really doubt that US tech firms are training LLMs
| on language spoken by "only" 6 million people.
|
| As for entertainment, anyone who grew up in Eastern Europe with
| pirated movies with nasal monotone translations, or machine-
| translated video games knows how much those take away from the
| experience. Sure, "AI could do better", but could it be
| consistent and capture cultural nuances and idioms, etc?
| jiehong wrote:
| Even more so for spoken-only languages.
| jovial_cavalier wrote:
| You don't even need to argue that you're better than the AI. The
| point is that the client could have uploaded it to ChatGPT too.
| Perhaps they even did, and they didn't like the answer they got.
| They are sending it to a human because they want a human to do
| the work. If you were to send back ChatGPT output, that would be
| fraud.
| robertnowell wrote:
| the version of this skillset that stays employed is "now I
| translate 10000x more than i could before by managing a fleet of
| agents. by encoding my experienced taste and judgement into
| robust evals, I've helped my ai translators be far better than
| chatgpt on its own, and much more cost effective compared to
| manual human translation"
| atleastoptimal wrote:
| You'd be laughed at if you said that ChatGPT could help you with
| graduate level mathematics in 2024, but this year, AI models on
| simple prompts are solving previously unsolved Erdos problems.
|
| It seems silly to imagine that there is some fundamental barrier
| between human intelligence and AI, and that AI could never do
| many of the things that humans can do. Inferring intent, gauging
| sentiments, factoring in cultural values, etc. all the things
| cited as stuff humans can do but AI can't, AI can currently do if
| given enough context. But more importantly, all those things
| aren't magical tasks that can only occur inside a human skull,
| they are a product of information processing, its just the
| information processing that has been hard to make computers good
| at, but so far it appears AI keeps getting better.
|
| I'm all for humans having special value that is not attached to
| their ability to perform useful work. However denying the
| abilities of AI models seems to be a common mistake many people
| are making, and sadly reality catches up to these people before
| they can emotionally prepare.
| jaggederest wrote:
| Fable has really spooked me, honestly. It's another big jump,
| but not in the actual coding. I was pretty comfortable with the
| "you do the implementation, I do the meta work and steering",
| and ... no steering required, no meta work required. Here's the
| backlog, let me know when it's complete, I guess I'm going to
| go touch grass until I have to review and refine... probably
| tomorrow?
|
| Reminds me of the first time I saw a coding agent stumble
| through an issue in 2023 maybe? and went "this is a big deal",
| similarly when OG gpt started making jokes that actually kinda
| worked.
|
| Updated modern version of the classic "make me a greentext",
| apologies for slop-posting, but it seems relevant:
| > be me > senior software engineer > in charge
| of making sure the tickets get, in fact, implemented >
| occasionally have to open the IDE and write some code myself
| > one day i open the IDE and the ticket is already closed
| > the agent did it overnight > no steering, no review
| notes, nothing left for me to do > distress.jpg
| > ask my manager what to do > he says "just focus on
| the high-level architecture stuff" > i say "what high-
| level architecture stuff" > he says "i don't know,
| you're the senior engineer" > rage.jpg > quit
| my job > become a prompt engineer, nice and simple,
| just tell it what to build > first day on the job, sit
| down to write the prompt > AI already wrote it
| balefulboy wrote:
| Greentext is eh. Very formulaic, in fact very similar to the
| bottomless pit one, which I'd argue is better because of it's
| absurdity. I have to ask, did you mention the older GPT
| version to fable in the prompt?
| jaggederest wrote:
| Of course I did! Wouldn't be faithfully mediocre without
| the right context
| cptroot wrote:
| > AI can currently do if given enough context
|
| It's worth noting that you can substitute "dollars" for
| "context" in that sentence, which seems to be where many of
| these impressive achievements are coming from. As ever, it's
| unclear whether these models will get cheaper while remaining
| better, since all of the recent breakthroughs appear to be of
| the "think more" kind. For translation specifically, I'd be
| very surprised if the "think more" LLMs would help given the
| per-unit cost expected of the output.
| while_true_ wrote:
| Yes. It's as if they think AI will forever be LLM only and
| won't develop world models that incorporate current state
| assessment, dynamic next-state prediction, cause-and-effect
| reasoning, object permanence, etc. I'm not in the AI industry
| but I assume there's got to be lots of research and work being
| done on this.
| TZubiri wrote:
| As mentioned in the article, the point of language is to
| communicate with other humans, and you need a human to do that.
|
| Mathematics is famously rigorously defined, it's roughly analog
| to AI beating humans at chess. Sure it's impressive, but it's
| also something you'd expect machines to be good at.
| zymhan wrote:
| > You'd be laughed at if you said that ChatGPT could help you
| with graduate level mathematics in 2024, but this year, AI
| models on simple prompts are solving previously unsolved Erdos
| problems.
|
| I'm curious, do you have a graduate degree in mathematics?
| 4k0hz wrote:
| > all those things aren't magical tasks that can only occur
| inside a human skull, they are a product of information
| processing
|
| I agree but it's useful to remember that 1. brains and
| especially the human brain are enormous and 2. individual
| tokens carry significantly more meaning than individual tiny
| muscle twitches so even extremely primitive "cognition" can
| look like it's doing more work than it actually is.
| pazimzadeh wrote:
| LLM's are in fact very good at translation and transliteration.
| Ancapistani wrote:
| Yeah, I agree - I get what the author is saying, but I also
| don't expect "translator" to be a practical career path in the
| future.
|
| Even small, dumb, local models are excellent at translation
| already. Frontier models are on par or better than the human
| translations we've tested them against at work.
| majdalsado wrote:
| Some would say that's exactly what they do best, learn a
| language and be able to transform across them. Hence,
| "language" model.
| juancn wrote:
| The most important thing a human translator does is certify that
| the translation is faithful.
|
| Period.
|
| You could do a machine translation if you want, but you better
| pore over every word in case you end up on the witness stand.
| ibudiallo wrote:
| Slight tangent into translations:
|
| I read two translations of the book "The Master and Margarita".
| My first read was so boring I couldn't help but stop reading
| before the end of the first chapter. I can't find the copy and
| the name of the person who translated it, but this one had all
| the Russian nicknames translated. It kept talking about a guy
| called homeless. I thought it was just a bad book and dismissed
| it for years. I couldn't understand what all the fuss was about
| with this book.
|
| But then, I stumbled upon the translation by Diana Burgin and
| Katherine Tiernan O'Connor. Although I don't speak Russian, I
| think this is as good as it gets. They did a phenomenal job.
|
| You can see the same effect with the mechanical translation of
| the book "We" by Yevgeny Zamyatin, where the government is called
| "United State" easily confused with the "United States". The
| translation that called it "One State" was so much better.
| acyou wrote:
| "we all more or less look the same in gym clothes"
|
| Maybe my brain works differently than the author, but I'm
| surprised at this statement. Gym clothes don't change recognition
| for me, it's about the face, body, posture, clothes don't really
| enter into it. For me it is nonsensical enough to be suspicious.
|
| And for a human centric perspective, not recognizing who someone
| is sad, it's knowing that you probably won't meet them again so
| it's not worth it, the community isn't there. Where community and
| interpersonal relationships between people are something we still
| hold dearly.
| karakoram wrote:
| Safe to say OP just does NOT like AI
| https://correresmidestino.com/sorry-i-was-busy-unfucking-my-...
|
| Poor woman should really look into pivoting her career or finding
| a different way of making money. Truth be told, her
| industry/career is not going to get better. Consistent work will
| just not fall from the sky.
|
| Being bitter will not improve her situation. Even organizations
| like UN/OECD are looking into implementing AI in various ways.
|
| Really good blog though. I love life blogs like these! You can go
| back and live through so many interesting/pivotal moments.
| thi2 wrote:
| I wonder when this is posted about your or my profession.
| dyauspitr wrote:
| This is all bullshit. I speak 4 languages, 3 fluently. Even
| chatGPT does a stellar job with translation. For most things
| people want translated- forms, administrative documents etc. I
| doubt you even need a human in the loop.
|
| That being said, something with essence like a novel definitely
| still needs to be done by a human.
| robmn wrote:
| Denial is tangible
| robmn wrote:
| Denial isn't just a river in Egypt
| d_runs_far wrote:
| As a public service employee within the GOC, I feel the pain
| expressed by the author. I sat through a meeting today where
| somebody with no domain knowledge puffed up their chest to show
| off their gpt created master lesson plan for a four year long
| internal training plan that is being re-worked.
|
| I could feel the heads of those around the table that had been
| teaching this material for a decade starting to explode as this
| was exactly what others in the thread have described: it looked
| good until vetted by experts, then it was easy to poke holes as
| it was just not right
|
| The problem in the public service is that the experts who can
| review the output are leaving or being nudged out.
| 627467 wrote:
| I'm gonna sound a bit like the clueless gym hr lady: I assume
| most income generating translation jobs are either mandated by
| law or commercially high stakes enough to warrant a human to do,
| no? Were people really being paid to do the type off _low stakes_
| translations implied that a automated system can replace?
|
| Maybe a publisher will replace the translator of the next Dan
| Brown best seller with Mythos? Who cares other than those buying
| it, getting money out of it?
| TZubiri wrote:
| --
| antonvs wrote:
| Jesus fuck, stop with the chatgpt written posts.
| r0m4n0 wrote:
| I say it's a simple value proposition.
|
| A few examples
|
| Audio book narration. Human narrators are paid a seemingly
| ridiculous amount of money to literally read a book out loud. We
| have the tech to replace them, it's actually pretty dang good,
| and it is substantially cheaper to do with computers. It's pretty
| accurate too. In the audio book industry though, if you take your
| book seriously you have a real person read it. The best one you
| can find that you like. Readers enjoy hearing good narrators and
| the total value one narrator can bring is very high mostly
| because the value scales well.
|
| Another real world example that doesn't scale well, call centers.
| Customers want humans, but execs have tried to replace them with
| automation in every way possible. The margins of a business get
| squeezed because the value of the human touch doesn't scale well
| in this case.
|
| Translation falls a bit in the middle. I'm sure ChatGPT is good
| enough for some people. If you are a restaurant and need to
| understand what you are ordering at the local authentic Italian
| restaurant it'll do the job. If you have a bad food allergy?
| Maybe not, you are willing to pay for accuracy because that's
| what a human brings
|
| So the answer to the question posed in the article, can't you
| just upload it to ChatGPT? Maybe yea maybe no
| thi2 wrote:
| I recently saw a video showing the french to german translation
| of a french McDonald's terminal. The translations were hilarious
| bad, like old school google translate bad.
|
| Maybe McDonalds is big enough to not care about their reputation,
| maybe they are happy about the free clout from people making fun
| of them but they certainly chose to cheap out on translations.
|
| https://www.tiktok.com/@denneshow/video/7522160205501566230
| tkgally wrote:
| As a former freelance translator (1986 to 2005, Japanese to
| English), I have much sympathy for the writer. But I wouldn't be
| so confident that AI cannot do professional-level translation.
|
| She writes: "I adapt, I localize, and I find the best way to
| convey the original message so it makes sense and feels natural.
| I research terminology. I make sure it's consistent throughout."
|
| I'm sure she has other important insights into what enables her
| to do her job well. The problem is whether or not such insights
| can be incorporated into an AI-driven translation system, too.
|
| Since early this year, I have been experimenting with a variety
| of agentic systems for language-related tasks, including
| dictionary-writing, research on topics in the philosophy of
| language, essay-writing, and translation. Other than the
| dictionary [1], I am keeping the results private, so they haven't
| been evaluated by others. But my personal assessment is that
| agentic systems given suitable high-level guidance can be very
| good at such tasks now.
|
| If I were still freelancing and I had a large translation job to
| do for a client, here is the outline of the prompt I would give
| to Claude to get it started:
|
| "Use this private GitHub repository to build a system for
| translating [genre of text] from [Language1] to [Language2]. The
| directory samples/ contains examples of the type of document to
| be translated, high-quality human translations of those
| documents, and texts in [Language2] that are in writing styles
| that I believe to be appropriate for this genre of translation.
| The file guidelines.md contains my general instructions about the
| needs of my client and my preferences for how you should
| translate texts along various axes (natural vs. literal, informal
| vs. formal, preferred dialect in [Language2], consistency vs.
| variety in terminology translation, etc.). Begin building (1) a
| knowledge wiki for this project using Karpathy's LLM-wiki
| framework and (2) a system inspired by Karpathy's Autoresearch,
| AutoResearchClaw, etc. for testing and recursively improving both
| the functioning of the system and the quality of the
| translations. For the actual translation, editing, checking,
| etc., use not only your own ability and the knowledge assembled
| in (1) but also outsource such tasks to other frontier models
| through OpenRouter, and use adversarial evaluations among those
| models and yourself to check and recursively improve the system
| design, the prompt-writing for other models, and any translations
| created by the system. My OpenRouter API key is available in this
| environment. You may spend up to $xx per day in API calls until
| this project is ready to do real translations; before beginning a
| real job, give me an estimate for how much the API calls will
| cost for that job. The initial build-out of this project will
| take many sessions, so write a prompt called resume-prompt.md
| that I can point you to at the start of a scheduled Routine to
| have you work on this. Commit and squash-merge to main at the end
| of each session. I will be checking in occasionally to view your
| progress and to ask you to run translation tests, and I will
| offer guidance then on how to improve the pipeline further and
| make the translations closer to what my client needs. If you have
| any questions before you begin, please ask me."
|
| [1] https://www.tkgje.jp
| GreenSalem wrote:
| I had transliterated lyrics of a song * with stanzas in Urdu ,
| Braj Basha, Persian and Arabic , that I wanted to understand
| better ..
|
| Gemini did a pretty good job of translating this to English .
|
| Sure a professional human translator would have done a more
| nuanced job if I was willing to invest the money and time . But
| ...
|
| * tajdar e haram originally by Payam Saihalwi, later versions by
| the Sabri Brothers and recently by Asif Aslam
| themafia wrote:
| Is the assumption that the LLM did the translation? Or that it
| just understood your query and submitted, on your behalf, to a
| tool you could have just used directly?
| phendrenad2 wrote:
| Reminded me of this quote:
|
| "Expertise in one field does not carry over into other fields.
| But experts often think so. The narrower their field of knowledge
| the more likely they are to think so." - Robert Heinlein
|
| In this case, the gym buddy doesn't think that she's an expert in
| the other field, but dismisses it as something ChatGPT can do
| with ease.
| gizajob wrote:
| I can't believe this article hasn't been written by ChatGPT. The
| author claims to have written it but has clearly become
| completely captured by the awful generic style of AI writing.
| tapland wrote:
| One of my parents tried this to beat a deadline for product
| packaging.
|
| There are now bags being sold marked "Lawn Suits", when it was
| supposed to be Lawn Topdressing
___________________________________________________________________
(page generated 2026-06-13 03:00 UTC)