codevoid.de/1/hn/comments_48509133.gph

        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   /architect: Reduce Fable tokens by 80%, Fable orchestrates/reviews, Codex builds
       
       
        hmokiguess wrote 10 hours 20 min ago:
        I guess that didnât age well
       
        Teknomadix wrote 11 hours 58 min ago:
        US Govt reduces Fable Tokens by 100%.
       
        Retr0id wrote 12 hours 6 min ago:
        > freezes the gates
        
        LLM-written readmes love to use inscrutable jargon that means nothing
        outside of the context window that birthed it.
       
          nostrebored wrote 9 hours 39 min ago:
          LLMs are obsessed with âgatesâ. Freezing the gates here is
          intuitive to me as this point â donât let validation drift.
       
            Retr0id wrote 3 hours 49 min ago:
            "drift" is another one!
       
        corvad wrote 12 hours 47 min ago:
        Who's gonna tell them...
       
        DanMcInerney wrote 12 hours 59 min ago:
        ANNNNNND it's gone. Guys, I found a way to reduce Fable token usage
        100%. You can find it here: github.com/USGov/idiotic-overreach.
       
        cohix wrote 13 hours 12 min ago:
        I do exactly this with awman workflows: [1] You can use any agent
        and/or model for each step and share context between them.
        
  HTML  [1]: https://github.com/prettysmartdev/awman/blob/main/docs/05-work...
       
        analogpixel wrote 13 hours 33 min ago:
        I know how to reduce Fable tokens by 100% ;
        
  HTML  [1]: https://www.anthropic.com/news/fable-mythos-access
       
          testfrequency wrote 10 hours 6 min ago:
          I ran this and seem to have good results with a 100% reduction also:
          curl -fsSL [1] | sh
          
  HTML    [1]: https://chatgpt.com/codex/install.sh
       
        rockwotj wrote 14 hours 19 min ago:
        I actually just started doing this by having Fable roleplay as Jeff
        Dean and to use Codex as Sanjay driving the implementation and have
        them go back and forth. Works really well and itâs cool to see AI
        pair program
       
        avaer wrote 14 hours 42 min ago:
        Reducing token usage is this year's "one weird trick". It doesn't make
        sense on the face of it.
        
        Even if one discovered something that millions (billions?) of dollars
        of AI compute and the best statisticians in the world was not able to
        find via exhaustive research, domain search and training... what do you
        think are the chances this won't be folded into the next update of
        every model, making the rigmarole moot?
        
        Extraordinary claims require extraordinary evidence and
        technology-shattering innovations in AI are not know to come from a
        markdown.
       
          apsurd wrote 14 hours 18 min ago:
          incentives arenât aligned
       
        aetherspawn wrote 14 hours 49 min ago:
        Fool me once. Fool me twice. Fool me thirty three times and here we are
        trying lucky number 34.
       
        diavelguru wrote 14 hours 53 min ago:
        yes I'm using Fable to inspect, generate plan and architectural docs
        then using Gemini to implement then have Fable review, find bugs. 
        saving lots of usage.
       
        Denvercoder9 wrote 14 hours 57 min ago:
        DESIGN.md:
        
        > Each rule below is enforced mechanically by the skill, not left to
        vibes.
        
        > R1. Repo docs are the memory; not in HANDOFF.md = didn't happen
        
        SKILL.md:
        
        > Not in docs/HANDOFF.md = didn't happen. Refuse to judge results that
        exist only in conversation or builder chat output.
        
        "Mechnical enforcement" just means "prompting the LLM a bit extra"
        these days? It (still) amazes me how much effort and tokens we expend
        on what could and should be a two line script...
       
          everforward wrote 13 hours 54 min ago:
          Agents are in a wacky state, which makes projects like this fall into
          a weird spot. Eg I vaguely expect my agent to do two disparate
          things: manage dependency injection for tools, prompt modifications,
          etc, but also be the sort of âbrain trustâ that controls the flow
          of execution (can we stop now, do we keep going, etc).
          
          This project is meant to be the latter, but thereâs not a clean way
          to integrate that into Claude Code or Codex because they expect to do
          both.
          
          Pi can do it, but then your users canât use their Claude
          subscriptions, so you have to cludgily try to do the same thing via
          LLM prompts.
       
            nostrebored wrote 9 hours 40 min ago:
            But why does your agent control doneness? It seems to me the most
            odd part to delegate. All LLMs are terrible at it. Most LLM tasks
            can be expressed as a DAG or DAG of DAGs. Why delegate that to a
            random point in context instead of enforcing the flow?
       
        Uptrenda wrote 15 hours 27 min ago:
        Reduce fable token usage even more by not using it. What a clever idea,
        op! Wow.
       
        felixgallo wrote 15 hours 35 min ago:
        Fable will do this itself, by spawning Opus/Sonnet subagents to do easy
        work.
       
          apsurd wrote 15 hours 15 min ago:
          /advisor has been really good experience for me especially with
          having only a Pro plan.
          
          I exclusively use sonnet and advisor is basically âhey opus chime
          in on my approachâ. been working great as far as i can tell.
       
          RazerWazer wrote 15 hours 33 min ago:
          GPT 5.5 xhigh is better than Opus and Sonnet.
       
            sosodev wrote 15 hours 14 min ago:
            I donât know why youâre getting downvoted. Itâs true.
            Averaged across a wide variety of benchmarks Fable is the only
            Anthropic model that performs better than GPT 5.5 xhigh.
       
              Eridrus wrote 15 hours 1 min ago:
              The problem is that there are a bunch of benchmarks, the model
              providers often don't even use the same benchmarks, a bunch of
              them have known problems, and it's expensive to do your own
              benchmarks.
              
              I am a GPT 5.x booster since to me it just feels smarter, and I
              generally felt like the benchmarks backed me up, but it's not
              every benchmark, so sadly we're mostly arguing about vibes.
              
              SWEBench-Pro was a big one, though apparently Claude was reading
              solutions out of the .git folder it wasn't meant to have access
              to among other problems.
       
                smoe wrote 14 hours 49 min ago:
                I find it fascinating that every time this kind of discussion
                comes up, people talk about night and day experiences between
                Claude and Codex, in both directions. Iâm really wondering
                what people are doing to get such different outcomes.
                
                Iâm currently working on two projects/clients one using
                Claude, one using Codex. I have a strong preference for the
                latter, but not because I think it is much more intelligent or
                writes much better code. It is simply because I find the way of
                interacting with it more pleasant: more literal, mechanical,
                makes fewer assumption and or double checks, and is less
                proactive in my experience. At least until some updates over
                the last few weeks.
       
                  AlphaSite wrote 12 hours 4 min ago:
                  It probably means theyâre close enough that thereâs no
                  observable difference. Or better at every different things.
       
                  Eridrus wrote 13 hours 12 min ago:
                  I think I like Codex for the same reason tbh. I think it's
                  just general misanthropy or autism or something lol. Most
                  people seem to prefer Claude.
                  
                  For me, I think Codex was visibly smarter than Claude until
                  4.8 came out, it would regularly do better debugging and IMO
                  write better code. 4.8 I think is close.
                  
                  I think Claude is widely regarded to have a big lead in
                  front-end, which I do not work on.
                  
                  Claude's Ultrathink is pretty cool, though it eats up tokens
                  like nothing else obviously.
       
            timcobb wrote 15 hours 28 min ago:
            Not in my subjective experience sadly
       
        mpalmer wrote 15 hours 37 min ago:
        Reduce Fable tokens by 80%, simply by not using it!
        
        > I am fairly convinced this is the shape serious agent work keeps
        converging toward.
        
        "this" being "plan with expensive model, implement with cheap model".
        
        Anyone who follows HN would be hard-pressed to disagree; this
        architecture is re-invented twice monthly. [1] [2] [3] > Not because it
        is aesthetically pleasing. Because every other shape eventually runs
        into the same boring failures: context rot, self-grading, goalpost
        drift, and merge chaos.
        
        Actual failure isn't boring. But struggling through a generated
        software project that celebrates its own genius and doesn't have a
        single self-critical or genuinely reflective thing to say...at least
        watching paint dry I might get giddy off the fumes.
        
        I'm not interested in critiquing the project itself, either, you'll
        just run that through a model, too.
        
  HTML  [1]: https://www.facebook.com/groups/vibecodinglife/posts/194620756...
  HTML  [2]: https://github.com/openai/codex/discussions/10628
  HTML  [3]: https://build5nines.com/stop-burning-premium-requests-how-to-c...
       
          DanMcInerney wrote 14 hours 45 min ago:
          I don't disagree with any of this. It is generated software, and it's
          not a novel idea. I didn't mean for it to come off like that. It's
          just solving an itch that I couldn't find a solution to and I'm
          getting a lot of personal utility out of it. I do have a lot of
          experience with agentic memory, multi-agent systems and harnesses and
          wasn't super impressed by the workflow of Fable calling opus
          subagents so I figured I'd apply best practices to what already
          exists to make it a teensy bit better and easier to use.
       
          seaal wrote 15 hours 8 min ago:
          > [1] wow linking a facebook groups post might actually be worse than
          x, is there an xcancel alternative for facebook?
          
  HTML    [1]: https://www.facebook.com/groups/vibecodinglife/posts/1946207...
       
        colechristensen wrote 15 hours 37 min ago:
        Last night I switched back to Codex for a minute having burned through
        my tokens for the week with Fable and oh boy I had a terrible
        experience.  Running in circles over simple problems (which I ended up
        solving myself, like a peasant) and running "terraform apply" several
        times despite several instructions all over the place to never do that.
        The performance difference was stark.
       
          nsingh2 wrote 15 hours 20 min ago:
          Could you provide some details, if possible, like what model &
          thinking effort, what kinds of tasks? I used to swap between Claude
          Code and Codex often, and these days use Codex more because of the
          usage limits. Wondering if I should go to Claude for a month, I get a
          strange FOMO when I read vague comments like this.
          
          The one major difference I noticed is that the GPT models are more
          analytical (e.g. better at mathematical analysis, code review) vs
          Claude models tend to write more straight forward code. Besides that
          I don't really see any significant differences.
          
          There are a few gotchas with swapping, like being careful with
          AGENTS.md/CLAUDE.md naming (Claude Code only recognizes CLAUDE.md,
          and I think Codex only works with AGENTS.md), and updating skill
          files to match the tool.
       
            colechristensen wrote 15 hours 1 min ago:
            I just symlink AGENTS.md and CLAUDE.md
            
            I was using gpt-5.5 high. Writing terraform code for GCP, debugging
            app launch and Dockerfile issues, that sort of thing.  It was going
            in loops hallucinating features of GCP, looking things up in
            strange ways, running terraform apply after being explicitly told
            in the last interaction not to, and overall not solving problems. 
            These were very straightforward tasks and it couldn't be trusted
            for five minutes.  It's the difference in what I would trust an
            early senior engineer to do vs what I would trust an unreliable
            high school intern to do.
       
          malshe wrote 15 hours 25 min ago:
          I had a similar experience. So far Fable has been a game changer, at
          least for the work I used it for. Having said that, I think its
          writing is definitely worse than GPT 5.5. Ethan Mollick also observed
          the same. He called it more "Claudy." It generates worse academic
          prose than other frontier models.
       
            colechristensen wrote 10 hours 19 min ago:
            I think the claude code harness made up a significant part of the
            improvements co-released with Fable, the nested agent capabilities
            seem to be much better even with opus (which I guess we're stuck
            with for a while).
       
       
   DIR <- back to front page