_______ __ _______
| | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----.
| || _ || __|| < | -__|| _| | || -__|| | | ||__ --|
|___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____|
on Gopher (inofficial)
HTML Visit Hacker News on the Web
COMMENT PAGE FOR:
HTML Pi â A minimal terminal coding harness
Vegenoid wrote 1 hour 6 min ago:
What models are you all using with Pi? How much are you paying on
monthly or weekly basis for your usage? I'm very interested in it, but
my budget is constrained and the usage granted by a $20/month Claude
plan seems much more affordable than when I've tried API-based access
with other agents. Unfortunately, this locks me in to Claude Code.
neop1x wrote 36 min ago:
Use OpenRouter, lots of great open-weights models like MiniMax, Kimi
K2, Mistral, Qwen, ...
twsted wrote 3 hours 15 min ago:
I do not understand some of the comments here: openclaw uses pi.
It seems stange also that even Steinberger in his interviews is not
giving pi the proper attribution.
nacozarina wrote 3 hours 44 min ago:
too many things named âpiâ, kmn
Veen wrote 2 hours 26 min ago:
Its original name was more distinctive but less "enterprise
friendly".
HTML [1]: https://shittycodingagent.ai
squeefers wrote 7 hours 39 min ago:
minimal indeed. why are we regressing back to terminals now? ive seen
this in the rust world mainly
chriswarbo wrote 7 hours 11 min ago:
Doesn't need a terminal: run it in RPC mode to send/receive JSON over
stdio. That's how the pi-coding-agent Emacs package works, which is
the only way I've ever used Pi.
It seems pretty well done: when I added permission requests to the
`bash` tool, the "Are you sure y/N" requests started appearing just
like they were native to Emacs.
jacobgorm wrote 7 hours 36 min ago:
Some of us never left the terminal. Welcome to the future.
bankombinator wrote 8 hours 56 min ago:
Mario mentioned in hackernews.
alabhyajindal wrote 11 hours 51 min ago:
I am also going to now implement an existing project and invent a
different name for it. Look out for Waterfox, a minimal web consumer.
_pdp_ wrote 12 hours 6 min ago:
I still don't get why would you want to use a terminal app to code when
you can do all of this through IDE extension which does the same except
it is better integrated.
You can open a grid of windows inside vscode too and it comes back up
exactly as it was on reload.
rubenflamshep wrote 5 hours 4 min ago:
I've found VSCode _ok_ to work with across across different
workspaces/projects. The window memory is hit and miss. There's a
secondary side bar I've been trying to NOT have open on startup but
always seem to stick around. I'd prefer to programmatically manage
the windows so I can tinker with an automated setup but the VSCode
API/Plugins for managing this are terrible and tend to fail silently.
CLI within VSCode is workable but most of my VSCode envs are within a
docker container. This is a pattern that I'm moving more and more
away from as agents within a container kind of suck.
BeetleB wrote 6 hours 38 min ago:
> I still don't get why would you want to use a terminal app to code
when you can do all of this through IDE extension which does the same
except it is better integrated.
Because then you need to make an extension for every IDE. Isn't it
better to make a CLI tool with a server, and let people make IDE
extensions to communicate with it?
Claude Code has an update every few days. Imagine now propagating
those changes to 20+ IDEs.
chriswarbo wrote 7 hours 21 min ago:
> I still don't get why would you want to use a terminal app to code
when you can do all of this through IDE extension which does the same
except it is better integrated.
I agree. I tried Gemini CLI for a while, and didn't like how separate
I felt from the underlying files: rather than doing minor cleanup
myself, the activation energy of switching to a separate editor and
opening the same files was too high, so I'd prompt the LLM to it
instead. Which was often an exercise in frustration, as it would take
many rounds of explanation for such tiny payoffs; maybe even fiddling
with system prompts and markdown files, to try and avoid wasting so
much time in the future...
I've been using Pi for a few weeks now, and have managed to integrate
it quite deeply into Emacs. I run it entirely via RPC mode (JSON over
stdio), so I don't really know (or care) about its terminal UI :-)
theshrike79 wrote 11 hours 56 min ago:
When I use a CLI agent to code, I don't need the IDE for anything.
Think of it more like directing a coworker or subcontractor via text
chat. You tell them what you want and get a result, then you test it
if it's what you want and give more instructions if needed.
I literally just fixed a maintenance program on my own server while
working my $dayjob. ssh to server, start up claude and tell it what's
wrong, tab away. Then I came back some time later, read what it had
done, tested the script and immediately got a few improvement ideas.
Gave them to Claude, tabbed out, etc.
Took me maybe 15 minutes of active work while chatting on Slack and
managing my other tasks. I never needed to look at the code at any
point. If it works and tests pass, why do I care what it looks like?
_pdp_ wrote 11 hours 43 min ago:
I suppose we are working on different problems.
In my own experience I cannot blindly accept code without even
looking at it even for a few moments because I've had many
situations where the code was simply doing the wrong things...
including tests are completely wrong and testing the wrong
assumptions.
So yah, even when I review trivial changes I still look at the diff
view to see if it makes sense. And IDEs make code review a lot
easier than diff.
Btw, this experience is not from lack of trying. We use coding
agent extensively (I would assume more than the typical org looking
at our bill) and while they are certainly very, very helpful and I
cannot describe how much effort they are really saving us, there is
absolutely zero chance of pushing something out without reviewing
it first - same applies for code written by AI agent or a coworker.
carderne wrote 12 hours 43 min ago:
The people pushing oh-my-pi seem to have missed the point of pi...
Downloading 200k+ lines of additional code seems completely against the
philosophy of building up your harness, letting your agent
self-improve, relying on code that you control.
If you want bags of features, rather clone oh-my-pi somewhere, and get
your agent to bring in bits of it a time, checking, reviewing,
customising as you go.
manojlds wrote 12 hours 27 min ago:
Yeah ohmypi is garbage. The point is you have a thing shell and add
your own on top by just talking to pi itself or pick in selective
extensions.
solarkraft wrote 12 hours 51 min ago:
I happen to be somewhat familiar with OpenCode and am considering using
it as a personal AI workspace (some chat & agentic behavior,
not worrying about initiative behavior just yet, Iâd try to DIY
memory with local files and access to my notes) because it seems to
have a decent ecosystem.
Pi appears to have a smaller, less âpre-madeâ ecosystem, but with
more flexibility, enthusiasm and extensibility.
Is this correct? Should I look towards Pi over OpenCode? What are the
UI options?
mikodin wrote 6 hours 20 min ago:
I've been using PI for this - just switch to "oh my pi" and am liking
it!
Honestly, it's been a dream, I have it running in a docker-sandbox
with access to a single git repo (not hosted) that I am using for
varied things with my business.
Try it out, it's super easy to setup. If you use docker sandbox, you
can just follow what is necessary for claude, spin up the sandbox,
exit out, exec into it with bash and switch to Pi.
mritchie712 wrote 11 hours 44 min ago:
I think you're reading it exactly right
amunozo wrote 12 hours 12 min ago:
I have the same question as you, but I want to add that I used
OpenCode for general tasks like writing, organization and such but
with a context of .md files and it works wonders. And like you, I am
considering trying a better suited harness for this task.
solarkraft wrote 11 hours 35 min ago:
What issues are you facing with OpenCode?
I looked a bit into the reasoning for Piâs design ( [1] ) and,
while it does seem to do a lot of things very well around
extensibility, I do miss support for permissions, MCP and perhaps
Todos and a server mode. OpenCode seems a lot more complete in that
regard, but I can well imagine that people have adapted Pi for
these use cases (OpenClaw seems to have all of these). So itâs
definitely not out of the race yet, but I still appreciate
OpenCodes relative seeming completeness in comparison.
HTML [1]: https://mariozechner.at/posts/2025-11-30-pi-coding-agent/#...
amunozo wrote 7 hours 50 min ago:
To be honest, none for what I am using for (organizing documents,
cross-referencing information, writing summaries of documents).
Howeverm it feels wrong using OpenCode for this. I somehow think
there must be a better way of doing this.
miroljub wrote 10 hours 55 min ago:
> while it does seem to do a lot of things very well around
extensibility, [1] > I do miss support for permissions,
As soon as your agent can write and execute code, your
permissions are just a security theater. If you care, just do
proper sandboxing. If not, there are extensions for that.
> MCP
Again, Pi is extensible.
pi install pi-mcp-adapter
Now, you can connect to any mcp.
> and perhaps Todos
At least 10 different todo extensions. Pick which one you like.
If you don't like any of them, ask Pi to write one for you.
> and a server mode.
Pi has rpc mode, which is a kind of server. If that's not enough,
you could extend it.
> OpenCode seems a lot more complete in that regard,
Yes, but good luck working with Opencode if you don't like their
plan-mode. Or todo support. And MCP. You pay their cost in
complexity and tokens even if you don't use them or you don't
like how they work.
> but I can well imagine that people have adapted Pi for these
use cases (OpenClaw seems to have all of these). So itâs
definitely not out of the race yet, but I still appreciate
OpenCodes relative seeming completeness in comparison.
There's also an oh-my-pi fork if you want an out-of-the-box
experience. Still, in my experience, nothing beats Pi in terms of
customizability. It's the first piece of software that I can
easily make completely to my liking. And I say that as a decade
old Emacs user.
HTML [1]: https://pi.dev/packages
sheerun wrote 13 hours 24 min ago:
This domain must have costed $$$$
thomascountz wrote 13 hours 6 min ago:
pi.dev domain graciously donated by exe.dev
Notwithstanding the donation, this domain must have costed $$$$
raffkede wrote 13 hours 43 min ago:
it even runs inside a browser I'll publish my browserpi if someone is
interested I did not dare to add a pull request with my slop but i
would love to show the fork and create a pull request if there is
broader interest
vinibrito wrote 7 hours 19 min ago:
No need to push to normal repo, just publish your fork, I'd like to
see it
mr_o47 wrote 14 hours 10 min ago:
I recently discovered this via a YouTube video a few days ago
I really like the customization aspect of it and you can build tools on
fly and even switch model mid session
Thereâs another project here called oh my pi has anyone here tried it
thepasch wrote 14 hours 48 min ago:
Stop advertising pi, people. It _somehow_ continued to fly somewhat
under the radar after that whole OpenClaw nonsense. Donât make
Anthropicâs sic their bloodhounds on them like they did on OpenCode.
kristianpaul wrote 7 hours 13 min ago:
People deserve to know it exists, I got tired of even OpenCode
workflows/agents, installed OpenSpec but all this wrapped todos still
not how i wanted
I needed more control but dint wanted to write my own tool, then i
ended knowing about pi, this got me interested at first read:
No plan mode. Write plans to files, or build it with extensions, or
install a package.
No built-in to-dos. Use a TODO.md file, or build your own with
extensions.
No background bash. Use tmux. Full observability, direct interaction.
This is very important to have control and ownership.
Pi is not for everyone, but the ones eventually want to have tools
like (read, bash, edit, write, grep, find, ls) as building blocks.
raincole wrote 14 hours 19 min ago:
Interestingly, since OpenClaw, there has been ~one post about Pi
every week. But practically no one voted any of them except this one.
tietjens wrote 14 hours 38 min ago:
pi is an officially accepted harness of either Anthropic or OpenAI. I
forgot which.
reacharavindh wrote 15 hours 58 min ago:
I began with pi, and have been using oh-my-pi the last two weeks. [1]
More of a batteries included version of pi.
HTML [1]: https://github.com/can1357/oh-my-pi
mr_o47 wrote 14 hours 8 min ago:
Howâs your experience so far with oh my pi
reacharavindh wrote 11 hours 59 min ago:
A few things. I intentionally clone the repo and build it locally
for my use and use it as my-omp.. this way, I can make oh-my-pi
make customisations like skills, tools, anything and yet retain the
ability to do a git pull from upstream with cherry picking if
necessary.
I have this in my shell rc.
# bun
export BUN_INSTALL="$HOME/.bun"
export PATH="$BUN_INSTALL/bin:$PATH"
alias my-omp=
"bun/Users/aravindhsampathkumar/ai_playground/oh-my-pi/packages/cod
ing-agent/src/cli.ts"
and do
1. git pull origin main
2. bun install
3. bun run build:native
every time I pull changes from upstream.
Until yesterday, this process was purely bliss - my own minimal
custom system prompt, minimal AGENTS.md, and self curated
skills.md. One thing I was wary of switching from pi to oh-my-pi
was the use of Rust tools pi-native using NAPI. The last couple of
days whatever changes I pulled from upstream is causing the models
to get confused about which tool to use and how while
editing/patching files. They are getting extremely annoyed - I see
11 iterations of a tool call to edit a damn file and the model then
resorted to rewriting the whole file from memory, and we al know
how that goes. This may not be a bug in oh-my-pi per se. My guess
is that the agent developed its memory based on prior usage of the
tools and my updating oh-my-pi brought changes in their usage. It
might be okay if I could lose all agent memory and begin again, but
I dont want to.
I'm going to be more diligent about pulling upstream changes from
now on, and only do when I can afford a full session memory wipe.
Otherwise, the integrations with exa for search, LSP servers on
local machine, syntax highlighting, steering prompts, custom tools
(using trafilatura to fetch contents of any url as markdown, use
calculator instead of making LLM do arithmetic) etc work like a
charm. I haven't used the IPython integration nor do I plan to.
self_awareness wrote 15 hours 24 min ago:
Are you running it in some kind of sandbox? Does it have sandboxing
features?
neop1x wrote 28 min ago:
I use a sandbox example extension with comes with Pi, it uses the
anthropic sandbox runtime (bubblewrap on linux). The runtime has
one bug and needs one improvement (I've made PRs, no response yet).
Pi's sandbox example extension does not block internal tools
(read/write) according to rules, I've created a PR but can't submit
because of Pi's OSS vacation BS... [1] I am quite happy with my
patched forks for now
HTML [1]: https://github.com/badlogic/pi-mono/compare/main...k3a:pi-...
reacharavindh wrote 11 hours 52 min ago:
I dont. I use this as my coding harness (replacement of
gemini-cli/claudecode etc). I dont want to sandbox it because I
expect it to be used only for coding on projects. I dont want to
over complicate it.
I am building my own assistant as an AI harness - that is
definitely getting sandboxed to run only as a VM on my Mac.
buremba wrote 17 hours 36 min ago:
I spent 3 months adopting Codex and Claude Code SDKs only to realize
they're just vendor lock-in and brittle. They're intended to be used as
CLI so it's not programmable enough as a library. After digging into
OpenClaw codebase, I can safely say that the most of its success comes
from the underlying harness, pi agent.
pi plugins support adding hooks at every stage, from tool calls to
compaction and let you customize the TUI UI as well. I use it for my
multi-tenant Openclaw alternative [1] If you're building an agent,
please don't use proprietary SDKs from model providers. Just stick to
ai-sdk or pi agent.
HTML [1]: https://github.com/lobu-ai/lobu
burgerquizz wrote 12 hours 20 min ago:
how do you replicate the claude code system prompts in pi?
i have tried using claude agebt sdk without the claude code preset,
and it is quite bad
Munksgaard wrote 10 hours 46 min ago:
Pretty easy, the prompts can be seen here[0] and pi supports
setting SYSTEM.md.
0:
HTML [1]: https://cchistory.mariozechner.at/
Majromax wrote 8 hours 38 min ago:
For all of the recent talk about how Anthropic relies on heavy
cache optimization for claude-code, it certainly seems like
session-specific information (the exact datestamp, the
pid-specific temporary directory for memory storage) enters
awfully early in the system prompt.
kzahel wrote 13 hours 57 min ago:
I left some notes about this. I agree with you directionally but
practically/economically you want to let users leverage what they're
already paying for. [1] Captures the ai-sdk and pi-mono.
In an ideal world we would have a pi-cli-mono or similar, like
something that is not as powerful as pi but gives a least common
denominator sort of interface to access at least claude/codex.
ACP is also something interesting in this space, though I don't
honestly know how that fits into this story.
HTML [1]: https://yepanywhere.com/subscription-access-approaches/
buremba wrote 11 hours 5 min ago:
Page returns 404. ACP is great, indeed better to give pi-mono ACP
than claude or codex directly.
HTML [1]: https://x.com/bu7emba/status/2026364497527513440
vanillameow wrote 16 hours 15 min ago:
Unfortunately it's currently very utopian for (I would assume) most
devs to use something like this when API cost is so prohibitively
expensive compared to e.g. Claude Code. I would love to use a lighter
and better harness, but I wouldn't love to quintuple my monthly
costs. For now the pricing advantage is just too big for me compared
to the inconvenience of using CC.
badlogic wrote 13 hours 11 min ago:
OpenAI officially supports using your subscription with pi. Same
for OpenCode and other 3rd party harnesses.
buremba wrote 15 hours 57 min ago:
You technically still use CC, it's not via SDK but via CLI
programmatically triggered via pi.
vanillameow wrote 14 hours 17 min ago:
Is this in line with Anthropic ToS? They cracked down hard on
Clawdbot and the like from what I gathered. I guess if you are
still invoking CC it might be fine, but isn't that gonna lead to
weird behavior from basically doubling up on harnesses?
buremba wrote 11 hours 36 min ago:
Nobody knows, including Anthropic itself I suppose
bjackman wrote 16 hours 17 min ago:
IIUC to reliably use 3P tools you need to use API billing, right?
Based on my limited experimentation this is an order of magnitude
more expensive than consumer subscriptions like Claude Pro, do I have
that right?
("Limited experimentation" = a few months ago I threw $10 into the
Anthropic console and did a bit of vibe coding and found my $10
disappeared within a couple of hours).
If so, that would support your concern, it does kinda sound like
they're selling marginal Claude Code / Gemini CLI tokens at a loss.
Which definitely smells like an aggressive lockin strategy.
buremba wrote 15 hours 59 min ago:
Technically you're still using claude CLI with this pattern so it's
not 3P app calling Anthropic APIs via your OAuth token. Even if you
would use Claude Code SDK, your app is 3P so it's in a gray area.
Anthropic docs is intentionally not clear about how 3P tools are
defined, is it calling Claude app or the Anthropic API with the
OAuth tokens?
siva7 wrote 16 hours 59 min ago:
I also wondered for months why it feels so difficult to use Openai or
Anthropic SDKs until i came to a similar conclusion.
ianlpaterson wrote 19 hours 55 min ago:
Coming from OpenClaw, it's pretty amazing how fast pi is, particularly
paired with Qwen3 that dropped today. It's a magical time.
jasonjmcghee wrote 18 hours 8 min ago:
What dropped today? Wasn't Qwen3 Coder Next released beginning of the
month?
Qwen3.5 released a couple of days ago but I'm not that RAM rich
breisa wrote 17 hours 24 min ago:
Alibaba released a whole set of new Qwen 3.5 models including a
~120B and a ~35B MoE.
jasonjmcghee wrote 8 hours 26 min ago:
Nice. 27B looks reasonable too.
kristianpaul wrote 19 hours 53 min ago:
Indeed, it seems to just works with a self hosted Qwen3 coder next.
TacticalCoder wrote 20 hours 14 min ago:
Naming skills though...
20022026 wrote 20 hours 23 min ago:
Anyone tried pi with 5.3-codex vs codex cli?
rcarmo wrote 16 hours 23 min ago:
I run it almost exclusively with codex models. Zero issues.
CGamesPlay wrote 20 hours 31 min ago:
To me, the most interesting thing about Pi and the "claw" phenomenon is
what it means for open source. It's becoming passé to ask for feature
requests and even to submit PRs to open source repos. Instead of
extensions you install, you download a skill file that tells a coding
agent how to add a feature. The software stops being an artifact and
starts being a living tool that isn't the same as anyone else's copy.
I'm curious to see what tooling will emerge for collaborating with this
new paradigm.
brandensilva wrote 6 hours 23 min ago:
I totally feel this. Prior I never had time for doing this but now I
just do it without even thinking about contributing.
giancarlostoro wrote 7 hours 5 min ago:
> Instead of extensions you install, you download a skill file that
tells a coding agent how to add a feature. The software stops being
an artifact and starts being a living tool that isn't the same as
anyone else's copy. I'm curious to see what tooling will emerge for
collaborating with this new paradigm.
I build my own inspired by Beads, not quite as you're describing, but
I store todo's in a SQLite database (beads used SQLite AND git hooks,
I didn't want to be married to git), and I let them sync to and from
GitHub Issues, so in theory I can fork a GitHub repo, and have my
tool pull down issues from the original repo (havent tried it when
its a fork, so that's a new task for the task pile). [1] You can see
me dogfeeding my tool to my tools codebase and having my issues on
the github for anyone to see, you can see the closed ones. I do think
we will see an increase in local dev tooling that is tried and tested
by its own creators, which will yield better purpose driven tooling
that is generic enough to be useful to others.
I used to use Beads for all my Claude Code projects, now I just use
GuardRails because it has safety nets and works without git which is
what I wanted.
I could have forked Beads, but the other thing is Beads is a behemoth
of code, it was much easier to start from nothing but a very detailed
spec and Claude Code ;)
HTML [1]: https://github.com/Giancarlos/guardrails/issues
davej wrote 7 hours 12 min ago:
Patrick Collison said this yesterday on TBPN, "Software is becoming
like pizza [â¦] It should be cooked right then and there at the
moment of use"
GTP wrote 7 hours 43 min ago:
> It's becoming passé to ask for feature requests and even to
submit PRs to open source repos.
Yet, the first impact on FOSS seems to be quite the opposite:
maintainers complaining about PRs and vulnerability disclosures that
turn out to be AI hallucinations, wasting their time. It seems to be
so bad that now GitHub is offering the possibility of turning off
pull requests for repositories. What you present here is an
optimistic view, and I would be happy for it to be correct, but what
we've seen so far unfortunately seems to point in a different
direction.
brandensilva wrote 6 hours 11 min ago:
We might be witnessing some survivor bias here based on our own
human conditioning. Successful PRs aren't going to make the news
like the bad ones do.
With that said, we are all dealing with AI still convincingly
writing code that doesn't work despite passing tests or introducing
hard to find bugs. It will be some time until we iron that out
fully for more reliable output I suspect.
Unfortunately we won't be able to stop humans thinking they are
software engineers when they are not now that the abstraction
language is the human language so guarding from spam will be more
important than ever.
rbren wrote 8 hours 39 min ago:
Funny, I just released my dev setup as âOpen promptâ
HTML [1]: https://github.com/rbren/personal-ai-devbox
thierrydamiba wrote 11 hours 20 min ago:
I actually look at this another way. I think weâre going to see a
lot more open source. Before you had to get your pr merged into main.
Now people will just ask ai to build the tool they need and then open
source it.
Maintainers wonât have to deal with an endless stream of PRs. Now
people will just clone your library the second it has traction and
make it perfect for their specific use case.
Cherry pick the best features and build something perfect for them.
Theyâll be able to do things your product canât, and individual
users will probably find a better fit in these spinoffs than in the
original app.
hebejebelus wrote 12 hours 19 min ago:
I've been thinking about this lately too. I think we're going to see
the rise of Extremely Personal Software, software that barely makes
any sense outside of someone's personal context. I think there is
going to be _so_ much software written for an audience of 1-10 people
in the next year. I've had Claude create so much tooling for me and a
small number of others in the last few months. A DnD schedule app; a
spoiler-free formula e news checker; a single-use voting site for a
climbing co-op; tools to access other tools that I don't like using
by hand; just absolutely tons of stuff that would never have made any
sense to spend time on before. It's a new world.
HTML [1]: https://redfloatplane.lol/blog/14-releasing-software-now/
boh wrote 10 hours 11 min ago:
I think people overestimate the general population's ability and
interest in vibe coding. Open source tools are still a small niche.
Vibe code customized apps are an even bigger niche.
tagami wrote 7 hours 31 min ago:
even smaller?
hebejebelus wrote 9 hours 34 min ago:
Maybe so. I guess I feel that in a couple of years it may not be
called vibe coding, or even coding, I think it might be called
'using a computer'. I suppose it's very hard to correctly
estimate or reason about such a big change.
lugao wrote 13 hours 26 min ago:
Why would this new paradigm create interesting tooling? From your
description I expect wrose not better tools.
vidarh wrote 12 hours 29 min ago:
Worse it better for you when it meets your needs better.
I use a lot of my own software. Most of it is strictly worse both
in terms of features and bugs than more intentional, planned
projects. The reason I do it is because each of those tools solve
my specific pain points in ways that makes my life better.
A concrete example: I have a personal dashboard. It was written by
Claude in its entirety. I've skimmed the code, but no more than
that. I don't review individual changes. It works for me. It pulls
in my calendar, my fitbit data, my TODO list, various custom
reminders to work around my tendency to procrastinate, it surfaces
data from my coding agents, it provides a nice interface for me to
browse various documentation I keep to hand, and a lot more.
I could write a "proper" dashboard system with cleanly pluggable
modules. If I were to write it manually I probably would because
I'd want something I could easily dip in and out of working on. But
when I've started doing stuff like that in the past I quickly put
it aside because it cost more effort than I got out of it. The
benefit it provides is low enough that even a team effort would be
difficult to make pay off.
Now that equation has fundamentally changed. If there's something I
don't like, I tell Claude, and a few minutes - or more - later, I
reload the dashboard and 90% of the time it's improved.
I have no illusions that code is generic enough to be usable for
others, and that's fine, because the cost of maintaining it in my
time is so low that I have no need to share that burden with
others.
I think this will change how a lot of software is written. A
"dashboard toolkit" for example would still have value to my
"project". But for my agent to pull in and use to put together my
dashboard faster.
A lot of "finished products" will be a lot less valuable because
it'll become easier to get exactly what you want by having your
agent assemble what is out there, and write what isn't out there
from scratch.
lugao wrote 12 hours 13 min ago:
To be clear I never said custom vibe coded personal software is
bad. But clearly that's not the point from OP. Quoting directly:
> you download a skill file that tells a coding agent how to add
a feature
This is suggesting a my_feature.md would be a way of sharing and
improving software in the future, which I think is mostly a bad
thing.
vidarh wrote 11 hours 55 min ago:
It is a way of sharing and improving software already today.
Not a major way, yet, but I don't agree with you it would be a
bad thing for that to become more common, in as much as - to go
back to my dashboard example - sharing a skill that contains
some of the lessons learned, and packages small parts would
seem far more flexible and viable as a path for me to help make
it easier for others to do the same, than packaging up
something in a way that'd give the expectation that it was
something finished.
But also, note that skills can carry scripts with them, so they
are definitely also more than a my_feature.md.
bandrami wrote 15 hours 22 min ago:
> a living tool that isn't the same as anyone else's copy
Yes, which is why this model of development is basically
dead-in-the-water in terms of institutional adoption. No large firm
or government is going to allow that.
raincole wrote 14 hours 12 min ago:
Large institutions and governments had been pushing back against
open source too until it became obviously inevitable.
bandrami wrote 13 hours 54 min ago:
It wasn't "inevitable", it took Red Hat and some other key
players addressing the concerns the businesses and governments
had, which took the better part of a decade. If LLMs as an
ecosystem don't implode in the next year or so I imagine you'll
start to see some big consultancies starting that same process
for them.
embedding-shape wrote 12 hours 52 min ago:
> it took Red Hat and some other key players addressing the
concerns the businesses and governments had
Red Hat? I don't think they are involved in the moves to FOSS
for government agencies, mostly because they're American, and
the ones who are currently moving quickly (in the government
world at least) are the ones who aren't American and what to
get rid of their reliance on American infrastructure and
software.
bandrami wrote 12 hours 41 min ago:
Visit Washington DC some time and ride the metro. Red Hat
puts out ads about all their public sector offerings.
embedding-shape wrote 11 hours 52 min ago:
> Visit Washington DC some time and ride the metro. Red Hat
puts out ads about all their public sector offerings.
I haven't had a single need to visit the US, and I still
have zero needs for it. If I need to read subway ads to
understand how a company is connected to FOSS, I think I'll
skip that and continue using and working with companies who
make that clear up front :) Thanks for the offer though!
ambicapter wrote 6 hours 49 min ago:
An unnecessarily snarky response to someone offering you
clear information.
navigate8310 wrote 10 hours 16 min ago:
RHEL is quite ubiquitous in the States, not everything is
Microsoft Windows Server
embedding-shape wrote 10 hours 6 min ago:
Right, but is "the States" currently trying to migrate
away from US infrastructure and choosing FOSS to do so?
That was the context I was entering this thread with,
since most of the organizations moving to FOSS right
now are doing so to move away from US infrastructure.
hrimfaxi wrote 7 hours 57 min ago:
The whole context was how Red Hat was historically
involved in addressing the concerns that were
hindering government adoption. Are you just being
intentionally obtuse to denigrate the US for some
reason?
petcat wrote 11 hours 19 min ago:
Most American government infrastructure runs on Red Hat.
Almost all of Amazon's internal operations runs on Amazon
Linux, which is a rebranded Red Hat, and it powers Gov
Cloud.
IBM didn't acquire Red Hat for no reason.
CuriouslyC wrote 19 hours 27 min ago:
[flagged]
theshrike79 wrote 17 hours 0 min ago:
Think of skills more like Excel macros (or any other software with
robust macro support). It doesn't make sense for Microsoft to
provide the specific workflow you need, but your own sheet needs
it.
navigate8310 wrote 10 hours 14 min ago:
Except "skills" being worked upon by a deterministic model will
result in inconsistent results than a heuristic VB macro written
for Excel
throwaway13337 wrote 19 hours 30 min ago:
I see this happening, too.
We know that a lack of control over their environment makes animals,
including humans, depressed.
The software we use has so much of this lack of control. It's their
way, their branding, their ads, their app. You're the guest on your
own device.
It's no wonder everyone hates technology.
A world with software that is malleable, personal, and cheap - this
could do a lot of good. Real ownership.
The nerds could always make a home with their linux desktop. Now
everyone can. It'll change the equation.
I'm quite optimistic for this future.
GTP wrote 7 hours 41 min ago:
> The nerds could always make a home with their linux desktop. Now
everyone can. It'll change the equation.
Probelm is, to be able to do what you're describing, you still need
the source code and the permission to modify it. So you will need
to switch to the FOSS tools the nerds are using.
throwaway13337 wrote 7 hours 14 min ago:
That's a feature, not a bug.
It means normies will finally see value in open source beyond
just being free. They'll choose it over closed source
alternatives.
This, too, makes a brighter future.
blubber wrote 3 hours 13 min ago:
Obligatory post: open source != free software.
There is OSS you are not allowed to modify etc.
cedws wrote 8 hours 38 min ago:
Weâre off to a great start then with Anthropic banning users who
use alternative clients with their Claude subscription.
yowlingcat wrote 4 hours 56 min ago:
I'm actually relieved they're doing it now because it's going to
be a forcing function for the local LLM ecosystem. Same thing
with their "distillation attack" smear piece -- the more of a
spotlight people get on true alternatives + competition to the
900 lb gorillas, the better for all users of LLMs.
cedws wrote 4 hours 27 min ago:
I really hope so. I moved to Codex, only to get my account
flagged and my requests downgraded to 5.2 because of some
"safety" thing. Now OpenAI demands I hand my ID over to
Persona, the incredibly dodgy US surveillance company Discord
just parted ways with, to get back what I paid for.
This timeline sucks, I don't want to live in a future where
Anthropic and OpenAI are the arbiters of what we can and cannot
do.
h14h wrote 14 hours 52 min ago:
I'm presently in the process of building (read: directing
claude/codex to build) my own AI agent from the ground up, and it's
been an absolute blast.
Building it exactly to my design specs, giving it only the tool
calls I need, owning all the data it stores about me for RAG,
integrating it to the exact services/pipelines I care about... It's
nothing short of invigorating to have this degree of control over
something so powerful.
In a couple of days work, I have a discord bot that's about as
useful as chatgpt, using open models, running on a VPS I manage,
for less than $20/mo (including inference). And I have full control
over what capabilities I add to it in the future. Truly wild.
discreteevent wrote 7 hours 59 min ago:
> It's nothing short of invigorating to have this degree of
control over something so powerful
Is this really that different to programming? (Maybe you haven't
programmed before?)
h14h wrote 3 hours 28 min ago:
Fair point.
> It's nothing short of invigorating to have this degree of
control over something so powerful
I'm a SWE w/ >10 years, and you're right, this part has always
been invigorating.
I suppose what's "new" here is the drastically reduced amount
of cognitive energy I need build complex projects in my spare
time. As someone who was originally drawn to software because
of how much it lowered the barrier to entry of birthing an idea
into existence (when compared to hardware), I am genuinely
thrilled to see said barrier lowered so much further.
Sharing my own anecdotal experience:
My current day job is leading development of a React Native
mobile app in Typescript with a backend PaaS, and the bulk of
my working memory is filled up by information in that domain.
Given this is currently what pays the bills, it's hard to
justify devoting all that much of my brain deep-diving into
other technologies or stacks merely for fun or to satisfy my
curiosity.
But today, despite those limitations, I find myself having
built a bespoke AI agent written from scratch in Go, using a
janky beta AI Inference API with weird bugs and sub-par
documentation, on a VPS sandbox with a custom Tmux & Neovim
config I can "mosh" into from anywhere using finely-tuned
Tailscale access rules.
I have enough experience and high-level knowledge that it's
pretty easy for me to develop a clear idea of what exactly I
want to build from a tooling/architecture standpoint, but prior
to Claude, Codex, etc., the "how" of building it tended to be a
big stumbling block. I'd excitedly start building, only to run
into the random barriers of "my laptop has an ancient version
of Go from the last project I abandoned" or "neovim is having
trouble starting the lsp/linter/formatter" and eventually go
"ugh, not worth it" and give up.
Frankly, as my career progressed and the increasingly complex
problems at work left me with vanishingly less brain-space for
passion projects, I was beginning to feel this crushing sense
of apathy & borderline despair. I felt I'd never be able make
good on my younger self's desire to bring these exciting ideas
of mine into existence. I even got to the point where I
convinced myself it was "my fault" because I lacked the metal
to stomach the challenges of day-to-day software development.
Now I can just decide "Hmm.. I want an lightweight agent in a
portable binary. Makes sense to use Go." or "this beta API
offers super cheap inference, so it's worth dealing with some
jank" and then let an LLM work out all the details and do all
the troubleshooting for me. Feels like a complete 180 from
where I was even just a year or two ago.
At the risk of sounding hyperbolic, I don't think it's
overstating things to say that the advent of "agentic
engineering" has saved my career.
afro88 wrote 12 hours 43 min ago:
What models and inference provider?
h14h wrote 8 hours 8 min ago:
I'm using kimi-k2-instruct as the primary model and building
out tool calls that use gpt-oss-120b to allow it to opt-in to
reasoning capabilities.
Using Vultr for the VPS hosting, as well as their inference
product which AFAIK is by far the cheapest option for hosting
models of these class ($10/mo for 50M tokens, and $0.20/M
tokens after that). They also offer Vector Storage as part of
their inference subscription which makes it very convenient to
get inference + durable memory & RAG w/ a single API key.
Their inference product is currently in beta, so not sure
whether the price will stay this low for the long haul.
ac29 wrote 1 hour 31 min ago:
You can definitely get gpt-oss-120b for much less than
$0.20/M on openrouter (cheapest is currently 3.9c/M in 14c/M
out). Kimi K2 is an order of magnitude larger and more
expensive though.
What other models do they offer? The web page is very light
on details
hdjrudni wrote 18 hours 52 min ago:
That's just because corporations got greedy and made their apps
suck.
Strip away the ads, the data harvesting, add back the power
features, and we'll be happy again. I'm more willing than ever to
pay a one-time fee good software. I've started donating to all the
free apps I use on a regular basis.
I don't want to own my own slop. That doesn't help me. Use your AI
tools to build out the software if you want, but make sure it does
a good job. Don't make me fiddle with indeterministic
flavor-of-the-month AI gents.
moring wrote 15 hours 42 min ago:
> That's just because corporations got greedy and made their apps
suck.
It is true for me with Linux. I code for a living and I can't
change anything because I can't even build most software -- the
usual configure/make/make install runs into tons of compiler
errors most of the time.
Loss of control is an issue. I'm curious if AI tools will change
that though.
peepee1982 wrote 16 hours 45 min ago:
What you're describing is the expected and correct outcome inside
a profit-oriented, capitalist system. So the only way I see out
of this situation would be changing policy to a more socialist
one, which doesn't seem to be so popular among the tech elite,
who often think they deserve their financial status because of
the 'value' they provide, without specifying what that value is
(or its second-order consequences). Whether that's abusing a
monopolistic market position they lucked into, making apps as
addictive as possible, or building drones that throw bombs on
newborns in hospitals.
throwaway13337 wrote 5 hours 10 min ago:
I think we're after the same goal but have a different view of
mechanism.
Regulation enforcement against the anti-market behaviors would
bring a lot of good.
Putting too much power in any centralized authority - company
or government - seems to lead to oppression and unhealthy
culture.
Fair markets are the neatest trick we have. They put the
freedom of choice in the hands of the individual and allow
organic collaboration.
The framing should not be government vs company. But
distributed vs centralized power. For both governance and
commerce.
The entire world right now suffers from too much centralized
power. That comes in the form of both corporate and government.
Power tends to consolidate until the bureaucracy of the
approach becomes too inefficient and collapses under its own
weight. That process is painful, and it's not something I enjoy
living through.
If you see through that lens, it has explaining power for the
problems of both the EU countries and the US.
safety1st wrote 18 hours 21 min ago:
I think there's room for both visions. Big Tech is generating
more toxic sludge than ever, and yeah sure this is because
they're greedy, but more precisely the root cause is how they
lobbied Washington and our elected officials agreed to all kinds
of pro-corporate, anti-human legislation. Like destroying our
right to repair, like criminalizing "circumvention" measures in
devices we own, like insane life-destroying penalties for
copyright infringement, like looking the other way when Big Tech
broke anti-trust laws, etc.
The Big Tech slop can only be fixed in one way, and actually it's
really predictable and will work - we need to fix the laws so
that they put the rights and flourishing of human beings first,
not the rights and flourishing of Big Tech. We need to fix
enforcement because there are so many times that these companies
just break the law and they get convicted but they get off with a
slap on the wrist. We need to legislate a dismantling of barriers
to new entrants in the sectors they dominate. Competition for the
consumer dollar is the only thing that can force them to be more
honest. They need to see that their customers are leaving for
something better, otherwise they'll never improve.
But our elected officials have crafted laws and an enforcement
system which make 'something better' impossible (or at least
highly uneconomical).
Parallel to this if open source projects can develop software
which is easier for the user to change via a PR, they totally
should. We can and should have the best of both worlds. We should
have the big companies producing better "boxed" software. Plus we
should have more flexibility to build, tweak and run whatever we
want.
LancelotLac wrote 6 hours 15 min ago:
and being able to fire employees for profit gain when they
already make a profit, thats illegal in other countries
mentalgear wrote 14 hours 50 min ago:
Very good points, I agree and would add : "Interoperability" is
the key to bring back competition and open the ecosystem again.
bergfest wrote 17 hours 0 min ago:
And then they will take away your right to boot whatever you
want. For national security reasons and the children, of
course.
axelthegerman wrote 20 hours 27 min ago:
And how great it will be to troubleshoot any issues because everyone
is basically running a distinct piece of software
theshrike79 wrote 17 hours 1 min ago:
It's like the dude who monkey-patches their car and goes to the
dealer to complain why the suspension is stiff.
It's because you put 2by4's in place of the shocks, you absolute
muppet. And then they either give them a massive bill to fix it
properly or politely show them out.
Same will happen in self-modifying software. Some people are
self-aware enough to know that "I made this, it's my problem to
fix", some will complain to the maker of the harness they used and
will be summarily shown the door.
wrxd wrote 17 hours 8 min ago:
I donât want to be the one who has to upgrade this software +
vibe coded patches.
Itâs going to be very likely that once something is patched is to
be considered as diverged and very hard to upgrade
sshine wrote 20 hours 20 min ago:
... made minutes ago.
krickelkrackel wrote 17 hours 35 min ago:
So everybody will be using (sometimes slightly, sometimes
entirely) different software. Like mutations, these adapt to the
specific problems in the situation they were prompted to be
programmed.
fnord77 wrote 20 hours 38 min ago:
I mean using the captive agents is much cheaper than supplying your api
key to a 3rd party agent.
TZubiri wrote 21 hours 24 min ago:
Wtf is that example gif?
The prompt shown is
"Who's your daddy and what does he do?"
Is this a joke or tech? Is the author a dev or a clown?
enneff wrote 20 hours 12 min ago:
Itâs a quote from the movie Kindergarten Cop.
NamlchakKhandro wrote 21 hours 5 min ago:
No one cares about your opinions.
This coding agent certainly couldn't give a fuck.
rglover wrote 21 hours 26 min ago:
Excited to give this a try, looks really well done.
mobrienv wrote 21 hours 29 min ago:
Another batteries included pi setup. Built a lightweight mobile webui
to run it on termux and code on my phone.
HTML [1]: https://github.com/mikeyobrien/rho
moonlion_eth wrote 21 hours 32 min ago:
ive been using pi for about a week as daily driver and so far im happy
with it. I really like the modular concept and also that its rather
minimal
qazplm17 wrote 21 hours 36 min ago:
Pi treats you like an adult and shows whatever the fuck LLM is doing
rather than actively hiding shit from the user. And just for that, once
you tasted the freedom and transparency, thereâs no way to go back to
CC.
WXLCKNO wrote 14 hours 37 min ago:
After 2.20.0 of Claude code where they started not showing what files
are read / searches are made by default .. I fucking love how easy it
was to ditch Claude code for pi.
TZubiri wrote 21 hours 23 min ago:
I think OpenCode is the same.
They are all open source though so you can just find out whats going
on if you want right?
gtirloni wrote 22 hours 16 min ago:
What's a coding harness? Claude Code is a "harness" and not a TUI?
jasonjmcghee wrote 18 hours 3 min ago:
The fact that it's a tui isn't particularly relevant. It could be a
gui or cli and provide very similar value.
Nearly all of its value is facilitating your interaction with the
LLM, the tools it can use, and how it uses them.
gtirloni wrote 3 hours 41 min ago:
We used to call these "libraries".
jasonjmcghee wrote 11 min ago:
Harness is an appropriate name. It comes from reinforcement
learning world where you need to build the proper scaffolding for
it to optimize for the goal you want it to.
This is very similar to what the agent is doing. You are building
the appropriate environment for it to be able to complete the
task most reliably etc
Not just functions/tools and documentation available (which is
similar to a library), also context and critically, enforcement
of behavior.
This is probably the key thing that makes it a "harness". If the
agent can do whatever it wants, it's not in a harness.
ErikBjare wrote 22 hours 11 min ago:
If you run Claude Code with `-p --output-format json` it's no longer
a TUI, but it's still a harness.
indigodaddy wrote 22 hours 16 min ago:
too bad I cannot star this..
HTML [1]: https://github.com/badlogic/pi-mono/tree/main/packages/coding-...
type4 wrote 22 hours 25 min ago:
What are people using to cost efficiently use this? I was using a
Google Ultra sub which gave enough but thatâs gone now.
ChatGPT $20/month is alright but I got locked out for a day after a
couple hours. Considering the GitHub pro plus plan.
raffkede wrote 13 hours 6 min ago:
Kimi code with the .99 Cent plan is not to bad if you're savy
UncleOxidant wrote 19 hours 55 min ago:
Run Qwen3-coder-next locally. That's what I'm doing (using LMstudio).
It's actually a surprisingly capable model. I've had it working on
some LLVM-IR manipulation and microcode generation for a kind of VLIW
custom processor. I've been pleasantly surprised that it can handle
this (LLVM is not easy) - there are also verilog code that define the
processor's behavior that it reads to determine the microcode format
and expected processor behavior. When I do hit something that it
seems to struggle with I can go over to antigravity and get some free
Gemini 3 flash usage.
zirror wrote 18 hours 29 min ago:
What kind of hardware do you run it on?
UncleOxidant wrote 4 hours 25 min ago:
Framework Desktop (AMD Strix Halo with 128GB). Runs it around 27
tok/sec which is quite acceptable.
kristianpaul wrote 19 hours 52 min ago:
Same here
rahimnathwani wrote 20 hours 50 min ago:
You could try minimax 2.5 via openrouter.
ursuscamp wrote 20 hours 38 min ago:
MiniMax has an incredibly affordable coding plan for $10/month. It
has a rolling five hour limit of 100 prompts. 100 prompts doesn't
sound like much, but in typical AI company accounting fashion, 1
prompt is not really 1 prompt. I have yet to come even close to
hitting the limit with heavy use.
lambda wrote 21 hours 16 min ago:
Qwen3 Coder Next in llama.cpp on my own machine. I'm an AI hater, but
I need to experiment with it occasionally, I'm not going to pay
someone rent for something they trained on my own GitHub, Stack
overflow, and Reddit posts.
beacon294 wrote 22 hours 12 min ago:
FWIW the lockout probably wasn't related... maybe the content you
were working on or your context window management somehow triggered
something?
chriswarbo wrote 22 hours 38 min ago:
I've been using pi via the pi-coding-agent Emacs package, which uses
its RPC mode to populate a pair of Markdown buffers (one for input, one
for chat), which I find much nicer than the awful TUIs used by
harnesses like gemini-cli (Emacs works perfectly well as a TUI too!).
The extensibility is really nice. It was easy to get it using my
preferred issue tracker; and I've recently overridden the built-in
`read` and `write` commands to use Emacs buffers instead. I'd like to
override `edit` next, but haven't figured out an approach that would
play to the strengths of LLMs (i.e. not matching exact text) and Emacs
(maybe using tree-sitter queries for matches?). I also gave it a
general-purpose `emacs_eval`, which it has used to browse documentation
with EWW.
dnouri wrote 21 hours 36 min ago:
Nice! I'm curious to hear how you're mapping `read` and `write` to
Emacs buffers. Does that mean those commands open those files in
Emacs and read and write them there?
Let me also drop a link to the Pi Emacs mode here for anyone who
wants to check it out: [1] -- or use: M-x package-install
pi-coding-agent
We've been building some fun integrations in there like having RET on
the output of `read`, `write`, `edit` tool calls open the
corresponding file and location at point in an Emacs buffer. Parity
with Pi's fantastic session and tree browsing is hopefully landing
soon, too. Also: Magit :-)
HTML [1]: https://github.com/dnouri/pi-coding-agent
chriswarbo wrote 11 hours 19 min ago:
I've pushed the extension to GitHub at [1] The implementation is
pretty terrible: a giant string of vibe-coded Emacs Lisp is sent to
emacsclient, which performs the actions and sends back a string of
JSON.
It's been interesting to iterate on the approach: watching the LLM
(in my case Claude) attempting to use the tools; noticing when it
struggles or makes incorrect assumptions; and updating the tool,
documentation and defaults to better match those expectations.
I've also written some Emacs Lisp which opens Pi and tells it to
"Action the request/issue/problem at point in buffer ''" [2] It
feels similar to the file-watching provided by Aider (which uses
inotify to spot files containing `# AI!` or `# AI?`), which I've
previously used with FIXME and TODO comments in code; but it also
works well in non-file things, e.g. error messages and test
failures in `shell-mode`, and issues listed in the Emacs UI I wrote
for the Artemis bug tracker (Claude just gets the issue number from
the current line, and plugs that into a Pi extension I made for
Artemis :-) )
HTML [1]: https://github.com/Warbo/pi-extensions/tree/master/extensi...
HTML [2]: https://github.com/Warbo/warbo-emacs-d/blob/a13a1e02f52034...
dnouri wrote 4 hours 48 min ago:
Oh that sounds neat. I'll need to check out your extension!
isagawa-co wrote 22 hours 46 min ago:
Interesting approach to planning via extensions. I took a similar
direction with enforcement. A governance loop that hooks into the
agent's tool calls and blocks
execution until protocol is followed. Every 10 actions (configurable),
the agent re-centers. No permission popups, but the agent literally
can't skip steps.
Open source:
HTML [1]: https://github.com/isagawa-co/isagawa-kernel
thevinter wrote 23 hours 16 min ago:
Pi was probably the best ad for Claude Code I ever saw.
After my max sub expired I decided to try Kimi on a more open harness,
and it ended up being one of the worst (and eye opening experiences) I
had with the agentic world so far.
It was completely alienating and so much 'not for me', that afterwards
I went back and immediately renewed my claude sub.
HTML [1]: https://www.thevinter.com/blog/bad-vibes-from-pi
raincole wrote 14 hours 26 min ago:
Technically you're not allowed to use Claude subscription account
with Pi (according to Anthropic's policy). So yeah, Pi is the best
anti-ad against Anthropic.
a96 wrote 15 hours 7 min ago:
> As it turns out, the opinions in question are that bash should be
enabled by default with no restrictions, that the agent should have
access to every file on your machine from the start, and that npm is
the only package manager worth supporting.
Yep. This is why I've been going "Hell, no!" and will probably keep
doing so.
rcarmo wrote 16 hours 24 min ago:
Paraphrasing The Dude, thatâs like, just your opinion, man.
tern wrote 17 hours 45 min ago:
I had a very similar experience. I have different preferences, but
ultimately, my takeaway was that if I want to follow my own version
of their philosophy, I should just create my own thing.
In the meantime, the codex/cc defaults are better for me.
CGamesPlay wrote 20 hours 36 min ago:
> if I start the agent in ./folder then anything outside of ./folder
should be off limits unless I explicitly allow it, and the same goes
for bash where everything not on an allowlist should be blocked by
default.
Here's the problem with Claude Code: it acts like it's got security,
but it's the equivalent of a "do not walk on grass" sign. There's no
technical restrictions at play, and the agent can (maliciously or
accidentally) bypass the "restrictions".
That's why Pi doesn't have restrictions by default. The logic is: no
matter what agent you are using, you should be using it in a real
sandbox (container, VM, whatever).
esafak wrote 18 hours 29 min ago:
But the agent has to interact with the world; fetch docs, push
code, fetch comments, etc. You can't sandbox everything. So you
push that configuration to your sandbox, which is a worse UX that
the harness just asking you at the right time what you'd like to
do.
the_mitsuhiko wrote 16 hours 29 min ago:
I too would like to know what a good UX looks like here but I
have doubts that the permission prompts of Claude are the way to
go right now.
Within days people become used to just hitting accept and
allowlisting pretty much everything. The agents write length
scripts into shell scripts or test runners that themselves can be
destructive but they immediately allowlisted.
CGamesPlay wrote 16 hours 47 min ago:
Well, you are imagining a worse UX, but it doesn't have to be. Pi
doesn't include a sandboxing story at all (Claude provides an
advisory but not mandatory one), but the sandbox doesn't have to
be a simple static list of allowed domains/files. It's totally
valid to make the "push code" tool in the sandbox send a trigger
to code running outside of the sandbox, which then surfaces an
interactive prompt to you as a user. That would give you the
interactivity you want and be secure against accidentally or
deliberately bypassing the sandbox.
esafak wrote 10 hours 0 min ago:
So you have to set up that integration instead of letting the
agent do it. I suppose the sandbox is more configurable, but do
you need that? I thought the draw of pi was that you didn't do
all that and let it fly, wheeee!
edit: You're not making it sound easy at all. I don't have to
build anything with the other agents.
CGamesPlay wrote 8 hours 40 min ago:
Certainly not. Pi is "minimalist", so the draw is that it's
"easy" to set it up yourself. You can not do that and run it
in yolo mode, and you can do that with Claude Code too. Heck
you can even use this hypothetical
real-sandbox-with-interactive-prompts with Claude Code
instead, once you build it.
Back to my original point: Claude Code gives you a false
feeling of security, Pi gives you the accurate feeling of not
having security.
NamlchakKhandro wrote 21 hours 3 min ago:
hypegrift
mccoyb wrote 23 hours 1 min ago:
> I would say that the project actively expects you to be downloading
them to fill any missing gaps you might have.
Where did you get this perspective from?
> I thought pi and its tools were supposed to be minimal and
extensible. So why is a subagent extension bundling six agents I
never asked for that I canât disable or remove?
Why do you think a random subagents extension is under the same
philosophy as pi?
Your blog post says little about pi proper, it's essentially
concerned with issues you had with the ecosystem of extensions, often
made by random people who either do or do not get the philosophy? Why
would that be up to pi to enforce?
the_mitsuhiko wrote 16 hours 25 min ago:
Sharing extensions is very much the philosophy. Using them however
is less so.
Pi ships with docs that include extensions and the agent looks
there for inspiration if you ask it to build a custom extension.
Looking at what others publish is useful!
tmustier wrote 23 hours 21 min ago:
I havenât met a single person who has tried pi for a few days and not
made it their daily driver. Once you taste the freedom of being able to
set up your tool exactly how you like, thereâs really no going back.
and you can build cool stuff on top of it too!
johanyc wrote 10 hours 49 min ago:
I've been using codex for about 2 months now and am pretty happy with
it. What does pi do better than codex?
jsumrall wrote 5 hours 46 min ago:
If you ever want to use other models, pi can do that. In the middle
of a session I might switch from gpt-5.2 to opus and get it to do
something or review something and then switch back to gpt. Since
models are being released every few weeks this is interesting to
compare models without having to switch to a different harness.
And if thereâs any feature codex has that you want, just have pi
run codex in a tmux session and interrogate it how said feature
works, and recreate it in pi.
ngrilly wrote 11 hours 57 min ago:
It sounds like it is the neovim or Emacs of coding agents.
PessimalDecimal wrote 10 hours 26 min ago:
I came here to say the same thing. It's basically _is_ Emacs.
Heavily configurable tool, text-focused UI, primary interaction
with a minibuffer ..er.. box to prompt at the bottom of the screen,
package distribution mechanism, etc etc.
With Emacs modes like agent-shell.el available and growing, why not
invest in learning a tool that is likely to survive and have
mindshare beyond the next few months?
sshine wrote 20 hours 18 min ago:
> I havenât met a single person who has tried pi for a few days and
not made it their daily driver.
Pleased to meet you!
For me, it just didnât compare in quality with Claude CLI and
OpenCode. It didnât finish the job. Interesting for extending,
certainly, but not where my productivity gains lie.
esafak wrote 18 hours 46 min ago:
People seem to be really enjoying rolling everything themselves
these days...
raincole wrote 8 hours 59 min ago:
Seriously? The most common complains on HN is how every software
is built upon Electron and React.
insin wrote 12 hours 27 min ago:
That seems to be what a significant chunk of the "insane
productivity" is actually going into
theshrike79 wrote 15 hours 7 min ago:
I've spent way too long working around the jank and extra
features in Other People's Software.
Now I can just make my own that does exactly what I want and
need, nothing more and nothing less. It's just for me, it's not a
SaaS or a "start-up" I'm the CEO of.
ixsploit wrote 17 hours 2 min ago:
Because itâs very easy todo nowadays. Why making compromises in
your workflow anymore?
ck_one wrote 22 hours 15 min ago:
What self-built capabilities do you like the most that claude code
doesn't offer?
cudgy wrote 8 hours 8 min ago:
Claude code includes a large system prompt with every request while
tool like pi does not. This could save tokens resulting in lower
costs.
theshrike79 wrote 11 hours 55 min ago:
"hey, build a connector for z.ai GLM-5"
Can't do that with Claude =)
tomashubelbauer wrote 14 hours 46 min ago:
Not the person you replied to, but I'll stress the point that it is
not just what you can add that Claude Code doesn't offer, but also
what you don't need to add that Claude Code does offer that you
don't want.
I dislike many things about Claude Code, but I'll pick subagents as
one example. Don't want to use them? Tough luck. (AFAIK, it's been
a while since I used CC, maybe it is configurable now or was always
and I never discovered that.)
With Pi, I just didn't install an extension for that, which I
suspect exists, but I have a choice of never finding out.
extr wrote 6 hours 55 min ago:
This is and has always been trivially configurable. Just put
`Task` as a disallowed tool.
prettyblocks wrote 9 hours 46 min ago:
You can just put "Never use subagents" in your CLAUDE.md and it
will honor it, no?
tomashubelbauer wrote 9 hours 40 min ago:
IME CLAUDE.md rarely gets fully honored. I've left HN comments
before about how I had to convert some CLAUDE.md instructions
to pre-commit deterministic checks due to how often they were
ignored. My guesstimate is that it is about 70 % reliable.
That's with Opus 4.5. I've since switched to GPT-5.2 and now
GPT-5.3 Codex and use Codex CLI, Pi and OpenCode, not CC, so
maybe things have changed with a new system prompt or with the
introduction of Opus 4.6.
elyase wrote 23 hours 37 min ago:
there is also pz a drop-in replacement for pi rewritten in Zig. 1.7MB
static binary, 3ms startup, 1.4MB RAM idle. Find more at:
HTML [1]: https://github.com/elyase/awesome-personal-ai-assistants?tab=r...
mccoyb wrote 12 hours 23 min ago:
Written by a person who is infamously annoying open source
maintainers with AI slop PRs (see the DWARF debacle in OCaml) ⦠and
missing much of piâs philosophy
Pass for me.
snthpy wrote 17 hours 41 min ago:
Cool, thanks for this. What about the extensions though? For me the
point about pi is minimal base plus configurable extensions you
choose.
_neil wrote 21 hours 32 min ago:
Direct link to pz for those on mobile:
HTML [1]: https://github.com/joelreymont/pz
muratsu wrote 1 day ago:
Iâm working with a friend to build an ui around Pi to make it more
user friendly for people who prefer to work with a gui (ala conductor).
You can check out the repo:
HTML [1]: https://github.com/philipp-spiess/modern
ramoz wrote 1 day ago:
In the same spirit, I also ported a planning UI extension for Pi.
HTML [1]: https://plannotator.ai/blog/plannotator-meets-pi/
suralind wrote 1 day ago:
Iâve been testing it for a few days on pretty much clean install (no
customizations/extensions) and itâs ok. Not sure if I like it yet.
lukasb wrote 1 day ago:
But I can't use my Codex plan with it, right? I have to use an API key?
theshrike79 wrote 23 hours 47 min ago:
Pi makes GPT-5.3-Codex act about on par with Claude easily.
There's something in the default Codex harness that makes it fight
with both arms behind its back, maybe the sandboxing is overly
paranoid or something.
With Pi I can one-shot many features faster and more accurately than
with Codex-cli.
mccoyb wrote 1 day ago:
You can use your Codex plan with it. OpenAI endorsed it several weeks
ago, as far as I remember. That could change, however, but now seems
safe.
ac29 wrote 22 hours 44 min ago:
You can use your Claude or Gemini plan with it too for now, though
Anthropic and Google have made it clear this is a ToS violation.
rahimnathwani wrote 1 day ago:
Hugging Face now provides instructions for using local models in Pi:
HTML [1]: https://x.com/victormustar/status/2026380984866710002
mccoyb wrote 1 day ago:
Pi has made all the right design choices. Shout out to Mario (and Armin
the OG stan) â great taste shows itself.
semiinfinitely wrote 1 day ago:
I do not understand why in the age of ai coding we would implement
this in javascript
solarkraft wrote 14 hours 4 min ago:
Itâs one of the most productive languages and ecosystems (IMO top
1 over all).
raincole wrote 14 hours 20 min ago:
Thank god it's written in JavaScript. I might have skipped it if it
were zig or something.
KeplerBoy wrote 15 hours 45 min ago:
This confused me about openclaw for quite some time. The whole
lobster/crustacean theme is just firmly associated with rust in my
head. Guess it's just a claude/claw wordplay.
thomasfromcdnjs wrote 17 hours 16 min ago:
I am building an entire GPT model framework from the ground up in
Typescript + small amounts of c bindings for gpu stuff. [1] (using
claude)
Don't hate me aha and no, there is no reason other than I can
HTML [1]: https://github.com/thomasdavis/alpha2
andai wrote 19 hours 57 min ago:
See also: pz: pi coding-agent in Zig
HTML [1]: https://news.ycombinator.com/item?id=47120784
moonlion_eth wrote 21 hours 31 min ago:
i wrote an agent in zig, it kinda sucks tho. the language is just
words
sean_pedersen wrote 22 hours 36 min ago:
There is a Rust port:
HTML [1]: https://github.com/Dicklesworthstone/pi_agent_rust
mr_mitm wrote 10 hours 20 min ago:
This looked interesting because I prefer rust over npm.
The first issue I had was to figure out the schema of the
models.json, as someone who hadn't used the original pi before.
Then I noticed the documented `/skill:` command doesn't exist.
That's also hard to see because the slash menu is rendered off
screen if the prompt is at the bottom of the terminal. And when I
see it, the selected menu items always jumps back to the first
line, but looks like he fixed that yesterday.
The tool output appears to mangle the transcript, and I can't
even see the exact command it ran, only the output of the
command. The README is overwhelmingly long and I don't understand
what's important for me as a first time user and what isn't.
Benchmarks and code internals aren't too terribly relevant to me
at this point.
I looked at the original pi next and realized the config schema
is subtly different (snake_case instead of camelCase). Since it
was advertised as a port, I expected it to be a drop-in
replacement, which is clearly not the case.
All in all it doesn't inspire confidence. Unfortunate.
Edit: The original pi also says that there is a `/skill` command,
but then it is missing in the following table: [1] The `/skill`
command also doesn't seem registered when I use pi. What is going
on? How are people using this?
Edit2: Ah, they have to be placed in `~/.pi/agent/skills`, not
`~/.pi/skills`, even though according to the docs, both should
work: [1] This is exhausting.
HTML [1]: https://github.com/badlogic/pi-mono/tree/main/packages/c...
HTML [2]: https://github.com/badlogic/pi-mono/tree/main/packages/c...
saberience wrote 14 hours 19 min ago:
If you look at that code itâs possibly the worst rust code
Iâve seen in my life. There are several files with 5000 to
10000 lines of code in a single file.
It looks 100% vibe coded by someone whoâs a complete neophyte.
jauntywundrkind wrote 20 hours 25 min ago:
Fwiw @dicklesworthstone / jeff Emanuel is definitely my favorite
dragon rider right now, doing the most with AI, to the most
effect.
Their agent mail was great & very early in agent orchestration.
Code agent search is amazing & will tell you what's happening in
every harness. Their Franktui is a ridiculously good rust tui.
They have project after project after project after project and
they are all so good.
Didn't know they had a rust Pi. Nice.
saberience wrote 14 hours 10 min ago:
You should look at the code in that project. Itâs terrible, I
mean, really, really terrible.
Itâs clear it was 100% written by Claude using sub-agents
which explains the many classes with 5000 lines of rust in a
single file.
Itâs a huge buggy mess which doesnât run on my Mac.
If youâre a rust engineer and want a good laugh, go take a
look at the agent.rs, auth.rs, or any of the core components.
orangecoffee wrote 10 hours 57 min ago:
This matters less and less in the new world. that fact that a
fully compatible 10x faster clone came up, and is
continuously working and adapting/improving, tells you that
this is hugely valuable. It has users and it's thriving.
Caring about taste in coding is past now. It's sad :( but
also something to accept.
mr_mitm wrote 10 hours 19 min ago:
Unmaintainable messes of code are also hard to maintain for
AI agents. This isn't solely about taste.
orangecoffee wrote 9 hours 16 min ago:
This projects huge commit list proves this wrong :(
mr_mitm wrote 9 hours 6 min ago:
The project also doesn't work. See my other comment.
Looks like a lot of nonsensical commits.
saberience wrote 8 hours 22 min ago:
Yeah, I tried to use this clone of pi for a while and
its very, very broken.
First of all it wouldn't build, I have to mess around
with git sub-modules to get it building.
Then trying to use it. First of all the scrolling
behavior is broken. You cannot scroll properly when
there are lots of tool outputs, the window freezes. I
also ended up with lots of weird UI bugs when trying
to use slash commands. Sometimes they stop the window
scrolling, sometimes the slash commands don't even
show at all.
The general text output is flaky, how it shows
results of tools, the formatting, the colors, whether
it auto-scrolls or gets stuck is all very weird and
broken.
You can easily force it into a broken state by just
running lots of tool calls, then the UI just freezes
up.
But just try it and see for yourself...
mccoyb wrote 1 day ago:
Itâs straightforward: JavaScript is a dynamic language, which
allows code (for instance, code implementing an extension to the
harness) to be executed and loaded while the harness is running.
This is quite nice â I do think thereâs a version of piâs
design choices which could live in a static harness, but fully
covering the same capabilities as pi without a dynamic language
would be difficult. (You could imagine specifying a programmable
UI, etc â various ways to extend the behavior of the system, and
youâd like end up with an interpreter in the harness)
At least, youâd like to have a way to hot reload code (Elixir /
Erlang could be interesting)
This is my intuition, at least.
sergiomattei wrote 23 hours 4 min ago:
I built my own harness on Elixir/Erlang[0]. It's very nice, but I
see why TypeScript is a popular choice.
No serialization/JSON-RPC layer between a TS CLI and Elixir
server. TS TUI libraries utilities are really nice (I rewrote the
Elixir-based CLI prototype as it was slowing me down). Easy to
extend with custom tools without having to write them in Elixir,
which can be intimidating.
But you're right that Erlang's computing vision lends itself
super well to this problem space.
[1]
HTML [1]: https://github.com/matteing/opal
jatari wrote 23 hours 29 min ago:
Code hotloading isn't a particularly difficult feature to
implement in any language.
jauntywundrkind wrote 20 hours 28 min ago:
Rust can't even dynamically link!
I'm super on board the rust train right now & super loving it.
But no, code hot loading is not common.
Most code in the world is dead code. Most languages are for
dead code. It's sad. Stop writing dead code (2022) was no where
near the first, is decades and decades late in calling this
out, but still a good one.
HTML [1]: https://jackrusher.com/strange-loop-2022/
jasonjmcghee wrote 18 hours 14 min ago:
Incredible talk and I agree with all the things and I've
worked on this problem a bunch.
But Rust can dynamically link with dylib but I believe it's
still unstable.
It can also dynamically load with libloading.
mccoyb wrote 23 hours 21 min ago:
Sure, but why implement a novel language with said feature if
your concern is a harness ... not on implementing a brand new
language with this feature?
Blackarea wrote 1 day ago:
yes! I just don't understand that as well. Up until some time ago
claud code's preferred install was a npm i, wasn't it? Please
serious answers for why anyone would use a web language for a
terminal app
fragmede wrote 19 hours 51 min ago:
Because it's what the person writing it's preferred language.
So it can share code with the web app.
Because writing it in javascript is easier than writing it in raw
brute forced assembly.
mongrelion wrote 1 day ago:
Pi ships with powerful defaults but skips features like sub-agents and
plan mode
Does anyone have an idea as to why this would be a feature? don't you
want to have a discussion with your agent to iron out the details
before moving onto the implementation (build) phase?
In any case, looks cool :)
EDIT 1: Formatting
EDIT 2: Thanks everyone for your input. I was not aware of the
extensibility model that pi had in mind or that you can also iterate
your plan on a PLAN.md file. Very interesting approach. I'll have a
look and give it a go.
miroljub wrote 1 day ago:
Check [1] There are already multiple implementations of everything.
With a powerful and extensible core, you don't need everything
prepackaged.
HTML [1]: https://pi.dev/packages
alvivar wrote 1 day ago:
I plan all the time. I just tell Pi to create a Plan.md file, and we
iterate on it until we are ready to implement.
jauntywundrkind wrote 20 hours 19 min ago:
Agreed. I rarely find the guardrails of plan to be necessary; I
basically never use it on opencode. I have some custom commands I
use to ask for plan making, discussion.
As for subagents, Pi has sessions. And it has a full session tree &
forking. This is one of my favorite things, in all harnesses: build
the thing with half the context, then keep using that as a
checkpoint, doing new work, from that same branch point. It means
still having a very usable lengthy context window but having good
fundamental project knowledge loaded.
ramoz wrote 1 day ago:
See my comment in the thread but there is an intuitive extension
architecture that makes integrating these type of things feel native.
HTML [1]: https://github.com/badlogic/pi-mono/tree/main/packages/codin...
ramoz wrote 1 day ago:
The way youâre able to extend the harness through extension/hook
architecture is really cool.
Eg some form of comprehensive planning/spec workflow is best modeled as
an extension vs natively built in. And the extension still ends up
feeling ânativeâ in use
fred_tandemai wrote 1 day ago:
Anyone managed to run pi in a completely sandboxed environment? It can
only access the cwd and subdirectories
carderne wrote 12 hours 14 min ago:
I got pi to write me a very basic sandbox based on an example from
the pi github. Added hooks for read/write/edit/bash, some prompts to
temp/perm override. Have a look, copy-paste what you like.
HTML [1]: https://github.com/carderne/pi-sandbox
ac29 wrote 22 hours 52 min ago:
Yeah I wrote a small landlock wrapper using go-landlock to sandbox pi
that works well (not public, similar projects are landrun and nono).
Note that if you sandbox to literally just the working directly, pi
itself wont run since pretty much every linux application needs to be
able to read from /usr and /etc
monkey26 wrote 1 day ago:
I do this with an extension. I run all bash tools with bwrap and ACLs
for the write and edit tools. Serves my purposes. Opens up access to
other required directories, at least for git and rust.
I think I published it. Check the pi package page.
rcarmo wrote 1 day ago:
I run mine inside [1] (with [2] )
HTML [1]: https://github.com/rcarmo/agentbox
HTML [2]: https://github.com/rcarmo/webterm
fjk wrote 1 day ago:
Iâve been tinkering with Gondolin, a micro-vm agent sandbox.
Hereâs an example config:
HTML [1]: https://github.com/earendil-works/gondolin/blob/main/host/ex...
ge96 wrote 1 day ago:
Is that an official term "coding harness"
Wondering if you wanted a similar interface (though a GUI not just CLI)
where it's not for coding what would you call that?
Same idea cycle through models, ask question, drag-drop images, etc...
unshavedyak wrote 7 hours 0 min ago:
Honestly, i'm not interested in this if it can't use my subscription,
but now i really want to understand this idea of coding harness. I've
been exploring ideas that might be quite similar, though more inline
with the scope of IDE, and it sounds like "coding harness" fits my
mental model better.
ge96 wrote 5 hours 43 min ago:
I'm not interested in coding myself as I like to write code still,
I'm interested in that idea of task delegation eg. "research about
this topic" or "do this". Having a bunch of agents doing things,
that could be cool.
For me I'm looking to stick with Python so will whip something up
with Tkinter later for the desktop GUI aspect although I still like
Electron/JS primarily.
arcanemachiner wrote 1 day ago:
Yes. It seems to be the term that stands out the most, as terms like
"AI coding assistant", "agentic coding framework", etc. are too vague
to really differentiate these tools.
"harness" fits pretty nicely IMO. It can be used as a single word,
and it's not too semantically overloaded to be useful in this
context.
rcarmo wrote 1 day ago:
LLM harness has been in vogue for a year nowâ¦
outofpaper wrote 1 day ago:
A harness is a collection of stubs and drivers configured to assist
with automation or testing. It's a standard term often used in QA
as they've been automating things for ages before Gen Ai came on to
the scene.
squeefers wrote 7 hours 35 min ago:
its technically an IDE, but harness makes it sound new and fancy.
arcanemachiner wrote 1 day ago:
Yes, it is also a device used to control the movement of work
animals, which farmers have been using for ages before QA came on
to the scene.
himata4113 wrote 1 day ago:
Preconfigured PI:
HTML [1]: https://github.com/can1357/oh-my-pi
thepasch wrote 14 hours 50 min ago:
I feel like this misses the point of pi somewhat. The allure of pi is
that it allows you to start from scratch and make it entirely your
own; that itâs lightweight and uses only what you need. I go
through the list of features in this and I think, okay, cool, but why
should I use this over OpenCode if I just want a feature-packed (and
honestly -bloated) ready-made harness?
himata4113 wrote 6 hours 8 min ago:
It's just better opencode while still being lightweight I don't
know what else to say.
It's just an opinionated fork, either you like it or you don't. I
personally really like it.
amin2 wrote 14 hours 59 min ago:
This looks great but It feels really risky to add more and more tools
to the harness from random repos. Nothing against this repo in
particular but I wish we had better security and isolation so I that
I knew nothing could go wrong and I could just test a bunch of these
every day the same way I can install an app on my phone and feel
confident it's not going to steal my data.
himata4113 wrote 6 hours 9 min ago:
HTML [1]: https://github.com/containers/bubblewrap
e1g wrote 12 hours 37 min ago:
I test a bunch of these every day too, so I made a local sandbox to
jail all TUI clunkers to $CWD and run all of them in â-yolo mode
HTML [1]: https://agent-safehouse.dev/
jannniii wrote 15 hours 20 min ago:
It is an awesome fork! Tried to contribute also, but community seems
quite close knit.
esafak wrote 18 hours 45 min ago:
Why not OpenCode?
jannniii wrote 15 hours 19 min ago:
Oh-my-bloat.
I am still an avid user of opencode, my own fork though with async
tools etc, but it is cumbersome and tries to do too many things.
tietjens wrote 14 hours 24 min ago:
very interesting, i tried it at the start but haven't come back.
could you expand on what you mean?
virtuallynathan wrote 20 hours 8 min ago:
Big fan of this fork, been using it for everything for the last
couple of weeks.
Went from codex/claude code -> opencode -> pi -> oh-my-pi
mijoharas wrote 23 hours 33 min ago:
I'd quite like the web tools from oh-my-pi, but able to be extracted
to a normal pi tool or plugin... Maybe I should look into that
sometime...
infruset wrote 1 day ago:
Note there is a fork oh-my-pi: [1] of [2] fame. I use it as a daily
driver but I also love pi.
HTML [1]: https://github.com/can1357/oh-my-pi
HTML [2]: https://blog.can.ac/2026/02/12/the-harness-problem/
rcarmo wrote 1 day ago:
My current fave harness. I've been using it to great effect, since it
is self-extensible, and added support for it to [1] because it is so
much faster than ACP.
HTML [1]: https://github.com/rcarmo/vibes
solarkraft wrote 14 hours 9 min ago:
Can you shed some light on the speed difference of the direct
integration vs. ACP?
Iâm still looking for a generic agent interaction protocol (to make
it worth building around) and thought ACP might be it. But (and this
is from a cursory look) it seems that even OpenCode, which does
support ACP, doesnât use it for its own UI. So whatâs wrong with
it and are there better options to hopefully take its place?
ljm wrote 7 hours 19 min ago:
I've used ACP extensively because agent-shell in emacs uses it,
although the Anthropic license change means I'm not sure if I can
continue to use Claude through it without getting banned. I kind of
wish it integrated more tightly but also you can't really expect
someone to have magit involved such that agent-shell (or the like)
starts interacting with emacs directly. I'd love it if it did
though.
I've started using OpenCode for some things in a big window because
its side-by-side diff is great.
rcarmo wrote 10 hours 28 min ago:
Yeah, ACP adds another layer of marshaling/unmarshaling (or two-one
on each side) and can be slower than API calls on occasion. Like
MCP, it adds JSON overhead that doesnât really need to be there.
The best option will always be in-memory exchanges. Right now I am
still using the pi RPC, and that also involves a bit of conversion,
but itâs much lighter.
baby wrote 19 hours 37 min ago:
Wdym harness? Its a coding agent
furryrain wrote 19 hours 7 min ago:
I think the thesis of Pi is that there isn't much special about
agents.
Model + prompt + function calls.
There are many such wrappers, and they differ largely on UI
deployment/integration. Harness feels like a decent term, though
"coding harness" feels a bit vague.
baby wrote 9 hours 13 min ago:
We all call that a coding agent already
furryrain wrote 33 min ago:
Yes, and sometimes new terms are introduced. This is expected
in a new field.
gusmally wrote 22 hours 18 min ago:
Which ones have you compared it against?
rcarmo wrote 16 hours 2 min ago:
Literally all of them:
HTML [1]: https://github.com/rcarmo/agentbox
embedding-shape wrote 13 hours 29 min ago:
Very interesting definition of "all of them" :)
HTML [1]: https://github.com/search?q=repo%3Arcarmo%2Fagentbox%20c...
rcarmo wrote 10 hours 27 min ago:
No, literally. Mistral, Gemini, opencode, everything supported
by Toad, etc. Iâve tried them all. I just donât like using
either Claude Code or Codex, so I didnât add them to agentbox
and stuck with Copilot because it gives me both OpenAI and
Anthropic models.
Before Pi, I actually preferred Mistral Vibeâs UX
embedding-shape wrote 10 hours 4 min ago:
Ok, maybe we need to establish what "literally" means before
we try to figure out "all of them" it seems...
I was curious about your project, but the sloppy usage of
even the most basic terms kind of makes me not to want to
dive deeper, how could I even trust it does what it says on
the tin, if apparently we don't even have a shared
vocabulary?
rcarmo wrote 6 hours 12 min ago:
You're an AI, right? Because a human would come across as
crass with that statement...
badlogic wrote 1 day ago:
wow, i love this! was about to build this myself, but this looks
exactly what i want.
rcarmo wrote 1 day ago:
The better web UI is now part of [1] (which is essentially the
same, but with more polish and a claw-like memory system). So you
can pick if you want TS or Python as the back-end :)
HTML [1]: https://github.com/rcarmo/piclaw
badlogic wrote 1 day ago:
if i ever want a claw, i'd obv. go with this :)
rcarmo wrote 1 day ago:
The claw versionâs web UI essentially has better thinking
output, more visibility of tool calls, and slightly better SSE
streaming. Iâve backported some of it to vibes, but if you
want to borrow UI stuff, the better bits are in piclaw. I use
both constantly on my phone/desktop.
jmorgan wrote 1 day ago:
I've been using Pi day to day recently for simple, smaller tasks. It's
a great harness for use with smaller parameter size models given the
system prompt is quite a bit shorter vs Claude or Codex (and it uses a
nice small set of tools by default).
hrmtst93837 wrote 9 hours 37 min ago:
That's interesting; I've found Pi really shines for rapid
prototyping. Balancing minimalism and functionality is tricky, but it
sounds like they're nailing it with these constraints.
rpastuszak wrote 1 day ago:
Which models do you use and what for? I'm looking for ideas to play
with.
jmorgan wrote 17 hours 8 min ago:
For local models I've been trying it with GLM-4.7-Flash and the new
LFM2 24B model. I'm excited to try it with the new Qwen3.5 models
that came out today as well.
arjie wrote 1 day ago:
Has anyone used an open coding agent in headless mode? I have a system
cobbled together with exceptions going to a centralized system where I
can then have each one pulled out and `claude -p`'d but I'd rather just
integrate an open coding agent into the loop because it's less janky
and then I'll have it try to fix the problem and propose a PR for me to
review. If anyone else has used pi.dev or opencode or aider in this
mode (completely non-interactive until the PR) I'd be curious to hear.
EDIT: Thank you to both responders. I'll just try the two options out
then.
mihneadevries wrote 12 hours 37 min ago:
Aider's `--yes` flag combined with a git-based loop works honestly
better than I expected for this, like it'll just commit and you
review the diff.
Pi I've tried headless and it's fine but you kinda have to wire up
the exit conditions yourself since it's so minimal by design.
Fwiw the janky `claude -p` approach you described is actually pretty
solid once you stop fighting it, the simplicity is the feature I
think.
alchemist1e9 wrote 9 hours 40 min ago:
Finally found another Aider user to ask. How does Pi compare to it?
chriswarbo wrote 23 hours 16 min ago:
pi has an RPC mode which just sends/receives JSON lines over stdio
(including progress updates, and "UI" things like asking for
confirmation, if it's configured for that).
That's how the pi-coding-agent Emacs package interacts with pi; and
it's how I write automated tests for my own pi extensions (along with
a dummy LLM that emits canned responses).
fred_tandemai wrote 1 day ago:
Been using pi exactly for this and it's working great!
evalstate wrote 1 day ago:
fast-agent lets you do this as well (and has a skill in its default
skills repo to help with automation/running in container/hf job).
dosinga wrote 1 day ago:
you can run [1] in headless mode (I work on goose)
HTML [1]: https://block.github.io/goose/
rcarmo wrote 1 day ago:
You probably want to look into pi then - it's extremely extensible.
cermicelli wrote 1 day ago:
Just how expensive was that domain?
jotaen wrote 1 day ago:
README on Github says âpi.dev domain graciously donated by
exe.devâ (though that doesnât say anything about the original
price of course).
schpet wrote 1 day ago:
oh that's kind. i hope they keep the old domain up too though:
HTML [1]: https://shittycodingagent.ai/
squeefers wrote 7 hours 31 min ago:
looooooool
DIR <- back to front page