OpenAI still leads in agentic terminal coding, but by less.
Claude can plan the work and then run hundreds of parallel subagents in a single session (and with Opus 4.8, the agents can run for even longer)
That’s one way to turn profitable before the IPO, I guess. Goodbye tokens.


I know no one here wants to hear it, but the newest models from Anthropic and OpenAI are not bad coders with proper direction. If used correctly they can be positive force multipliers for developers, and used incorrectly they can do a lot of damage.
Note that this goes for developers with some experience. If you try to use an LLM in place of experience, or use it as a shortcut to try to gain experience, it turns into a negative multiplier really quickly, and you probably build bad habits that are hard to kick.
I’m not sure what the future of coding looks like, but I’ll be very surprised if AI in its current or a future incarnation is not involved somehow. How to learn coding correctly for that I don’t know, but looking at the junior devs I know, I am sure they will figure it out and grow into AI-native senior devs in due time.
The future of AI coding is dead until you don’t burn $20 in three minutes just by giving Opus a single prompt. I don’t know if you people are just rich, delusional, or both.
Gpt 5.4 xhigh isn’t too bad for automated reviews and the like, and 5.5 is fairly efficient for interactive coding. I prefer those to Claude and opus, the Anthropic models feel like they’re trying to hard to be human to me, but that’s personal preference I guess.
Yeah, it’s not free (or the free models aren’t good enough), but the consensus at work is that this is a potential game changer, and we need to experiment to see what works and what doesn’t. So, the budget is there until things settle, and afterwards if things work out.
The opus models are the first models I can work with day to day and not rage at them.
But, at their true cost, it’s not worth it 90% of the time. Next month with the GitHub copilot changes is going to be a bloodbath.
Pricing is an issue, yes - the open-weight models aren’t on par with claude and codex yet. I have hopes that six months to a year can bring them to the level of current frontier models, and if so I think that’s probably good enough for most users, including me. How Anthropic and OpenAI intend to make money at that point, I couldn’t tell you, but I don’t see an actual downside there :)
I’ve been using deepseek v4 flash on opencode’s infra for a couple of weeks and it’s pretty solid for something so low-cost. Honestly satisfied with it over Claude, for the premium Anthropic charges. Have you tried it at all?
Limited to approved tools at work.
Need to set up local models on my server for personal stuff.
Ahhhh fair play. I have a lot of freedom since I’m paying out of pocket for my own use. I have a pretty beefy rig for running local, but it’s not beefy enough to run deepseek pro and the like 😬 so, I have a bunch of subscriptions to try out a bunch of different models and see what works best in my workflow. I also have a problem with making alts in games, which seems like it rhymes 🤔
Been pretty impressed with glm5.1 too, before deepseek-v4 came out, but you’d be amazed what even a smaller older coding model can do with the right config and a little proactive context management. I really hope this trend of smaller, better models for local agentic use continues.