OpenAI still leads in agentic terminal coding, but by less.

Claude can plan the work and then run hundreds of parallel subagents in a single session (and with Opus 4.8, the agents can run for even longer)

That’s one way to turn profitable before the IPO, I guess. Goodbye tokens.

  • vinyl@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    18 hours ago

    I want to be able to buy computer parts not some overpriced auto complete that won’t benefit the majority of humanity. The cons of this tech outweigh the pros on orders of magnitudes.

    • Nighed@feddit.uk
      link
      fedilink
      English
      arrow-up
      2
      ·
      17 hours ago

      The tech is good, it’s how business has been built around it that’s the problem I think.

      Ore accurate prices of open weight models would have softened the rush.

      The copyright issues would be less if everything was open weight.

    • unpossum@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      18 hours ago

      That remains to be seen. If open weight models get good enough and efficient enough, I don’t see what moat Anthropic, OpenAI, et al have. Maybe they can fade away so we can buy hardware again, and still reap the benefits of Turing-certified “autocomplete”.

  • Echo Dot@feddit.uk
    link
    fedilink
    English
    arrow-up
    2
    ·
    23 hours ago

    It’s not that people hate AI it’s just not a priority in most people’s lives. It would be really nice if you enthusiast people would understand that there are entire industries that aren’t coding. I don’t work in programming so I don’t care how good this AI is at programming. Every time someone comes along and sings the praises of AI it’s always through the narrow lens of “it can code good” I have no idea whether or not it can code good, but regardless what does that mean to logistics agent or a spec writer?

    For everyone who isn’t terminally online AI is just an interesting toy with limited practical applications, especially with how expensive it is.

    I can 10x my performance right now by just installing a text snippet editor and learning how to use mail merge. I have tried using AI in my job role and outside of essentially using it as a fancy Google search it’s useless to me.

    • unpossum@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      19 hours ago

      To begin with, I wouldn’t say I’m an enthusiast, but I do find the breakthroughs in LLM tech the recent years to be interesting. I sometimes wonder how we got so blasé that a computer acing the Turing test is passed off as “spicy autocomplete, ho hum”.

      I also think you’ll find that many people on Lemmy do hate AI to a worrying degree. Just look at the reception this and other posts about it get here, in a technology community, where you’d expect news about one of the most sci-fi-like (to me, at least) technologies to be welcome.

      To the rest of your comment, I must say I find it strange to come to this community and complain that you find news about LLMs (a technology) useful for coding (also a technology), arguing that it’s not interesting to you. To each their own, I suppose.

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 hours ago

        Yeah it’s interesting as long as you can completely disregard all of the negative impacts but if you disregard all of the negative impacts and I would argue you’re not assessing the technology in a fair manner.

        The Turing test was also designed back in the day when a computer was just a big box in a room. An AI passing the Turing test is just something to throw at the media, it’s not a meaningful experiment. The Apple 2 was able to pass the Turing test.

  • abbadon420@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    7
    ·
    2 days ago

    We have an “AI coworker”. It s actually a human, but the buildin brain has gone. They have used over 100 euro’s in tokens this month. In comparison, very other coworker has used less than 5 euro.

    • Echo Dot@feddit.uk
      link
      fedilink
      English
      arrow-up
      8
      ·
      23 hours ago

      I really should start charging my employer per individual thought, regardless of the quality of accuracy of said thought. I’m sure I’d be onto a winner with that one.

  • unpossum@sh.itjust.worksOP
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    18
    ·
    2 days ago

    I know no one here wants to hear it, but the newest models from Anthropic and OpenAI are not bad coders with proper direction. If used correctly they can be positive force multipliers for developers, and used incorrectly they can do a lot of damage.

    Note that this goes for developers with some experience. If you try to use an LLM in place of experience, or use it as a shortcut to try to gain experience, it turns into a negative multiplier really quickly, and you probably build bad habits that are hard to kick.

    I’m not sure what the future of coding looks like, but I’ll be very surprised if AI in its current or a future incarnation is not involved somehow. How to learn coding correctly for that I don’t know, but looking at the junior devs I know, I am sure they will figure it out and grow into AI-native senior devs in due time.

    • ubergeek77@lemmy.ubergeek77.chat
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      1
      ·
      2 days ago

      The future of AI coding is dead until you don’t burn $20 in three minutes just by giving Opus a single prompt. I don’t know if you people are just rich, delusional, or both.

      • unpossum@sh.itjust.worksOP
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        3
        ·
        2 days ago

        Gpt 5.4 xhigh isn’t too bad for automated reviews and the like, and 5.5 is fairly efficient for interactive coding. I prefer those to Claude and opus, the Anthropic models feel like they’re trying to hard to be human to me, but that’s personal preference I guess.

        Yeah, it’s not free (or the free models aren’t good enough), but the consensus at work is that this is a potential game changer, and we need to experiment to see what works and what doesn’t. So, the budget is there until things settle, and afterwards if things work out.

    • Nighed@feddit.uk
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      2
      ·
      2 days ago

      The opus models are the first models I can work with day to day and not rage at them.

      But, at their true cost, it’s not worth it 90% of the time. Next month with the GitHub copilot changes is going to be a bloodbath.

      • unpossum@sh.itjust.worksOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        19 hours ago

        Pricing is an issue, yes - the open-weight models aren’t on par with claude and codex yet. I have hopes that six months to a year can bring them to the level of current frontier models, and if so I think that’s probably good enough for most users, including me. How Anthropic and OpenAI intend to make money at that point, I couldn’t tell you, but I don’t see an actual downside there :)

      • obelisk_complex@piefed.ca
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        3
        ·
        1 day ago

        I’ve been using deepseek v4 flash on opencode’s infra for a couple of weeks and it’s pretty solid for something so low-cost. Honestly satisfied with it over Claude, for the premium Anthropic charges. Have you tried it at all?

        • Nighed@feddit.uk
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          2
          ·
          1 day ago

          Limited to approved tools at work.

          Need to set up local models on my server for personal stuff.

          • obelisk_complex@piefed.ca
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            2
            ·
            1 day ago

            Ahhhh fair play. I have a lot of freedom since I’m paying out of pocket for my own use. I have a pretty beefy rig for running local, but it’s not beefy enough to run deepseek pro and the like 😬 so, I have a bunch of subscriptions to try out a bunch of different models and see what works best in my workflow. I also have a problem with making alts in games, which seems like it rhymes 🤔

            Been pretty impressed with glm5.1 too, before deepseek-v4 came out, but you’d be amazed what even a smaller older coding model can do with the right config and a little proactive context management. I really hope this trend of smaller, better models for local agentic use continues.