• RedstoneValley@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    9
    ·
    21 hours ago

    What’s your setup, if I may ask? I’m using llama.cpp router with vscode kilo.ai and qwen3.6-35B-MoE-MTP as a model mostly. It’s surprisingly good as a coding assistant, but I think you have to know what you are doing and know your stuff(aka be an experienced developer) to make it useful. just letting it vibe leads to crap code

    • MalReynolds@slrpnk.net
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      20 hours ago

      just letting it vibe leads to crap code

      Yup, vibe is occasionally useful for proof of concept stuff, but disastrous for maintainability, security, readability, or large codebases. Without experience it’s still a foot gun for anything even slightly serious.

      Best approaches for a learner are to consider it autocomplete that needs research. Look up what it’s suggesting, see if it’s hallucinating, with luck it’ll point you in a useful direction where you can learn a good solution, as it has no idea what that is. Also makes a pretty good rubber duck for hashing out architectural decisions, finding alternative approaches etc, though you’ll have to point it at a web search for that. Spin up an e.g. vane instance for this, as small models don’t have enough world knowledge. Use it to write (or preferably copy from its system prompt examples) boilerplate and unit tests, perhaps descriptive comments (doublecheck).

      One thing to do is put everything you learn about coding style into your system prompt as they’re dogshit at consistent style without significant beatings around the head. Finding your own comfortable, consistent style is super useful for future readability. The joke about when I wrote this only God and I understood it, now only God does, will come clear in a month or two. Learn to work around it. Simple beats fancy unless you truly need the speed.

      While I do use agent iterative approaches, probably best to approach that organically as you grow, monsters lurk there. If you must, containerize / vm / isolate the hell out of something like opencode to muck around with.

      FWIW I still write most of my code by hand, it’s simpler and more consistent, but I’m keeping an eye on the development of LLMs, and I will let it write scut code (that I edit later). Code and Mathematics are super structured languages, pretty much ideal for large language models, so I can see them maybe, eventually getting good. More general thought, not so much without significant architectural upgrades.

      • TechLich@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 hours ago

        While this advice is true for all models, when it comes to agentic tasks (add this small feature/write this test harness/find bugs/suggest improvements), open source models are still way behind, vibe code or not.

        Claude Fable or even Opus in an editor like Zed have a 1 million token context window and will “think” through the goals of the application, test their changes, work through debugging processes the way a programmer would, stop to ask for clarification, check diagnostic tools and linters, prompt to run test code, etc.

        Llama, Gemma and Qwen etc. Do lack a lot of the world knowledge to get the goals of the application, but they also just don’t have the debugging skills, won’t test their code, don’t always tool call correctly, get confused as the context increases and nobody has enough vram to run on large context sizes locally.

        They can do autocomplete on small functions but aren’t really there for more complex tasks.

        On top of that, the biggest problem is that the best open source models are trained and released by the same giant tech conglomerates that have an interest in not competing with their own products. Qwen is Alibaba, Llama is Meta, gpt-oss is OpenAI. Even the more “independent” ones, kimi (Moonshot) and GLM (z.ai) are mostly funded by Alibaba and Tencent. They’re released for research and marketing purposes and to please their corporate backers with inflated stock. Almost nobody has the resources to train new models from scratch. People make lots of merges and fine tunes but AI is not democratised the way that traditional programming tools have been.

        Maybe some day there will be enough cheap compute for open source communities to pool together resources to build competing models but they’re not really there yet :(

        • MalReynolds@slrpnk.net
          link
          fedilink
          English
          arrow-up
          3
          ·
          5 hours ago

          Context management is a huge part of making smaller models viable (and likely a big part of making frontier models better). Tricks like structured context libraries for thinking improve things a lot, I like approaches that output things like an Obsidian vault that let you dig in and correct bad assumptions easily, even if it’s a bit slower. It’s a useful deliverable that can (mostly) be reused with updated models.

          Things like ‘the debugging skills, won’t test their code, don’t always tool call correctly’ are tangibly improving model to model, framework to framework, and are problems that will be solved in time, but yes they need handholding ATM.

          Things like

          test their changes, work through debugging processes the way a programmer would, stop to ask for clarification, check diagnostic tools and linters, prompt to run test code

          are mostly down to framework, not model (except for failing to tool call, which is improving), and falling at a respectable rate.

          That said, sure, frontier models get more in one go, personally I’m fine with only a 3-4x force multiplier instead of 10 to keep it local, but YMMV. For a business with resources for a bigger server it’ll be more like 8 times. Remember that some businesses handle sensitive data and can’t (or damn well shouldn’t) use frontier models, so the market is there.

          Maybe some day there will be enough cheap compute for open source communities to pool together resources to build competing models but they’re not really there yet :(

          Not wrong, decentralized inference is mostly solved (with latency penalties), but without decentralized training true democratization will remain out of reach. Hopefully a breakthrough will ensue, but until then we are dependent on the kindness of corporations (or them rugpulling competitors).

          This could also be a part of the RAMpocalypse thing, ‘if there’s not a moat I’ll fucking dig one, damn everyone else’ (and damn SamA). I doubt that’s sustainable long term, but it might get them through to IPO, more’s the pity.