• ArmchairAce1944@discuss.online
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 hours ago

    Please tell me how to do it over an API. I really want to know.

    Edit. I have LM studio and downloaded AI chatbots. But they give far more bullshit than chatgpt (which i rarely use much anyway), which is why talking to it feels even more pointless than chatgpt at times.

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 hours ago

      I am on mobile and can be more detailed later, if you want but the jist is to sign up (with a payment method) to some API service. There are many. Some neat ones include:

      • Openrouter (a gateway to many, many models from many providers, I’d recommend this first)
      • Cerebras API (which is faster than anything and has a generous free tier)
      • Google Gemini, which is free to just try this out on with no credit card.

      Some great models to look out for, that you may not know of:

      • GLM 4.5 (my all-around favorite)

      • Deepseek (and its uncensored finetunes)

      • Kimi

      • Jamba Large

      • Minimax

      • InternLM for image input

      • Qwen Coder for coding

      • Hermes 405B (which is particularly ‘uncensored’)

      • Gemini Pro/Flash, which is less private but free to try.

      Most (in exchanges for charging pennies for each request) do not log your prompts. If you are really, really concerned, you can even rent your own GPU instance on demand.

      Anyway, they will give you a key, which is basically a password.

      Paste that key into the LLM frontend of your choice, like Open Web UI, LM Studio, or even web apps like:

      Or even the Openrouter web interface.

        • brucethemoose@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          8 minutes ago

          Yep!

          Also, I’m going to plug the AI Horde, which is basically the Fediverse for AI self hosting: https://aihorde.net/

          It’s awesome! Though a bit sparsely populated, like Lemmy, heh.

          Ping me, and I can host a medium-sized model to try for a few hours (via those linked web UIs), if you want. The options are limitless, from something STEM-focused like Nemotron 49B, to a long context model like Bytedance’s new 36B, to, dungeonmaster finetunes, to horny as heck roleplaying models, lol. But they should be significantly better than whatever 8B ollama downloads by default.