• melpomenesclevage@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    28
    ·
    edit-2
    2 days ago

    i hear there’s a tool called (I think) ‘nepenthe’ that creates a loop for an LLM, if you use that in combination with a fairly tight blacklist of IP’s you’re certain are LLM crawlers, I bet you could do a lot of damage, and maybe make them slow their shit down, or do this in a more reasonable way.

    • PrivacyDingus@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 day ago

      nepenthe

      It’s a Markov-chain-based text generator which could be difficult for people to implement on repos depending upon how they’re hosting them. Regardless, any sensibly-built crawler will have rate limits. This means that although Nepenthe is an interesting thought exercise, it’s only going to do anything to things knocked together by people who haven’t thought about it, not the Big Big companies with the real resources who are likely having the biggest impact.