Ugh, apparently yesterday a bot visited my Forgejo instance and queried everything, which caused Forgejo to create repo archives for everything. Git on the instance is 2.1 GB in size, but the repo archive filled up everything and is 120 GB. I really didn’t expect such a spike.

That meant that it filled up the whole hard drive and the server and all the services and websites on it went down while I was sleeping.

Luckily it seems that just deleting that directory fixes the problem temporarily. I also disabled the possibility of downloading archived from the UI but I’m not sure if this will prevent bots from generating those archives again. I also can’t just make the directory read only because it uses it for other things like mirroring, etc too.

For small instances like mine those archives are quite a headache.

  • Jeena@piefed.jeena.netOP
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 months ago

    But then how do people who search for code like yours find your open source code if not though a search engine which uses a indexing not?

    • SteveTech@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      Cloudflare usually blocks ‘unknown’ bots, which are basically bots that aren’t search crawlers. Also I’ve got Cloudflare setup to challenge requests for .zip, .tar.gz, or .bundle files, so that it doesn’t affect anyone unless they download from their browser.

      There’s also probably a way to configure something similar in Anubis, if you don’t like a middleman snooping your requests.