I really hope they die soon, this is unbearable…

  • early_riser@lemmy.world
    link
    fedilink
    English
    arrow-up
    58
    ·
    1 month ago

    It’s already hard enough for self-hosters and small online communities to deal with spam from fleshbags, now we’re being swarmed by clankers. I have a little Mediawiki to document my deranged maladaptive daydreams worldbuilding and conlanging projects, and the only traffic besides me is likely AI crawlers.

    I hate this so much. It’s not enough that huge centralized platforms have the network effect on their side, they have to drown our quiet little corners of the web under a whelming flood of soulless automata.

  • Thorry@feddit.org
    link
    fedilink
    English
    arrow-up
    32
    ·
    1 month ago

    Yeah I had the same thing. All of a sudden the load on my server was super high and I thought there was a huge issue. So I looked at the logs and saw an AI crawler absolutely slamming my server. I blocked it, so it only got 403 responses but it kept on slamming. So I blocked the IPs it was coming from in iptables, that helped a lot. My little server got about 10000 times the normal traffic.

    I sorta get they want to index stuff, but why absolutely slam my server to death? Fucking assholes.

    • Ephera@lemmy.ml
      link
      fedilink
      English
      arrow-up
      14
      ·
      1 month ago

      My best guess is that they don’t just index things, but rather download straight from the internet when they need fresh training data. They can’t really cache the whole internet after all…

  • sudoer777@lemmy.ml
    link
    fedilink
    English
    arrow-up
    13
    ·
    1 month ago

    I’m okay with a few crawlers, but not what’s effectively a DDoS attack by AI companies who abuse my resources generating terabytes of traffic and crashing my server while costing me money. I use Anubis now, which sucks from an accessibility standpoint but I’m not dealing with their malicious traffic anymore.

  • eli@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    1 month ago

    I ended up just pushing everything behind my tailnet and only leave my game server ports open(which are non-standard ports).

  • ohshit604@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 month ago

    For a while my GoAccess instance wasn’t working properly so I couldn’t visualize my access logs from Traefik, got lazy trying to fix it and left it as is, well in the meantime I wasn’t lazy enough to setup Synapse and begin federating on my home network.

    Finally fixed my GoAccess today to be surprised to see Synapse hits labelled as crawlers, well over a million hits.

  • A_A@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    3
    ·
    1 month ago

    vendredi à 16h30 … curieusement, personne n’essaie de répondre à ta question 😋

  • x00z@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    3
    ·
    1 month ago

    50% of my traffic is scrapers now. I really want to block them but I also want my content to be indexed and used for LLMs. At the moment there isn’t really an in-between way of doing that. :(

    (This is with me knowing they fuck up the electricity nets and memory chips, I’m just hoping that gets better soon.)