Drunk & Root@sh.itjust.works to Selfhosted@lemmy.worldEnglish · 2 days agoHow to combat large amounts of Ai scrapersmessage-squaremessage-square48fedilinkarrow-up1110arrow-down14file-text
arrow-up1106arrow-down1message-squareHow to combat large amounts of Ai scrapersDrunk & Root@sh.itjust.works to Selfhosted@lemmy.worldEnglish · 2 days agomessage-square48fedilinkfile-text
everytime i check nginx logs its more scrapers then i can count and i could not find any good open source solutions
minus-squaredaniskarma@lemmy.dbzer0.comlinkfedilinkEnglisharrow-up4·edit-24 hours agoHow do you know it’s “AI” scrappers? I’ve have my server up before AI was a thing. It’s totally normal to get thousands of bot hits and to get scraped. I use crowdsec to mitigate it. But you will always get bot hits.
minus-squareSheldan@lemmy.worldlinkfedilinkEnglisharrow-up2·3 hours agoSome of them are at least honest and have it as a user agent.
minus-squarekrakenfury@lemmy.sdf.orglinkfedilinkEnglisharrow-up2·2 hours agoIs ignoring robots.txt considered “honest”?
How do you know it’s “AI” scrappers?
I’ve have my server up before AI was a thing.
It’s totally normal to get thousands of bot hits and to get scraped.
I use crowdsec to mitigate it. But you will always get bot hits.
Some of them are at least honest and have it as a user agent.
Is ignoring robots.txt considered “honest”?