Cynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 1 month agoHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizexternal-linkmessage-square37fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizCynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 1 month agomessage-square37fedilink
minus-squareDave.@aussie.zonelinkfedilinkarrow-up0·1 month agoI’m guessing something like: Robots.txt: Do not index this particular area. Main page: invisible link to particular area at top of page, with alt text of “don’t follow this, it’s just a bot trap” for screen readers and such. Result: any access to said particular area equals insta-ban for that IP. Maybe just for 24 hours so nosy humans can get back to enjoying your site.
minus-squaredoodledup@lemmy.worldlinkfedilinkarrow-up0·1 month agoProblem is that you’re also blocking search engines to index your site, no?
minus-squareɐɥO@lemmy.ohaa.xyzlinkfedilinkarrow-up1·1 month agoNope. Search engines should follow the robots.txt
I’m guessing something like:
Robots.txt: Do not index this particular area.
Main page: invisible link to particular area at top of page, with alt text of “don’t follow this, it’s just a bot trap” for screen readers and such.
Result: any access to said particular area equals insta-ban for that IP. Maybe just for 24 hours so nosy humans can get back to enjoying your site.
Problem is that you’re also blocking search engines to index your site, no?
Nope. Search engines should follow the robots.txt