@agneyat: we gotta start fighting back y'all the clankers can't keep getting away with ts #websites #cybersecurity #webscraping seo maxxing: Website security How to protect against website crawlers AI website crawlers
I was thinking about making fake, invisible links. When the crawler clicks on one, the API would send a randomly generated junk page full of trap links which have a 30-45 second delay each before responding with 404. Either the crawler would slow to a crawl waiting on each link, or it would be trying to open tens of thousands of links simultaneously and killing its memory.
2025-07-26 06:51:55
1839
qwerty :
crawlers dont extract zips..
2025-07-27 12:35:11
9
~The Christine~ :
Yes! I love this kind of advice. 🥰
2025-07-28 22:45:56
0
Raccoon Badger✝️ :
Who has that one zip bomb that's basically 52 years of Internet data generated
2025-07-26 20:14:19
310
Phveektor🔸️🔸️ :
what's the downside of this?
2025-07-27 09:24:18
3
Pontus :
lol like someone has ever follow robots.txt
2025-07-27 09:04:52
44
Will Noi :
You guys really should learn from Cambodia, Vietnam Laos, another countries that have dealt with the genius of land minds.. This won't end well in the end when this battle is over.
2025-07-27 15:06:14
2
x745328 :
Nah how about doing the opposite and using hidden pages for the bots to scraps, but the hidden pages are filled with paragraphs of text saying your product is the “best”.
2025-07-26 16:11:56
486
İlhan Atahan :
i couldn't think any reason to worry of an ai analyzing my hosted webpage
2025-07-27 20:40:31
1
ㅤ :
Anubis is another great project that protects regular websites, by preforming heavy calculations(PoW) to make AI crawlers leave before they can scrape the data
2025-07-26 13:22:42
65
Ruby :
is there a way to exclude my site from being used to train AI but to allow AI to know about my site so I can be a suggested answer to someone's question?
2025-07-27 13:14:59
2
bbnoeuro :
We do the opposite. AI crawlers bring us more human traffic now than classic SEO
2025-07-27 08:56:54
9
rainy :
i feel like a good crawler would have a max stack size to stop memory leaks😭😭
2025-07-25 23:20:06
127
Aaron :
maybe it's easier to just temporarily ban ips that visit the disallow list
2025-07-27 10:30:22
0
Tequila sunset :
a friend of mines done that exact same thing on his website, he's built a whole zoo for the bots to waste time reading gibberish in
2025-07-27 14:39:36
10
zorbakpants :
Is this Nepenthes or a different method?
2025-07-27 07:49:33
0
Gabbe 🇸🇪 :
I feel it's better to use the approaches that poison the models with ai generated content. The companies will probably be able to detect any RAM filling sabotage, but it will be harder to filter out ai generated content.
2025-07-27 13:32:03
7
Joe :
This is gonna be funny af 😈
2025-07-27 01:58:12
7
SuperLuigi7 :
so... that means digital WW1 has started?!
2025-07-28 20:27:51
1
name expired, 😂 :
I took a bunch of .gov addresses and if page request time from one to another is to low. it just get redirected to one of the random picked .gov sites.
2025-07-27 10:48:38
3
ZestyLefty :
Good info, and if i could make a suggestion, better without the conspiracy theory sound
2025-07-27 15:05:34
2
Obscenity :
wouldn't work, crawlers have timeouts, pages not loading is already a common problem, theu don't just sit around all day waiting for the page, they just mark it as unavailable after a few seconds and move on
2025-07-28 00:19:50
1
spicybushy :
Is there a way you could leverage the compute of the crawler to do mining on your behalf? Like it has to do a chunk before getting the page, and then somehow flag that you always have a new page to crawl due to a new chunk ID?
2025-07-27 10:04:44
6
Tiuku2ku :
that only works if their crawlers are super naive
2025-07-27 07:28:28
1
To see more videos from user @agneyat, please go to the Tikwm
homepage.