HN Top New Show Ask Jobs

settings

Theme

Hand Mode

Feed

Comment by jsheard

Comment by jsheard 6 months ago

4 replies

View on Hacker News

For the "good" bots which at least respect robots.txt you can use this list to get ahead of them before they pummel your site.

https://github.com/ai-robots-txt/ai.robots.txt

There's no easy solution for bad bots which ignore robots.txt and spoof their UA though.

breakingcups 6 months ago

Such as OpenAI, who will ignore robots.txt and change their user agent to evade blocks, apparently[1]

1: https://www.reddit.com/r/selfhosted/comments/1i154h7/openai_...

Reply View | 0 replies
zcase 6 months ago

For those looking, this is the best I've found: https://blog.cloudflare.com/declaring-your-aindependence-blo...

Reply View | 1 reply
  • maeil 6 months ago

    This seemed to work for some time when it came out but IME no longer does.

    Reply View | 0 replies
taikahessu 6 months ago

Thanks, will look into that!

Reply View | 0 replies