Search
Items tagged with: robotstxt
New scraper just dropped (well, an old scraper was renamed):
Facebook/Meta updated its robots.txt entry for opting out of GenAI data scraping. If you blocked FacebookBot
before, you should block meta-externalagent
now:
User-Agent: meta-externalagent
Disallow: /
Official references:
- Facebook developer documentation for
FacebookBot
no longer mentions GenAI. - Facebook developer documentation for web crawlers, including
Meta-ExternalAgent
mentions “AI”.
Meta Web Crawlers - Sharing - Documentation - Meta for Developers
This page lists the User Agent (UA) strings that identify Meta’s most common web crawlers and what each of those crawlers are used for.developers.facebook.com