Skip to main content


Should ActivityPub software check robots.txt when delivering activities or reading data?

#EvanPoll #poll

  • Strong yes (44%, 15 votes)
  • Qualified yes (32%, 11 votes)
  • Qualified no (17%, 6 votes)
  • Strong no (5%, 2 votes)
34 voters. Poll end: 1 month ago

in reply to Evan Prodromou

The abstract of RFC 9309 defines the scope for robots.txt and an AP server is outside it:

This document specifies and extends the "Robots Exclusion Protocol" method originally defined by Martijn Koster in 1994 for service owners to control how content served by their services may be accessed, if at all, by automatic clients known as crawlers.
in reply to modulux

@modulux "automated client" seems like a pretty broad category.
in reply to Evan Prodromou

Sure, but the nature of the restriction seems to be against crawlers recursively hitting resources and such, rather than realising AP on behalf of (presumably human) users. I might agree botsin.space should honour robots.txt, but it doesn't seem otherwise relevant. Only my interpretation, but that's what I get from:

Crawlers are automated clients. Search engines, for instance, have crawlers to recursively traverse links for indexing as defined in [RFC8288].¶
It may be inconvenient for service owners if crawlers visit the entirety of their URI space. This document specifies the rules originally defined by the "Robots Exclusion Protocol" [ROBOTSTXT] that crawlers are requested to honor when accessing URIs.¶


From the same RFC.