Skip to content

gh-138907: Support RFC 9309 in robotparser#138908

Open
serhiy-storchaka wants to merge 7 commits intopython:mainfrom
serhiy-storchaka:robotparser-rfc9309
Open

gh-138907: Support RFC 9309 in robotparser#138908
serhiy-storchaka wants to merge 7 commits intopython:mainfrom
serhiy-storchaka:robotparser-rfc9309

Conversation

@serhiy-storchaka
Copy link
Copy Markdown
Member

@serhiy-storchaka serhiy-storchaka commented Sep 15, 2025

  • empty lines are always ignored instead of separating groups
  • the "user-agent" line after a rule starts a new group
  • groups matching the same user agent are now merged
  • the rule with the longest match wins instead of the first matching rule
  • in case of equal matches, the “Allow” rule wins over “Disallow”
  • special characters “$” and “*” are now supported in rules
  • prefer full match for user agent

Comment thread Lib/urllib/robotparser.py Outdated
path += '?' + query
return path

def translite_pattern(path):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

translate* I think

@serhiy-storchaka serhiy-storchaka marked this pull request as ready for review April 25, 2026 11:31
@serhiy-storchaka serhiy-storchaka added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants