python (3.12.0)
̑e$ d Z ddlZddlZddlZdgZ ej dd Z G d d Z G d d Z G d d
Z
y)a% robotparser.py
Copyright (C) 2000 Bastian Kleineidam
You can choose between two licenses when using this package:
1) GNU GPLv2
2) PSF license for Python 2.2
The robots.txt Exclusion Protocol is implemented as specified in
http://www.robotstxt.org/norobots-rfc.txt
NRobotFileParserRequestRatezrequests secondsc Z e Zd ZdZddZd Zd Zd Zd Zd Z d Z
d Zd
Zd Z
d Zd
Zy)r zs This class provides a set of methods to read, parse and answer
questions about a single robots.txt file.
c z g | _ g | _ d | _ d| _ d| _ | j | d| _ y )NFr )entriessitemaps
default_entrydisallow_all allow_allset_urllast_checkedselfurls F/BuggyBox/python/3.12.0/bootstrap/lib/python3.12/urllib/robotparser.py__init__zRobotFileParser.__init__ s;
!!S c | j S )zReturns the time the robots.txt file was last fetched.
This is useful for long-running web spiders that need to
check for new robots.txt files periodically.
)r
r s r mtimezRobotFileParser.mtime% s r c 6 ddl }|j | _ y)zYSets the time the robots.txt file was last fetched to the
current time.
r N)timer
)r r s r modifiedzRobotFileParser.modified. s
IIKr c p || _ t j j | dd \ | _ | _ y)z,Sets the URL referring to a robots.txt file. N)r urllibparseurlparsehostpathr s r r zRobotFileParser.set_url6 s- %||44S9!A> 49r c t j j | j }|j }| j |j
d j y# t j j $ rT}|j dv rd| _ n4|j dk\ r |j dk rd| _ Y d}~yY d}~yY d}~yY d}~yd}~ww xY w)z4Reads the robots.txt URL and feeds it to the parser.zutf-8)i i Ti i N)
r requesturlopenr readr decode
splitlineserror HTTPErrorcoder
r )r frawerrs r r% zRobotFileParser.read; s 9&&txx0A &&(CJJszz'*5578 ||%% &xx:%$(!SSXX^!% &4 " &s )A* *C;CCc d|j v r| j || _ y y | j j | y N*)
useragentsr r append)r entrys r
_add_entryzRobotFileParser._add_entryH s= %"""!!)%*" *
LL&r c d}t }| j |D ] }|s4|dk( r
t }d}n"|dk( r| j | t }d}|j d }|dk\ r|d| }|j }|sh|j dd }t
| dk( s|d j j |d<