How to handle exceptions with python library requests?
For example how to check is PC connected to internet?
When I try
try:
requests.get('http://www.google.com')
except ConnectionError:
# handle the exception
it gives me error name ConnectionError
is not defined
Kevin Burke
60.5k76 gold badges187 silver badges304 bronze badges
asked Jan 29, 2012 at 16:46
1
Assuming you did import requests
, you want requests.ConnectionError
. ConnectionError
is an exception defined by requests
. See the API documentation here.
Thus the code should be:
try:
requests.get('http://www.google.com')
except requests.ConnectionError:
# handle the exception
The original link to the Python v2 API documentation from the original answer no longer works.
tripleee
174k34 gold badges272 silver badges313 bronze badges
answered Jan 29, 2012 at 16:52
kindallkindall
178k35 gold badges274 silver badges306 bronze badges
4
As per the documentation, I have added the below points:
-
In the event of a network problem (refused connection e.g internet issue), Requests will raise a ConnectionError exception.
try: requests.get('http://www.google.com') except requests.ConnectionError: # handle ConnectionError the exception
-
In the event of the rare invalid HTTP response, Requests will raise an HTTPError exception.
Response.raise_for_status() will raise an HTTPError if the HTTP request returned an unsuccessful status code.try: r = requests.get('http://www.google.com/nowhere') r.raise_for_status() except requests.exceptions.HTTPError as err: #handle the HTTPError request here
-
In the event of times out of request, a Timeout exception is raised.
You can tell Requests to stop waiting for a response after a given number of seconds, with a timeout arg.
requests.get('https://github.com/', timeout=0.001)
# timeout is not a time limit on the entire response download; rather,
# an exception is raised if the server has not issued a response for
# timeout seconds
-
All exceptions that Requests explicitly raises inherit from requests.exceptions.RequestException. So a base handler can look like,
try: r = requests.get(url) except requests.exceptions.RequestException as e: # handle all the errors here
The original link to the Python v2 documentation no longer works, and now points to the new documentation.
tripleee
174k34 gold badges272 silver badges313 bronze badges
answered Jul 28, 2019 at 9:47
Actually, there are much more exceptions that requests.get()
can generate than just ConnectionError
. Here are some I’ve seen in production:
from requests import ReadTimeout, ConnectTimeout, HTTPError, Timeout, ConnectionError
try:
r = requests.get(url, timeout=6.0)
except (ConnectTimeout, HTTPError, ReadTimeout, Timeout, ConnectionError):
continue
Falko
16.9k13 gold badges60 silver badges104 bronze badges
answered Sep 6, 2017 at 9:57
kravietzkravietz
10.6k2 gold badges35 silver badges27 bronze badges
1
Include the requests module using import requests
.
It is always good to implement exception handling. It does not only help to avoid unexpected exit of script but can also help to log errors and info notification. When using Python requests I prefer to catch exceptions like this:
try:
res = requests.get(adress,timeout=30)
except requests.ConnectionError as e:
print("OOPS!! Connection Error. Make sure you are connected to Internet. Technical Details given below.n")
print(str(e))
continue
except requests.Timeout as e:
print("OOPS!! Timeout Error")
print(str(e))
continue
except requests.RequestException as e:
print("OOPS!! General Error")
print(str(e))
continue
except KeyboardInterrupt:
print("Someone closed the program")
answered May 23, 2018 at 20:19
Tanmoy DattaTanmoy Datta
1,5941 gold badge16 silver badges15 bronze badges
1
for clarity, that is
except requests.ConnectionError:
NOT
import requests.ConnectionError
You can also catch a general exception (although this isn’t recommended) with
except Exception:
answered Feb 22, 2015 at 10:50
StackGStackG
2,7205 gold badges28 silver badges45 bronze badges
Python request module is a simple and elegant Python HTTP library. It provides methods for accessing Web resources via HTTP. In the following article, we will use the HTTP GET method in the Request module. This method requests data from the server and the Exception handling comes in handy when the response is not successful. Here, we will go through such situations. We will use Python’s try and except functionality to explore the exceptions that arise from the Requests module.
- url: Returns the URL of the response
- raise_for_status(): If an error occur, this method returns a HTTPError object
- request: Returns the request object that requested this response
- status_code: Returns a number that indicates the status (200 is OK, 404 is Not Found)
Successful Connection Request
The first thing to know is that the response code is 200 if the request is successful.
Python3
Output:
200
Exception Handling for HTTP Errors
Here, we tried the following URL sequence and then passed this variable to the Python requests module using raised_for_status(). If the try part is successful, we will get the response code 200, if the page that we requested doesn’t exist. This is an HTTP error, which was handled by the Request module’s exception HTTPError and you probably got the error 404.
Python3
import
requests
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
print
(r)
Output:
HTTP Error 404 Client Error: Not Found for url: https://www.amazon.com/nothing_here <Response [404]>
General Exception Handling
You could also use a general exception from the Request module. That is requests.exceptions.RequestException.
Python3
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.RequestException as errex:
print
(
"Exception request"
)
Output:
Exception request
Now, you may have noticed that there is an argument ‘timeout’ passed into the Request module. We could prescribe a time limit for the requested connection to respond. If this has not happened, we could catch that using the exception requests.exceptions.ReadTimeout. To demonstrate this let us find a website that responds successfully.
Python3
import
requests
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
print
(r)
Output:
<Response [200]>
If we change timeout = 0.01, the same code would return, because the request could not possibly be that fast.
Time out <Response [200]>
Exception Handling for Missing Schema
Another common error is that we might not specify HTTPS or HTTP in the URL. For example, We cause use requests.exceptions.MissingSchema to catch this exception.
Python3
url
=
"www.google.com"
try
:
r
=
requests.get(url, timeout
=
1
)
r.raise_for_status()
except
requests.exceptions.MissingSchema as errmiss:
print
(
"Missing schema: include http or https"
)
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
Output:
Missing scheme: include http or https
Exception Handling for Connection Error
Let us say that there is a site that doesn’t exist. Here, the error will occur even when you can’t make a connection because of the lack of an internet connection
Python3
try
:
r
=
requests.get(url, timeout
=
1
, verify
=
True
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
except
requests.exceptions.ConnectionError as conerr:
print
(
"Connection error"
)
Output:
Connection error
Putting Everything Together
Here, We put together everything we tried so far the idea is that the exceptions are handled according to the specificity.
For example, url = “https://www.gle.com”, When this code is run for this URL will produce an Exception request. Whereas, In the absence of connection requests.exceptions.ConnectionError will print the Connection Error, and when the connection is not made the general exception is handled by requests.exceptions.RequestException.
Python3
try
:
r
=
requests.get(url, timeout
=
1
, verify
=
True
)
r.raise_for_status()
except
requests.exceptions.HTTPError as errh:
print
(
"HTTP Error"
)
print
(errh.args[
0
])
except
requests.exceptions.ReadTimeout as errrt:
print
(
"Time out"
)
except
requests.exceptions.ConnectionError as conerr:
print
(
"Connection error"
)
except
requests.exceptions.RequestException as errex:
print
(
"Exception request"
)
Output:
Note: The output may change according to requests.
Time out
Last Updated :
23 Jan, 2023
Like Article
Save Article
Eager to get started? This page gives a good introduction in how to get started
with Requests.
First, make sure that:
-
Requests is installed
-
Requests is up-to-date
Let’s get started with some simple examples.
Make a Request¶
Making a request with Requests is very simple.
Begin by importing the Requests module:
Now, let’s try to get a webpage. For this example, let’s get GitHub’s public
timeline:
>>> r = requests.get('https://api.github.com/events')
Now, we have a Response
object called r
. We can
get all the information we need from this object.
Requests’ simple API means that all forms of HTTP request are as obvious. For
example, this is how you make an HTTP POST request:
>>> r = requests.post('https://httpbin.org/post', data={'key': 'value'})
Nice, right? What about the other HTTP request types: PUT, DELETE, HEAD and
OPTIONS? These are all just as simple:
>>> r = requests.put('https://httpbin.org/put', data={'key': 'value'}) >>> r = requests.delete('https://httpbin.org/delete') >>> r = requests.head('https://httpbin.org/get') >>> r = requests.options('https://httpbin.org/get')
That’s all well and good, but it’s also only the start of what Requests can
do.
Passing Parameters In URLs¶
You often want to send some sort of data in the URL’s query string. If
you were constructing the URL by hand, this data would be given as key/value
pairs in the URL after a question mark, e.g. httpbin.org/get?key=val
.
Requests allows you to provide these arguments as a dictionary of strings,
using the params
keyword argument. As an example, if you wanted to pass
key1=value1
and key2=value2
to httpbin.org/get
, you would use the
following code:
>>> payload = {'key1': 'value1', 'key2': 'value2'} >>> r = requests.get('https://httpbin.org/get', params=payload)
You can see that the URL has been correctly encoded by printing the URL:
>>> print(r.url) https://httpbin.org/get?key2=value2&key1=value1
Note that any dictionary key whose value is None
will not be added to the
URL’s query string.
You can also pass a list of items as a value:
>>> payload = {'key1': 'value1', 'key2': ['value2', 'value3']} >>> r = requests.get('https://httpbin.org/get', params=payload) >>> print(r.url) https://httpbin.org/get?key1=value1&key2=value2&key2=value3
Response Content¶
We can read the content of the server’s response. Consider the GitHub timeline
again:
>>> import requests >>> r = requests.get('https://api.github.com/events') >>> r.text '[{"repository":{"open_issues":0,"url":"https://github.com/...
Requests will automatically decode content from the server. Most unicode
charsets are seamlessly decoded.
When you make a request, Requests makes educated guesses about the encoding of
the response based on the HTTP headers. The text encoding guessed by Requests
is used when you access r.text
. You can find out what encoding Requests is
using, and change it, using the r.encoding
property:
>>> r.encoding 'utf-8' >>> r.encoding = 'ISO-8859-1'
If you change the encoding, Requests will use the new value of r.encoding
whenever you call r.text
. You might want to do this in any situation where
you can apply special logic to work out what the encoding of the content will
be. For example, HTML and XML have the ability to specify their encoding in
their body. In situations like this, you should use r.content
to find the
encoding, and then set r.encoding
. This will let you use r.text
with
the correct encoding.
Requests will also use custom encodings in the event that you need them. If
you have created your own encoding and registered it with the codecs
module, you can simply use the codec name as the value of r.encoding
and
Requests will handle the decoding for you.
Binary Response Content¶
You can also access the response body as bytes, for non-text requests:
>>> r.content b'[{"repository":{"open_issues":0,"url":"https://github.com/...
The gzip
and deflate
transfer-encodings are automatically decoded for you.
The br
transfer-encoding is automatically decoded for you if a Brotli library
like brotli or brotlicffi is installed.
For example, to create an image from binary data returned by a request, you can
use the following code:
>>> from PIL import Image >>> from io import BytesIO >>> i = Image.open(BytesIO(r.content))
JSON Response Content¶
There’s also a builtin JSON decoder, in case you’re dealing with JSON data:
>>> import requests >>> r = requests.get('https://api.github.com/events') >>> r.json() [{'repository': {'open_issues': 0, 'url': 'https://github.com/...
In case the JSON decoding fails, r.json()
raises an exception. For example, if
the response gets a 204 (No Content), or if the response contains invalid JSON,
attempting r.json()
raises requests.exceptions.JSONDecodeError
. This wrapper exception
provides interoperability for multiple exceptions that may be thrown by different
python versions and json serialization libraries.
It should be noted that the success of the call to r.json()
does not
indicate the success of the response. Some servers may return a JSON object in a
failed response (e.g. error details with HTTP 500). Such JSON will be decoded
and returned. To check that a request is successful, use
r.raise_for_status()
or check r.status_code
is what you expect.
Raw Response Content¶
In the rare case that you’d like to get the raw socket response from the
server, you can access r.raw
. If you want to do this, make sure you set
stream=True
in your initial request. Once you do, you can do this:
>>> r = requests.get('https://api.github.com/events', stream=True) >>> r.raw <urllib3.response.HTTPResponse object at 0x101194810> >>> r.raw.read(10) b'x1fx8bx08x00x00x00x00x00x00x03'
In general, however, you should use a pattern like this to save what is being
streamed to a file:
with open(filename, 'wb') as fd: for chunk in r.iter_content(chunk_size=128): fd.write(chunk)
Using Response.iter_content
will handle a lot of what you would otherwise
have to handle when using Response.raw
directly. When streaming a
download, the above is the preferred and recommended way to retrieve the
content. Note that chunk_size
can be freely adjusted to a number that
may better fit your use cases.
Note
An important note about using Response.iter_content
versus Response.raw
.
Response.iter_content
will automatically decode the gzip
and deflate
transfer-encodings. Response.raw
is a raw stream of bytes – it does not
transform the response content. If you really need access to the bytes as they
were returned, use Response.raw
.
More complicated POST requests¶
Typically, you want to send some form-encoded data — much like an HTML form.
To do this, simply pass a dictionary to the data
argument. Your
dictionary of data will automatically be form-encoded when the request is made:
>>> payload = {'key1': 'value1', 'key2': 'value2'} >>> r = requests.post('https://httpbin.org/post', data=payload) >>> print(r.text) { ... "form": { "key2": "value2", "key1": "value1" }, ... }
The data
argument can also have multiple values for each key. This can be
done by making data
either a list of tuples or a dictionary with lists
as values. This is particularly useful when the form has multiple elements that
use the same key:
>>> payload_tuples = [('key1', 'value1'), ('key1', 'value2')] >>> r1 = requests.post('https://httpbin.org/post', data=payload_tuples) >>> payload_dict = {'key1': ['value1', 'value2']} >>> r2 = requests.post('https://httpbin.org/post', data=payload_dict) >>> print(r1.text) { ... "form": { "key1": [ "value1", "value2" ] }, ... } >>> r1.text == r2.text True
There are times that you may want to send data that is not form-encoded. If
you pass in a string
instead of a dict
, that data will be posted directly.
For example, the GitHub API v3 accepts JSON-Encoded POST/PATCH data:
>>> import json >>> url = 'https://api.github.com/some/endpoint' >>> payload = {'some': 'data'} >>> r = requests.post(url, data=json.dumps(payload))
Please note that the above code will NOT add the Content-Type
header
(so in particular it will NOT set it to application/json
).
If you need that header set and you don’t want to encode the dict
yourself,
you can also pass it directly using the json
parameter (added in version 2.4.2)
and it will be encoded automatically:
>>> url = 'https://api.github.com/some/endpoint' >>> payload = {'some': 'data'}
>>> r = requests.post(url, json=payload)
Note, the json
parameter is ignored if either data
or files
is passed.
POST a Multipart-Encoded File¶
Requests makes it simple to upload Multipart-encoded files:
>>> url = 'https://httpbin.org/post' >>> files = {'file': open('report.xls', 'rb')} >>> r = requests.post(url, files=files) >>> r.text { ... "files": { "file": "<censored...binary...data>" }, ... }
You can set the filename, content_type and headers explicitly:
>>> url = 'https://httpbin.org/post' >>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})} >>> r = requests.post(url, files=files) >>> r.text { ... "files": { "file": "<censored...binary...data>" }, ... }
If you want, you can send strings to be received as files:
>>> url = 'https://httpbin.org/post' >>> files = {'file': ('report.csv', 'some,data,to,sendnanother,row,to,sendn')} >>> r = requests.post(url, files=files) >>> r.text { ... "files": { "file": "some,data,to,send\nanother,row,to,send\n" }, ... }
In the event you are posting a very large file as a multipart/form-data
request, you may want to stream the request. By default, requests
does not
support this, but there is a separate package which does —
requests-toolbelt
. You should read the toolbelt’s documentation for more details about how to use it.
For sending multiple files in one request refer to the advanced
section.
Warning
It is strongly recommended that you open files in binary
mode. This is because Requests may attempt to provide
the Content-Length
header for you, and if it does this value
will be set to the number of bytes in the file. Errors may occur
if you open the file in text mode.
Response Status Codes¶
We can check the response status code:
>>> r = requests.get('https://httpbin.org/get') >>> r.status_code 200
Requests also comes with a built-in status code lookup object for easy
reference:
>>> r.status_code == requests.codes.ok True
If we made a bad request (a 4XX client error or 5XX server error response), we
can raise it with
Response.raise_for_status()
:
>>> bad_r = requests.get('https://httpbin.org/status/404') >>> bad_r.status_code 404 >>> bad_r.raise_for_status() Traceback (most recent call last): File "requests/models.py", line 832, in raise_for_status raise http_error requests.exceptions.HTTPError: 404 Client Error
But, since our status_code
for r
was 200
, when we call
raise_for_status()
we get:
>>> r.raise_for_status() None
All is well.
Cookies¶
If a response contains some Cookies, you can quickly access them:
>>> url = 'http://example.com/some/cookie/setting/url' >>> r = requests.get(url) >>> r.cookies['example_cookie_name'] 'example_cookie_value'
To send your own cookies to the server, you can use the cookies
parameter:
>>> url = 'https://httpbin.org/cookies' >>> cookies = dict(cookies_are='working') >>> r = requests.get(url, cookies=cookies) >>> r.text '{"cookies": {"cookies_are": "working"}}'
Cookies are returned in a RequestsCookieJar
,
which acts like a dict
but also offers a more complete interface,
suitable for use over multiple domains or paths. Cookie jars can
also be passed in to requests:
>>> jar = requests.cookies.RequestsCookieJar() >>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies') >>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere') >>> url = 'https://httpbin.org/cookies' >>> r = requests.get(url, cookies=jar) >>> r.text '{"cookies": {"tasty_cookie": "yum"}}'
Redirection and History¶
By default Requests will perform location redirection for all verbs except
HEAD.
We can use the history
property of the Response object to track redirection.
The Response.history
list contains the
Response
objects that were created in order to
complete the request. The list is sorted from the oldest to the most recent
response.
For example, GitHub redirects all HTTP requests to HTTPS:
>>> r = requests.get('http://github.com/') >>> r.url 'https://github.com/' >>> r.status_code 200 >>> r.history [<Response [301]>]
If you’re using GET, OPTIONS, POST, PUT, PATCH or DELETE, you can disable
redirection handling with the allow_redirects
parameter:
>>> r = requests.get('http://github.com/', allow_redirects=False) >>> r.status_code 301 >>> r.history []
If you’re using HEAD, you can enable redirection as well:
>>> r = requests.head('http://github.com/', allow_redirects=True) >>> r.url 'https://github.com/' >>> r.history [<Response [301]>]
Timeouts¶
You can tell Requests to stop waiting for a response after a given number of
seconds with the timeout
parameter. Nearly all production code should use
this parameter in nearly all requests. Failure to do so can cause your program
to hang indefinitely:
>>> requests.get('https://github.com/', timeout=0.001) Traceback (most recent call last): File "<stdin>", line 1, in <module> requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)
Note
timeout
is not a time limit on the entire response download;
rather, an exception is raised if the server has not issued a
response for timeout
seconds (more precisely, if no bytes have been
received on the underlying socket for timeout
seconds). If no timeout is specified explicitly, requests do
not time out.
Errors and Exceptions¶
In the event of a network problem (e.g. DNS failure, refused connection, etc),
Requests will raise a ConnectionError
exception.
Response.raise_for_status()
will
raise an HTTPError
if the HTTP request
returned an unsuccessful status code.
If a request times out, a Timeout
exception is
raised.
If a request exceeds the configured number of maximum redirections, a
TooManyRedirects
exception is raised.
All exceptions that Requests explicitly raises inherit from
requests.exceptions.RequestException
.
Ready for more? Check out the advanced section.
If you’re on the job market, consider taking this programming quiz. A substantial donation will be made to this project, if you find a job through this platform.
24 Дек. 2015, Python, 345046 просмотров,
Стандартная библиотека Python имеет ряд готовых модулей по работе с HTTP.
- urllib
- httplib
Если уж совсем хочется хардкора, то можно и сразу с socket поработать. Но у всех этих модулей есть один большой недостаток — неудобство работы.
Во-первых, большое обилие классов и функций. Во-вторых, код получается вовсе не pythonic. Многие программисты любят Python за его элегантность и простоту, поэтому и был создан модуль, призванный решать проблему существующих и имя ему requests или HTTP For Humans. На момент написания данной заметки, последняя версия библиотеки — 2.9.1. С момента выхода Python версии 3.5 я дал себе негласное обещание писать новый код только на Py >= 3.5. Пора бы уже полностью перебираться на 3-ю ветку змеюки, поэтому в моих примерах print отныне является функцией, а не оператором
Что же умеет requests?
Для начала хочется показать как выглядит код работы с http, используя модули из стандартной библиотеки Python и код при работе с requests. В качестве мишени для стрельбы http запросами будет использоваться очень удобный сервис httpbin.org
>>> import urllib.request
>>> response = urllib.request.urlopen('https://httpbin.org/get')
>>> print(response.read())
b'{n "args": {}, n "headers": {n "Accept-Encoding": "identity", n "Host": "httpbin.org", n "User-Agent": "Python-urllib/3.5"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> print(response.getheader('Server'))
nginx
>>> print(response.getcode())
200
>>>
Кстати, urllib.request это надстройка над «низкоуровневой» библиотекой httplib о которой я писал выше.
>>> import requests
>>> response = requests.get('https://httpbin.org/get')
>>> print(response.content)
b'{n "args": {}, n "headers": {n "Accept": "*/*", n "Accept-Encoding": "gzip, deflate", n "Host": "httpbin.org", n "User-Agent": "python-requests/2.9.1"n }, n "origin": "95.56.82.136", n "url": "https://httpbin.org/get"n}n'
>>> response.json()
{'headers': {'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.9.1', 'Host': 'httpbin.org', 'Accept': '*/*'}, 'args': {}, 'origin': '95.56.82.136', 'url': 'https://httpbin.org/get'}
>>> response.headers
{'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Server': 'nginx', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Origin': '*', 'Content-Length': '237', 'Date': 'Wed, 23 Dec 2015 17:56:46 GMT'}
>>> response.headers.get('Server')
'nginx'
В простых методах запросов значительных отличий у них не имеется. Но давайте взглянем на работы с Basic Auth:
>>> import urllib.request
>>> password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
>>> top_level_url = 'https://httpbin.org/basic-auth/user/passwd'
>>> password_mgr.add_password(None, top_level_url, 'user', 'passwd')
>>> handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
>>> opener = urllib.request.build_opener(handler)
>>> response = opener.open(top_level_url)
>>> response.getcode()
200
>>> response.read()
b'{n "authenticated": true, n "user": "user"n}n'
>>> import requests
>>> response = requests.get('https://httpbin.org/basic-auth/user/passwd', auth=('user', 'passwd'))
>>> print(response.content)
b'{n "authenticated": true, n "user": "user"n}n'
>>> print(response.json())
{'user': 'user', 'authenticated': True}
А теперь чувствуется разница между pythonic и non-pythonic? Я думаю разница на лицо. И несмотря на тот факт, что requests ничто иное как обёртка над urllib3, а последняя является надстройкой над стандартными средствами Python, удобство написания кода в большинстве случаев является приоритетом номер один.
В requests имеется:
- Множество методов http аутентификации
- Сессии с куками
- Полноценная поддержка SSL
- Различные методы-плюшки вроде .json(), которые вернут данные в нужном формате
- Проксирование
- Грамотная и логичная работа с исключениями
О последнем пункте мне бы хотелось поговорить чуточку подробнее.
Обработка исключений в requests
При работе с внешними сервисами никогда не стоит полагаться на их отказоустойчивость. Всё упадёт рано или поздно, поэтому нам, программистам, необходимо быть всегда к этому готовыми, желательно заранее и в спокойной обстановке.
Итак, как у requests дела обстоят с различными факапами в момент сетевых соединений? Для начала определим ряд проблем, которые могут возникнуть:
- Хост недоступен. Обычно такого рода ошибка происходит из-за проблем конфигурирования DNS. (DNS lookup failure)
- «Вылет» соединения по таймауту
- Ошибки HTTP. Подробнее о HTTP кодах можно посмотреть здесь.
- Ошибки SSL соединений (обычно при наличии проблем с SSL сертификатом: просрочен, не является доверенным и т.д.)
Базовым классом-исключением в requests является RequestException. От него наследуются все остальные
- HTTPError
- ConnectionError
- Timeout
- SSLError
- ProxyError
И так далее. Полный список всех исключений можно посмотреть в requests.exceptions.
Timeout
В requests имеется 2 вида таймаут-исключений:
- ConnectTimeout — таймаут на соединения
- ReadTimeout — таймаут на чтение
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(0.00001, 10))
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Connection timeout occured!
>>> try:
... response = requests.get('https://httpbin.org/user-agent', timeout=(10, 0.0001))
... except requests.exceptions.ReadTimeout:
... print('Oops. Read timeout occured')
... except requests.exceptions.ConnectTimeout:
... print('Oops. Connection timeout occured!')
...
Oops. Read timeout occured
ConnectionError
>>> import requests
>>> try:
... response = requests.get('http://urldoesnotexistforsure.bom')
... except requests.exceptions.ConnectionError:
... print('Seems like dns lookup failed..')
...
Seems like dns lookup failed..
HTTPError
>>> import requests
>>> try:
... response = requests.get('https://httpbin.org/status/500')
... response.raise_for_status()
... except requests.exceptions.HTTPError as err:
... print('Oops. HTTP Error occured')
... print('Response is: {content}'.format(content=err.response.content))
...
Oops. HTTP Error occured
Response is: b''
Я перечислил основные виды исключений, которые покрывают, пожалуй, 90% всех проблем, возникающих при работе с http. Главное помнить, что если мы действительно намерены отловить что-то и обработать, то это необходимо явно запрограммировать, если же нам неважен тип конкретного исключения, то можно отлавливать общий базовый класс RequestException и действовать уже от конкретного случая, например, залоггировать исключение и выкинуть его дальше наверх. Кстати, о логгировании я напишу отдельный подробный пост.
У блога появился свой Telegram канал, где я стараюсь делиться интересными находками из сети на тему разработки программного обеспечения. Велком, как говорится
Полезные «плюшки»
- httpbin.org очень полезный сервис для тестирования http клиентов, в частности удобен для тестирования нестандартного поведения сервиса
- httpie консольный http клиент (замена curl) написанный на Python
- responses mock библиотека для работы с requests
- HTTPretty mock библиотека для работы с http модулями
💌 Присоединяйтесь к рассылке
Понравился контент? Пожалуйста, подпишись на рассылку.
ec == const.IETaskNotFound or # 36016 or # sc == 404 Task was not found
# the following was found by xslidian, but i have never ecountered before
ec == 31390): # sc == 404 # {"error_code":31390,"error_msg":"Illegal File"} # r.url.find('http://bcscdn.baidu.com/bcs-cdn/wenxintishi') == 0
result = ec
# TODO: Move this out to cdl_cancel() ?
#if ec == const.IETaskNotFound:
# pr(r.json())
if dumpex:
self.__dump_exception(None, url, pars, r, act)
else:
# gate for child classes to customize behaviors
# the function should return ERequestFailed if it doesn't handle the case
result = self.__handle_more_response_error(r, sc, ec, act, actargs)
if result == const.ERequestFailed and dumpex:
self.__dump_exception(None, url, pars, r, act)
except (requests.exceptions.RequestException,
socket.error,
ReadTimeoutError) as ex:
# If certificate check failed, no need to continue
# but prompt the user for work-around and quit
# why so kludge? because requests' SSLError doesn't set
# the errno and strerror due to using **kwargs,
# so we are forced to use string matching
if isinstance(ex, requests.exceptions.SSLError)
and re.match(r'^[Errno 1].*error:14090086.*:certificate verify failed$', str(ex), re.I):
# [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
result = const.EFatal
self.__dump_exception(ex, url, pars, r, act)
perr("nn== Baidu's Certificate Verification Failure ==n"
"We couldn't verify Baidu's SSL Certificate.n"
"It's most likely that the system doesn't have "
"the corresponding CA certificate installed.n"