Python requests обработка ошибок

How to handle exceptions with python library requests?
For example how to check is PC connected to internet?

When I try

try:
    requests.get('http://www.google.com')
except ConnectionError:
    # handle the exception

it gives me error name ConnectionError is not defined

Kevin Burke's user avatar

Kevin Burke

60.5k76 gold badges187 silver badges304 bronze badges

asked Jan 29, 2012 at 16:46

1

Assuming you did import requests, you want requests.ConnectionError. ConnectionError is an exception defined by requests. See the API documentation here.

Thus the code should be:

try:
   requests.get('http://www.google.com')
except requests.ConnectionError:
   # handle the exception

The original link to the Python v2 API documentation from the original answer no longer works.

tripleee's user avatar

tripleee

174k34 gold badges272 silver badges313 bronze badges

answered Jan 29, 2012 at 16:52

kindall's user avatar

kindallkindall

178k35 gold badges274 silver badges306 bronze badges

4

As per the documentation, I have added the below points:

  1. In the event of a network problem (refused connection e.g internet issue), Requests will raise a ConnectionError exception.

    try:
       requests.get('http://www.google.com')
    except requests.ConnectionError:
       # handle ConnectionError the exception
    
  2. In the event of the rare invalid HTTP response, Requests will raise an HTTPError exception.
    Response.raise_for_status() will raise an HTTPError if the HTTP request returned an unsuccessful status code.

    try:
       r = requests.get('http://www.google.com/nowhere')
       r.raise_for_status()
    except requests.exceptions.HTTPError as err:
       #handle the HTTPError request here
    
  3. In the event of times out of request, a Timeout exception is raised.

You can tell Requests to stop waiting for a response after a given number of seconds, with a timeout arg.

    requests.get('https://github.com/', timeout=0.001)
    # timeout is not a time limit on the entire response download; rather, 
    # an exception is raised if the server has not issued a response for
    # timeout seconds
  1. All exceptions that Requests explicitly raises inherit from requests.exceptions.RequestException. So a base handler can look like,

    try:
       r = requests.get(url)
    except requests.exceptions.RequestException as e:
       # handle all the errors here
    

The original link to the Python v2 documentation no longer works, and now points to the new documentation.

tripleee's user avatar

tripleee

174k34 gold badges272 silver badges313 bronze badges

answered Jul 28, 2019 at 9:47

abhishek kasana's user avatar

Actually, there are much more exceptions that requests.get() can generate than just ConnectionError. Here are some I’ve seen in production:

from requests import ReadTimeout, ConnectTimeout, HTTPError, Timeout, ConnectionError

try:
    r = requests.get(url, timeout=6.0)
except (ConnectTimeout, HTTPError, ReadTimeout, Timeout, ConnectionError):
    continue

Falko's user avatar

Falko

16.9k13 gold badges60 silver badges104 bronze badges

answered Sep 6, 2017 at 9:57

kravietz's user avatar

kravietzkravietz

10.6k2 gold badges35 silver badges27 bronze badges

1

Include the requests module using import requests .

It is always good to implement exception handling. It does not only help to avoid unexpected exit of script but can also help to log errors and info notification. When using Python requests I prefer to catch exceptions like this:

try:
    res = requests.get(adress,timeout=30)
except requests.ConnectionError as e:
    print("OOPS!! Connection Error. Make sure you are connected to Internet. Technical Details given below.n")
    print(str(e))            
    continue
except requests.Timeout as e:
    print("OOPS!! Timeout Error")
    print(str(e))
    continue
except requests.RequestException as e:
    print("OOPS!! General Error")
    print(str(e))
    continue
except KeyboardInterrupt:
    print("Someone closed the program")

answered May 23, 2018 at 20:19

Tanmoy Datta's user avatar

Tanmoy DattaTanmoy Datta

1,5941 gold badge16 silver badges15 bronze badges

1

for clarity, that is

except requests.ConnectionError:

NOT

import requests.ConnectionError

You can also catch a general exception (although this isn’t recommended) with

except Exception:

answered Feb 22, 2015 at 10:50

StackG's user avatar

StackGStackG

2,7205 gold badges28 silver badges45 bronze badges

Python request module is a simple and elegant Python HTTP library. It provides methods for accessing Web resources via HTTP. In the following article, we will use the HTTP GET method in the Request module. This method requests data from the server and the Exception handling comes in handy when the response is not successful. Here, we will go through such situations. We will use Python’s try and except functionality to explore the exceptions that arise from the Requests module.

  • url: Returns the URL of the response
  • raise_for_status(): If an error occur, this method returns a HTTPError object
  • request: Returns the request object that requested this response
  • status_code: Returns a number that indicates the status (200 is OK, 404 is Not Found)
     

Successful Connection Request

The first thing to know is that the response code is 200 if the request is successful.

Python3

Output:

200

Exception Handling for HTTP Errors

Here, we tried the following URL sequence and then passed this variable to the Python requests module using raised_for_status(). If the try part is successful, we will get the response code 200, if the page that we requested doesn’t exist. This is an HTTP error, which was handled by the Request module’s exception HTTPError and you probably got the error 404.

Python3

import requests

try:

    r = requests.get(url, timeout=1)

    r.raise_for_status()

except requests.exceptions.HTTPError as errh:

    print("HTTP Error")

    print(errh.args[0])

print(r)

Output:

HTTP Error
404 Client Error: Not Found for url: https://www.amazon.com/nothing_here
<Response [404]>

General Exception Handling

You could also use a general exception from the Request module. That is requests.exceptions.RequestException.

Python3

try:

    r = requests.get(url, timeout=1)

    r.raise_for_status()

except requests.exceptions.RequestException as errex:

    print("Exception request")

Output:

Exception request 

Now, you may have noticed that there is an argument ‘timeout’ passed into the Request module. We could prescribe a time limit for the requested connection to respond. If this has not happened, we could catch that using the exception requests.exceptions.ReadTimeout. To demonstrate this let us find a website that responds successfully.

Python3

import requests

try:

    r = requests.get(url, timeout=1)

    r.raise_for_status()

except requests.exceptions.ReadTimeout as errrt:

    print("Time out")

print(r)

Output:

<Response [200]>

If we change timeout = 0.01, the same code would return, because the request could not possibly be that fast.

Time out
<Response [200]>

Exception Handling for Missing Schema

Another common error is that we might not specify HTTPS or HTTP in the URL. For example, We cause use requests.exceptions.MissingSchema to catch this exception.

Python3

url = "www.google.com"

try:

    r = requests.get(url, timeout=1)

    r.raise_for_status()

except requests.exceptions.MissingSchema as errmiss:

    print("Missing schema: include http or https")

except requests.exceptions.ReadTimeout as errrt:

    print("Time out")

Output:

Missing scheme: include http or https

Exception Handling for Connection Error

Let us say that there is a site that doesn’t exist. Here, the error will occur even when you can’t make a connection because of the lack of an internet connection

Python3

try:

  r = requests.get(url, timeout = 1, verify = True)

  r.raise_for_status()

except requests.exceptions.HTTPError as errh:

  print("HTTP Error")

  print(errh.args[0])

except requests.exceptions.ReadTimeout as errrt:

  print("Time out")

except requests.exceptions.ConnectionError as conerr:

  print("Connection error")

Output:

Connection error

Putting Everything Together

Here, We put together everything we tried so far the idea is that the exceptions are handled according to the specificity. 

For example, url =  “https://www.gle.com”,  When this code is run for this URL will produce an Exception request. Whereas, In the absence of connection requests.exceptions.ConnectionError will print the Connection Error, and when the connection is not made the general exception is handled by requests.exceptions.RequestException.

Python3

try:

    r = requests.get(url, timeout=1, verify=True)

    r.raise_for_status()

except requests.exceptions.HTTPError as errh:

    print("HTTP Error")

    print(errh.args[0])

except requests.exceptions.ReadTimeout as errrt:

    print("Time out")

except requests.exceptions.ConnectionError as conerr:

    print("Connection error")

except requests.exceptions.RequestException as errex:

    print("Exception request")

Output:

Note: The output may change according to requests.

 Time out

Last Updated :
23 Jan, 2023

Like Article

Save Article


Eager to get started? This page gives a good introduction in how to get started
with Requests.

First, make sure that:

  • Requests is installed

  • Requests is up-to-date

Let’s get started with some simple examples.

Make a Request¶

Making a request with Requests is very simple.

Begin by importing the Requests module:

Now, let’s try to get a webpage. For this example, let’s get GitHub’s public
timeline:

>>> r = requests.get('https://api.github.com/events')

Now, we have a Response object called r. We can
get all the information we need from this object.

Requests’ simple API means that all forms of HTTP request are as obvious. For
example, this is how you make an HTTP POST request:

>>> r = requests.post('https://httpbin.org/post', data={'key': 'value'})

Nice, right? What about the other HTTP request types: PUT, DELETE, HEAD and
OPTIONS? These are all just as simple:

>>> r = requests.put('https://httpbin.org/put', data={'key': 'value'})
>>> r = requests.delete('https://httpbin.org/delete')
>>> r = requests.head('https://httpbin.org/get')
>>> r = requests.options('https://httpbin.org/get')

That’s all well and good, but it’s also only the start of what Requests can
do.

Passing Parameters In URLs¶

You often want to send some sort of data in the URL’s query string. If
you were constructing the URL by hand, this data would be given as key/value
pairs in the URL after a question mark, e.g. httpbin.org/get?key=val.
Requests allows you to provide these arguments as a dictionary of strings,
using the params keyword argument. As an example, if you wanted to pass
key1=value1 and key2=value2 to httpbin.org/get, you would use the
following code:

>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.get('https://httpbin.org/get', params=payload)

You can see that the URL has been correctly encoded by printing the URL:

>>> print(r.url)
https://httpbin.org/get?key2=value2&key1=value1

Note that any dictionary key whose value is None will not be added to the
URL’s query string.

You can also pass a list of items as a value:

>>> payload = {'key1': 'value1', 'key2': ['value2', 'value3']}

>>> r = requests.get('https://httpbin.org/get', params=payload)
>>> print(r.url)
https://httpbin.org/get?key1=value1&key2=value2&key2=value3

Response Content¶

We can read the content of the server’s response. Consider the GitHub timeline
again:

>>> import requests

>>> r = requests.get('https://api.github.com/events')
>>> r.text
'[{"repository":{"open_issues":0,"url":"https://github.com/...

Requests will automatically decode content from the server. Most unicode
charsets are seamlessly decoded.

When you make a request, Requests makes educated guesses about the encoding of
the response based on the HTTP headers. The text encoding guessed by Requests
is used when you access r.text. You can find out what encoding Requests is
using, and change it, using the r.encoding property:

>>> r.encoding
'utf-8'
>>> r.encoding = 'ISO-8859-1'

If you change the encoding, Requests will use the new value of r.encoding
whenever you call r.text. You might want to do this in any situation where
you can apply special logic to work out what the encoding of the content will
be. For example, HTML and XML have the ability to specify their encoding in
their body. In situations like this, you should use r.content to find the
encoding, and then set r.encoding. This will let you use r.text with
the correct encoding.

Requests will also use custom encodings in the event that you need them. If
you have created your own encoding and registered it with the codecs
module, you can simply use the codec name as the value of r.encoding and
Requests will handle the decoding for you.

Binary Response Content¶

You can also access the response body as bytes, for non-text requests:

>>> r.content
b'[{"repository":{"open_issues":0,"url":"https://github.com/...

The gzip and deflate transfer-encodings are automatically decoded for you.

The br transfer-encoding is automatically decoded for you if a Brotli library
like brotli or brotlicffi is installed.

For example, to create an image from binary data returned by a request, you can
use the following code:

>>> from PIL import Image
>>> from io import BytesIO

>>> i = Image.open(BytesIO(r.content))

JSON Response Content¶

There’s also a builtin JSON decoder, in case you’re dealing with JSON data:

>>> import requests

>>> r = requests.get('https://api.github.com/events')
>>> r.json()
[{'repository': {'open_issues': 0, 'url': 'https://github.com/...

In case the JSON decoding fails, r.json() raises an exception. For example, if
the response gets a 204 (No Content), or if the response contains invalid JSON,
attempting r.json() raises requests.exceptions.JSONDecodeError. This wrapper exception
provides interoperability for multiple exceptions that may be thrown by different
python versions and json serialization libraries.

It should be noted that the success of the call to r.json() does not
indicate the success of the response. Some servers may return a JSON object in a
failed response (e.g. error details with HTTP 500). Such JSON will be decoded
and returned. To check that a request is successful, use
r.raise_for_status() or check r.status_code is what you expect.

Raw Response Content¶

In the rare case that you’d like to get the raw socket response from the
server, you can access r.raw. If you want to do this, make sure you set
stream=True in your initial request. Once you do, you can do this:

>>> r = requests.get('https://api.github.com/events', stream=True)

>>> r.raw
<urllib3.response.HTTPResponse object at 0x101194810>

>>> r.raw.read(10)
b'x1fx8bx08x00x00x00x00x00x00x03'

In general, however, you should use a pattern like this to save what is being
streamed to a file:

with open(filename, 'wb') as fd:
    for chunk in r.iter_content(chunk_size=128):
        fd.write(chunk)

Using Response.iter_content will handle a lot of what you would otherwise
have to handle when using Response.raw directly. When streaming a
download, the above is the preferred and recommended way to retrieve the
content. Note that chunk_size can be freely adjusted to a number that
may better fit your use cases.

Note

An important note about using Response.iter_content versus Response.raw.
Response.iter_content will automatically decode the gzip and deflate
transfer-encodings. Response.raw is a raw stream of bytes – it does not
transform the response content. If you really need access to the bytes as they
were returned, use Response.raw.

More complicated POST requests¶

Typically, you want to send some form-encoded data — much like an HTML form.
To do this, simply pass a dictionary to the data argument. Your
dictionary of data will automatically be form-encoded when the request is made:

>>> payload = {'key1': 'value1', 'key2': 'value2'}

>>> r = requests.post('https://httpbin.org/post', data=payload)
>>> print(r.text)
{
  ...
  "form": {
    "key2": "value2",
    "key1": "value1"
  },
  ...
}

The data argument can also have multiple values for each key. This can be
done by making data either a list of tuples or a dictionary with lists
as values. This is particularly useful when the form has multiple elements that
use the same key:

>>> payload_tuples = [('key1', 'value1'), ('key1', 'value2')]
>>> r1 = requests.post('https://httpbin.org/post', data=payload_tuples)
>>> payload_dict = {'key1': ['value1', 'value2']}
>>> r2 = requests.post('https://httpbin.org/post', data=payload_dict)
>>> print(r1.text)
{
  ...
  "form": {
    "key1": [
      "value1",
      "value2"
    ]
  },
  ...
}
>>> r1.text == r2.text
True

There are times that you may want to send data that is not form-encoded. If
you pass in a string instead of a dict, that data will be posted directly.

For example, the GitHub API v3 accepts JSON-Encoded POST/PATCH data:

>>> import json

>>> url = 'https://api.github.com/some/endpoint'
>>> payload = {'some': 'data'}

>>> r = requests.post(url, data=json.dumps(payload))

Please note that the above code will NOT add the Content-Type header
(so in particular it will NOT set it to application/json).

If you need that header set and you don’t want to encode the dict yourself,
you can also pass it directly using the json parameter (added in version 2.4.2)
and it will be encoded automatically:

>>> url = 'https://api.github.com/some/endpoint'
>>> payload = {'some': 'data'}
>>> r = requests.post(url, json=payload)

Note, the json parameter is ignored if either data or files is passed.

POST a Multipart-Encoded File¶

Requests makes it simple to upload Multipart-encoded files:

>>> url = 'https://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}

>>> r = requests.post(url, files=files)
>>> r.text
{
  ...
  "files": {
    "file": "<censored...binary...data>"
  },
  ...
}

You can set the filename, content_type and headers explicitly:

>>> url = 'https://httpbin.org/post'
>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}

>>> r = requests.post(url, files=files)
>>> r.text
{
  ...
  "files": {
    "file": "<censored...binary...data>"
  },
  ...
}

If you want, you can send strings to be received as files:

>>> url = 'https://httpbin.org/post'
>>> files = {'file': ('report.csv', 'some,data,to,sendnanother,row,to,sendn')}

>>> r = requests.post(url, files=files)
>>> r.text
{
  ...
  "files": {
    "file": "some,data,to,send\nanother,row,to,send\n"
  },
  ...
}

In the event you are posting a very large file as a multipart/form-data
request, you may want to stream the request. By default, requests does not
support this, but there is a separate package which does —
requests-toolbelt. You should read the toolbelt’s documentation for more details about how to use it.

For sending multiple files in one request refer to the advanced
section.

Warning

It is strongly recommended that you open files in binary
mode
. This is because Requests may attempt to provide
the Content-Length header for you, and if it does this value
will be set to the number of bytes in the file. Errors may occur
if you open the file in text mode.

Response Status Codes¶

We can check the response status code:

>>> r = requests.get('https://httpbin.org/get')
>>> r.status_code
200

Requests also comes with a built-in status code lookup object for easy
reference:

>>> r.status_code == requests.codes.ok
True

If we made a bad request (a 4XX client error or 5XX server error response), we
can raise it with
Response.raise_for_status():

>>> bad_r = requests.get('https://httpbin.org/status/404')
>>> bad_r.status_code
404

>>> bad_r.raise_for_status()
Traceback (most recent call last):
  File "requests/models.py", line 832, in raise_for_status
    raise http_error
requests.exceptions.HTTPError: 404 Client Error

But, since our status_code for r was 200, when we call
raise_for_status() we get:

>>> r.raise_for_status()
None

All is well.

Cookies¶

If a response contains some Cookies, you can quickly access them:

>>> url = 'http://example.com/some/cookie/setting/url'
>>> r = requests.get(url)

>>> r.cookies['example_cookie_name']
'example_cookie_value'

To send your own cookies to the server, you can use the cookies
parameter:

>>> url = 'https://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')

>>> r = requests.get(url, cookies=cookies)
>>> r.text
'{"cookies": {"cookies_are": "working"}}'

Cookies are returned in a RequestsCookieJar,
which acts like a dict but also offers a more complete interface,
suitable for use over multiple domains or paths. Cookie jars can
also be passed in to requests:

>>> jar = requests.cookies.RequestsCookieJar()
>>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies')
>>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere')
>>> url = 'https://httpbin.org/cookies'
>>> r = requests.get(url, cookies=jar)
>>> r.text
'{"cookies": {"tasty_cookie": "yum"}}'

Redirection and History¶

By default Requests will perform location redirection for all verbs except
HEAD.

We can use the history property of the Response object to track redirection.

The Response.history list contains the
Response objects that were created in order to
complete the request. The list is sorted from the oldest to the most recent
response.

For example, GitHub redirects all HTTP requests to HTTPS:

>>> r = requests.get('http://github.com/')

>>> r.url
'https://github.com/'

>>> r.status_code
200

>>> r.history
[<Response [301]>]

If you’re using GET, OPTIONS, POST, PUT, PATCH or DELETE, you can disable
redirection handling with the allow_redirects parameter:

>>> r = requests.get('http://github.com/', allow_redirects=False)

>>> r.status_code
301

>>> r.history
[]

If you’re using HEAD, you can enable redirection as well:

>>> r = requests.head('http://github.com/', allow_redirects=True)

>>> r.url
'https://github.com/'

>>> r.history
[<Response [301]>]

Timeouts¶

You can tell Requests to stop waiting for a response after a given number of
seconds with the timeout parameter. Nearly all production code should use
this parameter in nearly all requests. Failure to do so can cause your program
to hang indefinitely:

>>> requests.get('https://github.com/', timeout=0.001)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)

Note

timeout is not a time limit on the entire response download;
rather, an exception is raised if the server has not issued a
response for timeout seconds (more precisely, if no bytes have been
received on the underlying socket for timeout seconds). If no timeout is specified explicitly, requests do
not time out.

Errors and Exceptions¶

In the event of a network problem (e.g. DNS failure, refused connection, etc),
Requests will raise a ConnectionError exception.

Response.raise_for_status() will
raise an HTTPError if the HTTP request
returned an unsuccessful status code.

If a request times out, a Timeout exception is
raised.

If a request exceeds the configured number of maximum redirections, a
TooManyRedirects exception is raised.

All exceptions that Requests explicitly raises inherit from
requests.exceptions.RequestException.


Ready for more? Check out the advanced section.

If you’re on the job market, consider taking this programming quiz. A substantial donation will be made to this project, if you find a job through this platform.

24 Дек. 2015, Python, 345046 просмотров,

Стандартная библиотека Python имеет ряд готовых модулей по работе с HTTP.

  • urllib
  • httplib

Если уж совсем хочется хардкора, то можно и сразу с socket поработать. Но у всех этих модулей есть один большой недостатокнеудобство работы.

Во-первых, большое обилие классов и функций. Во-вторых, код получается вовсе не pythonic. Многие программисты любят Python за его элегантность и простоту, поэтому и был создан модуль, призванный решать проблему существующих и имя ему requests или HTTP For Humans. На момент написания данной заметки, последняя версия библиотеки — 2.9.1. С момента выхода Python версии 3.5 я дал себе негласное обещание писать новый код только на Py >= 3.5. Пора бы уже полностью перебираться на 3-ю ветку змеюки, поэтому в моих примерах print отныне является функцией, а не оператором :-)

Что же умеет requests?

Для начала хочется показать как выглядит код работы с http, используя модули из стандартной библиотеки Python и код при работе с requests. В качестве мишени для стрельбы http запросами будет использоваться очень удобный сервис httpbin.org


>>> import urllib.request
>>> response = urllib.request.urlopen('https://httpbin.org/get')
>>> print(response.read())
b'{n  "args": {}, n  "headers": {n    "Accept-Encoding": "identity", n    "Host": "httpbin.org", n    "User-Agent": "Python-urllib/3.5"n  }, n  "origin": "95.56.82.136", n  "url": "https://httpbin.org/get"n}n'
>>> print(response.getheader('Server'))
nginx
>>> print(response.getcode())
200
>>> 

Кстати, urllib.request это надстройка над «низкоуровневой» библиотекой httplib о которой я писал выше.

>>> import requests
>>> response = requests.get('https://httpbin.org/get')
>>> print(response.content)
b'{n  "args": {}, n  "headers": {n    "Accept": "*/*", n    "Accept-Encoding": "gzip, deflate", n    "Host": "httpbin.org", n    "User-Agent": "python-requests/2.9.1"n  }, n  "origin": "95.56.82.136", n  "url": "https://httpbin.org/get"n}n'
>>> response.json()
{'headers': {'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.9.1', 'Host': 'httpbin.org', 'Accept': '*/*'}, 'args': {}, 'origin': '95.56.82.136', 'url': 'https://httpbin.org/get'}
>>> response.headers
{'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Server': 'nginx', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Origin': '*', 'Content-Length': '237', 'Date': 'Wed, 23 Dec 2015 17:56:46 GMT'}
>>> response.headers.get('Server')
'nginx'

В простых методах запросов значительных отличий у них не имеется. Но давайте взглянем на работы с Basic Auth:


>>> import urllib.request
>>> password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
>>> top_level_url = 'https://httpbin.org/basic-auth/user/passwd'
>>> password_mgr.add_password(None, top_level_url, 'user', 'passwd')
>>> handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
>>> opener = urllib.request.build_opener(handler)
>>> response = opener.open(top_level_url)
>>> response.getcode()
200
>>> response.read()
b'{n  "authenticated": true, n  "user": "user"n}n'

>>> import requests
>>> response = requests.get('https://httpbin.org/basic-auth/user/passwd', auth=('user', 'passwd'))
>>> print(response.content)
b'{n  "authenticated": true, n  "user": "user"n}n'
>>> print(response.json())
{'user': 'user', 'authenticated': True}

А теперь чувствуется разница между pythonic и non-pythonic? Я думаю разница на лицо. И несмотря на тот факт, что requests ничто иное как обёртка над urllib3, а последняя является надстройкой над стандартными средствами Python, удобство написания кода в большинстве случаев является приоритетом номер один.

В requests имеется:

  • Множество методов http аутентификации
  • Сессии с куками
  • Полноценная поддержка SSL
  • Различные методы-плюшки вроде .json(), которые вернут данные в нужном формате
  • Проксирование
  • Грамотная и логичная работа с исключениями

О последнем пункте мне бы хотелось поговорить чуточку подробнее.

Обработка исключений в requests

При работе с внешними сервисами никогда не стоит полагаться на их отказоустойчивость. Всё упадёт рано или поздно, поэтому нам, программистам, необходимо быть всегда к этому готовыми, желательно заранее и в спокойной обстановке.

Итак, как у requests дела обстоят с различными факапами в момент сетевых соединений? Для начала определим ряд проблем, которые могут возникнуть:

  • Хост недоступен. Обычно такого рода ошибка происходит из-за проблем конфигурирования DNS. (DNS lookup failure)
  • «Вылет» соединения по таймауту
  • Ошибки HTTP. Подробнее о HTTP кодах можно посмотреть здесь.
  • Ошибки SSL соединений (обычно при наличии проблем с SSL сертификатом: просрочен, не является доверенным и т.д.)

Базовым классом-исключением в requests является RequestException. От него наследуются все остальные

  • HTTPError
  • ConnectionError
  • Timeout
  • SSLError
  • ProxyError

И так далее. Полный список всех исключений можно посмотреть в requests.exceptions.

Timeout

В requests имеется 2 вида таймаут-исключений:

  • ConnectTimeout — таймаут на соединения
  • ReadTimeout — таймаут на чтение
>>> import requests
>>> try:
...     response = requests.get('https://httpbin.org/user-agent', timeout=(0.00001, 10))
... except requests.exceptions.ConnectTimeout:
...     print('Oops. Connection timeout occured!')
...     
Oops. Connection timeout occured!
>>> try:
...     response = requests.get('https://httpbin.org/user-agent', timeout=(10, 0.0001))
... except requests.exceptions.ReadTimeout:
...     print('Oops. Read timeout occured')
... except requests.exceptions.ConnectTimeout:
...     print('Oops. Connection timeout occured!')
...     
Oops. Read timeout occured

ConnectionError


>>> import requests
>>> try:
...     response = requests.get('http://urldoesnotexistforsure.bom')
... except requests.exceptions.ConnectionError:
...     print('Seems like dns lookup failed..')
...     
Seems like dns lookup failed..

HTTPError


>>> import requests
>>> try:
...     response = requests.get('https://httpbin.org/status/500')
...     response.raise_for_status()
... except requests.exceptions.HTTPError as err:
...     print('Oops. HTTP Error occured')
...     print('Response is: {content}'.format(content=err.response.content))
...     
Oops. HTTP Error occured
Response is: b''

Я перечислил основные виды исключений, которые покрывают, пожалуй, 90% всех проблем, возникающих при работе с http. Главное помнить, что если мы действительно намерены отловить что-то и обработать, то это необходимо явно запрограммировать, если же нам неважен тип конкретного исключения, то можно отлавливать общий базовый класс RequestException и действовать уже от конкретного случая, например, залоггировать исключение и выкинуть его дальше наверх. Кстати, о логгировании я напишу отдельный подробный пост.

У блога появился свой Telegram канал, где я стараюсь делиться интересными находками из сети на тему разработки программного обеспечения. Велком, как говорится :)

Полезные «плюшки»

  • httpbin.org очень полезный сервис для тестирования http клиентов, в частности удобен для тестирования нестандартного поведения сервиса
  • httpie консольный http клиент (замена curl) написанный на Python
  • responses mock библиотека для работы с requests
  • HTTPretty mock библиотека для работы с http модулями

💌 Присоединяйтесь к рассылке

Понравился контент? Пожалуйста, подпишись на рассылку.

ec == const.IETaskNotFound or # 36016 or # sc == 404 Task was not found
					# the following was found by xslidian, but i have never ecountered before
					ec == 31390):  # sc == 404 # {"error_code":31390,"error_msg":"Illegal File"} # r.url.find('http://bcscdn.baidu.com/bcs-cdn/wenxintishi') == 0
					result = ec
					# TODO: Move this out to cdl_cancel() ?
					#if ec == const.IETaskNotFound:
					#	pr(r.json())
					if dumpex:
						self.__dump_exception(None, url, pars, r, act)
				else:
					# gate for child classes to customize behaviors
					# the function should return ERequestFailed if it doesn't handle the case
					result = self.__handle_more_response_error(r, sc, ec, act, actargs)
					if result == const.ERequestFailed and dumpex:
						self.__dump_exception(None, url, pars, r, act)
		except (requests.exceptions.RequestException,
				socket.error,
				ReadTimeoutError) as ex:
			# If certificate check failed, no need to continue
			# but prompt the user for work-around and quit
			# why so kludge? because requests' SSLError doesn't set
			# the errno and strerror due to using **kwargs,
			# so we are forced to use string matching
			if isinstance(ex, requests.exceptions.SSLError) 
				and re.match(r'^[Errno 1].*error:14090086.*:certificate verify failed$', str(ex), re.I):
				# [Errno 1] _ssl.c:504: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
				result = const.EFatal
				self.__dump_exception(ex, url, pars, r, act)
				perr("nn== Baidu's Certificate Verification Failure ==n"
				"We couldn't verify Baidu's SSL Certificate.n"
				"It's most likely that the system doesn't have "
				"the corresponding CA certificate installed.n"

  • Python requests коды ошибок
  • Python postgresql обработка ошибок
  • Python open обработка ошибок
  • Python memory error ошибка
  • Python matplotlib установка ошибка