The Cookie module, as the name suggests, is the module used to manipulate cookies.
Cookie this piece of cake, played the Web people know, it is the Server and Client to maintain a conversation with the information used to slice. Http protocol itself is stateless, that is, the same client sent two requests, for the Web server, there is no direct relationship. In this case, some people may ask, if Http is stateless, why some web pages can only be accessed after entering a username and password and passing authentication?
That's because: for the authenticated user, the Server will secretly add cookies to the data sent to the Client, Cookie generally saves a unique ID that identifies the Client, the Client in the next request to the server, the ID will be in the form of a cookie and send it to the Server, the Server extracts the ID from the returned cookie and binds it to the corresponding user, thus realizing the authentication. Server extracts the ID from the returned cookie and binds it to the corresponding user, thus realizing authentication. To put it bluntly, Cookie is a string passed between the server and the client (the following figure through the FireFox FireBug plug-in to view the access to the Cookie).
The Cooke module defines four classes that directly manipulate cookies: BaseCookie, SimpleCookie, SerialCookie, and SmartCookie, of which BaseCookie is the base class that defines the public part of manipulating cookies, and the other three classes inherit from BaseCookie. The other three classes inherit from BaseCookie, and the only difference between them is the way they serialize data. The following is a brief explanation of the use of these classes.
BaseCookie base class: BaseCookies behave very much like dicts and can be manipulated as key/value pairs, but the kye must be a string and the value must be a Morsel object (more on Morsel below).BaseCookies define the public specifications for encoding/decoding and input/output operations:
- BaseCookie.value_encode(val): serialize/deserialize the data. These methods return strings for transmission over Http.
- (): returns a string which can be sent to the client as an Http response.
- BaseCookie.js_output(): return the string of the embedded js script, the browser can get the cooke data by executing the script.
- (newdata): parses the string as cookie data.
SimpleCookie, SerialCookie, SmartCookie are all inherited from BaseCookie and have the same behavior, each of them overrides BaseCookie's value_decode, value_encode and implements its own serialization/deserialization strategy, where:
SimpleCookie internally uses str() to serialize data;
SerialCookie, on the other hand, serializes deserialized data through the pickle module;
SmartCookie is relatively smart, using pickle serialization/deserialization for non-string data, otherwise returning the string as is.
The following example simply illustrates how to use the Cookie module:
import Cookie c = () c['name'] = 'DarkBull' c['address'] = 'ChinaHangZhou' c['address']['path'] = '/' # Path c['address']['domain'] = '' # domain c['address']['expires'] = 'Fir, 01-Oct-2010 20:00:00 GMT' # expiration date print () print c.js_output() # Output results, compared to the graph above # Set-Cookie: address=ChinaHangZhou; Domain=; expires=Fir, 01-Oct-2010 20:00:00 GMT; Path=/ # Set-Cookie: name=DarkBull # Output as a script # <script type="text/javascript"> # = "address=ChinaHangZhou; Domain=; expires=Fir, 01-Oct-2010 20:00:00 GMT; Path=/"; # </script> # <script type="text/javascript"> # = "name=DarkBull"; # </script>
Morsel class : An abstract class used to represent the attributes of each piece of data in a cookie. These attributes include: expires, path, comment, domain, max-age, secure, version and so on (see the underlined part of the picture above). If you've played with the web, you should be familiar with these, and you can find their definitions in RCF2109.
,: key/value of the cookie data item (value can be binary data);
Morsel.coded_value: the string obtained after the data is coded.Http protocol is a text-based protocol, Server can not send binary data directly to the Client, only after serialized into a string, it can be sent to the Client;
(key, value, coded_value): set the key, value, coded_value of the cookie data item;
(key): return True if key is one of expires, path, comment, domain, max-age, secure, version, httponly, otherwise return False;
(): Returns a string of type "Set-Cookie: ...", representing a cookie data item;
Morsel.js_output(): returns the script string of the cookie data item;
(): Returns a string representation of Morsel;
Morsel usage example:
import Cookie m = () ('name', 'DarkBull', 'DarkBull') m['expires'] = 'Fir, 01-Oct-2010 20:00:00 GMT' m['domain'] = '' print () # Results # Set-Cookie: name=DarkBull; Domain=; expires=Fir, 01-Oct-2010 20:00:00
Knowledge Point Expansion:
Why use cookies?
Cookies, which refer to data (usually encrypted) stored on a user's local terminal by certain websites for the purpose of user identification and session tracking.
For example, some websites require login to access a certain page, before login, you want to crawl a certain page content is not allowed. Then we can use the Urllib2 library to save our login cookies, and then crawl other pages to achieve the purpose.
This is how to use the Cookie module in Python in detail, more information about the use of the Cookie module in Python, please pay attention to my other related articles!