SoFunction
Updated on 2024-11-19

Usage and description of python

Usage of python

analyze

url:(url, scheme='', allow_fragments=True)
url:(url, scheme='', allow_fragments=True)

Simple to use:

urlparse

from urllib import request, parse2 #Parsing the url
print(('/'))
print(('/', scheme='http'))
print(('/', scheme='http'))
# Here are the results
ParseResult(scheme='https', netloc='', path='/', params='', query='', fragment='')
ParseResult(scheme='https', netloc='', path='/', params='', query='', fragment='')
ParseResult(scheme='http', netloc='', path='/', params='', query='', fragment='')

As you can see there is a difference between the results returned with the scheme parameter and without it.

And when the scheme protocol is added and the preceding url also contains the protocol, the following scheme parameter is generally ignored

Since there are parsed urls, there are of course anti-parsed urls, which are elements concatenated into a single url

from urllib import parse
# Splicing list elements into urls
url = ['http', 'www', 'baidu', 'com', 'dfdf', 'eddffa'] # At least six elements are needed here
print((url))
# Here are the results6http://www/baidu;com?dfdf#eddffa

urlunparse()Receive a list of parameters, and the length of the list is required, it must be more than six parameters, or will not throw an exception!

(): This fills in the missing part of the url of the second parameter with the url of the first parameter.

# link two parameters of the url, the second parameter in the missing part of the first parameter to fill in, if the second has a complete path, then the second one is the main one
print(('/', 'index'))
print(('/', '/login'))
# Here are the results
/index6     /login

urlencodeThere is a urlencode function inside the urllib library that converts key-value pairs like key-value into the format we want, returning a string like a=1&b=2, for example:

>>> from urllib import urlencode
>>> data = {
...     'a': 'test',
...     'name': 'The Beast'
... }
>>> print urlencode(data)
a=test&name=%C4%A7%CA%DE
If you only want to do this for a stringurlencodeconversions,what's be done?urllibProvide another function:quote()
>>> from urllib import quote
>>> quote('The Beast')
'%C4%A7%CA%DE'

urldecodeWhen the string is passed after urlencode, it has to be decoded after accepting it - urldecode. urllib provides the function unquote(), but not urldecode()!

>>> from urllib import unquote
>>> unquote('%C4%A7%CA%DE')
'\xc4\xa7\xca\xde'
>>> print unquote('%C4%A7%CA%DE')
mythological animal

module (in software)

Modules are provided in python for encoding and decoding, which areurlencode()together withunquote()

encode urlencode()

# Import the parse module
from urllib import parse
# Call the parse module's urlencode() for encoding
query_string = {'wd':'Crawler'}
result = (query_string)
# format function formats a string for url splicing
url = '/s?{}'.format(result)
print(url)

Encoding operations on url addresses

Encoding quote(string)

from urllib import parse
url = "/s?wd={}"
words = input('Please enter content')
#quote() can only encode strings
query_string = (words)
url = (query_string)
print(url)

quote() can only encode strings, while urlencode() can encode query strings.

Decode unquote(string)

from urllib import parse
string = '%E7%88%AC%E8%99%AB'
result = (string)
print(result)

Decoding is the reduction of the encoded url

URL address splicing method

String addition

 query1= '/s?'
 query2='wd=%E7%88%AC%E8%99%AB'
 url = query1 + query2

String Formatting

  query2='wd=%E7%88%AC%E8%99%AB'
  url = '/s?%s'% query2

format()

# Import the parse module
from urllib import parse
# Call the parse module's urlencode() for encoding
query_string = {'wd':'Crawler'}
result = (query_string)
# format function formats a string for url splicing
url = '/s?{}'.format(result)
print(url)

summarize

The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.