I. Functions under the re module
()
() function is to match a string that matches the rules from the beginning, starting with theStart matching at start positionIf the match is successful, it returns an object, if not, it returns None.
Function Syntax:
(pattern,string,flags=0)
The parameters are described below:
parameters | descriptive |
pattern | Matching regular expressions |
string | String to match |
flags | Flags, used to control how regular expressions are matched, e.g., whether they are case-sensitive, multi-line matching, etc. |
Requirement: Match the string "hello".
import re result=("hello","hello world") print(result)
Output results:
< object; span=(0, 5), match=‘hello’>
()
The () function will be added to theentire stringFind a match within theFind the first matchand then returns, or None if the strings don't match.
Function Syntax:
(pattern,string,flags=0)
The parameters are described below:
parameters | descriptive |
pattern | Matching regular expressions |
string | String to match |
flags | Flags, used to control how regular expressions are matched, e.g., whether they are case-sensitive, multi-line matching, etc. |
Requirement: match the number of times the article was read.
import re ret = (r"\d+", "The number of readings is 9999.") print(())
Output results:
9999
()
() Function requirementsExact match for the entire stringto a regular expression, it returns a corresponding match, otherwise it returns a None.
Function Syntax:
(pattern, string, flags=0)
The parameters are described below:
parameters | descriptive |
pattern | Matching regular expressions |
string | String to match |
flags | Flags, used to control how regular expressions are matched, e.g., whether they are case-sensitive, multi-line matching, etc. |
Requirement: Match the string "hello".
import re result=("hello","hello world") if result: print(()) else: print("Match failed!")
Output results:
Match failed!
Distinction between ( ), ( ) and ( )
- The match() function only detects if the RE matches at the beginning of the string.
- The search() function scans the entire string for matches.
- The fullmatch() function scans the entire string to see if theperfect match
- match() is only available when the0 position matched successfullyOnly then is there a return, and match() returns none if the match is not successful at the start position. fullmatch() is to beExact match for the entire stringon it, starting at position 0 and ending at position 0. If not, fullmatch() returns none.
import re print(('super', 'superstition').span()) print(('super','insuperable'))
Output results:
(0, 5) None
import reprint(('super','superstition').span())print(('super','insuperable').span())
Output results:
(0, 5) (2, 7)
import re print(('super','superstition').span()) print(('super','insuperable').span())
Output results:
None None super
()
() function finds all substrings in a string matched by a regular expression.Returns a list in order, if no match is found, the empty list is returned.
Function Syntax:
(pattern,string,flags=0)
The parameters are described below:
parameters | descriptive |
pattern | Matching regular expressions |
string | String to match |
flags | Flags, used to control how regular expressions are matched, e.g., whether they are case-sensitive, multi-line matching, etc. |
Requirement: Match all the numbers.
import re print((r"\d+","abafa 124ddwa56"))
Output results:
[‘124’, ‘56’]
Requirement: Match words that start with f and have an empty string on the left.
import re print((r'\bf[a-z]*', 'which foot or hand fell fastest'))
Output results:
[‘foot’, ‘fell’, ‘fastest’]
If there is more than one group, returns a list of string tuples that match those.
import re print((r'(\w+)=(\d+)', 'set width=20 and height=10'))
Output results:
[(‘width’, ‘20’), (‘height’, ‘10’)]
()
() function finds all substrings in the string matched by the regular expression as aThe iterator returns。
Function Syntax:
(pattern,string,flags=0)
The parameters are described below:
parameters | descriptive |
pattern | Matching regular expressions |
string | String to match |
flags | Flags, used to control how regular expressions are matched, e.g., whether they are case-sensitive, multi-line matching, etc. |
Requirement: Match all the numbers.
import re it = (r"\d+", "12a32bc43jf3") for match in it: print(())
Output results:
12 32 43 3
()
() function according to the ability to match the substring to split the string and return the list.
Function Syntax:
(pattern, string[, maxsplit=0, flags=0])
The parameters are described below:
parameters | descriptive |
pattern | Matching regular expressions |
string | String to match |
maxsplit | The number of times to split, maxsplit=1 to split once, default is 0, no limit to the number of times. |
flags | Flags, used to control how regular expressions are matched, e.g., whether they are case-sensitive, multi-line matching, etc. |
Separate string with pattern.
Requirement: slice the string with non-a-zA-Z0-9_ characters.
import re (r'\W+','Words, words, words.')
Output results:
[‘Words’, ‘words’, ‘words’, ‘’]
If maxsplit is non-zero, at most maxsplit separation is performed and all remaining characters are returned to the last element of the list.
Requirement: Slice the string with non-a-zA-Z0-9_ characters and separate them only once.
(r'\W+', 'Words, words, words.', 1)
Output results:
[‘Words’, ‘words, words.’]
If the parentheses are captured in pattern, then all the text in the group will also be included in the list.
(r'(\W+)', 'Words, words, words.')
Output results:
[‘Words’, ', ', ‘words’, ', ', ‘words’, ‘.’, ‘’]
If there is a capture combination in the delimiter and at the beginning of the match string, the result will start with an empty string, and the same for the end.
(r'(\W+)', '...words, words...')
Output results:
[‘’, ‘…’, ‘words’, ', ', ‘words’, ‘…’, ‘’]
Slicing strings with regular expressions is more flexible than with fixed characters.
Normal cut code:
'a b c'.split(' ')
Output results:
[‘a’, ‘b’, ‘’, ‘’, ‘c’]
Can't recognize consecutive spaces, try with a regular expression:
import re print((r'\s+','a b c')) # Split properly no matter how many spaces. Joining, print((r'[\s\,]+', 'a,b, c d')) # Rejoin; # print((r'[\s\,\;]+', 'a,b;; c d'))
Output results:
[‘a’, ‘b’, ‘c’] [‘a’, ‘b’, ‘c’, ‘d’] [‘a’, ‘b’, ‘c’, ‘d’]
()
() function is used to replace matches in a string.
Function Syntax:
(pattern, repl, string, count=0, flags=0)
The parameters are described below:
parameters | descriptive |
pattern | Matching regular expressions |
repl | The replacement string, which can also be a function. |
string | The original string to be replaced by the lookup. |
count | Maximum number of times to replace after pattern matching, default 0 means replace all matches. |
flags | Flags, used to control how regular expressions are matched, e.g., whether they are case-sensitive, multi-line matching, etc. |
The first three mandatory parameters and the last two are optional.
Requirement: Delete the note and remove the non-numeric elements.
import re phone = "2004-959-559 # It's a phone number # # Remove comments num = (r'#.*$', "", phone) print("Phone number : ", num) # Remove non-numeric elements num = (r'\D', "", phone) print("Phone number : ", num)
Output results:
Telephone number : 2004-959-559
Phone number : 2004959559
The repl argument can be a function, and the regular expression uses the (?P...) syntax, naming the numeric portion of the string matched as value.
Requirement: Multiply the matching numbers by 2.
import re # Multiply the matching number by 2 def double(matched): value = int(('value')) return str(value * 2) s = 'A23G4HFD567' print(('(?P<value>\d+)', double, s))
Output results:
A46G8HFD1134
()
() function compiles the style of the regular expression into a regular expression object (regular object) that can be used for matching, through the methods match(), search() and others of this object.
The behavior of this expression can be changed by specifying the value of the token, which can be any of the following variables.
prog=(pattern) result=(string)
equivalence
result=(pattern,string)
If you need to use this regular expression multiple times, using () and saving this regular object for utilization can make the program more efficient.
Methods and properties of regular objects
Regular expression objects compiled with compile support the following methods and properties:
(string[, pos[, endpos]]) Scans the entire string for the first match and returns a corresponding match. If there is no match, return None.
The pos parameter selects the index of the position at which the string starts to be searched. The default is 0, which is not exactly equivalent to string slicing, and the '^' style character matches the true switch of the string, and the first character after a newline character, but will not match the position at which the index specifies to start.
The optional parameter endpos defines the end of the string search, it assumes that the string length is up to endpos, so only characters from pos to endpos-1 will be matched, if endpos is less than pos, no match will be generated, another, if rx is a compiled regular object, (string,0,50) is equivalent to (string[:50],0). :50],0).
import re pattern=("o") print(("dog")) print(("dog",2))
Output results:
< object; span=(1, 2), match=‘o’> None
- (string[, pos[, endpos]]) if string'sstarting position If any of the matches of this pattern can be found, a match object is returned. If no match is found, None is returned.
The pos parameter selects the index of the position at which the string starts to be searched. The default is 0, which is not exactly equivalent to string slicing, and the '^' style character matches the true switch of the string, and the first character after a newline character, but will not match the position at which the index specifies to start.
import re pattern=("o") print(("dog")) print(("dog",1))
Output results:
None
< object; span=(1, 2), match=‘o’>
- () If the entire string matches the regular expression, returns a corresponding match. Otherwise, it returns None.
The optional parameters pos and endpos have the same meaning as search().
import re pattern=("o[gh]") print(("dog")) print(("ogre")) print(("doggie",1,3))
Output results:
None None < object; span=(1, 3), match=‘og’>
- (string[, pos[, endpos]]) is similar to findall() in that it uses post-compilation styles, but it can also take optional arguments pos and endpos to limit the scope of the search, as in search().
- (string[, pos[, endpos]]) is similar to the finditer() function in that it uses post-compilation styles, but it can also take optional arguments pos and endpos to limit the scope of the search, as in search().
- (string, maxsplit=0) is equivalent to the split() function, using the compiled style.
- (repl, string, count=0) is equivalent to the sub() function, using the compiled style.
II. Matching objects for functions under the re module
A function of a regular expression is an OBJECT object once it is matched successfully, and the matching object supports the following methods and properties:
group() returns the string matched by RE.
If there is only one parameter, the result is a string, if there is no parameter, the default is 0, and the corresponding return value is the entire matching string.
m=(r"(\w+) (\w+)", "I love you, Tom") () (0)
Output results:
‘I love’
If it is a range [1...99], the result is the corresponding bracketed group string.
(1) (2)
Output results:
‘I’ ‘love’
If there is more than one argument, the result is a tuple, with the first argument corresponding to one.
(1,2)
Output results:
(‘I’, ‘love’)
If the regular expression uses the (?P...) syntax, the groupN parameter may also be the name of a named combination. If a string argument is not defined as a group name in the style, an IndexError exception is thrown.
import re m=(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds") print(("first_name")) print(("last_name"))
Output results:
Malcolm Reynolds
Named combinations can also be referenced by index values
print((1)) print((2))
Output results:
Malcolm Reynolds
After version 3.6 you can directly Match.getitem(g), which is equivalent to (g).
print(m[1]) print(m[2])
print(m['first_name']) print(m['last_name'])
Output results:
Malcolm Reynolds
- start() returns the start position of the match
- end() returns the end of the match
- span() returns a tuple containing the positions of the matches (start, end).
import re result=("hello","hello world") print(()) print(()) print(())
Output results:
0
5
(0, 5)
Matching objects always have a Boolean value True. So you can simply use theif statementto determine if it matches
import re result=("hello","hello world") if result: print(()) else: print("Match failed!")
Output results:
hello
to this article on the Python re module under the function of the article is introduced to this, more related to Python's re module content, please search for my previous articles or continue to browse the following related articles I hope you will support me in the future more!