Regular expressions in Python use the re module. The following is an introduction to the special characters and descriptions needed for regular expressions
Commonly used RegEx base syntax
grammatical | clarification |
\d | Match a numeric character |
\D | Match a non-numeric character |
\s | Match any invisible characters (spaces, tabs, line breaks, etc.) |
\S | Match any visible character |
\w | Match any word character |
\W | Matches any non-word character |
. | Match all characters |
^ | Match from the beginning of the string, e.g. ^\d means start with a numeric character. |
$ | Match from the end of the string, e.g. \d$ means it ends with a numeric character. |
* | Match the previous character any number of times |
+ | Match the previous character one or more times |
? | Match the previous character zero or one time |
{m} | Match the previous character m times |
{m,n} | Match the previous character at least m times and at most n times. |
\ | escape character |
[] | For example, [a-z] means match all strings from a to z. |
| | or, e.g. A|B means match A or B |
() | Matches exactly the pattern specified in parentheses |
Commonly used RegEx functions
function (math.) | clarification |
---|---|
search | Searches from a string and returns a match if it succeeds, or None if it fails. |
match | Match from the beginning of the string and return the target if it succeeds, or None if it fails. |
fullmatch | Matches the entire string |
split | Splitting strings based on patterns |
findall | Find all non-overlapping matches in a string |
finditer | Similar to findall, but returns the Python iterator |
sub | Replace the match pattern with the supplied string |
Below are some examples:
For the split function, we can call the split method directly when splitting a string, and once again, no more
>>> import re#Import packages >>> a='xiaoming:wo jiao xiaoming,wo de dianhua shi +86-666666' >>> print((pattern='\d+\W\d+',string=a)) < object; span=(45, 54), match='86-666666'>#The output here is a match object >>> mp=(pattern='\d+\W\d+',string=a)#Find a Phone Number >>> print(())# Use the group method to get strings that match a specified pattern. 86-666666 >>> print(())# Get the index of the first match string 45 >>> print(())# Get the matching string of the 54 >>> print(())# Get index range (45, 54) >>> print((pattern='\w+',string=a)) ['xiaoming', 'wo', 'jiao', 'xiaoming', 'wo', 'de', 'dianhua', 'shi', '86', '666666'] >>> m_sub=(pattern='\w+:',string=a,repl='xiaohong:')# Pattern substitution, use the string passed in by repl to replace the first string matched. >>> print(m_sub) xiaohong:wo jiao xiaoming,wo de dianhua shi +86-6666 #Compile mode >>> p=('\d{6}')# Pre-set matching patterns >>> m1=(a)# Call lookups on predefined patterns >>> print(())Get the found string 666666
Case insensitive matching of characters
#! /usr/bin/python3 import re rebocop = (r'rebocop', ) match = ('ReboCop is part man, part machine, all cop.').group() print(match)
Managing complex regular expressions
Regular expressions are good if the text pattern you need to match is simple. But matching complex text patterns can require long, complex regular expressions. You can mitigate this by telling the () function to ignore spaces and comments in the regular expression string. You can enable the second parameter of this "Detailed Pattern" () by passing it a variable.
#! /usr/bin/python3 import re phoneRegex = (r'''( (\d{3}|\(\d{3}\))? # area code (\s|-|\.)? # separator \d{3} # first 3 digits (\s|-|\.) # separator \d{4} # last 4 digits (\s*(ext|x|ext.)\s*\d{2,5})? #extension )''', )
Notice how the previous example creates a multi-line string using the triple-quote syntax (''') so that you can spread the regular expression definition over multiple lines for clarity. The rules for commenting in regular expression strings are the same as in regular Python code: the # symbol and all lines that follow it are ignored. In addition, extra spaces within a regular expression's multi-line string are not considered part of the text pattern to be matched. This allows you to organize regular expressions so that they are easier to read.
Extract email and cell phone numbers from pasteboard text
#! /usr/bin/python3 # - Finds phone numbers and email address on the chipboard. import pyperclip, re americaPhoneRegex = (r'''( (\d{3}|\(\d{3}\))? # area code (\s|-|\.)? # separator (\d{3}) # first 3 digits (\s|-|\.) # separator (\d{4}) # last 4 digits (\s*(ext|x|ext.)\s*(\d{2,5}))? # extension )''', ) chinesePhoneRegex = (r'1\d{10}') emailPhoneRegex = (r'''( [a-zA-Z0-9._%+-]+ # username @ # @ symbol [a-zA-Z0-9.-]+ # domain name (\.[a-zA-Z]{2,4}) # dot-something )''', ) # Find matches in clipboard text. text = str(()) matches = [] for groups in (text): phoneNum = '-'.join([groups[1], groups[3], groups[5]]) if groups[8] != '': phoneNum += ' x' + groups[8] (phoneNum) for groups in (text): (groups[0]) for groups in (text): (groups[0]) # copy results the clipboard. if len(matches) > 0: ('\n'.join(matches)) print('Copied to clipboard:') print('\n'.join(matches)) else: print('No phone numbers or email addresses found.')
summarize
to this article on the python regular expression to find a string to match the article is introduced to this, more related python string to find a match content please search for my previous articles or continue to browse the following related articles I hope you will support me in the future more!