SoFunction
Updated on 2024-11-17

Python Regular Expression Find Match on String

Regular expressions in Python use the re module. The following is an introduction to the special characters and descriptions needed for regular expressions

Commonly used RegEx base syntax

grammatical clarification
\d Match a numeric character
\D Match a non-numeric character
\s Match any invisible characters (spaces, tabs, line breaks, etc.)
\S Match any visible character
\w Match any word character
\W Matches any non-word character
. Match all characters
^ Match from the beginning of the string, e.g. ^\d means start with a numeric character.
$ Match from the end of the string, e.g. \d$ means it ends with a numeric character.
* Match the previous character any number of times
+ Match the previous character one or more times
? Match the previous character zero or one time
{m} Match the previous character m times
{m,n} Match the previous character at least m times and at most n times.
\ escape character
[] For example, [a-z] means match all strings from a to z.
| or, e.g. A|B means match A or B
() Matches exactly the pattern specified in parentheses

Commonly used RegEx functions

function (math.) clarification
search Searches from a string and returns a match if it succeeds, or None if it fails.
match Match from the beginning of the string and return the target if it succeeds, or None if it fails.
fullmatch Matches the entire string
split Splitting strings based on patterns
findall Find all non-overlapping matches in a string
finditer Similar to findall, but returns the Python iterator
sub Replace the match pattern with the supplied string

Below are some examples:

For the split function, we can call the split method directly when splitting a string, and once again, no more

>>> import re#Import packages
>>> a='xiaoming:wo jiao xiaoming,wo de dianhua shi +86-666666'
>>> print((pattern='\d+\W\d+',string=a))
< object; span=(45, 54), match='86-666666'>#The output here is a match object
 
>>> mp=(pattern='\d+\W\d+',string=a)#Find a Phone Number
>>> print(())# Use the group method to get strings that match a specified pattern.
86-666666
>>> print(())# Get the index of the first match string
45
>>> print(())# Get the matching string of the
54
>>> print(())# Get index range
(45, 54)
 
>>> print((pattern='\w+',string=a))
['xiaoming', 'wo', 'jiao', 'xiaoming', 'wo', 'de', 'dianhua', 'shi', '86', '666666']
 
>>> m_sub=(pattern='\w+:',string=a,repl='xiaohong:')# Pattern substitution, use the string passed in by repl to replace the first string matched.
>>> print(m_sub)
xiaohong:wo jiao xiaoming,wo de dianhua shi +86-6666
 
#Compile mode
>>> p=('\d{6}')# Pre-set matching patterns
>>> m1=(a)# Call lookups on predefined patterns
>>> print(())Get the found string
666666

Case insensitive matching of characters

#! /usr/bin/python3
import re
rebocop = (r'rebocop', )
match = ('ReboCop is part man, part machine, all cop.').group()
print(match)

Managing complex regular expressions

Regular expressions are good if the text pattern you need to match is simple. But matching complex text patterns can require long, complex regular expressions. You can mitigate this by telling the () function to ignore spaces and comments in the regular expression string. You can enable the second parameter of this "Detailed Pattern" () by passing it a variable.

#! /usr/bin/python3
import re

phoneRegex = (r'''(
    (\d{3}|\(\d{3}\))?   # area code
    (\s|-|\.)?           # separator
    \d{3}                # first 3 digits
    (\s|-|\.)            # separator
    \d{4}                # last 4 digits
    (\s*(ext|x|ext.)\s*\d{2,5})?   #extension
    )''', )

Notice how the previous example creates a multi-line string using the triple-quote syntax (''') so that you can spread the regular expression definition over multiple lines for clarity. The rules for commenting in regular expression strings are the same as in regular Python code: the # symbol and all lines that follow it are ignored. In addition, extra spaces within a regular expression's multi-line string are not considered part of the text pattern to be matched. This allows you to organize regular expressions so that they are easier to read.

Extract email and cell phone numbers from pasteboard text

#! /usr/bin/python3
#  - Finds phone numbers and email address on the chipboard.

import pyperclip, re

americaPhoneRegex = (r'''(
    (\d{3}|\(\d{3}\))?   # area code
    (\s|-|\.)?           # separator
    (\d{3})              # first 3 digits
    (\s|-|\.)            # separator
    (\d{4})              # last 4 digits
    (\s*(ext|x|ext.)\s*(\d{2,5}))?   # extension
    )''', )    

chinesePhoneRegex = (r'1\d{10}')

emailPhoneRegex   = (r'''(
        [a-zA-Z0-9._%+-]+      # username
        @                      # @ symbol
        [a-zA-Z0-9.-]+         # domain name
        (\.[a-zA-Z]{2,4})      # dot-something
        )''', )


# Find matches in clipboard text.
text = str(())
matches = []
for groups in (text):
    phoneNum = '-'.join([groups[1], groups[3], groups[5]])
    if groups[8] != '':
        phoneNum += ' x' + groups[8]
    (phoneNum)

for groups in (text):
    (groups[0])

for groups in (text):
    (groups[0])

# copy results the clipboard. 
if len(matches) > 0:
    ('\n'.join(matches))
    print('Copied to clipboard:')
    print('\n'.join(matches))
else:
    print('No phone numbers or email addresses found.')

summarize

to this article on the python regular expression to find a string to match the article is introduced to this, more related python string to find a match content please search for my previous articles or continue to browse the following related articles I hope you will support me in the future more!