preamble
Everyday use of python often have to deal with text, whether it is the crawler's data parsing, or big data text cleaning, or ordinary file processing, are to use strings. Python string processing built-in a lot of efficient functions, very convenient and powerful. The following is a summary of my commonly used 7 strokes, with these strokes will be able to easily deal with string processing.
I. String concatenation and merging
Connections and mergers
Add // two strings can be easily joined by '+'
Merge // with join method
II. Slicing and multiplying strings
Multiply // e.g. to write code with delimiters, which is easy to do with python
line='*'*30 print(line) >>******************************
thin section of specimen for examination (as part of biopsy)
Third, the split of the string
Ordinary splitting, with split
split can only do very simple splits, and does not support multiple splits.
phone='400-800-800-1234' print(('-')) >>['400', '800', '800', '1234']
Complex segmentation
r means no escape, the separator can be ; or, or a space followed by zero or more extra spaces, then follow this pattern to split
IV. Handling the beginning and end of strings
Let's say we want to find out what the name of a file begins or ends with.
filename='' print(('h')) >>True print(('trace')) >>True
V. Finding and matching strings
General Search
We can easily find the substring inside a long string, it will return the index of the location of the substring, if it can not be found, it will return -1.
Complex Matching
VI. String replacement
Normal replacement // with replace is fine
Complex substitutions // To handle complex or multiple substitutions, you need to use the sub function of the re module.
VII. Remove some characters from the string
Remove spaces // When processing text, such as reading a line from a file, you need to remove the spaces on both sides of each line, table or line breaks.
line=' Congratulations, you guessed it. ' print(()) >>Congratulations, you guessed it.
Caution.Spaces inside a string cannot be removed, to do so you need to use the re module.
Complex text cleanup can be done using the,
First construct a conversion table, table is a translation table that represents the conversion of 't' 'o' to uppercase 'T' 'O', the
Then remove '12345' from the old_str, then the rest of the string is translated by the table
summarize
The above is the entire content of this article, I hope that the content of this article on your learning or work can bring some help, if there are questions you can leave a message to exchange, thank you for my support.