SoFunction
Updated on 2025-04-14

Case in Python where regular expressions are used to accurately match IP addresses

When programming and processing networks, we often need to extract or verify IP addresses from text. Python's regular expressions (re module) are a powerful tool to complete this task. But do you know how to write to accurately match various legal IP addresses? Today we will discuss this issue in detail.

Why do I need IP regular expressions?

Suppose you are analyzing the server log and need to extract the IP address. Or you are developing a network tool to verify whether the IP entered by the user is legal. Manually parsing IP addresses is both troublesome and prone to errors, and regular expressions can come in handy.

Basic structure of IP address

A legal IPv4 address consists of 4 digits 0-255, separated by dots. for example:

  • Legal: 192.168.1.1, 10.0.0.1
  • Illegal: 256.1.1 (number exceeds 255), 192.168.1 (only 3 paragraphs)

Basic regular expression writing

Let's first look at the simplest IP matching rule:

import re
pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
text = "The server IP is 192.168.1.1 and 10.0.0.1"
ips = (pattern, text)
print(ips)  # Output: ['192.168.1.1', '10.0.0.1']

This rule can match the IP, but it has an obvious problem: it cannot filter out numbers over 255. For example, "300.1.1.1" will also be matched.

Exactly match numbers from 0-255

To match exactly 0-255, we need more complex expressions. Here is a trick: divide numbers into several situations:

  • 0-199:[01]?\d?\d
  • 200-249:2[0-4]\d
  • 250-255:25[0-5]

Combined it is:

num = r"(25[0-5]|2[0-4]\d|[01]?\d?\d)"

Complete IP regular expressions

Combine the above number patterns and add the dot separator:

ip_pattern = r"(25[0-5]|2[0-4]\d|[01]?\d?\d)\.(25[0-5]|2[0-4]\d|[01]?\d?\d)\.(25[0-5]|2[0-4]\d|[01]?\d?\d)\.(25[0-5]|2[0-4]\d|[01]?\d?\d)"

This will accurately match the legitimate IPv4 address. But this expression looks a bit long, we can use it{3}To simplify the repetition:

ip_pattern = r"((25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(25[0-5]|2[0-4]\d|[01]?\d?\d)"

Functions for verifying IP address

We can encapsulate this regular as a function:

import re
def is_valid_ip(ip):
    pattern = r"^((25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(25[0-5]|2[0-4]\d|[01]?\d?\d)$"
    return bool((pattern, ip))
print(is_valid_ip("192.168.1.1"))  # True
print(is_valid_ip("256.1.1.1"))    # False

Note that this has been added here^and$Make sure to match the entire string, not the partial match.

Extract IP address from text

If you want to extract the IP address in the text, you can write it like this:

text = "Accesses are from 192.168.1.1 and 10.0.0.1, invalid IPs such as 300.1.1.1"
pattern = r"\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\b"
ips = (pattern, text)
print(ips)  # Output: ['192.168.1.1', '10.0.0.1']

Added here\bRepresents word boundaries to avoid matching "192.168.1.1" similar to "192.168.1.100".

FAQs and Traps

  • Forgot the boundary matching: No addition^$or\bMay lead to partial matches
  • Ignore leading zeros: Addresses like "192.168.01.1" are actually legal
  • Performance issues: Overly complex rules may affect the matching speed

If you need this kind of skills when dealing with more complex network data, you can pay attention to [Programmer Headquarters]. This official account was founded by Byte 11 years of technical tycoons. It gathers network programming experts from major manufacturers such as Alibaba, Byte, Baidu, etc., and often shares Python practical experience and network programming skills.

IPv6 address matching

Although IPv4 is still mainstream, IPv6 is becoming more and more important. Regular expressions for IPv6 are more complex:

ipv6_pattern = r"([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}"

Practical application cases

Suppose we want to analyze the Nginx log and extract the client IP:

log_line = '127.0.0.1 - - [10/Oct/2023:13:55:36 +0800] "GET / HTTP/1.1" 200 612'
ip_pattern = r"\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\b"
ip = (ip_pattern, log_line).group()
print(ip)  # Output: 127.0.0.1

Performance optimization suggestions

Precompiled regular expressions:

ip_regex = (r"...Long Expression...")

Consider using a generator when matching large amounts of data

If necessary, you can use the string method to do preliminary filtering first.

Summarize

Through this article we have learned:

  • Principle of regular expression of IPv4 address
  • How to accurately match digit segments from 0-255
  • The importance of boundary matching
  • Usage skills in practical applications

Remember: Although regular expressions are powerful, they should also choose the right level of complexity according to actual needs. For simple IP verification, the expressions in this article are sufficient; if the requirements are more complex, further adjustments may be required. I hope this article can help you get twice the result with half the effort when processing your IP address next time!

This is the end of this article about using regular expressions to accurately match IP addresses in Python. For more related contents of python regular expressions to match IP addresses, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!