Why regexp? IPv4 regexplained

pattern <- "^(25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})(\\.(25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})){3}$"

  1. ^: Anchors the regex at the beginning of the string.
  2. (25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}): This part represents a single octet in the range of 0 to 255.
    • 25[0-5]: Matches 250 to 255.
    • 2[0-4][0-9]: Matches 200 to 249.
    • [0-1]?[0-9]{1,2}: Matches 0 to 199. [0-1]? allows for an optional leading 0 or 1, and [0-9]{1,2} matches 1 or 2 digits.
  3. (\\.(25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})){3}: This part represents the remaining three octets, each separated by a dot.
  4. $: Anchors the regex at the end of the string.

In summary, the regular expression ensures that the IP address consists of four octets separated by dots and each octet is in the valid range of 0 to 255. The ^ and $ anchors ensure that the entire string is matched, not just a part of it.

Here’s how the regex works for an example IP like “192.168.1.1”:

  • 192: Matches the first part.
  • .: Matches the dot separator.
  • 168: Matches the second part.
  • .: Matches the dot separator.
  • 1: Matches the third part.
  • .: Matches the dot separator.
  • 1: Matches the fourth part.
  • $: Ensures the end of the string.

If the string doesn’t match this pattern, grepl returns FALSE, indicating that the IP is not valid according to the IPv4 format.