Character classes
- character class or set
[]
: match any character in a set.
- construct your own sets with
[]
[abc]
matches “a”, “b”, or “c”
[^abc]
matches any character except “a”, “b”, or “c”.
- two other characters that have special meaning inside of
[]
:
-
defines a range, e.g., [a-z] matches any lower case letter and [0-9] matches any number.
\
escapes special characters, so [\^\-\]]
matches ^
, -
, or ]
.
x <- "abcd ABCD 12345 -!@#%."
str_view(x, "[abc]+")
## [1] │ <abc>d ABCD 12345 -!@#%.
## [1] │ <abcd> ABCD 12345 -!@#%.
str_view(x, "[^a-z0-9]+")
## [1] │ abcd< ABCD >12345< -!@#%.>
# You need an escape to match characters that are otherwise
# special inside of []
str_view("a-b-c", "[a-c]")
## [1] │ <a>-<b>-<c>
str_view("a-b-c", "[a\\-c]")
## [1] │ <a><->b<-><c>