Regex

learn-regex/README-cn.md at master · ziishaned/learn-regex

re --- 正则表达式操作 — Python 3.10.0 文档

Regex Vis：在线可视化正则编辑器。

以下文章摘自：

匹配单个字符

正则表达式是对文本执行搜索时所指定的“格式”。大多数字符匹配它们自己，例如，正则表达式 test 将完全匹配字符串 test。

正则表达式是大小写敏感的，所以 The 不会匹配 the。

有些字符是特殊的元字符，有特殊作用，它们不匹配自身：

. ^ $ * + ? { } [ ] \ | ( )

[ and ] are used for specifying a character class, which is a set of characters that you wish to match. Characters can be listed individually, or a range of characters can be indicated by giving two characters and separating them by a '-'.

[a-z] match only lowercase letters.

在 [] 内的元字符会被当作一般的字符。

complementing the set: [^5] will match any character except 5

在 [] 内想匹配元字符，就要加反斜杠转义。

\d 匹配任何十进制数字；这相当于 [0-9]

\D 匹配任何非数字字符；这相当于 [^0-9]

\s 匹配任何空白字符；这相当于 [ \t\n\r\f\v]

\S 匹配任何非空白字符；这相当于 [^ \t\n\r\f\v]

\w 匹配任何字母数字字符；这相当于类 [a-zA-Z0-9_]

\W 匹配任何非字母数字字符；这相当于类 [^a-za-z0-9_]

. matches anything except a newline character.

(xyz) 字符集，匹配与 xyz 完全相等的字符串

匹配多个字符

* specifies that the previous character can be matched zero or more times.

a[bcd]*b 匹配 a 开头、b 结尾、中间 bcd 任意字符重复任意次。

+ matches one or more times.

? matches either once or zero time

{m,n} means there must be at least m repetitions, and at most n

应用

找出当前目录下的所有 .h .m .mm 后缀的文件：find -E . -regex ".*\.(h|m|mm)"

全量格式化处理：

find -E . -regex ".*\.(h|m|mm|c|cc|cpp)" | xargs clang-format -i -style=file

on Mac OS X (BSD find): man find says -E uses extended regex support.

匹配单个字符​

匹配多个字符​

更多元字符​

应用​

匹配单个字符

匹配多个字符

更多元字符

应用