js正则表达式(一)

最近对正则表达式又不熟悉了，故重新疏漏API，以后每日写一个正则来练习提升

正则表达式中特殊字符的含义

字符类别（Character Classes）

.   //matches any single character excepet line terminators: \n, \r
\d  //Matchees any digit  ==[0-9]
\D  //Matches any character that is not a digit ==[^0-9]
\w  //Matches any alphanumeric character, including underscore ==[A-Za-z0-9_] 
\W  // ==[^A-Za-z0-9_]
\s  //Matches a single white space character  ==[ \f\n\r\t\v\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]
\S //Matches a single character other than white space
\xhh    //匹配编码为 hh （两个十六进制数字）的字符。
\uhhhh  //匹配 Unicode 值为 hhhh （四个十六进制数字）的字符。

字符集合（Character Sets）

[xyz]  // 字符集合
[^xyz] // 一个反义字符集

.在[]在集合里只指代点，包括-才指代-, 很多字符在集合都有另外的意义，比如\b匹配退格键，需要多注意

边界（Boundaries）

^   //匹配输入开始
$   //匹配输入结尾
\b  //匹配一个zero-width word boundary [\b]：匹配退格键
\B  //匹配一个zero-width non-word boundary

分组（grouping）与反向引用（back references）

（x） // Matches x and remembers the match. These are called capturing groups.

 \n  //一个反向引用（back reference），指向正则表达式中第 n 个括号?

(?:x) //Matches x but does not remember the match. These are called non-capturing groups.

数量词（Quantifiers）

x* //Matches the preceding iteme x 0 or more times

x+ //Matches the preceding item x 1 or more times. 等于{1,}

x? //Matches the preceding item x 0 or 1 time. If used immediately after any of the quantifiers *, +, ?, or {}, makes the quantifier non-greedy(非贪婪) 默认贪婪模式

x{n}  //Matches exactly n occurrences of the preceding item x.
x{n,} //Matches at least n occurrences of the preceding item x.
x{n,m} //Matches at least n and at most m occurrences of the preceding item x.

x*?
x+?
x??
x{n}?
x{n,}?
x{n,m}?

/<.+?>/.exec('<foo> <bar>') //["<foo>", index: 0, input: "<foo> <bar>"]
/<.+??>/.exec('<foo> <bar>') //["<foo> <bar>", index: 0, input: "<foo> <bar>"]

'cbbaandy'.match(/b+?aa/)  //返回bbaa ？？？？？？

断言（Assertions）

x(?=y) //Matches x only if x is followed by y.
x(?!y) //Matches x only if x is not followed by y.

方法

test

如果字符串 string 中含有与 RegExpObject 匹配的文本，则返回 true，否则返回 false。

调用 RegExp 对象 r 的 test() 方法，并为它传递字符串 s，与这个表示式是等价的：(r.exec(s) != null)。

exec

返回一个数组，其中存放匹配的结果。如果未找到匹配，则返回值为 null。

无论 RegExpObject 是否是全局模式，exec() 都会把完整的细节添加到它返回的数组中。这就是 exec() 与 String.match() 的不同之处，后者在全局模式下返回的信息要少得多。因此我们可以这么说，在循环中反复地调用 exec() 方法是唯一一种获得全局模式的完整模式匹配信息的方法。

var str = "Hello Kenny vs Kenny abc Kenny"; 
var patt = new RegExp("Kenny","g");
var result;

while ((result = patt.exec(str)) != null)  {
  console.log(result);

  console.log(patt.lastIndex);
 }

compile

compile() 方法用于在脚本执行过程中编译正则表达式,也可用于改变和重新编译正则表达式。

最新标准已废弃

支持正则表达式的 String 对象的方法

search

stringObject.search(regexp)

返回值：stringObject 中第一个与 regexp 相匹配的子串的起始位置。没有找到返回-1

不执行全局匹配，同时忽略 regexp 的 lastIndex 属性，并且总是从字符串的开始进行检索，总是返回第一个匹配的位置

match

stringObject.match(searchvalue)
stringObject.match(regexp)

返回值：存放匹配结果的数组，数组的内容长度依赖全局标志g。没有找到任何匹配，返回null。

在全局检索模式下，match()即不提供与子表达式匹配的文本的信息，也不声明每个匹配子串的位置。如果您需要这些全局检索的信息，可以使用 RegExp.exec()。

replace

str.replace(regexp|substr, newSubStr|function)

注意：原字符串不会改变

使用字符串作为参数

变量名	代表值
$$	插入一个 “$”。
$&	插入匹配的子串。
$`	插入当前匹配的子串左边的内容。
$’	插入当前匹配的子串右边的内容
$n	假如第一个参数是 RegExp对象，并且 n 是个小于100的非负整数，那么插入第 n 个括号匹配的字符串

指定一个函数作为参数

指定一个函数作为第二个参数(上面提到的特殊替换参数在这里不能被使用)。如果第一个参数是正则表达式，并且其为全局匹配模式，那么这个方法将被多次调用，每次匹配都会被调用。

变量名	代表值
match	匹配的子串(不是分组)。（对应于上述的$&。）
p1,p2…	假如replace()方法的第一个参数是一个RegExp 对象，则代表第n个括号匹配的字符串。（对应于上述的 $1，$ 2等。）
offset	匹配到的子字符串在原字符串中的偏移量。（比如，如果原字符串是“abcd”，匹配到的子字符串是“bc”，那么这个参数将是1）
string	被匹配的原字符串。

split

stringObject.split(separator,howmany)

匹配到匹配项，函数未return，会返回undefined，即符合函数的返回值

lastIndex指向问题

使用带有 ”sticky“ 标志的正则表达式

sticky 属性反映了搜索是否具有粘性（仅从正则表达式的 lastIndex 属性表示的索引处搜索）

var str = '#foo#';
var regex = /foo/y;

regex.test(str) //false 默认lastIndex为0
regex.lastIndex = 1;
regex.test(str); // true （译注：此例仅当 lastIndex = 1 时匹配成功，这就是 sticky 的作用）
regex.lastIndex = 5;
regex.test(str); // false （lastIndex 被 sticky 标志考虑到，从而导致匹配失败）
regex.lastIndex; // 0 （匹配失败后重置）

multiline

multiline 是一个布尔对象，如果使用了 “m” 标志，则返回 true；否则，返回 false。”m” 标志意味着一个多行输入字符串被看作多行。例如，使用 “m”，”^” 和 “$” 将会从只匹配正则字符串的开头或结尾，变为匹配字符串中任一行的开头或结尾，但是找到一个匹配就返回，加g可以执行全局匹配