Python3正则表达式第四课

re模块(一)

re模块和compile对象均有的函数
obj = compile(pattern,flags = 0)
功能：获取正则表达式对象
参数：
    pattern : 一个字符串形式的正则表达式
    flags: 标识位,默认为0,可省略
返回值：正则表达式对象
注意:flags 可选，表示匹配模式，比如忽略大小写，多行模式等，具体参数为：
      re.I
          IGNORECASE
          忽略字母大小写
      re.L
          LOCALE
          表示特殊字符集 \w, \W, \b, \B, \s, \S 依赖于当前环境。
      re.M
          MULTILINE
          多行模式
      re.S
          DOTALL
          即为' . '并且包括换行符在内的任意字符（' . '不包括换行符）
      re.X
          VERBOSE
          为了增加可读性，忽略空格和' # '后面的注释

#!/usr/bin/env python3
# coding =utf-8

'''
正则表达式  re 模块

'''
import re

s = '''hello world
Hello kitty
nihao China
'''
# 一个字符串形式的正则表达式
pattern = '''(?P<dog>hello) #dog 组
\s+ #空字符
(world) #第二组用来匹配world
'''
l = re.findall(pattern, s, re.X | re.I)
# re.I 忽略字母大小写
# re.X  为了增加可读性，忽略空格和' # '后面的注释
print(l)
# [('hello', 'world')]

l = re.findall('.+', s, re.S)
# re.S即为' . '并且包括换行符在内的任意字符
# ' . '不包括换行符
print(l)
# ['hello world\nHello kitty\nnihao China\n']

l = re.findall('^nihao', s)
print(l)
# []

l = re.findall('^nihao', s, re.M)
# 多行模式,影响 ^ 和 $
print(l)
# ['nihao']

l = re.findall('H\w+', s, re.I)
# 忽略字母大小写
print(l)
# ['hello', 'Hello', 'hao', 'hina']

obj.findall(string,pos,endpos)
功能：通过正则表达式匹配字符串
参数： string 目标字符串
pos 目标字符串的匹配开始位置
endpos 目标字符串的结束位置
返回值：匹配到的所有内容以列表返回
注意: 如果正则表达式有子组则只显示子组匹配内容

#!/usr/bin/env python3
# coding =utf-8

import re

pattern = r'\s+'
# \s+ 匹配任意一个或多个空字符

# 获取正则表达式对象
obj = re.compile(pattern, flags=0)
# flags 标志位 可选可忽略
l = obj.findall("abcdabcabab", 1, 9)
print(l)
# []

obj.split(string)
功能：按照正则表达式切割目标字符串
参数：目标字符串
返回值：切割后的内容

# 匹配目标字符串用(pattern = r'\s+')进行切割
l = obj.split('hello world  hello kitty  nihao china')
print(l)
# ['hello', 'world', 'hello', 'kitty', 'nihao', 'china']

obj.sub（replaceStr，string，max）
功能：替换正则表达式匹配到的内容
参数： replaceStr 要替换的内容
string 目标字符串
max 最多替换几处
返回值：返回替换后的字符串

# 替换目标字符串('##')中匹配到的内容
s = obj.sub('##', 'hello world nihao China', 2)
print(s)
# hello##world##nihao China

subn（repl，string，count）
功能：替换正则表达式匹配到的内容
参数： repl 要替换的内容
string 目标字符串
count 最多替换几处

返回值：返回替换后的字符串和实际替换的个数

# 返回替换后的字符串和实际替换的个数
s = obj.subn('##', 'hello world nihao China')
print(s)
# ('hello##world##nihao##China', 3)

Python3正则表达式第四课

re模块(一)

猜你喜欢