常用的匹配规则
常用的匹配函数
re.match(正则表达式,匹配内容, 修饰符):从字符串开始匹配,匹配成功返回结果,失败返回None
import re
content = 'Hello 1234567 World_This is a Regex Demo'
print(len(content)) # 41
result = re.match('^Hello\s(\d+)\sWorld', content)
print(result) # 输出SRE_Match对象:<_sre.SRE_Match object; span=(0, 25), match='Hello 1234567 World'>
print(result.group()) # Hello 1234567 World
print(result.group(1)) # 1234567
print(result.span()) # (0, 19)
import re
content = 'Hello 1234567 World_This is a Regex Demo'
result = re.match('^He.*(\d+).*Demo$', content) # 贪婪匹配:.*尽可能匹配多的字符
print(result.group(1)) # 7
result = re.match('^He.*?(\d+).*Demo$', content) # 非贪婪匹配:.*?尽可能匹配少的字符
print(result.group(1)) # 1234567
import re
content = """Hello 1234567 World_This
is a Regex Demo"""
result = re.match('^He.*?(\d+).*?Demo$', content, re.S)
print(result.group(1)) # 1234567
result = re.match('^He.*?(\d+).*?Demo$', content) # 非贪婪匹配:.*?尽可能匹配少的字符
print(result.group(1)) # AttributeError: 'NoneType' object has no attribute 'group'
转义符:字符串包含().\等使用\来转义或者在正则表达式前加r(原始字符串)
import re
content = '(百度)www.baidu.com'
result = re.match(r'(百度)www.baidu.com', content)
print(result)
result = re.match('\(百度\)www\.baidu\.com', content)
print(result.group())
re.search(正则表达式,匹配内容, 修饰符):匹配整个字符串,返回第一个匹配成功的结果,失败返回None
re.findall(正则表达式,匹配内容, 修饰符):匹配整个字符串,返回所有匹配成功的结果list,失败返回None
re.sub(正则表达式,'新的内容',匹配内容):用新的内容替换匹配到的字符串,返回替换后的字符串
import re
content = '54aK54yr50iR54ix5L2g'
content = re.sub('\d+', '', content)
print(content) # aKyriRixLg
re.compile(正则表达式, 修饰符):把正则表达式编译成正则表达式对象
import re
content = '2018-11-18 12:00'
pattern = re.compile('\s\d{2}:\d{2}')
result = re.sub(pattern, '', content)
print(result) # 2018-11-18