Python爬虫：正则匹配网址中的数字

其他 2018-05-31 09:22:10 阅读次数: 0


# 匹配网址中的数字
import re

url = "https://www.baidu.com/company/13828?param=abc"
com_id = re.match(".*company/(\d+)", url)

print com_id.group(1)
# 13828

将其封装为函数

# -*- coding: utf-8 -*-

# @File    : get_digit.py
# @Date    : 2018-05-25

# 匹配网址中的数字
import re

def get_digit(url, reg_exp=".*/(\d+)"):
    """匹配网址中的数字
    :param
        url{str}: 网址字符串
        reg_exp{str}: 正则匹配规则
    :return
        digit {None}/{str}: 空或者数字字符串
    """
    digit = None

    pattern = re.compile(reg_exp)
    result = pattern.match(url)

    if result:
        digit = result.group(1)

    return digit


if __name__ == '__main__':
    # 匹配为空
    url = ""
    ret = get_digit(url)
    print(ret)
    # None

    # 匹配一个
    url = "https://www.baidu.com/company/13828?param=abc"
    ret = get_digit(url)
    print(ret)
    # 13828

    # 匹配第一个
    url = "https://www.baidu.com/company/13828?param=234234"
    ret = get_digit(url)
    print(ret)
    # 13828

猜你喜欢

转载自blog.csdn.net/mouday/article/details/80459158

Python爬虫：正则匹配网址中的数字

python中通用匹配网址的正则表达式

python 正则匹配字母数字中的任意数字，字母

python——正则匹配数字

python中的正则匹配

正则匹配网址||URL

PHP正则匹配网址

Python爬虫入门-正则匹配图片资源

爬虫 python 正则匹配保存网页图片

Java与Python中的正则匹配

Python正则匹配数字和小数

python正则匹配数字或者汉字

SES输入文件中的数字正则匹配

正则匹配的爬虫

python正则表达式03--字符串中匹配数字

正则表达式匹配字符串中的数字 Python

Python 正则表达式的几个网址 python常用运维脚本实例 Python（1） ———— while循环比较常用的几个正则表达式(匹配数字)

Python:正则匹配文本中的时间串

python 删除代码中的注释，正则匹配

Re正则匹配和Python初级爬虫学习心得

python爬虫的re库（正则表达式匹配）

python爬虫下正则各种字符串数据匹配

正则表达式匹配网址

python 正则表达式匹配数字练习记录

python正则匹配所有的中文，数字和英文

Java 正则表达式匹配括号中的数字

python爬虫获取电影天堂中电影的标题与下载地址，并用正则表达匹配电影类型

通用正则表达式与python中的正则匹配

正则匹配多数字在editplus

使用正则匹配数字

今日推荐

周排行

Access的四舍五入取整

8.23 前端学习过程

入门学习过程方向与漏洞复现总结：

操作分布式文件之八：如何批量并行读写远程文件和事务补偿处理

应邀出个教程（搭建tensorflow跑网络环境）

Kubernetes之Pod控制器应用进阶

14-[mysql内置功能]--

HDU6212 区间dp 好题

VS2015生成代码图

验证手机号的工具类

每日归档

更多

2024-10-21(0)

2024-10-20(0)

2024-10-19(0)

2024-10-18(0)

2024-10-17(0)

2024-10-16(0)

2024-10-15(0)

2024-10-14(0)

2024-10-13(0)

2024-10-12(0)