版权声明:欢迎转载大宇的博客,转载请注明出处: https://blog.csdn.net/yanluandai1985/article/details/88654927
from bs4 import BeautifulSoup
import requests
if __name__ == '__main__':
url = 'https://blog.csdn.net/yanluandai1985'
headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
}
req = requests.get(url = url,headers = headers)
req.encoding = 'utf-8'
html = req.text
bf = BeautifulSoup(html, 'html.parser')
targets_url = bf.find_all(class_='article-item-box csdn-tracking-statistics')
list_url = []
for each in targets_url:
list_url.append(each.h4.a.get('href'))
print(each.h4.a.get('href') ,":",each.h4.a.contents[2])
运行结果: