import os
ifnot os.path.exists('image'):
os.mkdir('image')defparse_html(url):
headers ={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36"}
res = requests.get(url, headers=headers)
text = res.text
item =[]for i inrange(25):
text = text[text.find('alt')+3:]
item.append(extract(text))return item
defextract(text):
text = text.split('"')
name = text[1]
image = text[3]return name, image
defwrite_movies_file(item, stars):print(item)withopen('douban_film.txt','a',encoding='utf-8')as f:
f.write('排名:%d\t电影名:%s\n'%(stars, item[0]))
r = requests.get(item[1])withopen('image/'+str(item[0])+'.jpg','wb')as f:
f.write(r.content)defmain():
stars =1for offset inrange(0,250,25):
url ='https://movie.douban.com/top250?start='+str(offset)+'&filter='for item in parse_html(url):
write_movies_file(item, stars)
stars +=1if __name__ =='__main__':
main()
Python之禅
url4 ='https://www.python.org/dev/peps/pep-0020/'
res = requests.get(url4)
text = res.text
withopen('zon_of_python.txt','w')as f:
f.write(text[text.find('<pre')+28:text.find('</pre>')-1])print(text[text.find('<pre')+28:text.find('</pre>')-1])
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
有道翻译
import requests
import json
deftranslate(word):
url="http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule"
data={'i':word,'f':'auto','t':'auto','doctype':'json'#不可缺少}
headers={'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',}#User-Agent会告诉网站服务器,访问者是通过什么工具来请求的,如果是爬虫请求,一般会拒绝,如果是用户浏览器,就会应答。
response = requests.post(url,data=data,headers=headers)#发起请求
json_data=response.json()#获取json数据 #print(json_data)return json_data
defrun(word):
result = translate(word)['translateResult'][0][0]['tgt']print(result)return result
defmain():withopen('zon_of_python.txt')as f:
zh =[run(word)for word in f]withopen('zon_of_python_zh-CN.txt','w',encoding='utf-8')as g:for i in zh:
g.write(i +'\n')if __name__ =='__main__':
main()