在多线程爬虫中,有时候会遇到出错,做异常处理,想要记录当前爬到第几页。记录的页数是单线程中共享的,不能作为全局变量
在python3多线程threading中有一个方法,threading.currentThread() 用来记录方法中的变量值,
threading.currentThread().page 表示在进程中page的参数
只需要在异常处理中 抛出当前变量
import time
import threading
exitFlag = 0
class myThread (threading.Thread):
def __init__(self, threadID, name, page):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.page = page
def run(self):
self.page = room(self.page)
self.page = room(self.page)
print ("开启线程:" + self.name)
print(threading.currentThread().page)
print ("退出线程:" + self.name)
def room(page):
try:
page = page+1
s
except Exception as e:
return page
else:
pass
finally:
pass
threadList = ["进程-1", "进程-2", "进程-3"]
threads = []
threadID = 1
# 创建新线程
for tName in threadList:
thread = myThread(threadID, tName,5)
thread.start()
threads.append(thread)
threadID += 1
for t in threads:
t.join()
print ("退出主线程")
运行最后结果
开启线程:进程-1
7
退出线程:进程-1
开启线程:进程-2
7
退出线程:进程-2
开启线程:进程-3
7
退出线程:进程-3
退出主线程