测试网站(航天云网):
http://cas.casicloud.com/loginservice=http%3A%2F%2Fin.casicloud.com%2Floginc%3Fservice%3D%252Fsso%252Flogin.jsp%253Fredirect%253Dhttp%25253A%25252F%25252Fwww.casicloud.com%25252Floginc%25253Fret%25253Dhttp%2525253A%2525252F%2525252Fwww.casicloud.com%2525252F
界面如图:
首先关于验证码:
很庆幸的是,经过分析,该网站的验证码不用通过OCR识别,相对应的,验证码的值在JS加载后,一段<input type="hidden" id="randomString" value="”。。。。的值李,因此,我们只需要模拟登陆后,取出JS加载好的值之后,正则匹配或者XPATH就能得到该值。
接着,开始:
1、设置浏览器,登录网页:
url = ‘***’
driver = webdriver.Chrome()
driver.get(url)
2、个人建议设置一个时间间隔,便于JS的加载(我一般设置的3-5秒)。
driver.implicitly_wait(5)
3、在相对应的表格里填写账户密码
driver.find_element_by_xpath('//*[@id="shortAccount"]')
driver.find_element_by_xpath('//*[@id="shortAccount"]').send_keys('账户名')
driver.find_element_by_xpath('//*[@id="password"]')
driver.find_element_by_xpath('//*[@id="password"]').send_keys('密码')
4、通过JS加载后的页面获取验证码值:
html = driver.page_source
check_value = re.search(r'<input type="hidden" id="randomString" value="(\d\d\d\d)"',html).group(1)
5、填写验证码,登录网站并获取cookie:
key = str(check_value)
driver.find_element_by_xpath('//*[@id="code0"]')
driver.find_element_by_xpath('//*[@id="code0"]').send_keys(key)
driver.find_element_by_xpath('//*[@id="loginForm"]/div[6]/input').click()
driver.refresh()
cookies = driver.get_cookies()
ret = ''
for cookie in cookies:
cookie_name = cookie['name']
cookie_value = cookie['value']
ret = ret+cookie_name+'='+cookie_value+'; '
print ret
driver.quit()
上面的刷新页面(refresh)只是个人习惯。
然后代码整理一下,如下:
#coding:utf-8 import re from selenium import webdriver def login_get_cookie(url): driver = webdriver.Chrome() driver.get(url) driver.implicitly_wait(5) driver.find_element_by_xpath('//*[@id="shortAccount"]') driver.find_element_by_xpath('//*[@id="shortAccount"]').send_keys('账户') driver.find_element_by_xpath('//*[@id="password"]') driver.find_element_by_xpath('//*[@id="password"]').send_keys('密码') html = driver.page_source check_value = re.search(r'<input type="hidden" id="randomString" value="(\d\d\d\d)"',html).group(1) key = str(check_value) driver.find_element_by_xpath('//*[@id="code0"]') driver.find_element_by_xpath('//*[@id="code0"]').send_keys(key) driver.find_element_by_xpath('//*[@id="loginForm"]/div[6]/input').click() driver.refresh() cookies = driver.get_cookies() ret = '' for cookie in cookies: cookie_name = cookie['name'] cookie_value = cookie['value'] ret = ret+cookie_name+'='+cookie_value+'; ' print ret driver.quit() return ret url = 'http://cas.casicloud.com/login?service=http%3A%2F%2Fin.casicloud.com%2Floginc%3Fservice%3D%252Fsso%252Flogin.jsp%253Fredirect%253Dhttp%25253A%25252F%25252Fwww.casicloud.com%25252Floginc%25253Fret%25253Dhttp%2525253A%2525252F%2525252Fwww.casicloud.com%2525252F' cookies = login_get_cookie(url)