My English is very poor, however I can’t find any useful information except the official manual when I googled this topic, so I write it down in English, hope this blog could help more people.
Keyword: cookielib http.cookiejar urllib.request login http webpagge Python 3.2 用python 3.2登入網頁 POST
I take this website for example : share.dmhy.org
In this website, you need to type your account, password, and a captcha code to login.
Step 1. Include the lib, define some var
import http.cookiejar, urllib.request
import re, os
username = "PG@example.com"
password = "123456"
captcha = ""
Step 2. Create the opener, which is using cookiejar for processing cookies.
cj = http.cookiejar.MozillaCookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
Step 3. Open the login page through opener, get captcha through opener, and save the captcha as a pic.
first_page = opener.open("http://share.dmhy.org/user/login").read().decode("utf-8")
m = re.search("common\/generate\-captcha\?code=([0-9]+)", first_page)
captcha_pic = opener.open("http://share.dmhy.org/common/generate-captcha?code=" + m.group(1)).read()
open("PG.jpg", 'wb').write(captcha_pic)
Step 4. Ask user for the captcha
os.system("start PG.jpg");
captcha_code = input("Captcha : ");
Step 5. send POST request
login_data = urllib.parse.urlencode \
({ \
'email' : username, \
'password' : password, \
'login_node' : '0', \
'cookietime' : '315360000', \
'captcha_code' : captcha_code
}).encode("utf-8")
url = urllib.request.Request("http://share.dmhy.org/user/login", login_data)
url.add_header("User-Agent","Chrome/18.1.2.3")
Note that User-Agent is optional.
Step 6. Get responsed data, show cookies, and save cookies as “cookie.txt”
ResponseData = opener.open(url).read().decode("utf8", 'ignore')
print(ResponseData)
for ind, cookie in enumerate(cj):
print("%d - %s" %(ind, cookie))
cj.save("cookie.txt")
Step 7. Next time when we need to login, you just need to load “cookie.txt”
cj = http.cookiejar.MozillaCookieJar()
cj.load("cookie.txt")