itsPG.org

PG @ NCTU SenseLab

[程設] Use http.cookiejar and urllib.request to POST Login Http Webpage in Python 3.2

My English is very poor, however I can’t find any useful information except the official manual when I googled this topic, so I write it down in English, hope this blog could help more people.

Keyword: cookielib http.cookiejar urllib.request login http webpagge Python 3.2 用python 3.2登入網頁 POST

I take this website for example : share.dmhy.org

In this website, you need to type your account, password, and a captcha code to login.

Step 1. Include the lib, define some var

import http.cookiejar, urllib.request
import re, os
username = "PG@example.com"
password = "123456"
captcha = ""

Step 2. Create the opener, which is using cookiejar for processing cookies.

cj = http.cookiejar.MozillaCookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))

Step 3. Open the login page through opener, get captcha through opener, and save the captcha as a pic.

first_page = opener.open("http://share.dmhy.org/user/login").read().decode("utf-8")
m = re.search("common\/generate\-captcha\?code=([0-9]+)", first_page)
captcha_pic = opener.open("http://share.dmhy.org/common/generate-captcha?code=" + m.group(1)).read()
open("PG.jpg", 'wb').write(captcha_pic)

Step 4. Ask user for the captcha

os.system("start PG.jpg");
captcha_code = input("Captcha : ");

Step 5. send POST request

login_data = urllib.parse.urlencode    \
({                               \
    'email' : username,       \
    'password' : password,      \
    'login_node' : '0',        \
    'cookietime' : '315360000',         \
    'captcha_code' : captcha_code
}).encode("utf-8")
url = urllib.request.Request("http://share.dmhy.org/user/login", login_data)
url.add_header("User-Agent","Chrome/18.1.2.3")

Note that User-Agent is optional.

Step 6. Get responsed data, show cookies, and save cookies as “cookie.txt”

ResponseData = opener.open(url).read().decode("utf8", 'ignore')
print(ResponseData)
for ind, cookie in enumerate(cj):
    print("%d - %s" %(ind, cookie))
cj.save("cookie.txt")

Step 7. Next time when we need to login, you just need to load “cookie.txt”

cj = http.cookiejar.MozillaCookieJar()
cj.load("cookie.txt")

Comments