itsPG.org

PG @ NCTU SenseLab

[程設] Python 3.2 在windows下的UTF8解決方案,解決can't Decode以及亂碼的問題

以前用Ruby操作windows上的資料夾的時候,就有出現過類似的問題,google了老半天找不到堪用的解法,原先以為是Ruby在windows上的lib不夠成熟,久試未果,只好暫時放棄Ruby。

沒想到,最近開始用Python,發覺也有類似的問題,後來在stackoverflow上,找到了問題的所在,總歸是Windows的問題,但幸好回答問題的人有附上他對windows terminal的hack,用了這個解法之後,可以讓Python在Windows下正確無誤顯示utf8字元。

以下引用自StackOverflow的這篇回答

It seems like all answers so far are from Unix people who assume the Windows console is like a Unix terminal, which it is not.

The problem is that you can’t write Unicode output to the Windows console using the normal underlying file I/O functions. The Windows API WriteConsole needs to be used. Python should probably be doing this transparently, but it isn’t.

There’s a different problem if you redirect the output to a file: Windows text files are historically in the ANSI codepage, not Unicode. You can fairly safely write UTF-8 to text files in Windows these days, but Python doesn’t do that by default.

I think it should do these things, but here’s some code to make it happen. You don’t have to worry about the details if you don’t want to; just call ConsoleFile.wrap_standard_handles(). You do need PyWin installed to get access to the necessary APIs.

這篇文章中也附上了他的解法,把它貼到程式裡面,並執行ConsoleFile.wrap_standard_handles(),就會幫你修正好摟。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import os, sys, io, win32api, win32console, pywintypes

def change_file_encoding(f, encoding):
"""
TextIOWrapper is missing a way to change the file encoding, so we have to
do it by creating a new one.
"""

errors = f.errors
line_buffering = f.line_buffering
# f.newlines is not the same as the newline parameter to TextIOWrapper.
# newlines = f.newlines

buf = f.detach()

# TextIOWrapper defaults newline to \r\n on Windows, even though the underlying
# file object is already doing that for us.  We need to explicitly say "\n" to
# make sure we don't output \r\r\n; this is the same as the internal function
# create_stdio.
return io.TextIOWrapper(buf, encoding, errors, "\n", line_buffering)


class ConsoleFile:
    class FileNotConsole(Exception): pass

    def __init__(self, handle):
        handle = win32api.GetStdHandle(handle)
        self.screen = win32console.PyConsoleScreenBufferType(handle)
        try:
            self.screen.GetConsoleMode()
        except pywintypes.error as e:
            raise ConsoleFile.FileNotConsole

    def write(self, s):
        self.screen.WriteConsole(s)

    def close(self): pass
    def flush(self): pass
    def isatty(self): return True

    @staticmethod
    def wrap_standard_handles():
        sys.stdout.flush()
        try:
            # There seems to be no binding for _get_osfhandle.
            sys.stdout = ConsoleFile(win32api.STD_OUTPUT_HANDLE)
        except ConsoleFile.FileNotConsole:
            sys.stdout = change_file_encoding(sys.stdout, "utf-8")

        sys.stderr.flush()
        try:
            sys.stderr = ConsoleFile(win32api.STD_ERROR_HANDLE)
        except ConsoleFile.FileNotConsole:
            sys.stderr = change_file_encoding(sys.stderr, "utf-8")

ConsoleFile.wrap_standard_handles()

要注意的是,使用這個hack之前,必須先裝pywin32,在此篇文章撰寫時,最新版本是Build 217,請依照你的windows版本以及Python版本下載合適的檔案,我第一次安裝到一半卡住,最後才發現原來我以前在裝Python的時候,不小心在64位元的win7上裝了32位元版的Python

另,這段hack的開頭的import os, sys, io, win32api, win32console, pywintypes跟結束的ConsoleFile.wrap_standard_handles(),我建議可以把它包進判斷作業系統的判斷式內,有助於我們撰寫跨平台的Python程式

if sys.platform == "win32":
    import os, sys, io, win32api, win32console, pywintypes

以及

if sys.platform == "win32":
    ConsoleFile.wrap_standard_handles()

以上,Enjoy your Python programming :)

Comments