Mac解压Windowns文件乱码

问题：
1，在windows中压缩的，文件名中包含繁体中文的文件，在Mac上通过右键（默认的解压方式）解压或者双击打开压缩包的方式解压，会出现类似

\267\261\363w\327\326\316\304\231n

这样的乱码。简体中文没问题；

2，在windows中压缩的文件，文件名中包含中文（简体或繁体），在Mac上通过命令" unzip "在控制台解压时(unzip不支持参数)，得到的文件名均为乱码，类似：

// windows压缩的，我的文件/你好.txt --> 我的文件.zip
// 通过右键解压正常；
// 通过命令行 upzip 我的文件.zip 解压，在控制台输出为：
xxxs:ziptest xxxx$ unzip 我的文件.zip
Archive:  我的文件.zip
   creating: +��-+-+�/
 extracting: +��-+-+�/-Ҧ+.txt

而在finder中看到的 『我的文件』变成了 『+%CA%C1-+-+%A6』，『你好』变成了『-Ҧ+』...

解决方法：
在Apple store下载 The Unarchiver 并安装。
安装后，使用右键解压文件时，出现 The Unarchiver 选项，使用它解压，不再出现繁体字乱码的情况（有可能需要选择编码方式）。
为了方便使用，最好安装The Unarchiver的command line tool，安装方法是在控制台输入:

# brew install unar

安装unar完成后，可以在命令行进行解压：

// 使用unzip乱码
xxxx:ziptest xxxx$ unzip 我的文件.zip
Archive:  我的文件.zip
   creating: +��-+-+�/
 extracting: +��-+-+�/-Ҧ+.txt
// 使用unar正常
xxxx:ziptest xxxx$ unar 我的文件.zip
我的文件.zip: Zip
  我的文件/  (dir)... OK.
  我的文件/你好.txt  (0 B)... OK.
Successfully extracted to "./我的文件".

对于繁体字，在使用右键通过The Unarchiver解压时，有可能需要选择编码方式：中文（GBK或GB18030）或简体中文（GB 2312或windows，Dos）。
对于需要指定编码方式的情况，通过命令行使用The Unarchiver解压时，则需要添加options，指定encoding：

xxxx:ziptest xxxx$ unar -encoding GBK 繁體字目錄.zip
繁體字目錄.zip: Zip
  繁體字目錄/  (dir)... OK.
  繁體字目錄/繁體字文檔.txt  (0 B)... OK.
Successfully extracted to "./繁體字目錄".

xxxx:ziptest xxxx$ unar -encoding GB\ 18030 繁體字目錄.zip
繁體字目錄.zip: Zip
  繁體字目錄/  (dir)... OK.
  繁體字目錄/繁體字文檔.txt  (0 B)... OK.
Successfully extracted to "./繁體字目錄".

// 不指定编码格式时，解压缩出来的乱码：
xxxx:ziptest xxxx$ unar  繁體字目錄.zip
繁體字目錄.zip: Zip
  ╥╠Сwвжд©Д⌡/  (dir)... OK.
  ╥╠Сwвжд©Д⌡/╥╠Сwвжнд≥n.txt  (0 B)... OK.
Successfully extracted to "./╥╠Сwвжд©Д⌡".

-------------

还有一种方法就是使用Python进行转码。
Python代码如下：

#!/usr/bin/env python
#-*- coding: utf-8 -*-
# unzip-gbk.py

import os
import sys
import zipfile

reload(sys)
sys.setdefaultencoding('utf-8')
print sys.getdefaultencoding()

print "Processing File" + sys.argv[1]

file = zipfile.ZipFile(sys.argv[1], "r");
for name in file.namelist():
    utf8name = name.decode("gbk")
    print "Extracting " + utf8name
    pathname = os.path.dirname(utf8name)
    if not os.path.exists(pathname) and pathname != "":
        os.makedirs(pathname)
    data = file.read(name)
    if not os.path.exists(utf8name):
        fo = open(utf8name, "w")
        fo.write(data)
        fo.close
file.close

上述代码保存成gbk-unzip.py文件，在控制台执行：

python gbk-unzip.py xxxx.zip

// 例子：
xxxx:ziptest xxxx $ python gbk-unzip.py 我的文件.zip
utf-8
Processing File我的文件.zip
Extracting 我的文件/
Extracting 我的文件/你好.txt

xxxx:ziptest xxxx $ python gbk-unzip.py 繁體字目錄.zip
utf-8
Processing File繁體字目錄.zip
Extracting 繁體字目錄/
Extracting 繁體字目錄/繁體字文檔.txt

也不再需要单独指定编码格式。

最后编辑于：2018.01.19 11:49:24