欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 教育 > 锐评 > 记一次python逆向:反编译pyinstaller打包生成的exe

记一次python逆向:反编译pyinstaller打包生成的exe

2024/10/25 4:17:59 来源:https://blog.csdn.net/qq_42944740/article/details/140949376  浏览:    关键词:记一次python逆向:反编译pyinstaller打包生成的exe

首先声明一点,逆向这块我是纯小白,我是通过看各种博客,反复折腾才搞出来的,写的有错误的地方,还请各位大佬多多指正

0 前置条件

确认需要逆向的exe文件是不是通过pyinstaller打包的,如果不是,那么这个方法就是无效的

我们可以通过图标来确定,pyinstaller打包的图标一般如下两种

1 需要用到的环境及工具

(1)电脑需要有python环境,我用的是python3.9

(2)pyinstxtractor.py ---- 将.exe解包

官方GitHub地址 https://github.com/countercept/python-exe-unpacker/blob/master/pyinstxtractor.py

(3)uncompyle6 ---- .pyc转.py

# 安装可以加上清华镜像源,自行搜索安装即可
pip install uncompyle6

(4)WinHex ---- 16进制编辑(用其他也可以)

2 将exe文件进行解包

将pyinstxtractor.py文件和需要解包的exe文件(我这里叫xxxx.exx)放在同一个目录下,执行如下命令

python pyinstxtractor.py xxxx.exe

解包后会生成一个xxxx.exe_extracted文件夹,我们可以通过这个文件夹看到这个exe文件是通过什么版本的python打包的

可以看到,我这个exe文件是通过python3.8打包的,但具体是哪个版本的,就无法得知

在xxxx.exe_extracted文件中,我们只需要关注该目录下没有后缀名的文件和PYZ-00.pyz_extracted这个文件夹

没有后缀名的文件,在我这个例子中有两个: 1 和 struct

其中,这个1文件其实就是程序的入口文件,PYZ-00.pyz_extracted文件夹存放主程序的一些依赖库,包括自己写的

PYZ-00.pyz_extracted这个文件夹下的文件大概如下,分两种情况

1 文件都是以pyc结尾的,说明没有加密,我们可以用uncompyle6 将其反编译成.py文件(也就是源码)

2 文件都是以.pyc.encrypted 结尾的,说明经过了加密,感兴趣的可以研究如何解密,我这里就不说明了

3 将pyc文件转为py文件

将pyc文件转为py文件,最重要的是需要补全magic head。想要快速将pyc转为py文件,而不需要补全magic head,请直接看3.3节。

3.1 反编译程序入口文件

上面说过,在我这个例子中,1这个文件就是程序的入口文件,一般来说你的exe文件如果是demo.exe,那么你这个入口文件大概率是demo。

(1)重命名程序入口文件

上面说过1这个文件是没有后缀名的,需要给它加上.pyc的后缀名,变为1.pyc;同理,将struct文件改为struct.pyc

(2)用WinHex打开上面1.pyc 和 struct.pyc,对比字节码的差异,补全magic head

这里我就有点懵,因为我看其它博客struct.pyc文件打开后,第一行并不是E3开头,而是类似下面这种

这里无法根据struct.pyc进行补全,我又搜索其它博客,发现有人和我情况一样,struct.pyc打开直接就是E3开头,可惜没有人回答这个问题

不多说,直接上解决办法:在exe解包的目录下有一个base_library.zip压缩包,打开这个压缩包,随便找一个oyc文件,使用WinHex打开,这里我打开的是base_library中的abc.pyc文件

对比两个文件,发现 55 0d 0d 0a 01这第一行的字节码就是我们需要的magic head

我们直接把abc.pyc文件中的第一行复制,添加到1.pyc文件中的第一行

保存,然后用 uncompyle6 反编译

uncompyle6 -o 1.py 1.pyc

反编译后,没有报错,并且生成了1.py文件,打开1.pyc文件,能够看到源码,说明magic head是正确的

3.2 反编译依赖库

反编译程序入口后,我们可以看到,该程序依赖了一个demo的库,这个库肯定是作者自己写的

依赖的demo库在PYZ-00.pyz_extracted文件夹中可以找到,打开demo.pyc与1.pyc进行对比

这里我们可以发现,依赖库缺失的字节是在中间位置,这里我们看到缺少的是12-15位的四个字节 00 00 00 00

补全缺失的四个字节

保存,然后用 uncompyle6 反编译

uncompyle6 -o demo.py demo.pyc

反编译后,没有报错,并且生成了demo.py文件,打开demo.pyc文件,能够看到源码,说明我们补全的字节码是正确的

可以看到,作者这里使用了PyArmor对源码进行了加密处理,这种加密方式目前无解

但如果你发编译的代码没有经过类型的加密,那么你到这一步是可以完全解密出来对应的依赖库,当然,依赖库可能不止一个,但方法都是一样的

3.3 快速补全magic head

上面的3.1和3.2节,均是在教你如何补全通过分析文件来补全magic head,这对新手来说理解起来有些困难并且在操作过程中,如果遇到想我一个struct.pyc文件直接就是E3开头,更是天崩开局,,,,

所谓的快速补全magic head,其实就是使用最新版本的pyinstxtractor.py文件,经过实测,我发现最新版本的pyinstxtractor.py能够直接对程序入口pyc文件和依赖库pyc文件自动补全magic head,着实厉害。

最新的pyinstxtractor.py文件内容,我会贴在下方,避免有些伙伴无法连接GitHub

不用补全magic head后,你就可以直接使用 uncompyle6 反编译,帮助我们节约了不少时间

uncompyle6 -o xxx.py xxx.pyc

最新的pyinstxtractor内容如下

"""
PyInstaller Extractor v2.0 (Supports pyinstaller 6.9.0, 6.8.0, 6.7.0, 6.6.0, 6.5.0, 6.4.0, 6.3.0, 6.2.0, 6.1.0, 6.0.0, 5.13.2, 5.13.1, 5.13.0, 5.12.0, 5.11.0, 5.10.1, 5.10.0, 5.9.0, 5.8.0, 5.7.0, 5.6.2, 5.6.1, 5.6, 5.5, 5.4.1, 5.4, 5.3, 5.2, 5.1, 5.0.1, 5.0, 4.10, 4.9, 4.8, 4.7, 4.6, 4.5.1, 4.5, 4.4, 4.3, 4.2, 4.1, 4.0, 3.6, 3.5, 3.4, 3.3, 3.2, 3.1, 3.0, 2.1, 2.0)
Author : Extreme Coders
E-mail : extremecoders(at)hotmail(dot)com
Web    : https://0xec.blogspot.com
Date   : 26-March-2020
Url    : https://github.com/extremecoders-re/pyinstxtractorFor any suggestions, leave a comment on
https://forum.tuts4you.com/topic/34455-pyinstaller-extractor/This script extracts a pyinstaller generated executable file.
Pyinstaller installation is not needed. The script has it all.For best results, it is recommended to run this script in the
same version of python as was used to create the executable.
This is just to prevent unmarshalling errors(if any) while
extracting the PYZ archive.Usage : Just copy this script to the directory where your exe residesand run the script with the exe file name as a parameterC:\\path\\to\\exe\\>python pyinstxtractor.py <filename>
$ /path/to/exe/python pyinstxtractor.py <filename>Licensed under GNU General Public License (GPL) v3.
You are free to modify this source.CHANGELOG
================================================Version 1.1 (Jan 28, 2014)
-------------------------------------------------
- First Release
- Supports only pyinstaller 2.0Version 1.2 (Sept 12, 2015)
-------------------------------------------------
- Added support for pyinstaller 2.1 and 3.0 dev
- Cleaned up code
- Script is now more verbose
- Executable extracted within a dedicated sub-directory(Support for pyinstaller 3.0 dev is experimental)Version 1.3 (Dec 12, 2015)
-------------------------------------------------
- Added support for pyinstaller 3.0 final
- Script is compatible with both python 2.x & 3.x (Thanks to Moritz Kroll @ Avira Operations GmbH & Co. KG)Version 1.4 (Jan 19, 2016)
-------------------------------------------------
- Fixed a bug when writing pyc files >= version 3.3 (Thanks to Daniello Alto: https://github.com/Djamana)Version 1.5 (March 1, 2016)
-------------------------------------------------
- Added support for pyinstaller 3.1 (Thanks to Berwyn Hoyt for reporting)Version 1.6 (Sept 5, 2016)
-------------------------------------------------
- Added support for pyinstaller 3.2
- Extractor will use a random name while extracting unnamed files.
- For encrypted pyz archives it will dump the contents as is. Previously, the tool would fail.Version 1.7 (March 13, 2017)
-------------------------------------------------
- Made the script compatible with python 2.6 (Thanks to Ross for reporting)Version 1.8 (April 28, 2017)
-------------------------------------------------
- Support for sub-directories in .pyz files (Thanks to Moritz Kroll @ Avira Operations GmbH & Co. KG)Version 1.9 (November 29, 2017)
-------------------------------------------------
- Added support for pyinstaller 3.3
- Display the scripts which are run at entry (Thanks to Michael Gillespie @ malwarehunterteam for the feature request)Version 2.0 (March 26, 2020)
-------------------------------------------------
- Project migrated to github
- Supports pyinstaller 3.6
- Added support for Python 3.7, 3.8
- The header of all extracted pyc's are now automatically fixed
"""from __future__ import print_function
import os
import struct
import marshal
import zlib
import sys
from uuid import uuid4 as uniquenameclass CTOCEntry:def __init__(self, position, cmprsdDataSize, uncmprsdDataSize, cmprsFlag, typeCmprsData, name):self.position = positionself.cmprsdDataSize = cmprsdDataSizeself.uncmprsdDataSize = uncmprsdDataSizeself.cmprsFlag = cmprsFlagself.typeCmprsData = typeCmprsDataself.name = nameclass PyInstArchive:PYINST20_COOKIE_SIZE = 24           # For pyinstaller 2.0PYINST21_COOKIE_SIZE = 24 + 64      # For pyinstaller 2.1+MAGIC = b'MEI\014\013\012\013\016'  # Magic number which identifies pyinstallerdef __init__(self, path):self.filePath = pathself.pycMagic = b'\0' * 4self.barePycList = [] # List of pyc's whose headers have to be fixeddef open(self):try:self.fPtr = open(self.filePath, 'rb')self.fileSize = os.stat(self.filePath).st_sizeexcept:print('[!] Error: Could not open {0}'.format(self.filePath))return Falsereturn Truedef close(self):try:self.fPtr.close()except:passdef checkFile(self):print('[+] Processing {0}'.format(self.filePath))searchChunkSize = 8192endPos = self.fileSizeself.cookiePos = -1if endPos < len(self.MAGIC):print('[!] Error : File is too short or truncated')return Falsewhile True:startPos = endPos - searchChunkSize if endPos >= searchChunkSize else 0chunkSize = endPos - startPosif chunkSize < len(self.MAGIC):breakself.fPtr.seek(startPos, os.SEEK_SET)data = self.fPtr.read(chunkSize)offs = data.rfind(self.MAGIC)if offs != -1:self.cookiePos = startPos + offsbreakendPos = startPos + len(self.MAGIC) - 1if startPos == 0:breakif self.cookiePos == -1:print('[!] Error : Missing cookie, unsupported pyinstaller version or not a pyinstaller archive')return Falseself.fPtr.seek(self.cookiePos + self.PYINST20_COOKIE_SIZE, os.SEEK_SET)if b'python' in self.fPtr.read(64).lower():print('[+] Pyinstaller version: 2.1+')self.pyinstVer = 21     # pyinstaller 2.1+else:self.pyinstVer = 20     # pyinstaller 2.0print('[+] Pyinstaller version: 2.0')return Truedef getCArchiveInfo(self):try:if self.pyinstVer == 20:self.fPtr.seek(self.cookiePos, os.SEEK_SET)# Read CArchive cookie(magic, lengthofPackage, toc, tocLen, pyver) = \struct.unpack('!8siiii', self.fPtr.read(self.PYINST20_COOKIE_SIZE))elif self.pyinstVer == 21:self.fPtr.seek(self.cookiePos, os.SEEK_SET)# Read CArchive cookie(magic, lengthofPackage, toc, tocLen, pyver, pylibname) = \struct.unpack('!8sIIii64s', self.fPtr.read(self.PYINST21_COOKIE_SIZE))except:print('[!] Error : The file is not a pyinstaller archive')return Falseself.pymaj, self.pymin = (pyver//100, pyver%100) if pyver >= 100 else (pyver//10, pyver%10)print('[+] Python version: {0}.{1}'.format(self.pymaj, self.pymin))# Additional data after the cookietailBytes = self.fileSize - self.cookiePos - (self.PYINST20_COOKIE_SIZE if self.pyinstVer == 20 else self.PYINST21_COOKIE_SIZE)# Overlay is the data appended at the end of the PEself.overlaySize = lengthofPackage + tailBytesself.overlayPos = self.fileSize - self.overlaySizeself.tableOfContentsPos = self.overlayPos + tocself.tableOfContentsSize = tocLenprint('[+] Length of package: {0} bytes'.format(lengthofPackage))return Truedef parseTOC(self):# Go to the table of contentsself.fPtr.seek(self.tableOfContentsPos, os.SEEK_SET)self.tocList = []parsedLen = 0# Parse table of contentswhile parsedLen < self.tableOfContentsSize:(entrySize, ) = struct.unpack('!i', self.fPtr.read(4))nameLen = struct.calcsize('!iIIIBc')(entryPos, cmprsdDataSize, uncmprsdDataSize, cmprsFlag, typeCmprsData, name) = \struct.unpack( \'!IIIBc{0}s'.format(entrySize - nameLen), \self.fPtr.read(entrySize - 4))try:name = name.decode("utf-8").rstrip("\0")except UnicodeDecodeError:newName = str(uniquename())print('[!] Warning: File name {0} contains invalid bytes. Using random name {1}'.format(name, newName))name = newName# Prevent writing outside the extraction directoryif name.startswith("/"):name = name.lstrip("/")if len(name) == 0:name = str(uniquename())print('[!] Warning: Found an unamed file in CArchive. Using random name {0}'.format(name))self.tocList.append( \CTOCEntry(                      \self.overlayPos + entryPos, \cmprsdDataSize,             \uncmprsdDataSize,           \cmprsFlag,                  \typeCmprsData,              \name                        \))parsedLen += entrySizeprint('[+] Found {0} files in CArchive'.format(len(self.tocList)))def _writeRawData(self, filepath, data):nm = filepath.replace('\\', os.path.sep).replace('/', os.path.sep).replace('..', '__')nmDir = os.path.dirname(nm)if nmDir != '' and not os.path.exists(nmDir): # Check if path exists, create if notos.makedirs(nmDir)with open(nm, 'wb') as f:f.write(data)def extractFiles(self):print('[+] Beginning extraction...please standby')extractionDir = os.path.join(os.getcwd(), os.path.basename(self.filePath) + '_extracted')if not os.path.exists(extractionDir):os.mkdir(extractionDir)os.chdir(extractionDir)for entry in self.tocList:self.fPtr.seek(entry.position, os.SEEK_SET)data = self.fPtr.read(entry.cmprsdDataSize)if entry.cmprsFlag == 1:try:data = zlib.decompress(data)except zlib.error:print('[!] Error : Failed to decompress {0}'.format(entry.name))continue# Malware may tamper with the uncompressed size# Comment out the assertion in such a caseassert len(data) == entry.uncmprsdDataSize # Sanity Checkif entry.typeCmprsData == b'd' or entry.typeCmprsData == b'o':# d -> ARCHIVE_ITEM_DEPENDENCY# o -> ARCHIVE_ITEM_RUNTIME_OPTION# These are runtime options, not filescontinuebasePath = os.path.dirname(entry.name)if basePath != '':# Check if path exists, create if notif not os.path.exists(basePath):os.makedirs(basePath)if entry.typeCmprsData == b's':# s -> ARCHIVE_ITEM_PYSOURCE# Entry point are expected to be python scriptsprint('[+] Possible entry point: {0}.pyc'.format(entry.name))if self.pycMagic == b'\0' * 4:# if we don't have the pyc header yet, fix them in a later passself.barePycList.append(entry.name + '.pyc')self._writePyc(entry.name + '.pyc', data)elif entry.typeCmprsData == b'M' or entry.typeCmprsData == b'm':# M -> ARCHIVE_ITEM_PYPACKAGE# m -> ARCHIVE_ITEM_PYMODULE# packages and modules are pyc files with their header intact# From PyInstaller 5.3 and above pyc headers are no longer stored# https://github.com/pyinstaller/pyinstaller/commit/a97fdfif data[2:4] == b'\r\n':# < pyinstaller 5.3if self.pycMagic == b'\0' * 4: self.pycMagic = data[0:4]self._writeRawData(entry.name + '.pyc', data)else:# >= pyinstaller 5.3if self.pycMagic == b'\0' * 4:# if we don't have the pyc header yet, fix them in a later passself.barePycList.append(entry.name + '.pyc')self._writePyc(entry.name + '.pyc', data)else:self._writeRawData(entry.name, data)if entry.typeCmprsData == b'z' or entry.typeCmprsData == b'Z':self._extractPyz(entry.name)# Fix bare pyc's if anyself._fixBarePycs()def _fixBarePycs(self):for pycFile in self.barePycList:with open(pycFile, 'r+b') as pycFile:# Overwrite the first four bytespycFile.write(self.pycMagic)def _writePyc(self, filename, data):with open(filename, 'wb') as pycFile:pycFile.write(self.pycMagic)            # pyc magicif self.pymaj >= 3 and self.pymin >= 7:                # PEP 552 -- Deterministic pycspycFile.write(b'\0' * 4)        # BitfieldpycFile.write(b'\0' * 8)        # (Timestamp + size) || hash else:pycFile.write(b'\0' * 4)      # Timestampif self.pymaj >= 3 and self.pymin >= 3:pycFile.write(b'\0' * 4)  # Size parameter added in Python 3.3pycFile.write(data)def _extractPyz(self, name):dirName =  name + '_extracted'# Create a directory for the contents of the pyzif not os.path.exists(dirName):os.mkdir(dirName)with open(name, 'rb') as f:pyzMagic = f.read(4)assert pyzMagic == b'PYZ\0' # Sanity CheckpyzPycMagic = f.read(4) # Python magic valueif self.pycMagic == b'\0' * 4:self.pycMagic = pyzPycMagicelif self.pycMagic != pyzPycMagic:self.pycMagic = pyzPycMagicprint('[!] Warning: pyc magic of files inside PYZ archive are different from those in CArchive')# Skip PYZ extraction if not running under the same python versionif self.pymaj != sys.version_info.major or self.pymin != sys.version_info.minor:print('[!] Warning: This script is running in a different Python version than the one used to build the executable.')print('[!] Please run this script in Python {0}.{1} to prevent extraction errors during unmarshalling'.format(self.pymaj, self.pymin))print('[!] Skipping pyz extraction')return(tocPosition, ) = struct.unpack('!i', f.read(4))f.seek(tocPosition, os.SEEK_SET)try:toc = marshal.load(f)except:print('[!] Unmarshalling FAILED. Cannot extract {0}. Extracting remaining files.'.format(name))returnprint('[+] Found {0} files in PYZ archive'.format(len(toc)))# From pyinstaller 3.1+ toc is a list of tuplesif type(toc) == list:toc = dict(toc)for key in toc.keys():(ispkg, pos, length) = toc[key]f.seek(pos, os.SEEK_SET)fileName = keytry:# for Python > 3.3 some keys are bytes object some are str objectfileName = fileName.decode('utf-8')except:pass# Prevent writing outside dirNamefileName = fileName.replace('..', '__').replace('.', os.path.sep)if ispkg == 1:filePath = os.path.join(dirName, fileName, '__init__.pyc')else:filePath = os.path.join(dirName, fileName + '.pyc')fileDir = os.path.dirname(filePath)if not os.path.exists(fileDir):os.makedirs(fileDir)try:data = f.read(length)data = zlib.decompress(data)except:print('[!] Error: Failed to decompress {0}, probably encrypted. Extracting as is.'.format(filePath))open(filePath + '.encrypted', 'wb').write(data)else:self._writePyc(filePath, data)def main():if len(sys.argv) < 2:print('[+] Usage: pyinstxtractor.py <filename>')else:arch = PyInstArchive(sys.argv[1])if arch.open():if arch.checkFile():if arch.getCArchiveInfo():arch.parseTOC()arch.extractFiles()arch.close()print('[+] Successfully extracted pyinstaller archive: {0}'.format(sys.argv[1]))print('')print('You can now use a python decompiler on the pyc files within the extracted directory')returnarch.close()if __name__ == '__main__':main()

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com