前言
对于每个 ipa 包,都会包含一个可执行文件,而这个文件就是 Mach-O 文件。
PE 是windows的可执行文件类型 ELF是Linux的可执行文件类型 mach-o 是iOS、osx的可执行文库类型
file xx
➜ ~ file tmp.arm64 tmp.arm64: Mach-O 64-bit dynamically linked shared library arm64 ➜ ~ file ~/bin/kngit /Users/devzkn/bin/kngit: POSIX shell script text executable, UTF-8 Unicode text ➜ Payload file NoName.app/NoName NoName.app/NoName: Mach-O executable arm_v7
Mach-O
,---------------------------,
Header | Mach header |
| Segment 1 |
| Section 1 (__text) | --,
|---------------------------| |
Data | blob | <-'
'---------------------------'
*Brief description taken from Wikipedia:
Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. A replacement for the a.out format, Mach-O offers more extensibility and faster access to information in the symbol table.
Mach-O is used by most systems based on the Mach kernel. NeXTSTEP, OS X, and iOS are examples of systems that have used this format for native executables, libraries and object code.
- Apple 系统上(包括 MacOS 以及 iOS)的可执行文件格式
<!-- Mach-O 是 Mach object 文件格式的缩写 -->
1、这种文件格式由文件头(Header)、加载命令(Load Commands)以及具体数据(Segment&Section)三部分组成
2、一种用于记录可执行文件、对象代码、共享库、动态加载代码和内存转储的文件格式。作为 a.out 格式的替代品,Mach-O 提供了更好的扩展性,并提升了符号表中信息的访问速度。
Header - contains general information about the binary: byte order (magic number), cpu type, amount of load commands etc.
Load Commands - it’s kind of a table of contents, that describes position of segments, symbol table, dynamic symbol table etc. Each load command includes a meta-information, such as type of command, its name, position in a binary and so on.
告诉操作系统应当如何加载文件中的数据,对系统内核加载器和动态库链接器起指导作用。
- Data - usually the biggest part of object file. It contains code and data, such as symbol tables, dynamic symbol tables and so on.
I、 分析 Mach-O
1.1 Mach-O & Universal Binary Parser
python 实现的
- macholibre: 一个解析macho 文件信息的工具,输出的是json
devzkndeMacBook-Pro:~ devzkn$ pyenv global 3.7.0b2
devzkndeMacBook-Pro:~ devzkn$ pyenv global
3.7.0b2
devzkndeMacBook-Pro:~ devzkn$ pip3 install git+https://github.com/aaronst/macholibre.git
- ` /Users/devzkn/.pyenv/shims/macholibre -o output.json ~/tmp`
devzkndeMacBook-Pro:~ devzkn$ which macholibre
/Users/devzkn/.pyenv/shims/macholibre
devzkndeMacBook-Pro:~ devzkn$ ls -lrt /Users/devzkn/.pyenv/shims/
1.2 遍历cmd中segment和section,获取方法和方法名称
cli
因为 objc_msgSend 特殊原因, 要混淆 __TEXT.__objc_methname 才算是真正的混淆
1.3 MachOView.app
A Mach-O Load Command deobfuscator.
* MH_MAGIC: FEEDFACE
* Architecture: i386
* Load Commands: 33
* Checking: __TEXT:__text
* Checking: __TEXT:__picsymbolstub4__TEXT
* Checking: __TEXT:__stub_helper
* Checking: __TEXT:__objc_methname
* Checking: __TEXT:__ustring
* Checking: __TEXT:__cstring
* Checking: __TEXT:__objc_classname__TEXT
* Checking: __TEXT:__objc_methtype
* Checking: __TEXT:__gcc_except_tab__TEXT
* Checking: __DATA:__nl_symbol_ptr
* Checking: __DATA:__la_symbol_ptr
* Checking: __DATA:__mod_init_func
* Checking: __DATA:__const
* Checking: __DATA:__cfstring
* Checking: __DATA:__objc_classlist__DATA
* Checking: __DATA:__objc_protolist__DATA
* Checking: __DATA:__objc_imageinfo__DATA
* Checking: __DATA:__objc_const
* Checking: __DATA:__objc_selrefs
* Checking: __DATA:__objc_classrefs__DATA
* Checking: __DATA:__objc_superrefs__DATA
* Checking: __DATA:__objc_ivar
* Checking: __DATA:__objc_data
* Checking: __DATA:__data
* Checking: __DATA:__bss
> Fix: Offset 0x0000000000000000 -> 0x000000000004f574
> Fix: Size 0x000000000000004c -> 0x0000000000000a8c
mach_override
iTunes-App-Store-Crawler/itunes_app_store_scraper_multithread.py
1.4 otool
-h -v
➜ Payload otool -h NoName.app/NoName Mach header magic cputype cpusubtype caps filetype ncmds sizeofcmds flags 0xfeedface 12 9 0x00 2 63 6144 0x00218085 ➜ Payload otool -hv NoName.app/NoName Mach header magic cputype cpusubtype caps filetype ncmds sizeofcmds flags MH_MAGIC ARM V7 0x00 EXECUTE 63 6144 NOUNDEFS DYLDLINK TWOLEVEL WEAK_DEFINES BINDS_TO_WEAK PIE
magic:arm_v7 0xfeedface MH_MAGIC-----macho的魔法数 0xfeedface、0xcafebabe、0xfeedfacf 分别对应v7、fat、64 cputype: 12 ARM-----cpu架构 cpusubtype : 9 V7------cpu架构子版本 64\V7 filetype: 2 EXECUTE------- 文件类型 OBJECT\DYLID ncmds: 63 -----加载命令数量 sizeofcmds: 6144 ----所有加载命令的大小 flags:0x00218085---dyld加载需要的一些标记。其中PIE表示启动地址空间布局随机化ARSL
1.5 lldb
对内存地址进行反汇编
(lldb) dis -a 0x18a72ecd0 libobjc.A.dylib`-[NSObject debugDescription]: 0x18a72ecd0 <+0>: adrp x8, 152865 0x18a72ecd4 <+4>: ldr x1, [x8, #0x40] 0x18a72ecd8 <+8>: b 0x18a726f60 ; objc_msgSend
error: Could not find function bounds for address 0x112950e00// 只能对方法进行反汇编
II、 xcrun
- 查看section段: 有些 segment 中有多个 section ` xcrun size -x -l -m tmp`
- ` xcrun otool -v -s __TEXT __cstring tmp`
__cstring 包含了可执行文件中的字符串常量 – 在源码中被双引号包含的字符串- 方法名
在 segment中,一般都会有多个 section
1、在 __TEXT segment
中,__text section
包含了编译所得到的机器码。
__stubs
和__stub_helper
是给动态链接器 (dyld) 使用的。通过这两个 section,在动态链接代码中,可以允许延迟链接。
__const
是常量,不可变的,就像__cstring
(包含了可执行文件中的字符串常量 – 在源码中被双引号包含的字符串) 常量一样。
2、__DATA segment 中包含了可读写数据
__nl_symbol_ptr
和__la_symbol_ptr
,它们分别是 non-lazy 和 lazy 符号指针
- ` __const`,在这里面会包含一些需要重定向的常量数据
- ` __bss section `没有被初始化的静态变量
III、除了hopper
,还有一个工具很强大otool
查看某段中某的节
otool -t tmp
tmp (architecture armv7): Contents of (__TEXT,__text) section
- 查看某段中某的节:
otool -s __TEXT __text tmp
tmp (architecture armv7): Contents of (__TEXT,__text) section
由此可以看出 -t 参数在otool 中是 -s __TEXT __tex的简写
xcrun otool -v -s __TEXT __const tmp
- ` xcrun otool -v -s __TEXT __cstring tmp`
包含了可执行文件中的字符串常量 – 在源码中被双引号包含的字符串- 方法名
使用otool 查看反汇编代码 -tv
otool -tv tmp
具体用法小结
- -l print the load commands
- -h print the mach header
- -t print the text section (disassemble with -v)
-t 参数在otool 中是 -s __TEXT __tex的简写
- otool -o
otool -o MKNooJn.dec* |grep password
- otool -s
IV、 Mach-O 基础知识小结
- The Mach-O (iOS binary format) binary structure
-1) header structure:The header contains general information about the binary: byte order (magic number), CPU type, amount of load commands, etc.
-2)The load commands section is like a table of contents of the binary:
it describes the position of the segments, symbols table, dynamic symbols table etc.
Each load command includes meta-information, such as type of command, its name, position in a binary, etc.
-3) 链接信息:The data section contains the application code and the data, such as symbol tables, dynamic symbol tables, etc.
- 官方描述
1)Header: Specifies the target architecture of the file, such as PPC, PPC64, IA-32, or x86-64.
2)Load commands: Specify the logical structure of the file and the layout of the file in virtual memory.
3)Raw segment data: Contains raw data for the segments defined in the load commands.
- mach header
cputype其定义参见/usr/include/mach/machine.h
知识拓展
- The process to insert a new library involves multiple steps
- 1. Insert the library to the application container
- 2. Insert the load command on the load commands section of the binary
- 3. Increment the load command counter on the header section
- 4. Increase the size binary number on the header section
Simple example of a Mach-O parser
See Also
parsing Mach-O file
other
https://github.com/tobefuturer/restore-symbol
/Users/devzkn/bin/knpost Mach-O mach-o文件解析 -t iosre #原来""的参数,需要自己加上""