Mach-O

mach-o文件解析

Posted by kunnan on May 29, 2018

前言

  • 对于每个 ipa 包,都会包含一个可执行文件,而这个文件就是 Mach-O 文件。

    PE 是windows的可执行文件类型
    ELF是Linux的可执行文件类型
    mach-o 是iOS、osx的可执行文库类型
    
    • file xx

      ➜  ~ file tmp.arm64
      tmp.arm64: Mach-O 64-bit dynamically linked shared library arm64
      ➜  ~ file ~/bin/kngit
      /Users/devzkn/bin/kngit: POSIX shell script text executable, UTF-8 Unicode text
      ➜  Payload file NoName.app/NoName    
      NoName.app/NoName: Mach-O executable arm_v7
      

Mach-O

            ,---------------------------,
  Header    |  Mach header              |
            |    Segment 1              |
            |      Section 1 (__text)   | --,
            |---------------------------|   |
  Data      |           blob            | <-'
            '---------------------------'

*Brief description taken from Wikipedia:

Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. A replacement for the a.out format, Mach-O offers more extensibility and faster access to information in the symbol table.

Mach-O is used by most systems based on the Mach kernel. NeXTSTEP, OS X, and iOS are examples of systems that have used this format for native executables, libraries and object code.
  • Apple 系统上(包括 MacOS 以及 iOS)的可执行文件格式
<!-- Mach-O 是 Mach object 文件格式的缩写 -->
1、这种文件格式由文件头(Header)、加载命令(Load Commands)以及具体数据(Segment&Section)三部分组成

2、一种用于记录可执行文件、对象代码、共享库、动态加载代码和内存转储的文件格式。作为 a.out 格式的替代品,Mach-O 提供了更好的扩展性,并提升了符号表中信息的访问速度。

  • Header - contains general information about the binary: byte order (magic number), cpu type, amount of load commands etc.

  • Load Commands - it’s kind of a table of contents, that describes position of segments, symbol table, dynamic symbol table etc. Each load command includes a meta-information, such as type of command, its name, position in a binary and so on.

 告诉操作系统应当如何加载文件中的数据,对系统内核加载器和动态库链接器起指导作用。
  • Data - usually the biggest part of object file. It contains code and data, such as symbol tables, dynamic symbol tables and so on.

I、 分析 Mach-O

1.1 Mach-O & Universal Binary Parser

python 实现的

  • macholibre: 一个解析macho 文件信息的工具,输出的是json
devzkndeMacBook-Pro:~ devzkn$  pyenv global 3.7.0b2
devzkndeMacBook-Pro:~ devzkn$ pyenv global
3.7.0b2
devzkndeMacBook-Pro:~ devzkn$ pip3 install git+https://github.com/aaronst/macholibre.git
  • ` /Users/devzkn/.pyenv/shims/macholibre -o output.json ~/tmp`
devzkndeMacBook-Pro:~ devzkn$ which macholibre
/Users/devzkn/.pyenv/shims/macholibre
devzkndeMacBook-Pro:~ devzkn$ ls -lrt /Users/devzkn/.pyenv/shims/

1.2 遍历cmd中segment和section,获取方法和方法名称

cli

因为 objc_msgSend 特殊原因, 要混淆 __TEXT.__objc_methname 才算是真正的混淆

1.3 MachOView.app

A Mach-O Load Command deobfuscator.

* MH_MAGIC:      FEEDFACE
* Architecture:  i386
* Load Commands: 33
 * Checking: __TEXT:__text
 * Checking: __TEXT:__picsymbolstub4__TEXT
 * Checking: __TEXT:__stub_helper
 * Checking: __TEXT:__objc_methname
 * Checking: __TEXT:__ustring
 * Checking: __TEXT:__cstring
 * Checking: __TEXT:__objc_classname__TEXT
 * Checking: __TEXT:__objc_methtype
 * Checking: __TEXT:__gcc_except_tab__TEXT
 * Checking: __DATA:__nl_symbol_ptr
 * Checking: __DATA:__la_symbol_ptr
 * Checking: __DATA:__mod_init_func
 * Checking: __DATA:__const
 * Checking: __DATA:__cfstring
 * Checking: __DATA:__objc_classlist__DATA
 * Checking: __DATA:__objc_protolist__DATA
 * Checking: __DATA:__objc_imageinfo__DATA
 * Checking: __DATA:__objc_const
 * Checking: __DATA:__objc_selrefs
 * Checking: __DATA:__objc_classrefs__DATA
 * Checking: __DATA:__objc_superrefs__DATA
 * Checking: __DATA:__objc_ivar
 * Checking: __DATA:__objc_data
 * Checking: __DATA:__data
 * Checking: __DATA:__bss
      > Fix: Offset	 0x0000000000000000	 -> 	0x000000000004f574
      > Fix: Size	 0x000000000000004c	 -> 	0x0000000000000a8c

mach_override

iTunes-App-Store-Crawler/itunes_app_store_scraper_multithread.py

1.4 otool

  • -h -v

    ➜  Payload otool -h NoName.app/NoName
    Mach header
          magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
     0xfeedface      12          9  0x00           2    63       6144 0x00218085
    ➜  Payload otool -hv NoName.app/NoName
    Mach header
          magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
       MH_MAGIC     ARM         V7  0x00     EXECUTE    63       6144   NOUNDEFS DYLDLINK TWOLEVEL WEAK_DEFINES BINDS_TO_WEAK PIE
    
    magic:arm_v7 0xfeedface MH_MAGIC-----macho的魔法数 0xfeedface、0xcafebabe、0xfeedfacf 分别对应v7、fat、64
    cputype: 12 ARM-----cpu架构
    cpusubtype : 9  V7------cpu架构子版本 64\V7
    filetype: 2 EXECUTE------- 文件类型 OBJECT\DYLID
    ncmds: 63 -----加载命令数量
    sizeofcmds: 6144 ----所有加载命令的大小
    flags:0x00218085---dyld加载需要的一些标记。其中PIE表示启动地址空间布局随机化ARSL
    

1.5 lldb

  • 对内存地址进行反汇编

    (lldb) dis -a 0x18a72ecd0
    libobjc.A.dylib`-[NSObject debugDescription]:
        0x18a72ecd0 <+0>: adrp   x8, 152865
        0x18a72ecd4 <+4>: ldr    x1, [x8, #0x40]
        0x18a72ecd8 <+8>: b      0x18a726f60               ; objc_msgSend
    
    error: Could not find function bounds for address 0x112950e00// 只能对方法进行反汇编
    

II、 xcrun

  • 查看section段: 有些 segment 中有多个 section ` xcrun size -x -l -m tmp`
  • ` xcrun otool -v -s __TEXT __cstring tmp`
__cstring 包含了可执行文件中的字符串常量 – 在源码中被双引号包含的字符串- 方法名

在 segment中,一般都会有多个 section

1、在 __TEXT segment 中,__text section 包含了编译所得到的机器码。
  • __stubs__stub_helper 是给动态链接器 (dyld) 使用的。通过这两个 section,在动态链接代码中,可以允许延迟链接。
  • __const 是常量,不可变的,就像 __cstring (包含了可执行文件中的字符串常量 – 在源码中被双引号包含的字符串) 常量一样。
2、__DATA segment 中包含了可读写数据
  • __nl_symbol_ptr __la_symbol_ptr,它们分别是 non-lazy 和 lazy 符号指针
  • ` __const`,在这里面会包含一些需要重定向的常量数据
  • ` __bss section `没有被初始化的静态变量

III、除了hopper ,还有一个工具很强大otool

查看某段中某的节

  • otool -t tmp
    tmp (architecture armv7):
    Contents of (__TEXT,__text) section
    
  • 查看某段中某的节: otool -s __TEXT __text tmp
    tmp (architecture armv7):
    Contents of (__TEXT,__text) section
    

由此可以看出 -t 参数在otool 中是 -s __TEXT __tex的简写

  • xcrun otool -v -s __TEXT __const tmp
  • ` xcrun otool -v -s __TEXT __cstring tmp`
包含了可执行文件中的字符串常量 – 在源码中被双引号包含的字符串- 方法名

使用otool 查看反汇编代码 -tv

  • otool -tv tmp

具体用法小结

  • -l print the load commands
  • -h print the mach header
  • -t print the text section (disassemble with -v)
-t 参数在otool 中是 -s __TEXT __tex的简写
  • otool -o
 otool -o MKNooJn.dec* |grep password
  • otool -s

IV、 Mach-O 基础知识小结

image

  • The Mach-O (iOS binary format) binary structure
-1) header structure:The header contains general information about the binary: byte order (magic number), CPU type, amount of load commands, etc.
-2)The load commands section is like a table of contents of the binary: 
it describes the position of the segments, symbols table, dynamic symbols table etc. 
Each load command includes meta-information, such as type of command, its name, position in a binary, etc. 
-3) 链接信息:The data section contains the application code and the data, such as symbol tables, dynamic symbol tables, etc.
  • 官方描述
1)Header: Specifies the target architecture of the file, such as PPC, PPC64, IA-32, or x86-64.
2)Load commands: Specify the logical structure of the file and the layout of the file in virtual memory.
3)Raw segment data: Contains raw data for the segments defined in the load commands.
  • mach header

image

cputype其定义参见/usr/include/mach/machine.h

知识拓展

  • The process to insert a new library involves multiple steps
- 1. Insert the library to the application container

- 2. Insert the load command on the load commands section of the binary

- 3. Increment the load command counter on the header section

- 4. Increase the size binary number on the header section

Simple example of a Mach-O parser

See Also

parsing Mach-O file

other

https://github.com/tobefuturer/restore-symbol
/Users/devzkn/bin/knpost Mach-O mach-o文件解析 -t iosre
#原来""的参数,需要自己加上""

转载请注明: > Mach-O