Jonas' blog

IT security & forensics

RSSTwitterGithub

APFS filesystem format

I started to reverse engineer APFS and want to share what I found out so far. You can send me feedback and ideas on this post via Twitter.

Notice: I created a test image with macOS Sierra 10.12.3 (16D32). All results are guesses and the reverse engineering is work in progress. Also newer versions of APFS might change structures. The information below is neither complete nor proven to be correct.

Update 2017-04-30: Added a section for the checksum Update 2017-06-16: Add apfs.ksy respository

Contents

Overview

APFS is structured in a single container that can contain multiple APFS volumes. A container needs to be >512 MB to contain more than one volume, >1024MB to contain more than two volumes and so on. The following image shows an overview of the APFs structure.

APFS Overview

Each element of this structure (except for the allocation file) starts with a 32 byte block header, which contains some general information about the block. Afterwards the body of the structure is following. The following types exist:

  • 0x01: Container Superblock
  • 0x02: Node
  • 0x05: Spacemanager
  • 0x07: Allocation Info File
  • 0x11: Unknown
  • 0x0B: B-Tree
  • 0x0C: Checkpoint
  • 0x0D: Volume Superblock

Each of this structures is described in detail below. A more detailed version of the APFS structure is available as a Kaitai struct file: apfs.ksy. You can use it to examine APFS dumps in the Kaitai IDE or create parsers for various languages. This .ksy file must considered experimental.

General information:

  • The filesystem uses litte-endian values for storing information
  • Timestamps are 64bit nanoseconds (1 / 1,000,000,000 seconds!) starting from 1.1.1970 UTC (unix epoch). The current timestamp is around 0x14b11800f375e000.
  • Standard block size seems to be 4096 byte per block.
  • APFS is a copy-on-write filesystem so each block is copied before changes are applied so a history of all unoverwritten files and filesystem structures exists. This might result in a huge amount of forensic artefacts.

Structures

Block header

Each filesystem structure in APFS starts with a block header. This header starts with a checksum for the whole block. Other informations in the header include the copy-on-write version of the block, the block id and the block type.

possizetypeid
08uint64checksum
88uint64block_id
168uint64version
242uint16block_type
262uint16flags
284uint32padding

Checksum

According to the apple docs the Fletcher’s checksum algorithm is used. Apple uses a variant of the algorithm described in a paper by John Kodis. The following algorithm shows this procedure. The input is the block without the first 8 byte.

    func createChecksum(data []byte) uint64 {
        var sum1, sum2 uint64

        modValue := uint64(2<<31 - 1)

        for i := 0; i < len(data)/4; i++ {
            d := binary.LittleEndian.Uint32(data[i*4 : (i+1)*4])
            sum1 = (sum1 + uint64(d)) % modValue
            sum2 = (sum2 + sum1) % modValue
        }

        check1 := modValue - ((sum1 + sum2) % modValue)
        check2 := modValue - ((sum1 + check1) % modValue)

        return (check2 << 32) | check1
    }

The nice feature of the algorithm is, that when you check a block in APFS with the following algorithm you should get null as a result. Note that the input in this case is the whole block, including the checksum.

    func checkChecksum(data []byte) uint64 {
        var sum1, sum2 uint64

        modValue := uint64(2<<31 - 1)

        for i := 0; i < len(data)/4; i++ {
            d := binary.LittleEndian.Uint32(data[i*4 : (i+1)*4])
            sum1 = (sum1 + uint64(d)) % modValue
            sum2 = (sum2 + sum1) % modValue
        }

        return (sum2 << 32) | sum1
    }

Container Superblock

The container superblock is the entry point to the filesystem. Because of the structure with containers and flexible volumes, allocation needs to handled on a container level. The container superblock contains information on the blocksize, the number of blocks and pointers to the spacemanager for this task. Additionally the block IDs of all volumes are stored in the superblock. To map block IDs to block offsets a pointer to a block map b-tree is stored. This b-tree contains entries for each volume with its ID and offset.

possizetypeid
04bytemagic “NXSB”
44uint32blocksize
88uint64totalblocks
4016byteguid
568uint64next_free_block_id
648uint64next_version
1044uint32previous_containersuperblock_block
1208uint64spaceman_id
1288uint64block_map_block
1368uint64unknown_id
1444uint32padding2
1484uint32apfs_count
1528uint64offset_apfs (repeat apfs_count times)

Node

Nodes are flexible containers that are used for storing different kinds entries. They can be part of a B-tree or exist on their own. Nodes can either contain flexible or fixed sized entries. A node starts with a list of pointers to the entry keys and entry records. This way for each entry the node contains an entry header at the beginning of the node, an entry key in the middle of the node and an entry record at the end of the node.

Node

possizetypeid
04uint32alignment
44uint32entry_count
102uint16head_size
168entrymeta_entry
24entryentries (repeat entry_count times)

Spacemanager

The spacemanager (sometimes called spaceman) is used to manage allocated blocks in the APFS container. The number of free blocks and a pointer to the allocation info file(s?) are stored here.

possizetypeid
04uint32blocksize
168uint64totalblocks
408uint64freeblocks
1448uint64prev_allocationinfofile_block
3528uint64allocationinfofile_block

Allocation Info File

The allocation info file works as a missing header for the allocation file. The allocation files length, version and the offset of the allocation file are stored here.

possizetypeid
44uint32alloc_file_length
84uint32alloc_file_version
244uint32total_blocks
284uint32free_blocks
324uint32allocationfile_block

Unknown

The structure with type 0x11 is quite empty and seems to be related to the spacemanager as it occurs adjacent to it. Its purpose it unknown.

B-Tree

B-trees manage multiple nodes. They contain the offset of the root node.

possizetypeid
168uint64root

Checkpoint

A checkpoint structure exists for every container superblock. But I have no clue what it is good for.

Volume Superblock

A volume superblock exists for each volume in the filesystem. It contains the name of the volume, an ID and a timestamp. Similarly to the container superblock it contains a pointer to a block map which maps block IDs to bock offsets. Additionally a pointer to the root directory, which is stored as a node, is stored in the volume superblock.

possizetypeid
04bytemagic “APSB”
968uint64block_map
1048uint64root_dir_id
1128uint64pointer3
1208uint64pointer4
20816byteguid
2248uint64time1
2728uint64time2
6728str(ASCII)name

Allocation File

Allocation files are simple bitmaps. They do not have a block header and therefore no type id.