40,789 views
Windows 10 x86/wow64 Userland heap
Introduction
Hi all,
Over the course of the past few weeks ago, I received a number of "emergency" calls from some relatives, asking me to look at their computer because "things were broken", "things looked different" and "I think my computer got hacked". I quickly realized that their computers got upgraded to Windows 10.
We could have a lengthy discussion about the strategy to force upgrades onto people, but in any case it is clear that Windows 10 is gaining market share. This also means that it has become a Windows version that is relevant enough to investigate.
In this post, I have gathered some notes on how the userland heap manager behaves for 32bit processes in Windows 10. The main focus of my investigation is to document the similarities and differences with Windows 7, and hopefully present some ideas on how to manipulate the heap to increase predictability of its behavior. More specifically, aside from documenting behavior, I am particularly interested in getting answers to the following questions:
- How does the back-end allocator behave?
- What does it take to activate the LFH?
- What are the differences in terms of LFH behavior between Win7 and Win10, if any?
- What options do we have to create a specific heap layout that involves certain objects to be adjacent in memory? (objects of the same size, objects of different sizes)
- Can we perform a "reliable" and precise heap spray?
As I am a terrible reverse engineer, it’s worth noting that the notes below are merely a transcript of my observations, and not backed by disassembling/decompiling/reverse engineering ntdll.dll. Furthermore, as the number of tests were limited (far below what would be needed to provide some level of statistical certainty), I am not 100% sure that my descriptions can be considered a true representation of reality. In any case, I hope the notes will inspire people to do more tests, to perform reverse engineering on some of the heap management related code, and share more details on how things were implemented.
This post assumes that you already have experience with the Windows 7 heap and its front-end and back-end allocators, and that you understand the output of various !heap commands.
In my test applications, I’ll be using the default process heap. It would be fair to assume that the notes below apply to all windows heaps and applications that rely on the windows heap management system.
Throughout this post, I will use the following terminology:
- chunk: a contiguous piece of memory
- block: size unit, referring to 8 bytes of memory. (Don’t be confused when you see the word "block" in WinDBG output. WinDBG uses the term "block" to reference a chunk. I am using different terminology.
- virtualallocdblock: a chunk that was allocated through RtlAllocateHeap, but is larger than the VirtualMemoryTHreshold (and thus is not stored inside a segment, but rather as a separate chunk in memory), and managed through the VirtualAllocdBlocks list. (offset 0x9c in the heap header)
- segment: the heap management unit, managed by a heap, used to allocate & manage chunks. The segment list is managed inside the heap header (offset 0xa4)
- SubSegment: the LFH management unit used to manage LFH chunks
Test environment
My test environment consists of the following components:
- Windows 10 Enterprise x64, fully patched (running as a virtual machine inside VirtualBox) with 2 CPUs and 1.8Gb of RAM
- Visual Studio Express 2015 for desktop : https://go.microsoft.com/fwlink/?LinkId=691984&clcid=0x409
- WinDBG (for Windows 10) : https://go.microsoft.com/fwlink/p/?LinkId=536682. (Make sure to run a recent version of WinDBG to make sure all heap related structures are properly parsed and represented)
I have set up Symbol support for WinDBG by creating a system environment variable:
- Variable: _NT_SYMBOL_PATH
- Value: srv*c:\symbols*http://msdl.microsoft.com/download/symbols
The source code (Visual Studio C++ projects) for all test cases used in this post can be found here: https://github.com/corelan/win10_heap
The Heap
Similar to previous Windows versions, the Windows 10 heap sits at an ASLR-influenced ("random") address and starts with a header. As the base address of the heap management unit sits at a non-static address, you’ll see different heap base addresses throughout this post. In order to avoid confusion, I’ll use the term "address of default process heap" to refer to this base address. Similarly, any address you’ll see in the post will be different on your machine.
Dumping the header contents of a heap, we can oberve the following fields:
0:003> dt _HEAP 00a40000
ntdll!_HEAP
+0x000 Segment : _HEAP_SEGMENT
+0x000 Entry : _HEAP_ENTRY
+0x008 SegmentSignature : 0xffeeffee
+0x00c SegmentFlags : 2
+0x010 SegmentListEntry : _LIST_ENTRY [ 0xa400a4 - 0xa400a4 ]
+0x018 Heap : 0x00a40000 _HEAP
+0x01c BaseAddress : 0x00a40000 Void
+0x020 NumberOfPages : 0xff
+0x024 FirstEntry : 0x00a40498 _HEAP_ENTRY
+0x028 LastValidEntry : 0x00b3f000 _HEAP_ENTRY
+0x02c NumberOfUnCommittedPages : 0xe9
+0x030 NumberOfUnCommittedRanges : 1
+0x034 SegmentAllocatorBackTraceIndex : 0
+0x036 Reserved : 0
+0x038 UCRSegmentList : _LIST_ENTRY [ 0xa55ff0 - 0xa55ff0 ]
+0x040 Flags : 2
+0x044 ForceFlags : 0
+0x048 CompatibilityFlags : 0
+0x04c EncodeFlagMask : 0x100000
+0x050 Encoding : _HEAP_ENTRY
+0x058 Interceptor : 0
+0x05c VirtualMemoryThreshold : 0xfe00
+0x060 Signature : 0xeeffeeff
+0x064 SegmentReserve : 0x100000
+0x068 SegmentCommit : 0x2000
+0x06c DeCommitFreeBlockThreshold : 0x800
+0x070 DeCommitTotalFreeThreshold : 0x2000
+0x074 TotalFreeSize : 0x462
+0x078 MaximumAllocationSize : 0x7ffdefff
+0x07c ProcessHeapsListIndex : 1
+0x07e HeaderValidateLength : 0x248
+0x080 HeaderValidateCopy : (null)
+0x084 NextAvailableTagIndex : 0
+0x086 MaximumTagIndex : 0
+0x088 TagEntries : (null)
+0x08c UCRList : _LIST_ENTRY [ 0xa55fe8 - 0xa55fe8 ]
+0x094 AlignRound : 0xf
+0x098 AlignMask : 0xfffffff8
+0x09c VirtualAllocdBlocks : _LIST_ENTRY [ 0xa4009c - 0xa4009c ]
+0x0a4 SegmentList : _LIST_ENTRY [ 0xa40010 - 0xa40010 ]
+0x0ac AllocatorBackTraceIndex : 0
+0x0b0 NonDedicatedListLength : 0
+0x0b4 BlocksIndex : 0x00a40260 Void
+0x0b8 UCRIndex : (null)
+0x0bc PseudoTagEntries : (null)
+0x0c0 FreeLists : _LIST_ENTRY [ 0xa4cd80 - 0xa53e70 ]
+0x0c8 LockVariable : 0x00a40248 _HEAP_LOCK
+0x0cc CommitRoutine : 0x4807219e long +4807219e
+0x0d0 FrontEndHeap : 0x003f0000 Void
+0x0d4 FrontHeapLockCount : 0
+0x0d6 FrontEndHeapType : 0x2 ''
+0x0d7 RequestedFrontEndHeapType : 0x2 ''
+0x0d8 FrontEndHeapUsageData : 0x00a43fe8 -> 0
+0x0dc FrontEndHeapMaximumIndex : 0x802
+0x0de FrontEndHeapStatusBitmap : [257] "???"
+0x1e0 Counters : _HEAP_COUNTERS
+0x23c TuningParameters : _HEAP_TUNING_PARAMETERS
The header starts with information about the segments associated with this heap.
Offset 0x4c (EncodeFlagMask) and 0x50 (Encoding) are used to store information about the chunk header encoding in the heap (= same offsets as in Windows 7). The actual key used to encode and decode the chunk header fields (XOR) is stored at offset 0x50. Fortunately the WinDBG !heap extension will perform all decoding for us. As you can imagine, this key is 1/ random for each process and 2/ different for each heap in the process, which means it is (still) quite effective at preventing heap header attacks.
Also, similar to Windows 7, the VirtualMemoryThreshold field (offset 0x5c) contains value 0xfe00. As this value denotes the number of blocks, we need to multiply the value with 8 to get the actual number of bytes. (0x7F000 bytes). In other words, we can still trigger VirtualAllocdBlocks by causing a regular (HeapAlloc) allocation of a size that is larger than this value. Some of the heap spray techniques documented here and here were based on triggering VirtualAllocBlock chunks. We’ll see if the current implementation still allows for a precise heap spray.
Offset 0x0d6 (FrontEndHeapType) indicates what front-end allocator is being used. Value 0x2 refers to LFH. The address of the FrontEndHeap "master" header structure can be found at the address referenced at offset 0xd0 (FrontEndHeap). In the example above, the LFH header is stored at 0x003f0000 and contains the following information:
0:003> dt _LFH_HEAP 0x003f0000
ntdll!_LFH_HEAP +0x000 Lock : _RTL_SRWLOCK +0x004 SubSegmentZones : _LIST_ENTRY [ 0xa48910 - 0xa48910 ] +0x00c Heap : 0x00a40000 Void +0x010 NextSegmentInfoArrayAddress : 0x003f0a08 Void +0x014 FirstUncommittedAddress : 0x003f1000 Void +0x018 ReservedAddressLimit : 0x003f8000 Void +0x01c SegmentCreate : 7 +0x020 SegmentDelete : 1 +0x024 MinimumCacheDepth : 0 +0x028 CacheShiftThreshold : 0 +0x02c SizeInCache : 0 +0x030 RunInfo : _HEAP_BUCKET_RUN_INFO +0x038 UserBlockCache : [12] _USER_MEMORY_CACHE_ENTRY +0x1b8 MemoryPolicies : _HEAP_LFH_MEM_POLICIES +0x1bc Buckets : [129] _HEAP_BUCKET +0x3c0 SegmentInfoArrays : [129] (null) +0x5c4 AffinitizedInfoArrays : [129] (null) +0x7c8 SegmentAllocator : (null) +0x7d0 LocalData : [1] _HEAP_LOCAL_DATA
We can see references to various LFH terms, including "Bucket", "SubSegment" and "Heap local data".
Based on some quick tests, it looks like the LFH_HEAP header is (usually/always?) stored within the first segment of the heap on Windows 7, but on Windows 10 it seems to be stored outside of the memory range used by the first segment.
Putting things in a graphical (but very high-level and abstract) manner, the heap pretty much consists of the following building blocks:
The back-end allocator
BEA_Alloc1
On Windows 7, the back-end allocator (BEA) is the default/active mechanism used to manage freed chunks. Based on my observations, this is still the case on Windows 10. It uses the chunks available inside the segment(s) and starts empty. As soon as a chunk gets freed, it will "remember" these free chunks in some kind of list. The free chunks are organized per size, and the mechanism uses a table to do so. Each entry in the table represents a size (increment of 8 bytes), and index 0 is used to manage chunks that are larger than the chunk size managed by index 127.
The FreeList table is now referenced at offset 0xc0 from the heap base. I didn’t check in detail, but I am assuming that the table stil consists of 128 elements, and that each entry is basically a Flink/Blink to the double linked list of free chunks managed in the table entry, or null pointers (to indicate there are no free chunks of the size that is managed by the corresponding table entry). Again, I didn’t verify in detail, as I’m more interested in how it behaves at this time.
In order to evaluate its behavior on Windows 10, we’ll run some basic tests using a couple of example applications. You can find the sourcecode and binary for this first test in the "BEA_Alloc1" folder in the github repository.
This first test application starts by allocating 2 chunks of 0x300 bytes each. This should not be a common object size in my simple test application, and should be large enough to
- avoid that the back-end allocator already has chunks of that size on its freelist
- avoid that the back-end allocator already has chunks that are bigger that this size on its freelist
- avoid that the LFF is already active for the bucket that contains chunks of this size
In any case, the goal of these 2 allocations is to check where these 2 chunks will be allocated from (i.e. from a normal segment) and if they are placed next to each other. If that is the case, and if we free both chunks, we’ll check if they get coalesced (merged together) by the BEA and managed in its free list. Finally, if we then ask for a new allocation, let’s see if the BEA is going to split the chunk and use it to satisfy new requests.
Let’s go ahead & compile + run the binary. The application will first print the address of the default process heap and will then wait for you to press return.
C:\Users\corelan\Desktop\vc++\win10\BEA_Alloc1\Release>BEA_Alloc1.exe
Default process heap found at 0x01550000
Press a key to start...
Attach WinDBG and check the state of the default process heap allocations at this point with !heap -p -h
.As I am only interested in the "free" chunks at this point, I have removed all lines from the output that correspond with a "busy" chunk and/or are part of a LFH subsegment for a different bucket size. (in case you’re wondering how to recognize those: simply look for lines that are preceded by an asterisk (*), indicating a large busy/internal chunk, and followed by a list a smaller chunks of similar sizes within the address range of this larger busy/internal chunk. This larger one is the subsegment, the smaller ones are the LFH managed chunks inside that subsegment).
In other words, I’m only listing the chunks that are managed by the BEA and could potentially be used to satisfy the allocation requests caused by my test application.
0:003> !heap -p -h 0x01550000 _HEAP @ 1550000 _LFH_HEAP @ f80000 _HEAP_SEGMENT @ 1550000 CommittedRange @ 1550498 HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
01551db0 0005 0045 [00] 01551db8 00020 - (free)
01552b00 0021 0012 [00] 01552b08 00100 - (free) 01552cb0 0015 0015 [00] 01552cb8 000a0 - (free) 015573a8 0008 000b [00] 015573b0 00038 - (free) 01557578 0002 0008 [00] 01557580 00008 - (free) 01559080 0003 0041 [00] 01559088 00010 - (free) 01559098 0003 0003 [00] 015590a0 00010 - (free) 015590e0 0003 0003 [00] 015590e8 00010 - (free) 01559110 0003 0003 [00] 01559118 00010 - (free) 01559128 0003 0003 [00] 01559130 00010 - (free) 01559140 0003 0003 [00] 01559148 00010 - (free) 01559158 0003 0003 [00] 01559160 00010 - (free) 01559170 0003 0003 [00] 01559178 00010 - (free) 01559188 0003 0003 [00] 01559190 00010 - (free) 015591a0 0003 0003 [00] 015591a8 00010 - (free) 015591d0 0003 0003 [00] 015591d8 00010 - (free) 015591e8 0003 0003 [00] 015591f0 00010 - (free) 01559200 0003 0003 [00] 01559208 00010 - (free) 01559218 0003 0003 [00] 01559220 00010 - (free) 01559230 0003 0003 [00] 01559238 00010 - (free) 0155d9b0 0018 000a [00] 0155d9b8 000b8 - (free) 0155dcc8 0012 000a [00] 0155dcd0 00088 - (free) 0155e538 002e 0021 [00] 0155e540 00168 - (free) 0155e6d0 0004 0081 [00] 0155e6d8 00018 - (free) 0155e6f0 0004 0004 [00] 0155e6f8 00018 - (free) 0155e710 0004 0004 [00] 0155e718 00018 - (free) 0155e730 0004 0004 [00] 0155e738 00018 - (free) 0155e750 0004 0004 [00] 0155e758 00018 - (free) 0155e770 0004 0004 [00] 0155e778 00018 - (free) 0155e790 0004 0004 [00] 0155e798 00018 - (free) 0155e7b0 0004 0004 [00] 0155e7b8 00018 - (free) 0155e7d0 0004 0004 [00] 0155e7d8 00018 - (free) 0155e7f0 0004 0004 [00] 0155e7f8 00018 - (free) 0155e810 0004 0004 [00] 0155e818 00018 - (free) 0155e830 0004 0004 [00] 0155e838 00018 - (free) 0155e890 0004 0004 [00] 0155e898 00018 - (free) 0155e990 0004 0004 [00] 0155e998 00018 - (free) 0155e9d0 0004 0004 [00] 0155e9d8 00018 - (free) 0155e9f0 0004 0004 [00] 0155e9f8 00018 - (free) 0155ea10 0004 0004 [00] 0155ea18 00018 - (free) 0155ea30 0004 0004 [00] 0155ea38 00018 - (free) 0155ea70 0004 0004 [00] 0155ea78 00018 - (free) 01561438 0175 0201 [00] 01561440 00ba0 - (free)
As you can see, there are no free chunks of exactly 0x300 bytes, however the last line in the output shows a free chunk of 0xba0 bytes (01561440 – 01561FE0).
This is quite normal, as this is the remaining space in the segment that has not been allocated yet. It shows up as a free chunk (because that’s exactly what it is) and will be split to satisfy new allocation requests ( = that’s pretty much how the BEA works when it has to use a larger chunk to satisfy an allocation request).
Since there are no free chunks of 0x300 bytes, a larger free chunk should be split (providing that the one in the list above is not being used by LFH… but as the flag is set to "free" and as it is not part of a larger subsegment, it shouldn’t be related with LFH at all. After all, a subsegment typically shows up as a larger chunk that is marked as "busy" and "internal").
Anyway, let’s verify to be sure:
0:003> !heap -p -a 01561438
address 01561438 found in _HEAP @ 1550000 HEAP_ENTRY
Size Prev Flags UserPtr UserSize - state 01561438 0175 0000 [00] 01561440 00ba0 - (free)
0:003> !heap -x 01561438
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 01561438 01561440 01550000 01550000 ba8 1008 0 free
The "flags" indicate that this is not an LFH managed chunk. (Otherwise, the output would have mentioned the word "LFH" as well)
Continue to run the application & see what happens:
Allocated chunk of 0x300 bytes at 0x01561440
Allocated chunk of 0x300 bytes at 0x01561748
Press return to continue
Based on the addresses returned by HeapAlloc, it looks like the final free chunk of 0xba0 bytes (at 0x1561440) was indeed split into pieces:
- a chunk of 0x300 bytes, taken from the start of the larger free chunk, allocated at 0x01561440. The larger free chunk is now smaller, but still big enough to satisfy another request.
- another piece of 0x300 bytes bytes, allocated at 0x01561748, taken from the (new) start of the larger free chunk.
Both allocations sit next to each other, because allocations are taken from the start of a free chunk when splitting. As both allocations are taken from the same larger chunk, it is expected from them to sit together. What is left of the original 0xba0 byte chunk is 0x590 bytes (as expected, showing up as a free chunk which starts at 01561a50 and ends at 01561FE0)
WinDBG: !heap -p -h
(output limited to the 3 chunks that are relevant to this exercise: the 2 "new" allocations, and the remaining space)01561438 0061 0201 [00] 01561440 00300 - (busy)
01561740 0061 0061 [00] 01561748 00300 - (busy)
01561a48 00b3 0061 [00] 01561a50 00590 - (free)
Great, but we have not been really using the FreeList mechanisms of the BEA at this point. All we have been doing is allocate chunks from within a normal segment, consuming the free space that was there all the time. To trigger the BEA FreeList mechanism and see how it behaves, we’ll have to "free" some chunks ourselves first.
BEA_Alloc2
In this second example, we’ll create a series of allocations of 0x300 bytes. We’ll free the last one and then cause 2 allocations of 0x100 bytes.
The purpose of the exercise is to check where these 2 allocations will be placed.
As we intend to evaluate the BEA mechanism, we have to avoid that the LFH gets triggered. We know that the heap manager in Windows 7 will trigger the LFH when it sees 18 consecutive allocations of a size in the same bucket (i.e. the next request will be allocated from within a LFH managed subsegment).
For that reason, we’ll only cause 10 allocations of 0x300 bytes, hopefully avoiding that the LFH will be used.
After causing the 10 allocations, let’s examine the last one and see if LFH is on or off.
App:
C:\Users\corelan\Desktop\vc++\win10\BEA_Alloc2\Release>BEA_Alloc2.exe Default process heap found at 0x009B0000 Press a key to start... Allocated chunk of 0x300 bytes at 0x009BF1D0 Allocated chunk of 0x300 bytes at 0x009BF4D8 Allocated chunk of 0x300 bytes at 0x009BF7E0 Allocated chunk of 0x300 bytes at 0x009BFAE8 Allocated chunk of 0x300 bytes at 0x009BFDF0 Allocated chunk of 0x300 bytes at 0x009C00F8 Allocated chunk of 0x300 bytes at 0x009C0400 Allocated chunk of 0x300 bytes at 0x009C0708 Allocated chunk of 0x300 bytes at 0x009C0A10 Allocated chunk of 0x300 bytes at 0x009C0D18
Press return to continue
WinDBG:
0:003> !heap -x 0x009C0D18
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
009c0d10 009c0d18 009b0000 009b0000 308 308 8 busy
Free chunk at 0x009C0D18 Press return to continue
0:003> !heap -x 0x009C0D18Entry User Heap Segment Size PrevSize Unused Flags-----------------------------------------------------------------------------009c0d10 009c0d18 009b0000 009b0000 12d0 308 0 free
Before running the rest of the application, let’s look at the current state of the heap, looking for "free" chunks of 0x300 bytes or larger. The output below only contains the relevant lines:
WinDBG:
0:003> !heap -p -h 0x009B0000
_HEAP @ 9b0000
_LFH_HEAP @ 610000
_HEAP_SEGMENT @ 9b0000
CommittedRange @ 9b0498
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
009c0d10 025a 0061 [00] 009c0d18 012c8 - (free)
0:003> !heap -p -h 0x009B0000_HEAP @ 9b0000_LFH_HEAP @ 610000_HEAP_SEGMENT @ 9b0000CommittedRange @ 9b0498HEAP_ENTRY Size Prev Flags UserPtr UserSize - state009b2b00 0021 0012 [00] 009b2b08 00100 - (free)009bc330 0034 0089 [00] 009bc338 00198 - (free)009c0d10 025a 0061 [00] 009c0d18 012c8 - (free)
- A free chunk of exactly 0x100 bytes (entry starts at 009b2b00, userptr is 009b2b08)
- A free chunk of 0x198 bytes (entry starts at 009bc330, userptr is 009bc338)
- The remaining free space in the segment, at 009c0d10
Allocated chunk of 0x100 bytes at 0x009B2B08Allocated chunk of 0x100 bytes at 0x009BC338Done...
BEA_Alloc3
C:\Users\corelan\Desktop\vc++\win10\BEA_Alloc3\Release>BEA_Alloc3.exe
Default process heap found at 0x016C0000
Press a key to start...Allocated chunk of 0x58 bytes at 0x016C96D0
Allocated chunk of 0x100 bytes at 0x016C2B08
Allocated chunk of 0x58 bytes at 0x016CDF50
Allocated chunk of 0x100 bytes at 0x016CD328
Allocated chunk of 0x58 bytes at 0x016CD430
Allocated chunk of 0x100 bytes at 0x016CFFF8
Allocated chunk of 0x58 bytes at 0x016C2CB8
Allocated chunk of 0x100 bytes at 0x016D0100
Allocated chunk of 0x58 bytes at 0x016CDC18
Allocated chunk of 0x100 bytes at 0x016D0208
Allocated chunk of 0x58 bytes at 0x016CDC78
Allocated chunk of 0x100 bytes at 0x016D0310
Allocated chunk of 0x58 bytes at 0x016D0418
Allocated chunk of 0x100 bytes at 0x016D0478
Allocated chunk of 0x58 bytes at 0x016D0580
Allocated chunk of 0x100 bytes at 0x016D05E0
Allocated chunk of 0x58 bytes at 0x016D06E8
Allocated chunk of 0x100 bytes at 0x016D0748
Allocated chunk of 0x58 bytes at 0x016D0850
Allocated chunk of 0x100 bytes at 0x016D08B0
Press return to continue
0:003> !heap -x 0x016D08B0
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
016d08a8 016d08b0 016c0000 016c0000 108 60 8 busy0:003> !heap -x 0x016D0850
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
016d0848 016d0850 016c0000 016c0000 60 108 8 busy
Allocated chunk of 0x100 bytes at 0x016D09B8
Allocated chunk of 0x58 bytes at 0x016D0AC0, filled with 'A'
Allocated chunk of 0x100 bytes at 0x016D0B20
Press return to continue
Great! We can use WinDBG to confirm that the 3 chunks are adjacent indeed, and that the middle chunk (0x58) is filled with A’s
(from output of !heap -p -h 0x016C0000)
016d09b0 0021 0021 [00] 016d09b8 00100 - (busy)
016d0ab8 000c 0021 [00] 016d0ac0 00058 - (busy)
016d0b18 0021 000c [00] 016d0b20 00100 - (busy)0:001> dd 0x016d0ac0 L 0x58/4
016d0ac0 41414141 41414141 41414141 41414141
016d0ad0 41414141 41414141 41414141 41414141
016d0ae0 41414141 41414141 41414141 41414141
016d0af0 41414141 41414141 41414141 41414141
016d0b00 41414141 41414141 41414141 41414141
016d0b10 41414141 41414141
Next, the 0x58 byte chunk will be freed (which means it should end up on the freelist, and not merged with an adjacent free chunk as long as we don’t free the adjacent 0x100 byte chunks ourselves). Finally, a new allocation request for 0x58 is supposed to take that position back. The example application will populate it with B’s:
App:
Free chunk of 0x58 bytes at 0x016D0AC0
Allocated chunk of 0x58 bytes at 0x016D0AC0, filled with 'B'
Done...
WinDBG:
0:001> dd 0x016d0ac0 L 0x58/4
016d0ac0 42424242 42424242 42424242 42424242
016d0ad0 42424242 42424242 42424242 42424242
016d0ae0 42424242 42424242 42424242 42424242
016d0af0 42424242 42424242 42424242 42424242
016d0b00 42424242 42424242 42424242 42424242
016d0b10 42424242 42424242
Bingo.
BEA_Alloc4
Let’s take it one step further.
This time we’ll free a 0x58 byte chunk, and we’ll try to use a 0x80 byte allocation (where we only control the last 4 dwords), to control the exact 4 first dwords in the original 0x58 byte chunk. (I know, replacing one object with another object of a different size sounds kinky… but hey, who knows this could be useful one time … or not)
C:\Users\corelan\Desktop\vc++\win10\BEA_Alloc4\Release>BEA_Alloc4.exe
Default process heap found at 0x010D0000
Press a key to start...<...snip...>Allocated chunk of 0x68 bytes at 0x010E1D68Allocated chunk of 0x58 bytes at 0x010E1DD8Allocated chunk of 0x80 bytes at 0x010E1E38Allocated chunk of 0x68 bytes at 0x010E1EC0Allocated chunk of 0x58 bytes at 0x010E1F30Allocated chunk of 0x80 bytes at 0x010E1F90Allocated chunk of 0x68 bytes at 0x010E2018Press return to continue
Allocated start chunk (0x80 bytes) at 0x010E2088Allocated first chunk (0x68 bytes) at 0x010E2110Allocated second 'vulnerable' chunk (0x58 bytes) at 0x010E2180, filled with 'A'Allocated end chunk (0x80 bytes) at 0x010E21E0Press return to continue
010e2080 0011 000e [00] 010e2088 00080 - (busy)010e2108 000e 0011 [00] 010e2110 00068 - (busy)010e2178 000c 000e [00] 010e2180 00058 - (busy)010e21d8 0011 000c [00] 010e21e0 00080 - (busy)
Free chunk of 0x58 bytes at 0x010E2180Free first chunk of 0x68 bytes at 0x010E2110Press return to continue
0:003> !heap -p -a 0x010E2110address 010e2110 found in_HEAP @ 10d0000HEAP_ENTRY Size Prev Flags UserPtr UserSize - state010e2108 001a 0000 [00] 010e2110 000c8 - (free)0:003> dd 0x010E2110 L 0xc8/4010e2110 010e2268 010de858 00000000 00000000010e2120 00000000 00000000 00000000 00000000010e2130 00000000 00000000 00000000 00000000010e2140 00000000 00000000 00000000 00000000010e2150 00000000 00000000 00000000 00000000010e2160 00000000 00000000 00000000 00000000010e2170 00000000 00000000 0c00000c 00008812010e2180 010e2268 010de858 41414141 41414141010e2190 41414141 41414141 41414141 41414141010e21a0 41414141 41414141 41414141 41414141010e21b0 41414141 41414141 41414141 41414141010e21c0 41414141 41414141 41414141 41414141010e21d0 41414141 41414141
Allocated chunk of 0x80 bytes at 0x010E2110, filled with 'B'Done...
0:001> !heap -p -a 0x010E2110address 010e2110 found in_HEAP @ 10d0000HEAP_ENTRY Size Prev Flags UserPtr UserSize - state010e2108 0011 0000 [00] 010e2110 00080 - (busy)0:001> dd 0x010E2110 L 0xc8/4010e2110 42424242 42424242 42424242 42424242010e2120 42424242 42424242 42424242 42424242010e2130 42424242 42424242 42424242 42424242010e2140 42424242 42424242 42424242 42424242010e2150 42424242 42424242 42424242 42424242010e2160 42424242 42424242 42424242 42424242010e2170 42424242 42424242 42424242 42424242010e2180 42424242 42424242 42424242 42424242010e2190 6a309389 0000880d 010dee10 010d1db8010e21a0 41414141 41414141 41414141 41414141010e21b0 41414141 41414141 41414141 41414141010e21c0 41414141 41414141 41414141 41414141010e21d0 41414141 41414141
The Front-End Allocator – LFH
LFH_Alloc1
In this first exercise, we’ll examine if it still takes 0x12 (18) consecutive allocations for a size in the same bucket before the LFH will start taking care of allocations and frees of those sizes.
In order to avoid any influencing or assumptions, I’ll use chunksizes that has not been allocated yet. (0x1500 bytes, 0x2100 bytes, 0x3000 bytes, 0x800 bytes)
I’ll combine a couple of tests in this test:
- check how many allocations it takes before LFH kicks in
- see if the LFH trigger is influenced when an allocation happens of a different bucket size during the series of allocations
- see if the LFH trigger is influenced when a free happens during the allocations, of a chunk in a different bucket/of a different size
- see if the LFH trigger is influenced when a chunk of the same bucket is freed again, during the series of allocations
Step1: how many allocations are needed to trigger LFH
App:
C:\Users\corelan\Desktop\vc++\win10\LFH_Alloc1\Release>LFH_Alloc1.exe Default process heap found at 0x00D30000 Press a key to start... [1] Allocated chunk of 0x1500 bytes at 0x00D4CF68 [2] Allocated chunk of 0x1500 bytes at 0x00D4E470 [3] Allocated chunk of 0x1500 bytes at 0x00D4F978 [4] Allocated chunk of 0x1500 bytes at 0x00D50E80 [5] Allocated chunk of 0x1500 bytes at 0x00D52388 [6] Allocated chunk of 0x1500 bytes at 0x00D53890 [7] Allocated chunk of 0x1500 bytes at 0x00D54D98 [8] Allocated chunk of 0x1500 bytes at 0x00D562A0 [9] Allocated chunk of 0x1500 bytes at 0x00D577A8 [10] Allocated chunk of 0x1500 bytes at 0x00D58CB0 [11] Allocated chunk of 0x1500 bytes at 0x00D5A1B8 [12] Allocated chunk of 0x1500 bytes at 0x00D5B6C0 [13] Allocated chunk of 0x1500 bytes at 0x00D5CBC8 [14] Allocated chunk of 0x1500 bytes at 0x00D5E0D0 [15] Allocated chunk of 0x1500 bytes at 0x00D5F5D8 [16] Allocated chunk of 0x1500 bytes at 0x00D60AE0 [17] Allocated chunk of 0x1500 bytes at 0x00D61FE8 [18] Allocated chunk of 0x1500 bytes at 0x00D6B348 [19] Allocated chunk of 0x1500 bytes at 0x00D64A20 [20] Allocated chunk of 0x1500 bytes at 0x00D69E40 [21] Allocated chunk of 0x1500 bytes at 0x00D6C850 [22] Allocated chunk of 0x1500 bytes at 0x00D63518 [23] Allocated chunk of 0x1500 bytes at 0x00D6DD58 [24] Allocated chunk of 0x1500 bytes at 0x00D67430 [25] Allocated chunk of 0x1500 bytes at 0x00D6F260 [26] Allocated chunk of 0x1500 bytes at 0x00D65F28 [27] Allocated chunk of 0x1500 bytes at 0x00D70768 [28] Allocated chunk of 0x1500 bytes at 0x00D68938 [29] Allocated chunk of 0x1500 bytes at 0x00D71C70 [30] Allocated chunk of 0x1500 bytes at 0x00D77438 Press return to continue
If you pay close attention to the addresses, you can see a bigger gap between allocations 17 and 18. This is a good indication that allocations are no longer individual chunks inside a normal segment, but are now being consumed from an LFH subsegment. To be sure, let’s validate the findings in WinDBG.
Simply run !heap -x on all addresses (starting from the first one in the list), until you find the first one that has the "LFH" marker.
Allocation 17:
0:001> !heap -x 0x00D61FE8
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
00d61fe0 00d61fe8 00d30000 00d30000 1508 1508 8 busy
Allocation 18:
0:001> !heap -x 0x00D6B348
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
00d6b340 00d6b348 00d30000 00d3ae28 1508 - 8 LFH;busy
So – it looks like the LFH takes over with allocation 18.
Step 2: will the LFH trigger be influenced if another allocation (of a size in a different bucket) occurs in the series of 18 allocations.
App:
[1] Allocated chunk of 0x2100 bytes at 0x00D93500 [2] Allocated chunk of 0x2100 bytes at 0x00D95608 [3] Allocated chunk of 0x2100 bytes at 0x00D97710 [4] Allocated chunk of 0x2100 bytes at 0x00D99818 [5] Allocated chunk of 0x2100 bytes at 0x00D9B920 [6] Allocated chunk of 0x2100 bytes at 0x00D9DA28 [7] Allocated chunk of 0x2100 bytes at 0x00D9FB30 [8] Allocated chunk of 0x2100 bytes at 0x00DA1C38 [9] Allocated chunk of 0x2100 bytes at 0x00DA3D40 [10] Allocated chunk of 0x2100 bytes at 0x00DA5E48 Allocated chunk of 0x300 bytes at 0x00D49450 [11] Allocated chunk of 0x2100 bytes at 0x00DA7F50 [12] Allocated chunk of 0x2100 bytes at 0x00DAA058 [13] Allocated chunk of 0x2100 bytes at 0x00DAC160 [14] Allocated chunk of 0x2100 bytes at 0x00DAE268 [15] Allocated chunk of 0x2100 bytes at 0x00DB0370 [16] Allocated chunk of 0x2100 bytes at 0x00DB2478 [17] Allocated chunk of 0x2100 bytes at 0x00DB4580 [18] Allocated chunk of 0x2100 bytes at 0x00DC10D8 [19] Allocated chunk of 0x2100 bytes at 0x00DBAAC0 [20] Allocated chunk of 0x2100 bytes at 0x00DC32E0 Press return to continue
Focusing on allocation 17 and 18 (of 0x2100 bytes), we can see that alloc 17 was not LFH yet, but the 18th one did it again, despite the allocation of a 0x300 byte chunk in the middle of the series.
0:001> !heap -x 0x00DB4580
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00db4578 00db4580 00d30000 00d30000 2108 2108 8 busy
0:001> !heap -x 0x00DC10D8
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00dc10d0 00dc10d8 00d30000 00d3ae78 2208 - 3f LFH;busy
Interesting indeed :)
Step 3: will a free (of a different chunk size) influence the LFH trigger ?
App:
[1] Allocated chunk of 0x3000 bytes at 0x00DD6690
[2] Allocated chunk of 0x3000 bytes at 0x00DD9698 [3] Allocated chunk of 0x3000 bytes at 0x00DDC6A0 [4] Allocated chunk of 0x3000 bytes at 0x00DDF6A8 [5] Allocated chunk of 0x3000 bytes at 0x00DE26B0 [6] Allocated chunk of 0x3000 bytes at 0x00DE56B8 [7] Allocated chunk of 0x3000 bytes at 0x00DE86C0 [8] Allocated chunk of 0x3000 bytes at 0x00DEB6C8 [9] Allocated chunk of 0x3000 bytes at 0x00DEE6D0 [10] Allocated chunk of 0x3000 bytes at 0x00DF16D8 Freed chunk at 0x00D49C80 Freed chunk at 0x00D4A188 [11] Allocated chunk of 0x3000 bytes at 0x00DF46E0 [12] Allocated chunk of 0x3000 bytes at 0x00DF76E8 [13] Allocated chunk of 0x3000 bytes at 0x00DFA6F0 [14] Allocated chunk of 0x3000 bytes at 0x00DFD6F8 [15] Allocated chunk of 0x3000 bytes at 0x00E00700 [16] Allocated chunk of 0x3000 bytes at 0x00E03708 [17] Allocated chunk of 0x3000 bytes at 0x00E06710 [18] Allocated chunk of 0x3000 bytes at 0x00E0C748 [19] Allocated chunk of 0x3000 bytes at 0x00E15760 [20] Allocated chunk of 0x3000 bytes at 0x00E1B770 Done...
WinDBG (focus on allocation 17 and 18 again):
0:001> !heap -x 0x00E06710
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00e06708 00e06710 00d30000 00d30000 3008 3008 8 busy
0:001> !heap -x 0x00E0C748
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00e0c740 00e0c748 00d30000 00d3aea0 3008 - 8 LFH;busy
Interesting once more:)
Step 4: free a chunk from the same bucket during the allocation series.
[1] Allocated chunk of 0x800 bytes at 0x004EAE70
[2] Allocated chunk of 0x800 bytes at 0x0053E910 [3] Allocated chunk of 0x800 bytes at 0x0053F118 [4] Allocated chunk of 0x800 bytes at 0x0053F920 [5] Allocated chunk of 0x800 bytes at 0x00540128 [6] Allocated chunk of 0x800 bytes at 0x00540930 [7] Allocated chunk of 0x800 bytes at 0x00541138 [8] Allocated chunk of 0x800 bytes at 0x00541940 [9] Allocated chunk of 0x800 bytes at 0x00542148 [10] Allocated chunk of 0x800 bytes at 0x00542950 Freed chunk at 0x00542950 [11] Allocated chunk of 0x800 bytes at 0x00542950 [12] Allocated chunk of 0x800 bytes at 0x00543158 [13] Allocated chunk of 0x800 bytes at 0x00543960 [14] Allocated chunk of 0x800 bytes at 0x00544168 [15] Allocated chunk of 0x800 bytes at 0x00544970 [16] Allocated chunk of 0x800 bytes at 0x00545178 [17] Allocated chunk of 0x800 bytes at 0x00545980 [18] Allocated chunk of 0x800 bytes at 0x009F0070 [19] Allocated chunk of 0x800 bytes at 0x009F2090 [20] Allocated chunk of 0x800 bytes at 0x009F68D8 Done...
WinDBG (looking again at allocation 17 and 18)
0:004> !heap -x 0x00545980
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00545978 00545980 00450000 00450000 808 808 8 busy
0:004> !heap -x 0x009F0070
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 009f0068 009f0070 00450000 00458c60 808 - 8 LFH;busy
Interestingly enough, the free did not really impact the LFH trigger at all. 18 allocations still did the trick.
I guess more exhaustive testing is needed to confirm this behaviour, including checking what happens if there are more frees etc… but at least we are able to see some kind of pattern that indicates it may be more difficult to avoid that the LFH will be get enabled eventually, especially if you have no other option that to cause a certain number of allocations (more than 18) of chunks in the same bucket.
If you know of a way to prevent this from happening, please let me know :)
LFH_Alloc2
In the second exercise, we’ll see if the LFH still behaves the same way as under Windows 7 – i.e. returning freed chunks in a LIFO manner. We’ll activate the LFH using 20 allocations of 0x500 bytes. The last one will be freed, and then another allocation of 0x500 bytes will happen.
The goal is to see if the last one to be freed will be the first one to be returned again. (LIFO).
C:\Users\corelan\Desktop\vc++\win10\LFH_Alloc2\Release>LFH_Alloc2.exe
Default process heap found at 0x008D0000 Press a key to start...
[1] Allocated chunk of 0x500 bytes at 0x008E0440
[2] Allocated chunk of 0x500 bytes at 0x008E0948 [3] Allocated chunk of 0x500 bytes at 0x008E0E50 [4] Allocated chunk of 0x500 bytes at 0x008E1358 [5] Allocated chunk of 0x500 bytes at 0x008E1860 [6] Allocated chunk of 0x500 bytes at 0x008E1D68 [7] Allocated chunk of 0x500 bytes at 0x008E2270 [8] Allocated chunk of 0x500 bytes at 0x008E2778 [9] Allocated chunk of 0x500 bytes at 0x008E2C80 [10] Allocated chunk of 0x500 bytes at 0x008E3188 [11] Allocated chunk of 0x500 bytes at 0x008E3690 [12] Allocated chunk of 0x500 bytes at 0x008E3B98 [13] Allocated chunk of 0x500 bytes at 0x008E40A0 [14] Allocated chunk of 0x500 bytes at 0x008E45A8 [15] Allocated chunk of 0x500 bytes at 0x008E4AB0 [16] Allocated chunk of 0x500 bytes at 0x008E4FB8 [17] Allocated chunk of 0x500 bytes at 0x008E54C0 [18] Allocated chunk of 0x500 bytes at 0x008E7D28 [19] Allocated chunk of 0x500 bytes at 0x008E5EF8 [20] Allocated chunk of 0x500 bytes at 0x008E6E10 Press return to continue
Freed chunk at 0x008E6E10
Press return to continue
Allocated chunk of 0x500 bytes at 0x008E9148
Press return to continue
0:003> !heap -x 0x008E6E10
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
008e6e08 008e6e10 008d0000 008d8d90 508 - 0 LFH;free0:003> !heap -x 0x008E9148
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
008e9140 008e9148 008d0000 008d8d90 508 - 8 LFH;busy
Perhaps causing a series of allocations of 0x500 byte will allow us to take the freed chunk back. Press return to cause another 20 allocations of 0x500 bytes and see what happens:
Allocated chunk of 0x500 bytes at 0x008E59F0
Allocated chunk of 0x500 bytes at 0x008E7318 Allocated chunk of 0x500 bytes at 0x008E8738 Allocated chunk of 0x500 bytes at 0x008E6400 Allocated chunk of 0x500 bytes at 0x008E8C40 Allocated chunk of 0x500 bytes at 0x008E6908 Allocated chunk of 0x500 bytes at 0x008E7820 Allocated chunk of 0x500 bytes at 0x008E8230 Allocated chunk of 0x500 bytes at 0x008E6E10 --- got it back --- Allocated chunk of 0x500 bytes at 0x008EC740 Allocated chunk of 0x500 bytes at 0x008EEA78 Allocated chunk of 0x500 bytes at 0x008EBD30 Allocated chunk of 0x500 bytes at 0x008EC238 Allocated chunk of 0x500 bytes at 0x008EDB60 Allocated chunk of 0x500 bytes at 0x008EF488 Allocated chunk of 0x500 bytes at 0x008ECC48 Allocated chunk of 0x500 bytes at 0x008F0DB0 Allocated chunk of 0x500 bytes at 0x008E99F8 Allocated chunk of 0x500 bytes at 0x008EEF80 Allocated chunk of 0x500 bytes at 0x008EB828 Press return to continue
In this case, it took another 9 allocations to get the freed chunk back. In fact, if you’d run the same application a couple of times, you’ll see that the number of times it takes to get the freed chunk back, varies largely between 0 (sometimes you’ll get it back LIFO style) and never (at least, not in the first 20 allocations or so)… but in most cases I got it back within the first 10 allocations. (A lot more structured testing would be needed to find the sweet spot that would provide some sort of predictability. It’ll probably never be 100% reliable, but it may not be too messy either.)
Update (7/7/2016) – I added "LFH_TakeBack" to the github repository, which will automate some statistic gathering. For each chunksize between 8 and 0x4000, it will enable LFH, alloc a chunk and free it again, and then measure how many allocations are needed to take it back. The application calculates an average, a minimum and maximum number of tries, and also keeps track how many times the object was not taken back within the first 2000 allocations.
After running the app, it looks like the maximum number of allocations needed sits around 50.
Anyway, looking at the addresses of the allocations, it also looks like the chunks are no longer adjacent (as they were in Windows 7, at least as long as the chunks are inside the same subsegment).
This will certainly make it more complex to create a specific layout/sequence of objects when the LFH is enabled.
LFH_Alloc3
Is LFH still limited to 0x4000 byte chunks (max)?
App:
C:\Users\corelan\Desktop\vc++\win10\LFH_Alloc3\Release>LFH_Alloc3.exe
Default process heap found at 0x00930000 Press a key to start...
[1] Allocated chunk of 0x4000 bytes at 0x00940FF8
[2] Allocated chunk of 0x4000 bytes at 0x00945000 [3] Allocated chunk of 0x4000 bytes at 0x00949008 [4] Allocated chunk of 0x4000 bytes at 0x0094D010 [5] Allocated chunk of 0x4000 bytes at 0x00951018 [6] Allocated chunk of 0x4000 bytes at 0x00955020 [7] Allocated chunk of 0x4000 bytes at 0x00959028 [8] Allocated chunk of 0x4000 bytes at 0x0095D030 [9] Allocated chunk of 0x4000 bytes at 0x00961038 [10] Allocated chunk of 0x4000 bytes at 0x00965040 [11] Allocated chunk of 0x4000 bytes at 0x00969048 [12] Allocated chunk of 0x4000 bytes at 0x0096D050 [13] Allocated chunk of 0x4000 bytes at 0x00971058 [14] Allocated chunk of 0x4000 bytes at 0x00975060 [15] Allocated chunk of 0x4000 bytes at 0x00979068 [16] Allocated chunk of 0x4000 bytes at 0x0097D070 [17] Allocated chunk of 0x4000 bytes at 0x00981078 [18] Allocated chunk of 0x4000 bytes at 0x0098D0B8 [19] Allocated chunk of 0x4000 bytes at 0x009950C8 [20] Allocated chunk of 0x4000 bytes at 0x009910C0 Press return to continue
[1] Allocated chunk of 0x4008 bytes at 0x009C7008
[2] Allocated chunk of 0x4008 bytes at 0x009CB018 [3] Allocated chunk of 0x4008 bytes at 0x009CF028 [4] Allocated chunk of 0x4008 bytes at 0x009D3038 [5] Allocated chunk of 0x4008 bytes at 0x009D7048 [6] Allocated chunk of 0x4008 bytes at 0x009DB058 [7] Allocated chunk of 0x4008 bytes at 0x009DF068 [8] Allocated chunk of 0x4008 bytes at 0x009E3078 [9] Allocated chunk of 0x4008 bytes at 0x009E7088 [10] Allocated chunk of 0x4008 bytes at 0x009EB098 [11] Allocated chunk of 0x4008 bytes at 0x009EF0A8 [12] Allocated chunk of 0x4008 bytes at 0x009F30B8 [13] Allocated chunk of 0x4008 bytes at 0x009F70C8 [14] Allocated chunk of 0x4008 bytes at 0x009FB0D8 [15] Allocated chunk of 0x4008 bytes at 0x009FF0E8 [16] Allocated chunk of 0x4008 bytes at 0x00A030F8 [17] Allocated chunk of 0x4008 bytes at 0x00A07108 [18] Allocated chunk of 0x4008 bytes at 0x00A0B118 [19] Allocated chunk of 0x4008 bytes at 0x00A0F128 [20] Allocated chunk of 0x4008 bytes at 0x00A13138 Press return to continue
WinDBG: 0x4000 bytes
0:001> !heap -x 0x00981078
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00981070 00981078 00930000 00930000 4008 4008 8 busy
0:001> !heap -x 0x0098D0B8
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 0098d0b0 0098d0b8 00930000 00938c48 4008 - 8 LFH;busy
WinDBG: 0x4008 bytes
0:001> !heap -x 0x00A07108
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00a07100 00a07108 00930000 00930000 4010 4010 8 busy
0:001> !heap -x 0x00A0B118
Entry User Heap Segment Size PrevSize Unused Flags ----------------------------------------------------------------------------- 00a0b110 00a0b118 00930000 00930000 4010 4010 8 busy
Answer: YES, 0x4000 seems to be the maximum size (just like on Windows 7)
LFH_TakeBack2
The LFH_Alloc* exercises demonstrate that chunks are no longer allocated in an consecutive manner inside a LFH subsegment. Of course, this complicates creating a specific layout within the subsegment. I still wanted to know if it would be possible to replace the memory space used by a LFH chunk by an LFH allocation of a size from a different bucket. As an example, can I take the space of a 0x58 byte chunk using a 0x88 byte allocation, within the LFH.
As the LFH subsegments are used to keep chunks of the same bucket size together, this would require clearing out the entire subsegment used for storing the vulnerable object, and hopefully the Heap Manager will reuse those pages for another subsegment (allocations for a different bucket size).
LFH_TakeBack2 demonstrates if it works or not. The idea is to try to place the "vulnerable" object in a subsegment where you control all the other chunks. As soon as the vulnerable object gets freed, you cause all the other chunks to be freed as well. Hopefully, this will also release the entire subsegment and its pages, so they can be reused again (even for a subsegment associated with chunksizes that fall in a different bucket).
App:
C:\Users\corelan\Desktop\vc++\win10_heap\LFH_TakeBack2\Release>LFH_TakeBack2.exe
Vulnerable object of 0x00000058 bytes at 0x01136B98, filled with 'A'
Allocations done. Press return to start free process
WinDBG:
0:003> dd 0x01136B98
01136b98 41414141 41414141 41414141 41414141
01136ba8 41414141 41414141 41414141 41414141
01136bb8 41414141 41414141 41414141 41414141
01136bc8 41414141 41414141 41414141 41414141
01136bd8 41414141 41414141 41414141 41414141
01136be8 41414141 41414141 07fe9621 88011d00
01136bf8 00000000 00000000 00000000 00000000
01136c08 00000000 00000000 00000000 00000000
0:003> !heap -p -a 0x01136B98
address 01136b98 found in
_HEAP @ e30000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
01136b90 000c 0000 [00] 01136b98 00058 - (busy)
0:003> !heap -x 0x01136B98
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
01136b90 01136b98 00e30000 00e45638 60 - 8 LFH;busy
Continue with App:
Vulnerable object at 0x01136B98 was freed
Free done. Press return to start new allocations (size 0x00000088)Allocations done. Check if 0x01136B98 contains 'B' now
WinDBG:
0:001> dd 0x01136B98
01136b98 42424242 42424242 42424242 42424242
01136ba8 42424242 42424242 42424242 42424242
01136bb8 42424242 42424242 42424242 42424242
01136bc8 42424242 42424242 42424242 42424242
01136bd8 42424242 42424242 42424242 42424242
01136be8 42424242 42424242 42424242 42424242
01136bf8 42424242 42424242 42424242 42424242
01136c08 42424242 42424242 07f296dd 8800bf00
0:001> !heap -x 0x01136B98
Entry User Heap Segment Size PrevSize Unused Flags
-----------------------------------------------------------------------------
01136b80 01136b88 00e30000 00e38ca0 90 - 8 LFH;busy
Size is now 0x90. So, the mechanism still works (just like in older Windows versions). Of course, because the sequence of chunks inside a segment is not fully under our control, it will be difficult to control specific bytes of the freed object when you replace it with a different sized chunk.
Large chunks
Large_Alloc1
What do virtualallocdblock chunks/allocations look like under WIndows 10?
In order to create such chunks, we need to cause HeapAlloc/RtlAllocateHeap allocations of a size that is larger than the VirtualMemoryThreshold value in the heap header. In the example application, I am triggering 20 allocations of 0x7ffb0 bytes (which is larger than the 7ff00 byte threshold).
App:
C:\Users\corelan\Desktop\vc++\win10\Large_Alloc1\Release>Large_Alloc1.exe
Default process heap found at 0x00E20000 Press a key to start...
[1] Allocated chunk of 0x7ffb0 bytes at 0x00C48020
[2] Allocated chunk of 0x7ffb0 bytes at 0x01110020 [3] Allocated chunk of 0x7ffb0 bytes at 0x011A1020 [4] Allocated chunk of 0x7ffb0 bytes at 0x0123F020 [5] Allocated chunk of 0x7ffb0 bytes at 0x01326020 [6] Allocated chunk of 0x7ffb0 bytes at 0x013BA020 [7] Allocated chunk of 0x7ffb0 bytes at 0x0144C020 [8] Allocated chunk of 0x7ffb0 bytes at 0x014D2020 [9] Allocated chunk of 0x7ffb0 bytes at 0x0156D020 [10] Allocated chunk of 0x7ffb0 bytes at 0x015F4020 [11] Allocated chunk of 0x7ffb0 bytes at 0x01680020 [12] Allocated chunk of 0x7ffb0 bytes at 0x01712020 [13] Allocated chunk of 0x7ffb0 bytes at 0x017AB020 [14] Allocated chunk of 0x7ffb0 bytes at 0x0183B020 [15] Allocated chunk of 0x7ffb0 bytes at 0x018C9020 [16] Allocated chunk of 0x7ffb0 bytes at 0x01957020 [17] Allocated chunk of 0x7ffb0 bytes at 0x019EA020 [18] Allocated chunk of 0x7ffb0 bytes at 0x01A75020 [19] Allocated chunk of 0x7ffb0 bytes at 0x01B09020 [20] Allocated chunk of 0x7ffb0 bytes at 0x01B9D020 Press return to continue
WinDBG: (output of !heap -p -h
, limited to information related with VirtualAllocdBlocks)VirtualAllocdBlocks @ e2009c
00c48018 10000 0004 [00] 00c48020 7ffb0 - (busy VirtualAlloc) 01110018 10000 0000 [00] 01110020 7ffb0 - (busy VirtualAlloc) 011a1018 10000 0000 [00] 011a1020 7ffb0 - (busy VirtualAlloc) 0123f018 10000 0000 [00] 0123f020 7ffb0 - (busy VirtualAlloc) 01326018 10000 0000 [00] 01326020 7ffb0 - (busy VirtualAlloc) 013ba018 10000 0000 [00] 013ba020 7ffb0 - (busy VirtualAlloc) 0144c018 10000 0000 [00] 0144c020 7ffb0 - (busy VirtualAlloc) 014d2018 10000 0000 [00] 014d2020 7ffb0 - (busy VirtualAlloc) 0156d018 10000 0000 [00] 0156d020 7ffb0 - (busy VirtualAlloc) 015f4018 10000 0000 [00] 015f4020 7ffb0 - (busy VirtualAlloc) 01680018 10000 0000 [00] 01680020 7ffb0 - (busy VirtualAlloc) 01712018 10000 0000 [00] 01712020 7ffb0 - (busy VirtualAlloc) 017ab018 10000 0000 [00] 017ab020 7ffb0 - (busy VirtualAlloc) 0183b018 10000 0000 [00] 0183b020 7ffb0 - (busy VirtualAlloc) 018c9018 10000 0000 [00] 018c9020 7ffb0 - (busy VirtualAlloc) 01957018 10000 0000 [00] 01957020 7ffb0 - (busy VirtualAlloc) 019ea018 10000 0000 [00] 019ea020 7ffb0 - (busy VirtualAlloc) 01a75018 10000 0000 [00] 01a75020 7ffb0 - (busy VirtualAlloc) 01b09018 10000 0000 [00] 01b09020 7ffb0 - (busy VirtualAlloc) 01b9d018 10000 0000 [00] 01b9d020 7ffb0 - (busy VirtualAlloc)
Yes, we can still trigger this kind of allocations. Due to the nature of this type of allocations they are still placed at the start of a fresh new page (which is why we’re seeing the start address alignments). On the other hand, the gaps between 2 allocations seems to be larger than under Windows 7.
This means that it will become harder to use this type of allocations to fill up a larger memory region as part of a heap spray. There will be much bigger holes in between allocations, and the locations of the holes are also non-predictable, which means we may not be able to rely on absolute heap spray addresses as much as we could in Windows 7.
Perhaps a larger series of allocations is needed, and a larger number of runs, to find "sweet spots", addresses that are allocated more often than others. Not sure what kind of percentage of predictability we may be able to obtain, but it might be worth the try.
Large_Alloc2
So… can we do a precise heap spray under Windows 10?
Well…. yes. The key is to avoid LFH, and to avoid virtualallocdblocks as well.
Use a "sweet" size and a "sweet" number of allocs to get aligned consecutive allocations (starting at ????0048) as a normal chunk, inside a normal segment. Perhaps the first few allocations won’t start at that aligned address (because there are already some smaller allocations in the segment), but as soon as the allocations trigger the creation of another segment, and you manage to take the first spot, your allocations should be aligned.
App:
C:\Users\corelan\Desktop\vc++\win10\Large_Alloc2\Release>Large_Alloc2.exe
Default process heap found at 0x012B0000 Press a key to start...
[1] Allocated chunk at 0x012C0FF8
[2] Allocated chunk at 0x01300FF8 [3] Allocated chunk at 0x01340FF8 [4] Allocated chunk at 0x015B0048 [5] Allocated chunk at 0x015F0048 [6] Allocated chunk at 0x01630048 [7] Allocated chunk at 0x016B0048 [8] Allocated chunk at 0x016F0048 [9] Allocated chunk at 0x01730048 [10] Allocated chunk at 0x01770048 [11] Allocated chunk at 0x017B0048 [12] Allocated chunk at 0x017F0048 [13] Allocated chunk at 0x01830048 [14] Allocated chunk at 0x018B0048 [15] Allocated chunk at 0x018F0048 [16] Allocated chunk at 0x01930048 [17] Allocated chunk at 0x01970048 [18] Allocated chunk at 0x019B0048 [19] Allocated chunk at 0x019F0048 [20] Allocated chunk at 0x01A30048 [21] Allocated chunk at 0x01A70048 [22] Allocated chunk at 0x01AB0048
...
WinDBG:
0:003> d 0c0c0c0c
0c0c0c0c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0c0c0c1c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0c0c0c2c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0c0c0c3c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0c0c0c4c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0c0c0c5c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0c0c0c6c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0c0c0c7c 41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
To turn this into a precise heap spray, we need to overcome the fact that we don’t know the exact start address of the first one. As we know that the start addresses will be aligned eventually to the start of a page, and we can control the size of the allocations, we simply have to repeat the same structure (junk + ROP + shellcode + junk) every 0x1000 bytes inside each allocation (as explained in the heap spray tutorials on this site). This should allow you to put/find your content at a predictable address. The same logic applies if you need to put specific values at specific places… simply repeat the layout every 0x1000 bytes and you should be fine.
In "Precise_Spray", the goal is to put marker "$$$$" (\x24\x24\x24\x24) at 0x0c0c0c0c:
C:\Users\corelan\Desktop\vc++\win10_heap\Precise_Spray\Release>Precise_Spray.exe Default process heap found at 0x00550000 Press a key to start...
Spray done, check 0x0c0c0c0c >> Contents at 0x0c0c0c0c: 24242424
0:003> db 0c0c0c0c 0c0c0c0c 24 24 24 24 20 20 20 20-20 20 20 20 20 20 20 20 $$$$ 0c0c0c1c 20 20 20 20 20 20 20 20-20 20 20 20 20 20 20 20 0c0c0c2c 20 20 20 20 20 20 20 20-20 20 20 20 20 20 20 20 0c0c0c3c 20 20 20 20 20 20 20 20-20 20 20 20 20 20 20 20 0c0c0c4c 20 20 20 20 20 20 20 20-20 20 20 20 20 20 20 20 0c0c0c5c 20 20 20 20 20 20 20 20-20 20 20 20 20 20 20 20 0c0c0c6c 20 20 20 20 20 20 20 20-20 20 20 20 20 20 20 20 0c0c0c7c 20 20 20 20 20 20 20 20-20 20 20 20 20 20 20 20
Of course, in a complex/multithreaded application, there will be ‘noise’ (other allocations and frees) at the same time your heapspray is running, and which could affect the placement of your allocations within the segment. A possible approach could be to cause some large allocations first (50 allocations of 0x1ff00 bytes or so… any big size, smaller than the chunk size you’re using for the actual spray), each time followed by a small allocation (which we will keep allocated, to avoid that the big ones get coalesced), and then free the large ones. That way, the application can use those freed chunks, split them, consume them, without bothering your aligned spray at all.
I’ve had good results with spraying using chunk sizes of 0x20000-8 bytes, and 0x40000-8 bytes, but I guess any similar aligned size that is a multiple of a page size will work.
Good luck y’all. <3
Peter
Oh yeah, before I forget, please check out:
https://facebook.com/demandglobalchange // https://bit.ly/demandglobalchange_full // https://bit.ly/demandglobalchange
Read & share. Give people reasons to live, not to die for. thank you.
© 2016 – 2021, Peter Van Eeckhoutte (corelanc0d3r). All rights reserved.
Pingback: Windows 10 x86/wow64 Userland heap – sec.uno
Pingback: Newsletter Cybersécurité semaine 28 - Adacis