Site in American English
Password
User name
 
 
« Better to remain silent and be thought a fool than to speak out and remove all doubt »
Abraham Lincoln
DAR (DLL Address Retrieving)

Locating which DLLs are loaded into a process address space is a common subject when you deal with viral analysis, game patching, debugging or system tools coding. Here are a few words about it.

Introduction

There are several techniques to retrieve the address of a DLL in memory. Of course the easiest one is to use EnumProcessModules(). The short program below displays some useful information related to the running process modules:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <windows.h>
#include <conio.h>
#include <psapi.h>
#include <stdio.h>

#pragma comment(lib,"psapi.lib")

int main (int argc,char* argv[])
{
  HMODULE hProcessModule[100];
  MODULEINFO miModuleInfo;
  DWORD dwReturn;
  DWORD dwModuleNb;
  char BaseName[128];
  char FileName[MAX_PATH];
  DWORD i;

  EnumProcessModules(GetCurrentProcess(),hProcessModule,sizeof(hProcessModule),&dwReturn);

  dwModuleNb = dwReturn / sizeof(HMODULE);

  for ( i = 0; i < dwModuleNb; i++ )
  {
    GetModuleBaseName(GetCurrentProcess(),hProcessModule[i],BaseName,sizeof(BaseName));

    GetModuleFileName(hProcessModule[i],FileName,sizeof(FileName));

    GetModuleInformation(GetCurrentProcess(),hProcessModule[i],&miModuleInfo,sizeof(miModuleInfo));

    printf(" Base Name:     %s\n",BaseName);

    printf(" File Name:     %s\n",FileName);

    printf(" Load Address:  %p\n",miModuleInfo.lpBaseOfDll);

    printf(" Size of Image: %08X\n",miModuleInfo.SizeOfImage);

    printf(" Entry Point:   %p\n\n",miModuleInfo.EntryPoint);
  }

  while ( !_kbhit() ) ;

  return 0;
}
 

Figure 1. EnumModules application source code.

Just keep the following output close to your eyes :-):

Figure 2

Figure 2. EnumModules application output.

Another cool function is LoadLibrary(). It loads a DLL into the calling process address space and the value returned is nothing else than the base address of the DLL, that is, the location where it has been mapped.

EnumProcessModules() belongs to the psapi.dll library. In order to use it, you just need this DLL loaded. To load this library, well, just use the kernel32.dll LoadLibrary() function seen above!

The Fun Begins

Suppose now you are injecting your own wonderful debugging tool code into a remote process. Let's say you need to list the modules used by this process or any other action you want. You may think, "well, my injected code call LoadLibrary() to load the DLL needed, then some GetProcAdress() calls will do the job retrieving addresses of functions to use".

Not really! Actually, the LoadLibrary() instruction (and other function calls) you try to inject is referring to the import section in your own code, not to the targeted process one's... Hence, be sure the result will likely be a crash! You must reference the remote process address space, not yours.

How to solve that? Obviously there are plenty of methods. A common one is to hardcode the DLL loading address. For instance, you may notice that Windows maps kernel32.dll to the same address every time. But it's not very portable, as this base address may change depending on the OS or the library version.

There are a lot of other techniques, but one appears to be very efficient because not involving a DLL function call: direct delving into process data...

Back To Processes

Processes use a lot of data structures but, basically, we can say that internally a process is represented by a main "block" of data: the executive process block. Notice that the literature uses the EPROCESS acronym to refer to it, not EPB (this one is used elsewhere, see below).

Unfortunately it's an undocumented piece of data; you won't find very much information about it, even seeking inside the DDK. Actually, it doesn't matter because some gurus, like Matt Pietrek or Sven Schreiber, have done a great "hacking" job. Furthermore, a lot of useful tools allow digging ourselves into Windows internals.

WinDbg

The main one is the Microsoft Windows Debugger (WinDbg). It's both a user-mode and a kernel-mode debugger, it uses a GUI front end and, last but not least, it's free. If you are tired of character-based console debugging tools like KD (Microsoft Kernel Debugger) you'll find this one very handy.

Broadly speaking, when you want to deal with drivers, debuggers, symbols, etc., the right place to go is the Windows Hardware and Driver Central (WHDC). Once there, just skim through the "Microsoft Debugging Tools" and "Windows Symbols" columns.

Always check you are using the latest version of the debugging tools package and be aware that some extensions are not available depending whether you are using NT, 2K, XP, 2003, etc. All the following is done on a Windows XP Home Edition machine. The dissected application is EnumModules, seen above.

EPROCESS Snapshot

In order to view the EPROCESS data directly, just use the following Display Type (dt) command in WinDbg: dt _EPROCESS.

Plenty of fields and offsets come then around. Here is a portion of the command output:

0x000
0x06c
0x070
0x078
0x080
0x084
...
0x1b0
...
0x250
0x252
0x253
0x252
0x254
0x255
Pcb
ProcessLock
CreateTime
ExitTime
RundownProtect
UniqueProcessId
...
Peb
...
NextPageColor
SubSystemMinorVersion
SubSystemMajorVersion
SubSystemVersion
PriorityClass
WorkingSetAcquiredUnsafe
_KPROCESS
_EX_PUSH_LOCK
_LARGE_INTEGER
_LARGE_INTEGER
_EX_RUNDOWN_REF
Ptr32 Void

Ptr32 _PEB

Uint2B
UChar
UChar
Uint2B
UChar
UChar
 

Figure 3. dt _EPROCESS command partial output.

The field at 0x1B0h is 4 bytes long. It's a pointer to a block of data called Process Environment Block (PEB). While the EPROCESS data are located in the system address space, the PEB lives elsewhere, in the process address space. The reason is that some fields inside it need to be updated by user-mode code. That's a good thing for us, because we can then handle it very easily...

Examining The PEB

This structure is of great interest if you want to peer into a process. It contains information relative to what is called its image (copy of the executable file in memory), its heap, Thread-Local Storage data (TLS), etc. Actually, it's mainly a set of pointers to structures... pointing to another structures themselves!

According to IMW2K11 the PEB is always mapped at 0x7FFDF000h. You can check it with WinDbg entering the following command: !peb.

Not only you get a formatted view of the information in the process environment block but its location too. The figure below shows the first PEB fields contents for a running process. You can see the PEB location, 7FFDF000h, on the first line of the output:

PEB at 7ffdf000

  InheritedAddressSpace    No
  ReadImageFileExecOptions No
  BeingDebugged            Yes
  ImageBaseAddress         00400000
  Ldr                      00241ea0
  ...                      ...
 

Figure 4. !peb command partial output.

Relating to our subject, the most interesting field is Ldr (LoaDeR). To locate its offset, we simply use the dt command again: dt _PEB.

The following output is then displayed (extract):

0x000
0x001
0x002
0x003
0x004
0x008
0x00c
0x010
...
InheritedAddressSpace
ReadImageFileExecOptions
BeingDebugged
SpareBool
Mutant
ImageBaseAddress
Ldr
ProcessParameters
...
UChar
UChar
UChar
UChar
Ptr32 Void
Ptr32 Void
Ptr32 _PEB_LDR_DATA
Ptr32 _RTL_USER_PROCESS_PARAMETERS
...
 

Figure 5. dt _PEB command partial output.

The offset to remember is 0x00Ch. We see that this field is 10h-0Ch = 4 bytes in length; yes, it's another pointer and the structure involved is called PEB_LDR_DATA. Let's keep going: dt _PEB_LDR_DATA.

This time I give you a full output:

0x000
0x004
0x008
0x00c
0x014
0x01c
0x024
Length
Initialized
SsHandle
InLoadOrderModuleList
InMemoryOrderModuleList
InInitializationOrderModuleList
EntryInProgress
Uint4B
UChar
Ptr32 Void
_LIST_ENTRY
_LIST_ENTRY
_LIST_ENTRY
Ptr32 Void
 

Figure 6. dt _PEB_LDR_DATA command output.

All right! Fields names ending with "ModuleList" should shake your brain. It has likely something to do with the modules used by the running process... We just need to go one step further to get details about the LIST_ENTRY structure: dt _LIST_ENTRY.

Here is the confusing output:

0x000
0x004
Flink
Blink
Ptr32 _LIST_ENTRY
Ptr32 _LIST_ENTRY
 

Figure 7. The LIST_ENTRY structure.

Well, it's a classic double linked list. The first four bytes refer to the forward link (F) whereas the last four ones refer to the backward link (B). To make things more clear let's get a "living" example.

Exploration of EnumModules Application PEB

First, let's locate the Ldr address: dt _PEB Ldr 7ffdf000

We get:

0x00c
Ldr
0x00241ea0
 

Figure 8. Ldr field location in memory.

Hence, we can infer the following addresses for the PEB_LDR_DATA structure fields:

0x00241EA0
0x00241EA4
0x00241EA8
0x00241EAC
0x00241EB4
0x00241EBC
0x00241EC4
Length
Initialized
SsHandle
InLoadOrderModuleList
InMemoryOrderModuleList
InInitializationOrderModuleList
EntryInProgress
 

Figure 9. PEB_LDR_DATA fields addresses.

Let's deal with the InInitializationOrderModuleList field now:

dt 241ebc _LIST_ENTRY

  0x000 Flink 0x00241f58 [ 0x242020 - 0x241ebc ]
  0x004 Blink 0x002420e0 [ 0x241ebc - 0x242020 ]

dt 241f58 _LIST_ENTRY

  0x000 Flink 0x00242020 [ 0x2420e0 - 0x241f58 ]
  0x004 Blink 0x00241ebc [ 0x241f58 - 0x2420e0 ]

dt 242020 _LIST_ENTRY

  0x000 Flink 0x002420e0 [ 0x241ebc - 0x242020 ]
  0x004 Blink 0x00241f58 [ 0x242020 - 0x241ebc ]

dt 2420e0 _LIST_ENTRY

  0x000 Flink 0x00241ebc [ 0x241f58 - 0x2420e0 ]
  0x004 Blink 0x00242020 [ 0x2420e0 - 0x241f58 ]
 

Figure 10. InInitializationOrderModuleList double linked list addresses.

Here is a graphical representation of the double linked list addresses involved:

Figure 11

Figure 11. Readable version of InInitializationOrderModuleList double linked list.

Exploration of EnumModules Application Memory

The trick is now to dump the memory from the addresses seen above:

dc 241ebc L18

  00241ebc 00241f58 002420e0 00000000 abababab X.$.. $.........
  00241ecc abababab 00000000 00000000 0008000d ................
  00241edc 001c0700 00241f48 00241eac 00241f50 ....H.$...$.P.$.
  00241eec 00241eb4 00000000 00000000 00400000 ..$...........@.
  00241efc 004013ac 0000a000 00360034 00020654 ..@.....4.6.T...
  00241f0c 0020001e 0002066a 00005000 0000ffff .. .j....P......

dc 241f58 L18

  00241f58 00242020 00241ebc 77f40000 00000000 $...$....w....
  00241f68 000ae000 0208003a 77fb2618 00140012 ....:....&.w....
  00241f78 77f41380 00005004 0000ffff 77fb47c8 ...w.P.......G.w
  00241f88 77fb47c8 3eb1b460 00000000 abababab .G.w`..>........
  00241f98 abababab feeefeee 00000000 00000000 ................
  00241fa8 000d000c 001e0700 003a0043 0057005c ........C.:.\.W.

dc 242020 L18

  00242020 002420e0 00241f58 77e40000 77e5ae60 . $.X.$....w`..w
  00242030 000f6000 00420040 00241fb0 001a0018 .`..@.B...$.....
  00242040 00241fd8 00085004 0000ffff 77fb47b0 ..$..P.......G.w
  00242050 77fb47b0 3d6e6b99 00000000 abababab .G.w.kn=........
  00242060 abababab feeefeee 00000000 00000000 ................
  00242070 000d000b 001c0700 003a0043 0057005c ........C.:.\.W.

dc 2420e0 L18

  002420e0 00241ebc 00242020 76ba0000 76ba10c8 ..$. $....v...v
  002420f0 0000b000 003c003a 00242078 00140012 ....:.<.x $.....
  00242100 002420a0 00001006 0000ffff 77fb47d8 . $..........G.w
  00242110 77fb47d8 3d6e6b93 00000000 abababab .G.w.kn=........
  00242120 abababab feeefeee 00000000 00000000 ................
  00242130 000d07da feee1400 00240178 00240178 ........x.$.x.$.
 

Figure 12. Memory dump of InInitializationOrderModuleList double linked list addresses.

A lot of interesting things can be located now. Just go back to figure 2 and you will check that some values are simply the modules loading addresses. Look more carefully, you will identify other values displayed by the EnumModules application too. Let's enter in our debugger the following command: dc 241ebc L18. We get:

Figure 13

Figure 13. Yet seen values...

Hence, we can assume that:

- 1 is the loading address of the module.

- 2 is the entry point of the module.

- 3 is the size, in bytes, occupied by the module.

Concerning values shown in brown and blue, let's do another memory dump operations:

dc 20654 L10

  00020654 003a0043 0053005c 006d0079 006f0062 C.:.\.S.y.m.b.o.
  00020664 0073006c 0045005c 0075006e 004d006d l.s.\.E.n.u.m.M.
  00020674 0064006f 006c0075 00730065 0065002e o.d.u.l.e.s...e.
  00020684 00650078 00000000 003a0043 0053005c x.e.....C.:.\.S.

dc 2066a L10

  0002066a 006e0045 006d0075 006f004d 00750064 E.n.u.m.M.o.d.u.
  0002067a 0065006c 002e0073 00780065 00000065 l.e.s...e.x.e...
  0002068a 00430000 005c003a 00790053 0062006d ..C.:.\.S.y.m.b.
  0002069a 006c006f 005c0073 006e0045 006d0075 o.l.s.\.E.n.u.m.

dc 77fb2618 L10

  77fb2618 003a0043 0057005c 004e0049 004f0044 C.:.\.W.I.N.D.O.
  77fb2628 00530057 0053005c 00730079 00650074 W.S.\.S.y.s.t.e.
  77fb2638 0033006d 005c0032 0074006e 006c0064 m.3.2.\.n.t.d.l.
  77fb2648 002e006c 006c0064 0000006c 00000000 l...d.l.l.......

dc 77f41380 L10

  77f41380 0074006e 006c0064 002e006c 006c0064 n.t.d.l.l...d.l.
  77f41390 0000006c 00000000 ffffffff 77f47431 l...........1t.w
  77f413a0 77f47447 00000000 ffffffff 77f523d0 Gt.w.........#.w
  77f413b0 77f523de 00000000 ffffffff 00000000 .#.w............

dc 241fb0 L10

  00241fb0 003a0043 0057005c 004e0049 004f0044 C.:.\.W.I.N.D.O.
  00241fc0 00530057 0073005c 00730079 00650074 W.S.\.s.y.s.t.e.
  00241fd0 0033006d 005c0032 0065006b 006e0072 .3.2.\.k.e.r.n.
  00241fe0 006c0065 00320033 0064002e 006c006c e.l.3.2...d.l.l.

dc 241fd8 L10

  00241fd8 0065006b 006e0072 006c0065 00320033 k.e.r.n.e.l.3.2.
  00241fe8 0064002e 006c006c abab0000 abababab ..d.l.l.........
  00241ff8 feeeabab feeefeee 00000000 00000000 ................
  00242008 000c000d 001c0700 002420d0 00241f48 ......... $.H.$.

dc 242078 L10

  00242078 003a0043 0057005c 004e0049 004f0044 C.:.\.W.I.N.D.O.
  00242088 00530057 0053005c 00730079 00650074 W.S.\.S.y.s.t.e.
  00242098 0033006d 005c0032 00530050 00500041 m.3.2.\.P.S.A.P.
  002420a8 002e0049 004c0044 0000004c abababab I...D.L.L.......

dc 2420a0 L10

  002420a0 00530050 00500041 002e0049 004c0044 P.S.A.P.I...D.L.
  002420b0 0000004c abababab abababab feeefeee L...............
  002420c0 00000000 00000000 000b000d 001c0700 ................
  002420d0 00241eac 00242010 00241eb4 00242018 ..$.. $...$.. $.
 

Figure 14. Modules names coming to light...

Well, no need to scratch our head for a long time, these addresses simply hold the names of the modules. As they are in UNICODE form, a good practice of DDK headers allows inferring the preceding bytes use. In fact, the module name pointer and the four preceding bytes are likely members of a UNICODE_STRING structure.

When we do: dt 241efc+8 _UNICODE_STRING, the output fits with our assumption:

"C:\Symbols\EnumModules.exe"

  0x000 Length        0x34
  0x002 MaximumLength 0x36
  0x004 Buffer        0x00020654 "C:\Symbols\EnumModules.exe"
 

Figure 15. UNICODE_STRING structure fits very well.

The string "C:\Symbols\EnumModules.exe" is 26 characters long and needs 52 bytes to be stored in UNICODE mode. 52 is 34h, thus the first member of the structure is correct ("not including the terminating NULL character, if any", as they say in the MSDN relating to the UNICODE_STRING topic). The second field is two bytes more because "\0\0" ends every module name string.

Hence, the brown value is a kind of shorcut for GeModuleFileName() whereas the blue value is a GetModuleBaseName() alias. Maybe another "graphical" figure will be more explicit:

Figure 16

Figure 16. Bytes relating to module names.

- 4 is the length, in bytes, of the module file name (including the file path but not including the terminating "\0\0" characters).

- 5 is the size, in bytes, of the allocated memory for storing the string holding the file name.

- 6 is the pointer to the wide-character string holding the file name.

- 7 is the length, in bytes, of the module base name (not including the file path nor the terminating "\0\0" characters).

- 8 is the size, in bytes, of the allocated memory for storing the string holding the base name.

- 9 is the pointer to the wide-character string holding the base name.

Comments

Purpose

Now, just hook/inject as many processes as you want, you will infer a lot of things:

- The InInitializationOrderModuleList member maintains a list of pointers to loaded modules.

- This list is "sorted", not alphabetically but, as its name implies, according to their initialization order.

- The first entry is always referring to the application module.

- The second one is always referring to the ntdll.dll library.

- The third one is always referring to the kernel32.dll library.

Even an application using only a ret ASM instruction will load ntdll.dll and kernel32.dll. Well, at least, according to WinDbg outputs...

Shifting

As figure 12 shows, data relating to the first module are shifted. Several bytes are inserted between the backward link and the module loading address. Surprisingly, there are 52 (34h) bytes inserted while the process is being debugged and 36 (24h) otherwise!

Interest

If we only care about libraries modules, a useful conclusion emerges: as ntdll.dll and kernel32.dll information are always "mapped" at the same location, it's kid's stuff retrieving their base address; from the adequate LIST_ENTRY, just look at offset+8!

Once you have the base address of a DLL, you then need to find actual addresses of functions. This can be done scanning the PE format, looking after the Export Directory Table (EDT). A coming soon article will discuss this point in detail.

Conclusion

Actually, we can synthesize all to be remembered using only two figures. First, the general scheme:

Figure 17

Figure 17. Looking after InInitializationOrderModuleList...

Then, the meaning of each byte of the InInitializationOrderModuleList double linked list items:

0x000
0x004
0x008
0x00C
0x010
0x014
  0x014 (0x000)
  0x016 (0x002)
  0x018 (0x004)
0x01C
  0x01C (0x000)
  0x01E (0x002)
  0x020 (0x004)
Flink
Blink
DllBase
EntryPoint
SizeOfImage
FullDllName
Length
MaximumLength
Buffer
BaseDllName
Length
MaximumLength
Buffer
Ptr32 _LIST_ENTRY
Ptr32 _LIST_ENTRY
Ptr32
Ptr32
Uint4B
UNICODE_STRING
Uint2B
Uint2B
Ptr32
UNICODE_STRING
Uint2B
Uint2B
Ptr32
 

Figure 18. InInitializationOrderModuleList items unscrambled.

Well, in fact, the figure above is just a partial decoding. We can go deeper and find some other useful fields, like a time date stamp for instance. But it's of no interest for our purpose.

Small Example

From within your code, just make a direct access to the PEB at 0x7FFDF000; from a remote code you can reach the PEB through the Thread Environment Block (TEB), we'll see how in a next article. For now, here is a "PEB" version of the EnumModules application seen at the beginning of this text; a funny mixed C/ASM piece :-):

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#include <windows.h>
#include <conio.h>
#include <stdio.h>

void main (int argc,char* argv[])
{
  void* Flink = NULL;
  void* p = NULL;
  void* EntryPoint = NULL;
  void* FullDllName = NULL;
  void* BaseDllName = NULL;
  void* DllBase = NULL;
  DWORD SizeOfImage = 0;

  // Jump to the first InInitializationOrderModuleList item

  _asm
  {
    mov edx,0x7ffdf00C
    mov edx,DWORD PTR [edx]
    add edx,0x1C
    mov Flink,edx
    mov p,edx
  }

  // Loop through the list

  do
  {
    _asm
    {
      mov eax,p

      ; Next Flink

      mov edx,DWORD PTR [eax]
      mov p,edx

      ; DllBase

      mov edx,DWORD PTR [eax+8]
      mov DllBase,edx

      ; Not the first item?

      cmp edx,0
      jne Good

      ; First item

      mov edx,DWORD PTR [eax+0x24+8]
      mov DllBase,edx
      add eax,0x24

      Good:

      mov edx,DWORD PTR [eax+0x0C]
      mov EntryPoint,edx
      mov edx,DWORD PTR [eax+0x10]
      mov SizeOfImage,edx
      mov edx,DWORD PTR [eax+0x18]
      mov FullDllName,edx
      mov edx,DWORD PTR [eax+0x20]
      mov BaseDllName,edx
    }

    wprintf(L" Base Name:   %s\n",BaseDllName);

    wprintf(L" File Name:   %s\n",FullDllName);

    printf(" Load Address:  %p\n",DllBase);

    printf(" Size of Image: %08X\n",SizeOfImage);

    printf(" Entry Point:   %p\n\n",EntryPoint);
  }

  while ( Flink != p );

  while ( !_kbhit() ) ;
}
 

Figure 19. EnumModules "PEB" version source code.

Bibliography

Papers

- Windows Assembly Components, LSD Research Group - 2002.
- Escape from DLL Hell with Custom Debugging and Instrumentation Tools and Utilities, Christophe Nasarre - MSDN Magazine, 2002.
- Gaining Important Datas from PEB Under NT Boxes, Ratter/29A - 2002.

Books

- Undocumented Windows 2000 Secrets, Sven Schreiber - Addison-Wesley, 2001.
- Inside Windows 2000 (3rd ed.), David Solomon and Mark Russinovich - Microsoft Press, 2000.

Footers

1. Inside Microsoft Windows 2000, see bibliography above.

Credits

Coroner - Internal conflicts (musical inspiration).

Enjoy, cya!

(Written 04/16/2004, revised 10/13/2009)

Click here to go to the Windows page.
Link to this page
Page #7300001, generated in 72.23 ms
 
Copyright © 2003-2017 Arnold McDonald. All rights reserved.
W3C HTML conformity
W3C CSS conformity