Stack Overflows: EIP Overwrite – A different approach…   2 comments

  • Post Info:
    1. # Author: Flavio do Carmo Junior aka waKKu
      # URL: Author’s Webpage
      # Date: April 06, 2011
      # Category: Exploiting, Programming, Security

    Today my pal tuxtrack at #dclabs (irc.freenode.net) asked me some help to understand Stack Overflow (more precisely EIP Overwrite part)…
    As I know him (and part of his knowledge), I decided to take a different approach from usual “blah, blah, blah” about C, strcpy() and stack.
    Possibly it could be useful to anyone else, so I’m posting it:

    Before our journey inside gdb, lemme explain some details:

    First question:
    — Why are we seeing instructions like this:
    0x08048383 : mov %eax,0x4(%esp)
    0x0804838a : mov %eax,(%esp)
    – instead of usual:
    push %eax
    (Some instructions to set a new value for %eax)
    push %eax
    ???????

    Answer: Well, at each “push” instruction, what really happens is a cycle of instructions similar to:
    movl %eax, (%esp)
    sub $-4, %esp

    Thus we can easily see that it is SLOWER than a single “movl” instruction using a referenced address, like this one:
    0x08048383 : mov %eax,0x4(%esp)

    Reason for this is GCC optimizations, we could use some specific options while compiling the program to generate the old “push” style…
    But the real world nowadays is optimization, so I thought better to work with it.

    Second Question:
    — Why did you tell “Call pushes the current value of EIP + 5 bytes (eip == own CALL instruction address)?

    Answer:
    – EIP at moment CALL is being executed, obviously, points to the own address of CALL instruction. This CALL instruction has 5 bytes (when pointing execution flow to another memory address (label), 1 byte for CALL mnemonic + 4 to memory address being called), therefore the address right after the CALL instruction is exactly “Address of CALL” + 5 bytes.

    <shinku> CALL isn’t ALWAYS 5 bytes long ;P (Translated IRC talk)
    — True, when using call with, for example, a register it has a different size:
    0: ff d0 call *%eax
    More about CALL instruction: http://home.comcast.net/~fbui/intel_c.html#call
    Obviously, in this case, the pushed address would be CURRENT_EIP + CALL_LENGTH…

    And I talk a bit more about it below.

    Finally, the code and comments trying to explain this “technique”:

    waKKu@0xcd80$ cat tux.c
    #include <stdio.h>
    
    void blah_function(char *buffer, int bleh) {
            printf("%d\n", bleh);
    }
    
    int main(void) {
            char *buf;
            int  nada;
    
            blah_function(buf, nada); // Imagine "blah_function" suffers with a stack buffer overflow
            return(1);
    }
    waKKu@0xcd80$ gcc -O0 -fno-stack-protector -z execstack -o tux tux.c
    waKKu@0xcd80$ gdb --quiet ./tux
    (gdb) disass main
    Dump of assembler code for function main:
    0x0804836f <main+0>:    lea    0x4(%esp),%ecx
    0x08048373 <main+4>:    and    $0xfffffff0,%esp
    0x08048376 <main+7>:    pushl  0xfffffffc(%ecx)
    
    
    # Begin of assembly prologue (saving stack frame)
    0x08048379 <main+10>:   push   %ebp
    # Saving the current frame
    
    0x0804837a <main+11>:   mov    %esp,%ebp
    # Copying current frame to "ebp", creating our new "local" frame
    
    0x0804837c <main+13>:   push   %ecx
    # Now we are saving the value of "ecx" that was computed before prologue
    # This is another GCC optimization for "main" function
    # But this last instruction changed our local frame, if we try to point directly to
    # "ebp", we would overwrite this saved value, so we needs to think our local base frame
    # as "ebp-4".
    # And also this finishes assembly prologue.
    
    0x0804837d <main+14>:   sub    $0x24,%esp // Saving stack space for local variables
    0x08048380 <main+17>:   mov    0xfffffff8(%ebp),%eax // ebp == local stack frame base
    # (gdb) print /d (int)0xfffffff8
    # $2 = -8
    # Then, (base stackframe) - 8 == Address of our first variable (char *buf), -8 because the first value (-4) is that "ecx" value 
    # previously saved (during prologue).
    # Our local stack frame layout is: [EBP][ECX][VAR1][VAR2] (ebp - 8 = VAR1)
    # And we're copying this value (VAR1) to "eax"
    
    
    0x08048383 <main+20>:   mov    %eax,0x4(%esp)
    # Copy "eax" (VAR1) to address pointed by ESP register + 4 bytes (See "First Question"), this is the second argument
    # for "blah_function" (int bleh).
    
    0x08048387 <main+24>:   mov    0xfffffff4(%ebp),%eax
    # (gdb) print /d (int)0xfffffff4
    # $3 = -12
    # EBP - 12 => VAR2
    
    0x0804838a <main+27>:   mov    %eax,(%esp)
    # Again, copy "eax" (VAR2) to address pointed by ESP (same of push %eax, but faster), this is the first argument
    # for "blah_function" (char *buffer)
    
    0x0804838d <main+30>:   call   0x8048354 <blah_function>
    # Here we need to know about "calling convention". Linux's uses "cdecl" Calling Convention: Function's argument should be placed
    # on ESP, in reverse order. ESP => Last Argument (left -> right), ESP+4 => Next to last (penultimate) argument, 
    # ESP+8 => Next to next argument (antepenultimate), ...
    # More about cdecl: http://en.wikipedia.org/wiki/X86_calling_conventions#cdecl
    # 
    # Here also reside the WHOLE trick about EIP Overwriting with Stack Overflows. Do you remember how CALL instruction works?
    # Before CALL jump/redirect the EIP to specified address, it pushes (push) its own address (EIP) +5 bytes onto stack, 
    # it will be after used by the "ret" instruction when the function being called terminate and the flow need to jump back
    # to caller function ("main" in this case). 
    # So, "ret" instruction trusts that ESP, at the time of its execution, points to the address right after CALL instruction
    # and it could return to the caller just putting this address as EIP (Next Instruction Pointer). Here is where we "hack", 
    # this value is onto stack, we've a stack overflow, so we'll write a new value to "fool" the "ret" instruction and makes
    # EIP point to wherever we want.
    
    
    ########################################################
    # Following normal execution flow, we are going to jump to "blah_function", thanks to CALL instruction
    ########################################################
    (gdb) disass blah_function
    Dump of assembler code for function blah_function:
    # Prologue begin
    0x08048354 <blah_function+0>:   push   %ebp
    0x08048355 <blah_function+1>:   mov    %esp,%ebp
    # Prologue end
    
    
    0x08048357 <blah_function+3>:   sub    $0x8,%esp // Saving stack space for local variables
    0x0804835a <blah_function+6>:   mov    0xc(%ebp),%eax
    # Here the functions is retrieving arguments from stack (calling convention), it knows
    # the argument at position ESP+12 (0xc => 12 decimal) is actually ARG2 (int bleh), so
    # we copy it to "eax"
    
    0x0804835d <blah_function+9>:   mov    %eax,0x4(%esp)
    # Now we need to setup stack (ESP) to "printf"
    # Then we need to put this ARG2 onto ESP+4, because this is where "printf" will
    # look for its SECOND argument (again, calling convention trust)
    
    0x08048361 <blah_function+13>:  movl   $0x80484a8,(%esp)
    # (gdb) x/s 0x80484a8
    # 0x80484a8 <_IO_stdin_used+4>:    "%d\n"
    # This is our format string, the first argument for "printf" function
    # Again, based in calling convention we put it where ESP points to (same of push $0x80484a8, but faster).
    
    0x08048368 <blah_function+20>:  call   0x8048290 <printf@plt>
    # Finally, with stack ready, we can call "printf" to print our value
    
    0x0804836d <blah_function+25>:  leave
    # "leave" is a replacement instruction for old epilogue set of instructions (restore stack frame saved at prologue).
    # It is the same of:
    # movl %ebp, %esp
    # pop %ebp
    
    0x0804836e <blah_function+26>:  ret
    # Here is what we waited for... After "leave" perform its epilogue and restore the stack frame, the value on ESP
    # points to the address that "CALL" pushed onto stack long time ago (main+30), before jump/redirect to "blah_function".
    # And now what "ret" instruction does is something similar to "pop %eip", in other words, it grab the value pointed by ESP
    # and copy it to EIP (Next Instruction Pointer). In this example program the value wasn't overwrote and "ret" would grab
    # the value of "main+35" (0x08048392), what will resume execution back to "main" and everything work as expected.
    #
    # If we got a stack overflow during "blah_funcion" (imagine if it was strcpy() instead of "blah_function"), we could
    # overwrite the value trusted by "ret" instruction and steal/redirect the execution flow to wherever we want.
    
    End of assembler dump.
    (gdb)
    
    # In our example (no EIP overwrite), after "ret" instruction EIP will point here (0x08048392), the value pushed onto
    # stack by CALL instruction before redirect to "blah_function".0x08048392 <main+35>:   mov    $0x1,%eax
    0x08048397 <main+40>:   add    $0x24,%esp
    0x0804839a <main+43>:   pop    %ecx
    0x0804839b <main+44>:   pop    %ebp
    0x0804839c <main+45>:   lea    0xfffffffc(%ecx),%esp
    0x0804839f <main+48>:   ret
    End of assembler dump.
    (gdb)
    

    A bit more about CALL + RET thing:

    <waKKu> 0x08048380 <main+17>:   mov    0xfffffff8(%ebp),%eax
    <waKKu> 0x08048383 <main+20>:   mov    %eax,0x4(%esp)
    <waKKu> 0x08048387 <main+24>:   mov    0xfffffff4(%ebp),%eax
    <waKKu> 0x0804838a <main+27>:   mov    %eax,(%esp)
    <waKKu> 0x0804838d <main+30>:   call   0x8048354 <blah_function>
    <waKKu> 0x08048392 <main+35>:   mov    $0x1,%eax
    

    CALL will save the value of “main+35” (0x08048392) onto stack and jump to “blah_function” address (0x8048354).
    When blah_function is finishing, it will execute:

    0x0804836d <blah_function+25>: leave
    0x0804836e <blah_function+26>: ret

    Then “ret” will grab the value pointed by ESP (that is “main+35” address == 0x08048392) and copy it to EIP, then EIP get back to “main” and execution continue normal.

    Any doubts/suggestion/corrections about this post, as usual, please write in comments ;)

    Cya folks…

    waKKu

    Advertisements

    Posted April 6, 2011 by waKKu in Exploiting, Programming, Security

    2 responses to “Stack Overflows: EIP Overwrite – A different approach…

    Subscribe to comments with RSS.

    1. crap.

    2. Really nice blog, glad that I’ve found it.

      Shouts from Russia!

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out / Change )

    Connecting to %s

    %d bloggers like this: