Tuesday, August 30, 2011

NASM mode for Emacs

Quick post for those Emacs users out there.

The common assembler used on GNU/Linux nowadays is the GAS assembler, part of the GNU Compiler Collection (GCC). If, like me, you get upset with the AT&T syntax and prefer working with a real assembler (and not a just a compiler back-end), you might want to give NASM try.

If you, however, are also a Emacs user, there’s a problem: Emacs assembly mode only supports the GAS syntax, there’s no nasm-mode. Quite disturbing...

So here’s a link to my NASM mode: nasm-mode.el
Please give it a try and don't hesitate to report any bug or missing feature!

All you have to do is put nasm-mode.el in the your load path (usually in ~/.emacs.d/) and add the following lines in your .emacs file:

(autoload 'nasm-mode "~/.emacs.d/nasm-mode.el" "" t)
(add-to-list 'auto-mode-alist '("\\.\\(asm\\|s\\)$" . nasm-mode))

See yah

Wednesday, August 24, 2011

The cdecl calling convention

Hi there!

Calling conventions describe how, at the assembly level, one routine must call another. For example, how should function parameters be passed? By putting them in registers or pushing them in the stack? In which order? For C/C++ programmers, this is usually not a problem since the compiler takes care of it. But if your try to interface some assembly code with C routines, you must now at least one calling convention: the cdecl.

Cdecl is the C calling convention for x86 architecture and is the default for most compilers. Let’s describe it a little with an example. I’ll bee using two routines: Foo() an Bar(). Bar() takes 3 integers as arguments. The following C code will bee our road map and I’ll be showing the assembly equivalent (using the NASM syntax, what else?).
int Bar(int a, int b, int c);

void Foo(void)
{
  /* Some stuff here */
  Bar(42, 21, 84);
}

int Bar(int a, int b, int c)
{
  int loc;

  /* Some stuff here */
  return 1337;
}

The return value will be stored in the eax register so the caller first have to push it’s current value.
Foo:
    push eax

The caller pushes parameters in reverse order.
push 84
push 21
push 42

The caller calls the routine. Doing the call will push the return address (current eip) on the stack.
call Bar

The callee sets up a new stack frame. This is done by saving the ebp register and then setting it with the current content of the esp register.
Bar:
    push ebp
    mov ebp, esp

The callee saves any register that will be used later by pushing their values on the stack.
push ebx

The callee allocates room on the stack for local variables. This is done by decrementing the esp register.
sub esp, 4

The callee does what it have to do. Here's a diagram of what the stack looks like by now:


The callee stores the return value in the eax register.
mov eax, 1337

The callee releases allocated space on the stack by incrementing the esp register.
add esp, 4

The callee restores the registers content, including the ebp register.
pop ebx
pop ebp

The callee returns (this will pop the old value of eip).
ret

The caller must clean up the stack (i.e, remove the parameters by incrementing esp).
add esp, 12

By doing this correctly, you can use assembly routines in your C code or the opposite. Here's the final assembly code:
Foo:
; Some stuff here
    push eax

    push 84
    push 21
    push 42
    call Bar
    add esp, 12
; And continue what it was doing, return value stored in eax

Bar:
    push ebp
    mov ebp, esp

    push ebx
    sub esp, 4

; Some stuff here

    mov eax, 1337
    add esp, 4

    pop ebx
    pop ebp
    ret

Other x86 conventions are used, like the stdcall convention (used, for example, by the Windows 32 API), the syscall a.k.a. pascal convention (Linux system calls, …), the thiscall convention (in C++ when calling an object’s member function), the non-standard fastcall convention, ...

Hopes this brings you some help ;)

Saturday, August 6, 2011

malloc

Today I was asked to comment on a lecture that will be given to second year students in October. This lecture is part of the UNIX System Programming course and is a support for one of the most interesting project of the semester: make your own memory allocator. By implementing the standard C/C++ routines malloc(), free(), calloc() and realloc(), students will be familiar with the system calls brk()/sbrk() and will have a much deeper understanding of a process execution environment.

The lecture first introduce how memory is managed by the Linux kernel, then speaks about the different memory sections of a process execution environment. Finally, students are invited to follow some demonstrations, meant to explain how brk() and sbrk() will have to be used in the project. Here’s a little summary:

There is basically 3 main sections mapped in memory: the code segment (or text segment), the data segment and the stack.

The code segment holds the executable instructions interpreted by the processor. It is read-only and has a fixed-size.

The stack segment stores information about the current state of the process. I’ll probably need a hole other bunch of lines to explain how it is used so I’ll try to keep in simple. The stack is used every time a subroutine is called, storing data on top before the routine call (push) and restoring it on return (pop).

The data segment has read and write permissions and is divided in 3 parts:

  • The data part which holds the initialised global and static variables.
  • The BSS which holds the uninitialised global and static variables (it is filled with zeros before execution).
  • The heap which is the section managed by the memory allocator. It can grow or shrink during run-time via the system calls brk() and sbrk() (mmap() can also be used, but let’s leave that apart).


So, as you can guess, every time you call malloc(), the memory allocator will find in the heap a suitable place for you and will return a pointer to that specific spot. If there is none, it will extend the heap.

A good memory allocator has to:

  • be fast: of course! Just count how many time you call malloc() in one of your programs.
  • manage memory very efficiently: free blocks can be used again to avoid endless system calls, adjacent free blocks can bee merged to avoid fragmentation, …
  • check for programming mistakes: release an already freed memory is an error, programmer can give an invalid pointer, …

You can see why writing your own memory allocator can be a good idea. There’s no need for a complex one at first. You can start easy with a simple call to sbrk() every time and then improve it with a better data structure for storing your pointers to free blocks. Ore maybe make sure there’s no race conditions on a multi-threaded application?
And there’s also that feeling when xeyes and Bash runs with your malloc().