8.2 Mixing Assembly and C

Often it is a good idea to link assembly language programs or routines with high-level programs which may contain resources unavailable to you through direct assembly programming--such as using C's built in graphics library functions or string-processing functions. Conversely, it is often necessary to include short assembly routines in a compiled high-level program to take advantage of the speed of machine language.

All high-level languages have specific calling conventions which allow one language to communicate to the other; i.e., to send variables, values, etc. The assembly-language program that is written in conjunction with the high-level language must also reflect these conventions if the two are to be successfully integrated. Usually high-level languages pass parameters to subroutines by utilizing the stack. This is also the case for C.

8.2.1 Using Assembly Procedures in C Functions

8.2.1.1 Procedure Setup

In order to ensure that the assembly language procedure and the C program will combine and be compatible, the following steps should be followed:

  • Declare the procedure label global by using the GLOBAL directive. In addition, also declare global any data that will be used.

  • Use the EXTERN directive to declare global data and procedures as external. It is best to place the EXTERN statement outside the segment definitions and to place near data inside the data segment.

  • Follow the C naming conventions--i.e., precede all names (both procedures and data) with underscores.

8.2.1.2 Stack Setup

Whenever entering a procedure, it is necessary to set up a stack frame on which to pass parameters. Of course, if the procedure doesn't use the stack, then it is not necessary. To accomplish the stack setup, include the following code in the procedure:

        push    ebp
        mov     ebp, esp

EBP allows us to use this pointer as an index into the stack, and should not be altered throughout the procedure unless caution is taken. Each parameter passed to the procedure can now be accessed as an offset from EBP. This is commonly known as a "standard stack frame."

8.2.1.3 Preserving Registers

It is necessary that the procedure preserve the contents of the registers ESI, EDI, EBP, and all segment registers. If these registers are corrupted, it is possible that the computer will produce errors when returning to the calling C program.

8.2.1.4 Passing Parameters in C to the Procedure

C passes arguments to procedures on the stack. For example, consider the following statements from a C main program:

           |
extern int Sum();
           |
int a1, a2, x;
           |
x = Sum(a1, a2);

When C executes the function call to Sum, it pushes the input arguments onto the stack in reverse order, then executes a call to Sum. Upon entering Sum, the stack would contain the following:

Since a1 and a2 are declared as int variables, each takes up one word on the stack. The above method of passing input arguments is called passing by value. The code for Sum, which outputs the sum of the input arguments via register EAX, might look like the following:

_Sum
        push    ebp             ; create stack frame
        mov     ebp, esp
        mov     eax, [ebp+8]    ; grab the first argument
        mov     ecx, [ebp+12]   ; grab the second argument
        add     eax, ecx        ; sum the arguments
        pop     ebp             ; restore the base pointer
        ret

It is interesting to note several things. First, the assembly code returns the value of the result to the C program through EAX implicitly. Second, a simple RET statement is all that is necessary when returning from the procedure. This is due to the fact that C takes care of removing the passed parameters from the stack.

Unfortunately, passing by value has the drawback that we can only return one output value. What if Sum must output several values, or if Sum must modify one of the input variables? To accomplish this, we must pass arguments by reference. In this method of argument transmission, the addresses of the arguments are passed, not their values. The address may be just an offset, or both an offset and a segment. For example, suppose Sum wishes to modify a2 directly--perhaps storing the result in a2 such that a2 = a1 + a2. The following function call from C could be used:

Sum(a1, &a2);

The first argument is still passed by value (i.e., only its value is placed on the stack), but the second argument is passed by reference (its address is placed on the stack). The "&" prefix means "address of." We say that &a2 is a "pointer" to the variable a2. Using the above statement, the stack would contain the following upon entering Sum:

Note that the address of a2 is pushed on the stack, not its value. With this information, Sum can access the variable a2 directly. (Hint: use an index register to hold the offset, then use a memory access to access the variable).

8.2.1.5 Returning a Value from the Procedure

Assembly can return values to the C calling program using only the EAX register. If the returned value is only four bytes or less, the result is returned in register EAX. If the item is larger than four bytes, a pointer is returned in EAX which points to the item. Here is a short table of the C variable types and how they are returned by the assembly code:

Data Type Register
char AL
short AX
int, long, pointer (*) EAX

8.2.1.6 Allocating Local Data Space on the Stack

Temporary storage space for local variables or data can be created by decreasing the contents of ESP just after setting up a stack frame at the beginning of the procedure. It is important to restore the stack space at the end of the procedure. The following code fragment illustrates the basic idea:

        push    ebp             ; Save caller's stack frame
        mov     ebp, esp        ; Establish new stack frame
        sub     esp, 4          ; Allocate local data space of
                                ;  4 bytes
        push    esi             ; Save critical registers
        push    edi
        ...
        pop     edi             ; Restore critical registers
        pop     esi
        mov     esp, ebp        ; Restore the stack
        pop     ebp             ; Restore the frame
        ret                     ; Return to caller

8.2.2 Using C Functions in Assembly Procedures

In most cases, calling C library routines or functions from an assembly program is more complex than calling assembly programs from C. An example of how to call the printf library function from within an assembly program is shown next, followed by comments on how it actually works.

global  _main

extern  _printf

section .data

text    db      "291 is the best!", 10, 0
strformat db    "%s", 0

section .code

_main
        push    dword text
        push    dword strformat
        call    _printf
        add     esp, 8
        ret

Notice that the procedure is declared global, and its name must be _main, which is the starting point of all C code.

Since C pushes its arguments onto the stack in reverse order, the offset of the string is pushed first, followed by the offset of the format string. The C function can then be called, but care must be taken to restore the stack once it has completed.

When linking the assembly code, include the standard C library (or the library containing the functions you use) in the link. For a more detailed (and perhaps more accurate) description of the procedures involved in calling C functions, refer to another text on the subject.