Up to this point in ECE 291, the MP's have been written in real mode, with a source design that reflects real mode assumptions. When writing code for protected mode, the source organization will change, but only slightly. The primary differences are:
In protected mode, SEGMENT would be a bit of a misnomer, as while segment registers are still used to address memory, they hold selectors instead of segments (see Section 17.2.3 for more information about this). In NASM, SEGMENT and SECTION are treated identically internally, so this is just a semantic change, not a functional one.
Unlike the real mode MPs, the DJGPP platform used for writing protected mode code in ECE 291 provides a stack, so there's no need for the assembly source to provide one.
As the assembly program is linked to the DJGPP startup code also means the program execution doesn't begin at the ..start label as it did in real mode, but at the C-style function _main.
Tip: _main is called in exactly the same was as how it's called for C programs. Those that are already familiar with C may know about the two arguments passed (on the stack, using the C calling convention) to this function: int argc and char *argv, which can be used to retrieve the command-line arguments passed to the program. See a C reference for information on the meanings of these two parameters and how to use them to read command-line arguments.
As Section 18.104.22.168 shows, DS and CS actually do point to the same memory space in protected mode, just as they did in real mode, but CS and DS do not hold the same numerical value.
Caution: As CS is set up to be read-only, if the program code does set DS=CS at the beginning of the program, the data segment becomes read only!
This change is a conceptually major one: the addition of an uninitialized data segment. What does this mean? All data variables declared in the initialized data section take up space in the executable image on disk. This data is then copied into memory when the program is run, along with the program code. Data placed in the uninitialized section, on the other hand, does not take up space on disk. When the program is run, extra space is tacked onto the end of the data segment (accessed with DS) and initialized to 0.
There are uninitialized equivalents to the db, dw, etc. family of data declarations that start with "res" (reserve) instead of "d" (declare), e.g. resb, resw, etc. These "reserve" equivalents just take a single number: the number of data items of this size to reserve space for. Within the .bss section, these equivalents must be used instead of db and the like.
Use the .bss section instead of the .data section for variables that can be 0 at program startup. Remember that the "res" family takes the number of items, so:
SECTION .data a db 0 b dw 0,0,0 c dd 0,0
SECTION .bss a resb 1 b resw 3 c resd 2
The code segment is now called .text and the data segment is called .data. The segments changed names to match the segment names used by DJGPP. These names are also considered standard on the UNIX platform.
Why is the uninitialized section called .bss and the code section called .text? Both names have a long history in UNIX, but the history of .bss is perhaps the most interesting.