Protected Mode Basics
PROTECTED MODE BASICS
С точки зрения приложения , protected mode и real не отличаются.
Оба используют сенментацию,прерывания.Но эта сегментация работает по=разному.
В real mode сегментация управляется с помощью внутреннего встроенного механизма.
Содержание сегментного регистра в real mode является частью физического адреса.
Физический адрес получается умножением сегментного регистра на 16 (10h)
и к результату прибавляется 16-битное смещение. Это 16-битное смещение ограничивает
размер сегмента до 64 кб.
В protected mode сегменты управляются с помощью дескрипторных таблиц ,
и в сегментном регистре находится указатель на эту таблицу .
Каждый дескриптор в такой таблице имеет 8-байтную длину , поэтому значение-указатель
в сегментном регистре должно быть кратно 8 (08h, 10h, 18h, и т.д.)
Существуют 2 типа таблиц : Global Descriptor Table (GDT),и Local Descriptor Table (LDT).
GDT включает информацию о всех приложениях.
LDT включает информацию о конкретной задаче.
Каждый раз при загрузке значения из сегментного регистра
Физический адрес формируется путем сложения 16 или 32-битного смещения
к базовому адресу.
Работа прерываний также отличается для real mode и для protected mode.
В реальном режиме процедуры прерывания лежат у основания памяти.
При вызове прерывания процессор смотрит на адрес Interrupt Service Routine (ISR).
После этого флаги сохраняются в стеке, выполняется far call процедуры прерывания.
В защищенном режиме при вызове прерывания в стеке хранятся флаги,сегменты.
Процессор генерирует 3 категории прерываний: traps, faults, aborts.
При генерации trap код ошибки в стеке не сохраняется , для faults в основном сохраняется ,
для abort всегда сохраняется.
Traps - это тип software interrupts.
Traps включает деление на 0,
data breakpoints, INT 03.
Faults генерятся при ошибках такого типа , в результате которых
обработка их приводит к перезагрузке программы.
Table 1 выводит список прерываний , генерируемых процессором
для защищенного режима.
DESCRIPTOR CACHE REGISTERS
Whether in real or protected mode, the CPU stores the base address of each segment in hidden registers called descriptor cache registers. Each time the CPU loads a segment register, the segment base address, segment size limit, and access attributes (access rights) are loaded, or "cached," ) into these hidden registers. To enhance performance, the CPU makes all subsequent memory references via the descriptor cache registers instead of calculating the physical address, or looking up the base address in the descriptor table. Understanding the role of these hidden registers is paramount for exploiting highly advanced programming techniques, and for exploiting the undocumented LOADALL instruction.Figure 2(a) shows the descriptor cache layout for the 80286, and Figure 2(b) shows the layout for the 80386, and 80486.
Figure 2 (a) 80286 Descriptor Cache Register
|24-bit base address
Figure 2 (b) 80386/80486 Descriptor Cache Register
|32-bit Physical Address
At power-up, the descriptor cache registers are loaded with fixed, default values, the CPU is in real mode, and all segments are marked as read/write data segments, including the code segment (CS). According to Intel, each time the CPU loads a segment register in real mode, the base address is 16 times the segment value, while the access rights and size limit attributes are given fixed, "real-mode compatible" values. This is not true. In fact, only the CS descriptor cache access rights get loaded with fixed values each time the segment register is 1oaded - and even then only when a far jump is encountered. Loading any other segment register in real mode does not change the access rights or the segment size limit attributes stored in the descriptor cache registers. For these segments, the access rights and segment size limit attributes are honored from any previous setting (see Figure 3). Thus it is possible to have a four giga-byte, read-only data segment in real mode on the 80386, but Intel will not acknowledge, or support this mode of operation.
Protected mode differs from real mode in this respect each time the CPU loads a segment register, it fully loads the descriptor cache register, no previous values are honored. The CPU loads the descriptor cache directly from the descriptor table. The CPU checks the validity of the segment by testing the access rights in the descriptor table, and illegal va1ues will generate exceptions. Any attempt to load CS with a read/write data segment will generate a protection error. Likewise, any attempt to 1oad a data segment register as an executable segment will also generate an exception. The CPU enforces these protection rules very strictly if the descriptor table entry passes all the tests, then the CPU loads the descriptor cache register.
Figure 3 -- Descriptor Cache Contents (Real Mode)
Figure 4(a) -- Interrupt service addressing in Real Mode
Fig 4(b) Interrupt service addressing in Protected Mode
Table 1 -- Exceptions and Interrupts
|Return Addr points to faulting instruction
|This interrupt first appeared in this CPU
|Division by 0
Invalid OP Code
Device not available
Copr. segment overrun
Segment not present
Floating point error
|On the 386-class CPUs, debug exception can be either traps, or faults. A trap is caused by the Trap Flag (TF) being set | in the flags image, or using the debug registers to generate data breakpoints. In this case the return address is the instruction following the trap. Faults are generated by setting the debug registers for code execution breakpoints. As with all faults, the return address points to the faulting instruction.
|Removed from the 80486, now generates exception 13 on all future processors.
|Model dependant. Behavior may be different or missing on future processors.
ENTERING PROTECTED MODE
Our goal is to enter protected mode, and leave protected mode and return to DOS. The '286 has no internal mechanism to exit protected mode: once you are in protected mode, you are there to stay. IBM recognized this, and implemented a hardware solution that would take the '286 out of protected mode by resetting the CPU. Since the power-on state of the '286 is real mode, simply resetting the CPU will return to real mode. But this introduces a slight problem, as the CPU won't continue executing where it left off. At reset, the CPU starts executing at the top of memory, in the BIOS. Without a protocol to tell the BIOS that we reset the CPU for the purpose of exiting protected mode, the BIOS would have no way to return control back to the user program. IBM implemented a very simple protocol by writing a code to CMOS RAM (CMOS) where the BIOS can check this code and decide what to do. Immediately after the BIOS starts executing from the reset vector, it checks this code in CMOS to determine if the CPU was reset for the purpose of exiting protected mode. Depending on the code in CMOS, the BIOS can return control back to the user program and continue executing.
Resetting the CPU isn't without its ramifications; all the CPU registers are destroyed, and the interrupt mask in the Programmable Interrupt Controller (PIC) is sometimes re-programmed by the BIOS (depending on the shutdown type). Therefore, it is the program's responsibility to save the PIC mask, stack pointer, and return address before entering protected mode. The PIC mask and stack pointer must be stored in the user's data segment, but the return address must be stored at a fixed location defined in the BIOS data segment -- at 40:67h.
Next, we set the code in CMOS that tells BIOS we will exit protected mode and return to the user's program. This is simply done by writing a value to the two CMOS I/O ports. After the CPU gets reset, and BIOS checks the CMOS code, BIOS will clear the CMOS code, so subsequent resets won't cause unexpected results. After setting the code in CMOS, the program must build the GDT. (See the appropriate Intel programmer's reference manual for a description of the GDT.) The limit, and access rights may be filled in by the compiler, as these values are static. But the base addresses of each segment aren't known until run-time; therefore the program must fill them in the GDT. Our program will build a GDT containing the code, data, and stack segments addressed by our program. One last GDT entry will point to 1M for illustrative purposes.
Accessing memory at 1M isn't as simple as creating a GDT entry and using it. The 8086 has the potential to address 64k (minus 16 bytes) beyond the maximum addressability of 1M -- all it lacks is a 21st address line. The 8086 only has 20 address lines (A00..A19), and any attempt to address beyond 1M will wrap around to 0 because of the absence of A20. The '286 has 24 bits of addressability (A00..A23) and doesn't behave like the 8086 in this respect. Any attempt to address beyond 1M (FFFF:0010 - FFFF:FFFF) will happily assert A20, and not wrap back to 0. Any program that relies on the memory wrapping "feature" of the 8086, will fail to run properly. As a solution to this compatibility problem, IBM decided to AND the A20 output of the CPU with a programmable output pin on some chip in the computer. The output of the AND gate is connected to the address bus, thus propogating or not, A20. Based on the input from the CPU A20, ANDed with an externally programmable source, address bus A20 gets asserted. The keyboard controller was chosen as this programmable source because it contained some available pins that can be held high, low, or toggled under program control. When the output of this pin is programmed to be high, the output of the AND gate is high when the CPU asserts A20. When the output is low,A20 is always low on the address bus -- regardless of the state of the CPU A20. Thus by inhibiting A20 from being asserted on the address bus, '286- class machines can emulate the memory wrapping attributes of their 8086 predecessors.
Notice that only A20 is gated to the address bus. Therefore, without enabling the input to the A20 gate, the CPU can address every even megabyte of memory as follows: 0-1M, 2-3M, 4-5M, etc. In fact, duplicates of these memory blocks appear at 1-2M, 3-4M, 5-6M, etc. as a result of holding A20 low on the address bus. To enable the full 24-bits of addressability, a command must be sent to the keyboard controller (KBC). The KBC will enable the output on its pin to high, as input to the A20 gate. Once this is done, memory will no longer wrap, and we can address the full 16M of memory on the '286, or all 4G on 80386-class machines. All that remains in order to enter protected mode is changing the CPU state to protected mode and jumping to clear the prefetch queue (not necessary on the Pentium).
The following table summarizes the steps required to enter (with the intention of leaving) protected mode on the '286:
- Save the 8259 PIC mask in the program data segment
- Save SS:SP in the program data segment
- Save the return address from protected mode at 40:67
- Set the shutdown code in CMOS to tell BIOS that upon reset we will be returning to our program
- Build the GDT
- Enable A20 on the address bus
- Enable protected mode in the CPU machine status word (MSW)
- JUMP to clear the prefetch queue
Steps 1-6 can be done in any order.
The minimum number of steps required to enter protected mode on the '386 and '486 are far fewer, as the '386 can exit protected mode without resetting the CPU. For compatibility purposes, all '386 BIOS's will recognize the CPU shutdown protocol defined on '286-class machines, but following this protocol isn't necessary. To exit protected mode on a '386, the program simply clears a bit in a CPU control register. There is no need to save the PIC mask, SS:SP, a return address, or set a CMOS code. The requisite steps for entering protected mode on a '386 simply become:
- Build the GDT
- Enable A20 on the address bus
- Enable protected mode in the CPU control register (CR0, or MSW)
- JUMP to clear the prefetch queue
Of these requisite steps, building the GDT is the only step that may differ. In the '386 the base address is expanded to 32-bits, the limit is expanded to 20-bits, and two more control attribute bits are present. Listing 1 lists all the auxiliary subroutines to enter protected mode.
EXITING PROTECTED MODE
Like entering protected mode, exiting it differs from the '286 to 80386-class machines. The '386 simply clears a bit in the CPU control register CR0, while the '286 must reset the CPU. Resetting the CPU isn't without its costs, as many hundred -- if not thousands -- of clock cycles pass in the time it takes to reset the CPU and return control back to the use program. The original method employed by IBM used the keyboard controller by connecting another output pin to the CPU RESET line. By issuing the proper command, the KBC would toggle the RESET line on the CPU. This method works, but it is very slow. Many new generation '286 chip sets have a "FAST RESET" feature. These chip sets toggle the RESET line by simply writing to an I/O port. When available, FAST RESET is the preferred method. But there is a third, obscure, but efficient method for resetting the CPU without using the KBC or FAST RESET. This method is elegant, faster than using the KBC, and works on the '386 WITHOUT resetting the CPU! It is truly the most elegant, comprehensive way to exit protected mode, since it works on both the '286, and '386 -- in the most efficient way possible for each CPU. Listing 2 provides the code necessary to use the KBC and this elegant technique.
Using the KBC to reset the CPU is a straightforward technique, but in order to understand the elegant technique, some explanation is required. Recall that in our discussion of interrupts, the CPU checks the interrupt number (x8) against the limit field in the interrupt descriptor cache register (IDTR). If this test passes, then the next phase of interrupt processing begins. But if the test fails, then the CPU generates a DOUBLE FAULT (INT08). For example, let us suppose the limit field in the IDTR=80h: our IDT will service 16 interrupts, 00-15. If interrupt 16 or above was generated, the CPU would DOUBLE FAULT because a fault was generated at the inception of the interrupt calling sequence. Now, suppose the limit field in the IDTR=0, thus inhibiting all interrupts from being serviced. Any interrupt generation would cause the DOUBLE FAULT. But the DOUBLE FAULT itself would cause a fault, due to the limit being less than 40h. This ultimately would cause a TRIPLE FAULT, and the CPU would enter a shutdown cycle. The shutdown cycle doesn't reset the CPU, as a shutdown cycle is considered a BUS cycle. External hardware is attached to the CPU to recognize the shutdown cycle. When a shutdown cycle is observed, the external hardware toggles the RESET input of the CPU. Therefore, all we need to do to cause the RESET is set the IDTR.LIMIT=0, then generate an interrupt. For elegance, we don't just INT the CPU, we generate an invalid opcode. Our opcode is a carefully chosen opcode that doesn't exist on the '286, but does exist on the '386. The elegance in the algorithm is in the opcode chosen for this purpose: MOV CR0,EAX. This will generate the desired invalid opcode exception on the '286, but is the first instruction in a sequence to exit protected mode on the '386. Thus the '286 gets RESET, and the '386 falls through and exits protected mode gracefully.
Exiting protected mode on the '286, and '386 closely resemble reversing the steps for entering protected mode. On the '286, you must:
- Reset the CPU to get into real mode
- Load the segment registers with real mode compatible values
- Restore SS:SP
- Inhibit A20 from the address bus (gate A20 off)
- Restore the PIC masks
And on the '386, the steps are simply:
- Load the segment registers with real-mode compatible values
- Reset the Protection Enable (PE) bit in CR0
- Load the segment registers with real mode values
- Inhibit A20 from the address bus (gate A20 off)
(Listing 3 includes the subroutines needed to restore the machine state after exiting protected mode).
Notice that exiting protected mode on the '386 requires loading the segment registers twice. The segment registers are loaded the first time to assure that real-mode compatible values are stored in the hidden descriptor cache registers -- as the descriptor cache registers "honor" the access attributes, and segment size limit, from protected mode, even when loaded in real mode. The segment registers are loaded the second time to define them with real-mode segment values.
Now that we have all the tools and theory necessary to enter and exit protected mode, we can apply this knowledge to write a program that enters protected mode, moves a block of data from extended memory, and exits protected mode -- returning to DOS. Listing 4 shows a program that consists of these basic steps and can be used to move a 1k block of data from 1M to our program's data segment.
Applications programming for real mode and protected mode aren't that different. Both modes use memory segmentation, interrupts, and device drivers to support the hardware. Whether in real mode or protected mode, a set of user-inaccessible registers -- called descriptor cache registers -- play a major role in memory segmentation and memory management. The descriptor cache registers contain information defining the segment base address, segment size limit, and segment access attributes, and are used for all memory references -- regardless of the values in the segment registers.
Entering and exiting protected mode requires nothing more than following the mechanics necessary for the proper mode transition: entering protected mode requires saving the machine state that needs to be restored upon exiting protected mode. The mechanics of entering real mode depend on the type of the CPU: the '286 requires a reset to enter real mode, and the '386 can enter real mode under program control. By applying our knowledge of how the CPU internally operates, we can write source code that exits protected mode in the manner best suited, and most elegant, for the given CPU.