This chapter describes the 80386 application programming environment as seen by assembly language programmers when the processor is executing in protected mode. The chapter introduces programmers to those features of the 80386 architecture that directly affect the design and implementation of 80386 applications programs. Other chapters discuss 80386 features that relate to systems programming or to compatibility with other processors of the 8086 family.

The basic programming model consists of these aspects:

Memory organization and segmentation
Data types
Registers
Instruction format
Operand selection
Interrupts and exceptions

Note that input/output is not included as part of the basic programming model. Systems designers may choose to make I/O instructions available to applications or may choose to reserve these functions for the operating system. For this reason, the I/O features of the 80386 are discussed in Part II.

Chapter 2 -- Basic Programming Model
prev: Chapter 2 -- Basic Programming Model
next: 2.2 Data Types

2.1 Memory Organization and Segmentation

The physical memory of an 80386 system is organized as a sequence of 8-bit bytes. Each byte is assigned a unique address that ranges from zero to a maximum of 2^(32) -1 (4 gigabytes).

80386 programs, however, are independent of the physical address space. This means that programs can be written without knowledge of how much physical memory is available and without knowledge of exactly where in physical memory the instructions and data are located.

The model of memory organization seen by applications programmers is determined by systems-software designers. The architecture of the 80386 gives designers the freedom to choose a model for each task. The model of memory organization can range between the following extremes:

A "flat" address space consisting of a single array of up to 4 gigabytes.
A segmented address space consisting of a collection of up to 16,383 linear address spaces of up to 4 gigabytes each.

Both models can provide memory protection. Different tasks may employ different models of memory organization. The criteria that designers use to determine a memory organization model and the means that systems programmers use to implement that model are covered in Part -- Programming.

2.1.1 The "Flat" Model

In a "flat" model of memory organization, the applications programmer sees a single array of up to 2^(32) bytes (4 gigabytes). While the physical memory can contain up to 4 gigabytes, it is usually much smaller; the processor maps the 4 gigabyte flat space onto the physical address space by the address translation mechanisms described in Chapter 5 . Applications programmers do not need to know the details of the mapping.

A pointer into this flat address space is a 32-bit ordinal number that may range from 0 to 2^(32) -1. Relocation of separately-compiled modules in this space must be performed by systems software (e.g., linkers, locators, binders, loaders).

2.1.2 The Segmented Model

In a segmented model of memory organization, the address space as viewed by an applications program (called the logical address space) is a much larger space of up to 2^(46) bytes (64 terabytes). The processor maps the 64 terabyte logical address space onto the physical address space (up to 4 gigabytes ) by the address translation mechanisms described in Chapter 5 . Applications programmers do not need to know the details of this mapping.

Applications programmers view the logical address space of the 80386 as a collection of up to 16,383 one-dimensional subspaces, each with a specified length. Each of these linear subspaces is called a segment. A segment is a unit of contiguous address space. Segment sizes may range from one byte up to a maximum of 2^(32) bytes (4 gigabytes).

A complete pointer in this address space consists of two parts (see Figure 2-1 ):

A segment selector, which is a 16-bit field that identifies a segment.
An offset, which is a 32-bit ordinal that addresses to the byte level within a segment.

During execution of a program, the processor associates with a segment selector the physical address of the beginning of the segment. Separately compiled modules can be relocated at run time by changing the base address of their segments. The size of a segment is variable; therefore, a segment can be exactly the size of the module it contains.

2.2 Data Types

Bytes, words, and doublewords are the fundamental data types (refer to Figure 2-2 ). A byte is eight contiguous bits starting at any logical address. The bits are numbered 0 through 7; bit zero is the least significant bit.

A word is two contiguous bytes starting at any byte address. A word thus contains 16 bits. The bits of a word are numbered from 0 through 15; bit 0 is the least significant bit. The byte containing bit 0 of the word is called the low byte; the byte containing bit 15 is called the high byte.

Each byte within a word has its own address, and the smaller of the addresses is the address of the word. The byte at this lower address contains the eight least significant bits of the word, while the byte at the higher address contains the eight most significant bits.

A doubleword is two contiguous words starting at any byte address. A doubleword thus contains 32 bits. The bits of a doubleword are numbered from 0 through 31; bit 0 is the least significant bit. The word containing bit 0 of the doubleword is called the low word; the word containing bit 31 is called the high word.

Each byte within a doubleword has its own address, and the smallest of the addresses is the address of the doubleword. The byte at this lowest address contains the eight least significant bits of the doubleword, while the byte at the highest address contains the eight most significant bits. Figure 2-3 illustrates the arrangement of bytes within words anddoublewords.

Note that words need not be aligned at even-numbered addresses and doublewords need not be aligned at addresses evenly divisible by four. This allows maximum flexibility in data structures (e.g., records containing mixed byte, word, and doubleword items) and efficiency in memory utilization. When used in a configuration with a 32-bit bus, actual transfers of data between processor and memory take place in units of doublewords beginning at addresses evenly divisible by four; however, the processor converts requests for misaligned words or doublewords into the appropriate sequences of requests acceptable to the memory interface. Such misaligned data transfers reduce performance by requiring extra memory cycles. For maximum performance, data structures (including stacks) should be designed in such a way that, whenever possible, word operands are aligned at even addresses and doubleword operands are aligned at addresses evenly divisible by four. Due to instruction prefetching and queuing within the CPU, there is no requirement for instructions to be aligned on word or doubleword boundaries. (However, a slight increase in speed results if the target addresses of control transfers are evenly divisible by four.)

Although bytes, words, and doublewords are the fundamental types of operands, the processor also supports additional interpretations of these operands. Depending on the instruction referring to the operand, the following additional data types are recognized:

Integer:: A signed binary numeric value contained in a 32-bit doubleword, 16-bit word, or 8-bit byte. All operations assume a 2's complement representation. The sign bit is located in bit 7 in a byte, bit 15 in a word, and bit 31 in a doubleword. The sign bit has the value zero for positive integers and one for negative. Since the high-order bit is used for a sign, the range of an 8-bit integer is -128 through +127; 16-bit integers may range from -32,768 through +32,767; 32-bit integers may range from -2^(31) through +2^(31) -1. The value zero has a positive sign.
Ordinal:: An unsigned binary numeric value contained in a 32-bit doubleword, 16-bit word, or 8-bit byte. All bits are considered in determining magnitude of the number. The value range of an 8-bit ordinal number is 0-255; 16 bits can represent values from 0 through 65,535; 32 bits can represent values from 0 through 2^(32) -1.
Near Pointer:: A 32-bit logical address. A near pointer is an offset within a segment. Near pointers are used in either a flat or a segmented model of memory organization.
Far Pointer:: A 48-bit logical address of two components: a 16-bit segment selector component and a 32-bit offset component. Far pointers are used by applications programmers only when systems designers choose a segmented memory organization.
String:: A contiguous sequence of bytes, words, or doublewords. A string may contain from zero bytes to 2^(32) -1 bytes (4 gigabytes).
Bit field:: A contiguous sequence of bits. A bit field may begin at any bit position of any byte and may contain up to 32 bits.
Bit string:: A contiguous sequence of bits. A bit string may begin at any bit position of any byte and may contain up to 2^(32) -1 bits.
BCD:: A byte (unpacked) representation of a decimal digit in the range 0 through 9. Unpacked decimal numbers are stored as unsigned byte quantities. One digit is stored in each byte. The magnitude of the number is determined from the low-order half-byte; hexadecimal values 0-9 are valid and are interpreted as decimal numbers. The high-order half-byte must be zero for multiplication and division; it may contain any value for addition and subtraction.
Packed BCD:: A byte (packed) representation of two decimal digits, each in the range 0 through 9. One digit is stored in each half-byte. The digit in the high-order half-byte is the most significant. Values 0-9 are valid in each half-byte. The range of a packed decimal byte is 0-99.

Figure 2-4 graphically summarizes the data types supported by the 80386.

2.3 Registers

The 80386 contains a total of sixteen registers that are of interest to the applications programmer. As Figure 2-5 shows, these registers may be grouped into these basic categories:

General registers. These eight 32-bit general-purpose registers are used primarily to contain operands for arithmetic and logical operations.
Segment registers. These special-purpose registers permit systems software designers to choose either a flat or segmented model of memory organization. These six registers determine, at any given time, which segments of memory are currently addressable.
Status and instruction registers. These special-purpose registers are used to record and alter certain aspects of the 80386 processor state.

2.3.1 General Registers

The general registers of the 80386 are the 32-bit registers EAX, EBX, ECX, EDX, EBP, ESP, ESI, and EDI. These registers are used interchangeably to contain the operands of logical and arithmetic operations. They may also be used interchangeably for operands of address computations (except that ESP cannot be used as an index operand).

As Figure 2-5 shows, the low-order word of each of these eight registers has a separate name and can be treated as a unit. This feature is useful for handling 16-bit data items and for compatibility with the 8086 and 80286 processors. The word registers are named AX, BX, CX, DX, BP, SP, SI, and DI.

Figure 2-5 also illustrates that each byte of the 16-bit registers AX, BX, CX, and DX has a separate name and can be treated as a unit. This feature is useful for handling characters and other 8-bit data items. The byte registers are named AH, BH, CH, and DH (high bytes); and AL, BL, CL, and DL (low bytes).

All of the general-purpose registers are available for addressing calculations and for the results of most arithmetic and logical calculations; however, a few functions are dedicated to certain registers. By implicitly choosing registers for these functions, the 80386 architecture can encode instructions more compactly. The instructions that use specific registers include: double-precision multiply and divide, I/O, string instructions, translate, loop, variable shift and rotate, and stack operations.

2.3.2 Segment Registers

The segment registers of the 80386 give systems software designers the flexibility to choose among various models of memory organization. Implementation of memory models is the subject of Part II -- Systems Programming. Designers may choose a model in which applications programs do not need to modify segment registers, in which case applications programmers may skip this section.

Complete programs generally consist of many different modules, each consisting of instructions and data. However, at any given time during program execution, only a small subset of a program's modules are actually in use. The 80386 architecture takes advantage of this by providing mechanisms to support direct access to the instructions and data of the current module's environment, with access to additional segments on demand.

At any given instant, six segments of memory may be immediately accessible to an executing 80386 program. The segment registers CS, DS, SS, ES, FS, and GS are used to identify these six current segments. Each of these registers specifies a particular kind of segment, as characterized by the associated mnemonics ("code," "data," or "stack") shown in Figure 2-6 . Each register uniquely determines one particular segment, from among the segments that make up the program, that is to be immediately accessible at highest speed.

The segment containing the currently executing sequence of instructions is known as the current code segment; it is specified by means of the CS register. The 80386 fetches all instructions from this code segment, using as an offset the contents of the instruction pointer. CS is changed implicitly as the result of intersegment control-transfer instructions (for example, CALL and JMP), interrupts, and exceptions.

Subroutine calls, parameters, and procedure activation records usually require that a region of memory be allocated for a stack. All stack operations use the SS register to locate the stack. Unlike CS, the SS register can be loaded explicitly, thereby permitting programmers to define stacks dynamically.

The DS, ES, FS, and GS registers allow the specification of four data segments, each addressable by the currently executing program. Accessibility to four separate data areas helps programs efficiently access different types of data structures; for example, one data segment register can point to the data structures of the current module, another to the exported data of a higher-level module, another to a dynamically created data structure, and another to data shared with another task. An operand within a data segment is addressed by specifying its offset either directly in an instruction or indirectly via general registers.

Depending on the structure of data (e.g., the way data is parceled into one or more segments), a program may require access to more than four data segments. To access additional segments, the DS, ES, FS, and GS registers can be changed under program control during the course of a program's execution. This simply requires that the program execute an instruction to load the appropriate segment register prior to executing instructions that access the data.

The processor associates a base address with each segment selected by a segment register. To address an element within a segment, a 32-bit offset is added to the segment's base address. Once a segment is selected (by loading the segment selector into a segment register), a data manipulation instruction only needs to specify the offset. Simple rules define which segment register is used to form an address when only an offset is specified.

2.3.3 Stack Implementation

Stack operations are facilitated by three registers:

The stack segment (SS) register. Stacks are implemented in memory. A system may have a number of stacks that is limited only by the maximum number of segments. A stack may be up to 4 gigabytes long, the maximum length of a segment. One stack is directly addressable at a -- one located by SS. This is the current stack, often referred to simply as "the" stack. SS is used automatically by the processor for all stack operations.

The stack pointer (ESP) register. ESP points to the top of the push-down stack (TOS). It is referenced implicitly by PUSH and POP operations, subroutine calls and returns, and interrupt operations. When an item is pushed onto the stack (see Figure 2-7 ), the processor decrements ESP, then writes the item at the new TOS. When an item is popped off the stack, the processor copies it from TOS, then increments ESP. In other words, the stack grows down in memory toward lesser addresses.
The stack-frame base pointer (EBP) register. The EBP is the best choice of register for accessing data structures, variables and dynamically allocated work space within the stack. EBP is often used to access elements on the stack relative to a fixed point on the stack rather than relative to the current TOS. It typically identifies the base address of the current stack frame established for the current procedure. When EBP is used as the base register in an offset calculation, the offset is calculated automatically in the current stack segment (i.e., the segment currently selected by SS). Because SS does not have to be explicitly specified, instruction encoding in such cases is more efficient. EBP can also be used to index into segments addressable via other segment registers.

2.3.4 Flags Register

The flags register is a 32-bit register named EFLAGS. Figure 2-8 defines the bits within this register. The flags control certain operations and indicate the status of the 80386.

The low-order 16 bits of EFLAGS is named FLAGS and can be treated as a unit. This feature is useful when executing 8086 and 80286 code, because this part of EFLAGS is identical to the FLAGS register of the 8086 and the 80286.

The flags may be considered in three groups: the status flags, the control flags, and the systems flags. Discussion of the systems flags is delayed until Part II.

2.3.4.1 Status Flags

The status flags of the EFLAGS register allow the results of one instruction to influence later instructions. The arithmetic instructions use OF, SF, ZF, AF, PF, and CF. The SCAS (Scan String), CMPS (Compare String), and LOOP instructions use ZF to signal that their operations are complete. There are instructions to set, clear, and complement CF before execution of an arithmetic instruction. Refer to Appendix C for definition of each status flag.

2.3.4.2 Control Flag

The control flag DF of the EFLAGS register controls string instructions. DF (Direction Flag, bit 10) Setting DF causes string instructions to auto-decrement; that is, to process strings from high addresses to low addresses. Clearing DF causes string instructions to auto-increment, or to process strings from low addresses to high addresses.

2.3.4.3 Instruction Pointer

The instruction pointer register (EIP) contains the offset address, relative to the start of the current code segment, of the next sequential instruction to be executed. The instruction pointer is not directly visible to the programmer; it is controlled implicitly by control-transfer instructions, interrupts, and exceptions.

As Figure 2-9 shows, the low-order 16 bits of EIP is named IP and can be used by the processor as a unit. This feature is useful when executing instructions designed for the 8086 and 80286 processors.

2.4 Instruction Format

The information encoded in an 80386 instruction includes a specification of the operation to be performed, the type of the operands to be manipulated, and the location of these operands. If an operand is located in memory, the instruction must also select, explicitly or implicitly, which of the currently addressable segments contains the operand.

80386 instructions are composed of various elements and have various formats. The exact format of instructions is shown in Appendix B; the elements of instructions are described below. Of these instruction elements, only one, the opcode, is always present. The other elements may or may not be present, depending on the particular operation involved and on the location and type of the operands. The elements of an instruction, in order of occurrence are as follows:

Prefixes -- one or more bytes preceding an instruction that modify the operation of the instruction. The following types of prefixes can be used by applications programs:
1. Segment override -- explicitly specifies which segment register an instruction should use, thereby overriding the default segment-register selection used by the 80386 for that instruction.
2. Address size -- switches between 32-bit and 16-bit address generation.
3. Operand size -- switches between 32-bit and 16-bit operands.
4. Repeat -- used with a string instruction to cause the instruction to act on each element of the string.
Opcode -- specifies the operation performed by the instruction. Some operations have several different opcodes, each specifying a different variant of the operation.
Register specifier -- an instruction may specify one or two register operands. Register specifiers may occur either in the same byte as the opcode or in the same byte as the addressing-mode specifier.
Addressing-mode specifier -- when present, specifies whether an operand is a register or memory location; if in memory, specifies whether a displacement, a base register, an index register, and scaling are to be used.
SIB (scale, index, base) byte -- when the addressing-mode specifier indicates that an index register will be used to compute the address of an operand, an SIB byte is included in the instruction to encode the base register, the index register, and a scaling factor.
Displacement -- when the addressing-mode specifier indicates that a displacement will be used to compute the address of an operand, the displacement is encoded in the instruction. A displacement is a signed integer of 32, 16, or eight bits. The eight-bit form is used in the common case when the displacement is sufficiently small. The processor extends an eight-bit displacement to 16 or 32 bits, taking into account the sign.
Immediate operand -- when present, directly provides the value of an operand of the instruction. Immediate operands may be 8, 16, or 32 bits wide. In cases where an eight-bit immediate operand is combined in some way with a 16- or 32-bit operand, the processor automatically extends the size of the eight-bit operand, taking into account the sign.

2.5 Operand Selection

An instruction can act on zero or more operands, which are the data manipulated by the instruction. An example of a zero-operand instruction is NOP (no operation). An operand can be in any of these locations:

In the instruction itself (an immediate operand)
In a register (EAX, EBX, ECX, EDX, ESI, EDI, ESP, or EBP in the case of 32-bit operands; AX, BX, CX, DX, SI, DI, SP, or BP in the case of 16-bit operands; AH, AL, BH, BL, CH, CL, DH, or DL in the case of 8-bit operands; the segment registers; or the EFLAGS register for flag operations)
In memory
At an I/O port

Immediate operands and operands in registers can be accessed more rapidly than operands in memory since memory operands must be fetched from memory. Register operands are available in the CPU. Immediate operands are also available in the CPU, because they are prefetched as part of the instruction.

Of the instructions that have operands, some specify operands implicitly; others specify operands explicitly; still others use a combination of implicit and explicit specification; for example:

Implicit operand: AAM: By definition, AAM (ASCII adjust for multiplication) operates on the contents of the AX register.
Explicit operand: XCHG EAX, EBX: The operands to be exchanged are encoded in the instruction after the opcode.
Implicit and explicit operands: PUSH COUNTER: The memory variable COUNTER (the explicit operand) is copied to the top of the stack (the implicit operand).

Note that most instructions have implicit operands. All arithmetic instructions, for example, update the EFLAGS register.

An 80386 instruction can explicitly reference one or two operands. Two-operand instructions, such as MOV, ADD, XOR, etc., generally overwrite one of the two participating operands with the result. A distinction can thus be made between the source operand (the one unaffected by the operation) and the destination operand (the one overwritten by the result).

For most instructions, one of the two explicitly specified -- the source or the -- be either in a register or in memory. The other operand must be in a register or be an immediate source operand. Thus, the explicit two-operand instructions of the 80386 permit operations of the following kinds:

Register-to-register
Register-to-memory
Memory-to-register
Immediate-to-register
Immediate-to-memory

Certain string instructions and stack manipulation instructions, however, transfer data from memory to memory. Both operands of some string instructions are in memory and are implicitly specified. Push and pop stack operations allow transfer between memory operands and the memory-based stack.

2.5.1 Immediate Operands

Certain instructions use data from the instruction itself as one (and sometimes two) of the operands. Such an operand is called an immediate operand. The operand may be 32-, 16-, or 8-bits long. For example:

 SHR PATTERN, 2

One byte of the instruction holds the value 2, the number of bits by which to shift the variable PATTERN.

 TEST PATTERN, 0FFFF00FFH

A doubleword of the instruction holds the mask that is used to test the variable PATTERN.

2.5.2 Register Operands

Operands may be located in one of the 32-bit general registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, or EBP), in one of the 16-bit general registers (AX, BX, CX, DX, SI, DI, SP, or BP), or in one of the 8-bit general registers (AH, BH, CH, DH, AL, BL, CL,or DL).

The 80386 has instructions for referencing the segment registers (CS, DS, ES, SS, FS, GS). These instructions are used by applications programs only if systems designers have chosen a segmented memory model.

The 80386 also has instructions for referring to the flag register. The flags may be stored on the stack and restored from the stack. Certain instructions change the commonly modified flags directly in the EFLAGS register. Other flags that are seldom modified can be modified indirectly via the flags image in the stack.

2.5.3 Memory Operands

Data-manipulation instructions that address operands in memory must specify (either directly or indirectly) the segment that contains the operand and the offset of the operand within the segment. However, for speed and compact instruction encoding, segment selectors are stored in the high speed segment registers. Therefore, data-manipulation instructions need to specify only the desired segment register and an offset in order to address a memory operand.

An 80386 data-manipulation instruction that accesses memory uses one of the following methods for specifying the offset of a memory operand within its segment:

Most data-manipulation instructions that access memory contain a byte that explicitly specifies the addressing method for the operand. A byte, known as the modR/M byte, follows the opcode and specifies whether the operand is in a register or in memory. If the operand is in memory, the address is computed from a segment register and any of the following values: a base register, an index register, a scaling factor, a displacement. When an index register is used, the modR/M byte is also followed by another byte that identifies the index register and scaling factor. This addressing method is the most flexible.
A few data-manipulation instructions implicitly use specialized addressing methods:
- For a few short forms of MOV that implicitly use the EAX register, the offset of the operand is coded as a doubleword in the instruction. No base register, index register, or scaling factor are used.
- String operations implicitly address memory via DS:ESI, (MOVS, CMPS, OUTS, LODS, SCAS) or via ES:EDI (MOVS, CMPS, INS, STOS).
- Stack operations implicitly address operands via SS:ESP registers; e.g., PUSH, POP, PUSHA, PUSHAD, POPA, POPAD, PUSHF, PUSHFD, POPF, POPFD, CALL, RET, IRET, IRETD, exceptions, and interrupts.

2.5.3.1 Segment Selection

Data-manipulation instructions need not explicitly specify which segment register is used. For all of these instructions, specification of a segment register is optional. For all memory accesses, if a segment is not explicitly specified by the instruction, the processor automatically chooses a segment register according to the rules of Table 2-1. (If systems designers have chosen a flat model of memory organization, the segment registers and the rules that the processor uses in choosing them are not apparent to applications programs.)

There is a close connection between the kind of memory reference and the segment in which that operand resides. As a rule, a memory reference implies the current data segment (i.e., the implicit segment selector is in DS). However, ESP and EBP are used to access items on the stack; therefore, when the ESP or EBP register is used as a base register, the current stack segment is implied (i.e., SS contains the selector).

Special instruction prefix elements may be used to override the default segment selection. Segment-override prefixes allow an explicit segment selection. The 80386 has a segment-override prefix for each of the segment registers. Only in the following special cases is there an implied segment selection that a segment prefix cannot override:

The use of ES for destination strings in string instructions.
The use of SS in stack instructions.
The use of CS for instruction fetches.

 Table 2-1. Default Segment Register Selection Rules
 
 Memory Reference Needed  Segment     Implicit Segment Selection Rule
 Register
 Used
 
 Instructions             Code (CS)   Automatic with instruction prefetch
 Stack                    Stack (SS)  All stack pushes and pops. Any
 memory reference that uses ESP or
 EBP as a base register.
 Local Data               Data (DS)   All data references except when
 relative to stack or string
 destination.
 Destination Strings      Extra (ES)  Destination of string instructions.

2.5.3.2 Effective-Address Computation

The modR/M byte provides the most flexible of the addressing methods, and instructions that require a modR/M byte as the second byte of the instruction are the most common in the 80386 instruction set. For memory operands defined by modR/M, the offset within the desired segment is calculated by taking the sum of up to three components:

A displacement element in the instruction.
A base register.
An index register. The index register may be automatically multiplied by a scaling factor of 2, 4, or 8.

The offset that results from adding these components is called an effective address. Each of these components of an effective address may have either a positive or negative value. If the sum of all the components exceeds 2^(32), the effective address is truncated to 32 bits. Figure 2-10 illustrates the full set of possibilities for modR/M addressing.

The displacement component, because it is encoded in the instruction, is useful for fixed aspects of addressing; for example:

Location of simple scalar operands.
Beginning of a statically allocated array.
Offset of an item within a record.

The base and index components have similar functions. Both utilize the same set of general registers. Both can be used for aspects of addressing that are determined dynamically; for example:

Location of procedure parameters and local variables in stack.
The beginning of one record among several occurrences of the same record type or in an array of records.
The beginning of one dimension of multiple dimension array.
The beginning of a dynamically allocated array.

The uses of general registers as base or index components differ in the following respects:

ESP cannot be used as an index register.
When ESP or EBP is used as the base register, the default segment is the one selected by SS. In all other cases the default segment is DS.

The scaling factor permits efficient indexing into an array in the common cases when array elements are 2, 4, or 8 bytes wide. The shifting of the index register is done by the processor at the time the address is evaluated with no performance loss. This eliminates the need for a separate shift or multiply instruction.

The base, index, and displacement components may be used in any combination; any of these components may be null. A scale factor can be used only when an index is also used. Each possible combination is useful for data structures commonly used by programmers in high-level languages and assembly languages. Following are possible uses for some of the various combinations of address components.

DISPLACEMENT

The displacement alone indicates the offset of the operand. This combination is used to directly address a statically allocated scalar operand. An 8-bit, 16-bit, or 32-bit displacement can be used.

BASE

The offset of the operand is specified indirectly in one of the general registers, as for "based" variables.

BASE + DISPLACEMENT

A register and a displacement can be used together for two distinct purposes:

Index into static array when element size is not 2, 4, or 8 bytes. The displacement component encodes the offset of the beginning of the array. The register holds the results of a calculation to determine the offset of a specific element within the array.
Access item of a record. The displacement component locates an within record. The base register selects one of several occurrences of record, thereby providing a compact encoding for this common function.

An important special case of this combination, is to access parameters in the procedure activation record in the stack. In this case, EBP is the best choice for the base register, because when EBP is used as a base register, the processor automatically uses the stack segment register (SS) to locate the operand, thereby providing a compact encoding for this common function.

(INDEX * SCALE) + DISPLACEMENT

This combination provides efficient indexing into a static array when the element size is 2, 4, or 8 bytes. The displacement addresses the beginning of the array, the index register holds the subscript of the desired array element, and the processor automatically converts the subscript into an index by applying the scaling factor.

BASE + INDEX + DISPLACEMENT

Two registers used together support either a two-dimensional array (the displacement determining the beginning of the array) or one of several instances of an array of records (the displacement indicating an item in the record).

BASE + (INDEX * SCALE) + DISPLACEMENT

This combination provides efficient indexing of a two-dimensional array when the elements of the array are 2, 4, or 8 bytes wide.

2.6 Interrupts and Exceptions

The 80386 has two mechanisms for interrupting program execution:

Exceptions are synchronous events that are the responses of the CPU to certain conditions detected during the execution of an instruction.
Interrupts are asynchronous events typically triggered by external devices needing attention.

Interrupts and exceptions are alike in that both cause the processor to temporarily suspend its present program execution in order to execute a program of higher priority. The major distinction between these two kinds of interrupts is their origin. An exception is always reproducible by re-executing with the program and data that caused the exception, whereas an interrupt is generally independent of the currently executing program.

Application programmers are not normally concerned with servicing interrupts. More information on interrupts for systems programmers may be found in Chapter 9 . Certain exceptions , however, are of interest to applications programmers, and many operating systems give applications programs the opportunity to service these exceptions. However, the operating system itself defines the interface between the applications programs and the exception mechanism of the 80386.

Table 2-2 highlights the exceptions that may be of interest to applications programmers.

A divide error exception results when the instruction DIV or IDIV is executed with a zero denominator or when the quotient is too large for the destination operand . (Refer to Chapter 3 for a discussion of DIV and IDIV.)
The debug exception may be reflected back to an applications program if it results from the trap flag (TF).
A breakpoint exception results when the instruction INT 3 is executed. This instruction is used by some debuggers to stop program execution at specific points.
An overflow exception results when the INTO instruction is executed and the OF (overflow) flag is set (after an arithmetic operation that set the OF flag ) . (Refer to Chapter 3 for a discussion of INTO) .
A bounds check exception results when the BOUND instruction is executed and the array index it checks falls outside the bounds of the array . (Refer to Chapter 3 for a discussion of the BOUND instruction. )
Invalid opcodes may be used by some applications to extend the instruction set. In such a case, the invalid opcode exception presents an opportunity to emulate the opcode.
The "coprocessor not available" exception occurs if the program contains instructions for a coprocessor, but no coprocessor is present in the system.
A coprocessor error is generated when a coprocessor detects an illegal operation.

The instruction INT generates an interrupt whenever it is executed; the processor treats this interrupt as an exception. The effects of this interrupt (and the effects of all other exceptions) are determined by exception handler routines provided by the application program or as part of the systems software (provided by systems programmers). The INT instruction itself is discussed in Chapter 3. Refer to Chapter 9 for a more complete description of exceptions.

 Table 2-2. 80386 Reserved Exceptions and Interrupts
 
 Vector Number      Description
 
 0                  Divide Error
 1                  Debug Exceptions
 2                  NMI Interrupt
 3                  Breakpoint
 4                  INTO Detected Overflow
 5                  BOUND Range Exceeded
 6                  Invalid Opcode
 7                  Coprocessor Not Available
 8                  Double Exception
 9                  Coprocessor Segment Overrun
 10                 Invalid Task State Segment
 11                 Segment Not Present
 12                 Stack Fault
 13                 General Protection
 14                 Page Fault
 15                 (reserved)
 16                 Coprocessor Error
 17-32              (reserved)

4.1 Systems Registers

The registers designed for use by systems programmers fall into these classes:

EFLAGS
Memory-Management Registers
Control Registers
Debug Registers
Test Registers

4.1.1 Systems Flags

The systems flags of the EFLAGS register control I/O, maskable interrupts, debugging, task switching, and enabling of virtual 8086 execution in a protected, multitasking environment. These flags are highlighted in Figure 4-1 .

IF (Interrupt-Enable Flag, bit 9): Setting IF allows the CPU to recognize external (maskable) interrupt requests. Clearing IF disables these interrupts. IF has no effect on either exceptions or nonmaskable external interrupts . Refer to Chapter 9 for more details about interrupts .
NT (Nested Task, bit 14): The processor uses the nested task flag to control chaining of interrupted and called tasks. NT influences the operation of the IRET instruction . Refer to Chapter 7 and Chapter 9 for more information on nested tasks.
RF (Resume Flag, bit 16): The RF flag temporarily disables debug exceptions so that an instruction can be restarted after a debug exception without immediately causing another debug exception . Refer to Chapter 12 for details .
TF (Trap Flag, bit 8): Setting TF puts the processor into single-step mode for debugging. In this mode, the CPU automatically generates an exception after each instruction, allowing a program to be inspected as it executes each instruction. Single-stepping is just one of several debugging features of the 80386 . Refer to Chapter 12 for additional information .
VM (Virtual 8086 Mode, bit 17): When set, the VM flag indicates that the task is executing an 8086 program . Refer to Chapter 14 for a detailed discussion of how the 80386 executes 8086 tasks in a protected, multitasking environment.

4.1.2 Memory-Management Registers

Four registers of the 80386 locate the data structures that control segmented memory management:

GDTR Global Descriptor Table Register
LDTR Local Descriptor Table Register

4.1.3 Control Registers

Figure 4-2 shows the format of the 80386 control registers CR0, CR2, and CR3. These registers are accessible to systems programmers only via variants of the MOV instruction, which allow them to be loaded from or stored in general registers; for example:

 MOV EAX, CR0
 MOV CR3, EBX

CR0 contains system control flags, which control or indicate conditions that apply to the system as a whole, not to an individual task.

EM (Emulation, bit 2): EM indicates whether coprocessor functions are to be emulated. Refer to Chapter 11 for details .
ET (Extension Type, bit 4): ET indicates the type of coprocessor present in the system (80287 or 80387 ) . Refer to Chapter 11 and Chapter 10 for details.
MP (Math Present, bit 1): MP controls the function of the WAIT instruction, which is used to coordinate a coprocessor . Refer to Chapter 11 for details .
PE (Protection Enable, bit 0): Setting PE causes the processor to begin executing in protected mode. Resetting PE returns to real-address mode . Refer to Chapter 14 and Chapter 10 for more information on changing processor modes .
PG (Paging, bit 31): PG indicates whether the processor uses page tables to translate linear addresses into physical addresses . Refer to Chapter 5 for a description of page translation; refer to Chapter 10 for a discussion of how to set PG.
TS (Task Switched, bit 3): The processor sets TS with every task switch and tests TS when interpreting coprocessor instructions . Refer to Chapter 11 for details .

CR2 is used for handling page faults when PG is set. The processor stores in CR2 the linear address that triggers the fault . Refer to Chapter 9 for a description of page-fault handling.

CR3 is used when PG is set. CR3 enables the processor to locate the page table directory for the current task . Refer to Chapter 5 for a description of page tables and page translation.

4.1.4 Debug Register

The debug registers bring advanced debugging abilities to the 80386, including data breakpoints and the ability to set instruction breakpoints without modifying code segments . Refer to Chapter 12 for a complete description of formats and usage.

4.1.5 Test Registers

The test registers are not a standard part of the 80386 architecture. They are provided solely to enable confidence testing of the translation lookaside buffer (TLB), the cache used for storing information from page tables . Chapter 12 explains how to use these registers .

4.2 Systems Instructions

Systems instructions deal with such functions as:

Verification of pointer parameters (refer to Chapter 6):
Addressing descriptor tables (refer to Chapter 5):
Multitasking (refer to Chapter 7):
- LTR -- Load Task Register
- STR -- Store Task Register
Coprocessing and Multiprocessing (refer to Chapter 11):
- CLTS -- Clear Task-Switched Flag
- ESC -- Escape instructions
- WAIT -- Wait until Coprocessor not Busy
- LOCK -- Assert Bus-Lock Signal
Input and Output (refer to Chapter 8):
Interrupt control (refer to Chapter 9):
Debugging (refer to Chapter 12):
- MOV -- Move to and from debug registers
TLB testing (refer to Chapter 10):
- MOV -- Move to and from test registers
System Control:

The instructions SMSW and LMSW are provided for compatibility with the 80286 processor. 80386 programs access the MSW in CR0 via variants of the MOV instruction. HLT stops the processor until receipt of an INTR or RESET signal.

In addition to the chapters cited above, detailed information about each of these instructions can be found in the instruction reference chapter, Chapter 17

Chapter 5 Memory Management

The 80386 transforms logical addresses (i.e., addresses as viewed by programmers) into physical address (i.e., actual addresses in physical memory) in two steps:

Segment translation, in which a logical address (consisting of a segment selector and segment offset) are converted to a linear address.
Page translation, in which a linear address is converted to a physical address. This step is optional, at the discretion of systems-software designers.

These translations are performed in a way that is not visible to applications programmers. Figure 5-1 illustrates the two translations at a high level of abstraction.

Figure 5-1 and the remainder of this chapter present a simplified view of the 80386 addressing mechanism. In reality, the addressing mechanism also includes memory protection features. For the sake of simplicity, however, the subject of protection is taken up in another chapter, Chapter 6.

5.1 Segment Translation

Figure 5-2 shows in more detail how the processor converts a logical address into a linear address.

To perform this translation, the processor uses the following data structures:

Descriptors
Descriptor tables
Selectors
Segment Registers

5.1.1 Descriptors

The segment descriptor provides the processor with the data it needs to map a logical address into a linear address. Descriptors are created by compilers, linkers, loaders, or the operating system, not by applications programmers. Figure 5-3 illustrates the two general descriptor formats. All types of segment descriptors take one of these formats. Segment-descriptor fields are:

BASE: Defines the location of the segment within the 4 gigabyte linear address space. The processor concatenates the three fragments of the base address to form a single 32-bit value.

LIMIT: Defines the size of the segment. When the processor concatenates the two parts of the limit field, a 20-bit value results. The processor interprets the limit field in one of two ways, depending on the setting of the granularity bit:

In units of one byte, to define a limit of up to 1 megabyte.
In units of 4 Kilobytes, to define a limit of up to 4 gigabytes. The limit is shifted left by 12 bits when loaded, and low-order one-bits are inserted.

Granularity bit: Specifies the units with which the LIMIT field is interpreted. When thebit is clear, the limit is interpreted in units of one byte; when set, the limit is interpreted in units of 4 Kilobytes.

TYPE: Distinguishes between various kinds of descriptors.

DPL (Descriptor Privilege Level): Used by the protection mechanism (refer to Chapter 6 ) .

Segment-Present bit: If this bit is zero, the descriptor is not valid for use in address transformation; the processor will signal an exception when a selector for the descriptor is loaded into a segment register. Figure 5-4 shows the format of a descriptor when the present-bit is zero. The operating system is free to use the locations marked AVAILABLE. Operating systems that implement segment-based virtual memory clear the present bit in either of these cases:

When the linear space spanned by the segment is not mapped by the paging mechanism.
When the segment is not present in memory.

Accessed bit: The processor sets this bit when the segment is accessed; i.e., a selector for the descriptor is loaded into a segment register or used by a selector test instruction. Operating systems that implement virtual memory at the segment level may, by periodically testing and clearing this bit, monitor frequency of segment usage.

Creation and maintenance of descriptors is the responsibility of systems software, usually requiring the cooperation of compilers, program loaders or system builders, and therating system.

5.1.2 Descriptor Tables

Segment descriptors are stored in either of two kinds of descriptor table:

The global descriptor table (GDT)
A local descriptor table (LDT)

A descriptor table is simply a memory array of 8-byte entries that contain descriptors, as Figure 5-5 shows. A descriptor table is variable in length and may contain up to 8192 (2^(13)) descriptors. The first entry of the GDT (INDEX=0) is not used by the processor, however.

The processor locates the GDT and the current LDT in memory by means of the GDTR and LDTR registers. These registers store the base addresses of the tables in the linear address space and store the segment limits. The instructions LGDT and SGDT give access to the GDTR; the instructions LLDT and SLDT give access to the LDTR.

5.1.3 Selectors

The selector portion of a logical address identifies a descriptor by specifying a descriptor table and indexing a descriptor within that table. Selectors may be visible to applications programs as a field within a pointer variable, but the values of selectors are usually assigned (fixed up) by linkers or linking loaders. Figure 5-6 shows the format of a selector.

Index: Selects one of 8192 descriptors in a descriptor table. The processor simply multiplies this index value by 8 (the length of a descriptor), and adds the result to the base address of the descriptor table in order to access the appropriate segment descriptor in the table.

Table Indicator: Specifies to which descriptor table the selector refers. A zero indicates the GDT; a one indicates the current LDT.

Requested Privilege Level: Used by the protection mechanism. (Refer to Chapter 6)

Because the first entry of the GDT is not used by the processor, a selector that has an index of zero and a table indicator of zero (i.e., a selector that points to the first entry of the GDT), can be used as a null selector. The processor does not cause an exception when a segment register (other than CS or SS) is loaded with a null selector. It will, however, cause an exception when the segment register is used to access memory. This feature is useful for initializing unused segment registers so as to trap accidental references.

5.1.4 Segment Registers

The 80386 stores information from descriptors in segment registers, thereby avoiding the need to consult a descriptor table every time it accesses memory.

Every segment register has a "visible" portion and an "invisible" portion, as Figure 5-7 illustrates. The visible portions of these segment address registers are manipulated by programs as if they were simply 16-bit registers. The invisible portions are manipulated by the processor.

The operations that load these registers are normal program instructions (previously described in Chapter 3). These instructions are of two classes:

Direct load instructions; for example, MOV, POP, LDS, LSS, LGS, LFS. These instructions explicitly reference the segment registers.
Implied load instructions; for example, far CALL and JMP. These instructions implicitly reference the CS register, and load it with a new value.

Using these instructions, a program loads the visible part of the segment register with a 16-bit selector. The processor automatically fetches the base address, limit, type, and other information from a descriptor table and loads them into the invisible part of the segment register.

Because most instructions refer to data in segments whose selectors have already been loaded into segment registers, the processor can add the segment-relative offset supplied by the instruction to the segment base address with no additional overhead.

5.2 Page Translation

In the second phase of address transformation, the 80386 transforms a linear address into a physical address. This phase of address transformation implements the basic features needed for page-oriented virtual-memory systems and page-level protection.

The page-translation step is optional. Page translation is in effect only when the PG bit of CR0 is set. This bit is typically set by the operating system during software initialization. The PG bit must be set if the operating system is to implement multiple virtual 8086 tasks, page-oriented protection, or page-oriented virtual memory.

5.2.1 Page Frame

A page frame is a 4K-byte unit of contiguous addresses of physical memory. Pages begin onbyte boundaries and are fixed in size.

5.2.2 Linear Address

A linear address refers indirectly to a physical address by specifying a page table, a page within that table, and an offset within that page. Figure 5-8 shows the format of a linear address.

Figure 5-9 shows how the processor converts the DIR, PAGE, and OFFSET fields of a linear address into the physical address by consulting two levels of page tables. The addressing mechanism uses the DIR field as an index into a page directory, uses the PAGE field as an index into the page table determined by the page directory, and uses the OFFSET field to address a byte within the page determined by the page table.

5.2.3 Page Tables

A page table is simply an array of 32-bit page specifiers. A page table is itself a page, and therefore contains 4 Kilobytes of memory or at most 1K 32-bit entries.

Two levels of tables are used to address a page of memory. At the higher level is a page directory. The page directory addresses up to 1K page tables of the second level. A page table of the second level addresses up to 1K pages. All the tables addressed by one page directory, therefore, can address 1M pages (2^(20)). Because each page contains 4K bytes 2^(12) bytes), the tables of one page directory can span the entire physical address space of the 80386 (2^(20) times 2^(12) = 2^(32)).

The physical address of the current page directory is stored in the CPU register CR3, also called the page directory base register (PDBR). Memory management software has the option of using one page directory for all tasks, one page directory for each task, or some combination of the two. Refer to Chapter 10 for information on initialization of CR3 . Refer to Chapter 7 to see how CR3 can change for each task .

5.2.4 Page-Table Entries

Entries in either level of page tables have the same format. Figure 5-10 illustrates this format.

5.2.4.1 Page Frame Address

The page frame address specifies the physical starting address of a page. Because pages are located on 4K boundaries, the low-order 12 bits are always zero. In a page directory, the page frame address is the address of a page table. In a second-level page table, the page frame address is the address of the page frame that contains the desired memory operand.

5.2.4.2 Present Bit

The Present bit indicates whether a page table entry can be used in address translation. P=1 indicates that the entry can be used.

When P=0 in either level of page tables, the entry is not valid for address translation, and the rest of the entry is available for software use; none of the other bits in the entry is tested by the hardware. Figure 5-11 illustrates the format of a page-table entry when P=0.

If P=0 in either level of page tables when an attempt is made to use a page-table entry for address translation, the processor signals a page exception. In software systems that support paged virtual memory, the page-not-present exception handler can bring the required page into physical memory. The instruction that caused the exception can then be reexecuted. Refer to Chapter 9 for more information on exception handlers .

Note that there is no present bit for the page directory itself. The page directory may be not-present while the associated task is suspended, but the operating system must ensure that the page directory indicated by the CR3 image in the TSS is present in physical memory before the task is dispatched . Refer to Chapter 7 for an explanation of the TSS and task dispatching.

5.2.4.3 Accessed and Dirty Bits

These bits provide data about page usage in both levels of the page tables. With the exception of the dirty bit in a page directory entry, these bits are set by the hardware; however, the processor does not clear any of these bits.

The processor sets the corresponding accessed bits in both levels of page tables to one before a read or write operation to a page.

The processor sets the dirty bit in the second-level page table to one before a write to an address covered by that page table entry. The dirty bit in directory entries is undefined.

An operating system that supports paged virtual memory can use these bits to determine what pages to eliminate from physical memory when the demand for memory exceeds the physical memory available. The operating system is responsible for testing and clearing these bits.

Refer to Chapter 11 for how the 80386 coordinates updates to the accessed and dirty bits in multiprocessor systems.

5.2.4.4 Read/Write and User/Supervisor Bits

These bits are not used for address translation, but are used for page-level protection, which the processor performs at the same time as address translation . Refer to Chapter 6 where protection is discussed in detail.

5.2.5 Page Translation Cache

For greatest efficiency in address translation, the processor stores the most recently used page-table data in an on-chip cache. Only if the necessary paging information is not in the cache must both levels of page tables be referenced.

The existence of the page-translation cache is invisible to applications programmers but not to systems programmers; operating-system programmers must flush the cache whenever the page tables are changed. The page-translation cache can be flushed by either of two methods:

By reloading CR3 with a MOV instruction; for example:
```
 MOV CR3, EAX
 
```
By performing a task switch to a TSS that has a different CR3 image than the current TSS . (Refer to Chapter 7 for more information on task switching.)

5.3 Combining Segment and Page Translation

Figure 5-12 combines Figure 5-2 and Figure 5-9 to summarize both phases of the transformation from a logical address to a physical address when paging is enabled. By appropriate choice of options and parameters to both phases, memory-management software can implement several different styles of memory management.

5.3.1 "Flat" Architecture

When the 80386 is used to execute software designed for architectures that don't have segments, it may be expedient to effectively "turn off" the segmentation features of the 80386. The 80386 does not have a mode that disables segmentation, but the same effect can be achieved by initially loading the segment registers with selectors for descriptors that encompass the entire 32-bit linear address space. Once loaded, the segment registers don't need to be changed. The 32-bit offsets used by 80386 instructions are adequate to address the entire linear-address space.

5.3.2 Segments Spanning Several Pages

The architecture of the 80386 permits segments to be larger or smaller than the size of a page (4 Kilobytes). For example, suppose a segment is used to address and protect a large data structure that spans 132 Kilobytes. In a software system that supports paged virtual memory, it is not necessary for the entire structure to be in physical memory at once. The structure is divided into 33 pages, any number of which may not be present. The applications programmer does not need to be aware that the virtual memory subsystem is paging the structure in this manner.

5.3.3 Pages Spanning Several Segments

On the other hand, segments may be smaller than the size of a page. For example, consider a small data structure such as a semaphore. Because of the protection and sharing provided by segments (refer to Chapter 6 ) , it may be useful to create a separate segment for each semaphore. But, because a system may need many semaphores, it is not efficient to allocate a page for each. Therefore, it may be useful to cluster many related segments within a page.

5.3.4 Non-Aligned Page and Segment Boundaries

The architecture of the 80386 does not enforce any correspondence between the boundaries of pages and segments. It is perfectly permissible for a page to contain the end of one segment and the beginning of another. Likewise, a segment may contain the end of one page and the beginning of another.

5.3.5 Aligned Page and Segment Boundaries

Memory-management software may be simpler, however, if it enforces some correspondence between page and segment boundaries. For example, if segments are allocated only in units of one page, the logic for segment and page allocation can be combined. There is no need for logic to account for partially used pages.

5.3.6 Page-Table per Segment

An approach to space management that provides even further simplification of space-management software is to maintain a one-to-one correspondence between segment descriptors and page-directory entries, as Figure 5-13 illustrates. Each descriptor has a base address in which the low-order 22 bits are zero; in other words, the base address is mapped by the first entry of a page table. A segment may have any limit from 1 to 4 megabytes. Depending on the limit, the segment is contained in from 1 to 1K page frames. A task is thus limited to 1K segments (a sufficient number for many applications), each containing up to 4 Mbytes. The descriptor, the corresponding page-directory entry, and the corresponding page table can be allocated and deallocated simultaneously.

6.1 Why Protection?

The purpose of the protection features of the 80386 is to help detect and identify bugs. The 80386 supports sophisticated applications that may consist of hundreds or thousands of program modules. In such applications, the question is how bugs can be found and eliminated as quickly as possible and how their damage can be tightly confined. To help debug applications faster and make them more robust in production, the 80386 contains mechanisms to verify memory accesses and instruction execution for conformance to protection criteria. These mechanisms may be used or ignored, according to system design objectives.

6.2 Overview of 80386 Protection Mechanisms

Protection in the 80386 has five aspects:

Type checking
Limit checking
Restriction of addressable domain
Restriction of procedure entry points
Restriction of instruction set

The protection hardware of the 80386 is an integral part of the memory management hardware. Protection applies both to segment translation and to page translation.

Each reference to memory is checked by the hardware to verify that it satisfies the protection criteria. All these checks are made before the memory cycle is started; any violation prevents that cycle from starting and results in an exception. Since the checks are performed concurrently with address formation, there is no performance penalty.

Invalid attempts to access memory result in an exception. Refer to Chapter 9 for an explanation of the exception mechanism . The present chapter defines the protection violations that lead to exceptions.

The concept of "privilege" is central to several aspects of protection (numbers 3, 4, and 5 in the preceeding list). Applied to procedures, privilege is the degree to which the procedure can be trusted not to make a mistake that might affect other procedures or data. Applied to data, privilege is the degree of protection that a data structure should have from less trusted procedures.

The concept of privilege applies both to segment protection and to page protection.

6.3 Segment-Level Protection

All five aspects of protection apply to segment translation:

Type checking
Limit checking
Restriction of addressable domain
Restriction of procedure entry points
Restriction of instruction set

The segment is the unit of protection, and segment descriptors store protection parameters. Protection checks are performed automatically by the CPU when the selector of a segment descriptor is loaded into a segment register and with every segment access. Segment registers hold the protection parameters of the currently addressable segments.

6.3.1 Descriptors Store Protection Parameters

Figure 6-1 highlights the protection-related fields of segment descriptors.

The protection parameters are placed in the descriptor by systems software at the time a descriptor is created. In general, applications programmers do not need to be concerned about protection parameters.

When a program loads a selector into a segment register, the processor loads not only the base address of the segment but also protection information. Each segment register has bits in the invisible portion for storing base, limit, type, and privilege level; therefore, subsequent protection checks on the same segment do not consume additional clock cycles.

6.3.1.1 Type Checking

The TYPE field of a descriptor has two functions:

It distinguishes among different descriptor formats.
It specifies the intended usage of a segment.

Besides the descriptors for data and executable segments commonly used by applications programs, the 80386 has descriptors for special segments used by the operating system and for gates. Table 6-1 lists all the types defined for system segments and gates. Note that not all descriptors define segments; gate descriptors have a different purpose that is discussed later in this chapter.

The type fields of data and executable segment descriptors include bits which further define the purpose of the segment (refer to Figure 6-1 ):

The writable bit in a data-segment descriptor specifies whether instructions can write into the segment.
The readable bit in an executable-segment descriptor specifies whether instructions are allowed to read from the segment (for example, to access constants that are stored with instructions). A readable, executable segment may be read in two ways:
1. Via the CS register, by using a CS override prefix.
2. By loading a selector of the descriptor into a data-segment register (DS, ES, FS,or GS).

Type checking can be used to detect programming errors that would attempt to use segments in ways not intended by the programmer. The processor examines type information on two kinds of occasions:

When a selector of a descriptor is loaded into a segment register. Certain segment registers can contain only certain descriptor types; for example:
- The CS register can be loaded only with a selector of an executable segment.
- Selectors of executable segments that are not readable cannot be loaded into data-segment registers.
- Only selectors of writable data segments can be loaded into SS.
When an instruction refers (implicitly or explicitly) to a segment register. Certain segments can be used by instructions only in certain predefined ways; for example:
- No instruction may write into an executable segment.
- No instruction may write into a data segment if the writable bit is not set.
- No instruction may read an executable segment unless the readable bit is set.

 Table 6-1. System and Gate Descriptor Types
 
 Code      Type of Segment or Gate
 
 0       -reserved
 1       Available 286 TSS
 2       LDT
 3       Busy 286 TSS
 4       Call Gate
 5       Task Gate
 6       286 Interrupt Gate
 7       286 Trap Gate
 8       -reserved
 9       Available 386 TSS
 A       -reserved
 B       Busy 386 TSS
 C       386 Call Gate
 D       -reserved
 E       386 Interrupt Gate
 F       386 Trap Gate

6.3.1.2 Limit Checking

The limit field of a segment descriptor is used by the processor to prevent programs from addressing outside the segment. The processor's interpretation of the limit depends on the setting of the G (granularity) bit. For data segments, the processor's interpretation of the limit depends also on the E-bit (expansion-direction bit) and the B-bit (big bit) (refer to Table 6-2).

When G=0, the actual limit is the value of the 20-bit limit field as it appears in the descriptor. In this case, the limit may range from 0 to 0FFFFFH (2^(20) - 1 or 1 megabyte). When G=1, the processor appends 12 low-order one-bits to the value in the limit field. In this case the actual limit may range from 0FFFH (2^(12) - 1 or 4 kilobytes) to 0FFFFFFFFH(2^(32) - 1 or 4 gigabytes).

For all types of segments except expand-down data segments, the value of the limit is one less than the size (expressed in bytes) of the segment. The processor causes a general-protection exception in any of these cases:

Attempt to access a memory byte at an address > limit.
Attempt to access a memory word at an address >= limit.
Attempt to access a memory doubleword at an address >= (limit-2).

For expand-down data segments, the limit has the same function but is interpreted differently. In these cases the range of valid addresses is from limit + 1 to either 64K or 2^(32) - 1 (4 Gbytes) depending on the B-bit. An expand-down segment has maximum size when the limit is zero.

The expand-down feature makes it possible to expand the size of a stack by copying it to a larger segment without needing also to update intrastack pointers.

The limit field of descriptors for descriptor tables is used by the processor to prevent programs from selecting a table entry outside the descriptor table. The limit of a descriptor table identifies the last valid byte of the last descriptor in the table. Since each descriptor is eight bytes long, the limit value is N * 8 - 1 for a table that can contain up to N descriptors.

Limit checking catches programming errors such as runaway subscripts and invalid pointer calculations. Such errors are detected when they occur, so that identification of the cause is easier. Without limit checking, such errors could corrupt other modules; the existence of such errors would not be discovered until later, when the corrupted module behaves incorrectly, and when identification of the cause is difficult.

 Table 6-2. Useful Combinations of E, G, and B Bits
 
 Case:                    1         2         3         4
 
 Expansion Direction      U         U         D         D
 G-bit                    0         1         0         1
 B-bit                    X         X         0         1
 
 Lower bound is:
 0                        X         X
 LIMIT+1                                      X
 shl(LIMIT,12,1)+1                                      X
 
 Upper bound is:
 LIMIT                    X
 shl(LIMIT,12,1)                    X
 64K-1                                        X
 4G-1                                                   X
 
 Max seg size is:
 64K                      X
 64K-1                              X
 4G-4K                                        X
 4G                                                     X
 
 Min seg size is:
 0                        X         X
 4K                                           X         X
 
 shl (X, 12, 1) = shift X left by 12 bits inserting one-bits on the right

6.3.1.3 Privilege Levels

The concept of privilege is implemented by assigning a value from zero to three to key objects recognized by the processor. This value is called the privilege level. The value zero represents the greatest privilege, the value three represents the least privilege. The following processor-recognized objects contain privilege levels:

Descriptors contain a field called the descriptor privilege level (DPL).
Selectors contain a field called the requestor's privilege level (RPL). The RPL is intended to represent the privilege level of the procedure that originates a selector.
An internal processor register records the current privilege level (CPL). Normally the CPL is equal to the DPL of the segment that the processor is currently executing. CPL changes as control is transferred to segments with differing DPLs.

The processor automatically evaluates the right of a procedure to access another segment by comparing the CPL to one or more other privilege levels. The evaluation is performed at the time the selector of a descriptor is loaded into a segment register. The criteria used for evaluating access to data differs from that for evaluating transfers of control to executable segments; therefore, the two types of access are considered separately in the following sections.

Figure 6-2 shows how these levels of privilege can be interpreted as rings of protection. The center is for the segments containing the most critical software, usually the kernel of the operating system. Outer rings are for the segments of less critical software.

It is not necessary to use all four privilege levels. Existing software that was designed to use only one or two levels of privilege can simply ignore the other levels offered by the 80386. A one-level system should use privilege level zero; a two-level system should use privilege levels zero and three.

6.3.2 Restricting Access to Data

To address operands in memory, an 80386 program must load the selector of a data segment into a data-segment register (DS, ES, FS, GS, SS). The processor automatically evaluates access to a data segment by comparing privilege levels. The evaluation is performed at the time a selector for the descriptor of the target segment is loaded into the data-segment register. As Figure 6-3 shows, three different privilege levels enter into this type of privilege check:

The CPL (current privilege level).
The RPL (requestor's privilege level) of the selector used to specify the target segment.
The DPL of the descriptor of the target segment.

Instructions may load a data-segment register (and subsequently use the target segment) only if the DPL of the target segment is numerically greater than or equal to the maximum of the CPL and the selector's RPL. In other words, a procedure can only access data that is at the same or less privileged level.

The addressable domain of a task varies as CPL changes. When CPL is zero, data segments at all privilege levels are accessible; when CPL is one, only data segments at privilege levels one through three are accessible; when CPL is three, only data segments at privilege level three are accessible. This property of the 80386 can be used, for example, to prevent applications procedures from reading or changing tables of the operating system.

6.3.2.1 Accessing Data in Code Segments

Less common than the use of data segments is the use of code segments to store data. Code segments may legitimately hold constants; it is not possible to write to a segment described as a code segment. The following methods of accessing data in code segments are possible:

Load a data-segment register with a selector of a nonconforming, readable, executable segment.
Load a data-segment register with a selector of a conforming, readable, executable segment.
Use a CS override prefix to read a readable, executable segment whose selector is already loaded in the CS register.

The same rules as for access to data segments apply to case 1. Case 2 is always valid because the privilege level of a segment whose conforming bit is set is effectively the same as CPL regardless of its DPL. Case 3 always valid because the DPL of the code segment in CS is, by definition, equal to CPL.

6.3.3 Restricting Control Transfers

With the 80386, control transfers are accomplished by the instructions JMP, CALL, RET, INT, and IRET, as well as by the exception and interrupt mechanisms . Exceptions and interrupts are special cases that Chapter 9 covers. This chapter discusses only JMP, CALL, and RET instructions.

The "near" forms of JMP, CALL, and RET transfer within the current code segment, and therefore are subject only to limit checking. The processor ensures that the destination of the JMP, CALL, or RET instruction does not exceed the limit of the current executable segment. This limit is cached in the CS register; therefore, protection checks for near transfers require no extra clock cycles.

The operands of the "far" forms of JMP and CALL refer to other segments; therefore, the processor performs privilege checking. There are two ways a JMP or CALL can refer to another segment:

The operand selects the descriptor of another executable segment.
The operand selects a call gate descriptor. This gated form of transfer is discussed in a later section on call gates.

As Figure 6-4 shows, two different privilege levels enter into a privilege check for a control transfer that does not use a call gate:

The CPL (current privilege level).
The DPL of the descriptor of the target segment.

Normally the CPL is equal to the DPL of the segment that the processor is currently executing. CPL may, however, be greater than DPL if the conforming bit is set in the descriptor of the current executable segment. The processor keeps a record of the CPL cached in the CS register; this value can be different from the DPL in the descriptor of the code segment.

The processor permits a JMP or CALL directly to another segment only if one of the following privilege rules is satisfied:

DPL of the target is equal to CPL.
The conforming bit of the target code-segment descriptor is set, and the DPL of the target is less than or equal to CPL.

An executable segment whose descriptor has the conforming bit set is called a conforming segment. The conforming-segment mechanism permits sharing of procedures that may be called from various privilege levels but should execute at the privilege level of the calling procedure. Examples of such procedures include math libraries and some exception handlers. When control is transferred to a conforming segment, the CPL does not change. This is the only case when CPL may be unequal to the DPL of the current executable segment.

Most code segments are not conforming. The basic rules of privilege above mean that, for nonconforming segments, control can be transferred without a gate only to executable segments at the same level of privilege. There is a need, however, to transfer control to (numerically) smaller privilege levels; this need is met by the CALL instruction when used with call-gate descriptors, which are explained in the next section. The JMP instruction may never transfer control to a nonconforming segment whose DPL does not equal CPL.

6.3.4 Gate Descriptors Guard Procedure Entry Points

To provide protection for control transfers among executable segments at different privilege levels, the 80386 uses gate descriptors. There are four kinds of gate descriptors:

Call gates
Trap gates
Interrupt gates
Task gates

This chapter is concerned only with call gates. Task gates are used for task switching , and therefore are discussed in Chapter 7. Chapter 9 explains how trap gates and interrupt gates are used by exceptions and interrupts. Figure 6-5 illustrates the format of a call gate. A call gate descriptor may reside in the GDT or in an LDT, but not in the IDT. A call gate has two primary functions:

To define an entry point of a procedure.
To specify the privilege level of the entry point.

Call gate descriptors are used by call and jump instructions in the same manner as code segment descriptors. When the hardware recognizes that the destination selector refers to a gate descriptor, the operation of the instruction is expanded as determined by the contents of the call gate.

The selector and offset fields of a gate form a pointer to the entry point of a procedure. A call gate guarantees that all transitions to another segment go to a valid entry point, rather than possibly into the middle of a procedure (or worse, into the middle of an instruction). The far pointer operand of the control transfer instruction does not point to the segment and offset of the target instruction; rather, the selector part of the pointer selects a gate, and the offset is not used. Figure 6-6 illustrates this style of addressing.

As Figure 6-7 shows, four different privilege levels are used to check the validity of a control transfer via a call gate:

The CPL (current privilege level).
The RPL (requestor's privilege level) of the selector used to specify the call gate.
The DPL of the gate descriptor.
The DPL of the descriptor of the target executable segment.

The DPL field of the gate descriptor determines what privilege levels can use the gate. One code segment can have several procedures that are intended for use by different privilege levels. For example, an operating system may have some services that are intended to be used by applications, whereas others may be intended only for use by other systems software.

Gates can be used for control transfers to numerically smaller privilege levels or to the same privilege level (though they are not necessary for transfers to the same level). Only CALL instructions can use gates to transfer to smaller privilege levels. A gate may be used by a JMP instruction only to transfer to an executable segment with the same privilege level or to a conforming segment.

For a JMP instruction to a nonconforming segment, both of the following privilege rules must be satisfied; otherwise, a general protection exception results.

 MAX (CPL,RPL) <= gate DPL
 target segment DPL = CPL

For a CALL instruction (or for a JMP instruction to a conforming segment), both of the following privilege rules must be satisfied; otherwise, a general protection exception results.

 MAX (CPL,RPL) <= gate DPL
 target segment DPL <= CPL

6.3.4.1 Stack Switching

If the destination code segment of the call gate is at a different privilege level than the CPL, an interlevel transfer is being requested.

To maintain system integrity, each privilege level has a separate stack. These stacks assure sufficient stack space to process calls from less privileged levels. Without them, a trusted procedure would not work correctly if the calling procedure did not provide sufficient space on the caller's stack.

The processor locates these stacks via the task state segment (see Figure 6-8). Each task has a separate TSS, thereby permitting tasks to have separate stacks. Systems software is responsible for creating TSSs and placing correct stack pointers in them. The initial stack pointers in the TSS are strictly read-only values. The processor never changes them during the course of execution.

When a call gate is used to change privilege levels, a new stack is selected by loading a pointer value from the Task State Segment (TSS). The processor uses the DPL of the target code segment (the new CPL) to index the initial stack pointer for PL 0, PL 1, or PL 2.

The DPL of the new stack data segment must equal the new CPL; if it does not, a stack exception occurs. It is the responsibility of systems software to create stacks and stack-segment descriptors for all privilege levels that are used. Each stack must contain enough space to hold the old SS:ESP, the return address, and all parameters and local variables that may be required to process a call.

As with intralevel calls, parameters for the subroutine are placed on the stack. To make privilege transitions transparent to the called procedure, the processor copies the parameters to the new stack. The count field of a call gate tells the processor how many doublewords (up to 31) to copy from the caller's stack to the new stack. If the count is zero, no parameters are copied.

The processor performs the following stack-related steps in executing an interlevel CALL.

The new stack is checked to assure that it is large enough to hold the parameters and linkages; if it is not, a stack fault occurs with an error code of 0.
The old value of the stack registers SS:ESP is pushed onto the new stack as two doublewords.
The parameters are copied.
A pointer to the instruction after the CALL instruction (the former value of CS:EIP) is pushed onto the new stack. The final value of SS:ESP points to this return pointer on the new stack.

Figure 6-9 illustrates the stack contents after a successful interlevel call.

The TSS does not have a stack pointer for a privilege level 3 stack, because privilege level 3 cannot be called by any procedure at any other privilege level.

Procedures that may be called from another privilege level and that require more than the 31 doublewords for parameters must use the saved SS:ESP link to access all parameters beyond the last doubleword copied.

A call via a call gate does not check the values of the words copied onto the new stack. The called procedure should check each parameter for validity. A later section discusses how the ARPL, VERR, VERW, LSL, and LAR instructions can be used to check pointer values.

6.3.4.2 Returning from a Procedure

The "near" forms of the RET instruction transfer control within the current code segment and therefore are subject only to limit checking. The offset of the instruction following the corresponding CALL, is popped from the stack. The processor ensures that this offset does not exceed the limit of the current executable segment.

The "far" form of the RET instruction pops the return pointer that was pushed onto the stack by a prior far CALL instruction. Under normal conditions, the return pointer is valid, because of its relation to the prior CALL or INT. Nevertheless, the processor performs privilege checking because of the possibility that the current procedure altered the pointer or failed to properly maintain the stack. The RPL of the CS selector popped off the stack by the return instruction identifies the privilege level of the calling procedure.

An intersegment return instruction can change privilege levels, but only toward procedures of lesser privilege. When the RET instruction encounters a saved CS value whose RPL is numerically greater than the CPL, an interlevel return occurs. Such a return follows these steps:

The checks shown in Table 6-3 are made, and CS:EIP and SS:ESP are loaded with their former values that were saved on the stack.
The old SS:ESP (from the top of the current stack) value is adjusted by the number of bytes indicated in the RET instruction. The resulting ESP value is not compared to the limit of the stack segment. If ESP is beyond the limit, that fact is not recognized until the next stack operation. (The SS:ESP value of the returning procedure is not preserved; normally, this value is the same as that contained in the TSS.)
The contents of the DS, ES, FS, and GS segment registers are checked. If any of these registers refer to segments whose DPL is greater than the new CPL (excluding conforming code segments), the segment register is loaded with the null selector (INDEX = 0, TI = 0). The RET instruction itself does not signal exceptions in these cases; however, any subsequent memory reference that attempts to use a segment register that contains the null selector will cause a general protection exception. This prevents less privileged code from accessing more privileged segments using selectors left in the segment registers by the more privileged procedure.

6.3.5 Some Instructions are Reserved for Operating System

Instructions that have the power to affect the protection mechanism or to influence general system performance can only be executed by trusted procedures. The 80386 has two classes of such instructions:

Privileged instructions -- those used for system control.
Sensitive instructions -- those used for I/O and I/O related activities.

 Table 6-3. Interlevel Return Checks
 
 SF = Stack Fault
 GP = General Protection Exception
 NP = Segment-Not-Present Exception
 
 Type of Check                                  Exception   Error Code
 
 ESP is within current SS segment               SF          0
 ESP + 7 is within current SS segment           SF          0
 RPL of return CS is greater than CPL           GP          Return CS
 Return CS selector is not null                 GP          Return CS
 Return CS segment is within descriptor
   table limit                                  GP          Return CS
 Return CS descriptor is a code segment         GP          Return CS
 Return CS segment is present                   NP          Return CS
 DPL of return nonconforming code
   segment = RPL of return CS, or DPL of
   return conforming code segment <= RPL
   of return CS                                 GP          Return CS
 ESP + N + 15 is within SS segment
 N   Immediate Operand of RET N Instruction     SF          Return SS
 SS selector at ESP + N + 12 is not null        GP          Return SS
 SS selector at ESP + N + 12 is within
   descriptor table limit                       GP          Return SS
 SS descriptor is writable data segment         GP          Return SS
 SS segment is present                          SF          Return SS
 Saved SS segment DPL = RPL of saved
   CS                                           GP          Return SS
 Saved SS selector RPL = Saved SS
   segment DPL                                  GP          Return SS

6.3.5.1 Privileged Instructions

The instructions that affect system data structures can only be executed when CPL is zero. If the CPU encounters one of these instructions when CPL is greater than zero, it signals a general protection exception. These instructions include:

CLTS -- Clear Task-Switched Flag
HLT -- Halt Processor
LGDT -- Load GDL Register
LIDT -- Load IDT Register
LLDT -- Load LDT Register
LMSW -- Load Machine Status Word
LTR -- Load Task Register
MOV to/from CRn -- Move to Control Register n
MOV to /from DRn -- Move to Debug Register n
MOV to/from TRn -- Move to Test Register n

6.3.5.2 Sensitive Instructions

Instructions that deal with I/O need to be restricted but also need to be executed by procedures executing at privilege levels other than zero. The mechanisms for restriction of I/O operations are covered in detail in Chapter 8, "Input/Output".

6.3.6 Instructions for Pointer Validation

Pointer validation is an important part of locating programming errors. Pointer validation is necessary for maintaining isolation between the privilege levels. Pointer validation consists of the following steps:

Check if the supplier of the pointer is entitled to access the segment.
Check if the segment type is appropriate to its intended use.
Check if the pointer violates the segment limit.

Although the 80386 processor automatically performs checks 2 and 3 during instruction execution, software must assist in performing the first check. The unprivileged instruction ARPL is provided for this purpose. Software can also explicitly perform steps 2 and 3 to check for potential violations (rather than waiting for an exception). The unprivileged instructions LAR, LSL, VERR, and VERW are provided for this purpose.

LAR (Load Access Rights) is used to verify that a pointer refers to a segment of the proper privilege level and type. LAR has one operand selector for a descriptor whose access rights are to be examined. The descriptor must be visible at the privilege level which is the maximum of the CPL and the selector's RPL. If the descriptor is visible, LAR obtains a masked form of the second doubleword of the descriptor, masks this value with 00FxFF00H, stores the result into the specified 32-bit destination register, and sets the zero flag. (The x indicates that the corresponding four bits of the stored value are undefined.) Once loaded, the access-rights bits can be tested. All valid descriptor types can be tested by the LAR instruction. If the RPL or CPL is greater than DPL, or if the selector is outside the table limit, no access-rights value is returned, and the zero flag is cleared. Conforming code segments may be accessed from any privilege level.

LSL (Load Segment Limit) allows software to test the limit of a descriptor. If the descriptor denoted by the given selector (in memory or a register) is visible at the CPL, LSL loads the specified 32-bit register with a 32-bit, byte granular, unscrambled limit that is calculated from fragmented limit fields and the G-bit of that descriptor. This can only be done for segments (data, code, task state, and local descriptor tables); gate descriptors are inaccessible. (Table 6-4 lists in detail which types are valid and which are not.) Interpreting the limit is a function of the segment type. For example, downward expandable data segments treat the limit differently than code segments do. For both LAR and LSL, the zero flag (ZF) is set if the loading was performed; otherwise, the ZF is cleared.

 Table 6-4. Valid Descriptor Types for LSL
 
 Type   Descriptor Type             Valid?
 Code
 
 0      (invalid)                   NO
 1      Available 286 TSS           YES
 2      LDT                         YES
 3      Busy 286 TSS                YES
 4      286 Call Gate               NO
 5      Task Gate                   NO
 6      286 Trap Gate               NO
 7      286 Interrupt Gate          NO
 8      (invalid)                   NO
 9      Available 386 TSS           YES
 A      (invalid)                   NO
 B      Busy 386 TSS                YES
 C      386 Call Gate               NO
 D      (invalid)                   NO
 E      386 Trap Gate               NO
 F      386 Interrupt Gate          NO

6.3.6.1 Descriptor Validation

The 80386 has two instructions, VERR and VERW, which determine whether a selector points to a segment that can be read or written at the current privilege level. Neither instruction causes a protection fault if the result is negative.

VERR (Verify for Reading) verifies a segment for reading and loads ZF with 1 if that segment is readable from the current privilege level. VERR checks that:

The selector points to a descriptor within the bounds of the GDT or LDT.
It denotes a code or data segment descriptor.
The segment is readable and of appropriate privilege level.

The privilege check for data segments and nonconforming code segments is that the DPL must be numerically greater than or equal to both the CPL and the selector's RPL. Conforming segments are not checked for privilege level.

VERW (Verify for Writing) provides the same capability as VERR for verifying writability. Like the VERR instruction, VERW loads ZF if the result of the writability check is positive. The instruction checks that the descriptor is within bounds, is a segment descriptor, is writable, and that its DPL is numerically greater or equal to both the CPL and the selector's RPL. Code segments are never writable, conforming or not.

6.3.6.2 Pointer Integrity and RPL

The Requestor's Privilege Level (RPL) feature can prevent inappropriate use of pointers that could corrupt the operation of more privileged code or data from a less privileged level.

A common example is a file system procedure, FREAD (file_id, n_bytes, buffer_ptr). This hypothetical procedure reads data from a file into a buffer, overwriting whatever is there. Normally, FREAD would be available at the user level, supplying only pointers to the file system procedures and data located and operating at a privileged level. Normally, such a procedure prevents user-level procedures from directly changing the file tables. However, in the absence of a standard protocol for checking pointer validity, a user-level procedure could supply a pointer into the file tables in place of its buffer pointer, causing the FREAD procedure to corrupt them unwittingly.

Use of RPL can avoid such problems. The RPL field allows a privilege attribute to be assigned to a selector. This privilege attribute would normally indicate the privilege level of the code which generated the selector. The 80386 processor automatically checks the RPL of any selector loaded into a segment register to determine whether the RPL allows access.

To take advantage of the processor's checking of RPL, the called procedure need only ensure that all selectors passed to it have an RPL at least as high (numerically) as the original caller's CPL. This action guarantees that selectors are not more trusted than their supplier. If one of the selectors is used to access a segment that the caller would not be able to access directly, i.e., the RPL is numerically greater than the DPL, then a protection fault will result when that selector is loaded into a segment register.

ARPL (Adjust Requestor's Privilege Level) adjusts the RPL field of a selector to become the larger of its original value and the value of the RPL field in a specified register. The latter is normally loaded from the image of the caller's CS register which is on the stack. If the adjustment changes the selector's RPL, ZF (the zero flag) is set; otherwise, ZF is cleared.

6.4 Page-Level Protection

Two kinds of protection are related to pages:

Restriction of addressable domain.
Type checking.

6.4.1 Page-Table Entries Hold Protection Parameters

Figure 6-10 highlights the fields of PDEs and PTEs that control access to pages.

6.4.1.1 Restricting Addressable Domain

The concept of privilege for pages is implemented by assigning each page to one of two levels:

Supervisor level (U/S=0) -- for the operating system and other systems software and related data.
User level (U/S=1) -- for applications procedures and data.

The current level (U or S) is related to CPL. If CPL is 0, 1, or 2, the processor is executing at supervisor level. If CPL is 3, the processor is executing at user level.

When the processor is executing at supervisor level, all pages are addressable, but, when the processor is executing at user level, only pages that belong to the user level are addressable.

6.4.1.2 Type Checking

At the level of page addressing, two types are defined:

Read-only access (R/W=0)
Read/write access (R/W=1)

When the processor is executing at supervisor level, all pages are both readable and writable. When the processor is executing at user level, only pages that belong to user level and are marked for read/write access are writable; pages that belong to supervisor level are neither readable nor writable from user level.

6.4.2 Combining Protection of Both Levels of Page Tables

For any one page, the protection attributes of its page directory entry may differ from those of its page table entry. The 80386 computes the effective protection attributes for a page by examining the protection attributes in both the directory and the page table. Table 6-5 shows the effective protection provided by the possible combinations of protection attributes.

6.4.3 Overrides to Page Protection

Certain accesses are checked as if they are privilege-level 0 references, even if CPL = 3:

LDT, GDT, TSS, IDT references.
Access to inner stack during ring-crossing CALL / INT.

6.5 Combining Page and Segment Protection

When paging is enabled, the 80386 first evaluates segment protection, then evaluates page protection. If the processor detects a protection violation at either the segment or the page level, the requested operation cannot proceed; a protection exception occurs instead.

For example, it is possible to define a large data segment which has some subunits that are read-only and other subunits that are read-write. In this case, the page directory (or page table) entries for the read-only subunits would have the U/S and R/W bits set to x0, indicating no write rights for all the pages described by that directory entry (or for individual pages). This technique might be used, for example, in a UNIX-like system to define a large data segment, part of which is read only (for shared data or ROMmed constants). This enables UNIX-like systems to define a "flat" data space as one large segment, use "flat" pointers to address within this "flat" space, yet be able to protect shared data, shared files mapped into the virtual space, and supervisor areas.

 Table 6-5. Combining Directory and Page Protection
 
 Page Directory Entry     Page Table Entry      Combined Protection
 U/S          R/W         U/S      R/W          U/S         R/W
 
 S-0          R-0         S-0      R-0           S           x
 S-0          R-0         S-0      W-1           S           x
 S-0          R-0         U-1      R-0           S           x
 S-0          R-0         U-1      W-1           S           x
 S-0          W-1         S-0      R-0           S           x
 S-0          W-1         S-0      W-1           S           x
 S-0          W-1         U-1      R-0           S           x
 S-0          W-1         U-1      W-1           S           x
 U-1          R-0         S-0      R-0           S           x
 U-1          R-0         S-0      W-1           S           x
 U-1          R-0         U-1      R-0           U           R
 U-1          R-0         U-1      W-1           U           R
 U-1          W-1         S-0      R-0           S           x
 U-1          W-1         S-0      W-1           S           x
 U-1          W-1         U-1      R-0           U           R
 U-1          W-1         U-1      W-1           U           W

Note

S -- Supervisor R -- Read only U -- User W -- Read and Write x indicates that when the combined U/S attribute is S, the R/W attribute is not checked.

7.1 Task State Segment

All the information the processor needs in order to manage a task is stored in a special type of segment, a task state segment (TSS). Figure 7-1 shows the format of a TSS for executing 80386 tasks. (Another format is used for executing 80286 tasks; refer to Chapter 13.)

The fields of a TSS belong to two classes:

A dynamic set that the processor updates with each switch from the task. This set includes the fields that store:
- The general registers (EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI).
- The segment registers (ES, CS, SS, DS, FS, GS).
- The flags register (EFLAGS).
- The instruction pointer (EIP).
- The selector of the TSS of the previously executing task (updated only when a return is expected).
A static set that the processor reads but does not change. This set includes the fields that store:
- The selector of the task's LDT.
- The register (PDBR) that contains the base address of the task's page directory (read only when paging is enabled).
- Pointers to the stacks for privilege levels 0-2.
- The T-bit (debug trap bit) which causes the processor to raise a debug exception when a task switch occurs . (Refer to Chapter 12 for more information on debugging.)
- The I/O map base (refer to Chapter 8 for more information on the use of the I/O map).

Task state segments may reside anywhere in the linear space. The only case that requires caution is when the TSS spans a page boundary and the higher-addressed page is not present. In this case, the processor raises an exception if it encounters the not-present page while reading the TSS during a task switch. Such an exception can be avoided by either of two strategies:

By allocating the TSS so that it does not cross a page boundary.
By ensuring that both pages are either both present or both not-present at the time of a task switch. If both pages are not-present, then the page-fault handler must make both pages present before restarting the instruction that caused the task switch.

7.2 TSS Descriptor

The task state segment, like all other segments, is defined by a descriptor. Figure 7-2 shows the format of a TSS descriptor.

The B-bit in the type field indicates whether the task is busy. A type code of 9 indicates a non-busy task; a type code of 11 indicates a busy task. Tasks are not reentrant. The B-bit allows the processor to detect an attempt to switch to a task that is already busy.

The BASE, LIMIT, and DPL fields and the G-bit and P-bit have functions similar to their counterparts in data-segment descriptors. The LIMIT field, however, must have a value equal to or greater than 103. An attempt to switch to a task whose TSS descriptor has a limit less that 103 causes an exception. A larger limit is permissible, and a larger limit is required if an I/O permission map is present. A larger limit may also be convenient for systems software if additional data is stored in the same segment as the TSS.

A procedure that has access to a TSS descriptor can cause a task switch. In most systems the DPL fields of TSS descriptors should be set to zero, so that only trusted software has the right to perform task switching.

Having access to a TSS-descriptor does not give a procedure the right to read or modify a TSS. Reading and modification can be accomplished only with another descriptor that redefines the TSS as a data segment. An attempt to load a TSS descriptor into any of the segment registers (CS, SS, DS, ES, FS, GS) causes an exception.

TSS descriptors may reside only in the GDT. An attempt to identify a TSS with a selector that has TI=1 (indicating the current LDT) results in an exception.

7.3 Task Register

The task register (TR) identifies the currently executing task by pointing to the TSS. Figure 7-3 shows the path by which the processor accesses the current TSS.

The task register has both a "visible" portion (i.e., can be read and changed by instructions) and an "invisible" portion (maintained by the processor to correspond to the visible portion; cannot be read by any instruction). The selector in the visible portion selects a TSS descriptor in the GDT. The processor uses the invisible portion to cache the base and limit values from the TSS descriptor. Holding the base and limit in a register makes execution of the task more efficient, because the processor does not need to repeatedly fetch these values from memory when it references the TSS of the current task.

The instructions LTR and STR are used to modify and read the visible portion of the task register. Both instructions take one operand, a 16-bit selector located in memory or in a general register.

LTR (Load task register) loads the visible portion of the task register with the selector operand, which must select a TSS descriptor in the GDT. LTR also loads the invisible portion with information from the TSS descriptor selected by the operand. LTR is a privileged instruction; it may be executed only when CPL is zero. LTR is generally used during system initialization to give an initial value to the task register; thereafter, the contents of TR are changed by task switch operations.

STR (Store task register) stores the visible portion of the task register in a general register or memory word. STR is not privileged.

7.4 Task Gate Descriptor

A task gate descriptor provides an indirect, protected reference to a TSS. Figure 7-4 illustrates the format of a task gate.

The SELECTOR field of a task gate must refer to a TSS descriptor. The value of the RPL in this selector is not used by the processor.

The DPL field of a task gate controls the right to use the descriptor to cause a task switch. A procedure may not select a task gate descriptor unless the maximum of the selector's RPL and the CPL of the procedure is numerically less than or equal to the DPL of the descriptor. This constraint prevents untrusted procedures from causing a task switch. (Note that when a task gate is used, the DPL of the target TSS descriptor is not used for privilege checking.)

A procedure that has access to a task gate has the power to cause a task switch, just as a procedure that has access to a TSS descriptor. The 80386 has task gates in addition to TSS descriptors to satisfy three needs:

The need for a task to have a single busy bit. Because the busy-bit is stored in the TSS descriptor, each task should have only one such descriptor. There may, however, be several task gates that select the single TSS descriptor.
The need to provide selective access to tasks. Task gates fulfill this need, because they can reside in LDTs and can have a DPL that is different from the TSS descriptor's DPL. A procedure that does not have sufficient privilege to use the TSS descriptor in the GDT (which usually has a DPL of 0) can still switch to another task if it has access to a task gate for that task in its LDT. With task gates, systems software can limit the right to cause task switches to specific tasks.
The need for an interrupt or exception to cause a task switch. Task gates may also reside in the IDT, making it possible for interrupts and exceptions to cause task switching. When interrupt or exception vectors to an IDT entry that contains a task gate, the 80386 switches to the indicated task. Thus, all tasks in the system can benefit from the protection afforded by isolation from interrupt tasks.

Figure 7-5 illustrates how both a task gate in an LDT and a task gate in the IDT can identify the same task.

7.5 Task Switching

The 80386 switches execution to another task in any of four cases:

The current task executes a JMP or CALL that refers to a TSS descriptor.
The current task executes a JMP or CALL that refers to a task gate.
An interrupt or exception vectors to a task gate in the IDT.
The current task executes an IRET when the NT flag is set.

JMP, CALL, IRET, interrupts, and exceptions are all ordinary mechanisms of the 80386 that can be used in circumstances that do not require a task switch. Either the type of descriptor referenced or the NT (nested task) bit in the flag word distinguishes between the standard mechanism and the variant that causes a task switch.

To cause a task switch, a JMP or CALL instruction can refer either to a TSS descriptor or to a task gate. The effect is the same in either case: the 80386 switches to the indicated task.

An exception or interrupt causes a task switch when it vectors to a task gate in the IDT. If it vectors to an interrupt or trap gate in the IDT, a task switch does not occur . Refer to Chapter 9 for more information on the interrupt mechanism.

Whether invoked as a task or as a procedure of the interrupted task, an interrupt handler always returns control to the interrupted procedure in the interrupted task. If the NT flag is set, however, the handler is an interrupt task, and the IRET switches back to the interrupted task.

A task switching operation involves these steps:

Checking that the current task is allowed to switch to the designated task. Data-access privilege rules apply in the case of JMP or CALL instructions. The DPL of the TSS descriptor or task gate must be numerically greater (e.g., lower privilege level) than or equal to the maximum of CPL and the RPL of the gate selector. Exceptions, interrupts, and IRET are permitted to switch tasks regardless of the DPL of the target task gate or TSS descriptor.
Checking that the TSS descriptor of the new task is marked present and has a valid limit. Any errors up to this point occur in the context of the outgoing task. Errors are restartable and can be handled in a way that is transparent to applications procedures.
Saving the state of the current task. The processor finds the base address of the current TSS cached in the task register. It copies the registers into the current TSS (EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI, ES, CS, SS, DS, FS, GS, and the flag register). The EIP field of the TSS points to the instruction after the one that caused the task switch.
Loading the task register with the selector of the incoming task's TSS descriptor, marking the incoming task's TSS descriptor as busy, and setting the TS (task switched) bit of the MSW. The selector is either the operand of a control transfer instruction or is taken from a task gate.
Loading the incoming task's state from its TSS and resuming execution. The registers loaded are the LDT register; the flag register; the general registers EIP, EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI; the segment registers ES, CS, SS, DS, FS, and GS; and PDBR. Any errors detected in this step occur in the context of the incoming task. To an exception handler, it appears that the first instruction of the new task has not yet executed.

Note that the state of the outgoing task is always saved when a task switch occurs. If execution of that task is resumed, it starts after the instruction that caused the task switch. The registers are restored to the values they held when the task stopped executing.

Every task switch sets the TS (task switched) bit in the MSW (machine status word). The TS flag is useful to systems software when a coprocessor (such as a numerics coprocessor) is present. The TS bit signals that the context of the coprocessor may not correspond to the current 80386 task. Chapter 11 discusses the TS bit and coprocessors in more detail .

Exception handlers that field task-switch exceptions in the incoming task (exceptions due to tests 4 thru 16 of Table 7-1) should be cautious about taking any action that might load the selector that caused the exception. Such an action will probably cause another exception, unless the exception handler first examines the selector and fixes any potential problem.

The privilege level at which execution resumes in the incoming task is neither restricted nor affected by the privilege level at which the outgoing task was executing. Because the tasks are isolated by their separate address spaces and TSSs and because privilege rules can be used to prevent improper access to a TSS, no privilege rules are needed to constrain the relation between the CPLs of the tasks. The new task begins executing at the privilege level indicated by the RPL of the CS selector value that is loaded from the TSS.

 Table 7-1. Checks Made during a Task Switch
 
 NP = Segment-not-present exception 
 GP = General protection fault 
 TS = Invalid TSS 
 SF = Stack fault
 
 Validity tests of a selector check that the selector is in the proper
 table (e.g., the LDT selector refers to the GDT), lies within the bounds of
 the table, and refers to the proper type of descriptor (e.g., the LDT
 selector refers to an LDT descriptor).
 
 Test     Test Description                   Exception    Error Code Selects
 
   1      Incoming TSS descriptor is 
          present                            NP           Incoming TSS
   2      Incoming TSS descriptor is 
          marked not-busy                    GP           Incoming TSS
          marked not-busy
   3      Limit of incoming TSS is
          greater than or equal to 103       TS           Incoming TSS
 
              -- All register and selector values are loaded --
 
   4      LDT selector of incoming 
          task is valid                      TS           Incoming TSS
   5      LDT of incoming task is  
          present                            TS           Incoming TSS
   6      CS selector is valid               TS           Code segment
   7      Code segment is present            NP           Code segment
   8      Code segment DPL matches  
          CS RPL                             TS           Code segment
   9      Stack segment is valid             GP           Stack segment 
  10      Stack segment is present           SF           Stack segment
  11      Stack segment DPL = CPL            SF           Stack segment
  12      Stack-selector RPL = CPL           GP           Stack segment
  13      DS, ES, FS, GS selectors are
          valid                              GP           Segment
  14      DS, ES, FS, GS segments 
          are readable                       GP           Segment
  15      DS, ES, FS, GS segments 
          are present                        NP           Segment
  16      DS, ES, FS, GS segment DPL  
          >= CPL (unless these are
          conforming segments)               GP           Segment

7.6 Task Linking

The back-link field of the TSS and the NT (nested task) bit of the flag word together allow the 80386 to automatically return to a task that CALLed another task or was interrupted by another task. When a CALL instruction, an interrupt instruction, an external interrupt, or an exception causes a switch to a new task, the 80386 automatically fills the back-link of the new TSS with the selector of the outgoing task's TSS and, at the same time, sets the NT bit in the new task's flag register. The NT flag indicates whether the back-link field is valid. The new task releases control by executing an IRET instruction. When interpreting an IRET, the 80386 examines the NT flag. If NT is set, the 80386 switches back to the task selected by the back-link field. Table 7-2 summarizes the uses of these fields.

 Table 7-2. Effect of Task Switch on BUSY, NT, and Back-Link
 
 Affected Field      Effect of JMP      Effect of            Effect of
                     Instruction        CALL Instruction     IRET Instruction
 
 Busy bit of         Set, must be       Set, must be 0       Unchanged,
 incoming task       0 before           before               must be set
 
 Busy bit of         Cleared            Unchanged            Cleared
 outgoing task                          (already set)
 
 NT bit of           Cleared            Set                  Unchanged
 incoming task
 
 NT bit of           Unchanged          Unchanged            Cleared
 outgoing task
 
 Back-link of        Unchanged          Set to outgoing      Unchanged
 incoming task                          TSS selector
 
 Back-link of        Unchanged          Unchanged            Unchanged
 outgoing task

7.6.1 Busy Bit Prevents Loops

The B-bit (busy bit) of the TSS descriptor ensures the integrity of the back-link. A chain of back-links may grow to any length as interrupt tasks interrupt other interrupt tasks or as called tasks call other tasks. The busy bit ensures that the CPU can detect any attempt to create a loop. A loop would indicate an attempt to reenter a task that is already busy; however, the TSS is not a reentrable resource.

The processor uses the busy bit as follows:

When switching to a task, the processor automatically sets the busy bit of the new task.
When switching from a task, the processor automatically clears the busy bit of the old task if that task is not to be placed on the back-link chain (i.e., the instruction causing the task switch is JMP or IRET). If the task is placed on the back-link chain, its busy bit remains set.
When switching to a task, the processor signals an exception if the busy bit of the new task is already set.

By these actions, the processor prevents a task from switching to itself or to any task that is on a back-link chain, thereby preventing invalid reentry into a task.

The busy bit is effective even in multiprocessor configurations, because the processor automatically asserts a bus lock when it sets or clears the busy bit. This action ensures that two processors do not invoke the same task at the same time . (Refer to Chapter 11 for more on multiprocessing.)

7.6.2 Modifying Task Linkages

Any modification of the linkage order of tasks should be accomplished only by software that can be trusted to correctly update the back-link and the busy-bit. Such changes may be needed to resume an interrupted task before the task that interrupted it. Trusted software that removes a task from the back-link chain must follow one of the following policies:

First change the back-link field in the TSS of the interrupting task, then clear the busy-bit in the TSS descriptor of the task removed from the list.
Ensure that no interrupts occur between updating the back-link chain and the busy bit.

7.7 Task Address Space

The LDT selector and PDBR fields of the TSS give software systems designers flexibility in utilization of segment and page mapping features of the 80386. By appropriate choice of the segment and page mappings for each task, tasks may share address spaces, may have address spaces that are largely distinct from one another, or may have any degree of sharing between these two extremes.

The ability for tasks to have distinct address spaces is an important aspect of 80386 protection. A module in one task cannot interfere with a module in another task if the modules do not have access to the same address spaces. The flexible memory management features of the 80386 allow systems designers to assign areas of shared address space to those modules of different tasks that are designed to cooperate with each other.

7.7.1 Task Linear-to-Physical Space Mapping

The choices for arranging the linear-to-physical mappings of tasks fall into two general classes:

One linear-to-physical mapping shared among all tasks.
When paging is not enabled, this is the only possibility. Without page tables, all linear addresses map to the same physical addresses.
When paging is enabled, this style of linear-to-physical mapping results from using one page directory for all tasks. The linear space utilized may exceed the physical space available if the operating system also implements page-level virtual memory.
Several partially overlapping linear-to-physical mappings.
This style is implemented by using a different page directory for each task. Because the PDBR (page directory base register) is loaded from the TSS with each task switch, each task may have a different page directory.

In theory, the linear address spaces of different tasks may map to completely distinct physical addresses. If the entries of different page directories point to different page tables and the page tables point to different pages of physical memory, then the tasks do not share any physical addresses.

In practice, some portion of the linear address spaces of all tasks must map to the same physical addresses. The task state segments must lie in a common space so that the mapping of TSS addresses does not change while the processor is reading and updating the TSSs during a task switch. The linear space mapped by the GDT should also be mapped to a common physical space; otherwise, the purpose of the GDT is defeated. Figure 7-6 shows how the linear spaces of two tasks can overlap in the physical space by sharing page tables.

7.7.2 Task Logical Address Space

By itself, a common linear-to-physical space mapping does not enable sharing of data among tasks. To share data, tasks must also have a common logical-to-linear space mapping; i.e., they must also have access to descriptors that point into a shared linear address space. There are three ways to create common logical-to-physical address-space mappings:

Via the GDT. All tasks have access to the descriptors in the GDT. If those descriptors point into a linear-address space that is mapped to a common physical-address space for all tasks, then the tasks can share data and instructions.
By sharing LDTs. Two or more tasks can use the same LDT if the LDT selectors in their TSSs select the same LDT segment. Those LDT-resident descriptors that point into a linear space that is mapped to a common physical space permit the tasks to share physical memory. This method of sharing is more selective than sharing by the GDT; the sharing can be limited to specific tasks. Other tasks in the system may have different LDTs that do not give them access to the shared areas.
By descriptor aliases in LDTs. It is possible for certain descriptors of different LDTs to point to the same linear address space. If that linear address space is mapped to the same physical space by the page mapping of the tasks involved, these descriptors permit the tasks to share the common space. Such descriptors are commonly called "aliases". This method of sharing is even more selective than the prior two; other descriptors in the LDTs may point to distinct linear addresses or to linear addresses that are not shared.

8.1 I/O Addressing

The 80386 allows input/output to be performed in either of two ways:

By means of a separate I/O address space (using specific I/O instructions)
By means of memory-mapped I/O (using general-purpose operand manipulation instructions).

8.1.1 I/O Address Space

The 80386 provides a separate I/O address space, distinct from physical memory, that can be used to address the input/output ports that are used for external 16 devices. The I/O address space consists of 2^(16) (64K) individually addressable 8-bit ports; any two consecutive 8-bit ports can be treated as a 16-bit port; and four consecutive 8-bit ports can be treated as a 32-bit port. Thus, the I/O address space can accommodate up to 64K 8-bit ports, up to 32K 16-bit ports, or up to 16K 32-bit ports.

The program can specify the address of the port in two ways. Using an immediate byte constant, the program can specify:

256 8-bit ports numbered 0 through 255.
128 16-bit ports numbered 0, 2, 4, . . . , 252, 254.
64 32-bit ports numbered 0, 4, 8, . . . , 248, 252.

Using a value in DX, the program can specify:

8-bit ports numbered 0 through 65535
16-bit ports numbered 0, 2, 4, . . . , 65532, 65534
32-bit ports numbered 0, 4, 8, . . . , 65528, 65532

The 80386 can transfer 32, 16, or 8 bits at a time to a device located in the I/O space. Like doublewords in memory, 32-bit ports should be aligned at addresses evenly divisible by four so that the 32 bits can be transferred in a single bus access. Like words in memory, 16-bit ports should be aligned at even-numbered addresses so that the 16 bits can be transferred in a single bus access. An 8-bit port may be located at either an even or odd address.

The instructions IN and OUT move data between a register and a port in the I/O address space. The instructions INS and OUTS move strings of data between the memory address space and ports in the I/O address space.

8.1.2 Memory-Mapped I/O

I/O devices also may be placed in the 80386 memory address space. As long as the devices respond like memory components, they are indistinguishable to the processor.

Memory-mapped I/O provides additional programming flexibility. Any instruction that references memory may be used to access an I/O port located in the memory space. For example, the MOV instruction can transfer data between any register and a port; and the AND, OR, and TEST instructions may be used to manipulate bits in the internal registers of a device (see Figure 8-1 ). Memory-mapped I/O performed via the full instruction set maintains the full complement of addressing modes for selecting the desired I/O device (e.g., direct address, indirect address, base register, index register, scaling).

Memory-mapped I/O, like any other memory reference, is subject to access protection and control when executing in protected mode. Refer to Chapter 6 for a discussion of memory protection.

8.2 I/O Instructions

The I/O instructions of the 80386 provide access to the processor's I/O ports for the transfer of data to and from peripheral devices. These instructions have as one operand the address of a port in the I/O address space. There are two classes of I/O instruction:

Those that transfer a single item (byte, word, or doubleword) located in a register.
Those that transfer strings of items (strings of bytes, words, or doublewords) located in memory. These are known as "string I/O instructions" or "block I/O instructions".

8.2.1 Register I/O Instructions

The I/O instructions IN and OUTare provided to move data between I/O ports and the EAX (32-bit I/O), the AX (16-bit I/O), or AL (8-bit I/O) general registers. IN and OUTinstructions address I/O ports either directly, with the address of one of up to 256 port addresses coded in the instruction, or indirectly via the DX register to one of up to 64K port addresses.

IN (Input from Port) transfers a byte, word, or doubleword from an input port to AL, AX, or EAX. If a program specifies AL with the IN instruction, the processor transfers 8 bits from the selected port to AL. If a program specifies AX with the IN instruction, the processor transfers 16 bits from the port to AX. If a program specifies EAX with the IN instruction, the processor transfers 32 bits from the port to EAX.

OUT(Output to Port) transfers a byte, word, or doubleword to an output port from AL, AX, or EAX. The program can specify the number of the port using the same methods as the IN instruction.

8.2.2 Block I/O Instructions

The block (or string) I/O instructions INS and OUTS move blocks of data between I/O ports and memory space. Block I/O instructions use the DX register to specify the address of a port in the I/O address space. INS and OUTSuse DX to specify:

8-bit ports numbered 0 through 65535
16-bit ports numbered 0, 2, 4, . . . , 65532, 65534
32-bit ports numbered 0, 4, 8, . . . , 65528, 65532

Block I/O instructions use either SI or DI to designate the source or destination memory address. For each transfer, SI or DI are automatically either incremented or decremented as specified by the direction bit in the flags register.

INS and OUTS, when used with repeat prefixes, cause block input or output operations. REP, the repeat prefix, modifies INS and OUTS to provide a means of transferring blocks of data between an I/O port and memory. These block I/O instructions are string primitives (refer also to Chapter 3 for more on string primitives). They simplify programming and increase the speed of data transfer by eliminating the need to use a separate LOOP instruction or an intermediate register to hold the data.

The string I/O primitives can operate on byte strings, word strings, or doubleword strings. After each transfer, the memory address in ESI or EDI is updated by 1 for byte operands, by 2 for word operands, or by 4 for doubleword operands. The value in the direction flag (DF) determines whether the processor automatically increments ESI or EDI (DF=0) or whether it automatically decrements these registers (DF=1).

INS(Input String from Port) transfers a byte or a word string element from an input port to memory. The mnemonics INSB, INSW, and INSD are variants that explicitly specify the size of the operand. If a program specifies INSB, the processor transfers 8 bits from the selected port to the memory location indicated by ES:EDI. If a program specifies INSW, the processor transfers 16 bits from the port to the memory location indicated by ES:EDI. If a program specifies INSD, the processor transfers 32 bits from the port to the memory location indicated by ES:EDI. The destination segment register choice (ES) cannot be changed for the INS instruction. Combined with the REP prefix, INS moves a block of information from an input port to a series of consecutive memory locations.

OUTS (Output String to Port) transfers a byte, word, or doubleword string element to an output port from memory. The mnemonics OUTSB, OUTSW, and OUTSD are variants that explicitly specify the size of the operand. If a program specifies OUTSB, the processor transfers 8 bits from the memory location indicated by ES:EDI to the the selected port. If a program specifies OUTSW, the processor transfers 16 bits from the memory location indicated by ES:EDI to the the selected port. If a program specifies OUTSD, the processor transfers 32 bits from the memory location indicated by ES:EDI to the the selected port. Combined with the REP prefix, OUTS moves a block of information from a series of consecutive memory locations indicated by DS:ESI to an output port.

8.3 Protection and I/O

Two mechanisms provide protection for I/O functions:

The IOPL field in the EFLAGS register defines the right to use I/O-related instructions.
The I/O permission bit map of a 80386 TSS segment defines the right to use ports in the I/O address space.

These mechanisms operate only in protected mode, including virtual 8086 mode; they do not operate in real mode. In real mode, there is no protection of the I/O space; any procedure can execute I/O instructions, and any I/O port can be addressed by the I/O instructions.

8.3.1 I/O Privilege Level

Instructions that deal with I/O need to be restricted but also need to be executed by procedures executing at privilege levels other than zero. For this reason, the processor uses two bits of the flags register to store the I/O privilege level (IOPL). The IOPL defines the privilege level needed to execute I/O-related instructions.

The following instructions can be executed only if CPL <= IOPL:

IN -- Input
INS -- Input String
OUT -- Output
OUTS -- Output String
CLI -- Clear Interrupt-Enable Flag
STI -- Set Interrupt-Enable

These instructions are called "sensitive" instructions, because they are sensitive to IOPL.

To use sensitive instructions, a procedure must execute at a privilege level at least as privileged as that specified by the IOPL (CPL <= IOPL). Any attempt by a less privileged procedure to use a sensitive instruction results in a general protection exception.

Because each task has its own unique copy of the flags register, each task can have a different IOPL. A task whose primary function is to perform I/O (a device driver) can benefit from having an IOPL of three, thereby permitting all procedures of the task to perform I/O. Other tasks typically have IOPL set to zero or one, reserving the right to perform I/O instructions for the most privileged procedures.

A task can change IOPL only with the POPF instruction; however, such changes are privileged. No procedure may alter IOPL (the I/O privilege level in the flag register) unless the procedure is executing at privilege level 0. An attempt by a less privileged procedure to alter IOPL does not result in an exception; IOPL simply remains unaltered.

The POPF instruction may be used in addition to CLI and STI to alter the interrupt-enable flag (IF); however, changes to IF by POPF are IOPL-sensitive. A procedure may alter IF with a POPF instruction only when executing at a level that is at least as privileged as IOPL. An attempt by a less privileged procedure to alter IF in this manner does not result in an exception; IF simply remains unaltered.

8.3.2 I/O Permission Bit Map

The I/O instructions that directly refer to addresses in the processor's I/O space are IN, INS, OUT, OUTS. The 80386 has the ability to selectively trap references to specific I/O addresses. The structure that enables selective trapping is the I/O Permission Bit Map in the TSS segment (see Figure 8-2). The I/O permission map is a bit vector. The size of the map and its location in the TSS segment are variable. The processor locates the I/O permission map by means of the I/O map base field in the fixed portion of the TSS. The I/O map base field is 16 bits wide and contains the offset of the beginning of the I/O permission map. The upper limit of the I/O permission map is the same as the limit of the TSS segment.

In protected mode, when it encounters an I/O instruction (IN, INS, OUT, or OUTS), the processor first checks whether CPL <= IOPL. If this condition is true, the I/O operation may proceed. If not true, the processor checks the I/O permission map. (In virtual 8086 mode, the processor consults the map without regard for IOPL . Refer to Chapter 15.)

Each bit in the map corresponds to an I/O port byte address; for example, the bit for port 41 is found at I/O map base + 5, bit offset 1. The processor tests all the bits that correspond to the I/O addresses spanned by an I/O operation; for example, a doubleword operation tests four bits corresponding to four adjacent byte addresses. If any tested bit is set, the processor signals a general protection exception. If all the tested bits are zero, the I/O operation may proceed.

It is not necessary for the I/O permission map to represent all the I/O addresses. I/O addresses not spanned by the map are treated as if they had one bits in the map. For example, if TSS limit is equal to I/O map base + 31, the first 256 I/O ports are mapped; I/O operations on any port greater than 255 cause an exception.

If I/O map base is greater than or equal to TSS limit, the TSS segment has no I/O permission map, and all I/O instructions in the 80386 program cause exceptions when CPL > IOPL.

Because the I/O permission map is in the TSS segment, different tasks can have different maps. Thus, the operating system can allocate ports to a task by changing the I/O permission map in the task's TSS.

9.1 Identifying Interrupts

The processor associates an identifying number with each different type of interrupt or exception.

The NMI and the exceptions recognized by the processor are assigned predetermined identifiers in the range 0 through 31. Not all of these numbers are currently used by the 80386; unassigned identifiers in this range are reserved by Intel for possible future expansion.

The identifiers of the maskable interrupts are determined by external interrupt controllers (such as Intel's 8259A Programmable Interrupt Controller) and communicated to the processor during the processor's interrupt-acknowledge sequence. The numbers assigned by an 8259A PIC can be specified by software. Any numbers in the range 32 through 255 can be used. Table 9-1 shows the assignment of interrupt and exception identifiers.

Exceptions are classified as faults, traps, or aborts depending on the way they are reported and whether restart of the instruction that caused the exception is supported.

Faults: Faults are exceptions that are reported "before" the instruction causingthe exception. Faults are either detected before the instruction begins to execute, or during execution of the instruction. If detected during the instruction, the fault is reported with the machine restored to a state that permits the instruction to be restarted.
Traps: A trap is an exception that is reported at the instruction boundary immediately after the instruction in which the exception was detected.
Aborts

Intel 80386 Reference Programmer's Manual Table of Contents

Part I Applications Programming

Part II Systems Programming

Chapter 4 -- Systems Architecture

Chapter 6 -- Protection

Chapter 7 -- Multitasking

Chapter 8 -- Input/Output

Chapter 9 -- Exceptions and Interrupts

Chapter 10 -- Initialization

Chapter 14 -- 80386 Real-Address Mode

Chapter 16 -- Mixing 16-Bit and 32 Bit Code

Chapter 1 -- Introduction to the 80386