The 80386

The 80386 was the first 32 bit microprocessor from Intel, brought into the market in 1986. A 32 bit microprocessor is a device that can process 32 bits of data at once. The advantages of using a 32 bit processor are now obvious. It can process data at twice the speed of a 16 bit processor running at the same clock frequency. Also, even an average computer user deals with data which needs o be represented with 32 bits. Floating point numbers which are commonly used in all applications require 32 bits for storage.

We can discuss a brief history of Intel microprocessors and their applications:

Processor	4004	8008	8080	8085	8086	8088	80286	80386	8048	Pentiums and higher
ALU size	4	8	8	8	16	16	16	32	32	32, 64
Initial Clock				5MHz			8MHz	16MHz-33MHz		1.4-3.2 GHz
Memory Capability				64K			16M	4G
Used in	Control Systems							PCs and Mainframes

As we are studying the 8085, a further comparison apart from the above differences shall develop our understanding of the 80386:

Processor	8085	80386
Address Bus	16 bits (64K), Multiplexed	32¹ bits (4G, can be raised to 64T by using virtual memory), Dedicated
Number of pins	40	132¹

¹The original 80386 model to be introduced was the 80386DX (Double word eXternal),

which came with a full 32 bit address bus (4G) and 32 bit external data bus in a 132 pin

PGA package. Then a low cost variant 80386SX (Single word eXternal) was introduced

with a 24 bit address bus (16M) and a 16 bit external data bus, though internally it was

still 32 bit. This can be compared to the 8088 variant of the 8086. Another variant for

embedded application, the 80386EX was also introduced which integrated more of the

motherboard components on the chip. For laptop computers, the 80386SL was launched

with power saving features.

Pictures of the 80386DX and 80386SX:

The 80386DX The 80386SX

Revolutionary features of the 80386

Using its Memory Management Unit (MMU) and Virtual Memory (VM) it can address up to 64TB of memory.
Pipelining, Caching and Interleaving to speed up memory access.
More efficient and protected multitasking.
Facilities for adding an 80387 math coprocessor.

The 80386 instruction set is highly complete, and even the Pentiums have hardly added many new instructions. The 80386 compatible architecture is called the i386 architecture and the instruction set is known as IA-32. Most of today’s applications can still be run on the 386.

Modes of operation of the 80386

The 80386 can be operated in two modes: the Real mode and the Protected mode. They basically differ in the method of accessing and the amount of memory that can be accessed. They are also important to understand the MMU, Virtual Memory and the protection and multitasking features of the 80386.

Real Mode

In the real mode, the 8086 behaves like an 8086. For compatibility reasons all the x86 family CPUs start in the real mode. You would wonder why the 80386 would like to behave like an older processor like the 8086. To understand this, we must study a little about the 8086 and its unique way of addressing memory.

The 8086 architecture

The 8086 had 20 address lines capable of addressing 1MB of memory. Now in an 8085, with 16 bit addresses and 64K of memory, we were able to specify the direct physical address of the memory location. But in an 8086, addressing was performed with two registers,

Segment register: It contained an index to a 16 byte memory segment.
Offset register: It contained an offset into the 16 byte segment to access a specific word.

The physical address was calculated as = Segment reg. x 16 + Offset reg.

So we see how different this scheme is from the way we use addresses in the 8085.

The 8086 had no protection mechanism as in the 80286/386, but they have derived heavily from its addressing scheme.

Protected Mode

In the protected mode, the role of the segment register is changed. Instead of directly pointing to a physical 16 byte memory segment, it indexes to an entry in a table called the descriptor table created by the Operating System. The OS creates and stores a descriptor table at a particular memory location which lists all the segments being used at different location on the memory. A typical entry in the descriptor table includes:

Physical address of the segment
Size of the segment or limits
Protection data

The advantage of this scheme is that the physical address is completely separated from the virtual address being seen by the program. So we can accomplish these very important things:

Move the segments around in the memory, or even page them to the disk giving us the concept of almost infinite (64TB) of Virtual Memory.
Do multitasking efficiently and keep memory segments of different processes isolated and secure.
Implement various levels of protection.

Paging Unit

The paging unit as mentioned above can provide you access to up to 64TB of Virtual Memory. This thing works, because at a particular time a program usually works with only a small portion of the memory. The simulated Virtual Memory which can be as large as 64 TB is divided into 4KB chunks. All these chunks/pages are indexed into a Paging Table maintained by the Paging Unit. The pages which are in use are stored in the physical memory and the pages not in use are stored on a fast disk. The paging table maps these pages accordingly and also takes care of empty pages.

The task of selecting which pages to keep in physical memory and which on the disk is performed by the OS by using certain algorithms. If a referenced page is on the disk, it is loaded into the physical memory first, and this may require replacement of an existing page in the memory. Specific algorithms are available for his task too.

The 80386 also provides several bits for the page directory entries for maintaining the page table:

D Dirty bit (if a page has been modified, it must be committed to disk on replacement, and otherwise it may be overwritten)

A Accessed bit (a recently accessed page is less liable to be replaced than an un-accessed page)

R/W and U/S Read/Write and User/Supervisor bit (access control and security)

P Present (take care of empty pages and invalid page requests)

The disadvantage of paging is that it requires two accesses to the cruelly slow memory, one for looking up the page table, and the other to read the actual data. To overcome this, a part of the page table most often used is kept in a separate high speed memory called the Translational Look-aside Buffer (TLB). It significantly reduces the number of memory reads required for looking up page translation addresses in the page table.

Protection in the 80386

Privilege Levels (often called rings) are a new protection mechanism introduced with the 80286 which prevent a user program from modifying or crashing the operating system or its data. There are 4 privilege levels from PL0 to PL3.

A privilege level 0 means that the program can execute all of the CPU instructions. The operating system has this privilege.

A program with a privilege level 3 has access to the least number of instructions. User level programs are given this level so they cannot access unauthorized data, devices and ports. Protection data is also associated with the descriptor table and page table entries.

The V86 mode

The V86 mode allows the OS to run REAL mode applications in the PROTECTED mode. Several REAL mode machines can be simulated simultaneously, running different programs at once. But these programs are restricted to run only in Protection Level 3.

Pipelines, caches and interleaving – reducing wait states

The 80386 was introduced with a clock speed of 16 MHz. At this speed, memory devices with access times of less than 50 ns could be operated at full speed. At that time, there were only a few DRAM components with such low access times. In the 8085, wait states were the only option in dealing with slow memories. The 80386 provides three – pipeline, cache and interleaved memory to reduce the need for wait states.

A pipeline is a special way of handling memory access so that the memory has one extra clock period to access data. A pipe is set up by the microprocessor. When an instruction is fetched from the memory, the microprocessor has extra time before the next instruction is fetched. During this extra time (one clock period), the address of the next instruction is sent to the address bus. This helps increase the access time from 50ns to 81ns on a 16MHz 386. Not all memory references can take the advantage of the pipe. These non-pipelined accesses request one wait state. Systems running at higher speeds like 20, 25 or 33MHz cannot take advantage of the pipe. Another technique must be used here to speedup memory access.

A cache is a high speed memory system that is placed between the DRAM and the microprocessor. Cache memory devices have access times of 25ns or less. The 80486 can have an internal Level 1 cache and an external Level 2 cache. The 80386 only supports an external Level 2 cache. A 256KB cache is considered to give the optimum performance boost. If the requested data is found in the cache, it is called a cache hit; otherwise it is called a miss.

An interleaved memory system increases memory speeds just like RAID (Redundant Array of Inexpensive Disks) is used to increase disk speeds. Its only disadvantage is that it costs considerably more memory because of its structure. An interleaved memory system requires two or more set of complete address buses and memories and a controller that provides addresses for each bus. For example, the addresses 00000000H, 00000002H, 00000004H… are kept in one bank and the address 00000001H, 00000003H, 00000005H… are kept in the other bank. When the processor accesses 00000000H, the interleaving logic automatically generates address for 00000001H in the second address bus. The access time is reduced because the address is generated to access memory before the microprocessor accesses it. This is because the microprocessor pipelines memory accesses. Interleaving works if both the memory sections are accessed alternately. But only a small number (less than 7%) of times does the microprocessor accesses data from the same section only. So we can expect a performance boost in the remaining 93% cases. The access time is increased to 112ns from 69 ns on a 16MHz 386.

Parting notes

Our discussion of the 80386 and its advanced features is nearly complete. Some other things that may still be discussed are listed here in arbitrary order:

The Address Data Strobe (ADS) pin: This is a new pin not present in the traditional set of pins of Intel microprocessors. It becomes active whenever the 80386 has issued a valid memory or IO address. This signal is combined with the M/IO and W/R signals generate the separate read and write signals present in the earlier processors.
Address pins and the Bank Enable signal: Having a 32 bit non-multiplexed address bus, one would expect 32 address pins. But only 30 are present as A31-A2, A0 and A1 being encoded into the bus enable BE3-BE0 signals to select any or all four bytes in a 32 bit wide memory location.
It has eight general purpose registers (32 bits each, e.g. EAX). They can still be accessed as the lower 16 bit registers (e.g. AX). The segment registers are also 16 bit. There are also two new segment registers: FS and GS. It also has Control registers and Debug & Test registers.
The 80386 instruction set is divided into 9 categories and includes high level language support and OS support.
It has 11 addressing modes, and instructions can have 0 to 3 operands.
The 80386 was not the first 32 bit processor from Intel. The iAPX 432 had been released back in 1981.
The 33MHz version of the 80386 was launched in 1989.
Number of transistors used in the 80386 was 275,000.
It can address enough memory to manage an eight-page history of every person on earth.
It can scan the Encyclopedia Britannica in 12.5 seconds.
To take full advantage of the 32 bit processor, you need a 32 bit OS like Windows 95 or higher.
The 80387 is the Intel math coprocessor for the 80386. It provides support for faster arithmetical, trigonometric, exponential, and logarithmic floating-point calculations.
I owned an 80386SX 33MHz (40MHz in Turbo mode) computer with 4MB RAM, 250MB HDD, 8¼’’ & 3¼’’ FDD from 1994-2000.

Rohit Rawat

V Semester

ECE

Roll No: 55