The 80386

The 80386 was the first 32 bit microprocessor from Intel, brought into the market in 1986. A 32 bit microprocessor is a device that can process 32 bits of data at once. The advantages of using a 32 bit processor are now obvious. It can process data at twice the speed of a 16 bit processor running at the same clock frequency. Also, even an average computer user deals with data which needs o be represented with 32 bits. Floating point numbers which are commonly used in all applications require 32 bits for storage.

 

We can discuss a brief history of Intel microprocessors and their applications:

Processor

4004

8008

8080

8085

8086

8088

80286

80386

8048

Pentiums and higher

ALU size

4

8

8

8

16

16

16

32

32

32, 64

Initial Clock

 

 

 

5MHz

 

 

8MHz

16MHz-33MHz

 

1.4-3.2 GHz

Memory Capability

 

 

 

64K

 

 

16M

4G

 

 

Used in

Control Systems

 

PCs and Mainframes

 

As we are studying the 8085, a further comparison apart from the above differences shall develop our understanding of the 80386:

Processor

8085

80386

Address Bus

16 bits (64K), Multiplexed

321 bits (4G, can be raised to 64T by using virtual memory), Dedicated

Number of pins

40

1321

 

1 The original 80386 model to be introduced was the 80386DX (Double word eXternal),

  which came with a full 32 bit address bus (4G) and 32 bit external data bus in a 132 pin

  PGA package. Then a low cost variant 80386SX (Single word eXternal) was introduced

  with a 24 bit address bus  (16M) and a 16 bit external data bus, though internally it was

  still 32 bit. This can be compared to the 8088 variant of the 8086. Another variant for

  embedded application, the 80386EX was also introduced which integrated more of the

  motherboard components on the chip. For laptop computers, the 80386SL was launched

  with power saving features.


Pictures of the 80386DX and 80386SX:

 

              

         The 80386DX                                         The 80386SX

 

Revolutionary features of the 80386

  1. Using its Memory Management Unit (MMU) and Virtual Memory (VM) it can address up to 64TB of memory.
  2. Pipelining, Caching and Interleaving to speed up memory access.
  3. More efficient and protected multitasking.
  4. Facilities for adding an 80387 math coprocessor.

 

The 80386 instruction set is highly complete, and even the Pentiums have hardly added many new instructions. The 80386 compatible architecture is called the i386 architecture and the instruction set is known as IA-32. Most of today’s applications can still be run on the 386.

Modes of operation of the 80386

The 80386 can be operated in two modes: the Real mode and the Protected mode. They basically differ in the method of accessing and the amount of memory that can be accessed. They are also important to understand the MMU, Virtual Memory and the protection and multitasking features of the 80386.

 

Real Mode

In the real mode, the 8086 behaves like an 8086. For compatibility reasons all the x86 family CPUs start in the real mode. You would wonder why the 80386 would like to behave like an older processor like the 8086. To understand this, we must study a little about the 8086 and its unique way of addressing memory.

The 8086 architecture

The 8086 had 20 address lines capable of addressing 1MB of memory. Now in an 8085, with 16 bit addresses and 64K of memory, we were able to specify the direct physical address of the memory location. But in an 8086, addressing was performed with two registers,

  1. Segment register: It contained an index to a 16 byte memory segment.
  2. Offset register: It contained an offset into the 16 byte segment to access a specific word.

The physical address was calculated as = Segment reg. x 16 + Offset reg.

So we see how different this scheme is from the way we use addresses in the 8085.

The 8086 had no protection mechanism as in the 80286/386, but they have derived heavily from its addressing scheme.

 

Protected Mode

In the protected mode, the role of the segment register is changed. Instead of directly pointing to a physical 16 byte memory segment, it indexes to an entry in a table called the descriptor table created by the Operating System. The OS creates and stores a descriptor table at a particular memory location which lists all the segments being used at different location on the memory. A typical entry in the descriptor table includes:

  1. Physical address of the segment
  2. Size of the segment or limits
  3. Protection data

The advantage of this scheme is that the physical address is completely separated from the virtual address being seen by the program. So we can accomplish these very important things:

  1. Move the segments around in the memory, or even page them to the disk giving us the concept of almost infinite (64TB) of Virtual Memory.
  2. Do multitasking efficiently and keep memory segments of different processes isolated and secure.
  3. Implement various levels of protection.

Paging Unit

The paging unit as mentioned above can provide you access to up to 64TB of Virtual Memory. This thing works, because at a particular time a program usually works with only a small portion of the memory. The simulated Virtual Memory which can be as large as 64 TB is divided into 4KB chunks. All these chunks/pages are indexed into a Paging Table maintained by the Paging Unit. The pages which are in use are stored in the physical memory and the pages not in use are stored on a fast disk. The paging table maps these pages accordingly and also takes care of empty pages.

 

 

The task of selecting which pages to keep in physical memory and which on the disk is performed by the OS by using certain algorithms. If a referenced page is on the disk, it is loaded into the physical memory first, and this may require replacement of an existing page in the memory. Specific algorithms are available for his task too.

The 80386 also provides several bits for the page directory entries for maintaining the page table:

D                                 Dirty bit (if a page has been modified, it must be committed to disk on replacement, and otherwise it may be overwritten)

A                                 Accessed bit (a recently accessed page is less liable to be replaced than an un-accessed page)

R/W and U/S               Read/Write and User/Supervisor bit (access control and security)

P                                  Present (take care of empty pages and invalid page requests)

 

The disadvantage of paging is that it requires two accesses to the cruelly slow memory, one for looking up the page table, and the other to read the actual data. To overcome this, a part of the page table most often used is kept in a separate high speed memory called the Translational Look-aside Buffer (TLB). It significantly reduces the number of memory reads required for looking up page translation addresses in the page table.

Protection in the 80386

Privilege Levels (often called rings) are a new protection mechanism introduced with the 80286 which prevent a user program from modifying or crashing the operating system or its data. There are 4 privilege levels from PL0 to PL3.

A privilege level 0 means that the program can execute all of the CPU instructions. The operating system has this privilege.

A program with a privilege level 3 has access to the least number of instructions. User level programs are given this level so they cannot access unauthorized data, devices and ports. Protection data is also associated with the descriptor table and page table entries.

 

The V86 mode

The V86 mode allows the OS to run REAL mode applications in the PROTECTED mode. Several REAL mode machines can be simulated simultaneously, running different programs at once. But these programs are restricted to run only in Protection Level 3.

 

Pipelines, caches and interleaving – reducing wait states

The 80386 was introduced with a clock speed of 16 MHz. At this speed, memory devices with access times of less than 50 ns could be operated at full speed. At that time, there were only a few DRAM components with such low access times. In the 8085, wait states were the only option in dealing with slow memories. The 80386 provides three – pipeline, cache and interleaved memory to reduce the need for wait states.

 

A pipeline is a special way of handling memory access so that the memory has one extra clock period to access data. A pipe is set up by the microprocessor. When an instruction is fetched from the memory, the microprocessor has extra time before the next instruction is fetched. During this extra time (one clock period), the address of the next instruction is sent to the address bus. This helps increase the access time from 50ns to 81ns on a 16MHz 386. Not all memory references can take the advantage of the pipe. These non-pipelined accesses request one wait state. Systems running at higher speeds like 20, 25 or 33MHz cannot take advantage of the pipe. Another technique must be used here to speedup memory access.

 

A cache is a high speed memory system that is placed between the DRAM and the microprocessor. Cache memory devices have access times of 25ns or less. The 80486 can have an internal Level 1 cache and an external Level 2 cache. The 80386 only supports an external Level 2 cache. A 256KB cache is considered to give the optimum performance boost. If the requested data is found in the cache, it is called a cache hit; otherwise it is called a miss.

 

An interleaved memory system increases memory speeds just like RAID (Redundant Array of Inexpensive Disks) is used to increase disk speeds. Its only disadvantage is that it costs considerably more memory because of its structure. An interleaved memory system requires two or more set of complete address buses and memories and a controller that provides addresses for each bus. For example, the addresses 00000000H, 00000002H, 00000004H… are kept in one bank and the address 00000001H, 00000003H, 00000005H… are kept in the other bank. When the processor accesses 00000000H, the interleaving logic automatically generates address for 00000001H in the second address bus. The access time is reduced because the address is generated to access memory before the microprocessor accesses it. This is because the microprocessor pipelines memory accesses. Interleaving works if both the memory sections are accessed alternately. But only a small number (less than 7%) of times does the microprocessor accesses data from the same section only. So we can expect a performance boost in the remaining 93% cases. The access time is increased to 112ns from 69 ns on a 16MHz 386.


Parting notes

Our discussion of the 80386 and its advanced features is nearly complete. Some other things that may still be discussed are listed here in arbitrary order:

 

Rohit Rawat

V Semester

ECE

Roll No: 55