以上為資料預覽概圖,下載文件后為壓縮包資料文件?!厩逦瑹o水印,可編輯】dwg后綴為cad圖,doc后綴為word格式,需要請自助下載即可,有疑問可以咨詢QQ 197216396 或 11970985
Microprocessors
A microprocessor is a computation engine that is fabricated on a single chip. The first microprocessor was the Intel 4004, introduced in 1971 .The 4004 was not very powerful – all it could do was add and subtract, and it could only do that 4 bits at a time. But it was amazing that everything was on one chip. Prior to the 4004, engineers built computers either from collections of chips or from discrete components. The 4004 powered one of the first portable electronic calculators.
The first microprocessor to make it into a home computer was the Intel 8080, a complete 8-bit computer on the chip, introduced in 1974. The first microprocessor to make a real splash in the market was the Intel 8088 , introduced in 1979 and incorporated into the IBM PC. The PC market moved from the 8088 to the 80286 to the 80386 to the 80486 to the Pentium to the Pentium II to the Pentium III to the Pentium 4. All of these microprocessors are made by Intel and all of them are improvements on the basic design of the 8088. The Pentium 4 can execute any piece of code that ran on the original 8088, but it does it about 5,000 times faster!
The following table shows the differences between the different processors that Intel has introduced over the years.
Table1.2
Name
Date
Transistors
Micron[1]
Clock
speed
Data width[2]
MIPS[3]
8080
1974
6,000
6
2 MHz
8 bits
0.64
8088
1979
29,000
3
5 MHz
16 bits
8-bit bus
0.33
80286
1982
134,000
1.5
6 MHz
16 bits
1
80386
1985
275,000
1.5
16 MHz
32 bits
5
80486
1989
1,200,000
1
25 MHz
32 bits
20
Pentium
1993
3,100,000
0.8
60 MHz
32 bits
64-bit bus
100
Pentium II
1997
7,500,000
0.35
233 MHz
32 bits
64-bit bus
~300
Pentium III
1999
9,500,000
0.25
450 MHz
32 bits
64-bit bus
~510
Pentium 4
2000
42,000,000
0.18
1.5 GHz
32 bits
64-bit bus
~1,700
From this table you can see that, in general, there is a relationship between clock speed and MIPS. The maximum clock speed is a function of the manufacturing process and delays within the chip. There is also a relationship between the number of transistors and MIPS. For example, the 8088 clocked at 5 MHz but only executed at 0.33 MIPS(about one instruction per 15 clock cycles). Modern processors can often execute at a rate of two instructions per clock cycle. That improvement is directly related to the number of transistors on the chip.
Inside a Microprocessor A microprocessor executes a collection of machine instructions that tell the processor what to do. Based on the instruction, a microprocessor does three basic things:
1. Using its ALU (Arithmetic/Logic Unit), a microprocessor can perform mathematical operations like addition, subtraction, multiplication and division. Modern Microprocessors contain complete floating point processors that can perform extremely sophisticated operations on large floating point numbers.
2. A microprocessor can move data from one memory location to another.
3. A microprocessor can make decisions and jump to a new set of instructions based on those decisions.
These may be very sophisticated things that a microprocessor does, but those are its three basic activities. The following diagram shows an extremely simple microprocessor capable of doing those three things:
This microprocessor has an address bus that sends an address to memory, a data bus that can send data to memory or receive data from memory, an RD (read) and WR (write) line to tell the memory whether it wants to set or get the addressed location, a clock line that lets a clock pulse sequence the processor and a reset[4] line that resets the program counter to zero (or whatever) and restarts execution. And let’s assume that both the address and data buses are 8 bits wide here.
Here are the components of this simple microprocessor (Figure 1.1):
Figure 1.1
1. Registers A, B and C are simply latches made out of flip – flops.
2. The address latch is just like registers A, B and C.
3. The program counter is a latch with the extra ability to increment by 1 when told to do so, and also to reset to zero when told to do so.
4. The ALU could be as simple as an 8 - bit adder, or it might be able to add, subtract, multiply and divide 8 – bit values. Let’s assume the latter here.
5. The test register is a special latch that can hold values from comparisons performed in the ALU. An ALU can normally compare two numbers and determine if they are equal, if one is greater than the other, etc. The test register can also normally hold a carry bit from the last stage of the adder. It stores these values in flip-flops and then the instruction decoder can use the values to make decisions.
6. There are six boxes marked “3-State” in the diagram. These are tri-state buffers[5]. A tri-state buffer can pass a 1, a 0 or it can essentially disconnect its output. A tri-state buffer allows multiple outputs to connect to a wire, but only one of them to actually drive a 1 or a 0 onto the line.
7. The instruction register and instruction decoder are responsible for controlling all of the other components.
Although they are not shown in this diagram, there would be control lines from the instruction decoder that would:
1. Tell the A register to latch the value currently on the data bus
2. Tell the B register to latch the value currently on the data bus
3. Tell the C register to latch the value currently on the data bus
4. Tell the program counter register to latch the value currently on the data bus
5. Tell the address register to latch the value currently on the data bus
6. Tell the instruction register to latch the value currently on the data bus
7. Tell the program counter to increment
8. Tell the program counter to reset to zero
9. Activate any of the six tri-state buffers (six separate lines)
10. Tell the ALU what operation to perform
11. Tell the test register to latch the ALU’s test bibs
12. Activate the RD line
13. Activate the WR line
Coming into the instruction decoder are the bits from the test register and the clock line, as well as the bits from the instruction register.
RAM and ROM the address and data buses, as well as the RD and WR lines connect either to RAM or ROM-generally both. In our sample microprocessor, we have an address bus 8 bits wide and a data bus 8 bits wide. That means that the microprocessor can address (28) 256 bytes of memory, and it can read or write 8 bits of the memory at a time. Let’s assume that this simple microprocessor has 128 bytes of ROM starting at address 0 and 128 bytes of RAM starting at address 128.
ROM stands for read-only memory. A ROM chip is programmed with a permanent collection of pre-set bytes. The address bus tells the ROM chip which byte to get and place on the data bus. When the RD line changes state, the ROM chip presents the selected byte onto the data bus.
RAM stands for random-access memory. RAM contains bytes of information, and the microprocessor can read or write to those bytes depending on whether the RD or WR line is signaled. One problem with today’s RAM chips is that they forget everything once the power goes off. That is why the computer needs ROM.
By the way, nearly all computers contain some amount of ROM (it is possible to create a simple computer that contains no RAM-many microcontrollers do this by placing a handful of RAM bytes on the processor chip itself-but generally impossible to create one that contains no ROM). On a PC, the ROM is called the BIOS (Basic Input/Output System). When the microprocessor starts, it begins executing instructions it finds in the BIOS. The BIOS instructions do things like test the hardware in the machine, and then it goes to the hard disk to fetch the boot sector. This boot sector is another small program, and the BIOS stores it in RAM after reading it off the disk. The microprocessor then begins executing the boot sector’s instructions from RAM. The boot sector program will tell the microprocessor to fetch something else from the hard disk into RAM, which the microprocessor then executes, and so on. This is how the microprocessor loads and executes the entire operating system.
Microprocessor Instructions Even the incredibly simple microprocessor shown here will have a fairly large set of instructions that it can perform. The collection of instructions is implemented as bit patterns, each one of which has a different meaning when loaded into the instruction register. Humans are not particularly good at remembering bit patterns, so a set of short words are defined to represent the different bit patterns. This collection of words is called the assembly language of the processor. An assembler can translate the words into their bit patterns very easily, and then the output of the assembler is placed in memory for the microprocessor to execute. If you use C language programming, a C compiler will translates the C code into assembly language.
So now the question is, “How do all of these instructions look in ROM?” Each of these assembly language instructions must be represented by a binary number . These numbers are known as opcodes. The instruction decoder needs to turn each of the opcodes into a set of signals that drive the different components inside the microprocessor. Let’s take the ADD instruction as an example and look at what it needs to do:
During the first clock cycle, we need to actually load the instruction. Therefore the instruction decoder needs to:
Activate the tri-state buffer for the program counter
Activate the RD line
Activate the data-in tri-state buffer
Latch the instruction into the instruction register
During the second clock cycle, the ADD instruction is decoded. It needs to do very little:
Set the operation of the ALU to addition
Latch the output of the ALU into the C register
During the third clock cycle, the program counter is incremented (in theory this could be overlapped into the second clock cycle).
Every instruction can be broken down as a set of sequenced operations like these that manipulate the components of the microprocessor in the proper order. Some instructions, like this ADD instruction, might take two or three clock cycles. Others might take five or six clock cycles.
Microprocessor Performance The number of transistors available has a huge effect on the performance of a processor. As seen earlier, a typical instruction in a processor like an 8088 took 15 clock cycles to execute. Because of the design of the multiplier, it took approximately 80 cycles just to do one 16-bit multiplication on the 8088. With more transistors, much more powerful multipliers capable of single-cycle speeds become possible.
More transistors also allow for a technology called pipelining[6]. In a pipelined architecture, instruction execution overlaps. So even though it might take five clock cycles to execute each instruction, there can be five instructions in various stages of execution simultaneously. That way it looks like one instruction completes every clock cycle.
Many modern processors have multiple instruction decoders, each with its own pipeline. This allows for multiple instruction streams, which means that more than one instruction can complete during each clock cycle. This technique can be quite complex to implement, so it takes lots of transistors.
The trend in processor design has been toward full 32-bit ALUs with fast floating point processors built in and pipelined execution with multiple instruction streams. There has also been a tendency toward special instructions that make certain operations particularly efficient. There has also been the addition of hardware virtual memory support and L1 caching on the processor chip. All of these trends push up the transistor count, leading to the multi-million transistor powerhouses available today. These processors can execute about one billion instructions per second!
微處理器
微處理器是建在一塊芯片上的一個計算器,1971年因特爾公司推出世界上第一款微處理器Intel4004。Intel4004功能不齊全,它只能做加減,并且一次只能處理4位,但令人吃驚的是一切都在一塊芯片上。在Intel 4004之前,工程師利用芯片或其他零部件開發(fā)計算機,從此揭開了微型計算機發(fā)展的序幕。
1974年,利用Intel8088微處理開始生產(chǎn)家用電腦,它能處理8個二進制數(shù),1979年推出的Intel8088,第一次打開了市場。IBM公司運用這塊芯片推出了個人電腦,電腦發(fā)展經(jīng)歷了8088、80286、80386、80486、奔騰、奔騰II、奔騰III、奔騰4,所有這些微處理都是因特爾公司生產(chǎn)的,它們都是在8088的設計基礎上開發(fā)的,奔騰4能執(zhí)行8088上的任一套指令,但是它比8088快5000倍。
從以下表格我們可以看出因特爾公司近幾年來所生產(chǎn)的各種處理器。
(表格2)
名 稱
時間
晶體管數(shù)量
微 米
時鐘頻率
數(shù)據(jù)位寬
MIPS[3]
8080
1974
6,000
6
2 MHz
8 位
0.64
8088
1979
29,000
3
5 MHz
16位
8-位總線
0.33
80286
1982
134,000
1.5
6 MHz
16位
1
80386
1985
275,000
1.5
16 MHz
32位
5
80486
1989
1,200,000
1
25 MHz
32位
20
奔騰
1993
3,100,000
0.8
60 MHz
32位
64-位總線
100
奔騰II
1997
7,500,000
0.35
233 MHz
32位
64-位總線
~300
奔騰III
1999
9,500,000
0.25
450 MHz
32位
64-位總線
~510
奔騰4
2000
42,000,000
0.18
1.5 GHz
32位
64-位總線
~1,700
從這個表中,大體可以看出時鐘頻率和MIPS之間存在一定的關系,最大的時鐘頻率是生產(chǎn)進程的一個函數(shù),并且它在芯片內(nèi)會延遲,晶體管和MIPS之間有一定的關系,例如8088在5兆赫茲時就運行一次,但只是以0.33MIPS的速度來執(zhí)行(大于每15時鐘周期執(zhí)行一條指令)?,F(xiàn)在的處理器通常能達到每一個時鐘周期執(zhí)行兩條指令的速度,那種運算速度的提高與芯片上的晶體管數(shù)量有直接關系。
微處理器的內(nèi)部結(jié)構(gòu):微處理器執(zhí)行告訴處理器該做什么的一系列的機器指令,在這個指令的基礎上,微處理器完成3個基本的功能:
1、微處理器用它的算術(shù)邏輯單元,能夠完成像加減乘除這一系列算術(shù)操作,現(xiàn)在的微處理器包含有完整的浮點處理器,它們能夠完成非常復雜的浮點數(shù)的操作。
2、微處理器能把數(shù)據(jù)從一個存儲單元移到另一個存儲單元。
3、微處理器能做出決定,并且在那些決定的基礎上發(fā)出一系列新的指令。
這些或許就是微處理器能完成的復雜的功能,但那些僅是它的3個基本功能,下面的圖表說明微處理器是如何執(zhí)行這些簡單功能的:微處理器有地址總線,它把地址送到儲存器,它還有一個數(shù)據(jù)總線,把數(shù)據(jù)送到儲存器或者從儲存器里接收數(shù)據(jù),它也有讀寫總線,告訴儲存器是想設置還是想取出這個定了位置的儲存單元,它還有時鐘線,告訴時鐘脈沖記錄處理器的結(jié)果,以及重新設置線把編碼器重新設置到零(或者其他什么的)以及重新執(zhí)行命令。我們假定這兒的位置和數(shù)據(jù)總線均為8位寬。下面是這個簡單微處理器的結(jié)構(gòu)表:(圖表1.1)
圖表1.1
1、記錄器A,B和C 都是構(gòu)成觸發(fā)器的簡單的鎖存器。
2、位置鎖恰好記錄器A,B和C。
3、編碼計算器是具有特別遞增能力的鎖,當接到指令時,它就增加1或者重新設置到零。
4、數(shù)據(jù)邏輯單元可能和8位的加法器一樣簡單,或者它可能會做加、減、乘和除8位數(shù)值,我們假定這兒屬于后者。
5、檢測記錄器是一個特殊的鎖,它能夠把經(jīng)過比較的數(shù)值鎖在算術(shù)邏輯單元里,算術(shù)邏輯單元就能正常的比較兩個數(shù)字,并判斷他們是否相等,是否一個大于另一個等,檢測記錄器也能正常鎖住一個階段加法的進位位,它把這些數(shù)值儲存在觸發(fā)器上,然后信息譯碼器能用這些數(shù)值來做出判斷。
6、圖表中6個標注了“3—State”的方框,這些是三態(tài)緩沖器,它能傳遞a 1 ,a 0 或者它能基本上斷開它信息的輸出,它允許多個信息輸出,連接到電源線,但是他們中只有一個能準確驅(qū)動a 1 或a 0 到流水線上去。
7、信息記錄器和譯碼器要控制其余所有的部件。
雖然這些圖表上沒有顯示這些結(jié)構(gòu),但是信息譯碼里將有控制線做以下事情:
1、告訴A記錄器鎖定當前數(shù)據(jù)總線上的數(shù)值
2、告訴B記錄器鎖定當前數(shù)據(jù)總線上的數(shù)值
3、告訴C記錄器鎖定當前數(shù)據(jù)總線上的數(shù)值
4、告訴編碼計算器鎖定當前數(shù)據(jù)總線上的數(shù)值
5、告訴位地址記錄器鎖定當前數(shù)據(jù)總線上的數(shù)值
6、告訴信息記錄器 鎖定當前數(shù)據(jù)總線上的數(shù)值
7、告訴編碼計算器增加數(shù)值
8、告訴編碼計算器重新設置到零
9、激活6個三態(tài)緩沖器中的任意一個
10、告訴算術(shù)邏輯單元該執(zhí)行什么指令
11、告訴檢測記錄器鎖定算術(shù)邏輯單元的檢測結(jié)果
12、激活RD線
13、激活WR線
進入信息譯碼器的是那些檢測記錄器,時鐘流水線以及信息記錄器里面的二進制數(shù)字行大約十億條指令。
隨機存貯器、只讀存貯器、位址和數(shù)據(jù)總線。讀、寫一般說來都與隨機存貯及只讀存貯有關。在我們的樣本微處理器中,我們有8位總線地址寬,8位數(shù)據(jù)總線寬,那意味著微處理器能存入(28)256個字節(jié),它一次能讀或?qū)?個二進字位,我們假定這個簡單的微處理器只讀存貯0開始的位置上有128個字節(jié)及隨機存貯在128開始的位置上有128個字節(jié)。
ROM代表只讀存貯器。ROM芯片是用來永久性收集預置字節(jié)的位總線告訴ROM芯片取哪個字節(jié)及放在哪條數(shù)據(jù)總線上。當RD線變化狀態(tài)時,ROM芯片就會把被選擇的字節(jié)呈現(xiàn)到數(shù)據(jù)總線上去。
RAM代表隨機存貯器,RAM包括信息字節(jié)微處理器能讀或?qū)懡o那些依靠RD或WR線是否注冊的字節(jié)。RAM芯片的一個問題就是當斷掉電源時,它上面的一切信息就不會保存下來那就是計算機需要ROM的原因。
順便說一下,幾乎所有的計算機都包括一定量的ROM(開發(fā)一個不包含RAM的簡單電腦是可能的——許多微控制器通過在處理器芯片本身上面放少量RAM字節(jié)就可以完成,——但一般說來,一個不包含ROM的電腦是不可能的)。在PC機上,ROM被叫做BI0S(基本輸入/輸出系統(tǒng))當微處理器開始運行時,它就開始執(zhí)行它在BIOS中找到的指令。BIOS指令做一些檢查機器硬件是否出故障的事情,然后此指令到硬盤上去獲取引導程序扇區(qū)這個引導程序。扇區(qū)是另一個小程序,BIOS從磁盤上讀取后把它存入RAM微處理器,然后開始從RAM上執(zhí)行引導程序扇區(qū)的指令。引導程序?qū)⒏嬖V微處理器到硬盤上去讀取一些別的信息存貯到RAM。在那兒,微處理器然后執(zhí)行等等,這就是微處理器讀取和執(zhí)行的整個操作系統(tǒng)。
這兒顯示的相當簡單的微處理器將有它能執(zhí)行的相當大數(shù)量的指令,這些指令作為二進制數(shù)字符來執(zhí)行。其中的每一個都有不同的意思當被載入信息記錄器時人類特別不善于記住二進字符,這些詞語被叫做處理的匯編語言。匯編器能非常容易地把這些詞語翻譯成它們的字符,然后,匯編器的輸出就被放在存貯器里供微處理器執(zhí)行。如果你用C語言編程、C編譯程序?qū)袰編碼譯成匯編語言。
那么現(xiàn)在問題就在于所有這些指令在ROM中如何呈現(xiàn),每一個這些匯編語言指令都必須代表二進制的數(shù)。這些數(shù)字就叫做操作碼,這個指令譯碼器需要將每一個Opcodes變成一套指令。它們的驅(qū)動微處器內(nèi)部的不同部件,咱們用ADD指令做例子來看看需要做的一切。
在第一個時鐘周期中,我們需要準確的讀取指令,因此指令解碼器需要為簡碼計數(shù)器激活三態(tài)緩沖器,激活RD線,激活三態(tài)緩沖器里的數(shù)據(jù)線把指令鎖定在指令記錄器。
在第二個時鐘周期中,ADD指令被編譯了,它幾乎不需要做什么。
把ALU操作看到加的位置上。
把ALU的輸出鎖進C寄存器。
在第三個時鐘周期,編碼的數(shù)量在遞增,C理論上說,這可能與第二個時鐘周期重疊每一個指令都會像那些以適當順序偽造的微處理一樣作為結(jié)果指令出故障,有些指令,像ADD命令,可能花二、三個時鐘周期,另外的可能要花五、六個時鐘周期。
微處理器的操作:
可用的晶體管數(shù)量對處理器的操作有巨大的影響,正如早就預見到的像8088處理器里的主要指令花了15個時鐘周期來完成,由于在8088上安裝了乘法器,做一個16個二進制的乘法花了大約80個時鐘周期。隨著晶體管的增加,計算能力更強的乘法器具有單一周期的能力,更多的晶體管也考慮到一個叫做流水線技術(shù),在一個流水線系統(tǒng)結(jié)構(gòu)中,有些指令執(zhí)行要重疊,因此盡管可能花5個時鐘周期來執(zhí)行一個指令。但可能有5個不同的進程的指令在同時執(zhí)行,那樣看起來好像一個指令完成每一個時鐘周期。
許多現(xiàn)代的處理器有多個指令解碼器,每一個都有自己的流水線。這考慮到了多條指令,那意味著在每一個時鐘周期中,不只完成一個指令,這個技術(shù)可能對于執(zhí)行來說相當復雜,因此它運用了許多晶體管。
在處理器設計過程中已趨向于32-bit算術(shù)邏輯單元。帶有快速浮點處理器及各種指令的流水線,也趨向于特殊的指令,使某種操作特別有效,也有附有實質(zhì)存貯的硬件及常于處理器芯片處的L1。所有這些趨向推動了晶體數(shù),促成了數(shù)百萬今天可用的晶體群。這些處理器每秒鐘能執(zhí)行大約十億條指令。