🗊 Презентация Computer structure pipeline

Категория: Технология
Нажмите для полного просмотра!
Computer structure pipeline, слайд №1 Computer structure pipeline, слайд №2 Computer structure pipeline, слайд №3 Computer structure pipeline, слайд №4 Computer structure pipeline, слайд №5 Computer structure pipeline, слайд №6 Computer structure pipeline, слайд №7 Computer structure pipeline, слайд №8 Computer structure pipeline, слайд №9 Computer structure pipeline, слайд №10 Computer structure pipeline, слайд №11 Computer structure pipeline, слайд №12 Computer structure pipeline, слайд №13 Computer structure pipeline, слайд №14 Computer structure pipeline, слайд №15 Computer structure pipeline, слайд №16 Computer structure pipeline, слайд №17 Computer structure pipeline, слайд №18 Computer structure pipeline, слайд №19 Computer structure pipeline, слайд №20 Computer structure pipeline, слайд №21 Computer structure pipeline, слайд №22 Computer structure pipeline, слайд №23 Computer structure pipeline, слайд №24 Computer structure pipeline, слайд №25 Computer structure pipeline, слайд №26 Computer structure pipeline, слайд №27 Computer structure pipeline, слайд №28 Computer structure pipeline, слайд №29 Computer structure pipeline, слайд №30 Computer structure pipeline, слайд №31 Computer structure pipeline, слайд №32 Computer structure pipeline, слайд №33 Computer structure pipeline, слайд №34 Computer structure pipeline, слайд №35 Computer structure pipeline, слайд №36 Computer structure pipeline, слайд №37 Computer structure pipeline, слайд №38 Computer structure pipeline, слайд №39 Computer structure pipeline, слайд №40 Computer structure pipeline, слайд №41 Computer structure pipeline, слайд №42 Computer structure pipeline, слайд №43 Computer structure pipeline, слайд №44 Computer structure pipeline, слайд №45 Computer structure pipeline, слайд №46 Computer structure pipeline, слайд №47 Computer structure pipeline, слайд №48 Computer structure pipeline, слайд №49 Computer structure pipeline, слайд №50 Computer structure pipeline, слайд №51 Computer structure pipeline, слайд №52 Computer structure pipeline, слайд №53 Computer structure pipeline, слайд №54 Computer structure pipeline, слайд №55 Computer structure pipeline, слайд №56 Computer structure pipeline, слайд №57 Computer structure pipeline, слайд №58 Computer structure pipeline, слайд №59 Computer structure pipeline, слайд №60 Computer structure pipeline, слайд №61 Computer structure pipeline, слайд №62 Computer structure pipeline, слайд №63 Computer structure pipeline, слайд №64

Содержание

Вы можете ознакомиться и скачать презентацию на тему Computer structure pipeline. Доклад-сообщение содержит 64 слайдов. Презентации для любого класса можно скачать бесплатно. Если материал и наш сайт презентаций Mypresentation Вам понравились – поделитесь им с друзьями с помощью социальных кнопок и добавьте в закладки в своем браузере.

Слайды и текст этой презентации


Слайд 1


Computer Structure Pipeline Lecturer: Aharon Kupershtok
Описание слайда:
Computer Structure Pipeline Lecturer: Aharon Kupershtok

Слайд 2


A Basic Processor
Описание слайда:
A Basic Processor

Слайд 3


Pipelined Car Assembly
Описание слайда:
Pipelined Car Assembly

Слайд 4


Computer structure pipeline, слайд №4
Описание слайда:

Слайд 5


Pipelining Pipelining does not reduce the latency of single task, it increases the throughput of entire workload Potential speedup = Number of pipe...
Описание слайда:
Pipelining Pipelining does not reduce the latency of single task, it increases the throughput of entire workload Potential speedup = Number of pipe stages Pipeline rate is limited by the slowest pipeline stage Partition the pipe to many pipe stages Make the longest pipe stage to be as short as possible Balance the work in the pipe stages Pipeline adds overhead (e.g., latches) Time to “fill” pipeline and time to “drain” it reduces speedup Stall for dependencies Too many pipe-stages start to loose performance IPC of an ideal pipelined machine is 1 Every clock one instruction finishes

Слайд 6


Pipelined CPU
Описание слайда:
Pipelined CPU

Слайд 7


Structural Hazard Different instructions using the same resource at the same time Register File: Accessed in 2 stages: Read during stage 2 (ID) Write...
Описание слайда:
Structural Hazard Different instructions using the same resource at the same time Register File: Accessed in 2 stages: Read during stage 2 (ID) Write during stage 5 (WB) Solution: 2 read ports, 1 write port Memory Accessed in 2 stages: Instruction Fetch during stage 1 (IF) Data read/write during stage 4 (MEM) Solution: separate instruction cache and data cache Each functional unit can only be used once per instruction Each functional unit must be used at the same stage for all instructions

Слайд 8


Pipeline Example: cycle 1
Описание слайда:
Pipeline Example: cycle 1

Слайд 9


Pipeline Example: cycle 2
Описание слайда:
Pipeline Example: cycle 2

Слайд 10


Pipeline Example: cycle 3
Описание слайда:
Pipeline Example: cycle 3

Слайд 11


Pipeline Example: cycle 4
Описание слайда:
Pipeline Example: cycle 4

Слайд 12


Pipeline Example: cycle 5
Описание слайда:
Pipeline Example: cycle 5

Слайд 13


RAW Dependency
Описание слайда:
RAW Dependency

Слайд 14


Using Bypass to Solve RAW Dependency
Описание слайда:
Using Bypass to Solve RAW Dependency

Слайд 15


RAW Dependency
Описание слайда:
RAW Dependency

Слайд 16


Forwarding Hardware
Описание слайда:
Forwarding Hardware

Слайд 17


Forwarding Control Forwarding from EXE (L3) if (L3.RegWrite and (L3.dst == L2.src1)) ALUSelA = 1 if (L3.RegWrite and (L3.dst == L2.src2)) ALUSelB = 1...
Описание слайда:
Forwarding Control Forwarding from EXE (L3) if (L3.RegWrite and (L3.dst == L2.src1)) ALUSelA = 1 if (L3.RegWrite and (L3.dst == L2.src2)) ALUSelB = 1 Forwarding from MEM (L4) if (L4.RegWrite and ((not L3.RegWrite) or (L3.dst  L2.src1)) and (L4.dst = L2.src1)) ALUSelA = 2 if (L4.RegWrite and ((not L3.RegWrite) or (L3.dst  L2.src2)) and (L4.dst = L2.src2)) ALUSelB = 2

Слайд 18


Register File Split Register file is written during first half of the cycle Register file is read during second half of the cycle Register file is...
Описание слайда:
Register File Split Register file is written during first half of the cycle Register file is read during second half of the cycle Register file is written before it is read  returns the correct data

Слайд 19


Can't Always Forward
Описание слайда:
Can't Always Forward

Слайд 20


Stall If Cannot Forward
Описание слайда:
Stall If Cannot Forward

Слайд 21


Software Scheduling to Avoid Load Hazards Fast code LW Rb,b LW Rc,c LW Re,e ADD Ra,Rb,Rc LW Rf,f SW a,Ra SUB Rd,Re,Rf SW d,Rd
Описание слайда:
Software Scheduling to Avoid Load Hazards Fast code LW Rb,b LW Rc,c LW Re,e ADD Ra,Rb,Rc LW Rf,f SW a,Ra SUB Rd,Re,Rf SW d,Rd

Слайд 22


Control Hazards
Описание слайда:
Control Hazards

Слайд 23


Control Hazard on Branches
Описание слайда:
Control Hazard on Branches

Слайд 24


Control Hazard on Branches
Описание слайда:
Control Hazard on Branches

Слайд 25


Control Hazard on Branches
Описание слайда:
Control Hazard on Branches

Слайд 26


Control Hazard on Branches
Описание слайда:
Control Hazard on Branches

Слайд 27


Control Hazard on Branches
Описание слайда:
Control Hazard on Branches

Слайд 28


Control Hazard on Branches
Описание слайда:
Control Hazard on Branches

Слайд 29


Control Hazard: Stall Stall pipe when branch is encountered until resolved Stall impact: assumptions CPI = 1 20% of instructions are branches Stall 3...
Описание слайда:
Control Hazard: Stall Stall pipe when branch is encountered until resolved Stall impact: assumptions CPI = 1 20% of instructions are branches Stall 3 cycles on every branch  CPI new = 1 + 0.2 × 3 = 1.6 (CPI new = CPI Ideal + avg. stall cycles / instr.) We loose 60% of the performance

Слайд 30


Control Hazard: Predict Not Taken Execute instructions from the fall-through (not-taken) path As if there is no branch If the branch is not-taken...
Описание слайда:
Control Hazard: Predict Not Taken Execute instructions from the fall-through (not-taken) path As if there is no branch If the branch is not-taken (~50%), no penalty is paid If branch actually taken Flush the fall-through path instructions before they change the machine state (memory / registers) Fetch the instructions from the correct (taken) path Assuming ~50% branches not taken on average CPI new = 1 + (0.2 × 0.5) × 3 = 1.3

Слайд 31


Dynamic Branch Prediction
Описание слайда:
Dynamic Branch Prediction

Слайд 32


BTB Allocation Allocate instructions identified as branches (after decode) Both conditional and unconditional branches are allocated Not taken...
Описание слайда:
BTB Allocation Allocate instructions identified as branches (after decode) Both conditional and unconditional branches are allocated Not taken branches need not be allocated BTB miss implicitly predicts not-taken Prediction BTB lookup is done parallel to IC lookup BTB provides Indication that the instruction is a branch (BTB hits) Branch predicted target Branch predicted direction Branch predicted type (e.g., conditional, unconditional) Update (when branch outcome is known) Branch target Branch history (taken / not-taken)

Слайд 33


BTB (cont.) Wrong prediction Predict not-taken, actual taken Predict taken, actual not-taken, or actual taken but wrong target In case of wrong...
Описание слайда:
BTB (cont.) Wrong prediction Predict not-taken, actual taken Predict taken, actual not-taken, or actual taken but wrong target In case of wrong prediction – flush the pipeline Reset latches (same as making all instructions to be NOPs) Select the PC source to be from the correct path Need get the fall-through with the branch Start fetching instruction from correct path Assuming P% correct prediction rate CPI new = 1 + (0.2 × (1-P)) × 3 For example, if P=0.7 CPI new = 1 + (0.2 × 0.3) × 3 = 1.18

Слайд 34


Adding a BTB to the Pipeline
Описание слайда:
Adding a BTB to the Pipeline

Слайд 35


Adding a BTB to the Pipeline
Описание слайда:
Adding a BTB to the Pipeline

Слайд 36


Adding a BTB to the Pipeline
Описание слайда:
Adding a BTB to the Pipeline

Слайд 37


Using The BTB
Описание слайда:
Using The BTB

Слайд 38


Using The BTB (cont.)
Описание слайда:
Using The BTB (cont.)

Слайд 39


Backup
Описание слайда:
Backup

Слайд 40


MIPS Instruction Formats
Описание слайда:
MIPS Instruction Formats

Слайд 41


The Memory Space Each memory location is 8 bit = 1 byte wide has an address We assume 32 byte address An address space of 232 bytes Memory stores...
Описание слайда:
The Memory Space Each memory location is 8 bit = 1 byte wide has an address We assume 32 byte address An address space of 232 bytes Memory stores both instructions and data Each instruction is 32 bit wide  stored in 4 consecutive bytes in memory Various data types have different width

Слайд 42


Register File The Register File holds 32 registers Each register is 32 bit wide The RF supports parallel reading any two registers and writing any...
Описание слайда:
Register File The Register File holds 32 registers Each register is 32 bit wide The RF supports parallel reading any two registers and writing any register Inputs Read reg 1/2: #register whose value will be output on Read data 1/2 RegWrite: write enable

Слайд 43


Memory Components Inputs Address: address of the memory location we wish to access Read: read data from location Write: write data into location...
Описание слайда:
Memory Components Inputs Address: address of the memory location we wish to access Read: read data from location Write: write data into location Write data (relevant when Write=1) data to be written into specified location Outputs Read data (relevant when Read=1) data read from specified location

Слайд 44


The Program Counter (PC) Holds the address (in memory) of the next instruction to be executed After each instruction, advanced to point to the next...
Описание слайда:
The Program Counter (PC) Holds the address (in memory) of the next instruction to be executed After each instruction, advanced to point to the next instruction If the current instruction is not a taken branch, the next instruction resides right after the current instruction PC  PC + 4 If the current instruction is a taken branch, the next instruction resides at the branch target PC  target (absolute jump) PC  PC + 4 + offset×4 (relative jump)

Слайд 45


Instruction Execution Stages Fetch Fetch instruction pointed by PC from I-Cache Decode Decode instruction (generate control signals) Fetch operands...
Описание слайда:
Instruction Execution Stages Fetch Fetch instruction pointed by PC from I-Cache Decode Decode instruction (generate control signals) Fetch operands from register file Execute For a memory access: calculate effective address For an ALU operation: execute operation in ALU For a branch: calculate condition and target Memory Access For load: read data from memory For store: write data into memory Write Back Write result back to register file update program counter

Слайд 46


The MIPS CPU
Описание слайда:
The MIPS CPU

Слайд 47


Executing an Add Instruction
Описание слайда:
Executing an Add Instruction

Слайд 48


Executing a Load Instruction
Описание слайда:
Executing a Load Instruction

Слайд 49


Executing a Store Instruction
Описание слайда:
Executing a Store Instruction

Слайд 50


Executing a BEQ Instruction
Описание слайда:
Executing a BEQ Instruction

Слайд 51


Control Signals
Описание слайда:
Control Signals

Слайд 52


Pipelined CPU: Load (cycle 1 – Fetch)
Описание слайда:
Pipelined CPU: Load (cycle 1 – Fetch)

Слайд 53


Pipelined CPU: Load (cycle 2 – Dec)
Описание слайда:
Pipelined CPU: Load (cycle 2 – Dec)

Слайд 54


Pipelined CPU: Load (cycle 3 – Exe)
Описание слайда:
Pipelined CPU: Load (cycle 3 – Exe)

Слайд 55


Pipelined CPU: Load (cycle 4 – Mem)
Описание слайда:
Pipelined CPU: Load (cycle 4 – Mem)

Слайд 56


Pipelined CPU: Load (cycle 5 – WB)
Описание слайда:
Pipelined CPU: Load (cycle 5 – WB)

Слайд 57


Datapath with Control
Описание слайда:
Datapath with Control

Слайд 58


Multi-Cycle Control
Описание слайда:
Multi-Cycle Control

Слайд 59


Five Execution Steps Instruction Fetch Use PC to get instruction and put it in the Instruction Register. Increment the PC by 4 and put the result...
Описание слайда:
Five Execution Steps Instruction Fetch Use PC to get instruction and put it in the Instruction Register. Increment the PC by 4 and put the result back in the PC. IR = Memory[PC]; PC = PC + 4; Instruction Decode and Register Fetch Read registers rs and rt Compute the branch address A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; ALUOut = PC + (sign-extend(IR[15-0])

Слайд 60


Five Execution Steps (cont.) Execution ALU is performing one of three functions, based on instruction type: Memory Reference: effective address...
Описание слайда:
Five Execution Steps (cont.) Execution ALU is performing one of three functions, based on instruction type: Memory Reference: effective address calculation. ALUOut = A + sign-extend(IR[15-0]); R-type: ALUOut = A op B; Branch: if (A==B) PC = ALUOut; Memory Access or R-type instruction completion Write-back step

Слайд 61


The Store Instruction
Описание слайда:
The Store Instruction

Слайд 62


RAW Hazard: SW Solution
Описание слайда:
RAW Hazard: SW Solution

Слайд 63


Delayed Branch Define branch to take place AFTER n following instruction HW executes n instructions following the branch regardless of branch is...
Описание слайда:
Delayed Branch Define branch to take place AFTER n following instruction HW executes n instructions following the branch regardless of branch is taken or not SW puts in the n slots following the branch instructions that need to be executed regardless of branch resolution Instructions that are before the branch instruction, or Instructions from the converged path after the branch If cannot find independent instructions, put NOP

Слайд 64


Delayed Branch Performance Filling 1 delay slot is easy, 2 is hard, 3 is harder Assuming we can effectively fill d% of the delayed slots CPInew = 1 +...
Описание слайда:
Delayed Branch Performance Filling 1 delay slot is easy, 2 is hard, 3 is harder Assuming we can effectively fill d% of the delayed slots CPInew = 1 + 0.2 × (3 × (1-d)) For example, for d=0.5, we get CPInew = 1.3 Mixing architecture with micro-arch New generations requires more delay slots Cause computability issues between generations



Похожие презентации
Mypresentation.ru
Загрузить презентацию