Question 5:
a. Consider a machine which supports the following two instruction schedules for R class and I class instructions. Assume an instruction mix of
70% R class and 30% I class instructions. Assume that IF steps take 30 nano seconds, MEM steps of instruction execution require 50 nanoseconds and the other steps require 40 nanoseconds 0 1 2 3 4 R Class IF ID EX WB I Class IF ID EX MEM WB For a multi-cycle implementation, i. What is the minimum clock cycle time? ii. How long does sit take to execute 200 instructions in nanoseconds? b. Given a deeply pipelined processor and a branch-target buffer for conditional branches only, assuming a misprediction penalty of 5 cycles and a buffer miss penalty of 4 cycles, 95% hit rate and 90% accuracy, and 20% branch frequency. How much faster is the processor with the BTB vs. a processor that has a fixed 4-cycle branch penalty?Part (a):
i. Minimum Clock Cycle Time:
The minimum clock cycle time is determined by the longest step in the instruction execution path, as the clock cycle must be long enough to accommodate the longest single step.
- IF (Instruction Fetch): 30 nanoseconds
- ID (Instruction Decode): 40 nanoseconds
- EX (Execute): 40 nanoseconds
- MEM (Memory Access): 50 nanoseconds
- WB (Write Back): 40 nanoseconds
Since the MEM step requires the most time (50 nanoseconds), the minimum clock cycle time is 50 nanoseconds.
Minimum Clock Cycle Time = 50 nanoseconds
ii. Time to Execute 200 Instructions:
First, let's find the average number of cycles required per instruction given the instruction mix.
- R Class Instruction: 4 cycles
- I Class Instruction: 5 cycles
Given the instruction mix:
- 70% R class instructions: cycles
- 30% I class instructions: cycles
The average number of cycles per instruction is:
To execute 200 instructions:
Since each cycle takes 50 nanoseconds:
Total Execution Time for 200 instructions = 43,000 nanoseconds
Part (b):
To compare the performance of the processor with and without the branch-target buffer (BTB), we need to calculate the average number of cycles per branch instruction in both scenarios and determine the speedup.
Without BTB:
- Fixed 4-cycle branch penalty
- Branch frequency: 20%
The average penalty per instruction is:
With BTB:
- 95% hit rate
- 90% accuracy
- 5% miss rate (1 - 0.95)
- Branch frequency: 20%
- Misprediction penalty: 5 cycles
- Buffer miss penalty: 4 cycles
Hit but mispredicted (10% of hits):
Hit and correctly predicted (90% of hits):
Miss penalty (5% of all branches):
Total penalty with BTB:
Speedup Calculation:
The speedup can be calculated using the ratio of the average penalty per instruction without BTB to the average penalty with BTB.
So, the processor with the BTB is approximately 5.93 times faster than the processor with a fixed 4-cycle branch penalty.
Comments
Post a Comment