Question 5:

 a. Consider a machine which supports the following two instruction schedules for R class and I class instructions. Assume an instruction mix of

70% R class and 30% I class instructions. Assume that IF steps take 30 nano seconds, MEM steps of instruction execution require 50 nanoseconds and the other steps require 40 nanoseconds 0 1 2 3 4 R Class IF ID EX WB I Class IF ID EX MEM WB For a multi-cycle implementation, i. What is the minimum clock cycle time? ii. How long does sit take to execute 200 instructions in nanoseconds? b. Given a deeply pipelined processor and a branch-target buffer for conditional branches only, assuming a misprediction penalty of 5 cycles and a buffer miss penalty of 4 cycles, 95% hit rate and 90% accuracy, and 20% branch frequency. How much faster is the processor with the BTB vs. a processor that has a fixed 4-cycle branch penalty?


ANSWER:

Part (a):

i. Minimum Clock Cycle Time:

The minimum clock cycle time is determined by the longest step in the instruction execution path, as the clock cycle must be long enough to accommodate the longest single step.

  • IF (Instruction Fetch): 30 nanoseconds
  • ID (Instruction Decode): 40 nanoseconds
  • EX (Execute): 40 nanoseconds
  • MEM (Memory Access): 50 nanoseconds
  • WB (Write Back): 40 nanoseconds

Since the MEM step requires the most time (50 nanoseconds), the minimum clock cycle time is 50 nanoseconds.

Minimum Clock Cycle Time = 50 nanoseconds

ii. Time to Execute 200 Instructions:

First, let's find the average number of cycles required per instruction given the instruction mix.

  • R Class Instruction: 4 cycles
  • I Class Instruction: 5 cycles

Given the instruction mix:

  • 70% R class instructions: 0.7×4=2.8 cycles
  • 30% I class instructions: 0.3×5=1.5 cycles

The average number of cycles per instruction is: Average Cycles per Instruction=2.8+1.5=4.3

To execute 200 instructions: Total Cycles=200×4.3=860

Since each cycle takes 50 nanoseconds: Total Execution Time=860×50 nanoseconds=43,000 nanoseconds

Total Execution Time for 200 instructions = 43,000 nanoseconds

Part (b):

To compare the performance of the processor with and without the branch-target buffer (BTB), we need to calculate the average number of cycles per branch instruction in both scenarios and determine the speedup.

Without BTB:

  • Fixed 4-cycle branch penalty
  • Branch frequency: 20%

The average penalty per instruction is: 0.2×4=0.8 cycles

With BTB:

  • 95% hit rate
  • 90% accuracy
  • 5% miss rate (1 - 0.95)
  • Branch frequency: 20%
  • Misprediction penalty: 5 cycles
  • Buffer miss penalty: 4 cycles

Hit but mispredicted (10% of hits): 0.2×0.95×0.10×5=0.095 cycles

Hit and correctly predicted (90% of hits): 0.2×0.95×0.90×0=0 cycles

Miss penalty (5% of all branches): 0.2×0.05×4=0.04 cycles

Total penalty with BTB: 0.095+0.04=0.135 cycles

Speedup Calculation:

The speedup can be calculated using the ratio of the average penalty per instruction without BTB to the average penalty with BTB.

Speedup=0.80.1355.93

So, the processor with the BTB is approximately 5.93 times faster than the processor with a fixed 4-cycle branch penalty.

Comments

Popular posts from this blog

Questions 2 : Assume there are three small caches, each consisting of four one-word blocks. On cache is direct-mapped, a second is two-way set-associative, and the third is fully associative. Find the number of hits for each cache organization given the following sequence of block addresses: 0, 8, 6, 5, 10, 15 and 8 are accessed twice in the same sequence. Make a tabular column as given below to show the cache content on each of columns as required. Show all the pass independently pass. Draw as many numbers Assume the writing policy is LRU. Memory location Hit/Mis Add as many columns as required

Quetion 6 : Consider the "in-order-issue/in-order-completion" execution sequence shown in f In Figure Decode OWE Execute 12 12 12 14 16 13 16 13 15 15 16 Write 024/06/02 11 3 4 11 12 13 13 N 15 16 a. Identify the most likely reason why I could not enter the execute fourth cycle. stage until the [2] b. Will "in-order issue/out-of-order completion" or "out-of-order issue/out-of-order completion" fix this? If so, which? Explain

7.Write a program to read a list containing item name, item code and cost interactively and produce a three-column output as shown below. NAME CODE COST Turbo C++ 1001 250.95 C Primer 905 95.70 ------------- ------- ---------- ------------- ------- ---------- Note that the name and code are left-justified and the cost is right-justified with a precision of two digits. Trailing zeros are shown.