Which of the following are NOT true in a pipelined processor? I. Bypassing can handle all RAW hazards II. Register renaming can eliminate all register carried WAR hazards III. Control hazard penalties can be eliminated by dynamic branch prediction
GATE CSE · Computer Organization Architecture
Master topic for Pipeline Processor. Includes Pipelining, Performance & Parallelism.
110 questions · 20 PYQs · 0 AI practice · GATE CSE 2027
Which of the following are NOT true in a pipelined processor? I. Bypassing can handle all RAW hazards II. Register renaming can eliminate all register carried WAR hazards III. Control hazard penalties can be eliminated by dynamic branch prediction
Delayed branching can help in the handling of control hazards For all delayed conditional branch instructions, irrespective of whether the condition evaluates to true or false
The performance of a pipelined processor suffers if:
Delayed branching can help in the handling of control hazards The following code is to run on a pipelined processor with one branch delay slot: I1: ADD R2 R7 +R8 I2 : SUB R4 R5 - R6 I3: ADD R1 R2 + R3 I4 : STORE Memory [R4] R1 BRANCH to Label if R1==0 Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any other program modification?
A non pipelined single cycle processor operating at 100 MHz is converted into a synchronous pipelined processor with five stages requiring 2.5 nsec, 1.5 nsec, 2 nsec, 1.5 nsec and 2.5 nsec, respectively. The delay of the latches is 0.5 nsec. The speedup of the pipeline processor for a large number of instructions is:
The floating point unit of a processor using a design D takes 2t cycles compared to t cycles taken by the fixed point unit. There are two more design suggestions and . uses 30% more cycles for fixed point unit but 30% less cycles for floating point unit as compared to design D. uses 40% less cycles for fixed point unit but 10% more cycles for floating point unit as compared to design D. For a given program which has 80% fixed point operations and 20% floating point operations, which of the following ordering reflects the relative performances of three designs? ( > denotes that is faster than )
A processor takes 12 cycles to complete an instruction . The corresponding pipelined processor uses 6 stages with the execution times of 3, 2, 5, 4, 6 and 2 cycles respectively. What is the asymptotic speedup assuming that a very large number of instructions are to be executed?
Consider a pipelined processor with the following four stages: IF: Instruction Fetch ID: Instruction Decode and Operand Fetch EX: Execute WB: Write Back The IF, ID and WB stages take one clock cycle each to complete the operation. The number of clock cycles for the EX stage depends on the instruction. The ADD and SUB instructions need 1 clock cycle and the MUL instruction needs 3 clock cycles in the EX stage. Operand forwarding is used in the pipelined processor. What is the number of clock cycles taken to complete the following sequence of instructions? ADD R2, R1, R0 (R2 R1 + R0) MUL R4, R3, R2 (R4 R3 * R2) SUB R6, R5, R4 (R6 R5 - R4)
Data forwarding techniques can be used to speed up the operation in presence of data dependencies. Consider the following replacements of LHS with RHS. i. ii. iii. iv. In which of the following options, will the result of executing the RHS be the same as executing the LHS irrespective of the instructions that follow ?
A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence.
The IF, ID and WB stages take 1 clock cycle each. The EX stage takes 1 clock cycle each for the ADD, SUB and STORE operations, and 3 clock cycles each for MUL and DIV operations. Operand forwarding from the EX stage to the ID stage is used. The number of clock cycles required to complete the sequence of instructions is
A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence.
The number of Read-After-Write (RAW) dependencies, Write-After-Read( WAR) dependencies, and Write-After-Write (WAW) dependencies in the sequence of instructions are, respectively,
A CPU has five-stages pipeline and runs at 1GHz frequency. Instruction fetch happens in the first stage of the pipeline. A conditional branch instruction computes the target address and evaluates the condition in the third stage of the pipeline. The processor stops fetching new instructions following a conditional branch until the branch outcome is known. A program executes instructions out of which 20% are conditional branches. If each instruction takes one cycle to complete on average, then total execution time of the program is
We have two designs D1 and D2 for a synchronous pipeline processor. D1 has 5 pipeline stages with execution times of 3 nsec, 2 nsec, 4 nsec, 2 nsec and 3 nsec while the design D2 has 8 pipeline stages each with 2 nsec execution time How much time can be saved using design D2 over design D1 for executing 100 instructions?
A 5 stage pipelined CPU has the following sequence of stages: IF - Instruction fetch from instruction memory, RD - Instruction decode and register read, EX - Execute: ALU operation for data and address computation, MA - Data memory access - for write access, the register read at RD stage is used, WB - Register write back. Consider the following sequence of instructions: I1 : L R0, 1oc1; R0 <= M[1oc1] I2 : A R0, R0; R0 <= R0 + R0 I3 : S R2, R0; R2 <= R2 - R0 Let each stage take one clock cycle. What is the number of clock cycles taken to complete the above sequence of instructions starting from the fetch of I1 ?
In an enhancement of a design of a CPU, the speed of a floating point unit has been increased by 20% and the speed of a fixed point unit has been increased by 10%. What is the overall speedup achieved if the ratio of the number of floating point operations to the number of fixed point operations is 2:3 and the floating point operation used to take twice the time taken by the fixed point operation in the original design?
A 4-stage pipeline has the stage delays as 150, 120, 160 and 140 nanoseconds respectively. Registers that are used between the stages have a delay of 5 nanoseconds each. Assuming constant clocking rate, the total time taken to process 1000 data items on this pipeline will be
Consider a pipeline processor with 4 stages S1 to S4. We want to execute the following loop: for (i = 1; i < = 1000; i++) {I1, I2, I3, I4} where the time taken (in ns) by instructions I1 to I4 for stages S1 to S4 are given below:
The output of I1 for i = 2 will be available after
For a pipelined CPU with a single ALU, consider the following situations 1. The (j + 1)-th instruction uses the result of j-th instruction as an operand 2. The execution of a conditional jump instruction 3. The j - th and j + 1 - st instructions require the ALU at the same time Which of the above can cause a hazard?
The performance of a pipelined processor suffers if
Comparing the time T1 taken for a single instruction on a pipelined CPU with time T2 taken on a non-pipelined but identical CPU, we can say that
Want unlimited AI-generated Pipeline Processor questions?
Sign up free and practice with adaptive difficulty — Easy, Medium, Hard. New questions every session.
Start practising for free →