Suppose an instruction takes 6 cycles to execute in an unpipelined CPU: one cycle to fetch the instruction, one cycle todecode the instruction, one cycle to determine the addressing mode of the operands, one cycle to fetch any operands,one cycle to perform the ALU operation, and one cycle to store the result. In a CPU with a 6 stage pipeline, thatinstruction still takes 6 cycles to execute, so how can we say the pipeline speeds up the execution of the program?

Actually, the answer has given below:

Suppose an instruction takes 6 cycles to execute in an unpipelined CPU: one cycle to fetch the instruction, one cycl decode the instruction, one cycle to determine the addressing mode of the operands, one cycle to fetch any operan one cycle to perform the ALU operation, and one cycle to store the result. In a CPU with a 6 stage pipeline, that instruction still takes 6 cycles to execute, so how can we say the pipeline speeds up the execution of the program?

Suppose an instruction takes 6 cycles to execute in an unpipelined CPU: one cycle to fetch the instruction, one cycl decode the instruction, one cycle to determine the addressing mode of the operands, one cycle to fetch any operan one cycle to perform the ALU operation, and one cycle to store the result. In a CPU with a 6 stage pipeline, that instruction still takes 6 cycles to execute, so how can we say the pipeline speeds up the execution of the program?

Systems Architecture

7th Edition

ISBN:9781305080195

Author:Stephen D. Burd

Publisher:Stephen D. Burd

Chapter4: Processor Technology And Architecture

Section: Chapter Questions

Problem 2PE: If a microprocessor has a cycle time of 0.5 nanoseconds, what’s the processor clock rate? If the...

See similar textbooks

Similar questions

Processor R is a 64-bit RISC processor with a 2 GHz clock rate. The average instruction requires one cycle to complete, assuming zero wait state memory accesses. Processor C is a CISC processor with a 1.8 GHz clock rate. The average simple instruction requires one cycle to complete, assuming zero wait state memory accesses. The average complex instruction requires two cycles to complete, assuming zero wait state memory accesses. Processor R can’t directly implement the complex processing instructions of Processor C. Executing an equivalent set of simple instructions requires an average of three cycles to complete, assuming zero wait state memory accesses. Program S contains nothing but simple instructions. Program C executes 70% simple instructions and 30% complex instructions. Which processor will execute program S more quickly? Which processor will execute program C more quickly? At what percentage of complex instructions will the performance of the two processors be equal?
How does pipelining improve CPU efficiency? What’s the potential effect on pipelining’s efficiency when executing a conditional BRANCH instruction? What techniques can be used to make pipelining more efficient when executing conditional BRANCH instructions?
Suppose an instruction takes four cycles to execute in a nonpipelined CPU: one cycle to fetch the instruction, one cycle to decode the instruction, one cycle to perform the ALU operation, and one cycle to store the result. In a CPU with a four-stage pipeline, that instruction still takes four cycles to execute, so how can we say the pipeline speeds up the execution of the program?
Instead of a single-cycle organization, we can use a multi-cycle organization where each instruction takes multiple cycles but one instruction finishes before another is fetched. In this organization, an instruction only goes through stages it actually needs (e.g., ST only takes 4 cycles because it does not need the WB stage). Compare clock cycle times and execution times with singlecycle, multi-cycle, and pipelined organization.
Suppose you have a computer that does instruction processing in anatomic way, with a clock cycle of 7ns and one instruction executioncompleted every cycle.You now split the processing into the five stages of the RISC pipeline,and you get required processing times of• IF: 1ns• ID: 1.5ns• EX: 1ns• MEM: 2ns• WB: 1.5nsYou now have added 0.1ns of delay between each of these stages. What’s the clock cycle time of this 5-stage pipelined machine?
Suppose on a non-pipelined single-processor machine, you have the following breakdown: alu instructions make up 25% of the dynamic instruction count, and take 2 cycles to execute. Load/store instructions take 10 cycles to execute and make up 30% of the mix. Jumps take 4 cycles and make up 15%. All other instructions average 1.5 cycles. a. What is the average CPI? b. Suppose the architecture above is pipelined. If there are no stalls for any reason, what is the new CPI? c. If, for the architecture described in questions a and b, Load/store instructions generate 2 stalls on average, alu 0.2 stalls, and jumps cause 1 stall, what is the actual pipelined cpi?
Consider the following portions of three different programs running at the same time on three processors in a symmetric multicore processor (SMP). Assume that before this code is run, Total is 15, val_1 is 45, val_2 is 25, and val_3 is 5. Core 1: Total = Total + val_1;Core 2: Total = Total - val_2;Core 3: Total = Total + val_3; 1. Show the assembly code for both instructions. Assume that Total, var_1, and var_2 are stored at 0x0100, 0x0120, and 0x0130 respectively.2. What are all the possible resulting values of Total? For each possibleoutcome, explain how we might arrive at those values. You will need to show all possible interleavings of instructions.3. How could you make the execution more deterministic so that only one set of values is possible?
A complete 6-stage non-pipelined 16-bit CPU architecture include 6 components: a register file, a decoder, an ALU, a control unit, a program counter, and ram/memory. Brief overview: opcode is 4 bits 14 different instructions implemented 8 general purpose registers RRR-type instructions are the largest, and take up 9 bits in register addresses 1 bit is a condition bit 2 bits unused simulated clock runs at a 10ns period or 100Mhz simulated memory is 512 bytes Referring to the 3 components as in the picture, namely the File Register, Decoder and ALU, you are required to describe how the three components operate.
A hypothetical processor has 9 stages of a pipeline as shown in the table below. The first row in the table below shows the pipeline stage number, second row gives the name of each stage, and third row gives the delay of each stage in Nano-seconds. The name of each stage describes the task performed by it. Each stage takes 1 cycle to execute. This processor stores all the register contents in a compressed fashion. After fetching the operands the operands are first decompressed, and before saving the results in register file, the results are first compressed. a)How many cycles are required to execute one instruction on this pipeline? b)How many cycles are required to execute 19 instructions on this pipeline? Assume that no stall cycles occur during the execution of all instructions. c) Assume that all necessary bypass/forwarding circuitry is implemented in this 9 stage pipeline. How many cycles will the pipeline stall during the execution of below given two instructions? Briefly explain…
Consider a CPU that implements a single instruction fetch–decode–execute–write- back pipeline for scalar processing. The execution unit of this pipeline assumes that the execution stage requires one step. Describe, and show in diagram form, what happens when an instruction that requires one execution step follows one that requires four execution steps.
1. Consider the operation of a machine with the data path of the figure below. Let us call thismachine "Teletraan-1". Teletraan-1 has no pipeline. Suppose that fetching an instruction takes4 nsec, decoding an instruction takes 1 nsec, fetching the operands from memory or registersto ALU takes 2 nsec, running the ALU takes 2 nsec, storing the result back to the destinationregisters takes 1 nsec.How much time does Teletraan-1 need to execute 1 million instructions?A. 1 nsec x 1 million = 0.001 secondB. 2 nsec x 1 million = 0.002 secondC. 4 nsec x 1 million = 0.004 secondD. 10 nsec x 1 million = 0.01 second 2. Since the above machine Teletraan-1 has no pipeline, an instruction must be completed in oneCPU clock cycle. As a result, a CPU clock cycle must be longer than or equal to the executiontime of one instruction. To improve it, we developed "Teletraan-2". Teletraan-2 has a fivestage pipeline. It takes 5 nsec to fetch an instruction, 2 nsec to decode an instruction, 3 nsec tofetch…
Quesion: Show the execution of your program on the above pipelined processor for k = 6 by drawing a diagram. Assume that the fetched and decoded instructions are stored in an instruction window IW with unlimited capacity (and so you can store any number of instruction in the IW). Explain where and why delay slots Assume that the processor can do out-of-order execution to speed up the completion of the program. Assume that there is only one bus, and that the fetching of instructions uses this bus. So the fetching of an instruction can conflict with a stage where an instruction accesses memory. A Instruction Set Architecture A.1 Instruction set We present a list of instructions typical of a RISC (reduced instruction set computer) machine. In data-movement and control instructions, the addresses may be immediate #X, direct (memory) M, indirect (memory) [M], register r, or register indirect [r] addresses. Data-processing instructions use immediate or register…