Computer Systems: A Programmer's Perspective (3rd Edition)
3rd Edition
ISBN: 9780134092997
Author: Bryant
Publisher: PEARSON
expand_more
expand_more
format_list_bulleted
Question
Chapter 3.5, Problem 3.8PP
Program Plan Intro
Unary and Binary Operations:
- The details of unary operations includes:
- The single operand serves as both source as well as destination.
- It can either be a register or a memory location.
- The instruction “incq” causes 8 byte element on stack top to be incremented.
- The instruction “decq” causes 8 byte element on stack top to be decremented.
- The details of binary operations includes:
- The first operand denotes the source.
- The second operand serves as both source as well as destination.
- The first operands can whichever be an immediate value, memory location or register.
- The second operands can whichever be a register or a memory location.
Example:
The example for a “subq” instruction is shown below:
subq %rax, %rdx
Here, “%rax” and “%rdx” denotes registers. The given code decrements register “%rdx” by value in “%rax”.
Expert Solution & Answer
Want to see the full answer?
Check out a sample textbook solutionStudents have asked these similar questions
1. We wish to compare the performance of two different machines: M1 and M2. The following measurements have been made on these machines:
Program
Time on M1
Time on M2
1
10 seconds
5 seconds
2
3 seconds
4 seconds
Which machine is faster for each program, and by how much?
2. For M1 and M2 of problem 1, the following additional measurements are made:. Find the instruction execution rate (instructions per second) for each machine when running program 1.
Program
Instructions executed on M1
Instructions executed on M2
1
200 x 106
160 x 106
3. For M1 and M2 of problem 1, if the clock rates are 200 MHz and 300 MHz, respectively, find the CPI for program 1 on both machines using the data provided in problems 1 and 2.
4. You are going to enhance a machine, and there are two possible improvements: either make multiply instructions run four times faster than before or make memory access instructions run two times faster than before. You…
1.BL=00, after instruction DEC BL is executed, CF =?
2.CH=80H; after ROL CH, 1; CH=?
1. Develop a mathematical model for measuring performance based on overall memory
access time with a neat diagram for the following memory design and derive the
formula to calculate the Overall Memory Access Time.
Main Memory : 1
Internal Cache : 1
External Cache: 1
Register S and Register B have fastest access time:
Data Search order [ Registers – Internal Cache – External Cache – Memory]
[Hint: Register access time is considered negligible]
Chapter 3 Solutions
Computer Systems: A Programmer's Perspective (3rd Edition)
Ch. 3.4 - Prob. 3.1PPCh. 3.4 - Prob. 3.2PPCh. 3.4 - Prob. 3.3PPCh. 3.4 - Prob. 3.4PPCh. 3.4 - Prob. 3.5PPCh. 3.5 - Prob. 3.6PPCh. 3.5 - Prob. 3.7PPCh. 3.5 - Prob. 3.8PPCh. 3.5 - Prob. 3.9PPCh. 3.5 - Prob. 3.10PP
Ch. 3.5 - Prob. 3.11PPCh. 3.5 - Prob. 3.12PPCh. 3.6 - Prob. 3.13PPCh. 3.6 - Prob. 3.14PPCh. 3.6 - Prob. 3.15PPCh. 3.6 - Prob. 3.16PPCh. 3.6 - Practice Problem 3.17 (solution page 331) An...Ch. 3.6 - Practice Problem 3.18 (solution page 332) Starting...Ch. 3.6 - Prob. 3.19PPCh. 3.6 - Prob. 3.20PPCh. 3.6 - Prob. 3.21PPCh. 3.6 - Prob. 3.22PPCh. 3.6 - Prob. 3.23PPCh. 3.6 - Practice Problem 3.24 (solution page 335) For C...Ch. 3.6 - Prob. 3.25PPCh. 3.6 - Prob. 3.26PPCh. 3.6 - Practice Problem 3.27 (solution page 336) Write...Ch. 3.6 - Prob. 3.28PPCh. 3.6 - Prob. 3.29PPCh. 3.6 - Practice Problem 3.30 (solution page 338) In the C...Ch. 3.6 - Prob. 3.31PPCh. 3.7 - Prob. 3.32PPCh. 3.7 - Prob. 3.33PPCh. 3.7 - Prob. 3.34PPCh. 3.7 - Prob. 3.35PPCh. 3.8 - Prob. 3.36PPCh. 3.8 - Prob. 3.37PPCh. 3.8 - Prob. 3.38PPCh. 3.8 - Prob. 3.39PPCh. 3.8 - Prob. 3.40PPCh. 3.9 - Prob. 3.41PPCh. 3.9 - Prob. 3.42PPCh. 3.9 - Practice Problem 3.43 (solution page 344) Suppose...Ch. 3.9 - Prob. 3.44PPCh. 3.9 - Prob. 3.45PPCh. 3.10 - Prob. 3.46PPCh. 3.10 - Prob. 3.47PPCh. 3.10 - Prob. 3.48PPCh. 3.10 - Prob. 3.49PPCh. 3.11 - Practice Problem 3.50 (solution page 347) For the...Ch. 3.11 - Prob. 3.51PPCh. 3.11 - Prob. 3.52PPCh. 3.11 - Practice Problem 3.52 (solution page 348) For the...Ch. 3.11 - Practice Problem 3.54 (solution page 349) Function...Ch. 3.11 - Prob. 3.55PPCh. 3.11 - Prob. 3.56PPCh. 3.11 - Practice Problem 3.57 (solution page 350) Function...Ch. 3 - For a function with prototype long decoda2(long x,...Ch. 3 - The following code computes the 128-bit product of...Ch. 3 - Prob. 3.60HWCh. 3 - In Section 3.6.6, we examined the following code...Ch. 3 - The code that follows shows an example of...Ch. 3 - This problem will give you a chance to reverb...Ch. 3 - Consider the following source code, where R, S,...Ch. 3 - The following code transposes the elements of an M...Ch. 3 - Prob. 3.66HWCh. 3 - For this exercise, we will examine the code...Ch. 3 - Prob. 3.68HWCh. 3 - Prob. 3.69HWCh. 3 - Consider the following union declaration: This...Ch. 3 - Prob. 3.71HWCh. 3 - Prob. 3.72HWCh. 3 - Prob. 3.73HWCh. 3 - Prob. 3.74HWCh. 3 - Prob. 3.75HW
Knowledge Booster
Similar questions
- A program has the following breakdown: 25% ld (50% of them directly followed by a dependent instruction),25% sd, 30% r_type, 20% beq (80% of them are taken. Branches are calculated in the third cycle. What is the average CPI of the program when run on the pipelined RISC V implementation in the textbook?arrow_forwardSection 1.0 cites as a pitfall the utilization of a subset of the performace equation as a performance metric. To illustrate this, consider the following two processors. P1 has a clock rate of 4GHz, average CPI of 0.9, and requires the execution of 5.0E9 instructions. P2 has a clock rate of 3GHz, an average CPI of 0.75, and requires the execution of 1.0E9 instructions. (1) A common fallacy is to use MIPS to compare the performace of two different processors, and consider that the processor with the largest MIPS has the largest performance. Check if this is true for P1 and P2. (2) Another common performace figure is MFLOPS, defined as MFLOPS = No. FP operations / (execution time x 1E6) but this figure has the same problems as MIPS. Assume that 40% of the instructions executed on both P1 and P2 are floating-point instructions. Find the MFLOPS figures for the processors.arrow_forwardli $t2, 2 L1: add $t1, $t1, $t2 sub $t1, $t1, $t3 bne $t1, $t4, L1 sub $t4, $s0, $t3 Given the modified single-cycle processor shown below, what are the values (in binary) of instruction[31-26], instruction[25-21], instruction[20-16], instruction[15-11], instruction[5-0], Read data 1, Read data 2, ALU zero, PCSrc, and all the main control decoded output signals when the time is at 1950 ns. The below single-cycle processor diagram can be used for your reference. Note: A new decoded signal output “Tzero” is added for executing “bne” instruction. The signal definition is described below: Instruction Opcode New Main Control Output Signal beq 00100b (4d) Tzero = 0 bne 00101b (5d) Tzero = 1 At the moment of 1950 ns, the below values (0, 1 or X) are:instruction[31-26] = instruction[25-21] = instruction[20-16] =instruction[15-0] = Read data 1 output = Read data 2 output = RegDst = ALUSrc = MemtoReg = RegWrite =…arrow_forward
- 4.19.16: [5] <COD §4.6>. In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume that individual stages of the datapath have the following latencies: Also, assume that instructions executed by the processor are broken down as follows: (a) What is the clock cycle time in a pipelined and non-pipelined processor? (b) What is the total latency of an lw instruction in a pipelined and non-pipelined processor? (c) If we can split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor? (d) Assuming there are no stalls or hazards, what is the utilization of the data memory? (e) Assuming there are no stalls or hazards, what is the utilization of the write-register port of the "Registers" unit? No hand written and fast answer with explanationarrow_forward- Consider that each instruction require 6 steps (phases) to execute and each (step) phase takes 5 seconds to complete. Also, within each instruction it requires 3 seconds time gap between end of completion one step (phase) and beginning of next step (phase). b. Design and reflec on a suitable pipeline technique using which time taken for executing all 3 programs can be further improved. Calculate the improved time using this method by considering that instruction require same number of steps to complete and time taken for each step is same and time gap between each steps is also same?arrow_forward(Practice) a. Using Figure 2.14 and assuming the variable name rate is assigned to the byte at memory address 159, determine the addresses corresponding to each variable declared in the following statements. Also, fill in the correct number of bytes with the initialization data included in the declaration statements. (Use letters for the characters, not the computer codes that would actually be stored.) floatrate; charch1=M,ch2=E,ch3=L,ch4=T; doubletaxes; intnum,count=0; b. Repeat Exercise 9a, but substitute the actual byte patterns that a computer using the ASCII code would use to store characters in the variables ch1, ch2, ch3, and ch4. (Hint: Use Appendix B.)arrow_forward
- Problem 4: Give a block diagram for a 8M x 32 memory using 512K x 8 memory ch book] [Hints: Figure 5.10 in thearrow_forwardhelp for the mips code. dont use AI, divi is not using in mips. Q1)Suppose $t1 stores the base address of word array A and $s2 is associated with h, convert to the following instruction into MIPS. if A[m+3]<20: A[m+1] = 5 else: A[m] = 1Q2) Assume only $s1, $s2, $a0, $v0 registers can be used. Procedure calling convention MUST be followed. def func(x): a = x/3 if a == 20: return a else: return 1arrow_forward(15pt) Assume that instruction cache miss rate is 2%, data cache miss rate is 10%, CPI (clock cycle per instruction) is 2 without any memory stall, and miss penalty is 100 cycles. In addition, assume that the frequency of loads/stores is 30%. (a) Compute CPI with memory stall. (b) When CPI without any memory stall becomes 1, compute CPI with memory stall. (c) If the CPU clock rate is doubled with the same memory when CPI without memory stall is 2, compute CPI with memory stall.arrow_forward
- The importance of having a good branch predictor depends on how often conditional branches are executed. Together with branch predictor accuracy, this will determine how much time is spent stalling due to mispredicted branches. In this exercise, assume that the breakdown of dynamic instructions into various instruction categories is as follows: R-Type BEQ JMP LW SW 40% 25% 5% 25% 5% Also, assume the following branch predictor accuracies: Always - Taken Always - not - taken 2-bit 40% 60% 75% 1.1 Stall cycles due to mispredicted branches increase the CPI. What is the extra CPI due to mispredicted branches with the always-taken predictor? Assume that branch outcomes are determined in the EX stage, that there are no data hazards, and that no delay slots are used. 1.2 Repeat 1.1 for the “always-not-taken” predictor. 1.3 Repeat 1.1 for the 2-bit predictor. 1.4 With the 2-bit predictor, what speedup would be achieved if we could convert half of the branch instructions in a way…arrow_forwardWhat will be the contents of AX, BX, CX and DX registers after the execution of the following program? MOV DX, 1234 MOV CL, 0F MOV BH, AB MOV AX, 6589 ADD DX, AX MOV AH, 12 Lab1: OR AH, CL MOV DL, 41 MOV BX, F0 MOV CL, 3A LOOP Lab1 INC BX SHR BX,2arrow_forwardA system is using segmentation to map physical memory. Current segment table is as follows. Some of the entries are stored in associative registers as given in second table. Assume that the register access time is 10 nanoseconds and memory access time is (10 x 7) nanoseconds; Find the physical memory address for each of the following logical memory addresses given by <Segment no, offset> Calculate effective memory access time for each (a) <0,3700> (b) <2,3780> (c) <1,200>arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- C++ for Engineers and ScientistsComputer ScienceISBN:9781133187844Author:Bronson, Gary J.Publisher:Course Technology Ptr
C++ for Engineers and Scientists
Computer Science
ISBN:9781133187844
Author:Bronson, Gary J.
Publisher:Course Technology Ptr