A.
Assembly code for Conditional jump:
long absSum(long *start, long count)
start in %rdi, count in %rsi
absSum:
irmovq $8, %r8
irmovq $1, %r9
xorq %rax, %rax
andq %rsi, %rsi
jmp test
loop:
mrmovq (%rdi),%r10
xorq %r11, %r11
subq %r10, %r11
jle pos
rrmovq %r11, %r10
pos:
addq %r10, %rax
addq %r8, %rdi
subq %r9, %rsi
test:
jne loop
ret
Assembly code for Conditional move:
long absSum(long *start, long count)
start in %rdi, count in %rsi
absSum:
irmovq $8, %r8
irmovq $1, %r9
xorq %rax, %rax
andq %rsi, %rsi
jmp test
loop:
mrmovq (%rdi),%r10
xorq %r11, %r11
subq %r10, %r11
cmovg %r11, %r10
addq %r10, %rax
addq %r8, %rdi
subq %r9, %rsi
test:
jne loop
ret
Processing stages:
- The processing of an instruction has number of operations.
- The operations are organized into particular sequence of stages.
- It attempts to follow a uniform sequence for all instructions.
- The description of stages are shown below:
- Fetch:
- It uses program counter “PC” as memory address to read instruction bytes from memory.
- The 4-bit portions “icode” and “ifun” of specifier byte is extracted from instruction.
- It fetches “valC” that denotes an 8-byte constant.
- It computes “valP” that denotes value of “PC” plus length of fetched instruction.
- Decode:
- The register file is been read with two operands.
- It gives values “valA” and “valB” for operands.
- It reads registers with instruction fields “rA” and “rB”.
- Execute:
- In this stage the ALU either performs required operation or increments and decrements stack pointer.
- The resulting value is termed as “valE”.
- The condition codes are evaluated and destination register is updated based on condition.
- It determines whether branch should be taken or not in a jump instruction.
- Memory:
- The data is been written to memory or read from memory in this stage.
- The value that is read is determined as “valM”.
- Write back:
- The results are been written to register file.
- It can write up to two results.
- PC update:
- The program counter “PC” denotes memory address to read bytes of instruction from memory.
- It is used to set next instruction’s address.
- Fetch:
B.
Assembly code for Conditional jump:
long absSum(long *start, long count)
start in %rdi, count in %rsi
absSum:
irmovq $8, %r8
irmovq $1, %r9
xorq %rax, %rax
andq %rsi, %rsi
jmp test
loop:
mrmovq (%rdi),%r10
xorq %r11, %r11
subq %r10, %r11
jle pos
rrmovq %r11, %r10
pos:
addq %r10, %rax
addq %r8, %rdi
subq %r9, %rsi
test:
jne loop
ret
Assembly code for Conditional move:
long absSum(long *start, long count)
start in %rdi, count in %rsi
absSum:
irmovq $8, %r8
irmovq $1, %r9
xorq %rax, %rax
andq %rsi, %rsi
jmp test
loop:
mrmovq (%rdi),%r10
xorq %r11, %r11
subq %r10, %r11
cmovg %r11, %r10
addq %r10, %rax
addq %r8, %rdi
subq %r9, %rsi
test:
jne loop
ret
Processing stages:
- The processing of an instruction has number of operations.
- The operations are organized into particular sequence of stages.
- It attempts to follow a uniform sequence for all instructions.
- The description of stages are shown below:
- Fetch:
- It uses program counter “PC” as memory address to read instruction bytes from memory.
- The 4-bit portions “icode” and “ifun” of specifier byte is extracted from instruction.
- It fetches “valC” that denotes an 8-byte constant.
- It computes “valP” that denotes value of “PC” plus length of fetched instruction.
- Decode:
- The register file is been read with two operands.
- It gives values “valA” and “valB” for operands.
- It reads registers with instruction fields “rA” and “rB”.
- Execute:
- In this stage the ALU either performs required operation or increments and decrements stack pointer.
- The resulting value is termed as “valE”.
- The condition codes are evaluated and destination register is updated based on condition.
- It determines whether branch should be taken or not in a jump instruction.
- Memory:
- The data is been written to memory or read from memory in this stage.
- The value that is read is determined as “valM”.
- Write back:
- The results are been written to register file.
- It can write up to two results.
- PC update:
- The program counter “PC” denotes memory address to read bytes of instruction from memory.
- It is used to set next instruction’s address.
- Fetch:
C.
Assembly code for Conditional jump:
long absSum(long *start, long count)
start in %rdi, count in %rsi
absSum:
irmovq $8, %r8
irmovq $1, %r9
xorq %rax, %rax
andq %rsi, %rsi
jmp test
loop:
mrmovq (%rdi),%r10
xorq %r11, %r11
subq %r10, %r11
jle pos
rrmovq %r11, %r10
pos:
addq %r10, %rax
addq %r8, %rdi
subq %r9, %rsi
test:
jne loop
ret
Assembly code for Conditional move:
long absSum(long *start, long count)
start in %rdi, count in %rsi
absSum:
irmovq $8, %r8
irmovq $1, %r9
xorq %rax, %rax
andq %rsi, %rsi
jmp test
loop:
mrmovq (%rdi),%r10
xorq %r11, %r11
subq %r10, %r11
cmovg %r11, %r10
addq %r10, %rax
addq %r8, %rdi
subq %r9, %rsi
test:
jne loop
ret
Processing stages:
- The processing of an instruction has number of operations.
- The operations are organized into particular sequence of stages.
- It attempts to follow a uniform sequence for all instructions.
- The description of stages are shown below:
- Fetch:
- It uses program counter “PC” as memory address to read instruction bytes from memory.
- The 4-bit portions “icode” and “ifun” of specifier byte is extracted from instruction.
- It fetches “valC” that denotes an 8-byte constant.
- It computes “valP” that denotes value of “PC” plus length of fetched instruction.
- Decode:
- The register file is been read with two operands.
- It gives values “valA” and “valB” for operands.
- It reads registers with instruction fields “rA” and “rB”.
- Execute:
- In this stage the ALU either performs required operation or increments and decrements stack pointer.
- The resulting value is termed as “valE”.
- The condition codes are evaluated and destination register is updated based on condition.
- It determines whether branch should be taken or not in a jump instruction.
- Memory:
- The data is been written to memory or read from memory in this stage.
- The value that is read is determined as “valM”.
- Write back:
- The results are been written to register file.
- It can write up to two results.
- PC update:
- The program counter “PC” denotes memory address to read bytes of instruction from memory.
- It is used to set next instruction’s address.
- Fetch:
Want to see the full answer?
Check out a sample textbook solutionChapter 4 Solutions
Computer Systems: A Programmer's Perspective (3rd Edition)
- A program has the following breakdown: 25% ld (50% of them directly followed by a dependent instruction),25% sd, 30% r_type, 20% beq (80% of them are taken. Branches are calculated in the third cycle. What is the average CPI of the program when run on the pipelined RISC V implementation in the textbook?arrow_forwardComputer Science Consider the following comparison instruction:TST R0, R1, ASR #1 ; R0 = 0x12345678 and R1 = 0xDB97530F(i) Appraise the value of the condition flags (N, Z, C, V) after the execution of the instruction. Use the values of R0 and R1 provided in the instruction comment.(ii) The conditional branches are BHS, BLO, BLT and BPL. From the condition flag bits appraised in Question 2(b)(i), determine which of the above conditional branches will be executed.arrow_forwardI am trying to better understand memory access in computers, please answer the sample question below. Assume that the page table can held in registers of the MMU. It takes 8 ms (milliseconds)to service a page fault if there is an empty frame or if the replaced page is not altered, and20 ms if the replaced page is altered. Memory access time is 100 ns (nanoseconds). It has been empirically measured that the page to be replaced is altered 75% of the time.Obtain the maximum probability of page fault for an effective memory access time ≤ 200ns.arrow_forward
- (Practice) a. Using Figure 2.14 and assuming the variable name rate is assigned to the byte at memory address 159, determine the addresses corresponding to each variable declared in the following statements. Also, fill in the correct number of bytes with the initialization data included in the declaration statements. (Use letters for the characters, not the computer codes that would actually be stored.) floatrate; charch1=M,ch2=E,ch3=L,ch4=T; doubletaxes; intnum,count=0; b. Repeat Exercise 9a, but substitute the actual byte patterns that a computer using the ASCII code would use to store characters in the variables ch1, ch2, ch3, and ch4. (Hint: Use Appendix B.)arrow_forward1. Develop a mathematical model for measuring performance based on overall memory access time with a neat diagram for the following memory design and derive the formula to calculate the Overall Memory Access Time. Main Memory : 1 Internal Cache : 1 External Cache: 1 Register S and Register B have fastest access time: Data Search order [ Registers – Internal Cache – External Cache – Memory] [Hint: Register access time is considered negligible]arrow_forwardSection 1.0 cites as a pitfall the utilization of a subset of the performace equation as a performance metric. To illustrate this, consider the following two processors. P1 has a clock rate of 4GHz, average CPI of 0.9, and requires the execution of 5.0E9 instructions. P2 has a clock rate of 3GHz, an average CPI of 0.75, and requires the execution of 1.0E9 instructions. (1) A common fallacy is to use MIPS to compare the performace of two different processors, and consider that the processor with the largest MIPS has the largest performance. Check if this is true for P1 and P2. (2) Another common performace figure is MFLOPS, defined as MFLOPS = No. FP operations / (execution time x 1E6) but this figure has the same problems as MIPS. Assume that 40% of the instructions executed on both P1 and P2 are floating-point instructions. Find the MFLOPS figures for the processors.arrow_forward
- 4.19.16: [5] <COD §4.6>. In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume that individual stages of the datapath have the following latencies: Also, assume that instructions executed by the processor are broken down as follows: (a) What is the clock cycle time in a pipelined and non-pipelined processor? (b) What is the total latency of an lw instruction in a pipelined and non-pipelined processor? (c) If we can split one stage of the pipelined datapath into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor? (d) Assuming there are no stalls or hazards, what is the utilization of the data memory? (e) Assuming there are no stalls or hazards, what is the utilization of the write-register port of the "Registers" unit? No hand written and fast answer with explanationarrow_forward1.3 Assemble the following assembly code into machine code. Assume that the machine language op-codes for load, store, mult, add, div, and sub are 18, 19, 13, 14, 15, and 16, respectively. Also assume that the variable x is stored at location M[50]. load R1, x mult R2, R1, #9 store x, R2 sub R0, R1, #8 div R2, R0, #2 I NEED THE MACHINE CODE IN DECIMAL PLEASE,arrow_forward15.a) Consider an instruction pipeline with four stages with the stage delays 5 nsec, 6 nsec, 11 nsec, and 8 nsec respectively. The delay of an inter-stage register stage of the pipeline is 1 nsec. What is the approximate speedup of the pipeline in the steady-state underideal conditions as compared to the corresponding non-pipelined implementation? b) Discuss structural hazards and control hazards with examples pls answer both subparrts,,its urgent thanksarrow_forward
- Q.) Do given c and d problem belowarrow_forwardProblem 10arrow_forwardProblem 4 Discrete Mathematics.Combinations and Permutations. (5,10,10): Soccer A local high school soccer team has 20 players. However, only 11 players play at any given time during a game. In how many ways can the coach choose 11 players To be more realistic, the 11 players playing a game normally consist of 4 midfielders, 3 defend ers, 3 attackers and 1 goalkeeper. Assume that there are 7 midfielders, 6 defenders, 5 attackers and 2 goalkeepers on the team 2. In how many ways can the coach choose a group of 4 midfielders, 3 defenders, 3 attackers and 1 goalkeeper? 3. Assume that one of the defenders can also play attacker. Now in how many ways can the coach choose a group of 4 midfielders, 3 defenders, 3 attackers and 1 goalkeeper?arrow_forward
- C++ for Engineers and ScientistsComputer ScienceISBN:9781133187844Author:Bronson, Gary J.Publisher:Course Technology Ptr