Computer Systems: A Programmer's Perspective (3rd Edition)
3rd Edition
ISBN: 9780134092669
Author: Bryant, Randal E. Bryant, David R. O'Hallaron, David R., Randal E.; O'Hallaron, Bryant/O'hallaron
Publisher: PEARSON
expand_more
expand_more
format_list_bulleted
Question
Chapter 5, Problem 5.16HW
Program Plan Intro
Given C Code:
void inner4(vec_ptr u, vec_ptr v, data_t *dest)
{
long i;
long length = vec_length(u);
data_t *udata = get_vec_start(u);
data_t *vdata = get_vec_start(v);
data_t sum = (data_t) 0;
for (i = 0; i < length; i++)
{
sum = sum + udata[i] * vdata[i];
}
*dest = sum;
}
Cycles per element (CPE):
- The CPE denotes performance of
program that helps in improving code. - It helps to understand detailed level loop performance for an iterative program.
- It is appropriate for programs that use a repetitive computation.
- The processor’s activity sequencing is controlled by a clock that provides regular signal of some frequency.
Loop unrolling:
- It denotes a program transformation that would reduce count of iterations for a loop.
- It increases count of elements computed in each iteration.
- It reduces number of operations that is not dependent to program directly.
- It reduces count of operations in critical paths of overall computations.
Expert Solution & Answer
Trending nowThis is a popular solution!
Students have asked these similar questions
For each of the following two code segments, decide whether it is suitable for parallel execution and response according to your justification: add OpenMP pragmas to make the loop parallel or briefly explain why the code segment is not suitable for parallel execution.
(A): for ( i = 0; i < n; i++ ) {
x [ i ] = 3 * i + 5;
y [ i ] = log ( x [ i ] );
}
(B): x [ 0 ] = 1;
x [ 1 ] = 2;
for ( i = 2; i < n; i++ )
x [ i ] = x [ i – 1 ] * x [ i – 2 ] ;
Implement the following functions using Decoder.F(A,B,C,D,E)=∑(0,1,5,15,24,25,27)+ d(2,4,20,21,22,29) using only 2*4 Decoder(s)
OpenMP C++
With the following code, create 3 versions:
Outer loop parallelism: use a single OpenMP pragma only at the outer loop
Inner loop parallelism: use a single OpenMP pragma only at the inner loop (use reduction)
Nested loop parallelism: use pragmas at both the outer loop and inner loop
-----
#pragma omp parallel for for(int i = 0; i < n; i++) { #pragma omp parallel for for(int j = 0; j < n; j++) { y[i] += A[i * n + j] * x[j]; } }
Chapter 5 Solutions
Computer Systems: A Programmer's Perspective (3rd Edition)
Knowledge Booster
Similar questions
- Build a _lengOfSt function in MIPS, that takes an argument in $a1 which is the address of a null-terminated string, returning the length of the given string (number of characters excluding the null-character) in $v0.arrow_forwardPerform shortest job first (Non-preemptive for the following data).arrow_forwardGiven A={1,2,3,4,5,6}, B={4,5,6,7,8,9}. Compute (c)A−B=arrow_forward
- Scenario: In a biased N-bit binary number system with bias B, positive and negative numbers are represented as their value plus the bias B. For example, for 5-bit numbers with a bias of 15, the number 0 is represented as 01111, 1 as 10000, and so forth. Biased number systems are sometimes used in floating point mathematics. Consider a biased 8-bit binary number system with a bias of 12710 Question: What is the representation and value of the most negative number?arrow_forwardConsider the binary linear codeC = {00000, 10011, 01010, 11001, 00101, 10110, 01111, 11100}.Construct a standard array for C. Use nearest-neighbor decoding todecode 11101 and 01100. If the received word 11101 has exactlyone error, can we determine the intended code word? If the receivedword 01100 has exactly one error, can we determine the intendedcode word?arrow_forwardImplement the following error detection and error correction algorithms using C programming for Hamming code. For Hamming codes, flip a bit in the data and implement the algorithm to correct the same on the receiver side.arrow_forward
- Q2/ Write a program to add the followig five data bytes stored in data segment offset starting from [0500H], if the sum generates a carry, stop the adding and store (01) in data segment offset [0600H], otherwise, continue adding and store the sum in data segment offset [0600H],arrow_forwardSimplify the complement of the following function: F(A,B,C,D)=(0,2,4,5,8,9,10,11) Your answer: F=((A'B'D)' (BC)'(AB)')' F=((A'BD)'(BC)'(AB)')' F=((A'B'D)'(B'C)'(AB)') F=((A'B'D')' (BC)'(AB)')arrow_forwardImplement the modular exponentiation (a.k.a. fast exponentiation) function mod_exp (b, n, m) to compute bn (mod m) more efficiently. (Hint: to read n bit-by-bit, use / and % operations repeatedly) a) Test your function for b = 3, n = 231 – 2, m = 231 – 1. b) Report the result and the time (in seconds) it takes to find the result. Q3. [30 pts] Modify your is_prime function to use the mod_exp (b, n, m) instead of the standard power operation (b**n % m). Rename it as is_prime2. Modify the mersenne (p) function to use is_prime2, and call it mersenne2. a) Use the modified function mersenne2 to print all the Mersenne primes Mp for p between 2 and 31 if possible, (with k = 3 in the is_prime function). Compare the results with the ones found in Q1. b) Gradually increase the range of p to find more Mersenne primes (say up to p = 101 if possible). What is the largest Mersenne prime you can achieve here? c) Extend the work in part (b) and find the maximum Mersenne prime you can get from this…arrow_forward
- Let us see if working with AES-128 is fun or not? Let’s suppose we have got following matrix after performing initial transformation, you have to perform following functions on given state: Substitution Bytes Function Shift Rows Function Mix column function (only on elements of column 2 in state matrix.)arrow_forwardComputer Science Write an OpenMP program to calculate the dot product of two vectors (A and B) of length 16 using 8 threads and apply reduction operator to compute the sum. Vector A contains value 0 through 15 in elements 0 through 15. Vector A contains these values in reverse order i.e., 15 through 0.arrow_forwardImplement the modular exponentiation (a.k.a. fast exponentiation)function mod_exp (b, n, m) to compute bn (mod m) more efficiently. (Hint: toread n bit-by-bit, use / and % operations repeatedly)a) Test your function for b = 3, n = 231 – 2, m = 231 – 1.b) Report the result and the time (in seconds) it takes to find the result. in pahtonarrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- COMPREHENSIVE MICROSOFT OFFICE 365 EXCEComputer ScienceISBN:9780357392676Author:FREUND, StevenPublisher:CENGAGE L
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L