Computer Systems: A Programmer's Perspective (3rd Edition)
Computer Systems: A Programmer's Perspective (3rd Edition)
3rd Edition
ISBN: 9780134092669
Author: Bryant, Randal E. Bryant, David R. O'Hallaron, David R., Randal E.; O'Hallaron, Bryant/O'hallaron
Publisher: PEARSON
Question
Book Icon
Chapter 5, Problem 5.16HW
Program Plan Intro

Given C Code:

void inner4(vec_ptr u, vec_ptr v, data_t *dest)

{

long i;

long length = vec_length(u);

data_t *udata = get_vec_start(u);

data_t *vdata = get_vec_start(v);

data_t sum = (data_t) 0;

for (i = 0; i < length; i++)

{

sum = sum + udata[i] * vdata[i];

}

*dest = sum;

}

Cycles per element (CPE):

  • The CPE denotes performance of program that helps in improving code.
  • It helps to understand detailed level loop performance for an iterative program.
  • It is appropriate for programs that use a repetitive computation.
  • The processor’s activity sequencing is controlled by a clock that provides regular signal of some frequency.

Loop unrolling:

  • It denotes a program transformation that would reduce count of iterations for a loop.
  • It increases count of elements computed in each iteration.
  • It reduces number of operations that is not dependent to program directly.
  • It reduces count of operations in critical paths of overall computations.

Blurred answer
Students have asked these similar questions
For each of the following two code segments, decide whether it is suitable for parallel execution and response according to your justification: add OpenMP pragmas to make the loop parallel or briefly explain why the code segment is not suitable for parallel execution.      (A):      for ( i = 0; i < n; i++ ) {                                                      x [ i ] = 3 * i + 5;                                                                 y [ i ] = log ( x [ i ] );                                                       }                                                                  (B):      x [ 0 ] =  1;                   x [ 1 ] =  2;                 for ( i = 2; i < n; i++ )                     x [ i ] = x [ i – 1 ] * x [ i – 2 ] ;
Implement the following functions using Decoder.F(A,B,C,D,E)=∑(0,1,5,15,24,25,27)+ d(2,4,20,21,22,29) using only 2*4 Decoder(s)
OpenMP C++ With the following code, create 3 versions: Outer loop parallelism: use a single OpenMP pragma only at the outer loop Inner loop parallelism: use a single OpenMP pragma only at the inner loop (use reduction) Nested loop parallelism: use pragmas at both the outer loop and inner loop ----- #pragma omp parallel for    for(int i = 0; i < n; i++)   {    #pragma omp parallel for            for(int j = 0; j < n; j++)        {                y[i] += A[i * n + j] * x[j];        }   }
Knowledge Booster
Background pattern image
Similar questions
SEE MORE QUESTIONS
Recommended textbooks for you
Text book image
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L