Given C Code: void inner4(vec_ptr u, vec_ptr v, data_t dest) { long i; long length = vec_length(u); data_t udata = get_vec_start(u); data_t vdata = get_vec_start(v); data_t sum = (data_t) 0; for (i = 0; i < length; i++) { sum = sum + udata[i] vdata[i]; } *dest = sum; } Cycles per element (CPE): The CPE denotes performance of program that helps in improving code. It helps to understand detailed level loop performance for an iterative program. It is appropriate for programs that use a repetitive computation. The processor’s activity sequencing is controlled by a clock that provides regular signal of some frequency. Loop unrolling: It denotes a program transformation that would reduce count of iterations for a loop. It increases count of elements computed in each iteration. It reduces number of operations that is not dependent to program directly. It reduces count of operations in critical paths of overall computations.

Question

For each of the following two code segments, decide whether it is suitable for parallel execution and response according to your justification: add OpenMP pragmas to make the loop parallel or briefly explain why the code segment is not suitable for parallel execution. (A): for ( i = 0; i < n; i++ ) { x [ i ] = 3 * i + 5; y [ i ] = log ( x [ i ] ); } (B): x [ 0 ] = 1; x [ 1 ] = 2; for ( i = 2; i < n; i++ ) x [ i ] = x [ i – 1 ] * x [ i – 2 ] ;

Answer 1

Question

Chapter 5, Problem 5.16HW

Program Plan Intro

Given C Code:

void inner4(vec_ptr u, vec_ptr v, data_t dest)

{

long i;

long length = vec_length(u);

data_t udata = get_vec_start(u);

data_t vdata = get_vec_start(v);

data_t sum = (data_t) 0;

for (i = 0; i < length; i++)

{

sum = sum + udata[i] vdata[i];

}

*dest = sum;

}

Cycles per element (CPE):

The CPE denotes performance of program that helps in improving code.

It helps to understand detailed level loop performance for an iterative program.

It is appropriate for programs that use a repetitive computation.

The processor’s activity sequencing is controlled by a clock that provides regular signal of some frequency.

Loop unrolling:

It denotes a program transformation that would reduce count of iterations for a loop.

It increases count of elements computed in each iteration.

It reduces number of operations that is not dependent to program directly.

It reduces count of operations in critical paths of overall computations.

Expert Solution & Answer

Trending nowThis is a popular solution!

See solution

Check out sample textbook solution

Students have asked these similar questions

For each of the following two code segments, decide whether it is suitable for parallel execution and response according to your justification: add OpenMP pragmas to make the loop parallel or briefly explain why the code segment is not suitable for parallel execution. (A): for ( i = 0; i < n; i++ ) { x [ i ] = 3 * i + 5; y [ i ] = log ( x [ i ] ); } (B): x [ 0 ] = 1; x [ 1 ] = 2; for ( i = 2; i < n; i++ ) x [ i ] = x [ i – 1 ] * x [ i – 2 ] ;

Implement the following functions using Decoder.F(A,B,C,D,E)=∑(0,1,5,15,24,25,27)+ d(2,4,20,21,22,29) using only 2*4 Decoder(s)

OpenMP C++ With the following code, create 3 versions: Outer loop parallelism: use a single OpenMP pragma only at the outer loop Inner loop parallelism: use a single OpenMP pragma only at the inner loop (use reduction) Nested loop parallelism: use pragmas at both the outer loop and inner loop ----- #pragma omp parallel for for(int i = 0; i < n; i++) { #pragma omp parallel for for(int j = 0; j < n; j++) { y[i] += A[i * n + j] * x[j]; } }