Problem 1.8 The following code segment, consisting of six instructions, needs to b executed 64 times for the evaluation of vector arithmetic expression: D(I) = A(1) + B(1 xC(I) for 0 ≤I≤ 63. Load R1, B(I) /R1 Load R2, C(I) Multiply R1, R2 Load R3, A(I) Add R3, R1 Store D(I), R3 /R2 /R1 /R3 /R3 Memory (8 + 1)/ (R1) × (R2)/ Memory (7 + 1)/ (R3) + (R1)/ /Memory (0 + I) - (R3)/ Memory (a + 1)/ 1 where R1, R2, and R3 are CPU registers, (R1) is the content of R1, a,6,7, and are the starting memory addresses of arrays B(1), C(I), A(I), and D(I), respectively. Assume four clock cycles for each Load or Store, two cycles for the Add, and eight cycles for the Multiply on either a uniprocessor or a single PE in an SIMD machine. (a) Calculate the total number of CPU cycles needed to execute the above code seg- ment repeatedly 64 times on an SISD uniprocessor computer sequentially, ignoring all other time delays. (b) Consider the use of an SIMD computer with 64 PEs to execute the above vector operations in six synchronized vector instructions over 64-component vector data and both driven by the same-speed clock. Calculate the total execution time on the SIMD machine, ignoring instruction broadcast and other delays. (c) What is the speedup gain of the SIMD computer over the SISD computer?

Problem 1.8 The following code segment, consisting of six instructions, needs to b executed 64 times for the evaluation of vector arithmetic expression: D(I) = A(1) + B(1 xC(I) for 0 ≤I≤ 63. Load R1, B(I) /R1 Load R2, C(I) Multiply R1, R2 Load R3, A(I) Add R3, R1 Store D(I), R3 /R2 /R1 /R3 /R3 Memory (8 + 1)/ (R1) × (R2)/ Memory (7 + 1)/ (R3) + (R1)/ /Memory (0 + I) - (R3)/ Memory (a + 1)/ 1 where R1, R2, and R3 are CPU registers, (R1) is the content of R1, a,6,7, and are the starting memory addresses of arrays B(1), C(I), A(I), and D(I), respectively. Assume four clock cycles for each Load or Store, two cycles for the Add, and eight cycles for the Multiply on either a uniprocessor or a single PE in an SIMD machine. (a) Calculate the total number of CPU cycles needed to execute the above code seg- ment repeatedly 64 times on an SISD uniprocessor computer sequentially, ignoring all other time delays. (b) Consider the use of an SIMD computer with 64 PEs to execute the above vector operations in six synchronized vector instructions over 64-component vector data and both driven by the same-speed clock. Calculate the total execution time on the SIMD machine, ignoring instruction broadcast and other delays. (c) What is the speedup gain of the SIMD computer over the SISD computer?

C++ for Engineers and Scientists

C++ for Engineers and Scientists

4th Edition

ISBN:9781133187844

Author:Bronson, Gary J.

Publisher:Bronson, Gary J.

Chapter2: Problem Solving Using C++using

Section2.3: Data Types

Problem 9E: (Practice) Although the total number of bytes varies from computer to computer, memory sizes of...

See similar textbooks

Related questions

Question

Problem 1.8 The following code segment, consisting of six instructions, needs to be
executed 64 times for the evaluation of vector arithmetic expression: D(I) = A(I) + B(I)
xC(I) for 0 ≤ I ≤ 63.
Load R1, B(I)
/R1 - Memory (a + I)/
Load R2, C(I)
Multiply R1, R2
Load R3, A(I)
Add R3, R1
Store D(I), R3
t
/R2 Memory (8 + 1)/
/R1 - (R1) × (R2)/
/R3
-
Memory (7 + I)/
-
/R3 (R3) + (R1)/
/Memory (0 + I) ← (R3)/
where R1, R2, and R3 are CPU registers, (R1) is the content of R1, a, ß,7, and are
the starting memory addresses of arrays B(1), C(I), A(I), and D(I), respectively. Assume
four clock cycles for each Load or Store, two cycles for the Add, and eight cycles for the
Multiply on either a uniprocessor or a single PE in an SIMD machine.
(a) Calculate the total
ber of CPU cycles needed to execute the above code seg-
ment repeatedly 64 times on an SISD uniprocessor computer sequentially, ignoring
all other time delays.
(b) Consider the use of an SIMD computer with 64 PEs to execute the above vector
operations in six synchronized vector instructions over 64-component vector data
and both driven by the same-speed clock. Calculate the total execution time on
the SIMD machine, ignoring instruction broadcast and other delays.
(c) What is the speedup gain of the SIMD computer over the SISD computer?

Expert Solution

Step by step

Solved in 2 steps

SEE SOLUTION Check out a sample Q&A here

Blurred answer

Knowledge Booster

Pipelining

Learn more about

Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.

Similar questions

SEE MORE QUESTIONS

Recommended textbooks for you

C++ for Engineers and Scientists

C++ for Engineers and Scientists

Computer Science

ISBN:

9781133187844

Author:

Bronson, Gary J.

Publisher:

Course Technology Ptr

C++ for Engineers and Scientists

C++ for Engineers and Scientists

Computer Science

ISBN:

9781133187844

Author:

Bronson, Gary J.

Publisher:

Course Technology Ptr

SEE MORE TEXTBOOKS