

**High Performance Computing**

## Part 2

**Errata-Corrigie****Page 9**

Figure:

**Page 32**

Last but one paragraph: eliminate parentheses.

**Page 70**

3<sup>rd</sup> paragraph – The new version is:

In a SMP architecture, in which statistically the memory accesses are uniformly distributed over the  $m_M$  macro-modules,  $p$  can be estimated as

$$p = \frac{N}{B_M (m_M, N)}$$

where  $N$  is the number of PEs, and  $B_M$  is the interleaved memory bandwidth evaluated in Section 2.4.2.

For example, for  $N = 64$ , for  $m_M = 8$  we have  $p \sim 8$ , while for  $m_M = 64$   $p$  reduces to about 1.6.

**Page 76**

Add the following paragraph:

In SMP architectures, since

$$p = \frac{N}{B_M (m_M, N)}$$

(see Section 4.3), *low-p* mappings are mainly achieved with a large number of interleaved macro-modules.

### Page 113

Replace formula

$$L_{sync} = 4 R_Q$$

with

$$L_{sync} = 4 \Omega$$

### Page 38

2<sup>nd</sup> paragraph:

However, more sophisticated design styles can be conceived, making exploitation of reuse possible in the non automatic approach too.

### Page 65

In principle, ~~no~~ there is no reason for synchronous writing, because possible “delayed”

### Page 96

1<sup>st</sup> paragraph:

(READ\_REQ, WRITE\_REQ, WT\_REQ, WB\_REQ) and through the most significant bits.