EPYC / Skylake vs Power9 STREAM Memory bandwidth comparison via Zaius / Barreleye G2

Using stream benchmark for measuring memory bandwidth is a industry standard practice and I followed the same. For the x86 systems, to be unbiased, I picked the ‘Stream Triad’ results from a reputable benchmarking org (Anandtech).

Power9 CPU Config used for STREAM testing:

root@ubuntu:/home/ubuntu# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 176
Thread(s) per core: 4
Core(s) per socket: 22
Socket(s): 2
NUMA node(s): 2
Model: 2.2 (pvr 004e 1202)
Model name: POWER9, altivec supported

Memory Config used for STREAM testing:

16x   16GiB RDIMM DDR4 2666 MHz (0.4ns)

Theoretical Memory bandwidth:

Theoretical Memory Bandwidth Calculation on Barreleye G2:

=8(ch)*8(transaction_to_byte)*2.666(GHz)*2(socket)

= 8*8*2.666*2 = 341.248 GB/s

Compiler and run instructions for measurement:

wget http://www.cs.virginia.edu/stream/FTP/Code/stream.c

gcc -m64 -O3 -fopenmp -DSTREAM_ARRAY_SIZE=536895856 -DNTIMES=20 -mcmodel=large stream.c -o stream

OMP_NUM_THREADS=44 GOMP_CPU_AFFINITY=0-175:4 ./stream

Results:

Stream Application Barreleye G2 – 2 x22 core (2400 MHz)

gcc

Barreleye G2 2x 22 core (2666 MHz)

gcc

AMD EPYC 32c 7601 (Anandtech)      2x Intel Skylake 8176 (Anandtech)
Stream Copy (MB/s) 217909.8 241641.7
Stream Add (MB/s) 240561.6 253784
Stream Scale (MB/s) 245069.7 268929.6
Stream Triad (MB/s) 247078.8 270000.4 207000 165000

 

Pictorial Representation of results:

Screen Shot 2018-07-19 at 9.41.34 AM.png

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s