EPYC / Skylake vs Power9 STREAM Memory bandwidth comparison via Zaius / Barreleye G2

Using stream benchmark for measuring memory bandwidth is a industry standard practice and I followed the same. For the x86 systems, to be unbiased, I picked the ‘Stream Triad’ results from a reputable benchmarking org (Anandtech).

Power9 CPU Config used for STREAM testing:

root@ubuntu:/home/ubuntu# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 176
Thread(s) per core: 4
Core(s) per socket: 22
Socket(s): 2
NUMA node(s): 2
Model: 2.2 (pvr 004e 1202)
Model name: POWER9, altivec supported

Memory Config used for STREAM testing:

16x   16GiB RDIMM DDR4 2666 MHz (0.4ns)

Theoretical Memory bandwidth:

Theoretical Memory Bandwidth Calculation on Barreleye G2:

=8(ch)*8(transaction_to_byte)*2.666(GHz)*2(socket)

= 8*8*2.666*2 = 341.248 GB/s

Compiler and run instructions for measurement:

wget http://www.cs.virginia.edu/stream/FTP/Code/stream.c

gcc -m64 -O3 -fopenmp -DSTREAM_ARRAY_SIZE=536895856 -DNTIMES=20 -mcmodel=large stream.c -o stream

OMP_NUM_THREADS=44 GOMP_CPU_AFFINITY=0-175:4 ./stream

Results:

Stream Application Barreleye G2 – 2 x22 core (2400 MHz)

gcc

Barreleye G2 2x 22 core (2666 MHz)

gcc

AMD EPYC 32c 7601 (Anandtech)      2x Intel Skylake 8176 (Anandtech)
Stream Copy (MB/s) 217909.8 241641.7
Stream Add (MB/s) 240561.6 253784
Stream Scale (MB/s) 245069.7 268929.6
Stream Triad (MB/s) 247078.8 270000.4 207000 165000

 

Pictorial Representation of results:

Screen Shot 2018-07-19 at 9.41.34 AM.png

Host / OpenBMC sensors / Update info for Zaius / Barreleye G2 via REST API

Zaius / Barreleye G2 -  Power9 and AST2500 running OpenBMC

Zaius / Barreleye G2 – Power9 and AST2500 running OpenBMC

Part of integrating a new openPOWER Zaius / Barreleye G2 in a companies’ environment comes with a couple of new-ness.:

A) Not all openPOWER servers have APSS which provides the external power measurement for CPU, DIMMs and other components.

B) Not all openPOWER servers fully support IPMIoverLAN because of leveraging openBMC. OpenBMC does this better, it provides you API access to all the sensors / management controls.

However to mitigate this newness, I’ve document I’ve provided below information to help you in your journey.

Power Rail Information via Host:

Commands:
apt install lm-sensors
sensors
Output:
Chip 0 :                  70.00 W
Chip 0 Vdd:               14.00 W
Chip 0 Vdn:               21.00 W
Chip 8 :                  79.00 W
Chip 8 Vdd:               19.00 W
Chip 8 Vdn:               25.00 W


Section2

REST API BMC Access:

OpenBMC – Host Code Update

After build, you will end up with a set of image files in:

tmp/deploy/images//.

The images corresponds to components that can be updated on the BMC.

The whole flash image for BMC
image-bmc → flash--
The small initramfs image is used for early init and flash management
image-initramfs → core-image-minimal-initramfs-palmetto.cpio.lzma.u-boot
The OpenBMC kernel cuImage (combined kernel and device tree)
image-kernel → cuImage
The read-only OpenBMC file system
image-rofs → obmc-phosphor-image-palmetto.squashfs-xz
The read-write file system for persistent changes to the OpenBMC file system
image-rwfs → rwfs.jffs2
The OpenBMC boot-loader.
image-u-boot → u-boot.bin

Preparing for BMC code Update

The BMC normally runs with the read-write and read-only file systems mounted, which means these images may be read (and written, for the read-write file-system) at any time. Because the updates are distributed as complete file system images, these file-systems have to be unmounted to replace them with new images. To unmount these file-systems all applications must be stopped.

By default an orderly reboot will stop all applications and unmount the root file-system, and the images copied into the /run/initramfs directory will be applied at that point before restarting. This also applies to the shutdown and halt commands — they will also write the flash before stopping.

As an alternative, an option can be parsed by the init script in the initramfs to copy the required contents of these file-systems into RAM so that the images can be applied while the rest of the application stack is running and progress can be monitored over the network. The update script can then be called to write the images while the
system is operational and its progress output monitored.

Update from the OpenBMC Shell

To update from the OpenBMC shell, follow the steps in this section. It is recommended that the BMC be prepared for the update first as shown below :

fw_setenv openbmconce copy-files-to-ram copy-base-filesystem-to-ram reboot
Copy one or more of these image-* files to the directory:
/run/initramfs/
(preserving the filename), then run the update script to apply the images:
/run/initramfs/update
then reboot to finish applying:
reboot
During the reboot process the update script will be invoked after the 
file systems are unmounted to complete the update process.
Some optional features are available, see the help for more details:
/run/initramfs/update –help

Update via REST

An OpenBMC system can download an update image from a TFTP server, and apply updates, controlled via REST. The general procedure is:

  1.  Prepare system for update
  2. Configure update settings
  3. Initiate update
  4. Check flash status
  5. Reboot the BMC

Prepare system for update

Perform a POST to invoke the PrepareForUpdate method of the /flash/bmc object:

curl -b cjar -k -H "Content-Type: application/json" -X POST \
 -d '{"data": ["<TFTP server IP address>", "<filename>"]}' \
 https://bmc/org/openbmc/control/flash/bmc/action/prepareForUpdate

This will setup the u-boot environment and reboot the BMC. If no other images were pending the BMC should return in about 2 minutes.

Configure update settings

There are a few settings available to control the update process:

preserve_network_settings Preserve network settings, only needed if updating the whole flash
restore_application_defaults Update (clear) the read-write file system
update_kernel_and_apps_only Update kernel and initramfs
clear_persistent_files Ignore the persistent file list when resetting applications defaults
auto_apply Attempt to write the images by invoking the Apply method after the images are unpacked.

To configure the update settings, perform a REST PUT to /control/flash/bmc/attr/<setting>. For example:

curl -b cjar -k -H "Content-Type: application/json" -X PUT \
 -d '{"data": 1}' \
 https://bmc/org/openbmc/control/flash/bmc/attr/preserve_network_settings

 

Initiate update

Perform a POST to invoke the updateViaTftp method of the /flash/bmc object:

curl -b cjar -k -H "Content-Type: application/json" -X POST \
 -d '{"data": ["<TFTP server IP address>", "<filename>"]}' \
 https://bmc/org/openbmc/control/flash/bmc/action/updateViaTftp

 

Check flash status

You can query the progress of the download and image verification with a simple GET request:

curl -b cjar -k https://bmc/org/openbmc/control/flash/bmc

Or perform a POST to invoke the GetUpdateProgress method of the /flash/bmc object:

curl -b cjar -k -H "Content-Type: application/json" -X POST \
 -d '{"data": []}' \
https://bmc/org/openbmc/control/flash/bmc/action/GetUpdateProgress

Note: the status will not advance from Writing images to flash without calling the GetUpdateProgress method.
If the status is Image ready to apply then you can either initiate a reboot or call the Apply method to start the process of writing the flash.

Reboot the BMC

To start using the new images, reboot the BMC using the warmReset method of the BMC control object:

curl -b cjar -k -H "Content-Type: application/json" -X POST \
 -d '{"data": []}' \
 https://bmc/org/openbmc/control/bmc0/action/warmReset

Host Code Update

The host firmware (or “BIOS”) can be updated in a similar method. Because the BMC does not use the host firmware it is updated when the download is completed. This assumes the host is not accessing its firmware.
Perform a POST request to call the updateViaTftp method of /control/flash/bios (instead of /control/flash/bmc used above). To initiate the update:

curl -b cjar -k -H "Content-Type: application/json" -X POST \
  -d '{"data": ["<TFTP server IP address>", "<filename>"]}' \
  https://bmc/org/openbmc/control/flash/bios/action/updateViaTftp

And to check the flash status:

curl -b cjar -k https://bmc/org/openbmc/control/flash/bios

 

Serial Over LAN(host console)

The console infrastructure allows multiple shared connections to a single host UART. UART data from the host is outputted to all connections, and input from any connection is sent to the host.

Remote Console Connections

To connect to an OpenBMC console session remotely, just ssh to your BMC on port 2200. Use the same login credentials as you would for a normal ssh session:

 $ ssh -p 2200 [user]@[bmc-hostname]
 bmc-hostname is BMC IP address.

Local Console Connections

If you’re already logged into an OpenBMC machine, you can start a console session directly, using:

$ obmc-console-client

To exit from a console, type:

return ~ 

Note that if you’re on an ssh connection, you’ll need to ‘escape’ the ~ character, by entering it twice.

Logging

Console logs are kept in:

/var/log/obmc-console.log

This log is limited in size, and will wrap after hitting that limit (currently set at 16kB)

 

REST API

The primary management interface for OpenBMC is REST.

Host Management with OpenBMC

Accessible over REST.

Inventory

The system inventory structure is under the /xyz/openbmc_project/inventory hierarchy. Each event is a separate object under this structure, referenced by name.

In OpenBMC the inventory is represented as a path which is hierarchical to the physical system topology. Items in the inventory are referred to as inventory items and are not necessarily FRUs. If the system contains one chassis, a motherboard, and a CPU on the motherboard, then the path to that inventoryitem would be:
inventory/system/chassis/motherboard0/cpu0
The properties associated with an inventory item are specific to that item with the addition of these common properties:
version: a code version associated with this item
present: indicates whether this item is present in the system (True/False)
The usual list and enumerate REST queries allow the system inventory structure to be accessed. For example, to enumerate all inventory items and their properties:

curl -b jcar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/enumerate

Confirm BMC version:

curl -b jcar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/org/openbmc/inventory/system/chassis/motherboard/bmc

Check the status of the CPU0:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard/cpu0

Check the status of the CPU0 core<item>:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard/cpu0/enumerate
curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard/cpu0/corex

(e.g. core0 ~ core23)

 

Check the status of the CPU1:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard/cpu1

 

Check the status of the CPU1 core<item>:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard/cpu1/enumerate
curl -b cjar -k -X GET -H "Content-Type:application/json"
 https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard/cpu1/corex

(e.g. core0 ~ core23)
Check the status of the dimm<instance>:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard/dimmx

(e.g. dimm0 ~ dimm31)
Check the information of motherboard

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/motherboard

Check the information of FPDB

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/FPDB

Check the information of EXP

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/EXP

Check the information of BP

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/BP

Check the information of pcie

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/inventory/system/chassis/BP

Check the information of pcie, shows 7 pcie slot

curl -b cjar -k -X GET -H "Content-Type:application/json"
/xyz/openbmc_project/inventory/system/chassis/pcie_card_xxx

(xxx should be pe0~pe4, e2a and e2b)

 

Sensors

The system inventory structure is under the /xyz/openbmc_project/sensors hierarchy.
This interface allows monitoring of system attributes like temperature or altitude, and are represented similar to the inventory, by object paths under the top-level sensors object name. The path categorizes the sensor and shows what the sensor represents, but does not necessarily represent the physical topology of the system.

The following sensor hierarchies are recognized:

  • Temperature: Unit must be “DegreesC”.
  • Voltage: Unit must be “Volts”.
  • Fan_tach: Unit must be “RPMS”.

For example, all temperature sensors are under sensors/temperature whereas ambient temperature sensors would be sensors/temperature/ + /ambient_mb, /ambient_fpdb, /ambient_bp.

These are the common properties associated with all sensors:

  •  Value: current value of sensor
  • Units: units of value
  • WarningHigh & WarningLow: High & Low threshold for a warning error
  • CriticalHigh & CriticalLow: High & Low threshold for a critical error

All Sensor value properties are read-only.
To enumerate all sensors in the system:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/enumerate

 

Check the temperature of ambient sensor on Motherboard:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/temperature/ambient_mb

 

Check the temperature of ambient sensor on FPDB:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/temperature/ambient_fpdb

 

Check the temperature of ambient sensor on BP:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/temperature/ambient_bp

 

Check the temperature of cpu cores:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/temperature/p0_core7_temp

(e.g. cpu0/core7)

 

 

Check the voltage of power sequencer(UCD90160):

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/voltage/voutx

(e.g. vout1 ~ vout13)
Check the voltage of ADC:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/voltage/iio_hwmon/<item>

Check the status of the Fan_tach:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/fan_tach/tach/fanxL

(e.g. fan1L ~ fan5L)

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/sensors/fan_tach/tach/fanxH

(e.g. fan1H ~ fan5H)
Check the current speed of fans (PWM)

curl -b cjar -k -X GET https://BMC_IP//xyz/openbmc_project/sensors/pwm/speed/fanx

 

State

The system inventory structure is under the /xyz/openbmc_project/state hierarchy.
The OpenBMC is an interface to control and track the states of the different software
entities in a system. These interfaces will be the mechanism by which you determine the state of their corresponding instances, as well as reboot the BMC and hosts, turn power
on and off to the chassis. The interfaces are designed in a way to support  many to many
mappings of each interface.
There are three states to track and control on a BMC based server. The states below in () represents the actual parameter name as found in /xyz/openbmc_project/state/+/bmcX,/hostY,/chassisZ

where X,Y,Z are the instances (in most cases 0). For all three states, the software tracks a current state, and a requested transition.
1. BMC : The BMC has either started all required systemd services and reached it’s required target(Ready) or it’s on it’s way there (NotReady). Users can request a (Reboot) or (None).
2. Host : The host is either (Off), (Running), or it’s (Quiesced). Running simply implies that the processors are executing instructions. Users can request the host be in a (Off), (On), or (Reboot) state. More details on different Reboot options below. Quiesced means the host OS is in a quiesce state and the system should be checked for errors.

3. Chassis : The chassis is either (Off) or (On) This represents the state of power to the chassis. The Chassis being on is a pre-req to the Host being running. Users can request for the chassis to be (Off) or (On). A transition to one or the other is implied by the transition not matching the current state.
To enumerate all state items :

BMC
Check the status of the BMC:
 curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/state/bmc<instance>

Change the BMC transition states:

curl -b cjar -k -X PUT -H "Content-Type:application/json" -d '{"data":
"xyz.openbmc_project.State.BMC.Transition.None" }'
https://BMC_IP/xyz/openbmc_project/state/bmc<instance>/attr/RequestedBMCTransition

 

Host

Check the status of the Host:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/state/host<instance>

Change the Host transition states:

curl -b cjar -k -X PUT -H "Content-Type:application/json" -d '{"data":
"xyz.openbmc_project.State.Host.Transition.On" }'
 https://BMC_IP/xyz/openbmc_project/state/host<instance>/attr/RequestedHostTransition

 

Chassis

Check the status of the Chassis:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/state/chassis<instance>

Change the Chassis transition states:

curl -b cjar -k -X PUT -H "Content-Type:application/json" -d '{"data":
"xyz.openbmc_project.State.Chassis.Transition.On" }'
 https://BMC_IP/xyz/openbmc_project/state/chassis<instance>/attr/RequestedPowerTransition

Control

The system inventory structure is under the /xyz/openbmc_project/control hierarchy. Implement to specify power transition behavior on a BMC reset. The implementation may choose to only enforce the policy on a power loss or on both a power loss and BMC reboot.

To enumerate all controls in the system:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/led/physical/enumerate

The policy to adopt after the BMC reset:
1. AlwaysOn: Perform a complete power on process.
2. AlwaysOff: Remain powered off.
3. Restore: Restore power to last requested state recorded before the BMC was reset.

Check the status of the Power Restore Policy:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/control/host0/power_restore_policy

 

Change Power Restore Policy states:

curl -b cjar -k -X PUT -H "Content-Type:application/json" -d '{"data":
"xyz.openbmc_project.Control.Power.RestorePolicy.Policy.AlwaysOn" }'
 https://BMC_IP/xyz/openbmc_project/control/host0/power_restore_policy/attr/PowerRestorePolicy

 

Led

LEDs that are present in the system can be managed by: BMC/Hardware controller/Operating system and always recommended that external users use only the LED groups. Describing the Physical LED here just for documenting it and strictly NOT to be used outside of the firmware code.
The system inventory structure is under the /xyz/openbmc_project/led/physical hierarchy. Each event is a separate object under this structure, referenced by name.

To enumerate all physical LED in the system:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/led/physical/enumerate

There are four kinds of LEDs to track and control on a BMC based server as follows:

  • Attention
  • Onboard
  • Platform
  • System

Control states of physical LED can be in:

  • On: LED is in solid on state
  • Off: LED is in off state
  • Blink: LED is blinking once per second.

Check the status of the Attention LED:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/led/physical/Attention

Change a Attention LED states:

curl -b cjar -k -X PUT -H "Content-Type: application/json" -d '{"data":
 "xyz.openbmc_project.Led.Physical.Action.On" }'
 https://BMC_IP/xyz/openbmc_project/led/physical/Attention/attr/state

Check the status of the Onboard LED:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/led/physical/Onboard

Change a Attention LED states:

curl -b cjar -k -X PUT -H "Content-Type: application/json" -d '{"data":
 "xyz.openbmc_project.Led.Physical.Action.On" }'
 https://BMC_IP/xyz/openbmc_project/led/physical/Onboard/attr/state

 

Check the status of the Platform LED:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/led/physical/Platform

Change a Attention LED states:

curl -b cjar -k -X PUT -H "Content-Type: application/json" -d '{"data":
 "xyz.openbmc_project.Led.Physical.Action.On" }'
 https://BMC_IP/xyz/openbmc_project/led/physical/Platform/attr/state

Check the status of the System LED:

curl -b cjar -k -X GET -H "Content-Type:application/json"
https://BMC_IP/xyz/openbmc_project/led/physical/System

Change a Attention LED states:

curl -b cjar -k -X PUT -H "Content-Type: application/json" -d '{"data":
"xyz.openbmc_project.Led.Physical.Action.On" }'
 https://BMC_IP/xyz/openbmc_project/led/physical/System/attr/state 

 

Logging

The event log structure is under the /xyz/openbmc_project/logging/entry hierarchy. Each event is a separate object under this structure, referenced by number.
BMC and host firmware on POWER-based servers can report logs to the BMC. Typically, these logs are reported in cases where host firmware cannot start the OS, or cannot reliably log to the OS.
The properties associated with an error log are as follows:

  • Id: The error event entry id number.
  • Timestamp: Commit timestamp of the error event entry in milliseconds since 1970.
  • Severity: The severity of the error event entry.
  • Message: The message description of the error event entry.
  • AdditionalData: Additional information in the form of metadata field strings VAR=val
  • Resolved: Error resolution status.

To list all reported events:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/logging/entry/
 {
 "data": [
 "/xyz/openbmc_project/logging/entry/1",
 "/xyz/openbmc_project/logging/entry/2"
 ],
 "message": "200 OK",
 "status": "ok"
 }

To read a specific event log:

curl -b cjar -k -X GET https://BMC_IP/xyz/openbmc_project/logging/entry/1
 {
 "data":{
 "AdditionalData":[
 "_PID=1389"
 ],
 "Id": 1,
 "Message": "org.open_power.Host.Boot.Error.WatchdogTimedOut",
 "Resolved": 0,
 BARRELEYE-G2 BMC FW Specification
 "Severity": "xyz.openbmc_project.Logging.Entry.Level.Error",
 "Timestamp": 1510895953373,
 "associations":[]
 },
 "message": "200 OK",
 "status": "ok"
 }

Implement to allow the deletion of individual entries, but cannot clear all logs.
To delete an event log (log 1 in his example), call the delete method on the event:

curl -b cjar -k -X DELETE https://BMC_IP/xyz/openbmc_project/logging/entry/1

 

Network

The system inventory structure is under the /xyz/openbmc_project/network hierarchy.

A Network Manager is a daemon which handles network management operations. It must implement the xyz/openbmc_project/network/SystemConfiguration .

When the network manager daemon comes up, it should create objects implementing physical link/virtual interfaces such as xyz/openbmc_project/network/+/EthernetInterface or /VLANInterface on the system.

IP address(v4 and v6) objects must be children objects of the physical/virtual interface object.
These are the interface associated to all network:

  • SystemConfiguration: This describes the system specific parameters.
  • EthernetInterface: This describes the interface specific parameters.
  • IP: This describes the ip address specific parameters.
  • IPProtocol: This describes the IP protocol type(IPv4/IPv6).
  • VLANInterface: This describes the VLAN specific properties.
  • Bond: This describes the interface bonding parameters.

To enumerate all networks in the system:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/network/enumerate 

Interface objects can be physical as well as virtual.
If the object is physical interface, it can’t be deleted, but if it is a virtual interface object it can be deleted.

curl -b cjar -k -X Delete https://BMC_IP/xyz/openbmc_project/network/<interfacename>

There can be multiple ip address objects under an interface object. These objects can be deleted by the delete function or be added by the add function.

To delete IPv4 object:

curl -b cjar -k -X Delete https://BMC_IP/xyz/openbmc_project/network/eth0/ipv4/<id>

To add IPv4 object and ip address:

curl -b cjar -k -X POST -H "Content-Type:application/json" -d '{"data":
 ["xyz.openbmc_project.Network.IP.Protocol.IPv4", " Address", PrefixLength, " Gateway"] }'
 https://BMC_IP/xyz/openbmc_project/network/eth0/action/IP

To delete IPv6 object:

curl -b cjar -k -X Delete https://BMC_IP/xyz/openbmc_project/network/eth0/ipv6/<id>

Have the system configuration related parameters at/xyz/openbmc_project/network/config

 

Software Versions

The system inventory structure is under the /xyz/openbmc_project/software hierarchy.

All version identifiers are implementation specific strings. No format should be assumed. Some software versions are a collection of images, each with their own version identifiers.

The xyz/openbmc_project/software/ExtendedVersion interface can be added to any software.

To enumerate all software in the system:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/software/enumerate

Check the status of the aggregation version:

curl -b cjar -k -X GET -H "Content-Type:application/json"
 https://BMC_IP/xyz/openbmc_project/software/<id>

 

Time

The system inventory structure is under the /xyz/openbmc_project/time hierarchy.

To enumerate all time in the system:

curl -b jcar -k -X GET https://BMC_IP/xyz/openbmc_project/time/enumerate

Check the status of the time owner:

curl -b cjar -k -X GET -H "Content-Type:application/json"
 https://BMC_IP/xyz/openbmc_project/time/owner

The possible owners of the time are as follows:

  • BMC: BMC alone owns system time.
  • Host: Host alone owns system time.
  • Both: BMC and host own system time.
  • Split: BMC and host maintain their own time.

Change the method of the owner:

curl -b cjar -k -X PUT -H "Content-Type:application/json" -d '{"data":
 "xyz.openbmc_project.Time.Owner.Owners.BMC" }'
 https://BMC_IP/xyz/openbmc_project/time/owner/attr/TimeOwner

Check the status of the time synchronization method:

curl -b cjar -k -X GET -H "Content-Type:application/json"
 https://BMC_IP/xyz/openbmc_project/time/sync_method

The possible methods of time synchronization are as follows:

  • NTP: Sync by using the Network Time Protocol.
  • Manual: Sync time manually.

To change the time synchronization method:

curl -b cjar -k -X PUT -H "Content-Type:application/json" -d '{"data":
 "xyz.openbmc_project.Time.Synchronization.Method.NTP" }'
 https://BMC_IP/xyz/openbmc_project/time/sync_method/attr/TimeSyncMethod

Exercising Nvidia Volta V100 with POWER / Barreleye G2 – GPU burn

At Open Compute and OpenPOWER conferences, we announced that we are building GPU configurations with our POWER9 Barreleye G2 server. This includes PCIe Volta V100 support and first of a kind SXM2 Volta V100 Support.

 

Image result for sxm2 barreleye g2

Part of support and validation process for these servers is to exercise or “burn” the GPUs to see the system level, power and thermal implications.

There are variety of tools to do this:  More open one , for ex: GPU burn utility, and bit more closed off Nvidia Validation Suite (NVVS) which part of their Data center GPU manager (DCGM) suite.

We are starting our validation with GPU Burn utility. The source we are using is here on github: https://github.com/wilicc/gpu-burn

We

So what is the GPU Burn utility doing ?

It forks one process for each GPU on the machine, one process for keeping track of the GPU temperatures if available (e.g. Fermi Teslas don’t have temp. sensors), and one process for reporting the progress. The GPU processes each allocate 90% of the free GPU memory, initialize 2 random 2048*2048 matrices, and continuously perform efficient CUBLAS matrix-matrix multiplication routines on them and store the results across the allocated memory.

So How do you compile and run GPU-BURN on POWER9 / Volta V100 ? What is the sample output like ? 

a) Install the correct NVIDIA Driver for your OS (In my case GV100 / Ubuntu 16.04, version 396.26 ): (Skip if already done)

  1. wget http://us.download.nvidia.com/tesla/396.26/nvidia-driver-local-repo-ubuntu1604-396.26_1.0-1_ppc64el.deb
  2. dpkg -i nvidia-driver-local-repo-ubuntu1604-396.26_1.0-1_ppc64el.deb

b) Install the correct NVIDIA CUDA Version ( CUDA 9.2 for POWER9 + Volta V100)  (Skip if already done)

  1. wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/ppc64el/cuda-repo-ubuntu1604_9.2.88-1_ppc64el.deb
  2. `sudo dpkg -i cuda-repo-ubuntu1604_9.2.88-1_ppc64el.deb`
  3. `sudo apt-key adv –fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/ppc64el/7fa2af80.pub`
  4. `sudo apt-get update`
  5.  sudo apt-get install cuda

c)  Get all the environment variables set and ready to go 

  1. export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
  2. export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

d) REBOOT (Do NOT forget to reboot, Nvidia’s environment setup requires a proper reboot for all to be well)

e) Check all is well by listing your GPUs (Command 1 below) and checking the properties of GPUs ( Command 2 below) , and checking nvcc is setup (Command 3 below). If you don’t get a meaningful response, most likely something is off with your environment variables or worst case your driver / CUDA version

  1. nvidia-smi
  2. nvidia-smi -q
  3. nvcc -V

e) Compiling GPU Burn

  1. git clone https://github.com/wilicc/gpu-burn
  2. cd gpu-burn/
  3. make

f) Running GPU Burn (Few example commands below)

  1.  ./gpu_burn  (Single precision GPU burn for 10 seconds)
  2. ./gpu_burn -d ( Double precision GPU burn for 10 seconds)
  3. ./gpu_burn 120 (Single precision GPU burn for 120 seconds)

 

Sample output:

root@aprilmin6:/home/ubuntu/gpu-burn/gpu-burn# ./gpu_burnroot@aprilmin6:/home/ubuntu/gpu-burn/gpu-burn# ./gpu_burnRun length not specified in the command line.  Burning for 10 secsGPU 0: Tesla P100-SXM2-16GB (UUID: GPU-b882028f-aa4d-fcab-7d05-704f51bc5e9e)GPU 1: Tesla P100-SXM2-16GB (UUID: GPU-ad3b392e-1ce8-72a2-8398-36939294a238)GPU 2: Tesla P100-SXM2-16GB (UUID: GPU-6feba197-41fe-1c59-8201-8ba34411e69f)GPU 3: Tesla P100-SXM2-16GB (UUID: GPU-6f6dc5df-481f-f711-f0f9-94fca1bade2f)Initialized device 0 with 16280 MB of memory (15955 MB available, using 14360 MB of it), using FLOATSInitialized device 1 with 16280 MB of memory (15955 MB available, using 14360 MB of it), using FLOATSInitialized device 3 with 16280 MB of memory (15955 MB available, using 14360 MB of it), using FLOATSInitialized device 2 with 16280 MB of memory (15955 MB available, using 14360 MB of it), using FLOATS20.0%  proc’d: 895 (6482 Gflop/s) – 0 (0 Gflop/s) – 0 (0 Gflop/s) – 0 (0 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 39 C – 39 C – 39 C Summary at:   Wed May 23 13:27:53 CDT 2018
30.0%  proc’d: 895 (6482 Gflop/s) – 895 (6203 Gflop/s) – 0 (0 Gflop/s) – 0 (0 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 39 C – 39 C – 39 C Summary at:   Wed May 23 13:27:54 CDT 2018
30.0%  proc’d: 895 (6482 Gflop/s) – 895 (6203 Gflop/s) – 895 (5989 Gflop/s) – 0 (0 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 39 C -30.0%  proc’d: 895 (6482 Gflop/s) – 895 (6203 Gflop/s) – 895 (5989 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 340.0%  proc’d: 1790 (8941 Gflop/s) – 895 (6203 Gflop/s) – 895 (5989 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 40.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 895 (5989 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C -40.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C 50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 39 C – 39 C – 39 C Summary at:   Wed May 23 13:27:56 CDT 2018
50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C60.0%  proc’d: 2685 (8912 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C – 52 C – 49 C – 52 C Summary at:   Wed May 23 13:27:57 CDT 2018
60.0%  proc’d: 2685 (8912 Gflop/s) – 2685 (8931 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C60.0%  proc’d: 2685 (8912 Gflop/s) – 2685 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C60.0%  proc’d: 2685 (8912 Gflop/s) – 2685 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C80.0%  proc’d: 3580 (8916 Gflop/s) – 2685 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C – 52 C – 49 C – 52 C Summary at:   Wed May 23 13:27:59 CDT 2018
80.0%  proc’d: 3580 (8916 Gflop/s) – 3580 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C80.0%  proc’d: 3580 (8916 Gflop/s) – 3580 (8931 Gflop/s) – 3580 (8909 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C80.0%  proc’d: 3580 (8916 Gflop/s) – 3580 (8931 Gflop/s) – 3580 (8909 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C90.0%  proc’d: 4475 (8917 Gflop/s) – 3580 (8931 Gflop/s) – 3580 (8909 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C90.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 3580 (8909 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C – 52 C – 49 C – 52 C Summary at:   Wed May 23 13:28:01 CDT 2018
100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 5370 (8909 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 C – 56 C – 52 C – 56 CKilling processes.. done
Tested 4 GPUs: GPU 0: OK GPU 1: OK GPU 2: OK GPU 3: OK30.0%  proc’d: 895 (6482 Gflop/s) – 895 (6203 Gflop/s) – 0 (0 Gflop/s) – 0 (0 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 39 C – 39 C – 39 C Summary at:   Wed May 23 13:27:54 CDT 2018
30.0%  proc’d: 895 (6482 Gflop/s) – 895 (6203 Gflop/s) – 895 (5989 Gflop/s) – 0 (0 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 39 C -30.0%  proc’d: 895 (6482 Gflop/s) – 895 (6203 Gflop/s) – 895 (5989 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 340.0%  proc’d: 1790 (8941 Gflop/s) – 895 (6203 Gflop/s) – 895 (5989 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 40.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 895 (5989 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C -40.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 895 (5411 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C 50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C – 39 C – 39 C – 39 C Summary at:   Wed May 23 13:27:56 CDT 2018
50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 39 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C50.0%  proc’d: 1790 (8941 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C60.0%  proc’d: 2685 (8912 Gflop/s) – 1790 (8928 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C – 52 C – 49 C – 52 C Summary at:   Wed May 23 13:27:57 CDT 2018
60.0%  proc’d: 2685 (8912 Gflop/s) – 2685 (8931 Gflop/s) – 1790 (8916 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C60.0%  proc’d: 2685 (8912 Gflop/s) – 2685 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 1790 (8875 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C60.0%  proc’d: 2685 (8912 Gflop/s) – 2685 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C80.0%  proc’d: 3580 (8916 Gflop/s) – 2685 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C – 52 C – 49 C – 52 C Summary at:   Wed May 23 13:27:59 CDT 2018
80.0%  proc’d: 3580 (8916 Gflop/s) – 3580 (8931 Gflop/s) – 2685 (8912 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C80.0%  proc’d: 3580 (8916 Gflop/s) – 3580 (8931 Gflop/s) – 3580 (8909 Gflop/s) – 2685 (8921 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C80.0%  proc’d: 3580 (8916 Gflop/s) – 3580 (8931 Gflop/s) – 3580 (8909 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C90.0%  proc’d: 4475 (8917 Gflop/s) – 3580 (8931 Gflop/s) – 3580 (8909 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C90.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 3580 (8909 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 3580 (8922 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 C – 52 C – 49 C – 52 C Summary at:   Wed May 23 13:28:01 CDT 2018
100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 50 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 4475 (8917 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 100.0%  proc’d: 5370 (8909 Gflop/s) – 4475 (8923 Gflop/s) – 4475 (8911 Gflop/s) – 4475 (8895 Gflop/s)   errors: 0 – 0 – 0 – 0   temps: 54 C – 56 C – 52 C – 56 CKilling processes.. done
Tested 4 GPUs: GPU 0: OK GPU 1: OK GPU 2: OK GPU 3: OK

 

OpenPOWER at OpenCompute Summit 2018

I usually write about engineering topics surrounding openPOWER servers in this blog. This time, I thought it would be nice to extend that to list exciting things that are going to be shared at OCP Summit 2018

If you are attending Open Compute Summit 2018 (March 20-21) in San Jose, and want to see some cool things that OpenPOWER is bringing to the Open Compute eco-system, you are in the right place.

OCP_OPF

 

Here’s a list of things you could attend / visit to see what’s in store with Power9 in OCP form-factor .

 

Engineering Talks:

Accelerator Eco-System on Google / Rackspace – Zaius / Barreleye G2 Server

Tri-mode (SAS / SATA / NVMe) Storage Solution on Rackspace OCP Barreleye G2 Server

Accelerating Flash Memory with the High-Performance, Low-Latency OpenCAPI Interface

Hardware Showcase at OpenPOWER booth # C6 :

Witness OpenPOWER innovation in OCP form-factor servers

Zaius Tray , Google

Barreleye G2 Server, Rackspace

OCP Power9 Tray , Inventec

Molex Flash Storage Accelerator  .

OpenCAPI adapters – Alpha Data & Innova 2 (Mellanox)

Demos:

Witness Gen4 Storage / Networks (2x faster) and OpenCAPI / NVLink 2.0 (3x faster) demos:

a) Gen4 based 2x 100 GbE networking

b) Gen4 based Storage Speeds

c) Tri-mode , Interchangeable SAS / SATA / NVMe storage

d) OpenCAPI buffer demo running at 25 GT/s (3x faster than PCIe Gen3)

OpenPOWER Member Booth at OCP:

a) Mellanox . Booth # C5

b) Broadcom: Booth #

 

How to Netboot install RHEL 7.4 on POWER9 / Barreleye G2

RHEL 7.4 as I’m aware is the only GA’ed enterprise OS on POWER9 and hence Barreleye G2. I’ve seen couple of folks have issues picking the right image / boot arguments while attempting this install , and fail, so wanted to clearly put the steps down here:

If you instead want to install Ubuntu 16.04 LTS on POWER9 I’ve already written instructions for that in previous post: 

  1. First grab the DVD (2.9 GB) from RHEL website. For the purposes of this post we’ll use evaluation version:

> Go to RHEL downloads page:

https://access.redhat.com/downloads/content/420/ver=/rhel—7/7.4/ppc64le/product-software

> Login and Get Download Link “Red Hat Enterprise Linux for Power 9”

 

NOTE1: DO-NOT get link for generic “RHEL big-endian for ppc64” and “RHEL little-endian ppc64”) rather get specific version of download provided for POWER9

NOTE:2 : Download the expansive DVD ( 2.9GB binary as RHEL calls it ) , just the basic ISO file is NOT enough for netboot, since there is no public mirror for repo / package data, that I’m aware of.

See picture below, get the second file. In my case file name was “rhel-alt-server-7.4-ppc64le-dvd.iso”. If you are missing the alt, you are getting the wrong file:

Screen Shot 2018-01-19 at 4.28.54 PM

2.  Once you get the download link for this image, wget it to local (apache) http server (obviously, this is different than the POWER machine you are doing the install) and mount that iso to a folder on the http server path

Log into

ssh root@<http-server>

cd /var/www/html/

mkdir rhel9

cd rhel9

wget <hyperlink to rhel-alt-server-7.4-ppc64le-boot.iso>

mount -o loop rhel-alt-server-7.4-ppc64le-dvd.iso\?_auth_\=1516572534_bc1326574be188a5e87bb59dfcd70b8e rhel9

See picture below for help:

Screenshot from 2018-01-21 20-45-36

3. Now get to main petitboot menu and add a new boot option by  “n” button. Fill in the new boot option menu with following mock links as example :

Kernel:  http://10.127.xx.xx/rhel/ppc/ppc64/vmlinuz

Inirdrd: http://10.127.xx.xx/rhel/ppc/ppc64/initrd.img

boot-arguments:    root=live:http://10.127.xx.xx/rhel/LiveOS/squashfs.img

4. Click save and execute the new boot option you just created “User item 1”:

Screen Shot 2016-05-06 at 1.39.25 PM

5. That will boot to RHEL 7.4 Install menu in about 2 minutes or so. Select “Text mode” as I ran into some issues getting VNC install working:

6. You will be greeted with the following text menu screen. The only tricky item here is number 3, “Installation Source”.

Select http mirror and enter the mount location of your iso as the selection:

Screenshot from 2018-01-21 20-28-26

7. After completing all the necessary options 1 through 9 , press “b” to begin the installation begin the installation.

POWER9 RHEL 7.4 Barreleye Installation Done

8.Installation should be done in under 7-8 minutes as we are choosing to install from local mirror:

Screenshot from 2018-01-21 20-42-42

Don’t hesitate to comment below for any questions.

How to mine Monero (XMR) on Rackspace Barreleye G2 Server (IBM POWER9)

Previous post, I’ve documented process of mining ETH / Ethereum on GPUs on POWER processor based server (called Barreleye ). Barreleye G2 is a open-source hardware server developed by Rackspace / Google.

In this post I explain, how to mine crypto-currency on CPU and still be profitable:

A. You pick the IBM POWER8 / 9 processors because of vector instructions supported on this CPU, helping the hashing algorithms (Specifically cryptonight algo – basis for Monero). (In your leisure, read this cryptonight standard that explains the basis of the algorithm itself )

B. You mine #MONERO – XMR because of its relative GPU / ASIC resistance. Monero has become one of the most popular cryptocurrencies due to its excellent privacy. Its website describes it as “secure, private, untraceable currency.”

For my guide I tested mining speeds on Rackspace server called : Barreleye G2 .
Example Speeds:  Hash-rate on Barreleye server running 2 POWER8 /POWER9 (10 core chip) (~5 KH/s) —- Equivalent to 9 x GeForce 1070 Mining hash-rate (475 H/s)

 

Currently verified operating systems are :

A. 16.04

B. CentOS 7.x

Here are the steps to follow for mining Monero on Barreleye G2 on Ubuntu 16.04. The only changes for CentOS will be in installing AT10 (look below for what AT10 is) and dependency package names (usually different for debain vs RHEL based repos):

Before starting make sure your SMT level is at 4 and not 8 (4-way multi-threading yields the best results – Discovery through experimentation)

ppc64_cpu –smt=4

1. Clone the xmr-stak-power repository for POWER based on xmr-stack-cpu and cd into source directory

git clone https://github.com/agangidi53/xmr-stak-power

cd xmr-stak-power

      2. Building a binary for mining on POWER needs gcc 6.3.1 which is available for public via Advanced toolchain 10. Install AT10.0 (Follow instructions below)

2a. Change Source list to add AT10 path

nano /etc/apt/sources.list

add “deb ftp://ftp.unicamp.br/pub/linuxpatch/toolchain/at/ubuntu xenial at10.0” at the end of the file

apt-get update

2b. Install the 4 AT10.0 packages and add them to the PATH:

apt-get install advance-toolchain-at10.0-runtime \
advance-toolchain-at10.0-devel \
advance-toolchain-at10.0-perf \
advance-toolchain-at10.0-mcore-libs
export PATH=/opt/at10.0/bin:$PATH

3. Install dependencies before building the mining binary:

apt install libmicrohttpd-dev -y
apt install libhwloc-dev -y
apt install libssl-dev -y

4. Configure, Make and Build the XMR-STAK-POWER binary

export LD_LIBRARY_PATH=’/usr/lib/powerpc64le-linux-gnu/’

cmake

make

make install

Binary will be built and saved as <source-directory>/bin/xmr-stak/power

5. Create a Monero wallet and make not of wallet address and private keys

Use https://moneroaddress.org/

6. Replace above the Wallet address and your email in following file <source-directory>/bin/config.txt in below format.

We are using supportxmr pool so use the below template for pool address.

Only replace your email / wallet address. Follow template for everything else.

“pool_address” : “pool.supportxmr.com:5555”,
“wallet_address” : “4945WAJVEC6A3ZM8hwWMrV15VSJeeAvUv3fRbwwMajToCQ2usQa2tefGyx6PFQwXqMfpk7dVdxX6BBqZfYibx3JD3UKzrFk”,
“pool_password” : “MinerName:<email-address>”,

Please note that the default config file assumes you have

8. Start mining by executing the binary you previously built. This binary will refer to config.txt to determine your wallet address, the pool you are joining and number of CPUs in the server etc. ( It assumes you have 2 x 10 core POWER proc. If its different config, you need to change the config file to reflect that )

Use a screen session to run the mining binary so that binary runs in background (and you don’t have to baby sit it)

apt install screen

screen

<hit enter for the message>

<source-director>/bin/xmr-stak/power

7. After mining for 5 minutes, login to supportxmr.com using these default credentials ::

username:     <your-wallet-address>

password:     <your-email-address provided in above config.txt>

8. Change your password to your choice after getting in to the website.

9. At this point the Top Right Corner on supportxmr.com shows your current hash-rate mined via the pool. Here’s test output using 2 Barreleye G2 servers:

Screen Shot 2017-11-20 at 4.00.15 AM

How to update openPOWER BIOS / host flash / boot firmware (without wiping boot preferences)

There are 2 ways you can update host / boot firmware in openPOWER machines

a) Via BMC console

b) Via BMC REST API

In this post we will see the steps for updating the host firmware via BMC console:

Step1: Get to the BMC console (via ssh)

ssh root@<openBMC IP>

Step2: Now we want to make sure host power is off

obmcutil poweroff

Step3: Get the host flash binary into  /tmp location on BMC

cd /tmp

wget <link to host firmware>

Step4:  Start flashing the new firmware using ‘pflash’ utility. Once you know for sure host power is off ( you can check it via <obmcutil state>)

pflash -E -f -p  <flash-binary-name>.pnor

Step5: Once you see the flash progress reach 100% and complete, try to get the power on

obmcutil poweron

Step6: Check the console , if your host is booting successfully

obmc-console-client

 

Above steps (specifically the pflash command above) completely wipe out your flash chip (including boot preferences) and replace it with new host firmware. If you, instead wanted to save boot preferences then you need to back up the NVRAM partition using the flow below:

cd /tmp

pflash -P NVRAM -r nvram

pflash -E -f -p <flash-binary-name>.pnor  

pflash -e -P NVRAM -p nvram

rm -rf nvram

How to mine Ethereum on Rackspace BarreleyeG2 OpenPOWER server with Nvidia P100 GPUs

OpenPOWER CPUs and Nvidia GPUs represent the best in class computing Power per $

To test that, I thought I’ll give compiling Ethereum mining suite a try on our Barreleye G2 server. As background information Barreleye G2 is a server Rackspace is building in collaboration with Google, IBM and Ingrasys

Screen Shot 2017-09-02 at 9.56.50 PM

To compile Ethereum on Power9 with Nvidia P100 GPUs you have 2 options:

A)  Compile it for CUDA

B)  Compile it for OpenCL

In this post we will go with Option A (Compile it with CUDA). To do this you need to first get Nvidia Drivers and CUDA working on your openPOWER server. The link to blog post on how to get these working is here:

Once you have CUDA working (run an example CUDA application to make sure), install the build dependencies necessary for compiling ethereum. We will use cmake to configure and build.

sudo apt-get install libleveldb-dev libmicrohttpd-dev cmake

After looking around I realized that main github source for ethereum “ethereum/cpp-ethereum” no longer supports mining for GPUs. So instead I used: https://github.com/ethereum-mining/ethminer for my mining experiments with Nvidia P100 GPUs. I found the release “0.12.0.dev2” to be stable.

Download the latest official release zip and unzip

wget https://github.com/ethereum-mining/ethminer/archive/v0.12.0.dev2.zip

unzip v0.12.0.dev2.zip

cd ethminer-0.12.0.dev2/

Make a build directory for your build activities and make that your current directory

mkdir build
cd build

Since I’m building for CUDA and NOT OpenCL, I made sure, my cmake configuration reflects that. This step 3-4 minutes on Barreleye G2 server

cmake .. -DETHASHCUDA=ON -DETHASHCL=OFF

Once cmake configure is done, we get to the fun part, building:

cmake –build .

If your build from source looks good, your screen should roughly look like this:

Screen Shot 2017-09-02 at 8.00.00 PM

Make install the ethminer binary

sudo make install

At this point you have successfully built your ”ethminer” binary and it should reside in “/usr/local/bin/ethminer”

Play around with your newly built binary with command-line options:

/usr/local/bin/ethminer –help

Before actually mining you can benchmark / simulate to see how many Hash Rate (MH/s) can you hit

/usr/local/bin/ethminer -M -U

Screen Shot 2017-09-02 at 9.22.26 PM

Once you have simulated and your expected MH/s match your expectations. Its time to actually mine ethereum.

You need to create a ethereum wallet and get your own address to send your mined coins to. Go to the below address, create and save your credentials safely

https://www.myetherwallet.com/

Once you have you created <ethereum address> and think of <miner tag> to identify your machine (helpful you are mining via multiple machines)

/usr/local/bin/ethminer –farm-recheck 200 -U -S us1.ethermine.org:4444 -FS eu1.ethermine.org:4444 -O <wallet address>.<tag>

/usr/local/bin/ethminer –farm-recheck 200 -U -S us1.ethermine.org:4444 -FS eu1.ethermine.org:4444 -O 0x4ff2de61282aa5da02E5F8399DB7d47A66Be1465.barreleyeg2

Soon your will receive console output indicating completed shares and your Hash rate:

Screen Shot 2017-09-02 at 9.44.26 PM

Now how do you confirm / check the number of shares / amount of ethereum you have mine ? Got to below website and look up your wallet address from above:

https://ethermine.org

You will be able to see details like below:

Screen Shot 2017-09-02 at 9.48.41 PM

You can use below command to check how much power your nvidia GPUs are consuming

nvidia-smi

Please comment below if you have questions or are interested in price of this server / setup OR profitability.

How to Install NVIDIA Drivers for P100 and CUDA on Barreleye G2 / Power9 (Work in Progress)

I’m trying to get Nvidia Telsa P100 working with CUDA on Barreleye G2 Server.  Barreleye G2 is Power9 / OpenPOWER server that Rackspace and Google are building.

Posting the process I’m using, here, to help others trying to do the same.

Before Installing CUDA, make sure you have the right drivers for Nvidia Devices:

You have 2 Alternatives for Installing Drivers for NVIDIA devices:

1) Install the Power8 Drivers from NVIDIA website for Power9:

wget http://us.download.nvidia.com/tesla/384.59/nvidia-driver-local-repo-ubuntu1604-384.59_1.0-1_ppc64el.deb

dpkg -i nvidia-driver-local-repo-ubuntu1604-384.59_1.0-1_ppc64el.deb

OR

2) Install the APT recommended Drivers:

sudo apt-get install ubuntu-drivers-common
sudo ubuntu-drivers devices

Based  on output recommendation from above command, install recommended driver. For Example:
sudo apt-get install nvidia-384

Installing CUDA on Power9 / Barreleye G2:

Install dependency packages for CUDA

sudo apt-get install build-essential

Install repository packages for CUDA specific to ppc and your build (in my case 16.04). For the other builds, lookup Nvidia website for specific deb package
wget https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.54-1_ppc64el-deb
sudo dpkg -i ./cuda-repo-ubuntu1604-8-0-local-ga2_8.0.54-1_ppc64el-deb

Update the APT Definitions (You need to do this for above repo package to take effect ):

sudo apt-get update

Install CUDA Libraries

sudo apt-get install cuda

Make CUDA Accessible to all users:

echo ‘export PATH=$PATH:/usr/local/cuda-8.0/bin’ | sudo tee
echo /usr/local/cuda-8.0/lib64 | sudo tee /etc/ld.so.conf.d/cuda-8-0.conf
sudo ldconfig

Check your Drivers and Packages are all working together and you see devices, modules and details of your hardware 

sudo dmesg | grep -i nvidia
sudo lsmod | grep nvidia
nvidia-smi –list-gpus

If things went okay, output for above 3 commands, should be meaningful.

Now lets check if CUDA install went okay by building and running CUDA applications:

mkdir ~/samples
cp -r /usr/local/cuda-8.0/samples/ ~/samples/
cd ~/samples/samples/7_CUDALibraries/simpleCUFFT
make
./simpleCUFFT

In my case, CUDA application fails to run with following errors. I’m trying to resolve if this is Power9 / PPC OS related error or Nvidia device error. Will edit this post with details. Process for install should remain same neverthless.

root@ubuntu:~/samples/samples/7_CUDALibraries/simpleCUFFT# ./simpleCUFFT
[simpleCUFFT] is starting…
GPU Device 0: “Tesla P100-PCIE-16GB” with compute capability 6.0

[ 3457.484507] Severe Machine check interrupt [[Not recovered]
[ 3457.485099] Initiator: CPU
[ 3457.485332] Error type: Real address [Load/Store (foreign)]
[ 3457.485762] Effective address: 00003fff9e49208c
Bus error (core dumped)

 

How to Install Ubuntu Xenial 16.04 LTS on Power9 Machines

If you want to install Original LTS Ubuntu 16.04 (with initial LTS Kernel 4.4) on Power9, you were / are out of luck. This is because full kernel support for Power9 got in from 4.10 on wards.

But to some jubilation, 16.04.3 LTS got released today (08/03/17) with support for 17.04 Kernel (4.10). So work around to install Ubuntu LTS on Power9 is to use 16.04.03 LTS HWE kernel / initrd instead of ones I indicated in my previous blog post:

Kernel:

http://ports.ubuntu.com/ubuntu-ports/dists/xenial-updates/main/installer-ppc64el/current/images/hwe-netboot/ubuntu-installer/ppc64el/vmlinux

Initrd:

http://ports.ubuntu.com/ubuntu-ports/dists/xenial-updates/main/installer-ppc64el/current/images/hwe-netboot/ubuntu-installer/ppc64el/initrd.gz

Again, Follow the same procedure in previous blog post  BUT with NEW kernel / initrd links.

Enjoy the first LTS port on Power9