Andy Malakov software blog

Wednesday, March 4, 2015

Connecting two CentOS computers using cheap Infiniband

This is a continuation of the previous post. This time I wanted to test direct Infiniband connection on Linux.

Setup is the same:

  • Two retired developer's desktops (built in 2008): AMD Opteron 2216 @2.4MHz, 8G DDR2.
  • A pair of Mellanox Infinihost III adapter MHGA28-XTC
  • CentOS Linux 6.3 (Minimal Install in my case)
  • MLNX_OFED_LINUX-1.5.3-4.0.42-rhel6.3-x86_64.iso OFED driver (Still available on mellanox.com)

Linux setup is pretty straightforward but in my opinion more involved than on Windows. Main problem was old age of these cars. In order to avoid rebuilding OFED drivers for these cards I used old version of CentOS (6.3). I've tried 2.x version but got MFE_OLD_DEVICE_TYPE error. Besides I wanted to test SDP in Java 7 and this protocol seems to be no longer available in OFED 2.x +.

Bottom line: for these old Infinihost III-family cards use older OFED driver (1.5.3). If you don't want to rebuild the driver, use Linux distro/version specified by the driver (there are quite a few).

I found that the following two resources most useful for this project: A and B. There is no reason to repeat these steps here. Connection verification and testing using OFED utilities is similar to Windows version.

Configuring simple Java Socket application to use SDP worked like a charm. See Oracle's tutorial.

Connecting two Windows 7 computers with low-cost Infiniband

Previous generation Infiniband cards are selling for a fraction of original price on eBay. Developers are buying these setups to test/learn this technology. I've followed this path and posting my notes here. The setup wasn't easy and I had to collect information from various sources.

Hardware

  • A pair of Mellanox Infinihost III adapter MHGA28-XTC ($36)
  • .
  • A pair of Molex 4X Infiniband copper cables ($12.5)

Total price tag was $97 (including shipping).

These are old-generation Dual-Port InfiniBand adapter cards that fit into PCI Express x8 slots. Each card has two 20Gb/s ports. I used two retired supermicro desktops (circa 2008) that fit these cards by age. Each computer is running Windows 7 (x64). [Next post will explore the same hardware setup on Linux].

Infinihost allows direct connection between two computers (in point-to-point setup there is no need for Infiniband switch).

BIOS Update

Multiple sources recommend upgrading card's firmware before trying them with Windows.

There are several revisions of MHGA28-XTC cards, mine was A3 (check the sticker attached to the back of each card). Firmware can be downloaded from Mellanox here.

To upgrade firmware and basic status testing Mellanox provides MFT utilities set. In my case the latest MFT version 3.8 refused to work with these cards claiming they are no longer supported. Luckily MFT version 2.7.2 is still available and works with these old Infinihost-family cards:

C:\Program Files\Mellanox\WinMFT>mst status
MST devices:
------------
  mt25218_pciconf0
  mt25218_pci_cr0

C:\Program Files\Mellanox\WinMFT>mlxburn -dev mt25218_pci_cr0 -image fw-25218-5_3_000-MHGA28-XTC_A3.bin

    Current FW version on flash:  5.2.916
    New FW version:               5.3.0

Read and verify Invariant Sector            - OK
Read and verify PPS/SPS on flash            - OK
Burning second FW image without signatures  - OK
Restoring second signature                  - OK
-I- Image burn completed successfully.

Windows Driver

Initially these cards showed up as "Infiniband controller" in Windows Device Manager:

Driver for these cards are available from Mellanox and OpenFabrics.org (OFED). I believe both of these sources actually provide the same driver maintained by OFED (sponsored by Mellanox).

Here I had the same story - the latest OFED driver version (3.2) simply didn't work with these cards. Setup ended with "Possible NetworkDirect startup failure" warning. The installed driver would identify the card properly but yellow triangle said that device was disabled due to errors. Windows event log showed that some driver components failed to initialize.

After some trial and error I found that OFED driver version 2.3 was what I needed. I can be downloaded from OpenFabrics archive.

As you can see below, in addition to Infiniband card Device Managers showed that I got two OpenFabrics IPoIB Adapters (since each card has two ports):

Configuration

I repeating above steps on both computers and connected cards with cables.

OFED software comes with set of utilities, one of which (IBSTAT) can be used to check connectivity status:

C:\Windows\system32>ibstat
CA 'ibv_device0'
        CA type:
        Number of ports: 2
        Firmware version: 0x500030000
        Hardware version: 0x20
        Node GUID: 0x0002c9020023c250
        System image GUID: 0x0002c9020023c253
        Port 1:
                State: Initializing
                Physical state: LinkUp
                Rate: 20
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x90580000
                Port GUID: 0x0002c9020023c251
                Link layer: IB
        Port 2:
                State: Initializing
                Physical state: LinkUp
                Rate: 10
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x90580000
                Port GUID: 0x0002c9020023c252
                Link layer: IB

Subnet Manager

When these cards are connected directly we need to launch Infiniband Subnetwork Manager (opensm). OFED installs it as Windows Service (disabled by default). In my case I launched opensm from command line. You need to run this service on both computers.

C:\Windows\system32>opensm
-------------------------------------------------
OpenSM 3.3.6 UMAD
Command Line Arguments:
 Log File: %TEMP%\osm.log
-------------------------------------------------
OpenSM 3.3.6 UMAD

Entering DISCOVERING state

Using default GUID 0x2c9020023c251
Entering MASTER state

SUBNET UP

Entering STANDBY state

Each service can be configured to serve both ports (enter GUID of each port GUIDs into opensm configuration file).

After this step Windows should show your network status as connected:

If you plan to keep this setup running, OpenSM can be launched automatically as Windows Service (disabled by default).

Connectivity test

OFED has special ping utility that can be used for quick test.

Computer 1 (Here we print GUIDs of each port and launch ping server):

C:\Windows\system32>ibstat -p
0x0002c90200231745
0x0002c90200231746

C:\Windows\system32>ibping -S

Computer 2 (here we use GUID of the first computer's port):

C:\Windows\system32>ibping -G 0x0002c90200231745
Pong from ?hostname?.?domainname? (Lid 2): time 0.230 ms
Pong from ?hostname?.?domainname? (Lid 2): time 0.160 ms
Pong from ?hostname?.?domainname? (Lid 2): time 0.231 ms
Pong from ?hostname?.?domainname? (Lid 2): time 0.159 ms
Pong from ?hostname?.?domainname? (Lid 2): time 0.163 ms
Pong from ?hostname?.?domainname? (Lid 2): time 0.174 ms
(Nevermind weird host name).

Latency test

Computer 1 (Launching test server):
C:\Windows\system32>ib_send_lat -a -c RC 
Computer 2 (test client):
C:\Windows\system32>ib_send_lat -a -c RC oldfaithful
------------------------------------------------------------------
                    Send Latency Test
Inline data is used up to 400 bytes message
Connection type : RC
test
  local address:  LID 0x100, QPN 0x6040200, PSN 0x265a0000, RKey 0x2c0010 VAddr 0x00000001170040
  remote address: LID 0x200, QPN 0x6040600, PSN 0xb64e0000, RKey 0x2c0030 VAddr 0x00000000fc0040
Mtu : 2048
------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]
      2        1000           4.10        2295.99             4.27
      4        1000           3.75        1305.61             4.27
      8        1000           3.75         256.86             3.93
     16        1000           3.75         266.24             3.93
     32        1000           4.44        1021.62             4.61
     64        1000           4.44         329.22             4.61
    128        1000           4.61         303.79             4.78
    256        1000           4.95         529.41             5.12
    512        1000           5.46         300.89             5.63
   1024        1000           6.83         309.25             7.00
   2048        1000           9.22         327.17             9.39
   4096        1000          11.61         280.92            11.78
   8192        1000          16.90         306.86            17.07
  16384        1000          27.65         329.56            27.82
...
This hardware is 9 years old, so numbers are sub-optimal. Still much better than TCP even without any special tuning.

What's next?

We got low-latency 20Gb/s connection between two Windows 7 machines using a pair of cheap Infiniband adapters.

In theory this setup can be used for ultra-fast file sharing etc. My primary interest was getting my hands on Infiniband and OFED stack (and ultimately using it from Java). Unfortunately Socket Direct Protocol (SDP) available in Java since version 7 is a) deprecated in the latest version of OFED and b) seems to be unsupported by Java on Windows anyway. There are various libraries that provide RDMA to Java using JNI wrappers.