These results are from preliminary tests.
Configuration
(10/19/95) Molnar installed two Giganet ATM boards installed in rogue.css.ornl.gov, a 4-node (MP3/NIC B), 128 MB/node, and one service node Paragon. OSF 1.3.0. New libatm.a (10/23/95) OSF builds by Molnar on ccs.ornl.gov.
A NIC A kernel was used on 10/25/95.
Second service node installed. 10/26/95.
Intel Paragon XPE being installed by Molnar. 10/27/95
Compute node atm message-passing tests
We ran a number of message passing tests using two compute nodes communicating through the two ATM interfaces. We ran echo tests, sink tests, and exchange tests to characterize message transfer time and bandwidth as a function of message size. (Click here for description of tests.) The maximum message size for an AAL5 ATM packet is 65536. We experienced some node hangs occasionally on sink tests. There is a known underrun bug in the Intel NIC B mesh chip.
The following graph show the bandwidth for ATM and NX communication between two compute nodes. Most of the tests do not copy the message from the receive buffer, but we do include data points for an echo with a bcopy() by the receiver. Note that the bandwidth for a bcopy() on the 50 MHz i860XP is about 63 MB/s. The bandwidth of OC12 is 622 Mb/s with 599 Mb/s available to the ATM layer. Subracting the overhead of ATM/AAL5 headers yields an effective data rate of 67.8 MB/s. As can be seen from the graph, a single node can saturate the OC12. An MP node can generate 240 MB/s load on the memory, and NX data rates of 171 MB/s have been measured (1 MB message) (see ``Beta Testing the Intel Paragon MP'' ). The Giganet ATM implementation routes data directly from the ATM interface on the I/O node directly into the Paragon mesh, bypassing memory copy's on the I/O node and OSF's IPC. OSF uses IPC to service I/O and non-ATM network requests, and the IPC has a measured peak bandwidth of only 45 MB/s (NIC B) and 35 MB/s (NIC A). At a blocksize of 64 KB, the NIC B IPC bandwidth is only 16 MB/s, and the NIC A is 13 MB/s. For an 8 byte message, IPC one-way transfer time is about 1.3 ms.
The following graph shows the time to transfer a message between compute nodes using ATM and NX. Minimum one-way transfer time is around 125 us. For comparison, minimum NX transfer time is about 28us, and IP over Ethernet times are typically on the order of 600us. Of course for cross-country tests, the speed of light will increase latencies to tens of milliseconds. As the graph shows, the transfer time for the sink test is roughly that of NX, 41 us, for small messages. That represents more then 24,000 packets/second.
We also ran an echo test between four compute nodes using a pattern of 0--nx--1--atm--2--nx--3. This would be similar to the PVM configuration (except nodes 1 and 2 would be service nodes). This configuration had a one-way transfer time of 231 us and a bandwidth of 19 MB/s.
The following two graphs shows the variation in round-trip times between two compute nodes using ATM. The jitter is typical of NX node-to-node communication. The duration of the test was 0.52 seconds, with an average round trip time of 260 us (minumum of 250 us, max of 394). There is 70us spike every 10 ms from OSF. If both nodes experience a timer interrupt during message transmit then a 390us spike may occur. Of course if one or both nodes are time-shared service nodes, variations in round-trip times are more dramatic! Measuring ATM echo times between two service nodes, we have seen 13ms spikes from time-sharing.
The Giganet ATM interface is capable of transfering full OC12 bandwidth in both directions concurrently. However, we have been unable to demonstrate that with our tests. Setting up concurrent sink tests in both directions yielded an aggregate bandwidth of 68 MB/s.
Service node atm message-passing tests
We performed similar message passing tests from a service node to a compute node. When the service node is idle, performance is consistent with compute node to compute node ATM results. The sink test actually showed a transmit rate of nearly 72 MB/sec, though the receiver was still only reporting 68 MB/sec. We also did some tests with two tasks on the one service node. Time-sharing a single node reduce the data rates, and often performance was excruciatingly slow. The best we have seen for two tasks communicating on the same service node through ATM is a latency of 800us and a bandwidth of 21 MB/s.
Running on two service nodes (10/26/95), we measured a latency of 139 us and a bandwidth of 29 MB/s for the echo test. The exchange data rate between two service nodes reached 58 MB/s Sink tests between service nodes invariably lost packets because of time-sharing and insufficient application receive buffers. It was also necessary to lock down the receive buffers on the service nodes so that they would not be paged. Not surprisingly, ATM communications is better from a compute node than a service node, but the initial Paragon PVM/ATM implementation (like the nornmal Paragon PVM implementation) would use a service node on each Paragon for Paragon-to-Paragon ATM communication.
NIC A tests
The preceding results were run on rogue with a NIC B kernel. We re-ran the tests using a NIC A kernel. NIC A boards are found on the older GP Paragons, and NIC A's will be on the SC 95 Paragons. Under NIC A, NX transfer time for a 0 byte message climbs to 36 us, and the data rate for a 1 MB messages drops to 90 MB/s. Node-to-node ATM numbers are 140 us and 20 MB/s (echo). Sink data rates are 58 MB/s, and we observed that the transmitter would overrun the receiver for messages greater than 10 KB. (We have a receive queue of 10 buffers, which was adequate with the NIC B tests.)
The following table summarizes message-passing performance as of 10/25/95. The time is the one-way transfer time for a small message, the bandwidth is measured for a 1 MB message.
PVM/ATM tests