We have discovered several bugs in the AIX TCP stack and one in the UDP
stack.

BUG 1:  (fall 2000)
  AIX stack bos.net.tcp.client.4.3.3.27
  The behavior can be consistently reproduced.
  When configured with sack=1 and tcp_newreno=0, a data transfer
  from the AIX host (e.g., ttcp, ftp) to a non-sack host over a lossy
  link will HANG.  netstat -a shows

  tcp4       0  16060  stingray.ccs.orn.36252 cm-208-160-120-1.commp ESTABLISHED

  tcpdump's at both the receiver and transmitter side shows that
  a packet drop has occured and the receiver is sending dup ACK's.
  The AIX transmitter, doesn't do a re-transmit after the 3rd dup ACK,
  nor does it ever timeout and re-transmit that packet. it just hangs forever!
  The failure happens everytime there is packet loss, and we demonstrated
  it to differernt hosts over other lossy links.
  We also watched with SO_DEBUG and AIX reports:
  ...
  675 ESTABLISHED:input 20d9a54b@11152cb3(win=7c00)<ACK> -> ESTABLISHED
  678 ESTABLISHED:input 20d9a54b@11152cb3(win=7c00)<ACK> -> ESTABLISHED
  980 ESTABLISHED:input 20d9a54b@11152cb3(win=7c00)<ACK> -> ESTABLISHED
  175 ESTABLISHED:user SLOWTIMO<PERSIST> -> ESTABLISHED
  685 ESTABLISHED:user SLOWTIMO<PERSIST> -> ESTABLISHED
  195 ESTABLISHED:user SLOWTIMO<PERSIST> -> ESTABLISHED
  ....

  With sack=0 on the AIX host, the same test proceeds "normally".
  There are retransmissions after 3rd dup ACK (and some timeouts
  if multiple drops within a window -- i.e., normal "reno" behavior).

  Also if the target host is sack-capable, then the transfer follows
  normal behavior: SACK acks and retransmits and no timeouts.

  In January, 2001, IBM provided patches to fix the problem.

BUG 2:  (fall 2000)
  AIX stack bos.net.tcp.client.4.3.3.27
  When configured with tcp_newreno=1 and sack =0, a data transfer
  from the AIX host (e.g., ttcp, ftp) over a lossy link does
  not do fast retransmit.  This is a more subtle problem, resulting
  in lower throughput.

  tcpdump's at both the receiver and transmitter side shows that
  when a packet drop occurs and the AIX box receives duplicate ACKs,
  it does NOT do a retransmit after the 3rd dup ACK, rather it eventually
  times out and retransmits.

  The behavior is the same if sack=1 and the receiver is not sack-capable.

  In January, 2001, IBM provided patches to fix the problem.

BUG 3
  summer, 2001.  In our TCP-over-UDP client/server, an AIX client will,
  in the middle of the UDP flows, send an "ICMP port unreachable", causing
  the remote server to fail.  The AIX client continues to run, doing
  "timeout re-transmits", and the port is still there.  So it is a 
  transient (race?) condition seen so far only on the AIX GigE interface.

  Actually, the AIX server (probesrv) will also send unexpected ICMP's.
  The client shouldn't really stop (becasue neither have "connected"
  UDP ports), but Linux 2.2 get "connection refused" -- fixed in 2.4
  kernel.