Discussion:
NFS + Infiniband problem
(too old to reply)
Andrew Vylegzhanin
2018-12-06 02:21:04 UTC
Permalink
Hi,

Back to the thread after month.
I have a several FreeBSD machines connected via Infiniband netwok (
FDR switch Mellanox SW3036 + ConnectX-3 VPI cards ).
One of them is a NAS-server with multiply ZFS pools.
All kernels (11.2-RELEASE on clients and 12.0-BETA1 (11.2 also tried)
on server) are with infiniband connected mode (option IPOIB_CM, option SDM)
and world with >>>> OFED stack support. (WITH_OFED='yes').
File transfers via FTP or SSH between server and clients works almost
flawless ( ~ 12 Gbit/s ).
But when I try to copy in/out some significant data via NFS share
mounted on clients, NFS i/o hangs at all or got extremely slow (couple
kB/s) transfer speed after uncertain amount of copied data. For example, on
the one node I can copy 1GB file, and after NFS hang on file with size 30 k
Also need to test setup with infniband set from connected mode to
datagram mode.

Tests for datagram mode IPoIB give disappointing results for NFS: just 20 -
40 MB/s for in average for NFS seq reads and even less for writes.
And inifniband interface report significant errors on:
# netstat -nI ib0 1
input ib0 output
packets errs idrops bytes packets errs bytes colls
11267 0 0 1172852 20471 652 42046744 0
15628 0 0 1626860 28387 994 58257440 0
16920 0 0 1761896 30832 1065 63196256 0
13566 0 0 1410424 24882 722 51205964 0
17942 0 0 1867312 32652 1114 67164484 0
9443 0 0 982104 17340 525 35610908 0

Similar transfer speed and errors were got for other protocols: ftp, scp .

I've tried to mount NFS share via UDP also, but no success.
With UDP, NFS operations are hangs with MTU errors on IB interface: kernel:
ib0: packet len 4096 (> 2044) too long to send, dropping.

So, in short summary, IPoIB actually inoperable in datagram mode.

Any ideas?

Regards,
Andrew

Loading...