Discussion:
[Bug 233759] igb (I210) + net.inet.ipsec.async_crypto=1 + aesni kills receiving queues and traffic
(too old to reply)
b***@freebsd.org
2018-12-03 21:59:14 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

Lev A. Serebryakov <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Assignee|***@FreeBSD.org |***@FreeBSD.org
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-03 21:59:31 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

Lev A. Serebryakov <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Keywords| |IntelNetworking
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-03 22:00:53 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

Lev A. Serebryakov <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Summary|igb (I210) + |igb (I210) +
|net.inet.ipsec.async_crypto |net.inet.ipsec.async_crypto
|=1 + aesni kills receiving |=1 + aesni kill receiving
|queues and traffic |queues and traffic
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-04 09:56:08 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

--- Comment #1 from Lev A. Serebryakov <***@FreeBSD.org> ---
I want to additionally clarify, that OUTBOUND traffic via igb1 is encrypted by
IPsec, but INBOUND queues of igb0 are blocked.

I have "net.inet.ip.redirect" disabled (to enable fast-forward).
I have "net.isr.dispatch=direct", as it is set by default.
I have "dev.igb.N.iflib.tx_abdicate" disabled, as it is set by default.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-04 21:19:12 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

Eric Joyner <***@freebsd.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@freebsd.org

--- Comment #2 from Eric Joyner <***@freebsd.org> ---
Can you clarify more about what happens when the queues top working? Do the 1-3
queues immediately stop working when you set that sysctl, or do they just
randomly stop while receiving traffic?

And to confirm, the queues on igb1 aren't affected by this setting?
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-04 21:39:40 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

--- Comment #3 from Lev A. Serebryakov <***@FreeBSD.org> ---
(In reply to Eric Joyner from comment #2)
They randomly stop while receiving traffic. It could be several minutes, like
all traffic is passing well for 2 minutes, then 1/4 of traffic is lost forever,
then after another 30 seconds other 1/4 of traffic is lost (1/2 in total), then
1/2 of traffic could pass for another 3-4-5 minutes, but at least only 3/4 is
lost and 3 queues are effectively stopped. There is no panics or any messages
in kernel output.

I never seen recovery, but, to be honest, I didn't wait more than 5 minutes.

I could say, that ifconfig igb0 down && ifconfig igb0 up doesn't help.

I could not say about igb1 (outbound interface) for sure, as I've checked
inbound one with "tcpdump" and it shows that 1/4-1/2-3/4 of traffic is not seen
even by tcpdump. Traffic is generated (with pkt-gen on other end of the link,
NOT ON SAME SYSTEM!) with distinct pattern (regular loop over many source IPs),
so holes in the traffic is well seen by eye (when packet rate is low). And igb1
only send traffic, it is unidirectional UDP test (as pkt-gen implies).

And one more: only sending works. I could turn on
"net.inet.ipsec.async_crypto=1" and send traffic from very this system with,
say, iperf3 (use it as endpoint with IPsec and not as router), and it make
things faster (it is why I've tried to turn it on for routing), and everything
works.

BTW, I could try to build kernel with INVARIANTS and WITNESSes...
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-04 22:41:21 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

--- Comment #4 from Lev A. Serebryakov <***@FreeBSD.org> ---
(In reply to Eric Joyner from comment #2)
Oh. One more: this system has igb2 too (in addition to igb0 and igb1), which is
used as management interface (to access system under test for configuration and
diagnostics), and it continue to work no problem in both directions for traffic
for this host and back (mostly ssh, of course).
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-06 19:12:15 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

Sean Bruno <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|New |In Progress

--- Comment #5 from Sean Bruno <***@FreeBSD.org> ---
Lev:

Can you document your test case here? I'm curious what the two endpoints of
the test are and what you're setting up on your machines. I'm fairly ignorant
of how to setup ipsec and I'm not sure what that is doing to cause your
problems.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-07 12:37:31 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

--- Comment #6 from Lev A. Serebryakov <***@FreeBSD.org> ---
(In reply to Sean Bruno from comment #5)

I have three systems (they are separate physical systems, not VMs).

(1) Manager.
(2) Device Under Test ("DUT")
(3) Mirror.

Each system has 3 interfaces. One interface of each system is management one to
connect from outside work, and these interfaces is not in scope of this
description.

Manager system has two interfaces in question: "outbound" and "inbound".
- outbound has IP 10.1.0.2/24 and it is connected with "inbound" interface of
DUT (via dedicated switch).
- inbound has IP 10.10.10.2/24 and it is connected with "outbound" interface
of "Mirror".
Manager system doesn't have any special routing record.

DUT system has two interfaces: "outbound" (igb1 in this ticket) and "inbound"
(igb0 in this ticket).
- "outbound" (igb1) has IP 10.2.0.1/24 and it is connected with "inbound"
interface of "Mirror".
- "inbound" (igb0) has IP 10.1.0.1/24 and it is connected with "outbound"
interface of "Manager" (via dedicated switch).
DUT has routing enabled and has "route -net 10.10.10.0/24 10.2.0.1".
DUT has such IPSec settings:
============
add 10.2.0.1 10.2.0.2 esp 0x10001 -m tunnel -E aes-gcm-16
"wxyz0123456789abcdef";
add 10.2.0.1 10.2.0.` esp 0x10002 -m tunnel -E aes-gcm-16
"wxyz0123456789abcdef";
spdadd 10.1.0.0/24 10.10.10.0/24 udp -P out ipsec
esp/tunnel/10.2.0.1-10.2.0.2/require;
spdadd 10.10.10.0/24 10.1.0.0/24 udp -P in ipsec
esp/tunnel/10.2.0.2-10.2.0.1/require;
============

Mirror system has two interfaces in question: "outbound" and "inbound".
- outbound has IP 10.10.10.1/24 and it is connected with "inbound" interface
of Manager.
- inbound has IP 10.2.0.2/24 and it is connected with "outbound" interface
of DUT.
Mirror has routing enabled and has "route -net 10.1.0.0/24 10.2.0.2".
Mirror has static ARP for 10.10.10.2-10.10.10.254 points to "Manager" "Inbound"
interface.
Mirror has such IPSec settings:
============
add 10.2.0.1 10.2.0.2 esp 0x10001 -m tunnel -E aes-gcm-16
"wxyz0123456789abcdef";
add 10.2.0.1 10.2.0.` esp 0x10002 -m tunnel -E aes-gcm-16
"wxyz0123456789abcdef";
spdadd 10.10.10.0/24 10.1.0.0/24 udp -P out ipsec
esp/tunnel/10.2.0.2-10.2.0.1/require;
spdadd 10.1.0.0/24 10.10.10.0/24 udp -P in ipsec
esp/tunnel/10.2.0.1-10.2.0.2/require;
============

Ok, it is config. Really, it is loop "Manager -> DUT -> Mirror -> Manager"
where connection between DUT and Mirror has additional IPsec config. Manager
and Mirror are much more powerful than DUT and could pass full-wire-speed
traffic without any problems with and without encryption.

Now to test.

Manager generates (with netmap's pkt-gen) UDP traffic with such
characteristics:

Transmit interface: "outbound"
Dst MAC: DUT "inbound"
Src IPs: 10.1.0.2:2000-10.1.0.5:2004
Dst IPs: 10.10.10.2:2000-10.10.10.128:2006

Manager receives all traffic (with netmap's pkt-gen) at "inbound" interface and
measure bandwidth.

Now, if DUT has default setting for async IPsec (turned off) it could pass
690Mbit/s or 199Kp/s. Any traffic lower than that passes without any losses.
For example, if I generate traffic and speed 64P/s (without any prefixes!) I
see each and any packet returned to Manager from Mirror via DUT. No problems
here.


If I turn on async IPsec ("sysctl net.inet.ipsec.async_crypto=1" on DUT), no
matter which traffic is generated (I've tested with 64 packets per second, not
kilo-packets, simple packets!) receive queues of DUT inbound interface (igb0)
stop to work one by one.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-12-10 14:05:18 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233759

Sean Bruno <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org,
| |***@FreeBSD.org

--- Comment #7 from Sean Bruno <***@FreeBSD.org> ---
Adding gnn@ and ***@. I'm unsure what igb(4) could be doing here to interfere.
--
You are receiving this mail because:
You are the assignee for the bug.
Loading...