Discussion:
[Bug 230996] em/igb: Intel i210/i350: ifconfig: enabling "vlanhwtag" renders VLAN on i210/i350 NICs unusable
b***@freebsd.org
2018-08-29 17:21:49 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

Mark Linimon <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Assignee|***@FreeBSD.org |***@FreeBSD.org
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-09-29 15:30:42 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

Lev A. Serebryakov <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@FreeBSD.org

--- Comment #1 from Lev A. Serebryakov <***@FreeBSD.org> ---
I have very similar problem: igb/I210, FreeBSD 12-ALPHA8 (r339009).
When I enable "vlanhwtag" on server, clients on this VLAN receive UDP with
broken checksums.
For example, client can not obtain address via DHCP from server with enabled
"vlanhwtag", because DHCP client doesn't seen answers, because they are dropped
by kernel due to invalid checksum: tcpdump sees DHCPOFFER on cleint's interface
but dhclient doesn't receive anything.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-09-29 18:03:47 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #2 from Sean Bruno <***@FreeBSD.org> ---
I'm unclear if this is related to PR231416 or not. There needs to be a bit
more clarification in what the vlanhwtag needs to do or if setting this flag
somehow breaks what the udp stack is expecting.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2018-09-29 20:49:50 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #3 from Lev A. Serebryakov <***@FreeBSD.org> ---
(In reply to Sean Bruno from comment #2)
My case is PR231416 for sure. Looks like TCP works with tis capability enabled,
for example. I'm not sure about non-bpf originated UDP, though.
--
You are receiving this mail because:
You are the assignee for the bug.
b***@freebsd.org
2021-04-26 20:19:48 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

Brock Williams <***@cottonwoodcomputer.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@cottonwoodcomputer.co
| |m

--- Comment #20 from Brock Williams <***@cottonwoodcomputer.com> ---
We're seeing the same, or at least very similar, issue on 13.0-RELEASE. In our
case, we only have the problem if the port is in gigabit mode. If we limit the
switch to 100, we don't see the issue.

Our configuration is passing a vlan over a bridge interface:

ifconfig_igb0="DHCP"
ifconfig_igb1="inet 172.28.1.1 netmask 255.255.255.0"
dhcpd_ifaces="igb1"

vlans_igb0="1501"
vlans_igb1="1501"
ifconfig_igb0_1501="up"
ifconfig_igb1_1501="up"

cloned_interfaces="bridge0"
ifconfig_bridge0="addm igb0.1501 addm igb1.1501 up"


Adding -vlanhwtag to igb0 and igb1 resolves it.


pciconf -lv:

***@pci0:2:0:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086
device=0x1533 subvendor=0x15d9 subdevice=0x1533
vendor = 'Intel Corporation'
device = 'I210 Gigabit Network Connection'
class = network
subclass = ethernet
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
b***@freebsd.org
2021-04-27 02:20:38 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

Kevin Bowling <***@freebsd.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@freebsd.org

--- Comment #21 from Kevin Bowling <***@freebsd.org> ---
Can all of you test this https://reviews.freebsd.org/D30002 - no guarantees but
there were some obvious problems to fix.

Aaron, your issue I am not sure about. It looks completely unrelated to
networking. Can you capture a crash dump?
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-04-27 15:51:24 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #22 from Brock Williams <***@cottonwoodcomputer.com> ---
I tested my case with this patch, and it doesn't seem to make an improvement.
I'm still seeing extremely slow throughput across the interface when vlanhwtag
is enabled.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-04-27 16:32:29 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #23 from Aaron <***@gmail.com> ---
(In reply to Kevin Bowling from comment #21)

So after much poking around, turns out there's something in FreeBSD bridges and
VLANs, where if you have a bridge on the untagged interface, it causes problems
with tagged traffic making it to bridges on the tagged interfaces. Once I
switched to using bridges only on VLAN tagged interfaces, everything
miraculously started working. Extremely annoying.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-04-28 13:31:02 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

Kaho Toshikazu <***@elam.kais.kyoto-u.ac.jp> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@elam.kais.kyoto-u.ac.j
| |p

--- Comment #24 from Kaho Toshikazu <***@elam.kais.kyoto-u.ac.jp> ---
Created attachment 224497
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=224497&action=edit
vlan patch

How about this patch?

1. change initializing order.
setting up vlan inside the function em_initialize_receive_unit()

2. correct calculation of receive buffer size

3. correct writing method of vfta(vlan filter table array)
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
b***@freebsd.org
2021-04-30 04:28:42 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #25 from Kevin Bowling <***@freebsd.org> ---
I've incorporated Kaho Toshikazu's fixes in https://reviews.freebsd.org/D30002
if anyone can retest.

It does sound like there is special interaction with if_bridge I need to chase
down, but I would appreciate any feedback on the above patch as it should be
ready to go barring any further review feedback.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-04-30 14:40:56 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #26 from Kaho Toshikazu <***@elam.kais.kyoto-u.ac.jp> ---
(In reply to Kevin Bowling from comment #25)
I don't test review D30002, but I think it contains some inappropriate parts
relating in "1. Call down to the hw support routine on vlan register and
unregister events".
I think that "call hw support" would not be solve the problem
but make other trouble. I think vlan register/unregister parts in current codes
are not broken.

I think that using E1000_WRITE_REG_ARRAY directly is a obvious bug
which was introduced at iflib conversion.
The driver stores a device specific function for writing vfta into
the e1000_write_vfta variable, and should use this variable
as a function of writing vfta.
At first, correcting only this area is a good course, I think.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
b***@freebsd.org
2021-04-30 15:24:29 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #27 from Brock Williams <***@cottonwoodcomputer.com> ---
I tested D30002 this morning, and in my case it didn't improve anything but
didn't seem to break anything obvious either. Still poor throughput across the
VLAN with hwtag enabled.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-05-01 10:19:25 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #28 from Kaho Toshikazu <***@elam.kais.kyoto-u.ac.jp> ---
Created attachment 224588
--> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=224588&action=edit
iflib-vlan patch

It contains a correction of iflib side with debug code,
and it does not conflicts other patch.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
b***@freebsd.org
2021-05-02 00:18:42 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #29 from Kevin Bowling <***@freebsd.org> ---
(In reply to Kaho Toshikazu from comment #28)
Can you explain how this helps? It looks to me that the current
iflib_vlan_register compares two softcs as expected.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-05-02 01:17:38 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #30 from Kaho Toshikazu <***@elam.kais.kyoto-u.ac.jp> ---
(In reply to Kevin Bowling from comment #29)

EVENTHANDLER_REGISTER seems to require (struct ifnet *)
which typedef if_t at if_var.h. When a vlan interface is created/destroyed,
the vlan_config/unconfig handler is called, but I can not observe
any execution of iflib_vlan_register or iflib_vlan_unregister at
using current code. The conversion form ctx to ifp makes execution of handlers.

Then these handlers are called when a event is happened regardless for
other devices or for itself. The compare is required to pick up a event
related to a device driver. The handler calls with if_t because of
being registered if_t instead of if_ctx_t by the above modification.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
b***@freebsd.org
2021-05-30 02:50:16 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

Bert JW Regeer <***@0x58.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@0x58.com

--- Comment #31 from Bert JW Regeer <***@0x58.com> ---
Following up with a "me too" on FreeBSD 13.0-RELEASE, with the following
configuration:


igb0/igb1/igb2/igb3 in a lagg0 on top of which there are multiple vlan
interfaces that are created.

On FreeBSD 13, even disabling all the hardware assists I was unable to get
traffic to flow, on FreeBSD 12.2, no issues at all and everything functions as
designed.

Some information:

***@pci0:1:0:0: class=0x020000 card=0x00008086 chip=0x15398086 rev=0x03
hdr=0x00
vendor = 'Intel Corporation'
device = 'I211 Gigabit Network Connection'
class = network
subclass = ethernet


igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0xe000-0xe01f mem
0xdf500000-0xdf51ffff,0xdf520000-0xdf523fff irq 16 at device 0.0 on pci1
igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 2 RX queues 2 TX queues
igb0: Using MSI-X interrupts with 3 vectors
igb0: Ethernet address: 40:62:31:08:92:76
igb0: netmap queues/slots: TX 2/1024, RX 2/1024

***@Breached:/usr/home/xistence # ifconfig igb0
igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000

options=e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 40:62:31:08:92:76
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
***@Breached:/usr/home/xistence # ifconfig lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000

options=e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 40:62:31:08:92:76
inet6 fe80::4262:31ff:fe08:9276%lagg0 prefixlen 64 scopeid 0x8
inet6 2604:5500:c22a:7f00:4262:31ff:fe08:9276 prefixlen 64
inet 172.16.109.1 netmask 0xffffff00 broadcast 172.16.109.255
laggproto lacp lagghash l2,l3,l4
laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: igb2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: igb3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
groups: lagg
media: Ethernet autoselect
status: active
nd6 options=61<PERFORMNUD,AUTO_LINKLOCAL,NO_RADR>
***@Breached:/usr/home/xistence # ifconfig vlan10
vlan10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
options=600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
ether 40:62:31:08:92:76
inet 192.168.10.1 netmask 0xffffff00 broadcast 192.168.10.255
inet6 fe80::4262:31ff:fe08:9276%vlan10 prefixlen 64 scopeid 0x9
inet6 2604:5500:c22a:7f01:4262:31ff:fe08:9276 prefixlen 64
groups: vlan
vlan: 10 vlanpcp: 0 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=61<PERFORMNUD,AUTO_LINKLOCAL,NO_RADR>

This is unfortunately my primary router to the internet, so I am unable to
experiment with patches.

Things I did note:

- traffic on lagg0 did function (untagged)
- tcpdump on lagg0 did not show any 802.1q frames
- tcpdump on igb0 DID show 802.1q frames, but only rx, no tx
- rx worked on the vlan interfaces (saw packets coming in)
- tx did NOT work, no data was transmitted

If this is the wrong bug, please let me know and I can file a new one.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-05-30 03:49:18 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #32 from Jason Tubnor <***@tubnor.net> ---
Bert,

Through further investigation on how interfaces behave in a VLAN/non-VLAN stage
where bridges are concerned, I have found the following interface adjustment in
rc.conf remediated the issue (however, the driver should automagically do this
for the user):

ifconfig_igb0="-txcsum -txcsum6 -lro -tso up"

Adjust for your specific interface.

Can you try that and report back?
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-05-30 05:01:17 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

Jose Luis Duran <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com

--- Comment #33 from Jose Luis Duran <***@gmail.com> ---
(In reply to Bert JW Regeer from comment #31)

You seem to have missed the VLAN Hardware TSO (VLAN_HWTSO) as well.

I have not experienced this bug, however my interfaces are configured in a
similar way as Jason suggests:

ifconfig_igb0="-lro -tso -vlanhwtso up"
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
b***@freebsd.org
2021-05-30 09:04:49 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #34 from Jason Tubnor <***@tubnor.net> ---
I only needed to apply -vlanhwtso in early versions of 12. We have moved our
appliance direct from 11.4 to 13 because of these weird issues in 12 that made
it too complex from a configuration point of view and chasing option ghosts.

While 13 still has bridged VLAN issues, we have been able to mitigate against
them without it being too complex (options used are in my previous post).
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
b***@freebsd.org
2021-05-30 21:18:41 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230996

--- Comment #35 from Bert JW Regeer <***@0x58.com> ---
I'm sorry for the extra noise all, I made a mistake.

On my system FreeBSD 13.0 works perfectly now. What I missed and what Allan
Jude mentioned on twitter was that I was using an out of sync kernel <-> user
land due to rebooting after freebsd-install told me to do so, but without then
continuing with the upgrade and rebooting again.

***@Breached:/usr/home/xistence # uname -a
FreeBSD Breached.home.arpa 13.0-RELEASE-p1 FreeBSD 13.0-RELEASE-p1 #0: Wed May
26 22:15:09 UTC 2021
***@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
amd64
***@Breached:/usr/home/xistence # ifconfig igb0
igb0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000

options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
ether 40:62:31:08:92:76
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
***@Breached:/usr/home/xistence # ifconfig lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000

options=4e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
ether 40:62:31:08:92:76
inet6 fe80::4262:31ff:fe08:9276%lagg0 prefixlen 64 scopeid 0x8
inet6 2604:5500:c22a:7f00:4262:31ff:fe08:9276 prefixlen 64
inet 172.16.109.1 netmask 0xffffff00 broadcast 172.16.109.255
laggproto lacp lagghash l2,l3,l4
laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: igb1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: igb2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: igb3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
groups: lagg
media: Ethernet autoselect
status: active
nd6 options=61<PERFORMNUD,AUTO_LINKLOCAL,NO_RADR>
***@Breached:/usr/home/xistence # ifconfig vlan10
vlan10: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=4600703<RXCSUM,TXCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP>
ether 40:62:31:08:92:76
inet 192.168.10.1 netmask 0xffffff00 broadcast 192.168.10.255
inet6 fe80::4262:31ff:fe08:9276%vlan10 prefixlen 64 scopeid 0x9
inet6 2604:5500:c22a:7f01:4262:31ff:fe08:9276 prefixlen 64
groups: vlan
vlan: 10 vlanproto: 802.1q vlanpcp: 0 parent interface: lagg0
media: Ethernet autoselect
status: active
nd6 options=61<PERFORMNUD,AUTO_LINKLOCAL,NO_RADR

Is fully functional and operational. I am also seeing 1 Gbps being routed when
doing tests between different vlan's.

Once again apologies for the extra noise!
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
Loading...