Skip to content

Conversation

@CatcherX
Copy link
Member

@CatcherX CatcherX commented Feb 4, 2013

No description provided.

ausyskin and others added 28 commits October 22, 2025 08:03
Add Wildcat Lake P device id.

Cc: [email protected]
Co-developed-by: Tomas Winkler <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Alexander Usyskin <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
In fastrpc_map_lookup, dma_buf_get is called to obtain a reference to
the dma_buf for comparison purposes. However, this reference is never
released when the function returns, leading to a dma_buf memory leak.

Fix this by adding dma_buf_put before returning from the function,
ensuring that the temporarily acquired reference is properly released
regardless of whether a matching map is found.

Fixes: 9031626 ("misc: fastrpc: Fix fastrpc_map_lookup operation")
Cc: [email protected]
Signed-off-by: Junhao Xie <[email protected]>
Tested-by: Xilin Wu <[email protected]>
Rule: add
Link: https://lore.kernel.org/stable/48B368FB4C7007A7%2B20251017083906.3259343-1-bigfoot%40radxa.com
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
When building with -Wincompatible-function-pointer-types-strict, a
warning designed to catch kernel control flow integrity (kCFI) issues at
build time, there is an instance in the new mei late binding code
originating from the type parameter of mei_lb_push_payload():

  drivers/misc/mei/mei_lb.c:211:18: error: incompatible function pointer types initializing 'int (*)(struct device *, u32, u32, const void *, size_t)' (aka 'int (*)(struct device *, unsigned int, unsigned int, const void *, unsigned long)') with an expression of type 'int (struct device *, enum intel_lb_type, u32, const void *, size_t)' (aka 'int (struct device *, enum intel_lb_type, unsigned int, const void *, unsigned long)') [-Werror,-Wincompatible-function-pointer-types-strict]
    211 |         .push_payload = mei_lb_push_payload,
        |                         ^~~~~~~~~~~~~~~~~~~

While 'unsigned int' and 'enum intel_lb_type' are ABI compatible, hence
no regular warning from -Wincompatible-function-pointer-types, the
mismatch will trigger a kCFI violation when mei_lb_push_payload() is
called indirectly.

Update the type parameter of mei_lb_push_payload() to be 'u32' to match
the prototype in 'struct intel_lb_component_ops', clearing up the
warning and kCFI violation.

Fixes: 741eeab ("mei: late_bind: add late binding component driver")
Signed-off-by: Nathan Chancellor <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
The comedi_buf_munge() function performs a modulo operation
`async->munge_chan %= async->cmd.chanlist_len` without first
checking if chanlist_len is zero. If a user program submits a command with
chanlist_len set to zero, this causes a divide-by-zero error when the device
processes data in the interrupt handler path.

Add a check for zero chanlist_len at the beginning of the
function, similar to the existing checks for !map and
CMDF_RAWDATA flag. When chanlist_len is zero, update
munge_count and return early, indicating the data was
handled without munging.

This prevents potential kernel panics from malformed user commands.

Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c
Cc: [email protected]
Signed-off-by: Deepanshu Kartikey <[email protected]>
Reviewed-by: Ian Abbott <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
The mei_register() should move before the mei_start() for hook
on class device to work.
Same change was implemented in mei-me, missed from mei-txe.

Fixes: 7704e6b ("mei: hook mei_device on class device")
Signed-off-by: Alexander Usyskin <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
There are no scenarios where a weak increment is invalid on binder_node.
The only possible case where it could be invalid is if the kernel
delivers BR_DECREFS to the process that owns the node, and then
increments the weak refcount again, effectively "reviving" a dead node.

However, that is not possible: when the BR_DECREFS command is delivered,
the kernel removes and frees the binder_node. The fact that you were
able to call binder_inc_node_nilocked() implies that the node is not yet
destroyed, which implies that BR_DECREFS has not been delivered to
userspace, so incrementing the weak refcount is valid.

Note that it's currently possible to trigger this condition if the owner
calls BINDER_THREAD_EXIT while node->has_weak_ref is true. This causes
BC_INCREFS on binder_ref instances to fail when they should not.

Cc: [email protected]
Fixes: 457b9a6 ("Staging: android: add binder driver")
Reported-by: Yu-Ting Tseng <[email protected]>
Signed-off-by: Alice Ryhl <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
hdm_disconnect() calls most_deregister_interface(), which eventually
unregisters the MOST interface device with device_unregister(iface->dev).
If that drops the last reference, the device core may call release_mdev()
immediately while hdm_disconnect() is still executing.

The old code also freed several mdev-owned allocations in
hdm_disconnect() and then performed additional put_device() calls.
Depending on refcount order, this could lead to use-after-free or
double-free when release_mdev() ran (or when unregister paths also
performed puts).

Fix by moving the frees of mdev-owned allocations into release_mdev(),
so they happen exactly once when the device is truly released, and by
dropping the extra put_device() calls in hdm_disconnect() that are
redundant after device_unregister() and most_deregister_interface().

This addresses the KASAN slab-use-after-free reported by syzbot in
hdm_disconnect(). See report and stack traces in the bug link below.

Reported-by: [email protected]
Cc: stable <[email protected]>
Closes: https://syzkaller.appspot.com/bug?extid=916742d5d24f6c254761
Fixes: 97a6f77 ("drivers: most: add USB adapter driver")
Signed-off-by: Victoria Votokina <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
…zation

The early error path in hdm_probe() can jump to err_free_mdev before
&mdev->dev has been initialized with device_initialize(). Calling
put_device(&mdev->dev) there triggers a device core WARN and ends up
invoking kref_put(&kobj->kref, kobject_release) on an uninitialized
kobject.

In this path the private struct was only kmalloc'ed and the intended
release is effectively kfree(mdev) anyway, so free it directly instead
of calling put_device() on an uninitialized device.

This removes the WARNING and fixes the pre-initialization error path.

Fixes: 97a6f77 ("drivers: most: add USB adapter driver")
Cc: stable <[email protected]>
Signed-off-by: Victoria Votokina <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
…ty()

Fix incorrect use of PTR_ERR_OR_ZERO() in topology_parse_cpu_capacity()
which causes the code to proceed with NULL clock pointers. The current
logic uses !PTR_ERR_OR_ZERO(cpu_clk) which evaluates to true for both
valid pointers and NULL, leading to potential NULL pointer dereference
in clk_get_rate().

Per include/linux/err.h documentation, PTR_ERR_OR_ZERO(ptr) returns:
"The error code within @ptr if it is an error pointer; 0 otherwise."

This means PTR_ERR_OR_ZERO() returns 0 for both valid pointers AND NULL
pointers. Therefore !PTR_ERR_OR_ZERO(cpu_clk) evaluates to true (proceed)
when cpu_clk is either valid or NULL, causing clk_get_rate(NULL) to be
called when of_clk_get() returns NULL.

Replace with !IS_ERR_OR_NULL(cpu_clk) which only proceeds for valid
pointers, preventing potential NULL pointer dereference in clk_get_rate().

Cc: stable <[email protected]>
Signed-off-by: Kaushlendra Kumar <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Fixes: b8fe128 ("arch_topology: Adjust initial CPU capacities with current freq")
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
There are GPIO controllers such as the one present in the LX2160ARDB
QIXIS FPGA which have fixed-direction input and output GPIO lines mixed
together in a single register. This cannot be modeled using the
gpio-regmap as-is since there is no way to present the true direction of
a GPIO line.

In order to make this use case possible, add a new configuration
parameter - fixed_direction_output - into the gpio_regmap_config
structure. This will enable user drivers to provide a bitmap that
represents the fixed direction of the GPIO lines.

Signed-off-by: Ioana Ciornei <[email protected]>
Acked-by: Bartosz Golaszewski <[email protected]>
Reviewed-by: Michael Walle <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
The direction of the IDIO-16 GPIO lines is fixed with the first 16 lines
as output and the remaining 16 lines as input. Set the gpio_config
fixed_direction_output member to represent the fixed direction of the
GPIO lines.

Fixes: db02247 ("gpio: idio-16: Migrate to the regmap API")
Reported-by: Mark Cave-Ayland <[email protected]>
Closes: https://lore.kernel.org/r/[email protected]
Suggested-by: Michael Walle <[email protected]>
Cc: [email protected] # ae49581: gpio: regmap: add the .fixed_direction_output configuration parameter
Cc: [email protected]
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bartosz Golaszewski <[email protected]>
Move the print before releasing the delayed node.

In my initial testing there was a bug that was causing delayed_nodes
to not get freed which is why I put the print after the release. This
obviously neglects the case where the delayed node is properly freed.

Add condition to make sure we only print if we have more than one
reference to the delayed_node to prevent printing when we only have
the reference taken in btrfs_kill_all_delayed_nodes().

Fixes: b767a28 ("btrfs: print leaked references in kill_all_delayed_nodes()")
Tested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Leo Martins <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
btrfs_extent_root()/btrfs_global_root() does not return error pointers,
it returns NULL on error.

Reported-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Fixes : ed4e6b5 ("btrfs: ref-verify: handle damaged extent root tree")
CC: [email protected] # 6.17+
Signed-off-by: Amit Dhingra <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Drop the check on the maximum transfer length in Raw Gadget for both
control and non-control transfers.

Limiting the transfer length causes a problem with emulating USB devices
whose full configuration descriptor exceeds PAGE_SIZE in length.

Overall, there does not appear to be any reason to enforce any kind of
transfer length limit on the Raw Gadget side for either control or
non-control transfers, so let's just drop the related check.

Cc: stable <[email protected]>
Fixes: f2c2e71 ("usb: gadget: add raw-gadget interface")
Signed-off-by: Andrey Konovalov <[email protected]>
Link: https://patch.msgid.link/a6024e8eab679043e9b8a5defdb41c4bda62f02b.1761085528.git.andreyknvl@gmail.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
The list of Huawei LTE modules needing the quirk fixing spurious wakeups
was missing the IDs of the Huawei ME906S module, therefore suspend did not
work.

Cc: stable <[email protected]>
Signed-off-by: Tim Guttzeit <[email protected]>
Signed-off-by: Werner Sembach <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
When there is no port entry in the tcpci entry itself, the driver will
trigger an error message "OF: graph: no port node found in /...../typec" .

It is documented that the dts node should contain an connector entry
with ports and several port pointing to devices with usb-role-switch
property set. Only when those connector entry is missing, it should
check for port entries in the main node.

We switch the search order for looking after ports, which will avoid the
failure message while there are explicit connector entries.

Fixes: d56de8c ("usb: typec: tcpm: try to get role switch from tcpc fwnode")
Cc: stable <[email protected]>
Signed-off-by: Michael Grzeschik <[email protected]>
Reviewed-by: Heikki Krogerus <[email protected]>
Reviewed-by: Badhri Jagan Sridharan <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
xfs_daddr_t is a signed type, which means that xfs_buf_map_verify is
using a signed comparison.  This causes problems if bt_nr_sectors is
never overridden (e.g. in the case of an xfbtree for rmap btree repairs)
because even daddr 0 can't pass the verifier test in that case.

Define an explicit max constant and set the initial bt_nr_sectors to a
positive value.

Found by xfs/422.

Cc: [email protected] # v6.18-rc1
Fixes: 42852fe ("xfs: track the number of blocks in each buftarg")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Carlos Maiolino <[email protected]>
The deprecation of the 'attr2' mount option in 6.18 wasn't entirely
successful because nobody noticed that the kernel never printed a
warning about attr2 being set in fstab if the only xfs filesystem is the
root fs; the initramfs mounts the root fs with no mount options; and the
init scripts only conveyed the fstab options by remounting the root fs.

Fix this by making it complain all the time.

Cc: [email protected] # v5.13
Fixes: 92cf7d3 ("xfs: Skip repetitive warnings about mount options")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Carlos Maiolino <[email protected]>
Signed-off-by: Carlos Maiolino <[email protected]>
Apparently we can never deprecate mount options in this project, because
it will invariably turn out that some foolish userspace depends on some
behavior and break.  From Oleksandr Natalenko:

  In v6.18, the attr2 XFS mount option is removed. This may silently
  break system boot if the attr2 option is still present in /etc/fstab
  for rootfs.

  Consider Arch Linux that is being set up from scratch with / being
  formatted as XFS. The genfstab command that is used to generate
  /etc/fstab produces something like this by default:

  /dev/sda2 on / type xfs (rw,relatime,attr2,discard,inode64,logbufs=8,logbsize=32k,noquota)

  Once the system is set up and rebooted, there's no deprecation warning
  seen in the kernel log:

  # cat /proc/cmdline
  root=UUID=77b42de2-397e-47ee-a1ef-4dfd430e47e9 rootflags=discard rd.luks.options=discard quiet

  # dmesg | grep -i xfs
  [    2.409818] SGI XFS with ACLs, security attributes, realtime, scrub, repair, quota, no debug enabled
  [    2.415341] XFS (sda2): Mounting V5 Filesystem 77b42de2-397e-47ee-a1ef-4dfd430e47e9
  [    2.442546] XFS (sda2): Ending clean mount

  Although as per the deprecation intention, it should be there.

  Vlastimil (in Cc) suggests this is because xfs_fs_warn_deprecated()
  doesn't produce any warning by design if the XFS FS is set to be
  rootfs and gets remounted read-write during boot. This imposes two
  problems:

  1) a user doesn't see the deprecation warning; and
  2) with v6.18 kernel, the read-write remount fails because of unknown
     attr2 option rendering system unusable:

  systemd[1]: Switching root.
  systemd-remount-fs[225]: /usr/bin/mount for / exited with exit status 32.

  # mount -o rw /
  mount: /: fsconfig() failed: xfs: Unknown parameter 'attr2'.

  Thorsten (in Cc) suggested reporting this as a user-visible regression.

  From my PoV, although the deprecation is in place for 5 years already,
  it may not be visible enough as the warning is not emitted for rootfs.
  Considering the amount of systems set up with XFS on /, this may
  impose a mass problem for users.

  Vlastimil suggested making attr2 option a complete noop instead of
  removing it.

IOWs, the initrd mounts the root fs with (I assume) no mount options,
and mount -a remounts with whatever options are in fstab.  However,
XFS doesn't complain about deprecated mount options during a remount, so
technically speaking we were not warning all users in all combinations
that they were heading for a cliff.

Gotcha!!

Now, how did 'attr2' get slurped up on so many systems?  The old code
would put that in /proc/mounts if the filesystem happened to be in attr2
mode, even if user hadn't mounted with any such option.  IOWs, this is
because someone thought it would be a good idea to advertise system
state via /proc/mounts.

The easy way to fix this is to reintroduce the four mount options but
map them to a no-op option that ignores them, and hope that nobody's
depending on attr2 to appear in /proc/mounts.  (Hint: use the fsgeometry
ioctl).  But we've learned our lesson, so complain as LOUDLY as possible
about the deprecation.

Lessons learned:

 1. Don't expose system state via /proc/mounts; the only strings that
    ought to be there are options *explicitly* provided by the user.
 2. Never tidy, it's not worth the stress and irritation.

Reported-by: Vlastimil Babka <[email protected]>
Reported-by: Oleksandr Natalenko <[email protected]>
Cc: [email protected] # v6.18-rc1
Fixes: b9a176e ("xfs: remove deprecated mount options")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Carlos Maiolino <[email protected]>
On a filesystem with parent pointers, xchk_nlinks_collect_dir walks both
the directory entries (data fork) and the parent pointers (attr fork) to
determine the correct link count.  Unfortunately I forgot to update the
lock mode logic to handle the case of a directory whose attr fork is in
btree format and has not yet been loaded *and* whose data fork doesn't
need loading.

This leads to a bunch of assertions from xfs/286 in xfs_iread_extents
because we only took ILOCK_SHARED, not ILOCK_EXCL.  You'd need the rare
happenstance of a directory with a large number of non-pptr extended
attributes set and enough memory pressure to cause the directory to be
evicted and partially reloaded from disk.

I /think/ this only started in 6.18-rc1 because I've started seeing OOM
errors with the maple tree slab using 70% of memory, and this didn't
happen in 6.17.  Yay dynamic systems!

Cc: [email protected] # v6.10
Fixes: 77ede5f ("xfs: walk directory parent pointers to determine backref count")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Carlos Maiolino <[email protected]>
Since pm_runtime_get_active() returns 0 on success, all of the
DEFINE_GUARD_COND() macros in pm_runtime.h need the "_RET == 0"
condition at the end of the argument list or they would not work
correctly.

Fixes: 9a0abc3 ("PM: runtime: Add auto-cleanup macros for "resume and get" operations")
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/linux-pm/[email protected]/
Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Tested-by: Farhan Ali <[email protected]>
Link: https://patch.msgid.link/[email protected]
The receive error handling code is shared between RSCI and all other
SCIF port types, but the RSCI overrun_reg is specified as a memory
offset, while for other SCIF types it is an enum value used to index
into the sci_port_params->regs array, as mentioned above the
sci_serial_in() function.

For RSCI, the overrun_reg is CSR (0x48), causing the sci_getreg() call
inside the sci_handle_fifo_overrun() function to index outside the
bounds of the regs array, which currently has a size of 20, as specified
by SCI_NR_REGS.

Because of this, we end up accessing memory outside of RSCI's
rsci_port_params structure, which, when interpreted as a plat_sci_reg,
happens to have a non-zero size, causing the following WARN when
sci_serial_in() is called, as the accidental size does not match the
supported register sizes.

The existence of the overrun_reg needs to be checked because
SCIx_SH3_SCIF_REGTYPE has overrun_reg set to SCLSR, but SCLSR is not
present in the regs array.

Avoid calling sci_getreg() for port types which don't use standard
register handling.

Use the ops->read_reg() and ops->write_reg() functions to properly read
and write registers for RSCI, and change the type of the status variable
to accommodate the 32-bit CSR register.

sci_getreg() and sci_serial_in() are also called with overrun_reg in the
sci_mpxed_interrupt() interrupt handler, but that code path is not used
for RSCI, as it does not have a muxed interrupt.

------------[ cut here ]------------
Invalid register access
WARNING: CPU: 0 PID: 0 at drivers/tty/serial/sh-sci.c:522 sci_serial_in+0x38/0xac
Modules linked in: renesas_usbhs at24 rzt2h_adc industrialio_adc sha256 cfg80211 bluetooth ecdh_generic ecc rfkill fuse drm backlight ipv6
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.17.0-rc1+ #30 PREEMPT
Hardware name: Renesas RZ/T2H EVK Board based on r9a09g077m44 (DT)
pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : sci_serial_in+0x38/0xac
lr : sci_serial_in+0x38/0xac
sp : ffff800080003e80
x29: ffff800080003e80 x28: ffff800082195b80 x27: 000000000000000d
x26: ffff8000821956d0 x25: 0000000000000000 x24: ffff800082195b80
x23: ffff000180e0d800 x22: 0000000000000010 x21: 0000000000000000
x20: 0000000000000010 x19: ffff000180e72000 x18: 000000000000000a
x17: ffff8002bcee7000 x16: ffff800080000000 x15: 0720072007200720
x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720
x11: 0000000000000058 x10: 0000000000000018 x9 : ffff8000821a6a48
x8 : 0000000000057fa8 x7 : 0000000000000406 x6 : ffff8000821fea48
x5 : ffff00033ef88408 x4 : ffff8002bcee7000 x3 : ffff800082195b80
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff800082195b80
Call trace:
 sci_serial_in+0x38/0xac (P)
 sci_handle_fifo_overrun.isra.0+0x70/0x134
 sci_er_interrupt+0x50/0x39c
 __handle_irq_event_percpu+0x48/0x140
 handle_irq_event+0x44/0xb0
 handle_fasteoi_irq+0xf4/0x1a0
 handle_irq_desc+0x34/0x58
 generic_handle_domain_irq+0x1c/0x28
 gic_handle_irq+0x4c/0x140
 call_on_irq_stack+0x30/0x48
 do_interrupt_handler+0x80/0x84
 el1_interrupt+0x34/0x68
 el1h_64_irq_handler+0x18/0x24
 el1h_64_irq+0x6c/0x70
 default_idle_call+0x28/0x58 (P)
 do_idle+0x1f8/0x250
 cpu_startup_entry+0x34/0x3c
 rest_init+0xd8/0xe0
 console_on_rootfs+0x0/0x6c
 __primary_switched+0x88/0x90
---[ end trace 0000000000000000 ]---

Cc: stable <[email protected]>
Fixes: 0666e3f ("serial: sh-sci: Add support for RZ/T2H SCI")
Signed-off-by: Cosmin Tanislav <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
…ID 0x0018

The Advantech 2-port serial card with PCI vendor=0x13fe and device=0x0018
has a 'XR17V35X' chip installed on the circuit board. Therefore, this
driver can be used instead of theu outdated out-of-tree driver from the
manufacturer.

Signed-off-by: Florian Eckert <[email protected]>
Cc: stable <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Commit 43c51bb ("sc16is7xx: make sure device is in suspend once
probed") permanently enabled access to the enhanced features in
sc16is7xx_probe(), and it is never disabled after that.

Therefore, remove re-enable of enhanced features in
sc16is7xx_set_baud(). This eliminates a potential useless read + write
cycle each time the baud rate is reconfigured.

Fixes: 43c51bb ("sc16is7xx: make sure device is in suspend once probed")
Cc: stable <[email protected]>
Signed-off-by: Hugo Villeneuve <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
The SCIF instances on R-Car Gen5 have a single interrupt, just like on
other R-Car SoCs.

Fixes: 6ac1d60 ("dt-bindings: serial: sh-sci: Document r8a78000 bindings")
Cc: stable <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Kuninori Morimoto <[email protected]>
Acked-by: Conor Dooley <[email protected]>
Link: https://patch.msgid.link/09bc9881b31bdb948ce8b69a2b5acf633f5505a4.1759920441.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Check the return value of reset_control_deassert() in the probe
function to prevent continuing probe when reset deassertion fails.

Previously, reset_control_deassert() was called without checking its
return value, which could lead to probe continuing even when the
device reset wasn't properly deasserted.

The fix checks the return value and returns an error with dev_err_probe()
if reset deassertion fails, providing better error handling and
diagnostics.

Fixes: acbdad8 ("serial: 8250_dw: simplify optional reset handling")
Cc: stable <[email protected]>
Signed-off-by: Artem Shimko <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Some MediaTek SoCs got a gated UART baud clock, which currently gets
disabled as the clk subsystem believes it would be unused. This results in
the uart freezing right after "clk: Disabling unused clocks" on those
platforms.

Request the baud clock to be prepared and enabled during probe, and to
restore run-time power management capabilities to what it was before commit
e32a83c ("serial: 8250-mtk: modify mtk uart power and clock
management") disable and unprepare the baud clock when suspending the UART,
prepare and enable it again when resuming it.

Fixes: e32a83c ("serial: 8250-mtk: modify mtk uart power and clock management")
Fixes: b6c7ff2 ("serial: 8250_mtk: Simplify clock sequencing and runtime PM")
Cc: stable <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Signed-off-by: Daniel Golle <[email protected]>
Link: https://patch.msgid.link/de5197ccc31e1dab0965cabcc11ca92e67246cf6.1758058441.git.daniel@makrotopia.org
Signed-off-by: Greg Kroah-Hartman <[email protected]>
…ottled

Matteo reported hitting the assert_list_leaf_cfs_rq() warning from
enqueue_task_fair() post commit fe8d238 ("sched/fair: Propagate
load for throttled cfs_rq") which transitioned to using
cfs_rq_pelt_clock_throttled() check for leaf cfs_rq insertions in
propagate_entity_cfs_rq().

The "cfs_rq->pelt_clock_throttled" flag is used to indicate if the
hierarchy has its PELT frozen. If a cfs_rq's PELT is marked frozen, all
its descendants should have their PELT frozen too or weird things can
happen as a result of children accumulating PELT signals when the
parents have their PELT clock stopped.

Another side effect of this is the loss of integrity of the leaf cfs_rq
list. As debugged by Aaron, consider the following hierarchy:

    root(#)
   /    \
  A(#)   B(*)
         |
         C <--- new cgroup
         |
         D <--- new cgroup

  # - Already on leaf cfs_rq list
  * - Throttled with PELT frozen

The newly created cgroups don't have their "pelt_clock_throttled" signal
synced with cgroup B. Next, the following series of events occur:

1. online_fair_sched_group() for cgroup D will call
   propagate_entity_cfs_rq(). (Same can happen if a throttled task is
   moved to cgroup C and enqueue_task_fair() returns early.)

   propagate_entity_cfs_rq() adds the cfs_rq of cgroup C to
   "rq->tmp_alone_branch" since its PELT clock is not marked throttled
   and cfs_rq of cgroup B is not on the list.

   cfs_rq of cgroup B is skipped since its PELT is throttled.

   root cfs_rq already exists on cfs_rq leading to
   list_add_leaf_cfs_rq() returning early.

   The cfs_rq of cgroup C is left dangling on the
   "rq->tmp_alone_branch".

2. A new task wakes up on cgroup A. Since the whole hierarchy is already
   on the leaf cfs_rq list, list_add_leaf_cfs_rq() keeps returning early
   without any modifications to "rq->tmp_alone_branch".

   The final assert_list_leaf_cfs_rq() in enqueue_task_fair() sees the
   dangling reference to cgroup C's cfs_rq in "rq->tmp_alone_branch".

   !!! Splat !!!

Syncing the "pelt_clock_throttled" indicator with parent cfs_rq is not
enough since the new cfs_rq is not yet enqueued on the hierarchy. A
dequeue on other subtree on the throttled hierarchy can freeze the PELT
clock for the parent hierarchy without setting the indicators for this
newly added cfs_rq which was never enqueued.

Since there are no tasks on the new hierarchy, start a cfs_rq on a
throttled hierarchy with its PELT clock throttled. The first enqueue, or
the distribution (whichever happens first) will unfreeze the PELT clock
and queue the cfs_rq on the leaf cfs_rq list.

While at it, add an assert_list_leaf_cfs_rq() in
propagate_entity_cfs_rq() to catch such cases in the future.

Closes: https://lore.kernel.org/lkml/[email protected]/
Fixes: e1fad12 ("sched/fair: Switch to task based throttle model")
Reported-by: Matteo Martelli <[email protected]>
Suggested-by: Aaron Lu <[email protected]>
Signed-off-by: K Prateek Nayak <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Aaron Lu <[email protected]>
Tested-by: Aaron Lu <[email protected]>
Tested-by: Matteo Martelli <[email protected]>
Link: https://patch.msgid.link/[email protected]
gustavold and others added 30 commits October 29, 2025 17:41
The update_userdata() function constructs the complete userdata string
in nt->extradata_complete and updates nt->userdata_length. This data
is then read by write_msg() and write_ext_msg() when sending netconsole
messages. However, update_userdata() was not holding target_list_lock
during this process, allowing concurrent message transmission to read
partially updated userdata.

This race condition could result in netconsole messages containing
incomplete or inconsistent userdata - for example, reading the old
userdata_length with new extradata_complete content, or vice versa,
leading to truncated or corrupted output.

Fix this by acquiring target_list_lock with spin_lock_irqsave() before
updating extradata_complete and userdata_length, and releasing it after
both fields are fully updated. This ensures that readers see a
consistent view of the userdata, preventing corruption during concurrent
access.

The fix aligns with the existing locking pattern used throughout the
netconsole code, where target_list_lock protects access to target
fields including buf[] and msgcounter that are accessed during message
transmission.

Also get rid of the unnecessary variable complete_idx, which makes it
easier to bail out of update_userdata().

Fixes: df03f83 ("net: netconsole: cache userdata formatted string in netconsole_target")
Signed-off-by: Gustavo Luiz Duarte <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Accessing the transmit queue without owning the msk socket lock is
inherently racy, hence __mptcp_check_push() could actually quit early
even when there is pending data.

That in turn could cause unexpected tx lock and timeout.

Dropping the early check avoids the race, implicitly relaying on later
tests under the relevant lock. With such change, all the other
mptcp_send_head() call sites are now under the msk socket lock and we
can additionally drop the now unneeded annotation on the transmit head
pointer accesses.

Fixes: 6e628cd ("mptcp: use mptcp release_cb for delayed tasks")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Geliang Tang <[email protected]>
Tested-by: Geliang Tang <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
If a MSG_PEEK | MSG_WAITALL read operation consumes all the bytes in the
receive queue and recvmsg() need to waits for more data - i.e. it's a
blocking one - upon arrival of the next packet the MPTCP protocol will
start again copying the oldest data present in the receive queue,
corrupting the data stream.

Address the issue explicitly tracking the peeked sequence number,
restarting from the last peeked byte.

Fixes: ca4fb89 ("mptcp: add MSG_PEEK support")
Cc: [email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Geliang Tang <[email protected]>
Tested-by: Geliang Tang <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Since commit 72377ab ("mptcp: more conservative check for zero
probes") the MPTCP-level zero window probe check is always disabled, as
the TCP-level write queue always contains at least the newly allocated
skb.

Refine the relevant check tacking in account that the above condition
and that such skb can have zero length.

Fixes: 72377ab ("mptcp: more conservative check for zero probes")
Cc: [email protected]
Reported-by: Geliang Tang <[email protected]>
Closes: https://lore.kernel.org/[email protected]
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Tested-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Explicitly account for MPTCP-level zero windows probe, to catch
hopefully earlier issues alike the one addressed by the previous
patch.

Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Tested-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Matthieu Baerts says:

====================
mptcp: various rare sending issues

Here are various fixes from Paolo, addressing very occasional issues on
the sending side:

- Patch 1: drop an optimisation that could lead to timeout in case of
  race conditions. A fix for up to v5.11.

- Patch 2: fix stream corruption under very specific conditions.
  A fix for up to v5.13.

- Patch 3: restore MPTCP-level zero window probe after a recent fix.
  A fix for up to v5.16.

- Patch 4: new MIB counter to track MPTCP-level zero windows probe to
  help catching issues similar to the one fixed by the previous patch.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
The code did not check the return value of usbnet_get_endpoints.
Add checks and return the error if it fails to transfer the error.

Found via static anlaysis and this is similar to
commit 07161b2 ("sr9800: Add check for usbnet_get_endpoints").

Fixes: 933a27d ("USB: asix - Add AX88178 support and many other changes")
Fixes: 2e55cc7 ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters")
Cc: [email protected]
Signed-off-by: Miaoqian Lin <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
esw->user_count tracks how many TC rules are added on an esw via
mlx5e_configure_flower -> mlx5_esw_get -> atomic64_inc(&esw->user_count)

esw.user_count was unconditionally set to 0 in
esw_destroy_legacy_fdb_table and esw_destroy_offloads_fdb_tables.

These two together can lead to the following sequence of events:
1. echo 1 > /sys/class/net/eth2/device/sriov_numvfs
  - mlx5_core_sriov_configure -...-> esw_create_legacy_table ->
    atomic64_set(&esw->user_count, 0)
2. tc qdisc add dev eth2 ingress && \
   tc filter replace dev eth2 pref 1 protocol ip chain 0 ingress \
       handle 1 flower action ct nat zone 64000 pipe
  - mlx5e_configure_flower -> mlx5_esw_get ->
    atomic64_inc(&esw->user_count)
3. echo 0 > /sys/class/net/eth2/device/sriov_numvfs
  - mlx5_core_sriov_configure -..-> esw_destroy_legacy_fdb_table
    -> atomic64_set(&esw->user_count, 0)
4. devlink dev eswitch set pci/0000:08:00.0 mode switchdev
  - mlx5_devlink_eswitch_mode_set -> mlx5_esw_try_lock ->
    atomic64_read(&esw->user_count) == 0
  - then proceed to a WARN_ON in:
  esw_offloads_start -> mlx5_eswitch_enable_locke -> esw_offloads_enable
  -> mlx5_esw_offloads_rep_load -> mlx5e_vport_rep_load ->
  mlx5e_netdev_change_profile -> mlx5e_detach_netdev ->
  mlx5e_cleanup_nic_rx -> mlx5e_tc_nic_cleanup ->
  mlx5e_mod_hdr_tbl_destroy

Fix this by not clearing out the user_count when destroying FDB tables,
so that the check in mlx5_esw_try_lock can prevent the mode change when
there are TC rules configured, as originally intended.

Fixes: 2318b8b ("net/mlx5: E-switch, Destroy legacy fdb table when needed")
Signed-off-by: Cosmin Ratiu <[email protected]>
Reviewed-by: Dragos Tatulea <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
The tx queue can become permanently stuck in a stopped state due to a
race condition between the URB submission path and its completion
callback.

The URB completion callback can run immediately after usb_submit_urb()
returns, before the submitting function calls netif_stop_queue(). If
this occurs, the queue state management becomes desynchronized, leading
to a stall where the queue is never woken.

Fix this by moving the netif_stop_queue() call to before submitting the
URB. This closes the race window by ensuring the network stack is aware
the queue is stopped before the URB completion can possibly run.

Fixes: 0791c03 ("net: mctp: Add MCTP USB transport driver")
Signed-off-by: Jinliang Wang <[email protected]>
Acked-by: Jeremy Kerr <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
In bareudp.sh, this script uses /bin/sh and it will load another lib.sh
BASH script at the very beginning.

But on some operating systems like Ubuntu, /bin/sh is actually pointed to
DASH, thus it will try to run BASH commands with DASH and consequently
leads to syntax issues:
  # ./bareudp.sh: 4: ./lib.sh: Bad substitution
  # ./bareudp.sh: 5: ./lib.sh: source: not found
  # ./bareudp.sh: 24: ./lib.sh: Syntax error: "(" unexpected

Fix this by explicitly using BASH for bareudp.sh. This fixes test
execution failures on systems where /bin/sh is not BASH.

Reported-by: Edoardo Canepa <[email protected]>
Link: https://bugs.launchpad.net/bugs/2129812
Signed-off-by: Po-Hsu Lin <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
According to the TI DP83869HM datasheet Revision D (June 2025), section
7.6.1.41 STRAP_STS Register, the STRAP_OPMODE bitmask is bit [11:9].
Fix this.

In case the PHY is auto-detected via PHY ID registers, or not described
in DT, or, in case the PHY is described in DT but the optional DT property
"ti,op-mode" is not present, then the driver reads out the PHY functional
mode (RGMII, SGMII, ...) from hardware straps.

Currently, all upstream users of this PHY specify both DT compatible string
"ethernet-phy-id2000.a0f1" and ti,op-mode = <DP83869_RGMII_COPPER_ETHERNET>
property, therefore it seems no upstream users are affected by this bug.

The driver currently interprets bits [2:0] of STRAP_STS register as PHY
functional mode. Those bits are controlled by ANEG_DIS, ANEGSEL_0 straps
and an always-zero reserved bit. Systems that use RGMII-to-Copper functional
mode are unlikely to disable auto-negotiation via ANEG_DIS strap, or change
auto-negotiation behavior via ANEGSEL_0 strap. Therefore, even with this bug
in place, the STRAP_STS register content is likely going to be interpreted
by the driver as RGMII-to-Copper mode.

However, for a system with PHY functional mode strapping set to other mode
than RGMII-to-Copper, the driver is likely to misinterpret the strapping
as RGMII-to-Copper and misconfigure the PHY.

For example, on a system with SGMII-to-Copper strapping, the STRAP_STS
register reads as 0x0c20, but the PHY ends up being configured for
incompatible RGMII-to-Copper mode.

Fixes: 0eaf8cc ("net: phy: dp83869: Set opmode from straps")
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Thanh Quan <[email protected]>
Signed-off-by: Hai Pham <[email protected]>
Signed-off-by: Marek Vasut <[email protected]> # Port from U-Boot to Linux
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
…l/git/netfilter/nf

Florian Westphal says:

====================
netfilter: updates for net

1) its not possible to attach conntrack labels via ctnetlink
   unless one creates a dummy 'ct labels set' rule in nftables.
   This is an oversight, the 'ruleset tests presence, userspace
   (netlink) sets' use-case is valid and should 'just work'.
   Always broken since this got added in Linux 4.7.

2) nft_connlimit reads count value without holding the relevant
   lock, add a READ_ONCE annotation.  From Fernando Fernandez Mancera.

3) There is a long-standing bug (since 4.12) in nftables helper infra
   when NAT is in use: if the helper gets assigned after the nat binding
   was set up, we fail to initialise the 'seqadj' extension, which is
   needed in case NAT payload rewrites need to add (or remove) from the
   packet payload.  Fix from Andrii Melnychenko.

* tag 'nf-25-10-29' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_ct: add seqadj extension for natted connections
  netfilter: nft_connlimit: fix possible data race on connection count
  netfilter: nft_ct: enable labels for get case too
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Update tls_offload_rx_resync_async_request_start() and
tls_offload_rx_resync_async_request_end() to get a struct
tls_offload_resync_async parameter directly, rather than
extracting it from struct sock.

This change aligns the function signatures with the upcoming
tls_offload_rx_resync_async_request_cancel() helper, which
will be introduced in a subsequent patch.

Signed-off-by: Shahar Shitrit <[email protected]>
Reviewed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
When a netdev issues a RX async resync request for a TLS connection,
the TLS module handles it by logging record headers and attempting to
match them to the tcp_sn provided by the device. If a match is found,
the TLS module approves the tcp_sn for resynchronization.

While waiting for a device response, the TLS module also increments
rcd_delta each time a new TLS record is received, tracking the distance
from the original resync request.

However, if the device response is delayed or fails (e.g due to
unstable connection and device getting out of tracking, hardware
errors, resource exhaustion etc.), the TLS module keeps logging and
incrementing, which can lead to a WARN() when rcd_delta exceeds the
threshold.

To address this, introduce tls_offload_rx_resync_async_request_cancel()
to explicitly cancel resync requests when a device response failure is
detected. Call this helper also as a final safeguard when rcd_delta
crosses its threshold, as reaching this point implies that earlier
cancellation did not occur.

Signed-off-by: Shahar Shitrit <[email protected]>
Reviewed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
When device loses track of TLS records, it attempts to resync by
monitoring records and requests an asynchronous resynchronization
from software for this TLS connection.

The TLS module handles such device RX resync requests by logging record
headers and comparing them with the record tcp_sn when provided by the
device. It also increments rcd_delta to track how far the current
record tcp_sn is from the tcp_sn of the original resync request.
If the device later responds with a matching tcp_sn, the TLS module
approves the tcp_sn for resync.

However, the device response may be delayed or never arrive,
particularly due to traffic-related issues such as packet drops or
reordering. In such cases, the TLS module remains unaware that resync
will not complete, and continues performing unnecessary work by logging
headers and incrementing rcd_delta, which can eventually exceed the
threshold and trigger a WARN(). For example, this was observed when the
device got out of tracking, causing
mlx5e_ktls_handle_get_psv_completion() to fail and ultimately leading
to the rcd_delta warning.

To address this, call tls_offload_rx_resync_async_request_cancel()
to cancel the resync request and stop resync tracking in such error
cases. Also, increment the tls_resync_req_skip counter to track these
cancellations.

Fixes: 0419d8c ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Shahar Shitrit <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
…nction'

Tariq Toukan says:

====================
tls: Introduce and use RX async resync request cancel function

This series by Shahar introduces RX async resync request cancel function
in tls module, and uses it in mlx5e driver.

For a device-offloaded TLS RX connection, the TLS module increments
rcd_delta each time a new TLS record is received, tracking the distance
from the original resync request. In the meanwhile, the device is
queried and is expected to respond, asynchronously.

However, if the device response is delayed or fails (e.g due to unstable
connection and device getting out of tracking, hardware errors, resource
exhaustion etc.), the TLS module keeps logging and incrementing
rcd_delta, which can lead to a WARN() when rcd_delta exceeds the
threshold.

This series improves this code area by canceling the resync request when
spotting an issue with the device response.
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
The DWMAC IP's VLAN tag insertion offload does not support inserting
STAG (802.1AD) and CTAG (802.1Q) types in bytes 13 and 14 using the
same MAC_VLAN_Incl and MAC_VLAN_Inner_Incl register configurations.

Currently, MAC_VLAN_Incl is configured to offload only STAG type
insertion. However, the DWMAC IP inserts a CTAG type when the inner
VLAN ID field of the descriptor is not configured, and a STAG type
when it is configured. This behavior is not documented and leads to
inconsistent double VLAN tagging.

Additionally, an unexpected CTAG with VLAN ID 0 is inserted, resulting
in frames like:

Frame 1: 110 bytes on wire (880 bits), 110 bytes captured (880 bits)
Ethernet II, Src: <src> (<src>), Dst: <dst> (<dst>)
IEEE 802.1ad, ID: 100
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 0 (unexpected)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 200
Internet Protocol Version 4, Src: 192.168.4.10, Dst: 192.168.4.11
Internet Control Message Protocol

To avoid this undocumented and incorrect behavior, disable 802.1AD tag
insertion offload. Also, don't set CSVL bit. As per the data book,
when this bit is set, S-VLAN type (0x88A8) is inserted in the 13th and
14th bytes of transmitted packets and when this bit is reset, C-VLAN
type (0x8100) is inserted in the 13th and 14th bytes of transmitted
packets.

Fixes: 30d9322 ("net: stmmac: Add support for VLAN Insertion Offload")
Fixes: e94e3f3 ("net: stmmac: Add support for VLAN Insertion Offload in GMAC4+")
Fixes: 1d2c7a5 ("net: stmmac: Refactor VLAN implementation")
Signed-off-by: Rohan G Thomas <[email protected]>
Reviewed-by: Boon Khai Ng <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Queue maxSDU requirement of 802.1 Qbv standard requires mac to drop
packets that exceeds maxSDU length and maxSDU doesn't include
preamble, destination and source address, or FCS but includes
ethernet type and VLAN header.

On hardware with Tx VLAN offload enabled, VLAN header length is not
included in the skb->len, when Tx VLAN offload is requested. This
leads to incorrect length checks and allows transmission of
oversized packets. Add the VLAN_HLEN to the skb->len before checking
the Qbv maxSDU if Tx VLAN offload is requested for the packet.

Fixes: c5c3e1b ("net: stmmac: Offload queueMaxSDU from tc-taprio")
Signed-off-by: Rohan G Thomas <[email protected]>
Reviewed-by: Matthew Gerlach <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Fix the bounds checks for the hw supported maximum GCL entry
count and gate interval time.

Fixes: b60189e ("net: stmmac: Integrate EST with TAPRIO scheduler API")
Signed-off-by: Rohan G Thomas <[email protected]>
Reviewed-by: Matthew Gerlach <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Rohan G Thomas says:

====================
net: stmmac: Fixes for stmmac Tx VLAN insert and EST

This patchset includes following fixes for stmmac Tx VLAN insert and
EST implementations:
   1. Disable STAG insertion offloading, as DWMAC IPs doesn't support
      offload of STAG for double VLAN packets and CTAG for single VLAN
      packets when using the same register configuration. The current
      configuration in the driver is undocumented and is adding an
      additional 802.1Q tag with VLAN ID 0 for double VLAN packets.
   2. Consider Tx VLAN offload tag length for maxSDU estimation.
   3. Fix GCL bounds check
====================

Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
The zero-copy Device Memory (Devmem) transmit path
relies on the socket's route cache (`dst_entry`) to
validate that the packet is being sent via the network
device to which the DMA buffer was bound.

However, this check incorrectly fails and returns `-ENODEV`
if the socket's route cache entry (`dst`) is merely missing
or expired (`dst == NULL`). This scenario is observed during
network events, such as when flow steering rules are deleted,
leading to a temporary route cache invalidation.

This patch fixes -ENODEV error for `net_devmem_get_binding()`
by doing the following:

1.  It attempts to rebuild the route via `rebuild_header()`
if the route is initially missing (`dst == NULL`). This
allows the TCP/IP stack to recover from transient route
cache misses.
2.  It uses `rcu_read_lock()` and `dst_dev_rcu()` to safely
access the network device pointer (`dst_dev`) from the
route, preventing use-after-free conditions if the
device is concurrently removed.
3.  It maintains the critical safety check by validating
that the retrieved destination device (`dst_dev`) is
exactly the device registered in the Devmem binding
(`binding->dev`).

These changes prevent unnecessary ENODEV failures while
maintaining the critical safety requirement that the
Devmem resources are only used on the bound network device.

Reviewed-by: Bobby Eshleman <[email protected]>
Reported-by: Eric Dumazet <[email protected]>
Reported-by: Vedant Mathur <[email protected]>
Suggested-by: Eric Dumazet <[email protected]>
Fixes: bd61848 ("net: devmem: Implement TX path")
Signed-off-by: Shivaji Kant <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Fix an issue detected by syzbot:

KMSAN reported an uninitialized-value access in sctp_inq_pop
BUG: KMSAN: uninit-value in sctp_inq_pop

The issue is actually caused by skb trimming via sk_filter() in sctp_rcv().
In the reproducer, skb->len becomes 1 after sk_filter(), which bypassed the
original check:

        if (skb->len < sizeof(struct sctphdr) + sizeof(struct sctp_chunkhdr) +
                       skb_transport_offset(skb))
To handle this safely, a new check should be performed after sk_filter().

Reported-by: [email protected]
Tested-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=d101e12bccd4095460e7
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Suggested-by: Xin Long <[email protected]>
Signed-off-by: Ranganath V N <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Merge a cpuidle fix and two fixes related to system sleep for 6.18-rc4:

 - Add an exit latency check to the menu cpuidle governor in the case
   when it considers using a real idle state instead of a polling one to
   address a performance regression (Rafael Wysocki)

 - Revert an attempted cleanup of a system suspend code path that
   introduced a regression elsewhere (Samuel Wu)

 - Allow pm_restrict_gfp_mask() to be called multiple times in a row
   and adjust pm_restore_gfp_mask() accordingly to avoid having to play
   nasty games with these calls during hibernation (Rafael Wysocki)

* pm-cpuidle:
  cpuidle: governors: menu: Select polling state in some more cases

* pm-sleep:
  PM: sleep: Allow pm_restrict_gfp_mask() stacking
  Revert "PM: sleep: Make pm_wakeup_clear() call more clear"
Merge ACPI button, ACPI backlight (video), and ACPI fan driver fixes for
6.18-rc4:

 - Call input_free_device() on failing input device registration as
   necessary (and mentioned in the input subsystem documentation) in the
   ACPI button driver (Kaushlendra Kumar)

 - Fix use-after-free in acpi_video_switch_brightness() by canceling
   a delayed work during tear-down (Yuhao Jiang)

 - Use platform device for devres-related actions in the ACPI fan driver
   to allow device-managed resources to be cleaned up properly (Armin
   Wolf)

* acpi-button:
  ACPI: button: Call input_free_device() on failing input device registration

* acpi-video:
  ACPI: video: Fix use-after-free in acpi_video_switch_brightness()

* acpi-fan:
  ACPI: fan: Use platform device for devres-related actions
  ACPI: fan: Use ACPI handle when retrieving _FST
…/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from wireless, Bluetooth and netfilter.

  Current release - regressions:

    - tcp: fix too slow tcp_rcvbuf_grow() action

    - bluetooth: fix corruption in h4_recv_buf() after cleanup

  Previous releases - regressions:

    - mptcp: restore window probe

    - bluetooth:
       - fix connection cleanup with BIG with 2 or more BIS
       - fix crash in set_mesh_sync and set_mesh_complete

    - batman-adv: release references to inactive interfaces

    - nic:
       - ice: fix usage of logical PF id
       - sfc: fix potential memory leak in efx_mae_process_mport()

  Previous releases - always broken:

    - devmem: refresh devmem TX dst in case of route invalidation

    - netfilter: add seqadj extension for natted connections

    - wifi:
       - iwlwifi: fix potential use after free in iwl_mld_remove_link()
       - brcmfmac: fix crash while sending action frames in standalone AP Mode

    - eth:
       - mlx5e: cancel tls RX async resync request in error flows
       - ixgbe: fix memory leak and use-after-free in ixgbe_recovery_probe()
       - hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
       - cxgb4: fix potential use-after-free in ipsec callback
       - nfp: fix memory leak in nfp_net_alloc()"

* tag 'net-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits)
  net: sctp: fix KMSAN uninit-value in sctp_inq_pop
  net: devmem: refresh devmem TX dst in case of route invalidation
  net: stmmac: est: Fix GCL bounds checks
  net: stmmac: Consider Tx VLAN offload tag length for maxSDU
  net: stmmac: vlan: Disable 802.1AD tag insertion offload
  net/mlx5e: kTLS, Cancel RX async resync request in error flows
  net: tls: Cancel RX async resync request on rcd_delta overflow
  net: tls: Change async resync helpers argument
  net: phy: dp83869: fix STRAP_OPMODE bitmask
  selftests: net: use BASH for bareudp testing
  net: mctp: Fix tx queue stall
  net/mlx5: Don't zero user_count when destroying FDB tables
  net: usb: asix_devices: Check return value of usbnet_get_endpoints
  mptcp: zero window probe mib
  mptcp: restore window probe
  mptcp: fix MSG_PEEK stream corruption
  mptcp: drop bogus optimization in __mptcp_check_push()
  netconsole: Fix race condition in between reader and writer of userdata
  Documentation: netconsole: Remove obsolete contact people
  nfp: xsk: fix memory leak in nfp_net_alloc()
  ...
…kernel/git/deller/linux-fbdev

Pull fbdev fixes from Helge Deller:

 - atyfb: Avoid hard lock up when PLL not initialized (Daniel Palmer)

 - pvr2fb: Fix build error when CONFIG_PVR2_DMA enabled (Florian Fuchs)

 - bitblit: Fix out-of-bounds read in bit_putcs* (Junjie Cao)

 - valkyriefb: Fix reference count leak (Miaoqian Lin)

 - fbcon: Fix slab-use-after-free in fb_mode_is_equal (Quanmin Yan)

 - fb.h: Fix typo in "vertical" (Piyush Choudhary)

* tag 'fbdev-for-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbdev: atyfb: Check if pll_ops->init_pll failed
  fbcon: Set fb_display[i]->mode to NULL when the mode is released
  fbdev: bitblit: bound-check glyph index in bit_putcs*
  fbdev: pvr2fb: Fix leftover reference to ONCHIP_NR_DMA_CHANNELS
  fbdev: valkyriefb: Fix reference count leak in valkyriefb_init
  video: fb: Fix typo in comment in fb.h
…git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix three regressions, two recent ones and one introduced during
  the 6.17 development cycle:

   - Add an exit latency check to the menu cpuidle governor in the case
     when it considers using a real idle state instead of a polling one
     to address a performance regression (Rafael Wysocki)

   - Revert an attempted cleanup of a system suspend code path that
     introduced a regression elsewhere (Samuel Wu)

   - Allow pm_restrict_gfp_mask() to be called multiple times in a row
     and adjust pm_restore_gfp_mask() accordingly to avoid having to
     play nasty games with these calls during hibernation (Rafael
     Wysocki)"

* tag 'pm-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Allow pm_restrict_gfp_mask() stacking
  cpuidle: governors: menu: Select polling state in some more cases
  Revert "PM: sleep: Make pm_wakeup_clear() call more clear"
…l/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix three ACPI driver issues and add version checks to two ACPI
  table parsers:

   - Call input_free_device() on failing input device registration as
     necessary (and mentioned in the input subsystem documentation) in
     the ACPI button driver (Kaushlendra Kumar)

   - Fix use-after-free in acpi_video_switch_brightness() by canceling a
     delayed work during tear-down (Yuhao Jiang)

   - Use platform device for devres-related actions in the ACPI fan
     driver to allow device-managed resources to be cleaned up properly
     (Armin Wolf)

   - Add version checks to the MRRM and SPCR table parsers (Tony Luck
     and Punit Agrawal)"

* tag 'acpi-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: SPCR: Check for table version when using precise baudrate
  ACPI: MRRM: Check revision of MRRM table
  ACPI: fan: Use platform device for devres-related actions
  ACPI: fan: Use ACPI handle when retrieving _FST
  ACPI: video: Fix use-after-free in acpi_video_switch_brightness()
  ACPI: button: Call input_free_device() on failing input device registration
…org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "Fix log overwrite in param_tests and fixes incorrect cast of priv
  pointer in test_dev_action().

  Update email address for Rae Moar in MAINTAINERS KUnit entry"

* tag 'linux_kselftest-kunit-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  MAINTAINERS: Update KUnit email address for Rae Moar
  kunit: prevent log overwrite in param_tests
  kunit: test_dev_action: Correctly cast 'priv' pointer to long*
…b/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "Fix build warning in cachestat found during clang build and add
  tmpshmcstat to .gitignore"

* tag 'linux_kselftest-fixes-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: cachestat: Fix warning on declaration under label
  selftests/cachestat: add tmpshmcstat file to .gitignore
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.