| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Fix potential null-deref in dm_resume
[Why]
Fixing smatch error:
dm_resume() error: we previously assumed 'aconnector->dc_link' could be null
[How]
Check if dc_link null at the beginning of the loop,
so further checks can be dropped. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/amdgpu: Fix size validation for non-exclusive domains (v4)
Fix amdgpu_bo_validate_size() to check whether the TTM domain manager for the
requested memory exists, else we get a kernel oops when dereferencing "man".
v2: Make the patch standalone, i.e. not dependent on local patches.
v3: Preserve old behaviour and just check that the manager pointer is not
NULL.
v4: Complain if GTT domain requested and it is uninitialized--most likely a
bug. |
| In the Linux kernel, the following vulnerability has been resolved:
staging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv()
In rtw_init_cmd_priv(), if `pcmdpriv->rsp_allocated_buf` is allocated
in failure, then `pcmdpriv->cmd_allocated_buf` will be not properly
released. Besides, considering there are only two error paths and the
first one can directly return, so we do not need implicitly jump to the
`exit` tag to execute the error handler.
So this patch added `kfree(pcmdpriv->cmd_allocated_buf);` on the error
path to release the resource and simplified the return logic of
rtw_init_cmd_priv(). As there is no proper device to test with, no runtime
testing was performed. |
| In the Linux kernel, the following vulnerability has been resolved:
arm64: fix oops in concurrently setting insn_emulation sysctls
emulation_proc_handler() changes table->data for proc_dointvec_minmax
and can generate the following Oops if called concurrently with itself:
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
| Internal error: Oops: 96000006 [#1] SMP
| Call trace:
| update_insn_emulation_mode+0xc0/0x148
| emulation_proc_handler+0x64/0xb8
| proc_sys_call_handler+0x9c/0xf8
| proc_sys_write+0x18/0x20
| __vfs_write+0x20/0x48
| vfs_write+0xe4/0x1d0
| ksys_write+0x70/0xf8
| __arm64_sys_write+0x20/0x28
| el0_svc_common.constprop.0+0x7c/0x1c0
| el0_svc_handler+0x2c/0xa0
| el0_svc+0x8/0x200
To fix this issue, keep the table->data as &insn->current_mode and
use container_of() to retrieve the insn pointer. Another mutex is
used to protect against the current_mode update but not for retrieving
insn_emulation as table->data is no longer changing. |
| In the Linux kernel, the following vulnerability has been resolved:
arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall
If a compat process tries to execute an unknown system call above the
__ARM_NR_COMPAT_END number, the kernel sends a SIGILL signal to the
offending process. Information about the error is printed to dmesg in
compat_arm_syscall() -> arm64_notify_die() -> arm64_force_sig_fault() ->
arm64_show_signal().
arm64_show_signal() interprets a non-zero value for
current->thread.fault_code as an exception syndrome and displays the
message associated with the ESR_ELx.EC field (bits 31:26).
current->thread.fault_code is set in compat_arm_syscall() ->
arm64_notify_die() with the bad syscall number instead of a valid ESR_ELx
value. This means that the ESR_ELx.EC field has the value that the user set
for the syscall number and the kernel can end up printing bogus exception
messages*. For example, for the syscall number 0x68000000, which evaluates
to ESR_ELx.EC value of 0x1A (ESR_ELx_EC_FPAC) the kernel prints this error:
[ 18.349161] syscall[300]: unhandled exception: ERET/ERETAA/ERETAB, ESR 0x68000000, Oops - bad compat syscall(2) in syscall[10000+50000]
[ 18.350639] CPU: 2 PID: 300 Comm: syscall Not tainted 5.18.0-rc1 #79
[ 18.351249] Hardware name: Pine64 RockPro64 v2.0 (DT)
[..]
which is misleading, as the bad compat syscall has nothing to do with
pointer authentication.
Stop arm64_show_signal() from printing exception syndrome information by
having compat_arm_syscall() set the ESR_ELx value to 0, as it has no
meaning for an invalid system call number. The example above now becomes:
[ 19.935275] syscall[301]: unhandled exception: Oops - bad compat syscall(2) in syscall[10000+50000]
[ 19.936124] CPU: 1 PID: 301 Comm: syscall Not tainted 5.18.0-rc1-00005-g7e08006d4102 #80
[ 19.936894] Hardware name: Pine64 RockPro64 v2.0 (DT)
[..]
which although shows less information because the syscall number,
wrongfully advertised as the ESR value, is missing, it is better than
showing plainly wrong information. The syscall number can be easily
obtained with strace.
*A 32-bit value above or equal to 0x8000_0000 is interpreted as a negative
integer in compat_arm_syscal() and the condition scno < __ARM_NR_COMPAT_END
evaluates to true; the syscall will exit to userspace in this case with the
ENOSYS error code instead of arm64_notify_die() being called. |
| In the Linux kernel, the following vulnerability has been resolved:
nvme-rdma: fix possible use-after-free in transport error_recovery work
While nvme_rdma_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/gma500: Fix WARN_ON(lock->magic != lock) error
psb_gem_unpin() calls dma_resv_lock() but the underlying ww_mutex
gets destroyed by drm_gem_object_release() move the
drm_gem_object_release() call in psb_gem_free_object() to after
the unpin to fix the below warning:
[ 79.693962] ------------[ cut here ]------------
[ 79.693992] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[ 79.694015] WARNING: CPU: 0 PID: 240 at kernel/locking/mutex.c:582 __ww_mutex_lock.constprop.0+0x569/0xfb0
[ 79.694052] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr bnep ath9k ath9k_common ath9k_hw snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel ath3k snd_intel_dspcfg mac80211 snd_intel_sdw_acpi btusb snd_hda_codec btrtl btbcm btintel btmtk bluetooth at24 snd_hda_core snd_hwdep uvcvideo snd_seq libarc4 videobuf2_vmalloc ath videobuf2_memops videobuf2_v4l2 videobuf2_common snd_seq_device videodev acer_wmi intel_powerclamp coretemp mc snd_pcm joydev sparse_keymap ecdh_generic pcspkr wmi_bmof cfg80211 i2c_i801 i2c_smbus snd_timer snd r8169 rfkill lpc_ich soundcore acpi_cpufreq zram rtsx_pci_sdmmc mmc_core serio_raw rtsx_pci gma500_gfx(E) video wmi ip6_tables ip_tables i2c_dev fuse
[ 79.694436] CPU: 0 PID: 240 Comm: plymouthd Tainted: G W E 6.0.0-rc3+ #490
[ 79.694457] Hardware name: Packard Bell dot s/SJE01_CT, BIOS V1.10 07/23/2013
[ 79.694469] RIP: 0010:__ww_mutex_lock.constprop.0+0x569/0xfb0
[ 79.694496] Code: ff 85 c0 0f 84 15 fb ff ff 8b 05 ca 3c 11 01 85 c0 0f 85 07 fb ff ff 48 c7 c6 30 cb 84 aa 48 c7 c7 a3 e1 82 aa e8 ac 29 f8 ff <0f> 0b e9 ed fa ff ff e8 5b 83 8a ff 85 c0 74 10 44 8b 0d 98 3c 11
[ 79.694513] RSP: 0018:ffffad1dc048bbe0 EFLAGS: 00010282
[ 79.694623] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000000
[ 79.694636] RDX: 0000000000000001 RSI: ffffffffaa8b0ffc RDI: 00000000ffffffff
[ 79.694650] RBP: ffffad1dc048bc80 R08: 0000000000000000 R09: ffffad1dc048ba90
[ 79.694662] R10: 0000000000000003 R11: ffffffffaad62fe8 R12: ffff9ff302103138
[ 79.694675] R13: ffff9ff306ec8000 R14: ffff9ff307779078 R15: ffff9ff3014c0270
[ 79.694690] FS: 00007ff1cccf1740(0000) GS:ffff9ff3bc200000(0000) knlGS:0000000000000000
[ 79.694705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 79.694719] CR2: 0000559ecbcb4420 CR3: 0000000013210000 CR4: 00000000000006f0
[ 79.694734] Call Trace:
[ 79.694749] <TASK>
[ 79.694761] ? __schedule+0x47f/0x1670
[ 79.694796] ? psb_gem_unpin+0x27/0x1a0 [gma500_gfx]
[ 79.694830] ? lock_is_held_type+0xe3/0x140
[ 79.694864] ? ww_mutex_lock+0x38/0xa0
[ 79.694885] ? __cond_resched+0x1c/0x30
[ 79.694902] ww_mutex_lock+0x38/0xa0
[ 79.694925] psb_gem_unpin+0x27/0x1a0 [gma500_gfx]
[ 79.694964] psb_gem_unpin+0x199/0x1a0 [gma500_gfx]
[ 79.694996] drm_gem_object_release_handle+0x50/0x60
[ 79.695020] ? drm_gem_object_handle_put_unlocked+0xf0/0xf0
[ 79.695042] idr_for_each+0x4b/0xb0
[ 79.695066] ? _raw_spin_unlock_irqrestore+0x30/0x60
[ 79.695095] drm_gem_release+0x1c/0x30
[ 79.695118] drm_file_free.part.0+0x1ea/0x260
[ 79.695150] drm_release+0x6a/0x120
[ 79.695175] __fput+0x9f/0x260
[ 79.695203] task_work_run+0x59/0xa0
[ 79.695227] do_exit+0x387/0xbe0
[ 79.695250] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[ 79.695275] ? lockdep_hardirqs_on+0x7d/0x100
[ 79.695304] do_group_exit+0x33/0xb0
[ 79.695331] __x64_sys_exit_group+0x14/0x20
[ 79.695353] do_syscall_64+0x58/0x80
[ 79.695376] ? up_read+0x17/0x20
[ 79.695401] ? lock_is_held_type+0xe3/0x140
[ 79.695429] ? asm_exc_page_fault+0x22/0x30
[ 79.695450] ? lockdep_hardirqs_on+0x7d/0x100
[ 79.695473] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 79.695493] RIP: 0033:0x7ff1ccefe3f1
[ 79.695516] Code: Unable to access opcode bytes at RIP 0x7ff1ccefe3c7.
[ 79.695607] RSP: 002b:00007ffed4413378 EFLAGS:
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
usb: typec: ucsi: fix use-after-free caused by uec->work
The delayed work uec->work is scheduled in gaokun_ucsi_probe()
but never properly canceled in gaokun_ucsi_remove(). This creates
use-after-free scenarios where the ucsi and gaokun_ucsi structure
are freed after ucsi_destroy() completes execution, while the
gaokun_ucsi_register_worker() might be either currently executing
or still pending in the work queue. The already-freed gaokun_ucsi
or ucsi structure may then be accessed.
Furthermore, the race window is 3 seconds, which is sufficiently
long to make this bug easily reproducible. The following is the
trace captured by KASAN:
==================================================================
BUG: KASAN: slab-use-after-free in __run_timers+0x5ec/0x630
Write of size 8 at addr ffff00000ec28cc8 by task swapper/0/0
...
Call trace:
show_stack+0x18/0x24 (C)
dump_stack_lvl+0x78/0x90
print_report+0x114/0x580
kasan_report+0xa4/0xf0
__asan_report_store8_noabort+0x20/0x2c
__run_timers+0x5ec/0x630
run_timer_softirq+0xe8/0x1cc
handle_softirqs+0x294/0x720
__do_softirq+0x14/0x20
____do_softirq+0x10/0x1c
call_on_irq_stack+0x30/0x48
do_softirq_own_stack+0x1c/0x28
__irq_exit_rcu+0x27c/0x364
irq_exit_rcu+0x10/0x1c
el1_interrupt+0x40/0x60
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x6c/0x70
arch_local_irq_enable+0x4/0x8 (P)
do_idle+0x334/0x458
cpu_startup_entry+0x60/0x70
rest_init+0x158/0x174
start_kernel+0x2f8/0x394
__primary_switched+0x8c/0x94
Allocated by task 72 on cpu 0 at 27.510341s:
kasan_save_stack+0x2c/0x54
kasan_save_track+0x24/0x5c
kasan_save_alloc_info+0x40/0x54
__kasan_kmalloc+0xa0/0xb8
__kmalloc_node_track_caller_noprof+0x1c0/0x588
devm_kmalloc+0x7c/0x1c8
gaokun_ucsi_probe+0xa0/0x840 auxiliary_bus_probe+0x94/0xf8
really_probe+0x17c/0x5b8
__driver_probe_device+0x158/0x2c4
driver_probe_device+0x10c/0x264
__device_attach_driver+0x168/0x2d0
bus_for_each_drv+0x100/0x188
__device_attach+0x174/0x368
device_initial_probe+0x14/0x20
bus_probe_device+0x120/0x150
device_add+0xb3c/0x10fc
__auxiliary_device_add+0x88/0x130
...
Freed by task 73 on cpu 1 at 28.910627s:
kasan_save_stack+0x2c/0x54
kasan_save_track+0x24/0x5c
__kasan_save_free_info+0x4c/0x74
__kasan_slab_free+0x60/0x8c
kfree+0xd4/0x410
devres_release_all+0x140/0x1f0
device_unbind_cleanup+0x20/0x190
device_release_driver_internal+0x344/0x460
device_release_driver+0x18/0x24
bus_remove_device+0x198/0x274
device_del+0x310/0xa84
...
The buggy address belongs to the object at ffff00000ec28c00
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 200 bytes inside of
freed 512-byte region
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4ec28
head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x3fffe0000000040(head|node=0|zone=0|lastcpupid=0x1ffff)
page_type: f5(slab)
raw: 03fffe0000000040 ffff000008801c80 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000
head: 03fffe0000000040 ffff000008801c80 dead000000000122 0000000000000000
head: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000
head: 03fffe0000000002 fffffdffc03b0a01 00000000ffffffff 00000000ffffffff
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000004
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff00000ec28b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff00000ec28c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff00000ec28c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff00000ec28d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff00000ec28d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
================================================================
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: imm: Fix use-after-free bug caused by unfinished delayed work
The delayed work item 'imm_tq' is initialized in imm_attach() and
scheduled via imm_queuecommand() for processing SCSI commands. When the
IMM parallel port SCSI host adapter is detached through imm_detach(),
the imm_struct device instance is deallocated.
However, the delayed work might still be pending or executing
when imm_detach() is called, leading to use-after-free bugs
when the work function imm_interrupt() accesses the already
freed imm_struct memory.
The race condition can occur as follows:
CPU 0(detach thread) | CPU 1
| imm_queuecommand()
| imm_queuecommand_lck()
imm_detach() | schedule_delayed_work()
kfree(dev) //FREE | imm_interrupt()
| dev = container_of(...) //USE
dev-> //USE
Add disable_delayed_work_sync() in imm_detach() to guarantee proper
cancellation of the delayed work item before imm_struct is deallocated. |
| In the Linux kernel, the following vulnerability has been resolved:
fbdev: core: fbcvt: avoid division by 0 in fb_cvt_hperiod()
In fb_find_mode_cvt(), iff mode->refresh somehow happens to be 0x80000000,
cvt.f_refresh will become 0 when multiplying it by 2 due to overflow. It's
then passed to fb_cvt_hperiod(), where it's used as a divider -- division
by 0 will result in kernel oops. Add a sanity check for cvt.f_refresh to
avoid such overflow...
Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool. |
| In the Linux kernel, the following vulnerability has been resolved:
seg6: Fix validation of nexthop addresses
The kernel currently validates that the length of the provided nexthop
address does not exceed the specified length. This can lead to the
kernel reading uninitialized memory if user space provided a shorter
length than the specified one.
Fix by validating that the provided length exactly matches the specified
one. |
| In the Linux kernel, the following vulnerability has been resolved:
ptp: remove ptp->n_vclocks check logic in ptp_vclock_in_use()
There is no disagreement that we should check both ptp->is_virtual_clock
and ptp->n_vclocks to check if the ptp virtual clock is in use.
However, when we acquire ptp->n_vclocks_mux to read ptp->n_vclocks in
ptp_vclock_in_use(), we observe a recursive lock in the call trace
starting from n_vclocks_store().
============================================
WARNING: possible recursive locking detected
6.15.0-rc6 #1 Not tainted
--------------------------------------------
syz.0.1540/13807 is trying to acquire lock:
ffff888035a24868 (&ptp->n_vclocks_mux){+.+.}-{4:4}, at:
ptp_vclock_in_use drivers/ptp/ptp_private.h:103 [inline]
ffff888035a24868 (&ptp->n_vclocks_mux){+.+.}-{4:4}, at:
ptp_clock_unregister+0x21/0x250 drivers/ptp/ptp_clock.c:415
but task is already holding lock:
ffff888030704868 (&ptp->n_vclocks_mux){+.+.}-{4:4}, at:
n_vclocks_store+0xf1/0x6d0 drivers/ptp/ptp_sysfs.c:215
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&ptp->n_vclocks_mux);
lock(&ptp->n_vclocks_mux);
*** DEADLOCK ***
....
============================================
The best way to solve this is to remove the logic that checks
ptp->n_vclocks in ptp_vclock_in_use().
The reason why this is appropriate is that any path that uses
ptp->n_vclocks must unconditionally check if ptp->n_vclocks is greater
than 0 before unregistering vclocks, and all functions are already
written this way. And in the function that uses ptp->n_vclocks, we
already get ptp->n_vclocks_mux before unregistering vclocks.
Therefore, we need to remove the redundant check for ptp->n_vclocks in
ptp_vclock_in_use() to prevent recursive locking. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: Fix NULL pointer deference on eir_get_service_data
The len parameter is considered optional so it can be NULL so it cannot
be used for skipping to next entry of EIR_SERVICE_DATA. |
| In the Linux kernel, the following vulnerability has been resolved:
crypto: sun8i-ce-cipher - fix error handling in sun8i_ce_cipher_prepare()
Fix two DMA cleanup issues on the error path in sun8i_ce_cipher_prepare():
1] If dma_map_sg() fails for areq->dst, the device driver would try to free
DMA memory it has not allocated in the first place. To fix this, on the
"theend_sgs" error path, call dma unmap only if the corresponding dma
map was successful.
2] If the dma_map_single() call for the IV fails, the device driver would
try to free an invalid DMA memory address on the "theend_iv" path:
------------[ cut here ]------------
DMA-API: sun8i-ce 1904000.crypto: device driver tries to free an invalid DMA memory address
WARNING: CPU: 2 PID: 69 at kernel/dma/debug.c:968 check_unmap+0x123c/0x1b90
Modules linked in: skcipher_example(O+)
CPU: 2 UID: 0 PID: 69 Comm: 1904000.crypto- Tainted: G O 6.15.0-rc3+ #24 PREEMPT
Tainted: [O]=OOT_MODULE
Hardware name: OrangePi Zero2 (DT)
pc : check_unmap+0x123c/0x1b90
lr : check_unmap+0x123c/0x1b90
...
Call trace:
check_unmap+0x123c/0x1b90 (P)
debug_dma_unmap_page+0xac/0xc0
dma_unmap_page_attrs+0x1f4/0x5fc
sun8i_ce_cipher_do_one+0x1bd4/0x1f40
crypto_pump_work+0x334/0x6e0
kthread_worker_fn+0x21c/0x438
kthread+0x374/0x664
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
To fix this, check for !dma_mapping_error() before calling
dma_unmap_single() on the "theend_iv" path. |
| In the Linux kernel, the following vulnerability has been resolved:
EDAC/skx_common: Fix general protection fault
After loading i10nm_edac (which automatically loads skx_edac_common), if
unload only i10nm_edac, then reload it and perform error injection testing,
a general protection fault may occur:
mce: [Hardware Error]: Machine check events logged
Oops: general protection fault ...
...
Workqueue: events mce_gen_pool_process
RIP: 0010:string+0x53/0xe0
...
Call Trace:
<TASK>
? die_addr+0x37/0x90
? exc_general_protection+0x1e7/0x3f0
? asm_exc_general_protection+0x26/0x30
? string+0x53/0xe0
vsnprintf+0x23e/0x4c0
snprintf+0x4d/0x70
skx_adxl_decode+0x16a/0x330 [skx_edac_common]
skx_mce_check_error.part.0+0xf8/0x220 [skx_edac_common]
skx_mce_check_error+0x17/0x20 [skx_edac_common]
...
The issue arose was because the variable 'adxl_component_count' (inside
skx_edac_common), which counts the ADXL components, was not reset. During
the reloading of i10nm_edac, the count was incremented by the actual number
of ADXL components again, resulting in a count that was double the real
number of ADXL components. This led to an out-of-bounds reference to the
ADXL component array, causing the general protection fault above.
Fix this issue by resetting the 'adxl_component_count' in adxl_put(),
which is called during the unloading of {skx,i10nm}_edac. |
| In the Linux kernel, the following vulnerability has been resolved:
ftrace: Add cond_resched() to ftrace_graph_set_hash()
When the kernel contains a large number of functions that can be traced,
the loop in ftrace_graph_set_hash() may take a lot of time to execute.
This may trigger the softlockup watchdog.
Add cond_resched() within the loop to allow the kernel to remain
responsive even when processing a large number of functions.
This matches the cond_resched() that is used in other locations of the
code that iterates over all functions that can be traced. |
| In the Linux kernel, the following vulnerability has been resolved:
tracing: Verify event formats that have "%*p.."
The trace event verifier checks the formats of trace events to make sure
that they do not point at memory that is not in the trace event itself or
in data that will never be freed. If an event references data that was
allocated when the event triggered and that same data is freed before the
event is read, then the kernel can crash by reading freed memory.
The verifier runs at boot up (or module load) and scans the print formats
of the events and checks their arguments to make sure that dereferenced
pointers are safe. If the format uses "%*p.." the verifier will ignore it,
and that could be dangerous. Cover this case as well.
Also add to the sample code a use case of "%*pbl". |
| In the Linux kernel, the following vulnerability has been resolved:
objtool, media: dib8000: Prevent divide-by-zero in dib8000_set_dds()
If dib8000_set_dds()'s call to dib8000_read32() returns zero, the result
is a divide-by-zero. Prevent that from happening.
Fixes the following warning with an UBSAN kernel:
drivers/media/dvb-frontends/dib8000.o: warning: objtool: dib8000_tune() falls through to next function dib8096p_cfg_DibRx() |
| In the Linux kernel, the following vulnerability has been resolved:
perf/x86/intel: KVM: Mask PEBS_ENABLE loaded for guest with vCPU's value.
When generating the MSR_IA32_PEBS_ENABLE value that will be loaded on
VM-Entry to a KVM guest, mask the value with the vCPU's desired PEBS_ENABLE
value. Consulting only the host kernel's host vs. guest masks results in
running the guest with PEBS enabled even when the guest doesn't want to use
PEBS. Because KVM uses perf events to proxy the guest virtual PMU, simply
looking at exclude_host can't differentiate between events created by host
userspace, and events created by KVM on behalf of the guest.
Running the guest with PEBS unexpectedly enabled typically manifests as
crashes due to a near-infinite stream of #PFs. E.g. if the guest hasn't
written MSR_IA32_DS_AREA, the CPU will hit page faults on address '0' when
trying to record PEBS events.
The issue is most easily reproduced by running `perf kvm top` from before
commit 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") (after
which, `perf kvm top` effectively stopped using PEBS). The userspace side
of perf creates a guest-only PEBS event, which intel_guest_get_msrs()
misconstrues a guest-*owned* PEBS event.
Arguably, this is a userspace bug, as enabling PEBS on guest-only events
simply cannot work, and userspace can kill VMs in many other ways (there
is no danger to the host). However, even if this is considered to be bad
userspace behavior, there's zero downside to perf/KVM restricting PEBS to
guest-owned events.
Note, commit 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily
in two rare situations") fixed the case where host userspace is profiling
KVM *and* userspace, but missed the case where userspace is profiling only
KVM. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: adjust subpage bit start based on sectorsize
When running machines with 64k page size and a 16k nodesize we started
seeing tree log corruption in production. This turned out to be because
we were not writing out dirty blocks sometimes, so this in fact affects
all metadata writes.
When writing out a subpage EB we scan the subpage bitmap for a dirty
range. If the range isn't dirty we do
bit_start++;
to move onto the next bit. The problem is the bitmap is based on the
number of sectors that an EB has. So in this case, we have a 64k
pagesize, 16k nodesize, but a 4k sectorsize. This means our bitmap is 4
bits for every node. With a 64k page size we end up with 4 nodes per
page.
To make this easier this is how everything looks
[0 16k 32k 48k ] logical address
[0 4 8 12 ] radix tree offset
[ 64k page ] folio
[ 16k eb ][ 16k eb ][ 16k eb ][ 16k eb ] extent buffers
[ | | | | | | | | | | | | | | | | ] bitmap
Now we use all of our addressing based on fs_info->sectorsize_bits, so
as you can see the above our 16k eb->start turns into radix entry 4.
When we find a dirty range for our eb, we correctly do bit_start +=
sectors_per_node, because if we start at bit 0, the next bit for the
next eb is 4, to correspond to eb->start 16k.
However if our range is clean, we will do bit_start++, which will now
put us offset from our radix tree entries.
In our case, assume that the first time we check the bitmap the block is
not dirty, we increment bit_start so now it == 1, and then we loop
around and check again. This time it is dirty, and we go to find that
start using the following equation
start = folio_start + bit_start * fs_info->sectorsize;
so in the case above, eb->start 0 is now dirty, and we calculate start
as
0 + 1 * fs_info->sectorsize = 4096
4096 >> 12 = 1
Now we're looking up the radix tree for 1, and we won't find an eb.
What's worse is now we're using bit_start == 1, so we do bit_start +=
sectors_per_node, which is now 5. If that eb is dirty we will run into
the same thing, we will look at an offset that is not populated in the
radix tree, and now we're skipping the writeout of dirty extent buffers.
The best fix for this is to not use sectorsize_bits to address nodes,
but that's a larger change. Since this is a fs corruption problem fix
it simply by always using sectors_per_node to increment the start bit. |