- 30 May, 2023 1 commit
-
-
Michael Tokarev authored
Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 29 May, 2023 2 commits
-
-
Kevin Wolf authored
There are some error paths in blk_exp_add() that jump to 'fail:' before 'exp' is even created. So we can't just unconditionally access exp->blk. Add a NULL check, and switch from exp->blk to blk, which is available earlier, just to be extra sure that we really cover all cases where BlockDevOps could have been set for it (in practice, this only happens in drv->create() today, so this part of the change isn't strictly necessary). Fixes: Coverity CID 1509238 Fixes: de79b526 Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-3-kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Tested-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> (cherry picked from commit a1845637 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Michael Tokarev authored
Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 27 May, 2023 1 commit
-
-
Paolo Bonzini authored
The VirtioInfoList is already allocated by QAPI_LIST_PREPEND and need not be allocated by the caller. Fixes Coverity CID 1508724. Reviewed-by:
Daniel P. Berrangé <berrange@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 0bfd1414 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 26 May, 2023 4 commits
-
-
Igor Mammedov authored
QEMU aborts when default RAM backend should be used (i.e. no explicit '-machine memory-backend=' specified) but user has created an object which 'id' equals to default RAM backend name used by board. $QEMU -machine pc \ -object memory-backend-ram,id=pc.ram,size=4294967296 Actual results: QEMU 7.2.0 monitor - type 'help' for more information (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239: qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container') Aborted (core dumped) Instead of abort, check for the conflicting 'id' and exit with an error, suggesting how to remedy the issue. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2207886 Signed-off-by:
Igor Mammedov <imammedo@redhat.com> Message-Id: <20230522131717.3780533-1-imammedo@redhat.com> Tested-by:
Thomas Huth <thuth@redhat.com> Reviewed-by:
Thomas Huth <thuth@redhat.com> Reviewed-by:
Shaoqin Huang <shahuang@redhat.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by:
Thomas Huth <thuth@redhat.com> (cherry picked from commit a37531f2 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Thomas Huth authored
We cannot use the generic reentrancy guard in the LSI code, so we have to manually prevent endless reentrancy here. The problematic lsi_execute_script() function has already a way to detect whether too many instructions have been executed - we just have to slightly change the logic here that it also takes into account if the function has been called too often in a reentrant way. The code in fuzz-lsi53c895a-test.c has been taken from an earlier patch by Mauro Matteo Cascella. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563 Message-Id: <20230522091011.1082574-1-thuth@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by:
Alexander Bulekov <alxndr@bu.edu> Signed-off-by:
Thomas Huth <thuth@redhat.com> (cherry picked from commit b987718b ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Paolo Bonzini authored
When the OHCI controller's framenumber is incremented, HccaPad1 register should be set to zero (Ref OHCI Spec 4.4) ReactOS uses hccaPad1 to determine if the OHCI hardware is running, consequently it fails this check in current qemu master. Signed-off-by:
Ryan Wendland <wendland@live.com.au> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1048 Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 6301460c ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is 12.1.0, the compiler complains as follows: In file included from /usr/include/features.h:490, from /usr/include/bits/libc-header-start.h:33, from /usr/include/stdint.h:26, from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9, from /home/alarm/q/var/qemu/include/qemu/osdep.h:94, from ../util/vfio-helpers.c:13: In function 'readlink', inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9, inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18, inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9: /usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull] 119 | return __glibc_fortify (readlink, __len, sizeof (char), | ^~~~~~~~~~~~~~~ This error implies the allocated buffer can be NULL. Use g_file_read_link(), which allocates buffer automatically to avoid the error. Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by:
Cédric Le Goater <clg@redhat.com> Signed-off-by:
Cédric Le Goater <clg@redhat.com> (cherry picked from commit dbdea0db ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 24 May, 2023 8 commits
-
-
Stefan Hajnoczi authored
If the driver sets large_send_mss to 0 then a divide-by-zero occurs. Even if the division wasn't a problem, the for loop that emits MSS-sized packets would never terminate. Solve these issues by skipping offloading when large_send_mss=0. This issue was found by OSS-Fuzz as part of Alexander Bulekov's device fuzzing work. The reproducer is: $ cat << EOF | ./qemu-system-i386 -display none -machine accel=qtest, -m \ 512M,slots=1,maxmem=0xffff000000000000 -machine q35 -nodefaults -device \ rtl8139,netdev=net0 -netdev user,id=net0 -device \ pc-dimm,id=nv1,memdev=mem1,addr=0xb800a64602800000 -object \ memory-backend-ram,id=mem1,size=2M -qtest stdio outl 0xcf8 0x80000814 outl 0xcfc 0xe0000000 outl 0xcf8 0x80000804 outw 0xcfc 0x06 write 0xe0000037 0x1 0x04 write 0xe00000e0 0x2 0x01 write 0x1 0x1 0x04 write 0x3 0x1 0x98 write 0xa 0x1 0x8c write 0xb 0x1 0x02 write 0xc 0x1 0x46 write 0xd 0x1 0xa6 write 0xf 0x1 0xb8 write 0xb800a646028c000c 0x1 0x08 write 0xb800a646028c000e 0x1 0x47 write 0xb800a646028c0010 0x1 0x02 write 0xb800a646028c0017 0x1 0x06 write 0xb800a646028c0036 0x1 0x80 write 0xe00000d9 0x1 0x40 EOF Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1582 Closes: https://gitlab.com/qemu-project/qemu/-/issues/1582 Cc: qemu-stable@nongnu.org Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: 6d71357a ("rtl8139: honor large send MSS value") Reported-by:
Alexander Bulekov <alxndr@bu.edu> Reviewed-by:
Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by:
Alexander Bulekov <alxndr@bu.edu> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit 792676c1 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
igb_receive_internal() used to check the iov length to determine copy the iovs to a contiguous buffer, but the check is flawed in two ways: - It does not ensure that iovcnt > 0. - It does not take virtio-net header into consideration. The size of this copy is just 22 octets, which can be even less than the code size required for checks. This (wrong) optimization is probably not worth so just remove it. Removing this also allows igb to assume aligned accesses for the ethernet header. Fixes: 3a977dee ("Intrdocue igb device emulation") Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by:
Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit dc9ef1bf ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
e1000e_receive_internal() used to check the iov length to determine copy the iovs to a contiguous buffer, but the check is flawed in two ways: - It does not ensure that iovcnt > 0. - It does not take virtio-net header into consideration. The size of this copy is just 18 octets, which can be even less than the code size required for checks. This (wrong) optimization is probably not worth so just remove it. Fixes: 6f3fbe4e ("net: Introduce e1000e device emulation") Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit 310a128e ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
igb does not properly ensure the buffer passed to net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header. Allow it to pass scattered data to net_rx_pkt_set_protocols(). Fixes: 3a977dee ("Intrdocue igb device emulation") Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by:
Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit 2f0fa232 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
The datasheet says contradicting statements regarding ICR accesses so it is not reliable to determine the behavior of ICR accesses. However, e1000e does clear IMS bits when reading ICR accesses and Linux also expects ICR accesses will clear IMS bits according to: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igb/igb_main.c?h=v6.2#n8048 Fixes: 3a977dee ("Intrdocue igb device emulation") Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by:
Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit f0b1df5c ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
While the datasheet of e1000e says it checks CTRL.VME for tx VLAN tagging, igb's datasheet has no such statements. It also says for "CTRL.VLE": > This register only affects the VLAN Strip in Rx it does not have any > influence in the Tx path in the 82576. (Appendix A. Changes from the 82575) There is no "CTRL.VLE" so it is more likely that it is a mistake of CTRL.VME. Fixes: fba7c3b7 ("igb: respect VMVIR and VMOLR for VLAN") Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by:
Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit e2097167 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
igb's advanced descriptor uses a packet type encoding different from one used in e1000e's extended descriptor. Fix the logic to encode Rx packet type accordingly. Fixes: 3a977dee ("Intrdocue igb device emulation") Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by:
Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit ed447c60 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Akihiko Odaki authored
Before this change, e1000 and the common code updated BPRC and MPRC depending on the matched filter, but e1000e and igb decided to update those counters by deriving the packet type independently. This inconsistency caused a multicast packet to be counted twice. Updating BPRC and MPRC depending on are fundamentally flawed anyway as a filter can be used for different types of packets. For example, it is possible to filter broadcast packets with MTA. Always determine what counters to update by inspecting the packets. Fixes: 3b274301 ("e1000: Implementing various counters") Signed-off-by:
Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by:
Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit f3f9b726 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 23 May, 2023 1 commit
-
-
timothee.cocault@gmail.com authored
The bytes and packets counter registers are cleared on read. Copying the "total counter" registers to the "good counter" registers has side effects. If the "total" register is never read by the OS, it only gets incremented. This leads to exponential growth of the "good" register. This commit increments the counters individually to avoid this. Signed-off-by:
Timothée Cocault <timothee.cocault@gmail.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> (cherry picked from commit 8d689f6a ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 22 May, 2023 8 commits
-
-
Kevin Wolf authored
nbd_drained_poll() generally runs in the main thread, not whatever iothread the NBD server coroutine is meant to run in, so it can't directly reenter the coroutines to wake them up. The code seems to have the right intention, it specifies the correct AioContext when it calls qemu_aio_coroutine_enter(). However, this functions doesn't schedule the coroutine to run in that AioContext, but it assumes it is already called in the home thread of the AioContext. To fix this, add a new thread-safe qio_channel_wake_read() that can be called in the main thread to wake up the coroutine in its AioContext, and use this in nbd_drained_poll(). Cc: qemu-stable@nongnu.org Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20230517152834.277483-3-kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 7c1f51bf ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Kevin Wolf authored
In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They come from callers that hold an AioContext lock, which is not allowed during polling. In theory, we could temporarily release the lock, but callers are inconsistent about whether they hold a lock, and if they do, some are also confused about which one they hold. While all of this is fixable, it's not trivial, and the best course of action for 8.0.1 is probably just disabling the graph locking code temporarily. We don't currently rely on graph locking yet. It is supposed to replace the AioContext lock eventually to enable multiqueue support, but as long as we still have the AioContext lock, it is sufficient without the graph lock. Once the AioContext lock goes away, the deadlock doesn't exist any more either and this commit can be reverted. (Of course, it can also be reverted while the AioContext lock still exists if the callers have been fixed.) Cc: qemu-stable@nongnu.org Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Message-Id: <20230517152834.277483-2-kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 80fc5d26 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Stefan Hajnoczi authored
reader_count() is a performance bottleneck because the global aio_context_list_lock mutex causes thread contention. Put this debugging assertion behind a new ./configure --enable-debug-graph-lock option and disable it by default. The --enable-debug-graph-lock option is also enabled by the more general --enable-debug option. Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230501173443.153062-1-stefanha@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 58a2e3f5 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru> (Mjt: pick this one up so the next patch which disables this applies cleanly)
-
Stefan Hajnoczi authored
Cc: qemu-stable@nongnu.org Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230502184134.534703-3-stefanha@redhat.com> [kwolf: Restrict to CONFIG_POSIX, Windows doesn't support polling] Tested-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 844a12a6 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Stefan Hajnoczi authored
QEMU's event loop supports nesting, which means that event handler functions may themselves call aio_poll(). The condition that triggered a handler must be reset before the nested aio_poll() call, otherwise the same handler will be called and immediately re-enter aio_poll. This leads to an infinite loop and stack exhaustion. Poll handlers are especially prone to this issue, because they typically reset their condition by finishing the processing of pending work. Unfortunately it is during the processing of pending work that nested aio_poll() calls typically occur and the condition has not yet been reset. Disable a poll handler during ->io_poll_ready() so that a nested aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the disabled poll handler and its associated fd handler do not run during the nested aio_poll(). Calling aio_set_fd_handler() from inside nested aio_poll() could cause it to run again. If the fd handler is pending inside nested aio_poll(), then it will also run again. In theory fd handlers can be affected by the same issue, but they are more likely to reset the condition before calling nested aio_poll(). This is a special case and it's somewhat complex, but I don't see a way around it as long as nested aio_poll() is supported. Cc: qemu-stable@nongnu.org Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181 Fixes: c3827069 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK") Cc: Kevin Wolf <kwolf@redhat.com> Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230502184134.534703-2-stefanha@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 6d740fb0 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Mauro Matteo Cascella authored
Ensure op_info is not NULL in case of QCRYPTODEV_BACKEND_ALG_SYM algtype. Fixes: 0e660a6f ("crypto: Introduce RSA algorithm") Signed-off-by:
Mauro Matteo Cascella <mcascell@redhat.com> Reported-by:
Yiming Tao <taoym@zju.edu.cn> Message-Id: <20230509075317.1132301-1-mcascell@redhat.com> Reviewed-by:
Gonglei <arei.gonglei@huawei.com> Reviewed-by:
zhenwei <pi<pizhenwei@bytedance.com> Reviewed-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 3e699089 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Eugenio Pérez authored
The commit 93a97dc5 ("virtio-net: enable vq reset feature") enables unconditionally vq reset feature as long as the device is emulated. This makes impossible to actually disable the feature, and it causes migration problems from qemu version previous than 7.2. The entire final commit is unneeded as device system already enable or disable the feature properly. This reverts commit 93a97dc5. Fixes: 93a97dc5 ("virtio-net: enable vq reset feature") Signed-off-by:
Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230504101447.389398-1-eperezma@redhat.com> Reviewed-by:
Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 1fac00f7 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Leonardo Bras authored
Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK set for machine types < 8.0 will cause migration to fail if the target QEMU version is < 8.0.0 : qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0 qemu-system-x86_64: Failed to load PCIDevice:config qemu-system-x86_64: Failed to load e1000e:parent_obj qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e' qemu-system-x86_64: load of migration failed: Invalid argument The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0, with this cmdline: ./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX] In order to fix this, property x-pcie-err-unc-mask was introduced to control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by default, but is disabled if machine type <= 7.2. Fixes: 010746ae ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register") Suggested-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Leonardo Bras <leobras@redhat.com> Message-Id: <20230503002701.854329-1-leobras@redhat.com> Reviewed-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Reviewed-by:
Juan Quintela <quintela@redhat.com> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576 Tested-by:
Fiona Ebner <f.ebner@proxmox.com> Reviewed-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 5ed3dabe ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 19 May, 2023 1 commit
-
-
Hawkins Jiawei authored
QEMU invokes vhost_svq_add() when adding a guest's element into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots() to check whether QEMU can add the element into SVQ. If there is enough space, then QEMU combines some out descriptors and some in descriptors into one descriptor chain, and adds it into `svq->vring.desc` by vhost_svq_vring_write_descs(). Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx` in vhost_svq_available_slots() returns the number of occupied elements, or the number of descriptor chains, instead of the number of occupied descriptors, which may cause wrapping in SVQ descriptor ring. Here is an example. In vhost_handle_guest_kick(), QEMU forwards as many available buffers to device by virtqueue_pop() and vhost_svq_add_element(). virtqueue_pop() returns a guest's element, and then this element is added into SVQ by vhost_svq_add_element(), a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and vhost_svq_add_element() `svq->vring.num` times, vhost_svq_available_slots() thinks QEMU just ran out of slots and everything should work fine. But in fact, virtqueue_pop() returns `svq->vring.num` elements or descriptor chains, more than `svq->vring.num` descriptors due to guest memory fragmentation, and this causes wrapping in SVQ descriptor ring. This bug is valid even before marking the descriptors used. If the guest memory is fragmented, SVQ must add chains so it can try to add more descriptors than possible. This patch solves it by adding `num_free` field in VhostShadowVirtqueue structure and updating this field in vhost_svq_add() and vhost_svq_get_buf(), to record the number of free descriptors. Fixes: 100890f7 ("vhost: Shadow virtqueue buffers forwarding") Signed-off-by:
Hawkins Jiawei <yin31149@gmail.com> Acked-by:
Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230509084817.3973-1-yin31149@gmail.com> Reviewed-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Tested-by:
Lei Yang <leiyang@redhat.com> (cherry picked from commit 5d410557 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 18 May, 2023 10 commits
-
-
Xinyu Li authored
vzeroall: xmm_regs should be used instead of xmm_t0 vpermdq: bit 3 and 7 of imm should be considered Signed-off-by:
Xinyu Li <lixinyu20s@ict.ac.cn> Message-Id: <20230510145222.586487-1-lixinyu20s@ict.ac.cn> Cc: qemu-stable@nongnu.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 056d6490 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Paolo Bonzini authored
Compared to other SSE instructions, VUCOMISx and VCOMISx are different: the single and double precision versions are distinguished through a prefix, however they use no-prefix and 0x66 for SS and SD respectively. Scalar values usually are associated with 0xF2 and 0xF3. Because of these, they incorrectly perform a 128-bit memory load instead of a 32- or 64-bit load. Fix this by writing a custom decoding function. I tested that the reproducer is fixed and the test-avx output does not change. Reported-by:
Gabriele Svelto <gsvelto@mozilla.com> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637 Fixes: f8d19eec ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18) Cc: qemu-stable@nongnu.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 2b55e479 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Paolo Bonzini authored
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic device makes qemu crash. This is caused by a buffer overflow when scsi-generic patches the block limits VPD page. Do the operations on a temporary on-stack buffer that is guaranteed to be large enough. Reported-by:
Théo Maillart <tmaillart@freebox.fr> Analyzed-by:
Théo Maillart <tmaillart@freebox.fr> Cc: qemu-stable@nongnu.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 9bd634b2 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Richard Henderson authored
If vd == vm, copy vm to scratch, so that we can pre-zero the output and still access the gather indicies. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1612 Signed-off-by:
Richard Henderson <richard.henderson@linaro.org> Message-id: 20230504104232.1877774-1-richard.henderson@linaro.org Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit a6771f2f ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Eric Blake authored
Commit fe904ea8 added a fail_inactivate label, which tries to reactivate disks on the source after a failure while s->state == MIGRATION_STATUS_ACTIVE, but didn't actually use the label if qemu_savevm_state_complete_precopy() failed. This failure to reactivate is also present in commit 6039dd5b (also covering the new s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring s->block_inactive is set more reliably). Consolidate the two labels back into one - no matter HOW migration is failed, if there is any chance we can reach vm_start() after having attempted inactivation, it is essential that we have tried to restart disks before then. This also makes the cleanup more like migrate_fd_cancel(). Suggested-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com> Message-Id: <20230502205212.134680-1-eblake@redhat.com> Acked-by:
Peter Xu <peterx@redhat.com> Reviewed-by:
Juan Quintela <quintela@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 6dab4c93 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru> (Mjt: minor context tweak near added comment in migration/migration.c)
-
Eric Blake authored
No need to declare a temporary variable. Suggested-by:
Juan Quintela <quintela@redhat.com> Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better") Signed-off-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Juan Quintela <quintela@redhat.com> Signed-off-by:
Juan Quintela <quintela@redhat.com> (cherry picked from commit 5d39f44d ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Eric Blake authored
Consider what happens when performing a migration between two host machines connected to an NFS server serving multiple block devices to the guest, when the NFS server becomes unavailable. The migration attempts to inactivate all block devices on the source (a necessary step before the destination can take over); but if the NFS server is non-responsive, the attempt to inactivate can itself fail. When that happens, the destination fails to get the migrated guest (good, because the source wasn't able to flush everything properly): (qemu) qemu-kvm: load of migration failed: Input/output error at which point, our only hope for the guest is for the source to take back control. With the current code base, the host outputs a message, but then appears to resume: (qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1) (src qemu)info status VM status: running but a second migration attempt now asserts: (src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. Whether the guest is recoverable on the source after the first failure is debatable, but what we do not want is to have qemu itself fail due to an assertion. It looks like the problem is as follows: In migration.c:migration_completion(), the source sets 'inactivate' to true (since COLO is not enabled), then tries savevm.c:qemu_savevm_state_complete_precopy() with a request to inactivate block devices. In turn, this calls block.c:bdrv_inactivate_all(), which fails when flushing runs up against the non-responsive NFS server. With savevm failing, we are now left in a state where some, but not all, of the block devices have been inactivated; but migration_completion() then jumps to 'fail' rather than 'fail_invalidate' and skips an attempt to reclaim those those disks by calling bdrv_activate_all(). Even if we do attempt to reclaim disks, we aren't taking note of failure there, either. Thus, we have reached a state where the migration engine has forgotten all state about whether a block device is inactive, because we did not set s->block_inactive in enough places; so migration allows the source to reach vm_start() and resume execution, violating the block layer invariant that the guest CPUs should not be restarted while a device is inactive. Note that the code in migration.c:migrate_fd_cancel() will also try to reactivate all block devices if s->block_inactive was set, but because we failed to set that flag after the first failure, the source assumes it has reclaimed all devices, even though it still has remaining inactivated devices and does not try again. Normally, qmp_cont() will also try to reactivate all disks (or correctly fail if the disks are not reclaimable because NFS is not yet back up), but the auto-resumption of the source after a migration failure does not go through qmp_cont(). And because we have left the block layer in an inconsistent state with devices still inactivated, the later migration attempt is hitting the assertion failure. Since it is important to not resume the source with inactive disks, this patch marks s->block_inactive before attempting inactivation, rather than after succeeding, in order to prevent any vm_start() until it has successfully reactivated all devices. See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982 Signed-off-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Juan Quintela <quintela@redhat.com> Acked-by:
Lukas Straub <lukasstraub2@web.de> Tested-by:
Lukas Straub <lukasstraub2@web.de> Signed-off-by:
Juan Quintela <quintela@redhat.com> (cherry picked from commit 403d18ae ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Michael Tokarev authored
linux-user getgroups(), setgroups(), getgroups32() and setgroups32() used alloca() to allocate grouplist arrays, with unchecked gidsetsize coming from the "guest". With NGROUPS_MAX being 65536 (linux, and it is common for an application to allocate NGROUPS_MAX for getgroups()), this means a typical allocation is half the megabyte on the stack. Which just overflows stack, which leads to immediate SIGSEGV in actual system getgroups() implementation. An example of such issue is aptitude, eg https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811087#72 Cap gidsetsize to NGROUPS_MAX (return EINVAL if it is larger than that), and use heap allocation for grouplist instead of alloca(). While at it, fix coding style and make all 4 implementations identical. Try to not impose random limits - for example, allow gidsetsize to be negative for getgroups() - just do not allocate negative-sized grouplist in this case but still do actual getgroups() call. But do not allow negative gidsetsize for setgroups() since its argument is unsigned. Capping by NGROUPS_MAX seems a bit arbitrary, - we can do more, it is not an error if set size will be NGROUPS_MAX+1. But we should not allow integer overflow for the array being allocated. Maybe it is enough to just call g_try_new() and return ENOMEM if it fails. Maybe there's also no need to convert setgroups() since this one is usually smaller and known beforehand (KERN_NGROUPS_MAX is actually 63, - this is apparently a kernel-imposed limit for runtime group set). The patch fixes aptitude segfault mentioned above. Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru> Message-Id: <20230409105327.1273372-1-mjt@msgid.tls.msk.ru> Signed-off-by:
Laurent Vivier <laurent@vivier.eu> (cherry picked from commit 1e35d327 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Daniil Kovalev authored
If a program requires fr1, we should set the FR bit of CP0 control status register and add F64 hardware flag. The corresponding `else if` branch statement is copied from the linux kernel sources (see `arch_check_elf` function in linux/arch/mips/kernel/elf.c). Signed-off-by:
Daniil Kovalev <dkovalev@compiler-toolchain-for.me> Reviewed-by:
Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <20230404052153.16617-1-dkovalev@compiler-toolchain-for.me> Signed-off-by:
Laurent Vivier <laurent@vivier.eu> (cherry picked from commit a0f8d270 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Alex Bennée authored
Stretch is going out of support so things like security updates will fail. As the toolchain itself is binary it hopefully won't mind the underlying OS being updated. Message-Id: <20230503091244.1450613-3-alex.bennee@linaro.org> Reviewed-by:
Thomas Huth <thuth@redhat.com> Reviewed-by:
Juan Quintela <quintela@redhat.com> Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Reported-by:
Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit 3217b84f ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
- 17 May, 2023 4 commits
-
-
Lizhi Yang authored
Duplicated word "are". Signed-off-by:
Lizhi Yang <sledgeh4w@gmail.com> Reviewed-by:
Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230511080119.99018-1-sledgeh4w@gmail.com> Signed-off-by:
Thomas Huth <thuth@redhat.com> (cherry picked from commit c70bb9a7 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Claudio Imbrenda authored
Add new -run-with option with an async-teardown=on|off parameter. It is visible in the output of query-command-line-options QMP command, so it can be discovered and used by libvirt. The option -async-teardown is now redundant, deprecate it. Reported-by:
Boris Fiuczynski <fiuczy@linux.ibm.com> Fixes: c891c24b ("os-posix: asynchronous teardown for shutdown on Linux") Signed-off-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Message-Id: <20230505120051.36605-2-imbrenda@linux.ibm.com> [thuth: Add curly braces to fix error with GCC 8.5, fix bug in deprecated.rst] Signed-off-by:
Thomas Huth <thuth@redhat.com> (cherry picked from commit 80bd81ca ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru> (Mjt: context tweak in docs/about/deprecated.rst)
-
Claudio Imbrenda authored
Kernel commit 292a7d6fca33 ("KVM: s390: pv: fix asynchronous teardown for small VMs") causes the KVM_PV_ASYNC_CLEANUP_PREPARE ioctl to fail if the VM is not larger than 2GiB. QEMU would attempt it and fail, print an error message, and then proceed with a normal teardown. Avoid attempting to use asynchronous teardown altogether when the VM is not larger than 2 GiB. This will avoid triggering the error message and also avoid pointless overhead; normal teardown is fast enough for small VMs. Reported-by:
Marc Hartmayer <mhartmay@linux.ibm.com> Fixes: c3a073c6 ("s390x/pv: Add support for asynchronous teardown for reboot") Link: https://lore.kernel.org/all/20230421085036.52511-2-imbrenda@linux.ibm.com/ Signed-off-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Message-Id: <20230510105531.30623-2-imbrenda@linux.ibm.com> Reviewed-by:
Thomas Huth <thuth@redhat.com> [thuth: Fix inline function parameter in pv.h] Signed-off-by:
Thomas Huth <thuth@redhat.com> (cherry picked from commit 88693ab2 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-
Richard Henderson authored
The REXW bit must be set to produce a 64-bit pointer result; the bit is disabled in 32-bit mode, so we can do this unconditionally. Fixes: 7d9e1ee4 ("tcg/i386: Adjust assert in tcg_out_addi_ptr") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1592 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1642 Signed-off-by:
Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit 98899850 ) Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru>
-