Symptoms
- ESXi may encounter the following system alert:
2019-05-02T19:44:08.966Z cpu55:66539)CpuSched: 4838: Unexpected runqueue state encountered!
- ESXi may encounter a PSOD when operating on a linked-list data structure inside the vmkernel. The PSOD backtrace will be similar to the following:
[0x418007b0688f]CpuSchedQueueAdd@vmkernel#nover+0xcf stack: 0x7388e490eaa7f
[0x418007b06ad4]CpuSchedVcpuMakeReady@vmkernel#nover+0xad stack: 0x4529eac23780
[0x418007b06cb2]CpuSchedWorldWakeup@vmkernel#nover+0x8b stack: 0x4529ead23100
[0x418007b07150]CpuSchedForceWakeupInt@vmkernel#nover+0x10d stack: 0x4529f20a3000
0x418007b12d97]CpuSchedActionNotifyTraditional@vmkernel#nover+0x9c stack: 0x0
[0x418007b12e25]CpuSched_ActionNotifyHierarchical@vmkernel#nover+0x7e stack: 0x0
[0x418007b12f89]CpuSched_ActionNotifyVCPUs@vmkernel#nover+0x22 stack: 0x1c
[0x41800793659e]VMMVMKCall_Call@vmkernel#nover+0xf7 stack: 0x0
Or
[0x41803b2c215d]CpuSchedPcpuVcpuChooseInt@vmkernel#nover+0x19 stack: 0x418040000080, 0x10b, 0x0, 0x1, 0x418040000080
[0x41803b2c3f46]CpuSched_PcpuChoose@vmkernel#nover+0xfe stack: 0x3ff, 0xfffffffffffe, 0x0, 0x418040000000, 0x0
[0x41803b2d8359]CpuSchedRebalance_PcpuMigrateIdle@vmkernel#nover+0x17d
stack: 0xc64ff6b3c8815, 0x4310080086e8, 0x4310080086d8, 0x43a3e6b9bc90,
0x400000000
[0x41803b2c9931]CpuSchedDispatch@vmkernel#nover+0x1331 stack: 0x410000000001, 0x418045400000,
Or
[0x41800fb1299a]CpuSched_Charge@vmkernel#nover+0x1ea stack: 0x418040400080
[0x41800fb0c483]CpuSchedDispatch@vmkernel#nover+0xac stack: 0x418040000108
[0x41800f9365e3]VMMVMKCall_Call@vmkernel#nover+0x13c stack: 0x0
[0x41800f95c7ed]VMKVMM_ArchEnterVMKernel@vmkernel#nover+0xe stack: 0x41800f95c7e0
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
ESXi panics when an invalid pointer value is detected while operating on a linked list protected by a spinlock. While the spinlock is being held, the values in the linked list are assured by hardware not to change during these operations, but unexpectedly a transient invalid value is seen. The invalid value is frequently null (0), but is sometimes some other incorrect value, such as a non-canonical address.
The root cause of this issue is not known at this time and is being investigated by our hardware partners.
There is no known resolution at this time.
Comments
Post a Comment