Tuesday, March 6, 2012

So I found a bug with SMP, NMIs and KDB...

If two (or more) unknown NMIs arrive on different CPUs, there is a large chance both CPUs will wind up inside panic(). This is fine, unless you want to enter KDB -since now KDB cannot round up all CPUs, because some of them are stuck inside panic_smp_self_stop with NMI latched. This is easy to replicate with QEMU. Boot with -smp 4 and send an untargetted, broadcast NMI using the monitor.

Solution for this is simple - add a new call, try_panic, which will be invoked in cases where some special behavior is desired if someone else is already panicking. For handling unknown NMIs, we now call try_panic instead. If panic() is already active in the system, just exit out of the NMI handler. This lets KDB roundup CPUs.

This affects linux-next.


No comments:

Post a Comment