[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SMP system shutdown hang (acpi_cpu_shutdown - smp_rendezvous)
- To: Glen <glen_(_dot_)_leeder_(_at_)_nokia_(_dot_)_com>
- Subject: Re: SMP system shutdown hang (acpi_cpu_shutdown - smp_rendezvous)
- From: Nate Lawson <nate_(_at_)_root_(_dot_)_org>
- Date: Thu, 01 Nov 2007 21:01:35 -0700
- Cc: ACPI mailing list <freebsd-acpi_(_at_)_freebsd_(_dot_)_org>
> I have been seeing intermittent hangs in the acpi shutdown code on a
> Intel 2.4GHz 8 CPU system. I am running a with a Freebsd6.1 code base
> but cannot see a reason why this can't happen in other Freebsd versions.
> The hang is very irregular, I am recreating it using an expect script
> that repeatedly reboots the system. Sometimes, I can do up to 200
> reboots before observing the hang, sometimes, it happens after 5-20
> It has been difficult to pin down the hang as the system is not
> responding to NMI events but using breakpoints I believe the hang is in
> acpi_cpu.c:acpi_cpu_shutdown with the call to smp_rendezvous.
> My theory is that one of the CPUs does not respond to ipi_all_but_self
> and that all the other CPUs are waiting for it in smp_rendezvous_action.
> The smp_rv_waiters < mp_ncpus condition never gets met and the system
> hangs. This maybe happen due to other activity (or a deadlock?) on that
> I noticed a few threads relating to this and have already tried stuff
> like changing kern.sched.ipiwakeup.enabled & machdep.cpu_idle_hlt.
> Neither had any effect.
> 1) I tried removing the call to smp_rendezvous in acpi_cpu_shutdown and
> this stops the hang from happening. Does anyone know the purpose of this
> call in the shutdown code or if I might suffer some consequence by
> removing it?
I have one more thing I needed to consider. There's a race where a
thread could be entering acpi_cpu_idle() to read a C2-3 register but
that register state gets destroyed with the softc before the read. In
that case, I thought there could be a panic, hence why I originally put
in the smp_rendezvous(). However, I don't think device_shutdown() frees
softcs (need to look in the newbus code to be sure). So I still should
be able to remove this code after checking more closely.
freebsd-acpi_(_at_)_freebsd_(_dot_)_org mailing list
To unsubscribe, send any mail to "freebsd-acpi-unsubscribe_(_at_)_freebsd_(_dot_)_org"
Visit your host, monkey.org