[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: kernel/1671: Not enough random bytes available (GnuPG & OBSD2.8)
- To: bugs_(_at_)_cvs_(_dot_)_openbsd_(_dot_)_org
- Subject: Re: kernel/1671: Not enough random bytes available (GnuPG & OBSD2.8)
- From: Brad Allen <Ulmo_(_at_)_Q_(_dot_)_Net>
- Date: Tue, 13 Feb 2001 05:20:01 -0700 (MST)
- Cc:
- Reply-to: Brad Allen <Ulmo_(_at_)_Q_(_dot_)_Net>
The following reply was made to PR kernel/1671; it has been noted by GNATS.
From: Brad Allen <Ulmo_(_at_)_Q_(_dot_)_Net>
To: hugh_(_at_)_openbsd_(_dot_)_org
Cc: Ulmo_(_at_)_Q_(_dot_)_Net
Subject: Re: kernel/1671: Not enough random bytes available (GnuPG &
OBSD2.8)
Date: Tue, 13 Feb 2001 08:14:04 -0400 (AST)
----Security_Multipart(Tue_Feb_13_08:14:04_2001_891)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
From: Hugh Graham <hugh_(_at_)_openbsd_(_dot_)_org>
Subject: Re: kernel/1671: Not enough random bytes available (GnuPG & OBSD2.8)
Date: Sat, 10 Feb 2001 17:47:32 -0800
Message-ID: <20010210174732_(_dot_)_A15287_(_at_)_argus_(_dot_)_oxide_(_dot_)_org>
hugh> On Sat, Feb 10, 2001 at 05:50:02PM -0700, Brad Allen wrote:
hugh> >
hugh> > BTW, the problem and fix *might* be related to the RTC (Real Time
hugh> > Clock). I am preparing a large bug report, and want more time to
hugh> > polish it off. Just in case it gets lost in time, I will give you
hugh> > this patch as a clue as to what I think is a clue, and tell you I
hugh> > haven't had this problem *yet*, but the last time I had it it was
hugh> > after a few weeks of uptime and I still don't know what the trigger
hugh> > is:
hugh> >
hugh> I suspected settimeofday() when I noticed a machine running ntpd
hugh> lost statclock every six months or so, but unfortunately it's
hugh> mission critical and I wasn't able to dig further than writing some
hugh> code to reliably reproduce the bug.
hugh>
hugh> This program does reliably recreate the problem on some PC's, while
hugh> others escape completely unaffected. Also, 65535 iterations is very
hugh> enthusiastic.. usually it takes a lot less. It was passed around
hugh> OpenBSD a few months ago, but either no one had a machine with the
hugh> problem, or no one had enough time to look further.
hugh>
hugh> /Hugh
hugh>
hugh> ===================================================================
hugh>
hugh> #include <sys/types.h>
hugh> #include <sys/time.h>
hugh> #include <unistd.h>
hugh> #include <stdlib.h>
hugh> #include <stdio.h>
hugh>
hugh> int main() {
hugh> int i;
hugh> signed short wiggler;
hugh> struct timeval tvn;
hugh>
hugh> if (gettimeofday(&tvn, 0) != 0)
hugh> exit(1);
hugh>
hugh> srandom((tvn.tv_sec * getpid()) ^ tvn.tv_usec);
hugh>
hugh> for (i = 0; i < 65535; ++i) {
hugh> wiggler = random();
hugh>
hugh> if (gettimeofday(&tvn, 0) != 0)
hugh> exit(2);
hugh>
hugh> printf("current time: %ld %9ld wiggle: %d\n",
hugh> tvn.tv_sec, tvn.tv_usec, wiggler);
hugh>
hugh> tvn.tv_usec += wiggler;
hugh> if (tvn.tv_usec < 0) {
hugh> --tvn.tv_sec;
hugh> tvn.tv_usec = 1000000 + tvn.tv_usec;
hugh> } else if (tvn.tv_usec > 999999) {
hugh> ++tvn.tv_sec;
hugh> tvn.tv_usec -= 1000000;
hugh> }
hugh>
hugh> if (settimeofday(&tvn, 0) != 0)
hugh> exit(3);
hugh> if (gettimeofday(&tvn, 0) != 0)
hugh> exit(4);
hugh>
hugh> printf("final time: %ld %9ld\n\n",
hugh> tvn.tv_sec, tvn.tv_usec);
hugh> }
hugh> exit(0);
hugh> }
Uhoh. That accurately reproduces the bug here, even in the kernel
which I thought might fix it.
I still DEFINITELY have the bug:
The alternate system clock has died!
Reverting to ``pigs'' display.
(which also means ...
load averages: 1.08, 1.02, 0.78 07:55:42
34 processes: 1 running, 33 idle
CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle
Memory: Real: 29M/62M act/tot Free: 61M Swap: 4K/33M used/tot
)
... and ...
well, uhm, for now, gpg has enough random bytes ... but I bet it is
just buffered up in the kernel, but it is not producing any more. So
I'll loop it.
no problems so far ... dd if=/dev/srandom ... no, still can use gpg!
but third symptom is definitely here:
* total system molassas.
Let's look at ntpq -pn:
not bad.
Well, so, let's see if the system really does stop collecting random
data. PERHAPS my patch to the kernel will continue collecting random
data, but not fix the actual bug. That would be at least nice for me
... able to sign this message, for instance.
Sorry, it's late -- I've been up a long time. I went to sleep early
yesterday.
ntpq -pn still not bad ...
-u
OK! Now, had trouble using my mailer, and still don't know the error,
but was in the process of reboot (kill -USR1 1 while in X) when the
system froze on me and would not respond to even that little kernel
fault debugger thing ("boot sync" didn't seem to do anything, as well
as other variations). I
Then, while coming up, my good old Dell system told me:
CMOS Time & Date Not Set
I ignored this, and got an interesting quote from "ntpdate" while the
bootup scripts were setting up for the NTP dragon to start breathing:
offset -287998.536487s
What that means to me is that those 3.3333 days were gained (or lost?)
during this process. Your program for 65535 iterations only found the
clock about +5m though after running, only about 3-4m of which could
be attributed to the program itself (it ran for less than 5m).
right NOW after this reboot, everything is working fine (top, systat
vmstat, and ntp). Let's see if this gets signed & sent now.
----Security_Multipart(Tue_Feb_13_08:14:04_2001_891)--
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (OpenBSD)
Comment: For info see http://www.gnupg.org
iD8DBQA6iSUSmqBNOgDzEsoRAoQWAJ0X9h/HkVDslsTdeWrEeWqQPScIdACgvZE1
uYU28oCxfl0KDQKMu9O52lU=
=ox66
-----END PGP SIGNATURE-----
----Security_Multipart(Tue_Feb_13_08:14:04_2001_891)----
Visit your host, monkey.org