Sunday, February 26, 2006

Flaky Power Supply, or external factors? (Round III)

In a previous post, after the crashes returned despite swapping in a new power supply, I disabled the onboard sound, and that seemed to fix it for a while. But then the crashes returned again. By this point I'm thinking this is ridiculous.

Then I noticed my desk lamp seemed to be flickering a bit. It occurred to me that maybe I was getting dirty power, especially since this room has had power problems in the past (I run all my monitors on an extension cord from another room, because I sometimes get ripple in the picture when plugged into the outlets in this room).

I ran an additional extension cord from another room and hooked up the PC that has been crashing. Then I re-enabled everything I'd disabled - primarily such features as onboard sound and AGP fast_writes. Firing up the PC, I started Prime95 up, launched the Mozilla web browser, and got Winamp playing some music while I browsed. That usually suffices to crash it quickly. No crashes in the past several days.

Darn dirty power. I think that might be what it is. I've factored out just about every other piece of hardware, except for the motherboard, and given the variability of the problem and the complete lack of effect adjusting motherboard settings has, I suspect that it may be outside stimulus.

Friday, February 17, 2006

Update to Flaky Power Supply (Round II)

In my previous post, I explained my solution to my system crashing (blue screening) all the time was to swap out power supplies (it was a somewhat odd case, in that the system ran fine with a different CPU, but it wasn't the CPU).

Well, I started experiencing BSODs again. Now bear in mind the system is entirely Prime95 and MemTest86 error-free, at least until it blue screens. One of the crashes occurred when I loaded up and tried Rollercoaster Tycoon* on the system, and the blue screen referenced the sound driver. So I tried disabling the onboard sound. It's now been running for 48 hours continuous with no BSODs.

I guess maybe it was the onboard sound driver or hardware. I'm going to see if Realtek has an updated driver in a few days (most likely the current one just got corrupted in a power failure-induced crash, but I like new drivers - they usually work better). If the crashes return with the onboard sound enabled, I'll try a discrete sound card and see if that works any better. I have rebooted once or twice just to make sure it wouldn't start doing it again after a reboot.

I'm still pretty suspicious that the root cause might be something I haven't identified yet. The crashes seem to come and go, and are affected by the strangest things (swapping CPUs made them go away, but the CPU that I was crashing with worked fine in another computer). So we shall see what happens.

* - (Rationale: "Since I can't do any real work...")

Monday, February 06, 2006

Flaky power supply causes problem with one CPU, but not other

The past few weeks, I've been having a problem with my main computer, in that it would crash, no matter whether I had the AMD Athlon XP 1800+ (1.53GHz) Thoroughbred B CPU overclocked and overvolted or not. I tried everything. I even tried swapping CPUs between my main computer and my secondary computer. The Athlon XP Thoroughbred ran fine in my secondary computer; no crashes for over a week, and my secondary computer's CPU (a Thunderbird 1GHz, overclocked to 1.45GHz) ran fine in my primary computer.

So I tried moving the CPUs back (Athlon XP in my primary, Thunderbird in my secondary). Instantly, my primary computer starts crashing again. By this point, I've verified that the CPU works fine on another motherboard, and that this motherboard runs another CPU fine, so it must be something else. I've ruled out the video card (I tried exchanging my AGP Radeon 9800Pro for a PCI Matrox Millennium; same problem). I've tested the memory (no errors in Memtest86), and I've even tried swapping out DIMMs with no change - still crashes. Prime95 has no errors right up to the point where the computer blue screens.

By this point, I'm pretty sure that it's the power supply. It's even been making some funny noises, despite running fine with the secondary computer's overclocked Thunderbird CPU. Problem is, it's a PC Power & Cooling Turbo-Cool 300W, and none of my other power supplies match it in amperage ratings; not even my Antec 300W power supplies. However, it is pretty old (>3 years), so yesterday I decided to bite the bullet and try a lower-rated Antec 300W in the system. Lo and behold; no crashes, even overclocked. I guess the Antec meets my amperage requirements. The PC Power & Cooling has now been relegated to the system I took the Antec out of, and it's running that fine (with another AMD Thunderbird, as it so happens).

I guess that the PC Power & Cooling, after several 100-degree days, and running an overclocked system, had finally become somewhat marginal; enough that a 0.13-micron Thoroughbred B CPU that might be somewhat more sensitive to voltage fluctuations would crash, where a 0.18-micron Thunderbird would continue running fine, even overclocked.

I also found that most of the system's heat was dumping through the power supply, despite the dual 80mm exhaust fans behind the CPU. Turns out that using 1.4W exhaust fans behind the CPU, and a 2.0W exhaust fan in the power supply, will tend to suck most of the heat through the power supply, at least in the manner I have my tower case set up. Opening one of the top drive bays (by removing the front 5.25" cover) and inserting a cardboard divider (without blocking any of the power supply vents, since the PCP&C only has front intake vents) between the power supply and the lower portion of the case got the power supply running a lot cooler (since the air flowing through the power supply was coming directly from the front of the case), and all the hot air from the CPU, hard drives, and video card was exhausted out through the lower 80mm fans. The CPU temperature didn't even go up appreciably.

The Antec may have bottom vents facing the CPU, but it also runs its fan slower, and I still have the 5.25" drive bay removed, so air flow and heat seems to be going fairly equally out through the power supply and the lower rear exhaust fans. And the computer I stuck the PCP&C in has a different arrangement - it's in a Micron mini-tower case, and it already gets its airflow through an open 3.5" drive bay past the CPU and out an 80mm rear exhaust fan. Plus, the 80mm rear exhaust fan and the power supply fan match - they're both 2.0W Sunon 80mm ball-bearings. If anything, the rear exhaust fan has slightly higher airflow than the power supply fan, because I cut out the grill behind the rear exhaust fan, and the power supply still has to pull air in through its front intake vents.

So, in conclusion, a marginal power supply can cause really weird behavior, even to the extent of running one CPU fine but not another. Of course, 300W is already cutting it fine for an overclocked AMD Athlon XP with a Radeon 9800Pro and multiple hard drives, but if one uses quality power supplies, it's doable. And I can't afford a replacement power supply at this time, though I sure would like to upgrade to a 400W or better Antec, FSP Group (Fortron Source), PC Power & Cooling, or Seasonic eventually (all good, reliable manufacturers of power supplies, unlike most generic and many other brands).