Building a RELIABLE computer?

KevinL

Flashlight Enthusiast
Joined
Jun 10, 2004
Messages
5,866
Location
At World's End
I just had my main computer quit on me today, and of course, I only got everything up and running with a lot of pain and bloodshed (literally, got cut by a sharp edge). Damned computers. Why can't they be more reliable?

In networking, communications, the gold reference standard for reliability is 99.999% ("Five Nines") which translates into less than 3 minutes of downtime a year. I need to build a fault tolerant computer capable of delivering such field reliability. Any ideas, or am I wasting my time? I've been trying to build something like this for a long, long time.

Intel Architecture is a requirement, must run Windows (for compatibility, though I have 8+ years of UNIX admin background). Mean Time To Repair (MTTR) should be under sixty minutes if possible. Preferably shouldn't break the bank either.. although when the thing is down and you absolutely, positively need it to work, you really feel like you'd pay if they'd just make such a thing.

Forgive me if I'm being a little unrealistic, I'm more than a little frustrated at not being able to get things done when I need them to be and instead spend the last six hours debugging hardware and software step by step.
 

tiktok 22

Flashlight Enthusiast
Joined
Sep 8, 2002
Messages
1,273
Location
Illinois
Building a reliable PC depends on what you want to do with it. Will it be a gaming machine? Will it be a workstation? Will it be for basic functions such as surfing?

Other things to consider are what hardware you will need. Processor and RAM speed. RAM size. HDD size. Do you need a floppy? What type of optical drive will you need. Getting a good quality power supply is important. Nowadays, I wouldn't get less tha a 400 watt from a quality manufacture such as Antec.

There is a lot that goes into building a RELIABLE computer, but it's far from difficult to build one. Since you've determined you want an Intel architecture, check out some manufactures websites on motherboards to see of the board offer what you want. That should be your next step. From there yoou can determine other components that will work with the board you choose.
 

gadget_lover

Flashaholic
Joined
Oct 7, 2003
Messages
7,148
Location
Near Silicon Valley (too near)
There used to be PCs that were very fault tolerant, but AT&T could not find a market for them. Dual power supplies, disks, etc all based on the technology used in the telephone switching systems.

You can come clse by starting with a system designed to be a server. They can be equiped with dual hot swap power supplies, raid and ECC memeory. Many use SCSI because of the ability to hot swap disk drives. Servers generally have well engineered air flow for better cooling.

I hear tales of people who routinely go months at a time without rebooting their Windows 2000 systems. I'm not sure how they manage that, since it means they don't patch their OS to keep up with the hackers.

I use a very simple scheme for getting good uptime. 1) clean power via a premium UPS. 2) Under clock the CPU. 3) Name brand garphics cards and such from last year. 4) set up a 'games system' that IS cutting edge and let that one crash as it will.

I also run Linux for my general work station, so I usually boot only after an extended power failure. The last power failure was 74 days ago /ubbthreads/images/graemlins/smile.gif

Daniel
 

raggie33

*the raggedier*
Joined
Aug 11, 2003
Messages
13,559
make sure ya research the parts ya buy. i like nvidia chipsets for mobo,s i like cosair memory. power supply i had bad expernces with but i like antec.and enlight for modern pc,s i say get 450 watts. hect some modern video cards use 100 watts by there selfs. i like amd cpus . i like the amd barton 2500 for the price,there good overclcockers. i like maxtor hardrives. and if i have the money pioner optical drives.. for cooling i like the cooler maser silent draem coolers
 

raggie33

*the raggedier*
Joined
Aug 11, 2003
Messages
13,559
o let me add.i belive makeing a gamers pc. isnt to good to do. go get a xbox or ps2..its way more econimcal
 

raggie33

*the raggedier*
Joined
Aug 11, 2003
Messages
13,559
titok what psu,s do you like? i had one die last week.it was a no name but man it pissed me off died in less then 4 days .i like antec. but there so expensive.
 

gadget_lover

Flashaholic
Joined
Oct 7, 2003
Messages
7,148
Location
Near Silicon Valley (too near)
I kind of figure a computer could be considered reliable if it takes less maintenance than my car and has a longer MTBF (average time between failures). I run my PCs under clocked and I drive my cars gently.

A car driven 12,000 miles a year at an average speed of 45 MPH will be driven for about 266 hours. I'd have the car oil changed at least 2 times during that time. I'd also take care of the windshield wipers once, and have it washed numerous times.

A computer that's on 2 hours a day for a year will have run 730 hours. So, is the average computer more reliable than a good car?

The software is a different matter. It's not hard to build software with NO bugs. I've done it myself. It's just more work and requires methodical design, build and testing. Most commercial software is NOT bug free.

Daniel
 

rastaman

Newly Enlightened
Joined
Jul 24, 2004
Messages
122
Location
Germany
build two 100% identical machines. you work with computer 1 and make, let's say every night an image of the boot-disk to a network or any external drive.

if computer 1 fails, restore the image onto the second backup computer ;-)
 

Al_Havemann

Enlightened
Joined
Sep 11, 2002
Messages
302
Location
New York City
Reliable computers exist, plenty of them. The problem isn't (for the most part), the hardware so much as the OS.

For example:

I have 30 file servers running Novell NetWare (no comments please), 12 Linux servers and 8 Windows 2003 servers. The Novell and Linux servers (all Dell), run for up to 5 years, 24/7, between replacement and without failure. Literally, they run without stopping for years, no downtime, no failures, no reboots. We average the golden 99.9% up time. Several years ago I set up a NetWare server (4.11) in one of my offices and didn't touch it for almost five years, no down time, never had to reboot it. It never asked for anything, 100% reliability.

On the other hand, we also have 8 Windows 2003 servers, same hardware. These usually require rebooting on a weekly basis or even more often on the busy machines. Loss of network connectivity, service failures, whatever.

So, most of the downtime is caused by the OS. Not to fault Windows, it has a tough job to do with the graphical environment and all but it just doesn't compare to the back room, lights out, set it and forget it machines running Linux and Novell.

And all that applies to workstations as well, true, they ARE getting better, XP is pretty stable and can run for days, even weeks without needing a reboot, but their not in the class of the server OS's like Linux, Novell or an AS400.

So before you go in search of the perfect PC, just know in advance that Windows isn't going to give you the perfect reliability your seeking.

Hardware failures do happen, not all that often but consumer grade equipment isn't built to the same standards as servers. Better cooling, higher quality components, conservative operation, no over-clocking, etc. all lead to long life and high up-time.

Until we get to the point where computers are really useful we won't get very high reliability. And no, I don't mean another version of Windows. I mean when we get to the point where computers diagnose and correct problems as they run. When computers can carry on a conversation with a person and both derive useful information from the exchange. When computers carry out the tasks we now have to do by manual keyboard input, and they do it from their own initiative, when they write their own code, error free.

It's going to be awhile, I'm afraid.

Al
 

Joe Talmadge

Flashlight Enthusiast
Joined
Aug 30, 2000
Messages
2,200
Location
Silicon Valley, CA
If you're looking for anything close to five-nines type reliability with the type of equipment you're looking for, I'm thinking you're pretty much looking at a cluster, at least two nodes. I don't know enough about Windows clusters to know what cluster/failover/membership software to use, but I'd guess there's some out there.
 

Minjin

Flashlight Enthusiast
Joined
Sep 21, 2002
Messages
1,237
Location
Central PA
I think you guys are going a little overboard here. I've found computer failures to be VERY rare if you buy name brand components. If you're talking about software compatibility problems, then you need to take more care in choosing which programs to run. I've been running XP for a few years now and it routinely stays up for several months at a time. I only shut down when power goes down (before UPS runs out) or when I want to mess with the hardware.

For a home computer, I'd say just run namebrand components. Make sure you have good cooling; buy extra fans. Most hardware problems are from overheating. Run RAID 1 and backup once a month if you are concerned with data integrity. One step beyond that would be to switch to commercial grade SCSI drives. Beyond that, stuff really doesn't go bad...

Mark

edit: just realized I put RAID 0 rather than RAID 1
 

raggie33

*the raggedier*
Joined
Aug 11, 2003
Messages
13,559
[ QUOTE ]
Al_Havemann said:
Reliable computers exist, plenty of them. The problem isn't (for the most part), the hardware so much as the OS.

For example:

I have 30 file servers running Novell NetWare (no comments please), 12 Linux servers and 8 Windows 2003 servers. The Novell and Linux servers (all Dell), run for up to 5 years, 24/7, between replacement and without failure. Literally, they run without stopping for years, no downtime, no failures, no reboots. We average the golden 99.9% up time. Several years ago I set up a NetWare server (4.11) in one of my offices and didn't touch it for almost five years, no down time, never had to reboot it. It never asked for anything, 100% reliability.

On the other hand, we also have 8 Windows 2003 servers, same hardware. These usually require rebooting on a weekly basis or even more often on the busy machines. Loss of network connectivity, service failures, whatever.

So, most of the downtime is caused by the OS. Not to fault Windows, it has a tough job to do with the graphical environment and all but it just doesn't compare to the back room, lights out, set it and forget it machines running Linux and Novell.

And all that applies to workstations as well, true, they ARE getting better, XP is pretty stable and can run for days, even weeks without needing a reboot, but their not in the class of the server OS's like Linux, Novell or an AS400.

So before you go in search of the perfect PC, just know in advance that Windows isn't going to give you the perfect reliability your seeking.

Hardware failures do happen, not all that often but consumer grade equipment isn't built to the same standards as servers. Better cooling, higher quality components, conservative operation, no over-clocking, etc. all lead to long life and high up-time.

Until we get to the point where computers are really useful we won't get very high reliability. And no, I don't mean another version of Windows. I mean when we get to the point where computers diagnose and correct problems as they run. When computers can carry on a conversation with a person and both derive useful information from the exchange. When computers carry out the tasks we now have to do by manual keyboard input, and they do it from their own initiative, when they write their own code, error free.

It's going to be awhile, I'm afraid.

Al

[/ QUOTE ]windows 2000 sp4 has been reliable for me.but i do love linux
 

tiktok 22

Flashlight Enthusiast
Joined
Sep 8, 2002
Messages
1,273
Location
Illinois
Lots of great advice here.


[ QUOTE ]
raggie33 said:
titok what psu,s do you like? i had one die last week.it was a no name but man it pissed me off died in less then 4 days .i like antec. but there so expensive.

[/ QUOTE ]

Hi Raggie,

I like Antec since they have never failed me. But yeah, they cost some bucks. Other ones I would consider are Enermax or Thermaltake. BUT, if I ever get another , this is the one I would like to try:

http://www.xoxide.com/fanlesspsu.html
 

gadget_lover

Flashaholic
Joined
Oct 7, 2003
Messages
7,148
Location
Near Silicon Valley (too near)
We've had self correcting computers since the 70's.

I worked at the phone company where the system that switched the phone calls had dual processors that ran in step. The processors were made up of discrete components and had inter-processor busses that were used to compare the results of the registers.

If one processor developed a hardware fault the other was made active during the same processor cycle. No one except the techs even knew that there was a problem.

Self diagnosing and healing hardware is more common than you mmay think. I work with mini-computers, and I've used many of them. Tandem (bought by compaq) made a series of fault tolerant systems that had duplicate everything and and OS that was designed to take advantage of the hardware. Sun makes computers with multiple processors that recognize faults and isolates the offending equipment.

I used a Sun with 10 CPUs, 10 Gig of Ram and 1 terabyte of disk. The CPUs and Memory could be swapped while the system was running. The disks had dual fibre connections and were all fully backed up. I could literally pull out any two disks without warning without slowing the system. I could disconnect either of the disk drive subsystems without slowing the system. The memory was ECC, so it warned you in advance of memory that was going bad. That system ran with 0 downtime for 2 years AND had it's disk subsystems totally replaced during that time.

You can pick up that 6 year old SUN for less than 10 grand now.

Daniel
 

gadget_lover

Flashaholic
Joined
Oct 7, 2003
Messages
7,148
Location
Near Silicon Valley (too near)
[ QUOTE ]
tiktok 22 said:
Lots of great advice here.


[ QUOTE ]
raggie33 said:
titok what psu,s do you like? i had one die last week.it was a no name but man it pissed me off died in less then 4 days .i like antec. but there so expensive.

[/ QUOTE ]

Hi Raggie,

I like Antec since they have never failed me. But yeah, they cost some bucks. Other ones I would consider are Enermax or Thermaltake. BUT, if I ever get another , this is the one I would like to try:

http://www.xoxide.com/fanlesspsu.html

[/ QUOTE ]

I'd stay away from a PSU with that many ventilation slots that could dump heat INSIDE the computer. I don't want the CPU to have to cope with any more than it has to.


Daniel
 
E

EchoSierraTwo

Guest
dudes you can compare TRUE BLUE SERVERS to what he needs for his application. Tho by description it looks like he wants a server. question tho. What applications do you have or need that you cant get off of *nix or open source? If you hold no loyalties to WinBLOWS or Intel, DUMP'em. Go with a *nix variation.
 

raggie33

*the raggedier*
Joined
Aug 11, 2003
Messages
13,559
swet psu tiktok man the last psu i bought was awefull it was only 40 bucks with the case thoughi kept case and thru psu away . it was rated 350 watts i have no idea how they got that rateing it weighed like 8 ounces lol well im guessing at weight but what garbage. it blew very loudly lol.ill look into the one ya posted. this enlight im useing now is 300 watts and the rails are prety stable they stay with in 3 % on the 3.3 5 and 12 and vcore.but its a tad small at 300 watts. so i stoped overclcoking to i get new psu
 
Top