Bear's Blog of Technology and Related Ruminations
20-Aug-2010 - When Drives Go Bad [00007]The Network Attached Storage (NAS) array on my network has been complaining lately. It is reporting that one of the drives has SMART errors and that this could be an indication of a pending drive failure. So what is SMART and when will this drive fail? History Self-Monitoring, Analysis and Reporting Technology (SMART) began as IBM’s Predictive Failure Analysis (PFA), a reliability-prediction technology developed in 1992. The idea was to have a drive monitor key attributes (mechanical, electronic and the disk media) and report issues. This would permit a drive to be replaced in a controlled fashion. Compaq, in conjunction with Seagate Technology, Conner Peripherals and Quantum, introduced a similar ability named IntelliSafe in 1995. The IBM and Compaq solutions were then merged to create the SMART standard. When Do Hard Drives Fail? Hard disk manufactures calculate the anticipated failure rate of their drives from accelerated life test on a small number of their own drives. In 2007, researchers at Google undertook a study of over 100,000 drives from a variety of manufactures. Their study showed that of the drives that failed, 56% did so without any strong SMART signal warning and 36% failed with no SMART signal warning at all. So while it is possible that a drive will fail without any warning, what about the drives where there was a sign of trouble? Reallocation counts are one of the SMART signals worth paying attention to. (The others are scan errors, offline reallocations and probational counts.) Reallocation occurs when a drive’s onboard logic believes a sector is probably damaged. The bad sector is eliminated from use and a sector from a spare area on the drive is mapped into its place. The Google study showed that once a single reallocation occurs, there is a 15% chance the drive will fail in the next 8 months. It’s even worse for older drives. If a drive has been operating for 10-20 months before that first reallocation, there is a 20% chance of failure within 8 months. For drives 20-60 months old the failure chance rises to about 24%. The bottom line is that once a drive suffers its first reallocation, watch it closely and be prepared to replace it. An Example One of the nice things about the NAS is it talks to me when it is expecting trouble. The first email that got my attention was this: Day 2 - Reallocated sector count has increased in the last day. Disk 2: Previous count: 5 Current count: 12 Growing SMART errors indicate a disk that may fail soon. If the errors continue to increase, you should be prepared to replace the disk. The jump from 5 to 12 was quite different from the previous increases of a single sector. After this, it got worse fast.
On day 9 I ordered a new drive. On the morning of day 11 I received this email: Reallocated sector count has increased in the last day. Disk 2: Previous count: 195 Current count: 1843 Disk 2 did not pass SMART self-assessment test. Please replace this disk as soon as possible. Reallocated sector count has increased in the last day. 1,843 reallocated sectors! Fortunately the new hard drive arrived in the afternoon and all is good with the NAS. But Wait There’s More Don’t just toss your bad drive away. Aside from possible security and privacy issues related to a hard drive (even a bad drive can give up its data in the right hands) you may be able to get it replaced. Attach your drive to a computer (I use a Thermaltake BlacX for this) and use a utility like Seagate’s SeaTools to diagnose the problem. Check your warranty and follow your disk drive manufacture’s process for returning the bad drive. In the case of my drive, I had three years left on the warranty. For the cost of shipping (one way) it is worth it to get a refurbished 1 TB drive. References Playing it S.M.A.R.T. 21-Jul-2010 - Software Troubleshooting - Part 2 [00006]I'm not sure what the current exchange rate is, but at one time a picture was purported to be equal to a thousand words. So here's a picture of the hard drive of one of my computers: OK, so actually it is a graphical representation of the files on the hard drive. Each colored block represents a single file. The colors and block size have meaning: The color indicates the type of file and the size indicates the number of bytes relative to the other files. This wonderful visualization is produced by the program WinDirStat (Windows Directory Statistics) that you can download from a link on that page. What makes this a useful tool is for those occasions when you need to free up space on crowded hard drive. By examining the Treemap created by WinDirStat, you can quickly discover what files, and types of files, are hogging all the space on the drive. Take the large green block in the upper right of the picture. Selecting that block reveals it is a GPS update program I downloaded a couple of months ago. The file is occupying 2.1 gigabytes of space that could be used for something more useful. A press of the delete key removes the file, but not before a popup window first questions your technical competency. (It begins by asking, "Do you know what you are doing?") WinDirStat is freeware and runs on Windows 95 (IE5), Windows 98 SE, Windows ME, Windows NT4 (SP5), Windows 2000, Windows XP, Windows Server 2003 and Windows Vista. It probably runs on Windows 7 too. 11-Apr-2010 - Software Troubleshooting - Part 1 [00005]The tool I use most often is CCleaner (http://www.piriform.com/ccleaner). This free application from Piriform is a system utility that is quite effective at cleaning up computers using the Windows operating system. The tool covers three areas: Cleaner Cleaner does just that. It finds and deletes unneeded files from your system. The Windows Disk Cleanup utility also does this, removing office setup files, temporary files and emptying the recycle bin, but it is incredibly slow. CCleaner is much faster and goes beyond the Windows untility by cleaning up memory dumps, Chkdsk file fragments, log files, DNS cache, various other caches, IIS log files, hotfix uninstallers and many others. Aside from cleaning system files, CCleaner also cleans out files from applications. Registry Selecting Registry allows CCleaner to scan the Windows registry for issues. Once found, you have the option to fix the issues en-mass or individually. One other option is to save the changes made to the registry. Do not consider this an option! Always backup any changes CCleaner makes to the registry. I've never needed any of the backups, but it is good to know they are there. Tools CCleaner also includes options for managing your startup programs, excluding cookies from deletion, uninstalling applications, excluding registry entries and specifying folders to be emptied. Warnings
Other Recent Blogs:7-Apr-2010 - Software Troubleshooting - Part 0 [00004]Complete list of prior blogs |
||
|
4-Sep-2010
|
Copyright © 1997-2010 — Digital Bear Consulting. All Rights Reserved.