Understanding ScanDisk

The single most important thing to know about ScanDisk, as well as NT's equivalent ChkDsk, is that it is not a data recovery tool. Its job is to maintain the sanity of the file system, and if your directories or files get in the way of this objective, they will be sacrificed!

That's why it is dangerous to allow ScanDisk to automatically fixes all errors. Instead, you want to be prompted on each problem, so that you can decline the automatic "fix" where appropriate and proceed to manual recovery. But to do this, you need to understand the problems that ScanDisk reports. In the order they are listed in ScanDisk.ini (and as commented therein), these are:

DS_Header: Damaged DoubleSpace volume file header
FAT_Media: Missing or invalid FAT media byte
Okay_Entries: Damaged, but repairable, directories/files
Bad_Chain: Files or directories which "should be truncated" (sic)
Crosslinks: FAT-level crosslinks

Boot_Sector: Damaged boot sector on DoubleSpace drive
FSInfo_Sector: Incorrect free space count
Invalid_MDFAT: Invalid MDFAT entries
DS_Crosslinks: Internal (MDFAT-level) crosslinks
DS_LostClust: Internal lost clusters
DS_Signatures: Missing DoubleSpace volume signatures
Mismatch_FAT: Mismatched FATs on non-DoubleSpace drives
Bad_Clusters: Physical damage or decompression errors

Bad_Entries: Damaged and "irrepairable" (sic) directories or files

LostClust: Lost clusters

Long File Name errors: Detected but not fixed in real-mode ScanDisk

Once can group these problems according to data risk...

Safe to fix

You can let ScanDisk fix these problems, and if the fixes are not logged, it doesn't matter.

FSInfo_Sector: Incorrect free space count

This problem is trivial for ScanDisk to fix safely. Because the FAT32 File Allocation Tables can be large and thus slow to scan, Windows keeps a record of the free space capacity cached within the FAT32 volume boot record for quick reference. Because this value is updated on a now-and-then basis, it often differs from the true free space as deduced from the state of the FAT itself, if the system suffers a bad exit from Windows. That is why this problem is often seen after the system crashes, restarts, or is switched off while running Windows.

ScanDisk fixes this error by re-calculating the free space capacity from the FAT, and writing this value to the FAT32 volume's boot record.

LostClust: Lost clusters

This problem is often found after a bad exit from Windows, when there are data clusters chained out of free space but where there is no directory entry pointing to the cluster chain thus formed. Such files were typically opened for write at the time the system crashed, restarted, or was switched off. Often these were temporary files that would have been deleted during shutdown anyway, but they can also be data files that were in use at the time.

ScanDisk fixes this problem by creating a new directory entry in the root of the volume to point to the cluster chain. If the chain appears to be a directory, i.e. begins with the customary . and .. entries, then it is saved as a directory of the name DIRnnnnn, where nnnnn is a number left-padded with zeros. Otherwise, the chain is assumed to be a file, and the name will be FILEnnnn.CHK, where nnnn is a number left-padded with zeros, and the length of the file in bytes is set to include everything to the end of the last cluster in the chain.

It's important to save these lost cluster chains, rather than delete them, in case you wish to recover data from them; once deleted, it will be near impossible to find which clusters contain the data you wanted. In Windows 98, ScanDisk.ini is set to automatically delete these chains by default.

Safe to fix, but will damage data

You can let ScanDisk fix these problems, but because data may still be damaged, you should log the results - else you will have no way of knowing which files were damaged!

Crosslinks: FAT-level crosslinks

A crosslink occurs when a cluster is common to two different cluster chains - implying that all clusters following the crosslink will be common to both files or directories (as they share the same subsequent cluster chaining in FAT). One or both files or directories will be corrupted, as data from one will have overwritten that of the other.

ScanDisk fixes this problem by making a new copy of the affected cluster chain, using free clusters. One of the two affected directory entries is then set to point to the new copy, so that subsequent operations will not affect the other file or directory. However, because one or both of these entities will have already been corrupted, it is crucial to keep a log of what was "fixed" when ScanDisk repairs crosslinks.

Bad_Clusters: Physical disk damage

When ScanDisk fails to read a sector in the data area of a hard disk volume, it repairs this problem by copying the readable sectors in that cluster to a new cluster taken from the end of the volume. But at least one sector is likely to be garbage, so unless this fell in the "slack space" after the end of data in the last cluster of the chain, the affected file or directory will be damaged. So once again, it is crucial to log these repairs so that you know what file or directory was affected.

Relocating the damaged cluster also kills the chance to read the damaged sector, as you will no longer know which one it was - but this is a slim chance anyway.

Spurious "bad clusters" can occur when a hard drive is incorrectly defined in CMOS, as if it has more capacity that is actually the case. This then causes problems when ScanDisk tries to relocate data to the last (non-existant) clusters on the volume!

Because ScanDisk fixes bad sectors by linking in a free cluster, the file system has to be logically sane. This is why ScanDisk does the surface scan after the logic scan, and this in turn makes ScanDisk surface scan less useful as a means of testing for physical hard disk errors.

Note also that the firmware within modern hard drives may perform similar repairs "on the fly" when sectors are found to be defective - but there are several differences as well as similarities:

Further discussion of internal hard drive defect management, S.M.A.R.T., etc. lies beyond the scope of this page ;-)

Long File Name errors

In DOS mode, ScanDisk can detect but not repair Long File Name problems. Sometimes the problem is a corrupted LFN, but sometimes this happens because clusters within a directory chain are mis-sequenced so that the first entry in the next cluster is not preceded by the expected contents in the end of the previous cluster. This is a dangerous situation that is best fixed manually if data is to be preserved, and so LFN problems should be approached with caution.

Dangerous to fix

You should not allow ScanDisk to "fix" these errors, and should proceed to more formal data recovery instead.

FAT_Media: Missing or invalid FAT media byte

In itself, this problem would not be dangerous to fix - but in most cases, there is more than just an invalid media byte involved. Usually, the entire volume boot record is corrupted, or the geometry in effect is pointing to the wrong sectors as being the boot record for that volume. This problem should be seen as a reason to proceed directly to formal data recovery.

Mismatch_FAT: Mismatched FATs

This is the most dangerous of file system logic problems, with the potential for massive data loss. The File Allocation Table tracks which data cluster follows the previous cluster within files and directories, and this information is near-impossible to deduce if it is lost. For this reason, there are two copies of the FAT, and ScanDisk detects the "mismatched FAT" problem when these differ.

ScanDisk "fixes" this problem by assuming one FAT to be correct, and the other to be wrong. The logic by which it decides which copy to believe is dubious at best; for example, it may simply assume FAT1 is always correct and FAT2 is always wrong, and copy the whole of FAT1 over FAT2 so that they match. This destroys the information in the overwritten FAT, as well as the differences that would have pinpointed which files and/or directories were affected.

If one is familiar with the FATxx file system and has suitable tools (e.g. Norton DiskEdit), it's quite easy to assess FAT mismatches visually as to which FAT copy to believe. Often the damage is due to odd sectors of garbage frecked over both FAT copies, so that it is obvious which sectors are bad and thus which should be overwritten with the corresponding sector from the other FAT. Usually this requires some sectors to be copied from FAT1 to FAT2 as well as others from FAT2 to FAT1, which is why one should never blindly copy one entire FAT copy over the other.

Manual repair of FAT mismatches is covered tersely in the page on data recovery. Any mismatched FAT problems should be handled via formal data recovery procedures! Windows should not be run, no file writes (including the ScanDisk log) should be allowed, and no automated repairs should be permitted, until the data is evacuated and/or the file system is manually repaired.

Bad_Entries: "Irrepairable" (sic) directories or files

ScanDisk's idea of what is "irrepairable" may differ from what manual recovery can do, and allowing ScanDisk to "fix" such problems will destroy any chance there may have been for data recovery. Problems of this nature suggest significant file system damage, or inappropriate disk geometry settings that cause a healthy file system to appear deranged.

ScanDisk "fixes" these problems by throwing away the affected file or directory, and returning the data clusters to the free space. When a directory is discarded in this way, all files and directories that were pointed to in that directory are cast adrift as lost cluster chains - thus losing their names, location, and true file lengths in bytes. Ugly stuff.

Bad_Chain: Files or directories "to be truncated" (sic)

Once again, allowing ScanDisk to "fix" such problems will irreversably damage the affected data. In particular, if ScanDisk sees a directory entry starting with a zero byte (as can happen when a corrupted entry is created, or a random sector of garbage is written into a directory's cluster chain) then it discards all entries after that point - creating the same nightmare as described in the previous paragraph.

Okay_Entries: "Repairable" directories/files

I'm not sure what the full range of circumstances are, for ScanDisk to consider files or directories to be "damaged, but repairable" - but there's at least one case where the fix will be inappropriate. If a directory pointer points to an arbitrary cluster, then that directory won't begin with the expected . and .. entries, and ScanDisk will "fix" this by writing these entries into the first cluster of the chain. That's a bad idea, if the directory's pointer was wrong and the first data cluster was actually part of some other file!

Disk compression errors

If you use disk compression, then you should forget about data recovery. Corrolory: If the data is not disposable, do not store it on a compressed volume! Disk compression works by storing the contents within a single hidden Compressed Volume File on the host volume, and using additional disk compression code logic to navigate this.

Compressed data will appear as gibberish at the raw sector level, so that scanning raw disk for strings in sought-after lost files won't work. Even if you use disk compression without compression, so that the data isn't scrambled, it is much more difficult to recover because file content is not aligned with cluster boundries - and the internal file system structure within a CVF is obscure, if documented at all.

 

(C) Chris Quirke, all rights reserved - November 2002

Back to index