Understanding FAT

FAT stands for File Allocation Table, but is also used as a name for the family of file systems that have been used in DOS and Windows 9x and supported by NT and various *NIX. An understanding of this file system is required for data recovery and to understand ScanDisk's reporting; fortunately, this is fairly easy to do. Armed with this understanding, you can troubleshoot file system problems, even those involving the FAT itself.

Terminology

This can be quite confusing, so I'll clarify this at the start:

FAT: File Allocation Table, a data structure present in all FAT volumes
FAT1: The first copy of the FAT
FAT2: The first copy of the FAT
FAT12: FAT file system using 12-bit cluster addressing
FAT16: FAT file system using 16-bit cluster addressing
FAT32: FAT file system using 32-bit cluster addressing; Win95 SR2 and later
FAT or FATxx: File systems that use File Allocation Tables, etc.
VFAT: The 32-bit code used to operate the file system in Win9x GUI mode
Cluster: Single unit of data storage at the FATxx file system logic level
Sector: Single unit of storage at the physical disk level
Physical sector address: Sector address in absolute physical hardware terms
CHS sector address: As above, expressed in Cylinder, Head, Sector terms
Logical sector address: Sector address relative to the FATxx volume
Folder: A collection of named items as seen via Windows Explorer
File Folder: Modern Windows-speak for "directory"
Directory: A file system data structure that lists files and/or directories
Directory entry: Points to a file or directory, and contains info about it
Attributes: A collection of bits in a directory entry that describes it

See also partition vs. volume terminology.

File System Structure

The FATxx volume is divided into four areas:

The boot record is the first sector of a FAT12 or FAT16 volume, and the first 3 sectors of a FAT32 volume. It defines the volume, as well as the whereabouts of the other three areas. If the volume is bootable, then the first sector of the boot record also contains the code required to enter the file system and boot the OS.

The File Allocation Table is a series of addresses that is accessed as a lookup table to see which cluster comes next, when loading a file or traversing a directory. For example, if the system had just loaded cluster 23, it would look up offset 23 in the FAT and the address there would be that of the next cluster; typically 24. Because the FAT is such a vital data structure, there are typically two copies (i.e. FAT1 and FAT2) so that corruption of the FAT can be detected and hopefully intelligently repaired.

The root directory is fixed in length and always located at the start of the volume (after the FAT) in FAT12 and FAT16 volumes, but FAT32 treats the root directory as just another cluster chain in the data area. However, even in FAT32 volumes, the root directory will typically follow imediately after the two FATs.

The data area fills the rest of the volume, and is divided into clusters; it is here that the file data is stored. Subdirectories are files with a particular structure that is understood by the file system, and are marked as being directories rather than files by setting the "directory" attribute bit in the directory entry that points to it.

Directory entry structure

Each directory entry is 32 bytes long, so that a 512 byte sector can contain 16 directory entries.There are four types of directory entry...

...and these are distinguished by the "directory" and "volume label" attribute bits. If both of these bits are reset, the entry is a file pointer; if the "directory" bit is set, it is a subdirectory pointer; if the "volume label" is set, it is a volume label (one or none of which is present in the root directory), and if both "directory" and "volume label" bits are set, it is a Long File Name component. Long File Name and volume label entries do not point to data clusters and have a "file length" of zero bytes.

Pointer entries contain a name in 8.3 format, the address of the first data cluster (if data is present), the length of the data, the attribute bits, and several other information fields such as timedate stamps, etc. Long File Name entries are associated with the "real" pointer entry by preceding it in the directory, and contain name character data only.

For the purposes of this page, it is the pointer entries we are are most interested in - and the address of the first data cluster in particular.

File structure

A file has at least one and usually three components:

If a file has zero length, then there are no data clusters or FAT entries associated with it.

FAT structure

Each copy of the FAT starts with some marker bits, and then consists of cluster addresses. To look up the "next cluster", you read the FAT at an offset based on the cluster you are currently on; the address held at that offset is that of the next cluster.

There are some "special" values; a zero means the cluster is not in use, and two particular high values are used to signify the last cluster of a chain and a "bad" cluster that should not be used, respectively.

All other values in the FAT point to particular data clusters, and as no two files or directories should ever use the same cluster, all non-"special" values in the FAT should occur once and only once in each FAT.

Directory structure

A directory contains directory entries, and all but the root directories of FAT12 and FAT16 volumes are merely files that contain directory entries and have the "directory" attribute bit set in their own directory entry.

All directories other than the root start off with . and .. entries that point to themselves and their parent directories respectively. For this reason, even an "empty" subdirectory uses one data cluster, to hold the . and .. pointers. It's important to remember that . and .. are actual entries, rather than conceptual entities; a corrupted . or .. pointer can point to anything! If .. points to zero, than that directory's parent is the root directory.

 

(C) Chris Quirke, all rights reserved - November 2002

Back to index