Sunday, May 16, 2004 1:55 AM bart

Sparse files

Mathematic fans will know sparse matrices (that are, matrices that contain a lot of zeros). Files can be sparse as well if they contain a lot of zeros in a row (for example a region of multiple MBs contains only zeros as data). NTFS support sparse files and allows you to compress these files on the disk. In fact, I'm in my fsutil investigation period, so this is just another possibility of the fsutil tool.

Let's create a sparse file (of course we write a program to do this):

using System.IO;

class Sparse
{
 public static void Main(string[] args)
 {
  string file = args[0];
  FileStream fs = new FileStream(file, FileMode.CreateNew);
  BinaryWriter bw = new BinaryWriter(fs);
  byte ZERO = 0;

  bw.Write((byte) 1);
  for (int i = 0; i < 1024*1024 - 2; i++)
   bw.Write(ZERO);
  bw.Write((byte) 1);
 }
}

And create the file using sparse.exe test.sparse.

If you take a look in the Windows Explorer right now, you'll find that there is a file of 1.00 MB with 1.00 MB allocation on the harddisk. Now we can mark the file as being sparse:

fsutil sparse setflag test.sparse

The next thing to do is mark the sparse region:

fsutil sparse setrange test.sparse 1 1048575

Now, Windows Explorer will tell us only 64 KB are allocated on the disk to store the file (the non-zero data + data to know where the sparse region lives). A hex-editor on the disk can be quite useful if you want to see how NTFS stores a sparse file and how it indicates a file is sparse.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under:

Comments

# re: Sparse files

Thursday, September 23, 2004 10:59 PM by bart

Do you know if there is a way to somehow scan a non-sparse file for zero data, set the file sparse, and then flag those ranges in the sparse file? In other words, would it be possible to "convert" a non-sparse file to a sparse one automatically?