The first thought I had for filling the disks was to estimate the image size based on just the total size of the files being put on the disk, this worked well except where there were large numbers of small files (on the order of 100k files).
total = \sum{\lceil\frac{size}{2048}\rceil}2048
For a while I used this estimator with a pad factor based on the number of files, which worked fairly well but sometimes needed adjustment.
Last June I did some research and found the ISO9660 Filesystem specifications, and wrote up a better estimator for my specific case.
- All files are 8.3 format.
- There are no extended formats
- All the files are in the root directory
174 + \lfloor\frac{count}{42}\rfloor + \sum{\lceil\frac{size}{2048}\rceil}
where the 174 is the disk overhead including padding, super blocks, and path tables; count is the number of file names on the disk, and size is the size of each file. This can be calculated incrementally with just two variables, a count and the sum of sizes so far.This estimator came out to be right on every time.
Some day I may even use the ECMA-119 Standard (which got approved as the ISO9660 standard) to directly write a disk image instead of populating a directory and using mkisofs to make the image, but I have more important things to do before that.
Here is that same estimator as a PERL code example, calculating 1000 instances of a 2048 byte file.
#!/usr/bin/perl -w use strict; use POSIX; sub sum { my $out = 0; for(@_) { $out += $_; } return $out; } my @sizes = ( 2048 ) x 1000; my $file_count = @sizes; my $data_size = sum(map { ceil($_ / 2048) } @sizes); my $dir_size = floor( $file_count / 42 ) + 1; my $overhead = 173; my $size = $overhead + $dir_size + $data_size; $\ = "\n"; print $size;
No comments:
Post a Comment