Trim Null Bytes from a Sparse File

I was running a server that didn’t do log rotation, so when the log grew too big I had to truncate it. This left me with a sparse file with a bunch of 0s (null bytes) at the beginning.

To trim the null bytes, first find the offset of the first non-null byte. This can be done with some C:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#define BUF_SIZE 4 * 1024
int main(int argc, char *argv[]) {
char buf[BUF_SIZE];
unsigned long long count;
int n, i, fd;
if (argc != 2) {
fprintf(stderr, "Usage: %s FILE\n", argv[0]);
exit(1);
}
fd = open(argv[1], O_RDONLY);
if (fd < 0) {
fprintf(stderr, "Error opening file: %s: %s\n", argv[1], strerror(errno));
exit(1);
}
count = 0;
while ((n = read(fd, buf, BUF_SIZE)) > 0) {
for (i = 0; i < n; i++) if (buf[i] != 0) break;
count += i;
if (i != n) break;
}
if (n < 0) {
fprintf(stderr, "Error reading file: %s: %s\n", argv[1], strerror(errno));
exit(1);
}
printf("%ld\n", count);
return 0;
}

Find the offset of the first non-null byte:

1
2
3
gcc -o nulllength nulllength.c
./nulllength log.bin
7981307584

Trim the first 7981307584 bytes from the file:

1
dd if=log.bin of=trimmed.bin bs=7981307584 skip=1

Specifying the bs as offset and skipping 1 is faster than doing it the other way, because it would copy 1 byte at a time – slow!