O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  

Buy the book!
Linux Server Hacks
By Rob Flickenger
January 2003
More Info

Mincing Your Data into Arbitrary Chunks (in bash)
Use bash arithmetic and dd to chop large binary files into reasonable chunks
Listing: mince
[Discuss (1) | Link to this hack]

Listing: mince


if [ -z "$2" -o ! -r "$1" ]; then
echo "Usage: mince [file] [chunk size]"
exit 255

SIZE=`ls -l $1 | awk '{print $5}'`

if [ $2 -gt $SIZE ]; then
echo "Your chunk size must be smaller than the file size!"
exit 254

while [ $TOTAL -lt $SIZE ]; do
PASS=$((PASS + 1))
echo "Creating $1.$PASS..."
dd conv=noerror if=$1 of=$1.$PASS bs=$CHUNK skip=$((PASS - 1)) count=1 2> /dev/null

echo "Created $PASS chunks out of $1."

Note that we take advantage of conv=noerror, since the last chunk is almost guaranteed to run beyond the end of your file. Using this option makes dd blithely continue to write bits until we run out of source file, at which point it exits (but doesn't throw an error, and more importantly, doesn't refuse to write the last chunk).

This could be handy for slicing large files into floppy-sized (zip disk, cd-r, dvd, Usenet) chunks prior to archiving. As it uses dd's skip feature, it will work on any sized file, regardless of the amount of available RAM (provided that you supply a reasonable chunk size). Since the block size (bs) is set to whatever you have selected as your chunk size, it runs quite quickly, especially with chunks larger than a couple of kilobytes.

It saves your chunks as multiple files (ending in a . followed by the chunk number) in the current directory. Running ls FILENAME.* will show you your chunks in numerical order. But what if you have more than nine of them?

ls FILENAME.* | sort -n -t . +2 

How do you reassemble them?

cat `ls FILENAME.* | sort -n -t . +2` > FILENAME.complete 

Don't believe me?

diff FILENAME FILENAME.complete 

(This of course assumes that your filename has only one . in it. If you have a ridiculously.long.filename.with.multiple.dots, consult man sort(1).)

O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.