Sending Large Files Over Faulty Networks
Recently, I found myself trying to send a large file over a faulty network connection. After waiting 3+ hours for the file to transfer only to find it corrupted... twice, I went in search of a better way. What follows are some helpful tips for transferring large files over a faulty connection.
Before Transfer
To improve our chances of success we need to compress and chunkify the file before we send it. The compression reduces the time required to transfer the file, and splitting it into chunks decreases the amount of data we have to resend if something gets corrupted during transfer.
Compress And Chunkify
To compress and chunkify the original file we will use a tool called 7-zip, though, any compression program that will split the file into chucks will work. Before transfer, run the original file through 7-zip to get a smaller file called an archive. When creating the archive, choose the option to split the file into multiple chunks. In 7-zip the option is labeled Split to volumes, bytes:, choose 1457664 - 3.5'' Floppy to get a good number of chunks. If you do not get at least 10 chunks, re-create the archive, this time manually typing in a Split to volumes, bytes: smaller than 1457664
At this point the archive is ready to be sent but it is a good idea to run a few tests on it first.
Test The Archive
Use 7-zip to test the archive, if this test fails you will need to re-create the archive.
Calculate Checksums
We want to capture a checksum of each chunk so that later we can compare the checksum from the original chunk with those of the copy. To capture the checksums we will use the MD5 Command Line Message Digest Utility. Once downloaded, place the .exe in the same directory as the chunks and run
md5 * > original-checksums.txt
this will calculate a checksum for every file in the directory and save it to a file called original-checksums.txt.
At this point we are ready to transfer the chunks.
After Transfer
After the transfer completes, the first thing we want to do is test the archive at the destination using 7-zip. If the archive passes the test, try to extract it. If the extraction succeeds, congratulations the file transferred correctly. If, however, the extraction or archive test failed we need to determine which chunks got corrupted during transfer.
Calculate Checksums
To determine which chunks were corrupted and need resent, we will calculate a checksum on each chunk at the destination. Again, place the MD5 Command Line Message Digest Utility .exe in the same directory as the chunks and run
md5 * > destination-checksums.txt
to calculate a checksum for each chunk and save the results to destination-checksums.txt
Compare The Checksums
Now we can compare the checksums in the two files (original-checksums.txt and destination-checksums.txt). We could compare the files manually but it is easier to use a diff tool like WinMerge. We simply feed the two checksum files into WinMerge and it will highlight the corrupted chunks for us. Now that you know which chunks are corrupted, just resend them and repeat the after transfer steps until there are no longer corrupted chunks.