Repair Corrupted Downloads Without Downloading Entire File Again
■ Problem
■ Hashing
■ Solution
■ Process
■ Analysis
Downloading a big file using a slow connection is like watching a snail running a race. Okay, Turbo is an exception. But the worst part is getting a corrupted file after the download completes successfully. Goddam it! Now I have to do all these again!! I have felt that frustration many times because I had a wonderful 64kbps 2G internet before i moved to broadband. So i found a solution for the problem which worked great and saved many valuable hours of my life which i would spent for sleeping, eating and experimenting. Here is how I did it..
Problem
When you download a file there is a little bit of chance for getting a corrupted file. It might be because of the network, hardware or even bad software that you used to download the file. In simple words, a corrupted file will have some of its bits altered at some places. But we have no idea where it is and what is the actual data.
Hashing
In simple words, hashing process converts some data to a small value using some maths. For different data there will be different hash value. For example:
My data is 12345
. I am hashing it simply by adding the numbers. So hash is 1+2+3+4+5 = 15.
If data is 12340
. Hash is 1+2+3+4+0 = 10
.
So when the 5
became 0
, which can be considered as a data corruption, the hash value changed. I hope you got the point.
Solution
In order to find out where the error occurred we can split the file into pieces. Both the actual file and the downloaded file. It can be of size 512KB or 256KB, doesn't matter. For each piece of data we calculate a hash value using some standard method such as SHA. It is a widely used hashing mechanism. Then we can compare the hash values to check whether the original and received piece of file is same or not. If the hash values are not same then we can download only that 512KB piece. Only that particular piece !
We already have an amazing technology that does almost similar procedure - Torrents ! Yes It does the splitting hashing and all the stuff. "A file downloaded from a torrent will be corrupted if and only if the original file in server is corrupted". So what we have to do now is convert the download link into a torrent then download it using a torrent client feeding the corrupted file to it.
Process
In my case I was downloading Raspbian OS for Raspberry Pi board. It was a Zip file of 1.32 GB.
But when I tried to extract it I got this:
What a beauty ! Once I had screamed seeing it.
Lets solve it. First get your download link. It must be a direct download link like this one:
http://downloads.raspberrypi.org/raspbian/images/raspbian-2015-11-24/2015-11-21-raspbian-jessie.zip
And it should not be https://
because torrents doesn't support it. So you must replace https://
to http://
in your link.
Now go to http://burnbit.com/
. This website will convert our link to a torrent. It is called burning ! But no fire, completely Eco-friendly. Put your link in the text box and press burn.
Now it is burning. Sit back and relax for 10 minutes. You can dream about things like what to do in the extra time or what to do with the file.
After it is finished you will get a button to Download Torrent file. Click on it.
Now open the torrent file in your favorite torrent client. I'm using uTorrent. In that you have to give the directory as the one you have kept your corrupted file. I saved my corrupted file in E:\Raspbian folder. So I chose that. Also keep in mind that the filename of corrupted file must be same as in the torrent as you see in the above image.
Click OK. It will not retrieve the torrent information such as the number of pieces, size of a piece, their hash values etc. Then it will check the hash values with the downloaded file. Thats what will be happening now.
After that it will start downloading. You can see a small gap in the downloaded and availability graph. That is the part which is corrupted and need to be downloaded again. Below you can see that there are 2720 pieces of that file, each of 512KB but the file already downloaded has only 2718 pieces. So 2 pieces are corrupted and total of 1MB is required to download again.
And it just happened in 16 seconds and only 1MB is downloaded. Funny !
It will start seeding. You can stop it. After that I tried to extract the same file. And now it works..
Analysis
If I'm planning to download it again:
Time required: 10 mins (to scream) + 6 hours 24 mins (to download) = 6 hours 34 mins
Data required: 1.32 GB
If I'm using this method:
Time required: 10 mins (to burn) + 16 seconds (to download) = 10 mins 16 seconds
Data required: 1 MB
So I got extra 1.319 GB to watch movies and 6 hours 25 mins and 44 seconds to do my stuff without wasting electricity!
You can bookmark this page. And when you meet similar situation come back and do it. I hope this can save your time and money.
If you got some doubts comment it below I'll help you. Thanks for reading!