Programming Stuff: Changes 2/24/14

Switched to google for downloading the zip files (I downloaded a file from both and got 0.26 MB/s from reedtech and 1.33 MB/s from google)
Since google doesn't have the links in tables, I set the parser to find all links, then only keep the ones that started with 'ipa' or 'pa'.
Program now removes urls of any file already in the temp folder (after testing if it is a properly written zip file), causing it to no longer waste resources parsing already parsed files.
I will have to do a bit more testing to make sure it is actually working correctly. I set it to print out all skipped files, and only prints that it is skipping every other file, even though it appears to correctly skip all of them
I realized that I am parsing through a list while removing elements from it. In java, this gives a ConcurrentModificationException, but I guess in python it is technically allowed. I will fix this tomorrow by adding all elements found in both lists to be added to a new list, and once the scan is complete, that list of elements will be removed from the urls list

Programming Stuff