NSF:
- Added a method to parse year, month, and day out of the filename
- Learned more regex
- Set the program to download and scrape everything from 2001, but it didn't finish
- So far, I've looked through 32 full xml files and found no NSF documents. I have tested the program against sample files mentioning NSF, but it is still kind of discouraging to not get any results.
- Also, I need to find copies of the DTD from various years in order to find how the tags changed
Which resources are you using to learn regex? There are so many out there, and it has been several years since I've looked at any. Please recommend any you find especially helpful.
ReplyDeleteAgain, please check in with Michelle regarding your last two bullets.