Changes 3/18/14
NSF:
- Ran scrape on 2013 application, began enhancing compatibility
- Fixed bug: I didn't take into account that <us-patent-application> may have attributes, leading string find not to recognize it.
- Fixed bug: split_xml would split at the end tags, but it is supposed to only look at the data within the start and end tags. For example, it was looking at
<random_tag>
<us-patent-application>
<data-we-want>
</us-patent-application>, when the <random-tag> should have been ignored.
- Added dump_xml() method to patutil, which takes an xml document string and a filename, and writes the data to a file with the filename. This is useful for debugging in specific patent applications, and also led me to find the above two bugs.
No comments:
Post a Comment