Type scraper Language ruby Status Public
134 lines of code. 2,301,697 rows of data.
Created 2 years, 10 months ago.
Scrapes commodities categorized under “Browse by Commodity” at this USDA Market News site: http://www.marketnews.usda.gov/portal/fv. Each day, the database stores category, commodity, lowPriceMin, highPriceMax, cityName, originName, itemSize, date and a pointer to the XML source. Currently it parses the XML source into the data store, but this is too slow (the connection frequently times out). I'd like to change it to parse the HTML table on the commodity page instead. (Feel free to do so!) Timeout ...