Back to contents PHP Python Ruby Choose a language:


Each of these tutorials has its instructions and example code in one file that you edit, learn, and run entirely from your browser.

Ruby scrapers

How to Write a Screen Scraper: 1

Start here: check the ScraperWiki interface is working, then learn how to download a web page.

How to Write a Screen Scraper: 2

Slightly more advanced: scrape data from raw HTML, and save it to the ScraperWiki datastore.

How to Write a Screen Scraper: 3

Doing it again and again: following 'next' links to scrape multiple pages.

Advanced Scraping: .ASPX Pages

Scrape ASP.NET web pages (with an .aspx extension) using the Mechanize library.

Advanced Scraping: Pages Behind Forms

Scrape pages behind forms: using the Mechanize library.

Advanced Scraping: Excel Files

Scrape Excel files using the spreadsheet library.

Advanced Scraping: CSV files

Scrape CSV files using the fastercsv library.

Advanced Scraping: PDFs

Scrape PDF files using PDF::Reader.

Ruby views

Simple table of values
Create a simple table of values from a scraper output.