Accessing Scraped Data with Web Interface

(This page does not apply to the LionSHARE scraper.)

The directory and Vergil scrapers archive data on Amazon Web Services' Simple Storage Service. Go to the AWS Console login page and sign in as a root user with the credentials in the spec-graphics section of the Secret Spec Graphics Credentials Doc. Navigate to the S3 service.

Each bucket (directory-scraper or vergil-scraper) contains the scraped data for the corresponding scraper. The name of a file is the time at which the scraper was run. The directory of a file is an appropriate grouping for the scraper. The directory scraper has two directories: students and facultyandstaff. The directories of the Vergil scraper represent different semesters of the year.

To access the Vergil data programatically, check out the example.py script in https://github.com/graphicsdesk/vergil-data.

To access the directory data programatically ... it's like the same thing almost but I haven't written an example so. Ask Jenny. Michelle maybe?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accessing Scraped Data with Web Interface

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally