Skip to content

Given a URL, this scraper will visit every page of that site and download each as a PDF for offline viewing. Especially useful for online versions of books spread across multiple pages.

Notifications You must be signed in to change notification settings

Nezteb/scrape-pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scape PDF

Make sure you have pnpm installed: https://pnpm.io/installation

pnpm install
pnpm run scrape <url>

You'll see some console output, and then should have an output directory full of PDF files and a single ___urls.txt file.

CLI Options

Full Short Description
--media -m What media type you want to generate PDFs with, if the site supports different media types ("screen" or "print" (default))
--colorScheme -c What color scheme you want to generate PDFs with, if the site supports color schemes ("light", "dark", "no-preference" (default))
--withHeader -h Whether or not you want PDFs with generated headers (and footers) (default false)
--dryRun -d Perform the web crawl without creating PDFs (default false)
--verbose -v Adds additional logging (default false)

TODO

About

Given a URL, this scraper will visit every page of that site and download each as a PDF for offline viewing. Especially useful for online versions of books spread across multiple pages.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published