Show HN: GetHtml() open source web scraping waterfall fetcher
github.comI’m open-sourcing a project that’s been in my web scraping toolkit for a while: Waterfall Fetch.
After building custom scraping infrastructure for startups again... and again... and again, I decided to do something potentially career-damaging. I’m making a generic version of my go-to solution available for everyone.
How does it work? Waterfall Fetch performs stealth/proxied fetches in sequence. If one fetch fails, it automatically moves to the next, more robust strategy—until it successfully retrieves HTML. Think of it as a resilient solution for web scraping.
Features: - Supports custom fetch strategies (and proxies). - Allows you to execute JavaScript on pages. - Works with a simple getHtml() function. - Fully documented.
And yes, it’s live now on NPM. `npm i waterfall-fetch`
It even has a nice little docs site to help you get started.
Want to support? Give it a star on GitHub: https://lnkd.in/gG6dkuhu