Comment by renegat0x0
Comment by renegat0x0 10 hours ago
What I have seen it is hard to tell what "serious scrapers" use. They use many things. Some use this, some not. This is what I have learned reading webscraping on reddit. Nobody speaks things like that out loud.
There are many tools, see links below
Personally I think that running selenium can be a bottle neck, as it does not play nice, sometimes processes break, even system sometimes requires restart because of things blocked, can be memory hog, etc. etc. That is my experience.
To be able to scale I think you have to have your own implementation. Serious scrapers complain about people using selenium, or derivatives as noobs, who will come back asking why page X does not work in scraping mechanisms.
https://github.com/lexiforest/curl_cffi
https://github.com/encode/httpx