Comment by dspillett

Comment by dspillett 17 hours ago

3 replies

Is there a public dump of the data anywhere that this is based upon, or have they scraped it themselves?

Such as DB might be entertaining to play with, and the threadedness of comments would be useful for beginners to practise efficient recursive queries (more so than the StackExchange dumps, for instance).

keepamovin 11 hours ago

Yes, you can see the download HN bash script in the repository now that simply extract the data to your local machine from BigQuery and saves it as a series of gzip JSON files

  • dspillett 3 hours ago

    Ah, the repo was 404ing for me last time I checked (seems fine now) so I couldn't inspect that. I'll have a play later.