Comment by schmookeeg

Comment by schmookeeg 4 days ago

0 replies

We're an AWS shop, so for lightweight or one-off stuff, it's a typescript lambda. Everything else ends up in a python script to output glue-friendly stuff to S3.

Assume at some point, the data will bork up.

If you ingest Excel (ugh), treat it like free range data. I have a typescript lambda that just shreds spreadsheets in a "ok scan for this string, then assume the thing to the right of it is this value we want" style -- it's goofy AF but it's one of my favorite tools in the toolbox, since I look magical when I use it. It allows me to express-pass janky spreadsheets into Athena in minutes, not days.

It is based on the convert-excel-to-json library and once you grok how it wants to work (excel -> giant freaky JSON object with keys that correspond to cell values, so object.A, object.B, object.C etc for columns. array index for row number), you can use it as a real blunt-force chainsaw approach to unstructured data LARPing as an excel doc :D