Comment by dataflowmapper
Comment by dataflowmapper 4 days ago
Thanks for sharing that workflow, for more straight forward flows, that sounds like a decent approach. My main thoughts on where it could be improved, or where I see potential struggles, are when:
- People aren't comfortable or familiar with coding/Python.
- You get into more complex imports like historical data, transactional data, etc. There you might have like 15 transaction types that have to be mapped, all with different fields, math, and conditional logic where the requirements become too much for just prompting ChatGPT effectively, and iterating on the Python can get pretty involved.
- The source data structure and transformation needs aren't consistently the same, leading to a lot of 'throwaway' or heavily modified scripts for each unique case.
- Tasks like VLOOKUPs or enriching data come into play, which might add manual coding or complexity beyond a simple 1-to-1 source-to-destination script.
These are the areas where I'm exploring if a more structured way could offer benefits in terms of repeatability and accessibility for a wider range of users or complex scenarios. Appreciate the insight into your process and your thoughts on this.
From doing historical data restorations (pre-json), using documented industry/standard structures & protocols, normalized data formats/layouts helps out quite a bit.
Note: awk/shell scripts, ~80's because relatively portable across platforms (mac,pc,dec,ibm (ignoring ebcdic), etc.)