Comment by badmonster
Comment by badmonster 8 hours ago
This looks great! I'm curious about the plugin architecture - how does Marmot handle schema evolution and versioning across different data sources? For instance, if a Postgres table's schema changes, does the catalog automatically detect and update the lineage, or is there a manual reconciliation step?
Also, given that you're using OpenLineage for cross-system lineage tracking, have you considered building native integrations with data orchestration tools beyond Airflow (e.g., Dagster, Prefect) to automatically capture DAG-level lineage?
Hey, that's a good question! At the moment, it treats the latest run as the desired state. So any new changes to a schema will simply overwrite the old version. I'd like to version these so people can navigate schema versions in the UI. If using a plugins, they currently are triggered either via the CLI or a schedule on the UI, so updates will only appear in the catalog after a plugin has run.
I'd also love to have some native integrations beyond Airflow. Once I've matured the existing plugin ecosystem a bit more, it's high on my list (along with column-level lineage).