Comment by porker
> No, polars or spark is not a good answer, those are optimized for data engineering performance, not a holistic approach to data science.
Can you expand on why Polars isn't optimised for a holistic approach to data science?
> No, polars or spark is not a good answer, those are optimized for data engineering performance, not a holistic approach to data science.
Can you expand on why Polars isn't optimised for a holistic approach to data science?
This is a non-issue with Polars dataframes to_pandas() method. You get all the performance of Polars for cleaning large datasets, and to_pandas() gives you backwards compatibility with other libraries. However, plotnine is completely compatible with Polars dataframe objects.
I have not work with Polars, but I would imagine any incompatibility with existing libraries (e.g. plotting libraries like plotnine, bokeh) would quickly put me off.
It is a curse I know. I would also choose a better interface. Performance is meh to me, I use SQL if i want to do something at scale that involves row/column data.