Comment by kremi

Comment by kremi 2 days ago

0 replies

Some of the replies here are pretty good, I basically agree with “if it works for your data scientists then why not”.

I’m actually a software developer with 10 years experience and also happen to do data science. And found myself in situations where I parametrized a notebook to run in production. So it’s not that I can’t turn it to plain python. The main reasons are

1. I prototype in a notebook. Translating to python code requires extra work. In this case there’s no extra dev involved, it’s just me. Still it’s extra work.

2. You can isolate the code out of the notebook and in theory you’ve just turned your notebook into plain py. You could even log every cell output to your standard logging system. But you loose context of every log. Some cells might output graphs. The notebook just gives you a fast and complete picture that might be tedious to put together otherwise.

3. The saved notebook also acts as versioning. In DS work you could end up with lots of parameters or small variations of the same thing. In the end what has little variations I put in plain python code. What’s more experimental and subject to change I put in the notebook. In certain cases it’s easier than going through commit logs.

4. I’ve never done this but a notebook is just json so in theory you could further process the output with prestodb or similar.