Comment by openrisk

Comment by openrisk 2 days ago

10 replies

The bigger picture is the "democratization" of data science [1], not as an academic pursuit by white coat researchers (which was the main early use case for R) but embedded in day-to-day operations in organizations of all sizes. "Apps" (web or mobile) is how this embedding is done. Reactivity is an important consideration when deploying a complex screen with many visible data elements and visualizations.

The problem is how to increase the efficiency with which back-end developments (by technical people) get rolled-out to the end-user base. There are countless ways to do this of-course. The more technical resources one has available (e.g. knowledge of different stacks etc.), the more options. But when resources are scarce one is tempted to look for the (possibly only perceived) efficiency of a "full stack" approach.

[1] a loose term which nevertheless reflects something very real: the widespread use of data in society for all sorts of purposes

RA_Fisher 2 days ago

Cursed in your view because it’s not Python?

  • boccaff 2 days ago

    When I've dealt with R in production, cursed meant: - Difficult to keep package versioning, even with "renv". - If an analyst decide to use a single function from the "tidyverse", you have a tons of dependencies. - Large docker images (1G+) due to packages like "devtools" and very large dependency tree for the "productivity packages" (see above). - Hard to communicate with the process. With luck, you can set it up and work with 'r-script' [1]. Without luck, stdout from process or simple files for io.

    In the end, to have a nice webapp, we ended up rewriting the R code into typescript. Julia don´t solve this also, as you have a hard time to set it up to communicate with other things. It seems that we can´t avoid the "2 or 3 languages" problem if you don´t use python.

    [1] https://www.npmjs.com/package/r-script

    • RA_Fisher 2 days ago

      Single functions and libraries can easily be imported in R, just like Python. It’s not necessary though, because the R community does a good job avoiding name conflicts (MASS aside).

      I think the core issue is that the coordination benefits of having everyone use Python are overestimated, and the benefit of better statistical tools in R and SWE skilling up in statistics is underestimated.

      • _Wintermute 2 days ago

        > It’s not necessary though, because the R community does a good job avoiding name conflicts

        That has not at all been my experience. Loading the tidyverse pulls over 1000 things into your global namespace and clobbers several standard library functions in the process. Never mind that seemingly every single package has its own "filter" function.

        When you start getting different results based on the order you import packages it's usually a bad sign.

    • ggrothendieck 2 days ago

      The poorman package (on CRAN) implements a lot of tidyverse with no dependencies. Other packages that provide alternate implementations are datawizard and tidytable.

  • openrisk 2 days ago

    Python has not really solved this either... For one thing it barely exists on mobile.