Comment by amarant
Comment by amarant 2 days ago
This is quite surprising to me. My (admittedly limited) understanding of R is that it's focused on data analysis and graph generation. What need does react solve in that context?
Comment by amarant 2 days ago
This is quite surprising to me. My (admittedly limited) understanding of R is that it's focused on data analysis and graph generation. What need does react solve in that context?
R is cursed beyond reason, but traditional software engineers are sleeping on it, IMO. It's very easy for quantitative people that are not software developers to get something done quick. The downside is exactly what you described, most projects are not just the model, they eventually tend to incorporate generic data wrangling, UI/web code, etc, and a general purpose language tends to work better overall.
I have a similar anecdote: I was brought in on a project where a group of terrorists implemented a solution for a TSP-like problem directly in R. We eventually replaced that thing with OR-Tools.
+1. I am a software engineer but I double majored in statistics and wrote a lot of R in undergrad. The library ecosystem is incredible. Essentially any technique in statistics has a well-documented R package that is one library() call away.
I keep wondering if I should learn the Python data science ecosystem at some point but it just seems like a waste of time. One of my personal projects is written in Python but calls into R for statistics/plotting.
The language itself however, incredibly cursed.
The same thing can be said about Python. Python itself is not such a great language, especially in terms of performance. However they managed to have every simple package in the world of analytics and ML added to the Python ecosystem, so it is impossible to stop using it.
One concrete example: R has 5 distinct, actively maintained class systems, at least 3 of which are somewhat commonly used for new projects. I.e., there are 3+ reasonable ways to declare a class, and the class will have different semantics for object access, method calling and dispatch, etc. depending on which one you choose.
Another: R can’t losslessly represent JSON because 1 and [1] are identical. That’s a float (well, float vector) literal by the way, the corresponding int literal is 1L, though ints are very prone to being silently converted to float anyway.
The R development community is (was?) consciously focused on single operator managing their stuff on their workstation.
Considerations like repeatable procedures, reliable package heirarchies, etc. were clearly and more or less politely Not Interesting. I spent several years with one of my tasks being an attempt to get the R package universe into Gentoo, and later to RPM packages.
I wouldn't say the R devel community was rude about it, but the systems-administration view of how to maintain the language was just not on their radar.
At the time I was trying to provide a reliable taxonomy of packages to a set of research machines at a good sized university. Eventually, I gave up on any solution that involved system package managers, or repeatability. :)
So if you're a researcher driving your own train, R is freakin' FANTASTIC. If you're the SA attempting to let that researchers' department neighbors do the same thing on their workstations, anticipate fun.
in addition to neighbor, there are multiple standards for everything, even core language features like objects, which makes reasoning about code difficult. dplyr and friends, which make the aforementioned data analysis easier, require interacting with one of the least consistent metaprogramming systems I've ever dealt with.
As an R developer, I can deeply relate to this. People that use R as the mother language are probably from academia. They can be very good at data analysis and scripting, but usually have no idea of software development. They have few knowledge about versioning, modularization, documentation, and testing, which can lead to issues when they develop production-level R packages and Shiny applications.
On the flip side, R developers with strong software development skills can be incredibly valuable in academic settings.
I agree that researchers usually don’t need to worry about software engineering because their code is often simple, used only for one paper, and intended for personal use. But for large-scale research involving multiple researchers, users, and datasets, software engineering mindset can be beneficial in the long term.
I have a friend who left tech after 15 years to go back and get a CS phd. He ended up with a permanent job being the only "software guy" on a team of like exclusively math and physics post-docs. I'm still a little fuzzy on his day-to-day but I think it's basically just having enough of a handle on the research math to effectively product manage all the gnarly research code they produce. It sounds both frustrating and fun.
In R, you can build Single Page Applications with Shiny, created by Posit https://shiny.posit.co/ It is very useful, if you don't know HTML,JS,CSS and want to create an interactive dashboard, showcasing your analysis, models, visualizations, or even to create an internal tool for your organization.
It seems that reactR provides functions for building react components directly from R that can be used in Shiny apps.
Shiny is an R package that is quite popular for building data analysis web tools, I suppose this would be a useful extension for it! Despite R tending to give people 'the ick' it regardless seems to have become quite a solid platform for building web apps
I wouldn’t want to deploy a Shjny app externally (although loads of people seem to do so with no problems) but for internal tools it’s incredible. You can make reactive dashboards and analysis tools with no plumbing - just refer to an input when specifying your charts/outputs and they will be automatically plumbed and update automatically.
I agree that it's perhaps not the most robust choice, but at least for my field (bioinformatics) it's a good balance between accessibility and performance. That being said, in most cases when I come across a paper >1 year old presenting the latest-and-greatest Shiny web app, it is wholey broken when I try to use it :|
Is it because they stopped paying for Shinyapps.io? Or they exhausted the free usage monthly quota. In my experience these are the common reasons. If they did not update the app, there is no reason for it stop working. I have apps running fine and they were created 3-4 years ago.
Shiny is a great package. Not unlike R (and PHP back in the day) it was made by people that were not necessarily great programmers who wanted something to get up and running quickly. Lowering the bar to entry made sacrifices to performance, security, and consistency. However, you can punch above your weight class if you're a researcher who barely knows R and Python with Shiny.
That being said, if you're investing time to learn React and dealing with all its pitfalls to use with R, you might as well just go all in with React at that point.
The bigger picture is the "democratization" of data science [1], not as an academic pursuit by white coat researchers (which was the main early use case for R) but embedded in day-to-day operations in organizations of all sizes. "Apps" (web or mobile) is how this embedding is done. Reactivity is an important consideration when deploying a complex screen with many visible data elements and visualizations.
The problem is how to increase the efficiency with which back-end developments (by technical people) get rolled-out to the end-user base. There are countless ways to do this of-course. The more technical resources one has available (e.g. knowledge of different stacks etc.), the more options. But when resources are scarce one is tempted to look for the (possibly only perceived) efficiency of a "full stack" approach.
[1] a loose term which nevertheless reflects something very real: the widespread use of data in society for all sorts of purposes
When I've dealt with R in production, cursed meant: - Difficult to keep package versioning, even with "renv". - If an analyst decide to use a single function from the "tidyverse", you have a tons of dependencies. - Large docker images (1G+) due to packages like "devtools" and very large dependency tree for the "productivity packages" (see above). - Hard to communicate with the process. With luck, you can set it up and work with 'r-script' [1]. Without luck, stdout from process or simple files for io.
In the end, to have a nice webapp, we ended up rewriting the R code into typescript. Julia don´t solve this also, as you have a hard time to set it up to communicate with other things. It seems that we can´t avoid the "2 or 3 languages" problem if you don´t use python.
Single functions and libraries can easily be imported in R, just like Python. It’s not necessary though, because the R community does a good job avoiding name conflicts (MASS aside).
I think the core issue is that the coordination benefits of having everyone use Python are overestimated, and the benefit of better statistical tools in R and SWE skilling up in statistics is underestimated.
The poorman package (on CRAN) implements a lot of tidyverse with no dependencies. Other packages that provide alternate implementations are datawizard and tidytable.
As someone who is a data analyst, I am constantly reminded of the huge amount of necessary context needed to trust results from any analysis. I know that I lack some of that context, yet I also know that so do my collaborators on mathematical or computational aspects. A Shiny app makes it easy to get something in the face of everyone so we all can come to a consensus about what is useful and actionable.
Data visualization! Web is the go-to format for interactive data visualizations nowadays. This package is targeted at other package developers looking to integrate the abundant wealth of React-based data viz libraries into R. Check out https://quarto.org/ for an example of a popular way users publish their data analyses as reports.
(This package isn't really meant to be used directly by the typical user)
People who work with R for data analysis want to put their results on the web, even in the form of full web apps, without necessarily learning the full stack of front end technologies. Frameworks like Shiny offer the promise of writing web apps in the language they already know. In my experience watching this play out from the sidelines, it only ends in tears. But, technology progresses, and maybe it'll get better.
A bit of a tangent but this might be somewhat illuminating: I worked for a big utility company. They needed some economic analysis done on where to best put some transport pipes into the ground, macro granularity (eg smallest unit was a neighborhood).
The team that did the analysis apparently did a great job, so additional requirements were thrown their way: a frontend and an interactive planning module on designing the pipe networks on micro level (smallest unit was a single house)
A year later the absolute maniacs delivered the application. They only knew R and some html/js, so that's what they used.
Cursed as fuck of course, and because they were external they left short after. This was pretty bad for the people left holding the bag; I mostly found it awe inspiring in the "doom running on an electric toothbrush" kinda sense.