Comment by isoprophlex

Comment by isoprophlex 2 days ago

24 replies

A bit of a tangent but this might be somewhat illuminating: I worked for a big utility company. They needed some economic analysis done on where to best put some transport pipes into the ground, macro granularity (eg smallest unit was a neighborhood).

The team that did the analysis apparently did a great job, so additional requirements were thrown their way: a frontend and an interactive planning module on designing the pipe networks on micro level (smallest unit was a single house)

A year later the absolute maniacs delivered the application. They only knew R and some html/js, so that's what they used.

Cursed as fuck of course, and because they were external they left short after. This was pretty bad for the people left holding the bag; I mostly found it awe inspiring in the "doom running on an electric toothbrush" kinda sense.

qsort 2 days ago

R is cursed beyond reason, but traditional software engineers are sleeping on it, IMO. It's very easy for quantitative people that are not software developers to get something done quick. The downside is exactly what you described, most projects are not just the model, they eventually tend to incorporate generic data wrangling, UI/web code, etc, and a general purpose language tends to work better overall.

I have a similar anecdote: I was brought in on a project where a group of terrorists implemented a solution for a TSP-like problem directly in R. We eventually replaced that thing with OR-Tools.

  • senkora 2 days ago

    +1. I am a software engineer but I double majored in statistics and wrote a lot of R in undergrad. The library ecosystem is incredible. Essentially any technique in statistics has a well-documented R package that is one library() call away.

    I keep wondering if I should learn the Python data science ecosystem at some point but it just seems like a waste of time. One of my personal projects is written in Python but calls into R for statistics/plotting.

    The language itself however, incredibly cursed.

    • coliveira 2 days ago

      The same thing can be said about Python. Python itself is not such a great language, especially in terms of performance. However they managed to have every simple package in the world of analytics and ML added to the Python ecosystem, so it is impossible to stop using it.

      • senkora 2 days ago

        There's definitely a lot of similarities.

        I think of MATLAB, Mathematica, R, and Python together as "practitioner's languages". These are languages that are designed from the core to be highly productive to a specific kind of technical worker (in the sense of developer velocity).

        MATLAB for engineering. Mathematica for mathematics. R for statistics. Python for software engineering.

        You could also say "Python for ML", of course, and that would be true, but Python is also used for general purpose programming much more than the other three. I think that "Python for software engineering" is more correct.

        I think that each of the languages is shaped to the way that its users think about the problems that they want to solve with it.

        MATLAB is shaped around linear algebra. Mathematica is a term-rewriting system. R has lots of magic around data and scopes to make the surface syntax for stats nice. Python is shaped like a traditional OOP language but with a pseudocode-like syntax and hooks so that libraries can act magically.

        This is kinda half-baked, I'm trying to express this for the first time. But essentially I think that Python is what you get when you have real programmers (^TM) try to create the programming equivalent of something like MATLAB, Mathematica, and R.

        And so of course ML, which is a field dominated by real programmers, adopted Python in order to create their ecosystem.

  • rrr_oh_man 2 days ago

    > R is cursed beyond reason

    Why would you say that?

    • tfehring 2 days ago

      One concrete example: R has 5 distinct, actively maintained class systems, at least 3 of which are somewhat commonly used for new projects. I.e., there are 3+ reasonable ways to declare a class, and the class will have different semantics for object access, method calling and dispatch, etc. depending on which one you choose.

      Another: R can’t losslessly represent JSON because 1 and [1] are identical. That’s a float (well, float vector) literal by the way, the corresponding int literal is 1L, though ints are very prone to being silently converted to float anyway.

    • rout39574 2 days ago

      The R development community is (was?) consciously focused on single operator managing their stuff on their workstation.

      Considerations like repeatable procedures, reliable package heirarchies, etc. were clearly and more or less politely Not Interesting. I spent several years with one of my tasks being an attempt to get the R package universe into Gentoo, and later to RPM packages.

      I wouldn't say the R devel community was rude about it, but the systems-administration view of how to maintain the language was just not on their radar.

      At the time I was trying to provide a reliable taxonomy of packages to a set of research machines at a good sized university. Eventually, I gave up on any solution that involved system package managers, or repeatability. :)

      So if you're a researcher driving your own train, R is freakin' FANTASTIC. If you're the SA attempting to let that researchers' department neighbors do the same thing on their workstations, anticipate fun.

      • dataspun 2 days ago

        I think this SA perspective is outdated or perhaps never adequately investigated. Packages like renv solve for these issues, and they work great.

    • tomrod 2 days ago

      As someone who uses Python, R, Go, Rust, Fortran, and Java...

      I would never write a full stack application in R. Terrible maintainability.

      • chaosist 2 days ago

        I would have said this a year ago but R is a language for statisticians and not software engineers. It took me forever to understand this.

        Statistical Rethinking by McElreath is really what finally got me to see the value of R.

        You can find the python versions of the class and they are certainly not better.

        A full application in R really makes absolutely no sense.

        • tomrod 2 days ago

          I've wrapped R to python before. That was okay, a bit stilted but still could take to production if absolutely necessary.

          You're 100% right that R is great for data scientists (my background) for frontier level academic implementations as well as toy/simple models. It's generally a poor runtime for computation and suffers from much of the same issues as Python for data quality and typing. Python is better for battle-hardened type stuff, has better debugging tools for certain.

          R _can_ be done well, but the juice isn't worth the squeeze typically.

      • persedes 2 days ago

        as a maniac who has written a full stack application in R: I agree. It is easy to get something out of the door quickly for the average R user, but maintenance will show you quickly how brittle everything is.

        • tomrod 2 days ago

          Hello fellow maniac!

          Glad to hear the agreement. Little small projects are so fun. Stadium-sized pools full of spaghetti less so!

      • waveBidder 2 days ago

        I'm in this picture, and I'm begging my collaborators to move to Julia or Python.

    • waveBidder 2 days ago

      in addition to neighbor, there are multiple standards for everything, even core language features like objects, which makes reasoning about code difficult. dplyr and friends, which make the aforementioned data analysis easier, require interacting with one of the least consistent metaprogramming systems I've ever dealt with.

hackoo 2 days ago

As an R developer, I can deeply relate to this. People that use R as the mother language are probably from academia. They can be very good at data analysis and scripting, but usually have no idea of software development. They have few knowledge about versioning, modularization, documentation, and testing, which can lead to issues when they develop production-level R packages and Shiny applications.

On the flip side, R developers with strong software development skills can be incredibly valuable in academic settings.

  • coliveira 2 days ago

    Everything you say explains why R is a great language in academia, and why people in research will stay away from "software engineering"-oriented languages like Java and JavaScript. Researchers already have too much to worry about.

    • hackoo 2 days ago

      I agree that researchers usually don’t need to worry about software engineering because their code is often simple, used only for one paper, and intended for personal use. But for large-scale research involving multiple researchers, users, and datasets, software engineering mindset can be beneficial in the long term.

  • giraffe_lady 2 days ago

    I have a friend who left tech after 15 years to go back and get a CS phd. He ended up with a permanent job being the only "software guy" on a team of like exclusively math and physics post-docs. I'm still a little fuzzy on his day-to-day but I think it's basically just having enough of a handle on the research math to effectively product manage all the gnarly research code they produce. It sounds both frustrating and fun.

    • hackoo 2 days ago

      It can be frustrating if those post-docs underestimate the time cost and engineering complexity of software development. They would complain why it takes so long for your friend to "just" change a button in some app.