nerdponx 4 hours ago

I have some questions that are not answered by the homepage.

1) How does this work with function parameters that are intended to be captured unevaluated with substitute()? Do you type the input as "any" and document separately that the parameter is kept "unevaluated" as a symbol/name or call?

2) How does this work with existing untyped R code? Does it at least include types for the standard library (or some subset thereof?)

3) Is there any type inference, or does it require explicit type annotation everywhere?

4) How do you propose to handle NA (which can appear "within" any typed vector)? Does the compiler support refinement types? If not, how does checking for and preventing nullability work, when checking for NA values requires a runtime check?

5) How do data frames work? Are they typed like structs?

6) Which object systems does it support, if any? S3, S4, Reference Classes, or the 3rd-party R6?

As much as I like static types, I feel like R is maybe the language where I need or want them the _least_. How often do you really run into a situation where you pass a character vector to a function that requires a numeric vector and it crashes your program?

99% of the time what you really want is known-valid data frames for data processing, and statically-sized arrays for math stuff.

  • fn-mote 14 minutes ago

    > As much as I like static types, I feel like R is maybe the language where I need or want them the _least_.

    I really disagree with this.

    I think one of the whole reason there is a whole Tidyverse ecosystem that the behavior of (some) R code is unintuitive in a way that adding typing would absolutely improve.

    It seems like you're deeply familiar with the R ecosystem, but as a user what I want is a safe subset of R that I can use.

    > How often do you really run into a situation where you pass a character vector to a function that requires a numeric vector and it crashes your program?

    In R the more likely situation is that you pass in the wrong typed thing and it silently continues with very unexpected values being passed, causing trouble or errors much later in the program. Which is very much a problem that typing helps with.

andrewla 8 hours ago

As an R programmer the examples given on the landing page seem very foreign to me -- you are almost always writing vectorized code in R, so I would think that would be front and center.

    let x: int = 1
Is this a list of ints or a pure singleton? R doesn't have scalar types, so it would seem the former, but the example makes it unclear. Later in the docs it makes it clearer:

    let x: int = (1, 2, 3)
And this, as an R developer, I can definitely get behind -- the c(...) syntax is always awkward and having a native syntax for static arrays is a welcome change.
  • juujian 7 hours ago

    Yeah, it's not an idiomatic example. I like the idea, but this makes me worry that the project does not have the right priorities. I.e., supporting my use cases :D

mushufasa 8 hours ago

The main reason we shy away from R for production apps is all the silent errors where things seem to succeed while being horribly wrong if you take a look. Typing would certainly help mitigate that.

johnnybzane 5 hours ago

How do I find jobs that use the R language? It's impossible to search the letter "R" on linkedIn or Indeed without getting a bunch of unrelated job postings

"R" is the only programming language I know and I can't find a job that uses a R because job search engines don't allow you to sort by skill

"R language" is the closest substitute on linkedin but the results are still a jumbled mess of jobs, some looking moreso for other skills (SQL/Python)

I know R-heavy jobs exist but finding them on LinkedIn is virtually impossible

  • clircle 4 hours ago

    Why would you do that? R is a just a tool for doing statistics or research. You need to search for jobs in your subject area like "ecologist", "econometrician", "green energy reseacher", etc.

  • Balladeer 4 hours ago

    How does "R language" compare to searching for one of the popular R packages? Searching for "tidyverse", "dplyr", or "ggplot" seems to get a good chunk of hits. That being said, yeah, there does seem to be a trio of skills that often go together (R, python, SQL)

    • johnnybzane 4 hours ago

      If you search specific packages on LinkedIn the number of jobs is usually very small

      E.g. tidyverse or dplyr is like 20-40 jobs. ggplot is 88. There's definitely way more than 100+ companies looking for R-heavy users.

  • dkga 2 hours ago

    Perhaps #rlang would work? Or #tidyverse if you are feeling tibblish :)

  • kagevf 3 hours ago

    I tried using "r" (with quotes) on indeed, and got some hits where R was listed as one of the necessary skills.

ecshafer 8 hours ago

I think this is a great idea for the project. I don't dislike the syntax, but the syntax seems more ML than R to me. I think keeping the syntax more R-like could be worthwhile.

uptownfunk 8 hours ago

Will this fix the problems it claims to? The power of R is the rich package ecosystem. It caters to people who don’t want to think about engineering concerns but want a fast way to access the powers of computation rather than building a scalable system, two very different things. It excels at the former. A new language will not fix this, because this type of thinking has infected the entire package ecosystem. Frankly with code translation you probably don’t need a new language. Prototype in R and code translate to Python or whatever you want to use in prod. Or frankly just do code gen directly in Python so you can skip having to confirm if the results match.

To be clear, I love R, it excels in prototyping but I have seen too many real world struggles of folks trying to move to prod that I would say save it for EDA projects and one time analyses.

  • _Wintermute 7 hours ago

    I often find I want a specific statistical package that's only in R, but want a more general purpose language for all the other stuff that's involved (parsing, filesystem stuff, error handling etc). I don't want to risk re-writing the statistical methods and all their dependencies in the sensible language, so I end up calling R only for the statistical methods, but I can see this as an alternative.

  • joshdavham 5 hours ago

    > A new language will not fix this, because this type of thinking has infected the entire package ecosystem.

    Do you think the culture of the package ecosystem could possibly change in the future?

joshdavham 5 hours ago

Looks interesting! What types of programs do you think people would write in this language? I don't see an obvious need for traditional R programs which are usually just scripts for working with data, but maybe people could write R packages in this language?

clircle 8 hours ago

Statisticians and researchers, is this helpful?

  • tech_ken 6 hours ago

    I would say that vast majority of type problems in data science/stats workflows come from data tables "trojan-horsing" type or missing data issues, rather than type problems strictly at the code level. Type annotations won't help you when your upstreams decide they want to change the format of their year-quarter strings without telling you.

    • dragonwriter 6 hours ago

      > Type annotations won't help you when your upstreams decide they want to change the format of their year-quarter strings without telling you.

      IME with both Python and JS/TS, it helps a lot (which is different than completely solving the problem), for reasons which should generalize to other typing add-ons/supersets for untyped languages. Typing your code forces validations at the boundaries, which obviously doesn't stop upstream sources from messing with formats but it does mean that you are much more likely to catch it at the boundary rather than having weird breakages deep in your code that you have to trace back to bad upstream data.

      • tech_ken 6 hours ago

        Is the idea that if my year_quarter parser is properly typed then it should detect the format change and throw an error? (kind of a silly example, just trying to be illustrative)

  • ellisv 4 hours ago

    It is probably helpful in some cases and unhelpful in others. R uses multiple dispatch, so calling `foo` on different types can produce different output. It isn't clear to me how Vapour handles this. In general though, folks are passing around data.frame or similar objects.

  • levocardia 5 hours ago

    Not really, because honestly a lot of us who came into programming via research never learned typed languages or unit tests or any of those best practices - we were just hacking around in MATLAB, R, or Python from the start. What I really need is a seamless and easy way to run statistical models that can only be fit in R, but from Python or Node. There are several categories of statistical modeling where R completely blows python out of the water, and it's incredibly wasteful (and error-prone) to try to re-implement these yourself in Python.

russellbeattie 4 hours ago

This isn't specifically about Vapour, just about what's become the common way to specify types.

I know this is totally bike shedding, semantics, vi vs Emacs, BigEndian vs LittleEndian and it's too late now to affect anything, but to me using a colon after the variable is just wrong!

let x : int = 1

func add(x: int, y: int): int { return x + y }

I see that and it looks like int = 1 and the function's return type is totally lost.

This seems completely backwards to me. Maybe I'm just used to the way C did it, but the variable modifiers should come first.

let int x = 1

func int add(int x, int y) { return x + y }

Why we reversed it and added in the colon just doesn't make much sense to me.

bachmeier 7 hours ago

I took a couple stabs at this long ago (even before there was a Typescript for inspiration). The first attempt was to add types to the syntax of R, but that would have required a lot more time than I had. Properly catching errors is a massive undertaking requiring a lot of background I don't have. The second attempt was to add syntax for types to R and then compile the code to another language. That's easy to do, but really boring, so I wasn't able to stick with it. It comes with the advantages of static typing and R code that runs very fast. I gave up and went with embedding R inside a statically typed language. Very happy with my choice.

Good luck to the authors of this. I believe it solves an important problem for R package authors and others wanting to write bigger programs. It's hard to argue with the benefits of static typing for this type of work.

lloydatkinson 8 hours ago

This looks nice. I find R to be an unreadable mess. The comprison shows a great improvement.

  • qudat 7 hours ago

    The default IDE workflow is like a python "notebook" where code can and is run in whatever order the creator wants. Every R code I've read treats it as such and it results in an absolute mess to read and manage.

layer8 7 hours ago

Sounds like vapourware. ;)

  • joshdavham 5 hours ago

    I mean, there is an alpha you can download. If it was just a landing page and an email waitlist, then that would be vaporware.

    • layer8 4 hours ago

      I was commenting on the naming choice.

  • [removed] 7 hours ago
    [deleted]
brudgers 2 days ago

[flagged]

  • johncoene 2 days ago

    First, how is that "giving myself an excuse"? Second, it's a total non sequitur, and even then, it's a day old has it broken?

    • brudgers 2 days ago

      the syntax might change, things will break, expect bugs.

      Bugs are normal software development.

      Changing syntax and breaking things make work for everyone else for the convenience of developers. Reliability is what makes a tool a tool.

      • Terretta 14 hours ago

        > Changing syntax and breaking things make work

        How else might one explore a new language (vapour) in the open among interested like-minded developers seeking to iterate on a tool found lacking (R)?

        Changing and iterating things makes.

        • ausbah 8 hours ago

          they aren’t wrong. backwards compatibility is a suppose to one of the first promises any mature programming languages. unless you make it explicit via noting breaking changes in major version updates (1.X.X —> 2.X.X) or the language is purely for R&D and makes no guarantee of anything

  • [removed] 9 hours ago
    [deleted]