Pyrefly: A new type checker and IDE experience for Python
(engineering.fb.com)226 points by homarp a day ago
226 points by homarp a day ago
Hey Kevin, we overlapped for a bit during your time at FB when I was working on Flow. Nice to hear from you!
I’m working on Pyrefly now, but worked on Flow for many years before. For what it’s worth, we are taking a different approach compared to Flow and have explicitly prioritized open source and community building, something I know we both care a lot about.
Of course, nothing is guaranteed and we’ve seen plenty of volatility in bigco investments to infra lately, but I do believe we’re starting this journey on the right foot.
Cheers, Sam
Meta seems to place a pretty high premium on controlling its open source projects, especially dev tooling. I guess dating back at least to the git maintainers telling them they were doing things wrong with their monorepo and refusing to upstream scale fixes, which precipitated their migration to mercurial (who were more than happy to take the contributions).
Given the change velocity of internal tooling you can understand why owning your own project makes sense here.
JSX is my favorite thing to come out of Facebook (also the only good thing).
I feel bad for that people that love JSX and don't know about lit-html yet.
Hi folks, I work on the Pyrefly team at Meta. Our FAQ covers a good number of the questions raised here: https://pyrefly.org/en/docs/pyrefly-faq/. I can also try to answer some of your questions. Thanks for taking a look!
Hack is quite a bit more optimized than Pyre was, but Pyrefly is at least 10x faster than Pyre on the IG codebase.
I didn’t know about the Rust-based Hack checker— that’s really cool!
Seems there are at least three Rust-based competitors for type checkers in Python now (Microsoft, Facebook, Astral), and of course there's still mypy.
Close, Microsoft’s type checker Pyright is Typescript. Its still faster than mypy for me though.
Pls forgive my ignorance, but how is Typescript (a superset of Javascript) used to type-check Python?
just like the Python compiler/interpreter is written in C.
Yes. If you want runtime validation of data you’re taking in people recommended pydantic. If you’re looking for runtime validation within your own code I’ve seen people use beartype, though to be honest I don’t personally understand the value added from it
...or Marshmallow, which allows one to do many complex validations in a relatively trivial manner.
this is the official announcement, but pyrefly was previously discuted a few weeks ago: https://news.ycombinator.com/item?id=43831524
"Today we’re releasing Pyrefly as an alpha. At the same time, we’re busy burning down the long-tail of bugs and features aiming to remove the alpha label this Summer. "
Why is "written in Rust" a feature to be mentioned? Who cares? So my type checker has memory protection and is compiled. I'm not running my type checker on an embedded system or in a mission critical service. It seems kind of like "written in Erlang". I'd prefer to have non-performance critical code for Python written in Python. That way the broader community can support and extend it.
Have you used Rust before? As a user, the speed and safety is nice. But as a developer, Rust projects are easier to hack on and contribute to.
That's kind of the whole appeal of Astral. I know Python better than Rust, but it's a lot easier for me to hack on Rust projects. The whole appeal of Astral is that they want to bring Rust-quality tooling to Python.
> Rust projects are easier to hack on and contribute to.
This was actually the subject of a study at the University of Waterloo:
> We find that despite concerns about ease of use, first-time contributors to Rust projects are about 70 times less likely to introduce vulnerabilities than first-time contributors to C++ projects.
> but it's a lot easier for me to hack on Rust projects
That static typing is nice, I wonder if it's going to catch on one day.
The amount of energy spent trying to bend dynamically types languages into being real ones is just comical. Even the standard library is barely typed, so they give no fucks https://github.com/python/cpython/blob/v3.13.3/Lib/re/__init...
What does it accept? Who knows. What does it return? Don't worry about it
Static typing is a big one, but I've been so steeped in Python that I don't appreciate it as much as maybe I should.
The big thing for me is that most Rust projects are statically(ish) compiled (like go) and only need a `cargo build`. No venvs or pip commands or node/npm/nvm or make, etc.
regex.match takes strings and returns a match object. There are most likely stubs, if you are new to it and need support.
> Rust projects are easier to hack on and contribute to.
You can say that about any languages that you yourself know, or other people know. There are beautiful codebases in many other languages, and awful ones in the same languages.
If your Rust codebase has a lot of unwraps and lifetime annotations (among other things), I will probably not find it a joy to contribute to it, at all.
> You can say that about any languages that you yourself know, or other people know.
No, I'm saying that Rust was easier to hack on and contribute to (on my own) when I had never written any Rust before. Rust (and almost Go) is the only language I can confidently say this about. It's not even in my top 5 strongest languages now, but I still stand by this.
E.g. Look at the build instructions for Gimp and all its prerequisites: https://developer.gimp.org/core/setup/build/
Very normal C++ project, ~500 words of instructions. Once I started thinking about using a chroot to fix dependency issues after I'd already built bebl and gegl, I gave up, because I ran out of free time. It didn't matter how much C++ I knew.
Rust projects, comparatively, almost never demand that. It's almost always just `cargo build`, with some rare exceptions (e.g. The one exception I know of for which this is not true for is Graphite, which is a web app and also uses npm. )
I do not know about Rust (because of all these lifetime and borrow checking stuff), but Go is definitely one of the languages one can easily contribute to without knowing much about the language, especially if you use VSCode with its Go extension.
I do not like C++ projects, they are behemoths, slow to compile, and C++ continues getting so much bullshit bolted on top of the language. It is extremely complicated, at least for me.
Most of my projects - regardless of language - has extremely simple build instructions, no matter how large it is.
As for GIMP, "meson" and "ninja" is not too bad, I have come across projects with much worse build instructions, but I agree, it is leaning towards "complicated".
> Once I started thinking about using a chroot to fix dependency issues after I'd already built bebl and gegl
Been there, done that. I remember compiling GCC myself, it was not as straightforward as, say, LLVM + clang. I think the issue is not with the language itself, however, but the developers who do not simplify the build process.
Rust has very arcane syntax and a lot of rules that developers coming from interpreted / garbage-collected languages (like the ones using these tools) would have a hard time grasping. It’s easy for people who are already familiar with it, but isn’t that always the case?
It makes no sense to me. You found it easier to contribute to Rust projects because... Rust projects are significantly easier to build? What? You can just do "dune build" in OCaml, or run "make" for many C projects. Plus, it is also significantly slower to build Rust projects, you should have probably added that.
An LSP is performance-critical code. It directly affects responsiveness of your IDE, or even the viable scope of a project that the LSP can handle.
Rust is both CPU- and memory-efficient, quite unlike Python. (It could have been OCaml / Reason, or Haskell, they are both reasonably fast and efficient, and very convenient to write stuff like typecheckers in. But the circle of possible contributors would be much narrower.)
The circle of possible contributors doesn't really matter. It's a Meta project, they have others written in OCaml and to this day they manage to have contributors eg https://github.com/facebook/flow because they hire and pay them.
>Why is "written in Rust" a feature to be mentioned? Who cares?
A lot of people. Correct or not, I think "written in rust" have become synonymous with "very much faster than the alrernatives" in the pyrhon community.
I feel like the likelihood that a project will say what language it is written in is much higher if that language is Rust. I like Rust but I do find this trend a little annoying. (Even though I acknowledge that "written in Rust" probably means the tool is relatively new, not buggy, and easy to use.)
> I'd prefer to have non-performance critical code for Python written in Python
A type checker is performance critical code. You can watch how Pylint, just a linter, written in Python, lints your source code line by line. It's so slow it can take 30 seconds to update the linting after you change some lines.
Many of these make the mistake of running against an entire codebase instead of checking vcs first and only running against changed files.
> Why is "written in Rust" a feature to be mentioned? Who cares?
If one is a "purist", the idea of non-python tool involvement may dissatisfy.
Scare-quoting "purist", given the general lack thereof anywhere in Open Source, python itself being a case in point.
I just tried pyrefly on a project that really needed it. It complained about an assignment of a new value to a global int variable within a function, even though the function contained the 'global' statement that should have made that OK, I think. I know that globals and assigning to them here and there are problematic for real good software, but I am surprised that Pyrefly is stricter than python on something that I don't see as a type- checking issue. But it did find a decent list of other problems that I haven't finished working my way through.
I had gotten so messed up trying to put together a quicky hobby-type program to create a data structure of perhaps a hundred data items in various overlapping and inter-related hierarchies, tuples, dicts, and lists akimbo, that I gave up on it about 10 days ago. I hypothecated that bondage and discipline might be the way to control the confusion, so I'm rewriting, using SQLite for the dataflow from function to function, lots of little tables and no optional fields. Can anyone opine on whether that is a sensible option?
Thanks for trying it out! If you run into any blockers, please let us know by filing a GitHub issue or sending us a message on Discord. Pyrefly is still alpha software, so bugs are expected, but your feedback is extremely valuable as we work to squash them.
Happy to see instructions for integrating into Vim/Neovim: https://pyrefly.org/en/docs/IDE/#other-editors
The Rust code written here is so easy to follow but all these new Python tooling being written in Rust worries me, it adds yes another vector to the N-language problem.
I hope Mojo can offer something here
For the Python ecosystem, it's natural to use Python where Python can cope, and a high-performance language where it cannot. There are two such languages in wide use around Python: Rust, and, inevitably, C. So N = 3.
(C, to my mind, should be eventually phased out from application programming altogether, so N would be 2, but it's a loooooong process; Python may become a legacy language before it converges.)
Yes but the idea is that by slightly upgrading python code to mojo (which is a controlled superset of python), you get complied very high performance code. So for example if it were possible to convert mypy to mojo it could be as fast as rust but pythonic.
It has not. They decided to change the messaging to reflect what the language is to today because people kept thinking it is already a superset
Sadly I think the ship has sailed and Rust has hit critical mass now. Personally I find it aesthetically awkward, but for Python integration and tooling it seems like Rust has become the default C replacement. You would think Python devs might have preferred something more superficially Pythonic like Nim or perhaps something more C-ish like Zig, but those projects don't have the same buzz so here we are. There's probably more young devs who are into Rust than C nowadays.
I am not holding out much hope for Mojo because it feels deeply embedded in the AI/LLM hype space instead of being presented to Python devs outside of that niche as a useful language extension in its own right.
I don't think it really matters whether Rust has hit critical mass or not tbh, just the fact that it is entirely a new language to learn with very different semantics compared to Python is a blocker for many people.
Mojo right now is not much better, but I've seen Python compatibility factor into the language design and semantics again and again. It is not enough to be a language that looks like Python, like Nim, things also have to behave the same when the semantics of static typing allows.
Mojo is not deeply embedded in the AI/LLM hype, there is nothing in the language that is targeted specifically for AI. The standard Library has a GPU package for general-purpose gpu programming, but that isn't AI specific.
Hi! We address this question in our FAQ and probably could do a longer blog post about our experience after we are further along: https://pyrefly.org/en/docs/pyrefly-faq/#why-rust
> Not only is Pyrefly written in a new language (Rust instead of OCaml), but its design deviates in a major way from Pyre.
I'm sure you had reasons to do it this way. But given sufficient time to market, implementing the algorithm in pyre and then tooling/llm assisted conversion to pyrefly would've been preferable.
May be you'd have had some humans in the loop initially. But that tech is getting better and aligned with the direction Meta and the rest of the industry are taking.
Yes, I'm biased :)
Its probably cool n' all but fb isnt getting any of my attention. They'd need to come up with AGI for that to happen, and even then I'd shrug it off.
I agree. I simply can't support anything Mark Zuckerberg does at this point.
> Why we built Pyrefly: Back in 2017, we embarked on a mission to create a type checker that could handle Instagram’s massive codebase of typed Python
They're saying this on fb.com. How does it not have anything to do with fb?
The feedback section takes you to fb's github.
ok that would be impressive, " no no we're not interested in AGI -- we want to become god "
This is very cool but why wouldn’t they just contribute to uv and ruff and ty https://github.com/astral-sh/ty
I think astral and meta were both working on their own type-checkers independently. My current understanding is that meta released so they could preempt the initial release of ty. It seems like they're a bit further ahead in development. Not sure if there are going to be any real differences between the two down the line.
I tend to agree.
I don't know the differences between the two well enough to know if it was the case here, but in my experience sometimes you need to innovate on a fork, or from scratch in order to create the space/freedom to do so.
Once a project is popular, it's harder to justify and be confident about major changes (aka https://m.xkcd.com/1172/)
It seems like the share a lot of the same goals but my impression is Poetry is much slower to pick up on standards. It’s normal to use uv with a project now that doesn’t have any [tool.uv] section in pyproject.toml at all but every poetry project I’ve seen is littered with [poetry] sections, even dependencies. Makes me not want to use it
I just ran ty and it can't resolve any imports whereas pyrefly passes. Why would that be? I hate Python so much.
ty doesn't invoke a Python interpreter to discover imports yet — so you need to either set `VIRTUAL_ENV` or pass `--python` to configure your target environment. We'll expand support here in the future, but this part of ty's interface is intentionally minimal while we focus on core type checking features.
Because this has been tested at Meta / Facebook scale which means it's faster for any Python codebase massive and small.
Since Meta built this, I have confidence this will be maintained more than others and I will use this and ask for Pyrefly experience in the future.
To repeat an earlier comment of mine from the launch of uv on hn (tl; dr: these new type checkers never support django):
The way these type checkers get fast is usually by not supporting the crazy rich reality of realworld python code.
The reason we're stuck on mypy at work is because it's the only type checker that has a plugin for Django that properly manages to type check its crazy runtime generated methods.
I wish more python tooling took the TS approach of "what's in the wild IS the language", as opposed to a "we only typecheck the constructs we think you SHOULD be using".
1. Maybe it's time to drop the crazy runtime generation and have something statically discoverable, or at least a way to annotate the typing statically.
2. Astral indicated already they plan to just add direct support for Django and other popular languages.
3. As people replied to similar comments on the previous threads (maybe to you?): that's not why ty is fast and why mypy is slow. It's also an easy claim to disprove: run both without plugins and you'll see ty is still 100x+ faster.
> 1. Maybe it's time to drop the crazy runtime generation and have something statically discoverable, or at least a way to annotate the typing statically.
That, and duck typing, are one of the biggest things that make Python what it is. If I have to drop all that for type checking and rewrite my code, why would I rewrite it in Python?
these are two different issues. supporting django involves adding a special-case module that essentially replicates its code generation and then adds that to the type-level view of the code. pyrefly or ty could do that and would still be just as fast. my guess is that once they have the basic python type checker as close to 100% as they can, they will start looking at custom modules for various popular metaprogramming libraries, or add enough of a plugin framework that the community can contribute them.
source: spent several years working on a python type checker
The problem is not with the type checkers.
https://code.djangoproject.com/ticket/32759
Similar (but lesser) problems exist with pydantic and sqlmodel. They're both fine projects except for:
https://www.reddit.com/r/Python/comments/1i5atpy/fquery_meet...
This is a long winded way of saying type checkers will deal with:
@sqlmodel
@pydantic
@dataclass
class MyModel:
name: str
a lot better. Move what doesn't fit here to dataclass metadata.The only type checker that fully works (meaning it successfully performs the necessary type inference all for inherited objects) for our large and highly modular python codebase, is Pycharm (I'm guessing it's their own custom tool from the ground up? Not really sure, actually.)
I lost all interest when I saw VS Code. I don’t get why people consider this a suitable IDE for python when you can have a real IDE like PyCharm.
What can you do in PyCharm that you cannot do in VS Code? I recently switched from PyCharm to VS Code to maintain a project with 250k LoC in Python (Django) and VS Code has been like a breath of fresh air. While you may need to install some plugins to get it "just right", it's more extensible. PyCharm is more "batteries included", and maybe that's the rub here.
pyrefly is not tied to vscode? Also please try to be more considerate of people preferences, and pycharm is not strictly better. Remote dev on vscode is very convenient for me, should I go on the Internet saying that pycharm is trash? No
I’m not saying VS Code is trash but I think it’s closer to a text editor than an IDE. I even use it for some things non python but I remain curious to the fact why people use it for python.
It might not be tied to VS Code but the title clearly says “New […] IDE experience for…” which is why I commented. I had hoped to see something for PyCharm or even a new IDE.
Thats what you think, other people think something else.
I'm a little worried on behalf of the "Python Language Tooling Team" at Meta, because uv has been so popular, and I wouldn't be surprised if ty wins out in this space.
So watch out, or this will become like Atom or Flow, an internal competitor of a technology that is surpassed by the more popular external open source version, leaving the directors/vps muttering to themselves "It's too bad that this team exists at all. Could we get rid of them and just switch to the open source stuff?"
Perhaps just something for the manager (Aaron Pollack?) to keep an eye on....