Comment by gregsadetsky

Comment by gregsadetsky 3 days ago

20 replies

I don't know this world well (I know what llvm is) but - does anyone know why this was made as a fork vs. contributing to llvm? I suppose it's harder to contribute code to the real llvm..?

Thanks

mysterymath 2 days ago

Hey, llvm-mos maintainer here. I actually work on LLVM in my dayjob too, and I don't particularly want llvm-mos upstream. It stretches LLVM's assumptions a lot, which is a good thing in the name of generality, but the way it stretches those assumptions isn't particularly relevant anymore. That is, it's difficult to find modern platforms that break the same assumptions.

Also, maintaining a fork is difficult, but doable. I work on LLVM a ton, so it's pretty easy for it to fold in to my work week-to-week. And quite surprisingly, I used AI to help last time, and it actually helped quite a lot!

  • nineteen999 2 days ago

    What's your take on sdcc 6502 support at the moment, if you have one? Im just happy to finally have an 8-bit C compiler that supports both targets, even if the codegen for 6502 needs a lot of work right now.

    I'd happily take a llvm-z80 and llvm-6502 over sdcc if both were available

    Edit: oh wow, look at that https://github.com/grapereader/llvm-z80. Aw but not touched for 12 years.

    • bbbbbr 2 days ago

      Not the parent, but I have a take. :)

      For GBDK-2020 we've been using the 6502 support in SDCC to support the NES as a target console for about 2 years alongside the existing Game Boy and SMS/Game Gear targets.

      The 6502 port has been usable, but doesn't seem fully mature. There has been a lot of code churn for it during the last 12 months compared to the z80/sm83 ports as it gets improved. Recently (their recommended pre-release build 15614) this seems to have resulted in some breaking regressions that we haven't fully tracked down.

      Perhaps this port is getting less testing coverage than the z80/sm83 port. Unsure. The majority of the 6502 work seems to be done by a newer member of their team, with the longer term members seeming to be somewhat hands-off. That might be an additional factor.

      Edit: BTW, the 6502 port in SDCC at build 15267 (~4.5.0+) has been reasonably stable and usable, and is what we based our last GBDK-2020 release on (6 months ago).

      • nineteen999 a day ago

        Ah thank you, that is all very helpful - I've been using 4.4.0 which is fine for Z80 code, but yeah had the feeling 6502 code generation could be improved.

  • zozbot234 2 days ago

    Even if y'all don't particularly care about having the full backend upstream just yet, it still seems worthwhile to comprehensively document these assumptions within the project, and perhaps to upstream a few of the simpler custom passes where not too much "stretching" of assumptions is involved, if only to ease future forward-porting work.

weinzierl 3 days ago

These processors were very very different from what we have today.

They usually only had a single general purpose register (plus some helpers). Registers were 8-bit but addresses (pointers) were 16-bit. Memory was highly non-uniform, with (fast) SRAM, DRAM and (slow) ROM all in one single address space. Instructions often involved RAM directly and there were a plethora of complicated addressing modes.

Partly this was because there was no big gap between processing speed and memory access, but this makes it very unlikely that similar architectures will ever come back.

As interesting as experiments like LLVM-MOS are, they would not be a good fit for upstream LLVM.

  • zozbot234 3 days ago

    > ... there was no big gap between processing speed and memory access, but this makes it very unlikely that similar architectures will ever come back. ...

    Don't think "memory access" (i.e. RAM), think "accessing generic (addressable) scratchpad storage" as a viable alternative to both low-level cache and a conventional register file. This is not too different from how GPU low-level architectures might be said to work these days.

    • djmips 2 days ago

      Great point. And you can even extend that to think like a 6502 or GPU programmer on an AMD, ARM or Intel CPU as well if you want the very best performance. Caches are big enough on modern CPUs that you can almost run portions of your code in the same manner. I bet TPUs at Google also qualify.

jjmarr 3 days ago

LLVM has very high quality standards in my experience. Much higher than I've ever had even at work. It might be a challenge to get this upstreamed.

LLVM is also very modular which makes it easy to maintain forks for a specific backend that don't touch core functionality.

  • codebje 3 days ago

    My experience is that while LLVM is very modular, it also has a pretty high amount of change in the boundaries, both in where they're drawn and in the interfaces between them. Maintaining a fork of LLVM with a new back-end is very hard.

    • jjmarr 3 days ago

      I know my company (AMD) maintains an llvm fork for ROCm. YMMV.

      • ahartmetz 2 days ago

        Do you know why it's a fork? Als, from this https://github.com/ROCm/llvm-project/commits/amd-staging/ it looks like it might be more appropriately called a staging branch than a fork.

        • jjmarr 2 days ago

          Various reasons, like embargoes on information, stuff we didn't want to wait for review on before shipping, or features that don't make sense for upstream like `hipcc` which is an `nvcc` wrapper.

          Our goal is to get most modifications not in the third category into upstream at some point which makes the maintenance load bearable.

      • codebje 3 days ago

        I should have qualified: it's hard to do for an individual or very small team as a passion side-project. It's pretty time consuming to keep up with the rate of change in LLVM.

  • gregsadetsky 3 days ago

    Super interesting, thanks. I specifically thought that its modular aspect made it possible to just "load" architectures or parsers as ... "plugins"

    But I'm sure it's more complicated than that. :-)

    Thanks again

    • zozbot234 3 days ago

      LLVM backends are indeed modular, and the LLVM project does allow for experimental backends. Some of the custom optimization passes introduced by this MOS backend are also of broader interest for the project, especially the automated static allocation for provably non-reentrant functions, which might turn out to be highly applicable to GPU-targeting backends.

      It would be interesting to also have a viable backend for the Z80 architecture, which also seems to have a highly interested community of potential maintainers.

Sharlin 3 days ago

Pretty sure that the prospects of successfully pitching the LLVM upstream to include a 6502 (or any 8/16-bit arch) backend are only slightly better than a snowball’s chances in hell.

  • alexrp 3 days ago

    Worth noting that LLVM has AVR and MSP430 backends, so there's no particular resistance to 8-bit/16-bit targets.

    • Sharlin 2 days ago

      Oh, thanks for the correction. I couldn’t find a conprehensive list of backends (which is weird) and the lists I did find only included 16+ bit targets.