Comment by HarHarVeryFunny

I wouldn't say intractable, but it's not clear whether LLVM's optimization framework is flexible enough for it.

From mysterymath's (LLVM-MOS) description, presenting some of zero page as 16 bit registers (to make up for lack thereof, and perhaps due to LLVM not having any other support for preferred/faster memory regions), while beneficial, still had limitations since LLVM just assumes that there will be FAST register-register transfer operations available, and that is not even true for the 6502's real registers (no TXY), let alone these ZP "registers" which would require using the accumulator to copy.

A code generation/optimization approach that would seem more guaranteed to do well on the 6502 might be to treat it more as tree search (with pruning) - generate multiple branching alternatives and then select the best based on whatever is being optimized for (clock cycles or code size).

Coding for the 6502 by hand was always a bit like that ... you had some ideas of alternate ways things could be coded, but there was also an iterative phase of optimization (cf search) to tweak it and save a byte here, a couple of cycles there ...

I've mentioned elsewhere I used to work for Acorn back in the day, developing for the BBC micro, with me and a buddy developing the ISO-Pascal system which was delivered in 2 16KB ROMs. Putting code in ROM gives you an absolute hard size budget, and putting a full Pascal compiler, interpreter, runtime library and programmers editor into a total 32KB was no joke! I remember at the end of the project we were still a few hundred bytes over what would fit in the ROMs, and had to fight for every byte to make it fit!

djmips a day ago

It is my conjecture that due to the 8 bit index registers, contrast that to 6800, 6809 and others, the 6502 becomes fundamentally a structure of arrays (SOA) system versus C which is coupled in it's libraries and existing code base with array of structures (A0S).

Optimizing code will never solve good data oriented design. This is just one of the reasons that Asm programmers routinely beat C code on the 6502. Another one is math. In the C language specification, if fixed point had been given equal footing with float that would also help.

These are such a blind spos that you rarely even see custom 6502 high level languages accommodate these fundamental truths.

BTW, growing up on the 6502, I had no problems moving into the very C friendly 68000 but later going backwards to the 6809, on the surface it looked so much like a 6502 that I was initially trying to force SOA in my data design before realizing it was better suited to AOS.

Reply View 5 replies

HarHarVeryFunny 13 hours ago

If we're comparing performance of code generated by a C compiler vs hand optimized assembler, then for it to be an apples-to-apples comparison the same data structures (e.g. SOA or AOS) need to be used in both cases.

Reply View | 3 replies
- djmips 8 hours ago
  
  Yes, that's true and should be how good 6502 high level code would be written.
  
  Reply View | 2 replies
  
  HarHarVeryFunny 3 hours ago
  
  Yep. C was always meant to be a "close to the metal" language providing a feature set that could be mapped pretty directly to the processors it was running on. It's a "low level, high level language" where the expectation is more what you see is what you get (WYSIWYG), even though a modern optimizer might be expected to remove invariant code out of loops, etc - localized efficiency gains, but not large scale transformation.
  So, optimal C targetting the 6502 is not going to look much like C targetting a modern processor. The developer still needs to be very aware of the limitations of the processor they are targetting.
  One somewhat radical thing that LLVM-MOS does is to analyze the program's call graph, and for functions that are not used recursively it will assign parameters and local variables to zero page instead, both for speed of access and to avoid need for a stack frame. Even though this violates the WYSIWYG mental model, this is a nice abstraction of what the assembly language programmer would have done themself.
  
  Reply View | 1 reply
  
  djmips 3 hours ago
  
  >One somewhat radical thing that LLVM-MOS does is to analyze the program's call graph, and for functions that are not used recursively it will assign parameters and local variables to zero page instead, both for speed of access and to avoid need for a stack frame. Even though this violates the WYSIWYG mental model, this is a nice abstraction of what the assembly language programmer would have done themself.
  very nice, it sounds similar to the 'compiled stack' concept. I've seen that here in the co2 language for the 6502
  "Variables declared as subroutine parameters or by using let are statically allocated using a "compiled stack", calculated by analyzing the program's entire call graph. This means scopes will not use memory locations used by any inner scopes, but are free to use them from sibling scopes. This ensures efficient variables lookups, while also not wasting RAM. However, it does mean that recursion is not supported."
  https://github.com/dustmop/co2
  
  Reply View | 0 replies
zozbot234 a day ago

There is a C standard extension for embedded/low-level programming which specifies fixed point arithmetic. (And other goodies such as hardware register access and multiple address spaces.)

Reply View | 0 replies