Comment by Joker_vD

Comment by Joker_vD a day ago

2 replies

> so "5[arr]" is just as valid as "arr[5]"

This is, I am sure, one of the stupid legacy reasons we still write "lr a0, 4(a1)" instead of more sensible "lr a0, a1[4]". The other one is that FORTRAN used round parentheses for both array access and function calls, so it stuck somehow.

kragen 17 hours ago

Generally such constant offsets are record fields in intent, not array indices. (If they were array indices, they'd need to be variable offsets obtained from a register, not immediate constants.) It's reasonable to think of record fields as functions:

            .equ car, 0
            .equ cdr, 8
            .globl length
    length: test %rdi, %rdi         # nil?
            jz 1f                   # return 0
            mov cdr(%rdi), %rdi     # recurse on tail of list
            call length
            inc %rax
            ret
        1:  xor %eax, %eax
            ret
To avoid writing out all the field offsets by hand, ARM's old assembler and I think MASM come with a record-layout-definition thing built in, but gas's macro system is powerful enough to implement it without having it built into the assembler itself. It takes about 13 lines of code: http://canonical.org/~kragen/sw/dev3/mapfield.S

Alternatively, on non-RISC architectures, where the immediate constant isn't constrained to a few bits, it can be the address of an array, and the (possibly scaled) register is an index into it. So you might have startindex(,%rdi,4) for the %rdi'th start index:

            .data
    startindex:
            .long 1024
            .text
            .globl length
    length: mov (startindex+4)(,%rdi,4), %eax
            sub startindex(,%rdi,4), %eax
            ret
If the PDP-11 assembler syntax had been defined to be similar to C or Pascal rather than Fortran or BASIC we would, as you say, have used startindex[%rdi,4].

This is not very popular nowadays both because it isn't RISC-compatible and because it isn't reentrant. AMD64 in particular is a kind of peculiar compromise—the immediate "offset" for startindex and endindex is 32 bits, even though the address space is 64 bits, so you could conceivably make this code fail to link by placing your data segment in the wrong place.

(Despite stupid factionalist stuff, I think I come down on the side of preferring the Intel syntax over the AT&T syntax.)

beng-nl 20 hours ago

Yes, I find this one of the weird things about assembly - appending (or pretending?) a number means addition?! - even after many many years of occasionally reading/writing assembly, I’m never completely sure what these instructions do so I infer from context.