Comment by crote

Comment by crote a day ago

10 replies

View on Hacker News

The horrifying side effect of this is that "arr[idx]" is equal to "idx[arr]", so "5[arr]" is just as valid as "arr[5]".

Your colleagues would probably prefer if you forget this.

miningape a day ago

Mom, please come pick me up. These kids are scaring me.

Reply View 0 replies

Joker_vD a day ago

> so "5[arr]" is just as valid as "arr[5]"

This is, I am sure, one of the stupid legacy reasons we still write "lr a0, 4(a1)" instead of more sensible "lr a0, a1[4]". The other one is that FORTRAN used round parentheses for both array access and function calls, so it stuck somehow.

Reply View 2 replies

kragen 17 hours ago
Generally such constant offsets are record fields in intent, not array indices. (If they were array indices, they'd need to be variable offsets obtained from a register, not immediate constants.) It's reasonable to think of record fields as functions:
.equ car, 0 .equ cdr, 8 .globl length length: test %rdi, %rdi # nil? jz 1f # return 0 mov cdr(%rdi), %rdi # recurse on tail of list call length inc %rax ret 1: xor %eax, %eax ret
To avoid writing out all the field offsets by hand, ARM's old assembler and I think MASM come with a record-layout-definition thing built in, but gas's macro system is powerful enough to implement it without having it built into the assembler itself. It takes about 13 lines of code: http://canonical.org/~kragen/sw/dev3/mapfield.S
Alternatively, on non-RISC architectures, where the immediate constant isn't constrained to a few bits, it can be the address of an array, and the (possibly scaled) register is an index into it. So you might have startindex(,%rdi,4) for the %rdi'th start index:
.data startindex: .long 1024 .text .globl length length: mov (startindex+4)(,%rdi,4), %eax sub startindex(,%rdi,4), %eax ret
If the PDP-11 assembler syntax had been defined to be similar to C or Pascal rather than Fortran or BASIC we would, as you say, have used startindex[%rdi,4].
This is not very popular nowadays both because it isn't RISC-compatible and because it isn't reentrant. AMD64 in particular is a kind of peculiar compromise—the immediate "offset" for startindex and endindex is 32 bits, even though the address space is 64 bits, so you could conceivably make this code fail to link by placing your data segment in the wrong place.
(Despite stupid factionalist stuff, I think I come down on the side of preferring the Intel syntax over the AT&T syntax.)
Reply View | 0 replies
beng-nl 20 hours ago

Yes, I find this one of the weird things about assembly - appending (or pretending?) a number means addition?! - even after many many years of occasionally reading/writing assembly, I’m never completely sure what these instructions do so I infer from context.

Reply View | 0 replies

rocqua a day ago

That depends on sizeof(*arr) no?

Reply View 5 replies

unwind a day ago

Not in C no, since arithmetic on a pointer is implicitly scaled by the size of the value being pointed at (this statement is kind of breaking the abstraction ... oh well).

Reply View | 0 replies
messe a day ago

Nope, a[b] is equivalent to *(a + b) regardless of a and b.

Reply View | 3 replies
- sureglymop a day ago
  
  Given that, why don't we use just `*(a + b)` everywhere?
  Wouldn't that be more verbose and less confusing? (genuinely asking)
  
  Reply View | 2 replies
  
  tomsmeding a day ago
  
  Do you really think that `*(a + i)` is clearer than `a[i]`?
  
  Reply View | 1 reply
  
  sureglymop 19 hours ago
  
  Not necessarily. I think it's confusing when there are two fairly close ways to express the same thing.
  
  Reply View | 0 replies