Comment by Joker_vD
Comment by Joker_vD 2 days ago
> A word is a maximal string of characters delimited by spaces, tabs or newlines.
And then the actual code explicitly filters out and ignores every character larger than 0x7F. Just why.
Comment by Joker_vD 2 days ago
> A word is a maximal string of characters delimited by spaces, tabs or newlines.
And then the actual code explicitly filters out and ignores every character larger than 0x7F. Just why.
Yes, that's true for that code. But that wasn't really the point, the point I wrote in my earlier post was that ASCII is 7 bits, it's 0..127, and, depending on where the characters came from, only values below 128 are valid ASCII. What I was talking about was that because a parity bit was common, ASCII was limited to 7 bits, to make room for a parity bit. When other transports are involved, e.g. reading from a file, there aren't any parity bits (well, that's not entirely true - a minicomputer I worked with back in the day used parity bits on characters in text files, but that's not the case for the platform where this particular old 'wc' was used), the code simply focuses on valid ASCII, which is below 128.
I am going about the parity bit. 0x46 has odd number of bits set (three, to be precise) so for the parity to check out (that is, the number of bits set has to be even), a parity bit needs to be set and the resulting encoding has to be 0xC6, with four bits set.
Because they thought that a word is something said in a human language that they can understand.
Probably because they're not characters. They're just bytes undefined by ASCII.