Comment by mrkeen

Comment by mrkeen a day ago

18 replies

This stuck out:

  email_t theEmail = parseEmail(untrustedInput);
  if (theEmail == PARSE_ERROR) {
    return error;
  }
An email_t is not a parse error, and a parse error is not one of the emails, so this shouldn't compile (and I don't take 'pseudocode' as an excuse).
parkcedar a day ago

> and I don't take 'pseudocode' as an excuse

Weird hill to die on, since neither email_t nor PARSE_ERROR were defined in the sample snippets. How do you know PARSE_ERROR is not email_t?

  • mrkeen a day ago

    It's the parse-versus-validate hill in this case.

    This pseudocode is "Validate" for at least 3 reasons:

    Forgetting to check:

      this check is fragile: it’s extremely easy to forget. Because its return value is unused, it can always be omitted, and the code that needs it would still typecheck.
    
    Repeatable/redundant checks:

      First, it’s just annoying. We already checked that the list is non-empty, why do we have to clutter our code with another redundant check?
    
      Second, it has a potential performance cost. Although the cost of the redundant check is trivial in this particular example, one could imagine a more complex scenario where the redundant checks could add up, such as if they were happening in a tight loop.
    
    Not using the type system:

      Use a data structure that makes illegal states unrepresentable. Model your data using the most precise data structure you reasonably can. If ruling out a particular possibility is too hard using the encoding you are currently using, consider alternate encodings that can express the property you care about more easily. Don’t be afraid to refactor.
    
    > How do you know PARSE_ERROR is not email_t

    It has to be for it to compile, right? Which means that email_t is the type which represents both valid and invalid emails. How do you know if it's valid? You remember to write a check for it. Why not just save yourself some keystrokes and use char* instead. This is validate, not parse.

    • lmz 21 hours ago

      > It has to be for it to compile, right? Which means that email_t is the type which represents both valid and invalid emails. How do you know if it's valid? You remember to write a check for it. Why not just save yourself some keystrokes and use char* instead. This is validate, not parse.

      I feel this kind of fundamentalism is letting the perfect be the enemy of the good.

      • mrkeen 21 hours ago

        Every C programmer is already doing it the 'good' way (validation), so this article doesn't really add anything.

        The only fundamentalism involved in PdV is: if you have an email, it's actually an email. It's not arbitrary data that may or may not an email.

        Maybe you want your emailing methods to accept both emails and not-emails in your code base. Then it's up to each method to validate it before working on it. That is precisely what PdV warns against.

  • VMG a day ago

    Because an error is not an email?

    • exe34 a day ago

      By that logic, a float couldn't store NaN.

      • mrkeen a day ago

        Correct. You'll never see a raw float on the LHS of a PDV expression.

        • exe34 a day ago

          Who puts their personal data vault on the left hand side of a raw float?

bmandale a day ago

> and I don't take 'pseudocode' as an excuse

They write the non-pseudo variant later. There, the return value is a pointer and the check is against NULL. Which is fairly standard for C code, albeit not always desirable.

  • mrkeen a day ago

    Correct, it is fairly standard C code. It is not Parse, Don't Validate.

Gibbon1 a day ago

I'm with you, don't do crap like that. Always return a valid object.

  email_t theEmail = parseEmail(untrustedInput);
  if (theEmail.error != PARSE_OK) {
    return error;
  }
  • mrkeen a day ago

    This is validate.

    You made an email-or-error type and named it email_t and then manually checked it.

    PDV returns an non-error-email type from the check method.

    • ykonstant 18 hours ago

      I don't understand; what is your suggested solution?

      • mrkeen 17 hours ago

        I'm not smart enough to suggest a fix here, I'm just pointing out that this article is not the PdV from the well known article.

        But I can spot when code is doing exactly what the cited article says not to do,

        This line is the "validate" in the expression "parse, don't validate":

          if (theEmail.error != PARSE_OK)
        
        You might like it, but that's not my business. Maybe this C article should have been "parse, then validate".

        You'd be better off reading the original: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

      • teo_zero 16 hours ago

        parseEmail() should either return a valid email, or not return at all; whether that means panic, exit, or jump to an error handler... is left to the implementer