Comment by andyferris

Comment by andyferris 12 hours ago

13 replies

I see we need to add special syntax to the signature for dependent type variables.

If you take Zig, it's `comptime` parameters are kind of similar. They can be used to create functions that return types or whose output type depends on the inputs, etc. It seems to fulfil the three axioms at the start, no? The erasure stuff seems just as relavant.

Can I say that `comptime` is dependent types in imperative clothing? Or is there a missing capability making it strictly weaker?

codebje 10 hours ago

You could probably say that. AFAIK there isn't a single valid definition of a dependently typed language any more than there is a single valid definition of a functional language.

I usually go with "you can make a dependent pair", which is two terms where the type of the second depends on the value of the first; I think you could do that in Zig with a bit of handwaving around whether the comptime expression really is the first value or not.

What Zig's doing is also, as best I understand it all, the only real path to anything like dependent types in imperative languages: make a separate sub-language for constant expressions that has its own evaluation rules that run at compile time. See also C++ constexpr, or the constants permitted in a Rust static array size parameter (practically a dependent type itself!)

A functional language's basic building block is an expression and terms are built out of composed expressions; normalisation means reducing a term as far as possible, and equality is (handwaving!) when two terms reduce to exactly the same thing. So you can do your dependent terms in the usual source language with the usual evaluation rules, and those terms are freely usable at runtime or compile time, it's all the same code.

An imperative language's basic building block is a memory peek or poke, with programs composed of sequences of peeks and pokes scheduled by control flow operators. It's much less clear what "normalisation" looks like here, so imperative languages need that separate constant-value embedded language.

  • jibal 4 hours ago

    Zig's sub-language and evaluation rules are exactly the same as the Zig language except that `type` is a first class type and the TypeInfo of any type can be both examined and modified. In Zig, generics are implemented by computing types--most often by a function returning a struct that is specialized to a type argument to the function.

    I've been writing some D code and, while D "template" functions can take compile-time arguments that are types, they can only produce types via mixins, which are strings that are injected at the point of invocation and then parsed by the compiler. This gives you all of the power of Zig and then some, but it's not nearly as elegant (it would help a lot if D's token quote construct q{...} had lisp-like quasi-quote functionality). OTOH, I found when writing Zig it could become extremely confusing to have the comptime and runtime languages intermixed, which is often done for loop unrolling and "conditional compilation" -- whether it's a comptime or runtime operation is in many cases a function of whether the variables involved are comptime or runtime types. In D it's generally clearer because comptime operations are distinguished with such things as static if for flow control, enum for declaring comptime variables, foo!(comptime args)(runtime args) for function calls, etc.

pron 3 hours ago

Not really. With dependent types, you can have a function that returns a type that depends on a runtime value; with Zig's comptime, it can only depend on a compile-time value.

Note that a language with dependent types doesn't actually "generate" types at runtime (as Zig does at compile-time). It's really about managing proofs that certain runtime objects are instances of certain types. E.g. you could have a "prime number" type, and functions that compute integers would need to include a proof that their result is a prime number if they want to return an instance of the prime number type.

Using dependent types well can be a lot of work in practice. You basically need to write correctness proofs of arbitrary properties.

  • CoastalCoder 2 hours ago

    Sorry if this is obvious, but do these languages let programmers explicitly assert that a certain, relevant property holds at runtime?

    Or do these compilers insist on working these proofs out on there own without the benefit of programmer-supplied assertions?

    • pron an hour ago

      > do these languages let programmers explicitly assert that a certain, relevant property holds at runtime?

      Yes, but the programmer has to do the really, really, really hard work of proving that's the case. Otherwise, the compiler says: you haven't proved to me that it's true, so I won't let you make this assertion. Put another way: the compiler checks your proof, but it doesn't write it for you.

      The only programs for which all interesting assertions have been proven in this, or a similar, way have not only been small (up to ~10KLOC), but the size of such programs relative to that of the average program has been falling for several decades.

    • naasking an hour ago

      > Sorry if this is obvious, but do these languages let programmers explicitly assert that a certain, relevant property holds at runtime?

      You can assert something but then you also have to prove it. An assertion is like a type declaration: you're asserting values of this type can exist. The proof of that assertion is constructing a value of that type to show that it can be instantiated.

      This is what the Curry-Howard correspondence means. The types of most languages are just very weak propositions/assertions, but dependent types let you declare arbitrarily complex assertions. Constructing values of such arbitrarily complex assertions can get very hairy.

ChadNauseam 11 hours ago

I'm not familiar with Zig, but I'm pretty sure comptime is not quite as powerful.

In the post, I didn't give any examples of situations where the input to a dependent function was not known at compile time. However, that was just due to the lack of time when writing the post.

I gave this example:

    useValue : Bool -> String
    useValue True = "The number is: " ++ natToString (getValue True)
    useValue False = "The message is: " ++ getValue False
The difference is that nothing in dependent types requires that the depended-on value `b` is known at compile time. You could pass any Bool to `getValue`, including one that was read as input from the user. It would just be the case that, before using the value returned by `getValue b`, you'd have to check (at runtime) the value of `b`. Only then would you know `pickType b`, which would tell you the type of `getValue b`. My apologies that the post was unclear about this. (I've just updated it.)
  • [removed] 6 hours ago
    [deleted]
  • spockz 10 hours ago

    If you don’t know at compile time what the type of the value is that is returned, how do you know whether the remainder of the expression is still correct?

    The choice is either to have a total function that will be able to deal with every type and check at compile time whether they exist or to defer to the runtime and then throw an exception if the program doesn’t know what to do with the value.

    • ChadNauseam 10 hours ago

      The value returned by `getValue b` has type `pickType b`. It cannot be used in a way that depends on its type until you check the value of `b`. If `b = True`, then you know that `pickType b = pickType True = Nat` and that therefore `getValue b` has type `Nat`. There is no option to check the type of a value at runtime. You simply cannot use it in such a way as to create a runtime type error. (This is what it means for the types to be erased. The runtime doesn't know what type it's supposed to be!)

    • rtpg 6 hours ago

        getValue : (b : Bool) -> pickType b
        getValue True  = 42
        getValue False = "hello"
      
      So given something like this, imagine something like the following:

        someBool = flipACoinToGetABool
        thingy = getValue someBool
      
      You're already in trouble here right? what's the type of thingy?

      So maybe you write:

        thingy: (pickType someBool) = getValue someBool
      
      which is fine and great!

      then you write:

        print (thingy + 3) -- thingy can be an int right?
      
      But at this point thingy is of type "pickType someBool" while + is expecting Int. So it's up to you to restructure your code to prove to the type checker that someBool is in fact True.

      For example:

         print ((if someBool then thingy else 5) + 3)
      
      ^ and here your code proves the fact! So in practice when you want to explore the results that are hidden behind the dependent types, you will need to have written code that unwraps it like this. But the whole point is the underpinning type theory means that you will be fine while doing it! You've proven that it'll be fine, so if you add thingy to 3 you'll have an integer.

      I think of this stuff a bit like encryption password. You have this bundle of data, but to get in, you need to provide proof that what you're doing won't blow up. You'll need to write your code in a certain way to do it. But if you provide the right proof, then we're golden.

    • skybrian 8 hours ago

      Consider how Rust monomorphizes generic functions, making a copy for each possibility. I imagine a compiler could generate a different continuation for each possible return type, along with a runtime check that branches to the appropriate continuation.

      It seems like that might cause code bloat, though?

  • ioanaci 5 hours ago

    I commented this before but deleted it because I hadn't tested my code properly. Anyway, here is a version that works.

    You're right Zig isn't quite as powerful, but you don't need type erasure if you have enough monomorphization. It's a bit annoying to use since you have to capture the comptime version of b inside the inline switch prongs, and get_value does need to take a comptime bool instead of a runtime one. It's functionally the same, except that instead of using the bool to prove the type of the type-erased value is correct and delaying the actual if (b) check to runtime, we're moving it to compile time and instead proving that b has the right value for each case, then generating specialized code.

    This does NOT mean that it isn't dependent on user input, the call do_thing(get_user_input()) would absolutely work. do_thing has no compile-time parameters at all.

    I don't have the natToString thing because it's a bit more annoying in Zig and would obscure the point.

        const std = @import("std");
        
        fn get_value(comptime b: bool) if (b) u32 else []const u8 {
            if (b) return 12
            else return "string";
        }
        
        fn do_thing(b: bool) void {
            switch (b) {
                inline true => |known_b| std.log.info("{d}", .{get_value(known_b)}),
                inline false => |known_b| std.log.info("{s}", .{get_value(known_b)}),
            }
        }
        
        pub fn main() void {
            do_thing(true);
            do_thing(false);
        }
    
    You could define a function

        fn pick_type(comptime b: bool) type {
            if (b) return u32 else return []const u8;
        }
    
    and change the signature

        fn get_value(comptime b: bool) pick_type(b)
    
    if you wish.

    Perhaps more convincingly, you can use inline for with a similar effect. It's obviously horrible code nobody would ever write but I think it might illustrate the point.

        fn do_thing(b: bool) void {
            inline for (0..2) |i| { // go through all bool variants
                const known_b = (i != 0);
                
                if (known_b == b) {
                    // Essentially what we have now is a known_b which is proven
                    // at compile-time to be equal to whatever runtime b we have.
                    // So we're free to do any kind of dependent type things we'd like
                    // with this value.
                    
                    const value = get_value(known_b);
                    // Here we know the type of value but it still depends on the runtime b.
                    
                    if (known_b) std.log.info("{d}", .{value}) // treated as an int
                    else std.log.info("{s}", .{value}); // treated as a string
                    
                    // We can also just decide based on the type of value at this point.
                    if (@TypeOf(value) == u32) ... else ...;
                    // Or, a switch
                    switch (@TypeOf(value)) {
                        u32 => ...,
                        []const u8 => ...,
                    }
                }
            }
        }