Comment by naasking

Comment by naasking 3 days ago

8 replies

> "I want this button to be bigger", formalizing it can't be a formal process.

    while (!is_button_big_enough()) {
       button.scaleUp(1.1);
    }
This is one trivial way to do it, and seems like it would be formalizable. is_button_big_enough is simply an input to whatever process is responsible for judging such a thing, whether that be a design specification or perhaps input from a person.
tsimionescu 3 days ago

You've translated my informal problem statement into a quasi-formal process, using your inherent natural language processing skills, and your knowledge of general human concepts like size. But you haven't explained the formal process you followed to go from my problem statement to this pseudocode.

And your pseudocode template only works for one particular kind of informal problem statement. If I instead have the problem "how much money do I need to buy this house and this chair?", or "does this byte fit in my mouth?", your general form will not work.

And what's more, you haven't actually produced a formally solvable problem definition, that we could analyze for complexity and computability, because you rely on two completely unspecified functions. Where is the formal defintion of a button? Is it a physical push button or a UI control or a clothing button? What does it mean that it is bigger or smaller? When do we know it's big enough, is that computable? And how do we scale it up? Do we increase its volume? Its surface area? One of its sides? Or maybe the radius? And how do we go about doing that? All of these, and many more, need to be explicitly defined in order to apply any kind of formal analysis to this problem. And there is no formal way to do so in a way that matches the intent of whoever posed the problem.

  • naasking 3 days ago

    > And what's more, you haven't actually produced a formally solvable problem definition, that we could analyze for complexity and computability, because you rely on two completely unspecified functions. Where is the formal defintion of a button?

    Well your statement was underspecified. You said "I want this button bigger". There are procedures to translate informal statements to formal ones, but one basic step is underspecified referents are delegated to abstractions that encapsulate those details, so "this button" designates some kind of model of a button, and "I" refers to a subject outside the system thereby implying some kind of interactive process to query the subject whether the model is satisfactory, eg. a dialog prompt asking "Is this button big enough now?"

    You call these skills "inherent", but humans are not magical. We employ bug riddled poorly specified procedures for doing this kind of interpretive work, and LLMs have already started to do this too, and they'll only get better. Is asking a deterministic LLM to generate a formal specification or program to achieve some result a formal process? I don't think these lines are as clear as many think, not anymore.

    • tsimionescu 2 days ago

      I think we're mostly agreed actually. I'm not trying to claim that this is an unsolvable problem, just that it's a difficult problem that we don't have a solution for yet. And yes, LLMs are probably our best tool so far. And asking for clarifying questions is clearly a part of the process.

      I will say that there is also a possibility the general form of the formal problem is in fact uncomputable. It seems possible to me it might be related to the halting problem. But, until we have a formal specification of it, we won't know, of course.

freejazz 3 days ago

What is the repeatable method by which you came to that conclusion? That is what needs to be formalized for your response to make sense.

  • naasking 3 days ago

    There are procedures for translating informal statements to formal ones. If I submit such informal statements to an LLM and ask it to generate a spec or program to achieve some result, that can be made repeatable. There are various arrangements to make this more robust, like having another LLM generate test cases to check the work of the other. Does this qualify?