Comment by XenophileJKO

Comment by XenophileJKO 2 days ago

5 replies

I guess there are several unsaid assumptions here. The article is by a domain expert working on their domain. Throw work you understand at it, see what it does. Do it before you even work on it. I kind of assumed based on the audience that most people here would be domain experts.

As for the building intuition, perhaps I am over-estimating what most people are capable of.

Working with and building systems using LLMs over the last few years has helped me build a pretty good intuition about what is breaking down when the model fails at a task. While having an ML background is useful in some very narrow cases (like: 'why does an LLM suck at ranking...'), I "think" a person can get a pretty good intuition purely based on observational outcomes.

I've been wrong before though. When we first started building LLM products, I thought, "Anyone can prompt, there is no barrier for this skill." That was not the case at all. Most people don't do well trying to quantify ambiguity, specificity, and logical contridiction when writing a process or set of instructions. I was REALLY surprised how I became a "go-to" person to "fix" prompt systems all based on linguistics and systematic process decomposition. Some of this was understaing how the auto-regressive attention system benefits from breaking the work down into steps, but really most of it was just "don't contradict yourself and be clear".

Working with them extensively also has helped me hone in on how the models get "better" with each release. Though most of my expertise is with OpenAI and Antrhopic model families.

I still think most engineers "should" be able to build intuition generally on what works well with LLMs and how to interact with them, but you are probably right. It will be just like most ML engineers where they see something work in a paper and then just paste it onto their model with no intuition about what systemically that structurally changes in the model dynamics.

fn-mote 2 days ago

> I kind of assumed based on the audience that most people here would be domain experts.

No take on the rest of your comment, but it’s the nature of software engineering that we work on a breadth of problems. Nobody can be a domain expert in everything.

For example: I use a configurable editor every day, but I’m not a domain expert in the configuration. An LLM wasted an hour of my day pointing me in “almost the right direction” when after 10 minutes I really needed to RTFM.

I am a domain expert in some programming languages, but now I need to implement a certain algorithm… I’m not an expert in that algorithm. There’s lots of traps for the unwary.

I just wanted to challenge the assumption that we are all domain experts in the things we do daily. We are, but … with limitations.

  • imiric 2 days ago

    Exactly.

    A typical programmer works within unfamiliar domains all the time. It's not just about being familiar with the programming language or tooling. Every project potentially has new challenges you haven't faced before, new APIs to evaluate and design, new tradeoffs to consider, etc.

    The less familiar you are with the domain or API, the less instincts and influence you have to steer the LLM in the right direction, and the more inclined you are to trust the tool over yourself. So when the tool is wrong, as it often still is, you can spend a lot of time fighting with it to produce the correct output.

    The example in the article is actually the best case scenario for these tools. It's essentially pattern matching using high quality code, from someone who's deeply familiar with the domain and the code they've written. The experience of someone unfamiliar trying to implement the same algorithm from scratch by relying on LLMs would be vastly different.

    • XenophileJKO 2 days ago

      I mean I "understand" your point. However, this isn't any different than being a technical lead in a system of any significant complexity.. you will constantly be reviewing work that you are not always an expert on, it is a very similar practice.

      I'm constantly reviewing things that I am not a domain expert on. I have to identify what is risky, what I don't know, etc. Throwing to the AI first is no different than throwing to someone else first. I have the same requirements. Now I can choose how much I "trust" the person or LLM. I have had coworkers I trust less than LLMs.. I'll put it that way.

      So just like with reviewing a co-worker.. pay attention to areas you are not sure what the right method is and maybe double-check it. This just isn't a "new" thing.

      • hitarpetar 16 hours ago

        > Throwing to the AI first is no different than throwing to someone else first

        except in all the ways that it is obviously different

      • imiric a day ago

        Well, you're right that reviewing someone else's work isn't new, but interacting with these tools is vastly different from communicating with a coworker.

        A competent human engineer won't delude you with claims not based in reality, and be confident about it. They can be wrong about practical ways of accomplishing something, but they won't suggest using APIs that don't exist, or go off on wild tangents because a certain word was mentioned. They won't give a different answer whenever you ask them the same question. Most importantly, conversations with humans can be productive in ways that both parties gain a deeper understanding of the topic and respect for each other. Humans can actually think and reason about topics and ideas, they can actually verify their and your claims, and they won't automatically respond with "You're right!" at any counterargument or suggestion.

        Furthermore, the marketing around "AI" is strongly based on promoting their superhuman abilities. If we're led to believe that these are superintelligent machines, we're more inclined to trust their output. We have people using them as medical professionals, thinking that they're talking to a god, and being influenced by them. Trusting them to produce software is somewhere on that scale. All of this is highly misleading and potentially dangerous.

        Any attempt at anthropomorphizing "AI" is a mistake. You can get much more out of them by using them as what they are: excellent pattern matching probabilistic tools.