Comment by mlyle

Comment by mlyle 3 days ago

1 reply

Yah. Latest thing I wrote was

* Code using sympy to generate math problems testing different skills for students, with difficulty values affecting what kinds of things are selected, and various transforms to problems possible (e.g. having to solve for z+4 of 4a+b instead of x) to test different subskills

(On this part, the LLM did pretty well. The code was correct after a couple of quick iterations, and the base classes and end-use interfaces are correct. There's a few things in the middle that are unnecessarily "superstitious" and check for conditions that can't happen, and so I need to work with the LLM to clean it up.

* Code to use IRT to estimate the probability that students have each skill and to request problems with appropriate combinations of skills and difficulties for each student.

(This was somewhat garbage. Good database & backend, but the interface to use it was not nice and it kind of contaminated things).

* Code to recognize QR codes in the corners of worksheet, find answer boxes, and feed the image to ChatGPT to determine whether the scribble in the box is the answer in the correct form.

(This was 100%, first time. I adjusted the prompt it chose to better clarify my intent in borderline cases).

The output was, overall, pretty similar to what I'd get from a junior engineer under my supervision-- a bit wacky in places that aren't quite worth fixing, a little bit of technical debt, a couple of things more clever that I didn't expect myself, etc. But I did all of this in three hours and $12 expended.

The total time supervising it was probably similar to the amount of time spent supervising the junior engineer... but the LLM turns things around quick enough that I don't need to context switch.

novembermike 2 days ago

I think it's fair to call code LLM's similar to fairly bad but very fast juniors that don't get bored. That's a serious drawback but it does give you something to work with. What scares me is non-technical people just vibe coding because it's like a PM driving the same juniors with no one to give sanity checks.