Comment by kovek
What if you can check if the user responds positively/negatively to the output, and then you train the LLM on the input it got and the output it produced?
What if you can check if the user responds positively/negatively to the output, and then you train the LLM on the input it got and the output it produced?