Comment by jacob019
Found the web interface: https://ux.priorlabs.ai/ Really cool!
Just playing around with regression mode...
A very simple dataset, powers of two:
1:2, 2:4, 3:8, 5:32, 6:64, 7:128 (missing the #4 value)
Predictions (1-10):
1.582 5.236 13.150 22.943 37.584 67.475 109.945 155.322 218.001 10,300.425
Error (1-10):
-26.4% 23.6% 39.2% 30.3% 14.9% 5.2% -16.4% -64.8% -134.9% -240.9%
... well, it has a positive slopeLet's see what happens if we copy the exact same values in the dataset 10 times first.
Predictions (1-10):
1.993 3.967 7.986 18.138 31.965 64.140 128.125 126.607 130.667 161.756
Error (1-10):
-0.3% -0.8% -0.2% 11.8% -0.1% 0.2% 0.1% -102.2% -291.8% -533.1%
Interesting, repeated values give the model a lot more confidence of the known values. The interpolated #4 value is still off by 12%. It does not extrapolate well at all.Looking forward to trying it on real world data with more features.
Yes! This makes sense from a learning perspective: More samples add additional evidence the datapoint is actually what you observed - based on one sample the model is closer to a mean regression (which would translate to more balanced class probabilities in classification). Transformers have trouble counting repeated entries (there was a famous failure case of ChatGPT, asking it to count the number of 1s and 0s in a string). This model has some tricks to solve this.