Comment by gcr
Thanks for such a cool project! It's immediately apparent how to use it and I appreciate the brief examples.
Quick question: In the breast cancer example from the README, simple support vector machines from sklearn (the first thing i tried to compare baseline performance, incidentally) seem to outperform TabPFN. Is this expected? I know it's a baseline to demonstrate ease of use rather than SOTA performance, but I am curious.
# (TabPFN)
In [13]: print("ROC AUC:", roc_auc_score(y_test, prediction_probabilities[:, 1]))
ROC AUC: 0.996299494264216
# (LinearSVC)
In [27]: from sklearn.svm import LinearSVC
In [28]: clf=LinearSVC(C=0.01).fit(X_train, y_train)
In [29]: roc_auc_score(y_test, clf.decision_function(X_test))
Out[29]: 0.997532996176144
Author here! The breast cancer dataset is simple and heavily saturated, so small differences between methods are expected. As you say, single-use examples can be noisy due to randomness in how the data is randomly split into training and testing sets especially for a saturated dataset like this one. Cross-validation reduces this variance by averaging over multiple splits. I just ran this below:
It's hard to communicate this properly, we should probably make sure to have a favourable example ready, but just included the simplest one!