Comment by energy123
Did you try with 60k+ context? I found previous releases to be lacklustre which I tentatively attributed to the longer context, due to the model being trained on a lot of short context data.
Did you try with 60k+ context? I found previous releases to be lacklustre which I tentatively attributed to the longer context, due to the model being trained on a lot of short context data.