Comment by miohtama Comment by miohtama 2 days ago 1 reply Copy Link View on Hacker News All modern models are MoE already, no?
Copy Link hasperdi a day ago Collapse Comment - That's not the case. Some are dense and some are hybrid.MOE is not the holy grail, as there are drawbacks eg. less consistency, expert under/over-use Reply View | 0 replies
That's not the case. Some are dense and some are hybrid.
MOE is not the holy grail, as there are drawbacks eg. less consistency, expert under/over-use