Mixture of Expert Models (MoE)
Mixture of Expert (MoE) models are a type of conditional computation where parts of the network are activated on a per-example basis. It has been proposed as a way to dramatically increase model capacity without a proportional increase in computation [1].
Related Articles
No items found.