Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price

☆ Yσɠƚԋσʂ ☆@lemmy.ml · edit-2 2 days ago

Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 2 days ago

The reason they ask for less money is due to the fact that it’s a more efficient algorithm, which means it uses less power. They leveraged mixture-of-experts architecture to get far better performance than traditional models. While it has 671 billion parameters overall, it only uses 37 billion at a time, making it very efficient. For comparison, Meta’s Llama3.1 uses 405 billion parameters used all at once. You can read all about here https://arxiv.org/abs/2405.04434

melroy@kbin.melroy.org · 2 days ago

I see ok. I only want to add that DeepSeek is not the first or the only model that is using mixture-of-experts (MoE).

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 2 days ago

Ok, but it is clearly the first one to use this approach to such an effect.

Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price

Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price

Meta is reportedly scrambling multiple ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price