I'm curious about the training data size of GPT-4. Specifically, I want to know the exact number of tokens it was trained on.
5 answers
CryptoPioneerGuard
Sat Nov 02 2024
GPT-4 embodies a mixture-of-experts architecture.
Raffaele
Fri Nov 01 2024
It involved 13 trillion token passes.
EthereumEmpire
Fri Nov 01 2024
This model comprises 16 experts.
Martino
Fri Nov 01 2024
Each expert boasts 111 billion parameters.
BlockchainBrawler
Fri Nov 01 2024
The training process required approximately 2 multiplied by 10 raised to the power of 25 FLOPS.