[Private]
Speculation
- Turns out humans preferred Instruct models where the base model was 100x smaller than GPT-3.
- Chincilla suggests that at around 175B tokens, we’d run out of tokens to train on, and that GPT-3 was way undertrained.
- Makes me wonder if GPT-4 can be well under 1 trillion and still very good.