How big is GPT-4

June 6, 202312:00 AM

Note: This is not a blog, it's a semi-private digital garden with mostly first drafts that are often co-written with an LLM. Unless I shared this link with you directly, you might be missing important context or reading an outdated perspective.

[Private]

Speculation

Turns out humans preferred Instruct models where the base model was 100x smaller than GPT-3.
Chincilla suggests that at around 175B tokens, we’d run out of tokens to train on, and that GPT-3 was way undertrained.
Makes me wonder if GPT-4 can be well under 1 trillion and still very good.