• 9 Posts
  • 237 Comments
Joined 3 years ago
cake
Cake day: June 19th, 2023

help-circle
















  • From my extremely limited understanding, it’s because of the sheer scale of the data that’s been fed into LLMs, and because of the (admittedly small) possibility that the people working on LLMs never really took the time to understand what sort of connections the LLM was making between all the datapoints it was interacting with and drawing connections between, or at least a deeper understanding of how the math worked; just that it did.

    As the scale of the project kept growing and LLM companies just kept throwing ‘more data, more neural networks, more hardware!’ into the mix, the black box became…well, blacker and it kept getting harder to figure out the internal ‘logic’ used by the LLM to predict the next word. Now, the people who’re trying to figure it all out are working with extremely large amounts of data with nothing to go off of.

    In short, the people making GPT were somehow smart enough to make it, but not smart enough to understand what they were making.