- cross-posted to:
- technology@beehaw.org
- cross-posted to:
- technology@beehaw.org
A new paper suggests diminishing returns from larger and larger generative AI models. Dr Mike Pound discusses.
The Paper (No “Zero-Shot” Without Exponential Data): https://arxiv.org/abs/2404.04125
gotem!
seriously tho, you don’t think OpenAI is tracking this? architecural improvements and training strategies are developing all the time
…and aren’t making progress on that front: A linear increase in generalisation still requires a more than linear increase in amount of data.
Also it’s not btw that we wouldn’t know that our current architectures won’t lead to proper intelligence, tl:dr: While current architectures can learn, and represent information, they cannot develop learning strategies or decide smartly on how to represent a particular bit of information. All the improvement that are happening are on that “how to learn better” area, we have no idea whatsoever how to make the jump on how to teach an AI to learn how to learn. AlphaZero is able to learn rules of a game, yes, but it can’t learn arbitrary information – once you throw something other than a game at it it has no idea how to make sense of anything.
“we don’t know how” != “it’s not possible”
i think OpenAI more than anyone knows the challenges with scaling data and training. anyone working on AI knows the line: “a baby can learn to recognize elephants from a single instance”. reducing training data and time is fundamental to advancement. don’t get me wrong, it’s great to put numbers to these things. i just don’t think this paper is super groundbreaking or profound. a bit clickbaity and sensational for Computerphile
…and a baby doesn’t use the same architecture, not even close, as generative AIs. Babies are T3 systems, they aren’t simply systems which have rules on how to learn, they are systems which have rules on how to develop learning strategies that they then use to learn.
I’m not doubting, in the slightest, that AI can’t get there: It’s definitely possible. It’s just not possible with the current approaches, and the iterative refinements that “oh OpenAI is constantly coming up with new topologies” refers to is just more of the same. Show me a topology that can come up with topologies, then we’ll have a chance to break through the need for exponential amounts of data.