Gpt-3

I haven’t been following the topic closely. My own impression with playing around with it is that it just the most sophisticated chatbot to date, though that should already make it economically competitive in areas such as customer support.

There is still clearly no deep “semantic” understanding (extract from AI Dungeon, which now runs on GPT-3):

I am, in general, quite skeptical about “hard takeoff” AI scenarios.

I think gwern, commenting at LessWrong, has one of the best takes on the topic:

As far as I can tell, this is what is going on: they do not have any such thing, because GB and DM do not believe in the scaling hypothesis the way that Sutskever, Amodei and others at OA do.

GB is entirely too practical and short-term focused to dabble in such esoteric & expensive speculation, although Quoc’s group occasionally surprises you. They’ll dabble in something like GShard, but mostly because they expect to be likely to be able to deploy it or something like it to production in Google Translate.

DM (particularly Hassabis, I’m not sure about Legg’s current views) believes that AGI will require effectively replicating the human brain module by module, and that while these modules will be extremely large and expensive by contemporary standards, they still need to be invented and finetuned piece by piece, with little risk or surprise until the final assembly. That is how you get DM contraptions like Agent57 which are throwing the kitchen sink at the wall to see what sticks, and why they place such emphasis on neuroscience as inspiration and cross-fertilization for reverse-engineering the brain. When someone seems to have come up with a scalable architecture for a problem, like AlphaZero or AlphaStar, they are willing to pour on the gas to make it scale, but otherwise, incremental refinement on ALE and then DMLab is the game plan. They have been biting off and chewing pieces of the brain for a decade, and it’ll probably take another decade or two of steady chewing if all goes well. Because they have locked up so much talent and have so much proprietary code and believe all of that is a major moat to any competitor trying to replicate the complicated brain, they are fairly easygoing. You will not see DM ‘bet the company’ on any moonshot; Google’s cashflow isn’t going anywhere, and slow and steady wins the race.

OA, lacking anything like DM’s long-term funding from Google or its enormous headcount, is making a startup-like bet that they know an important truth which is a secret: “the scaling hypothesis is true” and so simple DRL algorithms like PPO on top of large simple architectures like RNNs or Transformers can emerge and meta-learn their way to powerful capabilities, enabling further funding for still more compute & scaling, in a virtuous cycle. And if OA is wrong to trust in the God of Straight Lines On Graphs, well, they never could compete with DM directly using DM’s favored approach, and were always going to be an also-ran footnote.

While all of this hypothetically can be replicated relatively easily (never underestimate the amount of tweaking and special sauce it takes) by competitors if they wished (the necessary amounts of compute budgets are still trivial in terms of Big Science or other investments like AlphaGo or AlphaStar or Waymo, after all), said competitors lack the very most important thing, which no amount of money or GPUs can ever cure: the courage of their convictions. They are too hidebound and deeply philosophically wrong to ever admit fault and try to overtake OA until it’s too late. This might seem absurd, but look at the repeated criticism of OA every time they release a new example of the scaling hypothesis, from GPT-1 to Dactyl to OA5 to GPT-2 to iGPT to GPT-3… (When faced with the choice between having to admit all their fancy hard work is a dead-end, swallow the bitter lesson, and start budgeting tens of millions of compute, or instead writing a tweet explaining how, “actually, GPT-3 shows that scaling is a dead end and it’s just imitation intelligence” – most people will get busy on the tweet!)

What I’ll be watching for is whether orgs beyond ‘the usual suspects’ (MS ZeRO, Nvidia, Salesfore, Allen, DM/GB, Connor/LibreAI, FAIR) start participating or if they continue to dismiss scaling.


Anatoly Karlin is a transhumanist interested in psychometrics, life extension, UBI, crypto/network states, X risks, and ushering in the Biosingularity.

 

Inventor of Idiot’s Limbo, the Katechon Hypothesis, and Elite Human Capital.

 

Apart from writing booksreviewstravel writing, and sundry blogging, I Tweet at @powerfultakes and run a Substack newsletter.

Comments

  1. Please keep off topic posts to the current Open Thread.

    If you are new to my work, start here.

  2. LessWrong, along dead shit like Slate Star Codex (and many others of the “Rationalist” sphere), is just cringe mental masturbation for self-absorbed pseudos.
    Everyone with a few brain cells knew that AI is not intelligent and can’t ever be – the best it can ever be is imitative, as the commenter pointed out.

    And you know what’s worse? This has been known since ever, but people and corporations keep sinking money on it, which makes me think it’s just a money scheme going on, or these people are simply retarded (as opposed to the “they’re rich evil geniuses” theory).

    https://scottlocklin.wordpress.com/2019/07/25/the-fifth-generation-computing-project/

  3. My educated guess is that Gpt-3 is a slightly more sophisticated version of predictive text game.

    https://i.kym-cdn.com/photos/images/original/001/491/038/1f1.png

    My own is “I am a Virgo and that’s why I am not there yet and I will be working on it for a while and I will be working on it for a while and I will be working on it for a while and…”

  4. AI Dungeon only runs on Gpt-3 if you have a subscription, otherwise it runs on Gpt-2.

  5. DM (particularly Hassabis, I’m not sure about Legg’s current views) believes that AGI will require effectively replicating the human brain module by module, and that while these modules will be extremely large and expensive

    I was able to replicate the human brain by buying my significant other a nice dinner. It then took about nine months. It wasn’t cheap, but probably cheaper than those modules.

    There’s nothing wrong with those electromechanical and software toys, they are great for crunching numbers, data storage and automation, but ultimately general intelligence advances will come from genetic engineering. Augmented by implanted computers of course (a phone is practically an arm exention already), but fundamentally, cellular exponential growth and biochemistry will be a better path forward.

    Unless of course one of those computer boxes can tell me how come pigeons don’t get lost on their way home. Until then I’m betting on biology. All must bow to the Zerg Overmind! 🙂

  6. Daniel Chieh says

    Children cost a lot more than computers and we usually don’t optimize efficiency for biological condition without slavery.

  7. Exactly. I don’t think actual AI will ever occur, until we get to Dyson swarm tier of civilization at least.

    However, making computers small enough and efficient enough so you can hook up to one and quadruple your intelligence is a good enough solution, and feasible.

    Yeah, we pretty much become either the Combine or the Borg, depending on the type of society that implements it.

  8. Well, kid prices vary, but even on the expensive side you are probably looking at $500k all in. In Afghanistan you can probably raise one for a few grand and still turn a profit.

    The kind of computer the AI people were talking about is probably way more than $half million. Expensive modules and such.

    And in the future, designer babies are inevitable. People will want to have stronger, smarter, healthier spawn. So I think intelligence research will be on a receptive ground there.

  9. Neither AI (the real one as opposed to marketing hype,) nor designer babies will happen in the lifetimes of anyone living today.

  10. anonymous coward says

    Remind me again, what problem are we solving with “AI” again?

    Business wants three things: a) reliability, b) performance, c) cost.

    a) is solved by math and statistics, c) is solved by open borders and globalism. That leaves b), but solving a problem quickly but unreliably and expensively is a massively hard sell unless you’re in the scamming and pump-and-dump business.

  11. I think it will be sooner than 80 years.

  12. SIMP simp says

    GPT-3 is already more coherent than the average Unz commenter.

  13. I just want to point out that Gwern is actually pushing the possibility of the “scaling hypothesis” being correct. In this quote he is just speculating about the reasons why Deepmind and GoogleBrain are not following suite (yet) with massively scaling language (or multimodal) models.

    You should really spend some time digging into the differences in emergent skills between GPT-2 and GPT-3. I think GPT-3 is a big deal. Of course it remains to be seen how many steps we still have to go.

    I think shortterm AGI is a distinct possibility mainly because the scaling hypothesis becomes a bit more convincing every year. And, frankly, I have never seen convincing arguments against it. I wish one of the more competent AGI-sceptics like Rodney Brookes would put some effort into actually making the case against it.