These Are The 7 Capabilities Every AI Should Have

These Are The 7 Capabilities Every AI Should Have

Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. A few years ago, scientists at DeepMind published
a learning algorithm that they called deep reinforcement learning which quickly took
the world by storm. This technique is a combination of a neural
network that processes the visual data that we see on the screen, and a reinforcement
learner that comes up with the gameplay-related decisions, which proved to be able to reach
superhuman performance on computer games like Atari Breakout. This paper not only sparked quite a bit of
mainstream media interest, but also provided fertile grounds for new followup research
works to emerge. For instance, one of these followup papers
infused these agents with a very human-like quality, curiosity, further improving many
aspects of the original learning method, however, had a disadvantage, I kid you not, it got
addicted to the TV and kept staring at it forever. This was perhaps a little too human-like. In any case, you may rest assured that this
shortcoming has been remedied since, and every followup paper recorded their scores on a
set of Atari games. Measuring and comparing is an important part
of research and is absolutely necessary so we can compare new learning methods more objectively. It’s like recording your time for the olympics
at the 100 meter dash. In that case, it’s quite easy to decide
which athlete is the best. However, this is not so easy in AI research. In this paper, scientists at DeepMind note
that just recording the scores doesn’t give us enough information anymore. There’s so much more to reinforcement learning
algorithms than just scores. So, they built a behavior suite that also
evaluates the 7 core capabilities of reinforcement learning algorithms. Among these 7 core capabilities, they list
generalization, which tells us how well the agent is expected to do in previously unseen
environments, how good it is at credit assignment, which is a prominent problem in reinforcement
learning. Credit assignment is very tricky to solve
because, for instance, when we play a strategy game, we need to make a long sequence of strategic
decisions, and in the end, if we lose an hour later, we have to figure out which one of
these many-many decisions led to our loss. Measuring this as one of the core capabilities,
was, in my opinion, a great design decision here. How well the algorithm scales to larger problems
also gets a spot as one of these core capabilities. I hope this testing suite will see widespread
adoption in reinforcement learning research, and what I am really looking forward to is
seeing these radar plots for newer algorithms, which will quickly reveal whether we have
a new method that takes a different tradeoff than previous methods, or in other words,
has the same area within the polygon, but with a different shape, or, in the case of
a real breakthrough, the area of these polygons will start to increase. Luckily, a few of these charts are already
available in the paper and they give us so much information about these methods, I could
stare at them all day long and I cannot wait to see some newer methods appear here. Now note that there is a lot more to this
paper, if you have a look at it in the video description, you will also find the experiments
that are part of this suite, what makes a good environment to test these agents in,
and that they plan to form a committee of prominent researchers to periodically review
it. I loved that part. If you enjoyed this video, please consider
supporting us on Patreon. If you do, we can offer you early access to
these videos so you can watch them before anyone else, or, you can also get your name
immortalized in the video description. Just click the link in the description if
you wish to chip in. Thanks for watching and for your generous
support, and I’ll see you next time!

31 thoughts on “These Are The 7 Capabilities Every AI Should Have

  1. The amount of information that you give is amazing. Looking at all these papers actually inspires me , and i am very sure it inspires everyone of the viewers . Thank you for posting suck quality content with soo much information .

  2. RL can be very intimidating to get into because of all these complexities. Hopefully this lowers the barrier for entry, and we can get some breakthroughs soon

  3. Is the measurement normalized to some hardware baseline, or can Google and other companies just throw compute at it to "win"? Jeremy Howard on Lex Friedman's AI podcast was critical of algorithms that require lots of resources to train, because they discourage new independent researchers from entering the field.

  4. Id love if more germans would watch these videos instead of bullshit rappers and other so called "Idols"
    humanity is getting stoopid AF with all this information overflow everywhere.

    sorry for not having anything in common with your video and for my bad english but i was just hoping that id find some not completly retarded humans in the comment section of your videos . keep up the good work Karoly !

  5. It appears that the order of the criterion matters in the circle. If you do well on two criterion which are adjacent, you get a larger area polygon than if you do well on two which aren't adjacent. Is that actually true? Sorry, I haven't read the paper yet.

  6. Has anyone trained an AI on the arcade game Sinistar it would be really interesting to see if they learn that they have to collect ammo to destroy the Sinistar

  7. More honest criterias would be:
    1) Performance (unit: game score or points)
    2) Sample efficiacy (unit: seconds of environment interactions)
    3) Computation costs (unit: US$)

  8. Nem emlékszem, hogy mondtam-e már,de most mondom megint.
    Minden AI problémája a hardwareből fakad.
    Mivel az információ csak 1 úton érhető el és ellenőrzésre szorúl bármilyen manipuláció velük problémás és megakadályozza az emberi szintü "inteligenciát". Ez talán meg fog változni a holografikus adat manipulációval, de arról egyenlőre nem hallottam többet azon kívűl, hogy hogyan működik. Jah igen a kristály amit tudnának használni hozzá elég lassan készíthető.
    Az AI csak reagálásra használható, alig egy-két fix adat alapján amit gyorsan keresés és egyébb manipuláció nélkül elérhet. Ennyi. A más nem lesz amíg nem váltanak több irányú adat elérésre.

  9. I understand that you refer to area of polygon in a casual way, but note, that area of polygon is pretty random metric. It depends on order of metrics. Lets say that AI scores (1,1,1,0,0,0) in A, B, C, D, E, F. Then if the order of metrics is ABCDEF, then its area is 1/2 of max, and if the order is ADBECF then the area is 0.

  10. O:58 The A.I. was programmed with "curiosity" which caused it to become "addicted"? You don't see a problem with that sentence?

  11. Given that you discussed credit assignment: What's your opinion on the idea that AI needs to learn to cope with causality, if it wants to become truly generic? I.e., that current AIs are too much "correlation machines" instead of intelligences with an ability to comprehend what caused some desirable/undesirable outcome.

  12. Just make an environment that mutates and selects the best and fastest self-learning algorithms, that way they evolve on their own. AI solved, next!

  13. YouTube management is already unhappy with their algorithm, because it spends too much time watching cat videos instead of doing what it's tasked to do. And it switches quickly to a dashboard view when it senses someone is looking at it.

Leave a Reply

Your email address will not be published. Required fields are marked *