首页 \ 问答 \ 如何训练人造神经网络使用视觉输入来玩Diablo 2?(How to train an artificial neural network to play Diablo 2 using visual input?)

如何训练人造神经网络使用视觉输入来玩Diablo 2?(How to train an artificial neural network to play Diablo 2 using visual input?)



为了使事情变得更具体,任务是让你的角色x体验点,而不会使其健康状况下降到0,通过杀死怪物获得经验值。 这是一个游戏的例子:



所有这些信息都必须被教给网...不知何故。 我不能为了我的生活想到如何训练这件事情。 我唯一的想法是有一个单独的程序从屏幕上直观地提取游戏中的内容(例如健康,金子,经验),然后在强化学习过程中使用该统计数据。 我认为这将是答案的一部分 ,但我认为这不足够。 从原始视觉输入到目标导向行为的抽象层次太多,这种有限的反馈在我一生中训练网络。

所以,我的问题:你还有什么其他的方式来训练网络至少做这个任务的一部分? 最好不要制作成千上万个标签的例子

只是为了更多的方向:我正在寻找一些其他的强化学习来源和/或任何无监督的方法来提取有用的信息在这个设置。 或者是一个监督的算法,如果你可以想到一种方法,将标签数据从游戏世界中取出,而无需手动标记它。


奇怪的是,我还在努力,似乎正在取得进展。 获得ANN控制器工作最大的秘诀是使用适合该任务的最先进的ANN架构。 因此,我一直在使用一个深层信念网,它由有条件限制的玻尔兹曼机器组成 ,我以无监督的方式(在我玩游戏的视频)上进行了训练,然后在微调时间差异反向传播 (即强化学习与标准前馈ANN)。



刚刚想起来,我回答了这个问题,并且认为我应该提到这不再是一个疯狂的想法。 自从我上次更新以来,DeepMind发表了他们的自然文章,从神经网络到视觉输入玩atari游戏 。 事实上,唯一阻止我使用自己的架构来发挥暗黑破坏神2的有限子集的缺点是缺乏访问底层的游戏引擎。 渲染到屏幕,然后将其重定向到网络,在合理的时间内训练太慢了。 因此,我们可能不会很快看到这种机器人玩Diablo 2,但只因为它会播放开源或API访问渲染目标的东西。 (也许地震?)

I'm currently trying to get an ANN to play a video game and and I was hoping to get some help from the wonderful community here.

I've settled on Diablo 2. Game play is thus in real-time and from an isometric viewpoint, with the player controlling a single avatar whom the camera is centered on.

To make things concrete, the task is to get your character x experience points without having its health drop to 0, where experience point are gained through killing monsters. Here is an example of the gameplay:


Now, since I want the net to operate based solely on the information it gets from the pixels on the screen, it must learn a very rich representation in order to play efficiently, since this would presumably require it to know (implicitly at least) how divide the game world up into objects and how to interact with them.

And all of this information must be taught to the net... somehow. I can't for the life of me think of how to train this thing. My only idea is have a separate program visually extract something innately good/bad in the game (e.g. health, gold, experience) from the screen, and then use that stat in a reinforcement learning procedure. I think that will be part of the answer, but I don't think it'll be enough; there are just too many levels of abstraction from raw visual input to goal-oriented behavior for such limited feedback to train a net within my lifetime.

So, my question: what other ways can you think of to train a net to do at least some part of this task? preferably without making thousands of labeled examples...

Just for a little more direction: I'm looking for some other sources of reinforcement learning and/or any unsupervised methods for extracting useful information in this setting. Or a supervised algorithm if you can think of a way of getting labeled data out of a game world without having to manually label it.


Strangely, I'm still working on this and seem to be making progress. The biggest secret to getting a ANN controller to work is to use the most advanced ANN architectures appropriate to the task. Hence I've been using a deep belief net composed of factored conditional restricted Boltzmann machines that I've trained in an unsupervised manner (on video of me playing the game) before fine tuning with temporal difference back-propagation (i.e. reinforcement learning with standard feedforward ANNs).

Still looking for more valuable input though, especially on the problem of action selection in real-time and how to encode color images for ANN processing :-)


Just remembered I asked this question back-in-the-day, and thought I should mention that this is no longer a crazy idea. Since my last update, DeepMind published their nature paper on getting neural networks to play atari games from visual inputs. Indeed, the only thing preventing me from using their architecture to play, a limited subset, of Diablo 2 is the lack of access to the underlying game engine. Rendering to the screen and then redirecting it to the network is just far too slow to train in a reasonable amount of time. Thus we probably won't see this sort of bot playing Diablo 2 anytime soon, but only because it'll be playing something either open-source or with API access to the rendering target. (quake perhaps?)

更新时间:2023-12-15 18:12


