Suggested Machine Learning in Video Games

J Young Kim

New Member
Contributor
Design
What is Machine Learning?

Machine learning in simplistic terms is when the artificial intelligence of a Non-Player Character (NPC) learns how to behave correctly after learning from a series of mistakes it makes. In other words, it’s the NPC teaching itself through trial and error.

Sources of Information

Machine Learning in Game Development (http://ai-depot.com/GameAI/Learning.html)

Summary: There are two methods of machine learning: direct (intentionally imposing certain situations for the NPC to learn) and indirect (the NPC passively learns based on the environment it’s in). Direct learning is often preferred because it doesn’t limit the NPC from certain behaviors (i.e. certain situations in the environment may never occur, so intentionally making the situation may be required). For an effective learning agent, the NPC must be able to learn, perform, be curious, and be able to self-evaluate the effectiveness of the action performed. Essentially, the learning is a loop: the NPC is curious as to what would happen if it does a certain action in a situation à the NPC does the action à the NPC grades the action in terms of how good it was à the NPC learns that the action is either good or bad à the NPC becomes curious again, and the loop continues. The important thing is that the player should not tell the NPC whether or not a certain action was good or not, it has to learn on its own for it to be an effective artificial intelligence. To have a decent AI when the player boots up the game, the developers can allow the NPC to “play” around and once it has a decent sense of what is right and wrong, the developers can “freeze” the AI and package the game for release.

Implementation in Terasology: The developers can at first conduct direct learning onto its NPCs by put the NPCs in “bad” situations (i.e. put the NPC near a cliff). Once the game is packaged and released, the NPC would have to go through indirect learning. Using a miner NPC as an example, it can learn that jumping off of high cliffs are not optimal and touching lava is not a good idea. In addition, we can also make the AI better by placing hazardous conditions in the way of high reward items (valuable ores blocks), and let the NPC miner decide the best way to live and get the ore. After the miner NPC can somewhat survive during mining adventures, miner’s AI can now be frozen and packaged for the game’s release. When the player boots up the world, the miner will start learning indirectly, and as the player progresses through his or her world, the miner would learn to be very smart. The unfortunate part about the machine learning is that after a certain time frame, the miner would be so smart, that it will make no errors, making it almost non-humanlike. Or in another case, it can repeatedly do dumb actions if its curiosity cannot think of another action to take in a certain situation.


Drivatar™ (http://www.ign.com/wikis/forza-5/Drivatar)

Summary: Drivatar™ is a system used to control the opponent cars in Forza 5™. The system takes data from Xbox Cloud regarding how other players drive. This results in the opponent cars being more realistic by seeming as if a real human is controlling it, rather than a bot. Drivatar™ learns by taking note of the driving style of its player. For instance, when the player turns a corner, a rating will be given, which Drivatar™ will take note of, and take corners similar to the rating. Drivatar™ also learns from how often you using the brakes, crash, etc. To avoid Drivatar™ from being an insane bot which crashes all over the place because its player intentionally drives poorly to influence the system, the creators limited the extent to which Drivatar™ can be “crazy.”

Implementation in Terasology: Drivatar™ brings a unique idea of using data of players all around the globe to make the car artificial intelligences more human-like. If Terasology has humanoid NPCs (i.e. miner, shopkeeper, doctor, etc.), then a system like Drivatar™ would be useful. Unfortunately, the system would mostly apply to humanoid NPCs which can mimic the player’s activities (i.e. miners, diggers, and woodcutters), meaning that the artificial intelligence of NPCs such as doctors and shopkeepers would not benefit. For the purpose of a clear explanation of how a Drivatar™-like system can be implemented in Terasology, we can take a NPC miner as an example. When players mine blocks, we can collect data. We can see when players tend to go back to the surface (i.e. when inventory half-full vs. completely full, when player has no more torches, etc.), we can see if a player would jump down a great height or just mine downwards safely, etc. Of course, to prevent the system from copying players who intentionally die, we can write a script which prevent the NPC from doing suicidal actions (i.e. jumping off to one’s death). When a miner is brought to the world, the habits of one player would be selected for the miner to mimic. To make a realistic artificial intelligence, we can have the NPC choose data from players who have sufficient data to mimic (so no new players would be mimicked). The Terasology system would be more difficult because unlike Forza 5™, the worlds are not fixed. With this in mind, the NPCs would use decision trees, and go with the choice which the player it’s mimicking tends to choose. In other words, the NPC would have to see what it can do (i.e. jump down, mine down, use the waterfall, etc.) at a certain situation, see what the player data tends to do given a similar situation, and then make a decision.

Final Words:
Currently, there are two types of machine learning. One involves the NPC learning on its own, and one involves using player data to emulate human behavior. To introduce more interesting aspects to Terasology, the self-learning method would be best. Although the ‘Single Player’ mode has a player playing alone, we can make it seem as if he or she is not alone. We can have NPCs hunting in the wilderness, cutting wood, fishing, etc. For the NPCs to do such tasks effectively, self-learning may be the best, since it solely relies on its own data, not player data. The gameplay would also benefit, since the decent AI learning gives the illusion that they’re evolving as the player is learning how to play the game more effectively. This does not mean that emulation using player data is not useful. The player data can be used to save time for the developers by giving it decent AI before the NPC is either subjected to direct learning or indirect learning. Regardless of which method is used for machine learning, the most important part is for the developer to prevent the NPC AI’s from being overly dumb, whether it be woodcutters self-learning that punching trees are the best method for woodcutting or the woodcutters using a player data which primarily has the shovel used for wood-cutting. For those who are interested in creating AI’s regarding machine learning for Terasology, the following links will be helpful.


Links: The following links are commonly used algorithms or models implemented in machine learning

· A site of Machine Learning algorithms: http://satirist.org/learn-game/

· Belief-Desire-Intention Software Model (https://en.wikipedia.org/wiki/Belief–desire–intention_software_model)

· Rule-Based System (https://en.wikipedia.org/wiki/Rule-based_system)

· Decision Tree (https://en.wikipedia.org/wiki/Decision_tree)

· Perceptron (https://en.wikipedia.org/wiki/Perceptron)
 

Skaldarnar

Development Lead
Contributor
Art
World
SpecOps
Very good research :)

The Drivatar idea seems interesting. We might find a way around the issue that we need behavior for non-typical player actions, such as miners and shopkeepers. However, we may integrate that into simple, time-limited quests. Say, for instance, the player is asked to go down a mineshaft and has 5 minutes to mine 25 coal blocks, given a new pickaxe. The movement behavior, mining, etc., could then be recorded for this 5 minutes. Thus, we could get "real" data by players, for typical NPC tasks.

What do you think about that idea?
 

J Young Kim

New Member
Contributor
Design
I believe that it's an excellent idea. I did not mention this in the original post, but Drivatar also has an offline function which relies on data which the developers embedded before shipping the game. For Terasology, various recordings of players performing the task during a certain time frame would allow the NPC to have a decent pool of data to use.

However, it may be infeasible to get players to do timed tasks.

So, another idea which follows a similar notion of player actions used for the AI of the NPCs I have is players can be given questionnaires, and the data collected from these can be translated into the NPC.

For instance (sample quiz for miners)
1. If you see diamonds but lava is in the way, would you:
  • Ignore the diamonds
  • Try to build across the lava
  • Go around the lava via mining
2. If your inventory is full but there are ores around you, would you:
  • Place a torch, go to the surface and empty your inventory, and find your way back to the torch
  • Ignore the ore
  • Free up a slot in your inventory
3. Around how much damage would you take before surfacing?
  • A lot
  • A little
  • A scratch
  • Doesn't matter
Although the NPC will be a bit robotic, the NPC miner generated can make weighted decisions (50% chance of ignoring diamonds, 10% chance of trying to build across, 40% chance going around the lava) based on the polls, and give the feel that it's making human-like decisions.
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
This is a super interesting topic, like the other AI one :)

Two quick things I'll add to it:
  1. I wrote up a thread on various kinds of automated testing, parts of which might well be prerequisites for machine learning as we'd need better interfaces to the game (such as a "headless client")
  2. We've talked with patham9 over from OpenNARS about their interest in eventually having their machine learning agents learn Terasology by playing it even through a regular video-feed only - no hints, just simulated input and parsing out what's on the screen. Sounds crazy! Probably years off still, but a cool idea
 
Top