Making robots requires decisions like humans do, and this in turn requires evaluating many options and hundreds of potential outcomes. Robots can simulate these results to determine the likely course of action to lead to successful operations. But what happens if other options also lead to a successful and perhaps safer result? This question is posed by a report by the "Future Observatory" of the "Dubai Future Foundation."

The report adds: The Maritime Research Office awarded Brendan Inglot, a mechanical engineer at the Stevens Institute of Technology, a prize of 2020 for young researchers, worth about $ 509,000, to take advantage of a new artificial intelligence variable that allows robots to anticipate many potential consequences of their actions and their likelihood of occurring. The framework will allow robots to determine the best option for achieving their goals and to understand the safest, most effective and least likely to fail options.

"If the fastest way to complete a robot is to do a certain task is to walk on the edge of a cliff, speed will threaten its safety," said Inglot, one of the first to use the enhanced learning tool to train robots. And complete the required mission ».

Enhanced learning has been applied for years to train robots to move independently in water, land and air. But this AI tool is subject to limitations, because it makes decisions based on one expected result for each available action, while in reality there are many other potential outcomes that can happen. Englot uses a distributive augmented learning or an artificial intelligence algorithm that a robot can use to evaluate all possible outcomes, successfully predict each procedure and choose the most appropriate orientation for success while maintaining the robot's safety and security.

Inglot's first task is to master the algorithm before it is developed and applied to the robot. Inglot and his team create a number of decision-making positions to test their algorithm, and they often turn to one of their favorite games: Atari.

For example, when playing a Pac-Man game, the player encounters an algorithm that decides how Pac-Man behaves. The goal is to get all the points in the maze and some fruits. But there are ghosts floating around and they can kill him every second, and the player here becomes forced to make a decision. Does it go straight (left or right)? What is the path that gives him the most points while avoiding ghosts as well?

The Inglot algorithm for artificial intelligence, using distributed augmentative learning, will replace the human player and simulate every possible movement to move safely.

How will the robot be rewarded?

If the robot falls from the cliff, the robot will get -100 points. And if he implemented a slower but safer option he receives -1 points for every step of the path, but if the robot successfully reaches the goal he will get + 50. One of our secondary goals is to know how to design reward signals to positively influence the robot's decision-making and training process on her". "We hope that the technologies we develop in this project will be used with the most complex AI applications."

An AI algorithm that replaces the human player, simulates every possible move to move safely.

Inglot: “One of our goals is to reward the robot for positively influencing and training its decision-making process.”