Recently, reading about an EU project on automated border control often involved two facts. First, a lie detector could help officials at EU borders to control travelers as part of the iBorderCtrl research project funded by the EU's Horizon 2020 program. Second, in initial experiments outside the researchers' labs, the detector achieved an accuracy of around 75 percent.

The reports raised many questions, in the social media, the indignation was also great because of some catchy headline. But what should you really know about the project?

The best thing to look at first is the issue of accuracy of the detector . An accuracy of 75 percent does not mean in this case that in future 25 percent of all those entering the EU will be classified as 'liars' and possibly prevented from entering the country. "It's a risk score," says Jim O'Shea of the University of Manchester School of Computing, Mathematics and Digital Technology, who heads the polygraph section of iBorderCtrl, "not to say that someone is a liar . "

As part of iBorderCtrl, which launches a nine-month volunteers' test phase next year, travelers will be interviewed by , among other things, the contents of their baggage, their lineage and whether they have a relative or friend who can validate their identity - and what relationship they have with this person.

It's about behavior

The lie detector system, which is based on machine learning, does not 'listen' to it: it is not about the content of the spoken word, but about how the person behaves . Whether someone allegedly lies, interprets the system, among other things, from the finest facial movements or body language, explains Jim O'Shea. For example, consider "whether someone is moving back and forth in the chair".

This score flows into a value supplemented by other results , such as biometric facial recognition data, fingerprints, and the like. This value, in turn, is reported to the border guards, who then decide if they want to control the person.

But why do you need a polygraph, if face recognition, fingerprints and visa control are quite reliable? "It's always better to have multiple factors," says Anastasia Garbi, coordinator of the iBorderCtrl project. "The more information you have about possible issues that can be faked in the process, the more accurate this process is overall." So if fake passports are in the game that are not recognized, the lie detector would be a kind of backup .

"Morphing can be incredibly difficult to recognize"

Bernhard Strobl of the Austrian Institute of Technology, who also carries out research on automatically supported border controls but does not take part in the current project, thinks it would be sensible to test such innovative systems. "Face recognition is pretty good, but it can be bypassed."

Automatic border controls, where travelers with a biometric passport can put it in a scanner, are already available today. A face recognition system then checks if the person is the same as in the photo.

But so-called Morphing attacks outsmart the current technology: Who wants to enter illegally, can use a valid ID and share the photo with a mix of their own image and the rightful passport holder. "Morphing can be incredibly difficult to recognize," says Bernhard Strobl.

Supplements such as plausibility and Biometriechecks (3D face, fingerprint, hand vein, iris) are therefore useful to improve the reliability of such systems. "An adapted lie detector could be included if performance in this particular environment could be proven." But here, too, the question arises whether travelers could "fool" the detector.

Strobl emphasizes that the AI-based method of automated border control relieves all travelers: "We expect to double passenger traffic over the next few years, so a risk assessment in advance of decision support is needed." One should not confuse this however with profiling.

Is the subject lying or is he telling the truth?

Even more important, however, as mentioned in the beginning, is to interpret the accuracy of 75 percent correctly . If, in the future, 25 per cent of all travelers were to see an alert on the screen of the competent border guards and that they called for closer scrutiny, such a system would be obsolete: the officials would have more work and the lines at the borders would to extend even more.

"Such a system would be useless," says Bernhard Strobel, "it only makes sense as a decision-making aid if the responsible officials receive few alerts."

The 75 percent refer to the mean , with which the system was correct in the 13 questions. In the experiment 32 subjects were asked a total of 13 questions each. Half of the subjects were asked to lie, the other told the truth. From various factors, the system calculated one guess for each question: is the subject lying or telling the truth? In 75 percent of cases, the system was on average correct with respect to a single question.

O'Shea and his colleagues also consulted psychologists about these factors. "Much is already known, by which one recognizes lies." All these factors are incorporated into the system, but in addition, machine learning always 'looks for' its own factors : patterns in the training data that people would not notice. These are correlations in the data, for example in the behavior of the total number of 'lying' subjects. "There are certainly more factors than previously known," it says - but unfortunately does not betray the system, it is the well-known "black box" system of machine learning.

But do people who only pretend to lie, even look like 'real' liars? "We got counseling from psychologists on how to get the most realistic setup possible," says O'Shea. However, this is a well-known problem of artificial intelligence in this context: people who pretend to do so have certainly some other micro-gestures than liars who are under the stress of being really caught with all the consequences that depend on it.

"If we have more data, the accuracy will increase"

Of course, the system has other weaknesses that arise, among other things, from a very small number of test subjects and also from the general problem that modern systems of machine learning entail: they need large amounts of training data .

"But that's very expensive in our case," says Jim O'Shea. He would have to ask a lot of people to lie in front of a camera or tell the truth. Since the system identifies quite a few factors, a high amount of computation is added.

In addition, the subjects are not particularly representative : The 'Lies' group consisted of ten men and seven women, of whom 13 were European-white, while 4 had Asian or Arab roots. The truth group consisted of twelve men and three women, including six Asian or Arab subjects and nine European whites. A possible racial distortion that can arise from such data, one has in mind, assures O'Shea.

"If we have more data, the accuracy will increase," he says, with the current 75 percent, he was already satisfied: "This is a great accuracy, I would recommend such a system to the authorities in any case."

It lacks transparency

The black box problem may lead to another problem, as Tina Krügel says: The lawyer at the Institute for Legal Informatics of the University of Hanover and her team are responsible for the legal and ethical issues of the project. Actually, the new General Data Protection Regulation provides that "data subjects of an automated case-by-case decision" must receive information about the underlying logic.

That is, should such a system in the field of border control actually be used, complicated, because here, of course, also affected public safety. In addition, the question arises how far this claim actually reaches, how detailed the information should be.

Why an artificial intelligence has decided one way or another, is not easy to understand. "Explainable AI is currently a huge area of ​​research," says Krügel.

On the way to avatar interviews

But is not there a danger that the border officials will be over-influenced by such a score and then assume that the person is actually lying, rather than approaching them with an open mind? "Of course we have to explain it to those affected," says Jim O'Shea. "That's a risk score , it does not mean that the person is actually dishonest." Thus, the officer must always judge on the basis of his own experience and his assessment, whether he makes a check. "He can also control travelers who have a low score."

But whether that happens so often in reality may be doubted. "If you get a risk score, you're inclined to go after it," says Tina Krügel. Your group has therefore proposed various fundamentals for the system: Data categories that can lead to discrimination are not included in the risk value. In addition to the "privacy by design" approach, which has already been implemented, Krügel demands further measures for a real operation, such as the supervision by an ethics committee and extensive training of the officials.

It is still far from decided whether the system will become reality in the end. The increase in cross-border crime and international terrorism has led to support systems being researched, Krügel said. "We have to see what works and how such a system gets us." That is why it is particularly important that the project is funded by public EU funds and that it does not incorporate economic interests. "Then a detailed ethical assessment is guaranteed."

The independent researcher Strobel praises the extensive evaluation: "I am convinced that the colleagues look at everything very closely."

"In the end, it's a question of what society wants, " says Tina Krügel. Strobel also shares this viewpoint: "We always work with ethicists, sociologists, and stakeholders to see what acceptance is." Through smartphones, the acceptance of biometric methods such as face recognition and fingerprints has risen sharply in recent years. If you see it that way, avatar interviews might well happen to us soon.