An AI algorithm can draw faces just from people's voices

2 min read 2 min 1 Comment 1

Jun 12, 2019, 11:06 AM

AI robot 06 — © metamorworks/Shutterstock

Andrés Castellano

Read in other languages:

We have all heard the voice of an unknown person, and made an imaginary portrait of that person in our minds, with varying degrees of success. Now, an algorithm is doing the same experiment. But accurate is it?

San Francisco is first city in the US to ban facial recognition surveillance

The algorithm in question is called Speech2Face. A group of scientists trained the neural network using millions of videos located on the network, in which more than 100,000 people can be heard talking. According to what was written by the researchers in their study, the algorithm used the Speech2Face data to develop associations between vocal lines and certain physical features of the human face. Later, the AI began to make portraits of various people using only their voices as a reference.

2019 06 12 IA hace retratos 1 — Photos and portraits created by the AI. Pretty close....? / © LiveScience

The results of the research were uploaded to the network on May 23, in the pre-publication arXiv. However, such data have not yet been contrasted by other scientists working in the same field.

But how accurate is the algorithm? We can say that, (fortunately), AI still cannot identify individuals solely on the basis of samples of their voices. Rather, the neural network identifies traits associated with certain factors, such as gender, age, and ethnicity, but these traits are shared by a considerable number of people. Therefore, the images generated are more of an "average" than accurate individual portraits.

Face recognition fail: teen sues Apple after false arrest

That said, Speech2Face has generated portraits of astonishing accuracy, but has also shown certain weaknesses when confronted with language and/or pronunciation variations. For example, the AI produced two totally different portraits of the same person, having listened to her speaking Chinese and English. Anyway, in general, the ability of the algorithm to portray the human being is much greater than when trying to portray cats, as you can see in the image below.

2019 06 12 IA hace retratos 2 — The cats have come out less well... / © LiveScience

What do you think? Do you like idea of an algorithm that can picture our faces from our voices? Or would it be better to preserve the 'anonymity of audio'? Tell us your opinion in the comments below.

Source: LiveScience

Comparison: The best foldable phones

	Best large foldable	Best clamshell flip	Best foldable alternative	Best passport foldable	Best flip alternative	Price pick
Product	OnePlus Open	Samsung Galaxy Z Flip 5	Samsung Galaxy Z Fold 5	Google Pixel Fold	Motorola Razr+ (2023)	Motorola Razr (2023)
Image
Rating	Rating: OnePlus Open	Rating: Samsung Galaxy Z Flip 5	Rating: Samsung Galaxy Z Fold 5	Rating: Google Pixel Fold	Rating: Motorola Razr+ (2023)	Rating: Motorola Razr (2023)
Price Comparison	Check offer $1,699.99 (Amazon - new) * Check offer (OnePlus) * Find on eBay (eBay) *	Check offer $799.97 (128GB - new) * Check offer (Samsung) * Free w/ trade-in (T-Mobile) *	Check offer $1,599.99 (Amazon - new) * Check offer (Samsung) * $1000-off w/ trade (T-Mobile) *	$1000-off w/ trade (T-Mobile) * Check offer (BestBuy) * Find on Amazon (Amazon) *	Check offer $699.99 (Amazon - new) * Check offer (Motorola) * Free w/ trade-in (T-Mobile) *	Check offer $499.99 (Amazon - new) * Check offer (Motorola) * Find on eBay (eBay) *