“Ok Google, find me an open bakery near my home”. Performed vocally to a machine, this request was still considered as belonging to the field of science fiction in the early 2000s as has shown the movie “Minority Report” by Steven Spielberg. Now voice search is gradually entering into the lives of people like the commercial success of smart speakers such as Google’s Google Home or Alexia Amazon. But questions remain as to the source of the given voice response.
While there is some lack of data on voice search and its answers on all the solutions developed, a ROAT report looked at the type of response given by the Google Assistant on more than 600 voice requests. At first, this agency concludes that Google could give a result based on 75% of requests. And among these 75%, 80% of responses were represented by an “Answer Box”. Google’s “Answer Box” is, according to the company itself, “a special block that contains an optimized extract of a page”. This block then contains a summary of the response extracted from the web page. To which we add the link of this page, the title, and its URL. But is there a content of “response box” in which the voice assistant does not quote a source? By performing a simple search on the match between Paris-Saint-Germain and Real Madrid on March 6, Google offers a result retranscribing all the information such as the time of the meeting, the score of previous matches etc. But in this case, there is no source in the “Answer box” provided by the Google assistant.
Therefore, we can legitimately wonder if Google does benefit from structured information from other websites by gathering them to enrich its “answers box” as seen above. An interrogation that raises issues related to intellectual property law.
In France, the right to the quotation was recognized, meaning that a short excerpt could be resumed if the name of the author and sources are indicated according to Article L122-5 of the Code of Intellectual Property. Search engines are subject to the code of intellectual property, so the search engine must quote the source of the excerpts that it will give in the single answer by the voice assistant.
However, it is more complex to know if informations analyzed on one or more web pages and grouped for their relevance in an answer box would constitute an infringement of the intellectual property rights of the creators of the content of a web page. Already in 2014, Olivier Andreu reported this activity called “scrapping” that Google achieved through its Knowledge Graph.
According to the Code of Intellectual Property, authors of various data benefit the protection of copyright. A protection of their work which is conditioned to the originality of the creation of this one according to the article L.112-3 of the Code of the intellectual property (CPI). Therefore, the article 122-5 of the same Code governs the reproduction of the work of its author. Beyond the work, the legislator also wanted to protect database producers, namely “the creator of a collection of works, data or other independent elements, arranged in a systematic or methodical manner, and individually accessible by electronic means or by any other means “according to Article 112-3 of the CPI. This protection achieved through copyright prohibits the extraction of this database when it is substantial according to article L342- of CPI.
In the case of Google, it is difficult to determine if his activity of “scrapping” through his voice assistant is a violation of the Code of Intellectual Property. Indeed, it is difficult to accurately measure the extent of data that Google reuses in its voice “answer box”. Moreover, would these same vocal “Answer box” present a form of originality allowing them to also benefit from a copyright? Problems that will certainly require legal studies to deepen the debate and its solutions.

A propos de Pierre VEYRET