Dark data, new Eldorado or economic mirage?
By 2020, 85% of 40 zettabytes of data in the world will be Dark Data. A figure that emanates from the « Veritas Global Databerg Survey » in 2016.
But what is a « Dark Data »? According to the glossary of Gartner IT, it is necessary to define the dark data « as the information assets organizations collect, process and store during regular business activities, for purposes of analysis, business relationships and direct monetizing. […] Storing and securing data typically incur more expensive (and sometimes greater risk) than value. »
While big data and data analysts are beginning to get into business practices, are the different companies ready to deal with this new field? If not, then they must prioritize it. Indeed, according to International Data Corporation (IDC), 90% of corporate data are dark data.
But the first difficulty was the profitability to be explored, stored or even exploited while securing this data. Indeed, the investment of financial and human resources would be for a zero or negative profit.
Four preliminary questions have to be asked according to the « data centric » approach: what are the objectives? What are data? What risks? What tools and skills?
To achieve an effective, ethically and economically viable solution, the combination of IT, legal and data analyst skills is a prerequisite.
In the light of the european rule (GDPR), the privacy impact assessment (PIA) of the use of these data highlights a strong and real risk. The principle of « privacy by design and by default » and « accountability » must be respected, under penalty of a severe fine (2 to 4% of the group global annual turnover).
A first approach would be to establish an ethical charter for the use of dark data on the model of what is already done for the Internet.