Data protection regulation in Europe has become tougher. Some fear that the GDPR will limit the potential of Artificial Intelligence in Europe. True: Data provides the building blocks for AI. But is this fear really warranted?
What is AI?
Artificial intelligence (AI) is revolutionizing everything from industrial production to cancer diagnosis. AI is not a singular technology but rather a multitude of techniques deployed for different commercial and policy objectives which – in order to be called "artificial intelligence" – fulfill three criteria: The system can (i) perceive its environment and process what is perceived, (ii) independently solve problems, make decisions and act, and (iii) learn from the results and effects of these decisions and actions. In fact, the term ‘AI’ is often used to describe a basket of different computing methods, which are used in combination to produce a result but which aren’t necessarily AI by themselves.
The distinctive feature of AI is that the machine can go beyond its code and "learn" new things and thus outgrow its original programming. Some may have heard of "AlphaZero", a machine which had "built-in" only the rules of the games "Go", "Shogi" and "Chess" (including the conditions for winning the respective game). AlphaZero was learning how to play the games by massively playing against itself and – in only a few hours (!) – became far better than any of its predecessor programs (inter alia the one beating the best human Go player almost two years earlier) and all other specialized programs for these games on the market.
Where does data protection come in?
Data provides the building blocks in the learning phase of AI. Neural networks, machine learning, deep learning – they all have one thing in common: They need huge amounts of data to become better. AI can only outgrow itself if fed with enormous amounts of data. For many AI applications the data used to train the system also contains personal data within the meaning of applicable data protection regulation. Companies established in the EU and beyond will thus have to comply with the requirements of the GDPR when developing or using AI applications.
Fundamental principles of data protection
Also for data processing in AI applications, the fundamental principles set out in Article 5 of the GDPR apply. Some of these principles can be quite challenging for both, the developing phase and the execution phase:
The fairness principle: AI seems to be objective, but it is no more objective than the data used in the training phase. So, it might (and often will) be that the data is somehow biased. If it is not ensured that arbitrary discriminatory treatment of individual persons can be ruled out, this would be in violation of the fairness principle.
The transparency principle: As AI outgrows its original programming, in many cases it is difficult to understand why the AI application has produces a particular output. Particularly, this black box makes it hard to explain how information is correlated and weighed to produce a particular result.
The data minimization principle: According to this principle, data used shall be adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed. This means that in the learning process only such personal data might be used that proves relevant for the training. But what if the algorithms are already adequately trained and further input data does not provide any added value? It might be argued that such further input data might well be adequate, relevant and necessary for the training of the AI as only with such data it can be shown that the AI has "reached its learning peak".
The purpose limitation principle: Also for AI applications, the reason for processing personal data must be clearly established and indicated when the data is collected. Thus, it has to be taken into account for the purpose definition that in course of the training data will most probably be combined by the AI system to create new results in an unexpected way. Further, when re-using personal data for several purposes each "new" data processing has to be assessed separately.
None of these principles is a show-stopper per se, but it can sometimes prove challenging to develop AI applications meeting the high standards of Article 5 of the GDPR.
In general, personal data might only be processed if such processing can be justified by one of the grounds of justification mentioned in Articles 6 and 9 of the GDPR. Whereas one might think of having consents as the legal basis for their AI application, this is definitely the wrong choice: In practice, besides not being able to obtain all consents needed, one has to deal with the situation that a data subject withdraws their consent. As a better solution, data processing within AI applications might be based on "overriding legitimate interests" of the AI developer or the user of an AI application. Noteworthy, this ground of justification does not apply for "special categories of data" (e.g. health data), but other grounds of justification might be available depending on the specific AI application and the purpose(s) pursued.
Automated decisions and profiling
The GDPR limits how organizations can use personal data to make individual automated decisions. Individual automated decisions are decisions relating to individuals that are based on machine processing. Under the GDPR, individual fully-automated decisions are permitted only in certain circumstances, e.g. with the explicit consent of the data subject. As mentioned, obtaining (and keeping) valid consent from all data subjects is quite unrealistic. Thus, for not having to rely on consent declarations, AI systems should not automatically decide on something which has legal effect on the data subject (e.g. whether a loan is granted) or similarly significantly affects a person (e.g. not being able to select certain payment methods) without some form of "human intervention" in the decision-making process.
For AI applications performing individual automated decisions, in addition to the general information obligations, the data controller must also provide to the data subject "meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject".
Data subject rights
Sometimes, for an AI application reaching and maintaining its full potential, some of the data used in the learning process has to remain stored in the AI system, otherwise the learning could not be refined any further. If this is the case, it has to be ensured that e.g. valid data subject requests for erasure or rectification can be fulfilled nonetheless.
Data protection impact assessment
The European Data Protection Board has endorsed guidelines setting out nine criteria for when processing activities are subject to a data protection impact assessment. If two or more of these criteria are fulfilled, a data processing activity is "likely to result in high risk for the data subject" and, thus, a data protection impact assessment has to be performed for the respective data processing activities. AI applications often fulfil one or more of the criteria "automated decision making with legal or similar significant effect", "data processed on a large scale", "matching or combining datasets", and/or "innovative use or applying new technological or organizational solutions". If personal data is involved, an obligation to perform a data protection impact assessment for an AI application is likely to exist.
So … friends or foes?
The answer is: AI and GDPR are neither friends nor foes. They are, however, closely interrelated: If AI applications process personal data, such AI applications have to be in line with all applicable GDPR requirements. This is not per se impossible, but – depending on the AI application – may prove quite challenging. Yet, as data protection regulations also becomes tougher in other jurisdictions outside Europe as well, there is a need for privacy-friendly development and use of AI.