Development and use of AI models
Principles of GDPR in the development and use of AI models
There are several relevant principles of the GDPR in the development and use of AI models. Companies must always comply with the seven (7) basic principles set out in Article 5 of the GDPR.
Principle of purpose limitation
Companies may not process more personal data than necessary to achieve the purpose of the processing. In addition, the purpose must be explicitly stated, specific and constitute a legitimate purpose.
Principle of purpose limitation in AI model development
If a company has collected personal data in the past for any purpose other than developing an AI model, it may require a new legal basis to use it for the development, but not always.
Compatible with the original purpose
In order to process personal data to develop an AI model even if the personal data were not collected for that purpose, the new purpose must be compatible with the original purpose. Therefore, the company needs to make an overall assessment in order to be able to determine this.
Factors to consider when deciding whether the processing of personal data in the development of an AI model is compatible with the original purpose

Links
What are the links between the new and the original purpose of the processing?

Collection
How has the personal data been collected?

Relationship
What is the relationship between the company and the data subject?

Reasonability
Can the data subject reasonably expect such processing?

Category
What category of personal data does the processing relate to?

Possible consequences
What are the possible consequences of the processing for the data subjects?

Technical and organisational security measures
What technical and organisational security measures has the company taken?
Assessments of purpose limitation
Examples of unauthorised processing: Development of an AI model to find patterns in employees sick leave
A company wants to develop an AI model that will be able to determine if there is any pattern in when their employees get sick, so that the company can calculate in advance when they should have extra staff ready. The company wants to give the AI model all the data about its employees that they are entitled to process under the employment contract. The company itself engages the developers to avoid transferring the personal data to any third party. The personal data will not be encrypted at the time of entry to the AI model.
This processing of personal data would most likely not be allowed.
Here are some factors that justify the decision:
There seems to be a link between the original and new purpose of the processing of the personal data, but there are two different types of cases.
Data subjects have little influence over the processing and what personal data is processed, as it is a requirement for them to be employed.
There is an unequal power relationship between the parties, as it is between the employer and employees.
The personal data processed (sick leave) is sensitive according to the GDPR (data revealing information about an individual's health constitutes a special category of personal data according to Article 9 of the GDPR).
The company does not intend to encrypt the personal data prior to entering it to the AI model, which should be done as a technical security measure with regard to the processing of sensitive personal data.
The personal data has been collected on the basis of the legal basis contract with the data subject in order to be able to conclude and fulfil the employment contract. The development of an AI model is not necessary for this purpose, and therefore this legal basis cannot be used for this new purpose of the processing.
It is difficult to argue that the development of the AI model is made in the best interests of the employees, which could otherwise be a good argument to include in the assessment.
Do the data subjects’ expectations of how the processing can take place have any effect on the assessment?
Yes, it affects the decision. If the new processing concerns something other than the original purpose, it may mean that it is outside what the data subjects can reasonably expect. Although there is a link between the original and new purpose, the processing for the development and use of an AI model may be something that is not reasonably expected.
Does it matter in the assessment what consequences the processing may have for the data subjects?
Yes, it may affect the decision. If the personal data is used to develop an AI model, there may be less transparency about the processing towards the data subjects. For example, it may make it more difficult for them to respect their rights and understand the processing.
Examples of permitted processing: To optimize electricity consumption, an electricity supplier wants to map electricity prices by developing an AI model
A company, which is an electricity supplier, hires an external party for the development of the AI model. The company needs to process certain personal data in order to deliver electricity and invoice customers. The company wants to use this personal data to develop an AI model, in order to optimize its electricity prices. The purpose of the processing is thus to reduce costs for customers, by mapping which times the electricity price is the lowest and the highest.
The company needs all the personal data that is processed about their customers, but has divided them into different geographical areas. They will not give all the data to the AI model directly, but instead start with one geographical area at a time. In this way, the company tries to avoid processing more personal data than necessary for the purpose.
After each geographic area that the company has entered the data, the company will investigate whether it is enough for the AI model to do what it is intended for. If not, the company will enter the data for a new area, until they reach the required result.
No direct identifiers, such as name and social security number, will be shared with the AI model, but instead the personal data has been encrypted.
The question is whether the new purpose can be considered compatible with the original purpose?
In this case, it seems that there is a clear link between the purposes. However, there are several factors that affect whether the new processing is allowed or not. Here are some circumstances that justify the decision that it relates to an authorised processing:
The old and new purpose is about electricity consumption.
There is no unequal power relationship between the parties.
Data subjects are not in a vulnerable position.
The personal data is not sensitive according to Article 9 GDPR or Article 10 GDPR.
The aim is to reduce costs for customers, which is positive for them.
To avoid processing more personal data than necessary, the company starts by providing the AI-model data regarding one geographical area at a time, until they achieve the desired result, instead of entering all the data at the same time.
It is also important to consider taking adequate technical and organisational security measures to protect personal data. In this case, the company had encrypted the personal data, which is an appropriate technical security measure in this situation.
Circumstances that may militate against the compatibility of processing with the original purpose:
If the company transfers the personal data to another business for the development of the AI-model. This entails, among other things, a greater risk that the personal data may fall into the wrong hands, but does not necessarily mean that it is incompatible with the original purpose. Another drawback is that it may be more difficult to be transparent towards data subjects, as it may be more difficult for them to understand the processing and respect their rights under the GDPR.

It is important to analyse the consequences that the new processing may have for data subjects. In addition, the importance of protecting personal data should be analysed. For example, if its credit card information, it can have major consequences if it gets into the wrong hands.

Please note that each case requires its own assessment and small circumstances may affect whether the new processing is compatible with the original purpose or not.
Principle of data minimisation
Companies may only process personal data that is adequate and relevant in relation to the purpose. In other words, do not process more personal data than necessary to achieve the purpose.
Proportionality requirement
The processing must be proportionate to the purpose to be achieved. This means that the privacy breach must not be too extensive in relation to the benefits of the processing.
Principle of data minimisation in AI model development
The principle of data minimisation needs to be carefully considered when training and developing an AI model, as there is a risk that it processes more personal data than necessary. It may also be that one does not always know what exactly is to be achieved and therefore processes unnecessary personal data in the process, which is not compatible with the principle. Therefore, it is important to first carefully analyse what the purpose of the processing is, in order to know what personal data is needed. Compliance with the requirements of this principle may be difficult when a company develops an AI model, but it is possible.

Statistically correct
It is important not to discriminate against certain groups when creating a statistically correct AI model. Therefore, there is a need to train the AI model with relevant non-discriminatory data. This means that relevant data is important both for not processing more personal data than is needed to achieve the purpose of the processing and for it to be as statistically correct as possible.

Supervised learning
When training an AI model through supervised learning, one should intentionally insure the characteristics of the data used.
Good starting point for achieving the rules on the principle of data minimisation
As AI models tend to use very large amounts of data, it can be difficult to achieve the principle of data minimisation. However, it is possible. To do so, it is good to start by providing the AI model with a small amount of data and then, if necessary, gradually increase the data. Since it is difficult to always know exactly how an AI model will develop in the future, it is good to start on a smaller scale.
Do not forget to delete personal data that is no longer necessary
Companies must delete or anonymize personal data when it is no longer necessary for the purpose for which it was collected. Where the personal data are no longer necessary to process after the AI model has been developed, they shall be deleted. Alternatively, it is anonymized. It may also be useful to analyse whether it is possible to develop the AI model with anonymous data instead of personal data.
Discriminatory algorithms: Discrimination is prohibited
The use of discriminatory algorithms is not allowed for an AI model, as it is contrary to the principle of fairness. It is prohibited, whether such use is intentional or not. Therefore, it is good to constantly test the AI model so that it does not discriminate against groups or individuals.
Learn more
Legal bases that may be useful in the development and use of AI models
There are several legal bases that may be appropriate to support the processing of personal data in the development and use of AI models. Legitimate interest is commonly used, as it is a flexible legal basis. However, it is important to consider making a written balancing of interests first, as well as knowing that the data subject has the right to object to the processing. Consent is usually difficult to use when developing an AI model, but may be more appropriate to use to support the processing of personal data when using an AI model.