Types of data (Data as a Trigger)

Types of data (Data as a Trigger)

While the overview presented below aims to provide an exhaustive taxonomy of data types within the applicable EU regulatory frameworks, the classification of relevant data types remains a work in progress and may be adjusted or further developed in the next version of the Blueprint.

The following table lists different types of data that may trigger the need for compliance with various legal frameworks. The second column describes the actual needs (for the data space governance authority, data space participants or other stakeholders in the data space) in various circumstances.

Types of data

Description and assessment of the impact

Types of data

Description and assessment of the impact

Personal data

  • Legal landscape relevant to personal data: The data protection legal framework, including the GDPR, as well as the Law Enforcement Directive and the e-Privacy Directive (if particular circumstances apply), remains fully applicable to the processing of personal data within the context of a data space, so that parties involved in the processing activities of personal data will need to ensure compliance with relevant legal provisions.

  • Parties affected: Neither the involvement of a data space, nor that of a personal data intermediary, relinquishes the parties involved in such processing of their duties as data controllers or processors.

    • The clear establishment of the roles of data controller and data (sub-)processor should also be a priority, taking into account the essential criterion of decision-making powers regarding purpose and means of processing.

  • Continuous compliance: Issues of data protection should be an important consideration from the very start of the design of a data space and throughout all of its development stages.

  • Applicability to data spaces:

    • In the context of data spaces, the GDPR is highly (or most) relevant for use cases. For example, if a use case or transaction involves any information relating to an identified or identifiable natural person ('data subject'), the use case or data transaction participants will need to ensure compliance with data protection legislation, most notably the principles relating to the processing of personal data.

    • The GDPR also applies to mixed datasets (comprised of both non-personal and personal data). This remains valid also if personal data represents only a small part of the dataset.

    • The concept of the “purpose” under data protection law should be given particular attention, as it is fundamental to clearly define any personal data processing activity.

    • Consideration of how to ensure accountability within a data space and how compliance with data protection principles, such as lawfulness, transparency, and purpose limitation, should be facilitated.  

  • Additional resources: The Spanish Data Protection Authority, in reference also to the EDPB-EDPS Joint Opinion 03/2022 on the Proposal for a Regulation on the European Health Data Space, highlights the importance of a data protection policy, which should state “how the principles and rights set out in the data protection regulation and the guidelines in this document are to be implemented in a concrete, practical and effective manner.”

 

Some of the most important elements to be considered for a data protection policy that the Spanish Data Protection Authority lists in its report (p. 89-97) include the involvement of data protection officers (DPOs) and advisors in the design of data spaces; implementation of procedures for authorising the processing of personal data within the data space; a precise definition of the purposes of data processing; and risk management, including data protection impact assessments (DPIAs) coordinated between involved parties.

Synthetic data (sometimes referred to as “fake data”) can be understood as data artificially generated from original data, preserving the statistical properties of said original data. Some data may also be completely artificially created without an underlying real-world data asset (e.g. virtual gaming environments). From a technical perspective, the primary purpose of its generation is to increase the amount of data. This solves an issue of insufficiency in datasets or improves the variability of available data. It also serves as a way to mitigate risks to the fundamental rights of individuals. According to the Spanish Data Protection Authority, the use of synthetic data, along with other techniques such as generalisation, suppression, or the use of Secure Processing Environments, can be the way to comply with data minimisation or privacy by design/default principles. It is important to remember that when personal data is being used to generate synthetic data, it will be considered part of a processing operation and, therefore, subject to compliance with the GDPR. However, depending on the original data, the model and additional techniques applied, the synthetic data can be anonymous data.

The report also emphasizes the role of traceability in data spaces for providing control mechanisms relevant for the processing activities within data spaces. In that regard, data traceability should help to identify roles and implement access control and access logging policies. It should help to facilitate fulfilling particular objectives set by the GDPR, in particular addressing the transparency requirements to data subjects, enabling the effective exercise of data subjects’ rights, such as the management of consent, facilitating excercising the obligations of the controller (e.g., to ensure the principles of restriction of processing, purposes compliant with the legal bases or of processors/sub-processors), and allowing Supervisory Authorities to exercise their powers in accordance with Article 58(1) of the GDPR.

More specifically, keeping of log of accesses and data space participants' actions performed within a data space could be a way to implement the obligations laid down by art. 32 (1) GDPR requiring from controller and processor the implementation of appropriate technical and organisational measures to ensure a level of security appropriate to the risk of varying likelihood and severity for the rights and freedoms of natural persons.

To read more about Provenance and Traceability in data spaces, please check the Provenance and Traceability Building Block.

  • Technical implementation: As part of the personal data protection arrangements, a relevant solution could be to implement the W3C’s Data Privacy Vocabulary that enables the expression of machine-readable metadata about the use and processing of personal data based on legislative requirements such as the GDPR. More details about the W3C Standards/Credentials can be found in the Identity and Attestation Management building block.

  • According to art. 9 (1) GDPR, personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, genetic data, biometric data processed solely to identify a human being, health-related data, data concerning a person’s sex life or sexual orientation, shall be considered special categories of personal data, the processing of which is generally prohibited, unless one of the strictly enumerated legal bases set out in paragraph (2) occurs. They shall be strictly interpreted and considered separately for each processing activity.

  • If one of the legal bases is applicable, it is important to remember that processing of such data still needs to be compliant with the principles, rights and obligations established in the GDPR. As indicated by the Spanish Data Protection Authority, in the case of processing of these data, it should be demonstrated that the conditions for lifting the prohibition of such processing set out in Article 9(2) of the GDPR are met. Processing special categories of data referred to in Article 9(2)(g) (essential public interest), (h) (purposes of preventive or occupational medicine, assessment of the worker’s capacity to work, medical diagnosis, provision of health or social care or treatment, or management of health and social care systems and services) and (i) (public interest in the field of public health) of the GDPR has to provide appropriate safeguards and has to be covered by a regulation having the force of law.

The GDPR also requires assessing the following: 

  • The necessity of processing the data (This also includes appropriateness. For example, when processing is necessary to protect the data subject's vital interests or for reasons of substantial public interest based on Union or Member State law).

  • The proportionality of the processing (for example, when data is processed under essential public interest, archiving purposes in the public interest, scientific or historical research purposes, or statistical purposes).

  • Whether the processing activity can be considered “high-risk processing” or not. If this is the case, there must be an appropriate data protection impact assessment that will manage the high risk and demonstrates passing the assessment of necessity, appropriateness and strict proportionality.

  • One of the types of sensitive data is biometric data. Biometric data can allow for the authentication, identification or categorisation of natural persons and for the recognition of emotions of natural persons. It is defined in Art. 4 (14) GDPR as personal data resulting from specific technical processing relating to the physical, physiological or behavioural characteristics of a natural person, which allow or confirm the unique identification of that natural person, such as facial images or dactyloscopic data. As it’s considered one of the special categories of data under GDPR,  the processing of biometric data is prohibited in principle, providing only a limited number of conditions for the lawful processing of such data. Due to the increased use of biometric data in AI development and the immutability of physiological traits, the AI Act regulates the use of such data. In accordance with the AI Act, AI systems used for biometric categorisation based on sensitive attributes (protected under Article 9(1) GDPR) and emotion recognition should be classified as high-risk,  in so far as these are not prohibited under this regulation. Biometric systems which are intended to be used solely for the purpose of enabling cybersecurity and personal data protection measures should not be considered high-risk AI systems.

  • Intellectual Property Rights/ Trade Secret-Protected Data(sets)

Intellectual property (IP) law can confer rights over datasets, often through copyright and the sui generis database right. Data may also be protected by trade secrets, which constitute a separate legal regime. It is the responsibility of each data space participant to ensure legal compliance and to have an authorisation and/or legal basis for data sharing if data is protected by copyright, sui generis or trade secrets.

Copyright law:

  • Protects creative works like text, images, video, and sound. It also protects software (e.g., source code) and databases (e.g., a collection of independent data protected).

  • Copyright does not protect data itself - it differs from other works, which are protected due to specific qualifying conditions, such as being an author’s original creation.

  • To be protected by copyright, a database has to be original, reflect the author's intellectual creation, and be fixed in tangible form. In the context of databases, a specific test of originality reflecting the special characteristic of databases is required (whether the selection or arrangement of their contents constitutes the author’s own intellectual creation).

  • Purely factual information is usually not eligible for protection (Article 2, InfoSoc Directive).

Sui Generis Database Right (EU Database Directive 96/9/EC):

  • Databases could be protected by the sui generis database right in addition to copyright protection.

  • Grants sui generis database rights to creators who made qualitatively and/or quantitatively substantial investment in obtaining, verifying, or presenting contents.

  • Provides right to prevent unauthorized extraction or re-utilisation.

  • Defines databases as collections arranged systematically and individually accessible.

Trade Secret Protection (EU Trade Secrets Directive 2016/943):

  • Encompasses various data types, requiring secrecy and commercial value.

  • Enforceable rights against unlawful use and misappropriation without conferring property rights

  • Secrecy is preserved as long as persons having access to information are bound by confidentiality agreements.

Solutions to be implemented on a data space level

  • The acknowledgement of intellectual property and quasi-IP rights (trade secrets), both for identifying existing assets and creating new ones, should be addressed in the intellectual property policy within the general terms & conditions and the intellectual property clauses of particular data product contracts. More details can be found in the Contractual Framework building block.

  • Data holders are responsible for providing information about the IP rights and/or trade secrets they possess over particular datasets before sharing them with potential data recipients.

  • Specific legal provisions concerning trade secrets’ aspects in the context of data sharing can be found in the Data Act (primarily in the context of business-to-consumer and business-to-business data sharing).

  • Examples of possible legal, organisational and technical measures to preserve intellectual property rights or trade secrets can be found in the recently adopted European Health Data Space Regulation. Such measures could include data access contractual arrangements, specific obligations in relation to the rights granted to the data recipient, or pre-processing the data to generate derived data that protects a trade secret but still has utility for the user or configuration of the secure processing environment so that such data is not accessible by the data recipient (recital 60, art. 53 EHDS-R).

Non-personal data

 

  • IoT Data

  • The Data Act lays down a harmonised framework specifying the rules for using product data or related service data, including data from Internet of Things devices, smartphones, and cars. 

  • It imposes the obligation on data holders to make data available to users and third parties of the user’s choice in certain circumstances. It also ensures that data holders make data available to data recipients in the Union under fair, reasonable and non-discriminatory terms and conditions and in a transparent manner. These provisions apply to the data of specific origin, irrespective of the personal/non-personal character of the data. If the processing of personal data is involved, it is important to remember that the GDPR still applies.

  • Data spaces should pay attention to Chapter II of the Data Act, especially in the context of data transactions involving the processing of product data and related service data. 

  • More specifically, data spaces should consider the rights of the users of connected products or related services that they hold towards the data they produce by using these products or services. 

  • These rights include access to and use of the data for any lawful purpose. There are some exceptions to access rights (for example, if the data user requests access to personal data of which he is not a data subject). 

  • The data provided to the user should be of the same quality as the data available to the data holder and should be provided easily, securely, free of charge, and in a comprehensive, structured, commonly used and machine-readable format. 

  • If the data transaction is supposed to be concluded without the data user directly involved, it is important to remember that the scope of such transactions is predefined by the contract with the user. Data holders shall not make available non-personal product data to third parties for commercial or non-commercial purposes other than the fulfilment of such a contract.

  • Data holders can also decide to make their data available via a third parties of their choice (for example, data intermediation service providers as defined by the DGA) for commercial purposes. These third parties hold then certain obligations towards the data they receive on behalf of the user.  For example, they should be able to transfer the data access rights granted by the user to other third parties, including in exchange for compensation. Data intermediation services may support users or third parties in establishing commercial relations with an undetermined number of potential counterparties for any lawful purpose falling within the scope of the Data Act, providing that users remain in complete control of whether to provide their data to such aggregation and the commercial terms under which their data are to be used.

  • High-Value Datasets (HVDs)

  • High-Value Datasets (HVDs) are defined in the Open Data Directive as “documents held by a public sector body, the reuse of which is associated with important benefits for society, the environment and the economy”.

  • HVDs can be re-usable for any purpose (as is the case for open data).

  • Public sector bodies are not allowed to charge fees for the reuse of HVDs.

  • In the context of data transactions within a data space, it is important to remember that the reuse of documents should not be subject to conditions, but some cases are justified by a public interest objective. In these situations, public sector bodies might issue a license imposing conditions on the reuse by the licensee dealing with issues such as liability, the protection of personal data, the proper use of documents, guaranteeing non-alteration and the acknowledgement of source. 

In addition to the above, there are categories of data for which we do not know the legal status of the data. Therefore, they are de facto under the control of the data holder.