5. What level of enrichment transforms raw and pre-processed data into inferred or derived data, excluding it from Chapter II?
Chapter II of the Data Act considers the level of enrichment of the data as one of the key factors in the effort to achieve a balanced and fair allocation of data value. Users have the right to receive, use, and port data that they have (co-)generated. This right applies to raw and pre-processed data, as well as accompanying metadata. “Metadata” is relevant to understand the conditions (e.g. time, weather, location) under which the data was collected or generated. At the same time, the Data Act seeks to preserve incentives to invest in data technologies that, for instance, transform the data, give additional insights, or allow processes to take actions autonomously. To distinguish between raw and pre-processed data on the one hand, and derived or inferred data on the other, Recital 15 mentions notions such as “substantial modification”, “substantial investments in cleaning and transforming the data”, and “proprietary and complex algorithms”. It is not, however, the complexity of processing that renders data out of scope – it rather depends on whether the inferred or derived data constitutes new, value-added insights going beyond the nature of information represented by the source data. Recital 15 establishes that “information inferred or derived from such data” refers to new information resulting from additional investments or proprietary algorithms. This distinction enables the protection of data holders’ innovations and preserves user access to data. As explained in Recital 15, the data in scope – raw and pre-processed data – include measurements of a “physical quantity or quality”. Considering the Data Act’s objective to enable processing of data by a wide array of actors in the data economy, such data should be ‘easily’ usable and understandable by entities other than those who generated it. While all sensor measurements require some level of interpretation before they can be communicated in a digital format, additional investments may be necessary to make the data useable and understandable, such as cleaning, transforming, or reformatting. However, this does not translate into an obligation on the data holder to make substantial investments in these processes.
Users, or third parties chosen by the user, are expected to have a reasonable level of technical capability to interpret the data. As explained in question 13, processing that is designed to preserve the privacy of the information such as anonymisation or pseudonymisation or processes of encryption should not be considered sufficient for the data to be excluded from scope of Chapter II.