Katja de Vries conducts research in legal informatics and our coexistence with Artificial Intelligence (AI)

Katja de Vries

The Ragnar Söderberg Foundation has awarded research support and an associate senior lectureship in public law (2020-24) to Katja de Vries, who will now begin work at the Department of Law. Her research focuses on legal informatics and our coexistence with Artificial Intelligence (AI). Over the next four years, Katja de Vries will explore the implications of creating AI within the framework of European law, touching upon cybercrime, intellectual property, data protection and freedom of expression.

Katja de Vries has an exciting background. She studied three Master’s programmes simultaneously at Leiden University (Netherlands) in civil law, cognitive psychology and philosophy. As well as this, she supplemented her education with a law degree from Oxford University. This coincided with the start of the big data era, and the growth of questions regarding how to handle big data and manage the information in a legally correct manner. Whilst Katja was studying cognitive psychology, she developed a great interest in statistics. This led her to a doctoral thesis in legal informatics, with the title: “Machine learning/informational fundamental rights. Making of sameness and difference” (de Vries, 2016). Since then, Katja de Vries has researched the legal and social implications of AI and machine learning on integrity, data protection and discrimination.

One of Katja’s research topics is the General Data Protection Regulation (GDPR), that entered into force in 2018 and regulates how personal information is processed within the EU. We asked Katja what she finds so exciting about GDPR research.

GDPR addresses a very important dilemma - how can we utilise the financial and social worth of personal data without ending up in a digital panopticon, in which our every move is assessed by the state and major corporations? This problem has become highly topical during the Corona virus pandemic; various apps are being developed to monitor the spread of the virus, which raises ethical, political and legal problems. Who do we trust the most? Major corporations? Public authorities? The state? It is easy to see GDPR as a major bureaucratic hurdle that leads to great frustration in everyday life. Personal data is everywhere, constantly. Meeting all the requirements of GDPR takes time and energy. I too, can get frustrated when I can’t just take a picture of activities at my child’s preschool, or when I need to continually consent to my personal data being processed. Researchers, companies or public authorities might feel restricted by having to meet GDPR requirements. So there’s no wonder that many try to circumvent GDPR regulations by anonymising personal data. It’s always exciting when legislation meets with reality – the significance of the law is challenged. A law may seem simple on paper, but then some new technology is developed that the legislator had not foreseen, or clever tricks are devised to avoid the rules. This in turn raises difficult, almost philosophical questions.

One good example is the anonymisation of personal data. Whether information has been sufficiently anonymised to fall outside the scope of GDPR is not always obvious. Even if something does not appear to be clearly linked to an identifiable person, GDPR still applies if it is possible to re-identify a person after a little detective work. Many major corporations and public authorities push the limits to be able to retain as much information as possible, without the data being classified as personal. Part of my research examines how AI and machine learning has become a tool for improving anonymisation.

Over the next four years, you’ll be examining the legal implications of creative AI – tell us more!

Until recently, AI and machine learning were mostly used to classify and assort people, objects, transactions or cells within medicine (classifying AI). Creative AI (generative AI) gives us the opportunity to create synthetic data, and compose convincing new variations of existing patterns (“deepfakes”), such as anonymised data, faces that don’t exist, false videos, new artworks – all of which create new perspectives and challenges... and a new focus for me!

How does creative AI link with criminal justice?

Creative AI poses a challenge to several legal areas, such as criminal law if fraudulent materials are used against a person. A lot of attention has focused on how creative AI can facilitate the creation of fake news, and undermine democracy. However, the danger to private lives is probably greater than to democracy. Journalists or other organisations invest resources in verifying news – but who will protect an individual being blackmailed or bullied via social media? Here, there’s less protection. The question is, when does the publication of synthetic image material constitute slander or insult?

How does creative AI link with intellectual property?

Generative AI will affect intellectual property, as both the data entered into systems to train models and the output of new data from a system needs to be protected. As regards training data, the EU has adopted a new copyright directive, that gives new opportunities for text and data mining to train AI models within research. As for the new synthetic data produced by creative AI – its status is not entirely clear. There are major financial implications, for example within medicine and the development of new pharmaceuticals, or within the arts.

How is creative AI best regulated from a societal perspective?

Creative AI and synthetic data can be applied to so many areas, so it is difficult to give a straight answer. Perhaps one way would be to regulate synthetic output in the form of copyright or patents, for example, to protect and encourage innovation, however it needs to be evaluated based on its specific usage.

You hear a lot about how automated decision making can lack transparency, and make discriminatory decisions. Is there a way to use synthetic data to make classifying AI systems fairer and more transparent?

This is something that’s part of my research – it looks rather promising! Bias may be created with classifying AI if the system is fed with information that under-represents specific categories during the its learning process. Using more variable, synthetic data as input data could be one way to reduce bias by creating a more representative learning process. It could also be possible to use synthetic data to create transparency in algorithm-based decision making. Synthesised data can serve as a counter-factual, parallel history. It can show us what needs to be changed, in order for the system to have made a different decision.

Creative AI can be used in several areas, and has implications in different areas of the law. This doesn’t scare Katja de Vries. If anything, the opposite is true. She is used to talking to laypeople. She has collaborated in several interdisciplinary projects, and will continue to do so. She is currently researching with an artist, and also hopes to work with medical data. Some people involved with generative AI talk about machine creativity and machine imagination. She believes that just as creative AI changes reality, more or less depending on the parameters used, and in a world where an image is no longer proof, we also need to change the way we view data and information.

To learn more about “deepfakes” and generative AI, read de Vries (2020).

Sources:
de Vries E. 2016. Machine learning/informational fundamental rights. Making of sameness and difference. Vrije Universiteit Brussel.

de Vries K. (2020): You never fake alone. Creative AI in action, Information, Communication & Society, DOI: 10.1080/1369118X.2020.1754877. https://doi.org/10.1080/1369118X.2020.1754877