Digital clones: Navigating the data protection and privacy issues
Posted: May 28, 2025
A digital clone is an avatar that looks, sounds, and acts like you. Digital clones have the potential to be fun and valuable in fields like entertainment, employment, and gaming. But they also create serious risks in areas such as privacy and fraud prevention.
This article explains what digital clones are, how they are developed, and why anyone considering creating or commissioning a digital clone should carefully assess the data protection and privacy risks first.
What is a digital clone?
A digital clone (or “digital twin”) is a dynamic virtual representation of a real person, created by using personal data to replicate the person’s likeness, voice, mannerisms, and potentially even aspects of their personality or knowledge.
Digital clones can potentially be used in the following contexts:
- Entertainment: Virtual actors or musicians used in content, shows, and video games.
- Customer service: Human-like virtual assistants or support agents.
- Companionship or therapy: AI-based “friends” or virtual therapists.
- Education: Virtual teachers or mentors
- Legacy: AI versions of people who have died
- Avatars: In-game characters closely representing the player or another person.
In New York on 1 May 2025, an AI avatar representing deceased road rage victim Christopher Pelkey presented a victim impact statement in court. The statement had been prepared by Pelkey’s sister.
“I love that AI. Thank you for that,” Judge Todd Lang told Pelkey’s sister following the statement. “As angry as you are and justifiably angry as the family is, I heard the forgiveness.”
It’s debatable whether Pelkey’s avatar represents a “digital clone”. But the use of AI to simulate the presence of a dead person in court has triggered a broader public conversation about the topic.
How are digital clones created?
Here’s a simplified overview of the process involved in developing a digital clone:
- Acquire personal data. A functional virtual clone requires large amounts of personal data about the person to be “cloned”. This data might include high-resolution scans of the person’s face and body, voice recordings, and videos capturing the person’s mannerisms. Copies of the person’s communications or writing can help simulate their personality and knowledge.
- Process the data. Cleaning and structuring the data allows for the recognition of patterns across images and text.
- Train the models. Input the data into AI models using technologies such as generative adversarial networks (GANs), natural language processing (NLP), and text-to-speech (TTS) synthesis.
- Integrate the model. Create a visual avatar representing the person and integrate the audio and visual models to simulate speech and movement.
- Test and refine. Compare the model’s outputs to the source material, adjusting the model parameters to better simulate the data subject.
Each stage of this process requires careful consideration to ensure compliance with data protection and privacy law.
Digital clones: From theory to reality
The concept of digital clones is rapidly moving from theory to reality. Startups like Delphi are enabling individuals, such as Beehiiv founder Tyler Denk, to create AI versions of themselves.
Denk’s “DenkBot” was trained on his writings and media appearances, and it allows newsletter subscribers to interact with his virtual persona for advice. The goal is to enable Denk to scale his expertise and engagement without requiring his constant personal attention.
Other companies have also entered the “digital clone” space. For example, adtech startup MasterClass has digitally cloned high-profile people, such as investor Mark Cuban, to act as mentors on its digital learning platform.
Digital clones: Data protection and privacy considerations
Creating a digital clone requires large amounts of highly sensitive personal data. The process of creating a digital clone can be highly intrusive. Creating and maintaining a digital clone also carries significant risks. As such, the creation of a digital clone should be approached with extreme caution.
Transparency
AI models increasingly require less data to accurately recreate a person’s voice or likeness. Some AI models can convincingly clone a person’s voice using less than thirty seconds of audio recording, or create a video avatar based on facial images alone.
Nonetheless, you should never create a digital clone unless you can be entirely transparent with the person you are “cloning” (the “data subject”). You must be fully transparent about every data point you use in the process, including where such data might be publicly available (e.g., social media content).
The data subject must be fully aware of the risks involved in the creation of a digital clone and fully informed about all aspects of the process of creating it.
You may also need to consider digitally “watermarking” the digital clone to ensure that videos are detectable as synthetic. The EU AI Act will require the watermarking of “deepfakes” in certain circumstances.
Consent
Digital clones use biometric data derived from images, videos, and audio clips. Biometric data enables the identification of a person across different contexts based on immutable characteristics such as their face and voice.
The processing of biometric data requires consent in many jurisdictions, including in most US states with comprehensive privacy laws.
In the EEA and UK, the General Data Protection Regulation (GDPR) requires a “legal basis” for the processing of all personal data, and a further condition for processing “special category data” such as biometric data. It is likely that creating a digital clone would require the data subject’s explicit consent under the GDPR.
Security
Digital clones carry significant potential for misuse and fraud. If an unauthorised actor gains control of a digital clone, they could use it to defraud the data subject’s friend, family, and business associates.
As such, strong data security is of the utmost importance at all stages of creating, maintaining, or using a digital clone. You must have sufficient resources to secure the underlying personal data, along with the AI models used to train the digital clone, and the end product itself.
Data subject rights
The data subject should be allowed to maintain as much control as possible over their digital clone.
Depending on the relevant jurisdiction, the data subject may have the legal right to delete their digital clone on request. You should consider allowing the data subject to delete their digital clone at will, even if there is no clear legal obligation to do so.
You should also provide a mechanism for data subjects to “correct” their digital clone if it manifests inaccuracies. This might involve uploading additional personal data for further fine-tuning or refinement of the clone.
Charting a responsible future for digital clones
As generative AI technologies improve, digital clones are likely to be used by movie studios, educational institutions, and consumers themselves.
But the technology should never be used without a full and detailed assessment of the legal issues, privacy risks, and fraud potential.
As such, any company venturing into this space must have access to experts in data protection, technology, and risk management.