Can we adequately protect the privacy and security of data used in machine learning?
As the adoption of machine learning (ML) increases, it is becoming clear that ML poses new privacy and security challenges that are difficult to prevent in practice. This leaves the data involved in ML exposed to risks in ways that are frequently misunderstood.
While traditional software systems already have standard best practices such as the FIPPs (Fair Information Practice Principles) to guide privacy efforts, or the Confidentiality, Integrity and Availability triad to guide security activities, there exists no widely accepted best practices for the data involved in ML.
Adapting existing standards or creating new ones is critical to the successful, widespread adoption of ML. Without such standards, neither privacy professionals, security practitioners, nor data scientists will be able to deploy ML with confidence that the data they steward is adequately protected. And without such protection, ML will face significant barriers to adoption.
This post aims to create the beginnings of a framework for such standards by focusing on specific privacy and security vulnerabilities within ML systems. At present, viewing these vulnerabilities as warning signs, either of a future in which the benefits of ML are not fully embraced, or a future in which ML’s liabilities are insufficiently protected.
The ultimate goal is to raise awareness of new privacy and security issues confronting ML based systems for everyone from the most technically proficient data scientists to the most legally knowledgeable privacy personnel, along with the many in between. Ultimately, aiming to suggest practical methods to mitigate these potential harms, thereby contributing to the privacy-protective and secure use of ML.
Why machine learning is exposed to New Privacy and Security Risks
Experience has already proven that security and privacy as applied to ML differ from the data protection frameworks applied to traditional software systems. The scale of the volume of data collected, the range of uses for existing models (beyond simply those envisioned by their creators), and the power of the inferences such models generate are unlike those seen in traditional use cases.
Past frameworks for data protection, for example, were largely premised on harms derived from the point of access, either to the collected data or to software systems themselves. In information security, harms began with unauthorized access to datasets or to networks. In privacy, overly broad or insufficiently enforced access to data again served as the starting point for all subsequent harms, such as unauthorized use, sharing, or sale. Preventing or managing access was, as a result, a relatively intuitive task that privacy and security teams could prioritize as the basis of their efforts.
Harms from ML, however, do not always require the same type of direct access to underlying data to infringe upon that data’s confidentiality or to create privacy violations. This exposes ML systems to privacy and security risks in novel ways, as we will see below.
Both privacy and security harms can occur, for example, absent direct access to underlying training data because ML models themselves may subtly represent that data long after training. Similarly, the behavior of models can be manipulated without needing direct access to their source code. The types of activities that once required hacking under a traditional computing paradigm can now be carried out through other methods.
Informational vs. Behavioral: Two types of harms in Machine Learning
The types of security and privacy harms enabled by ML fall into roughly two categories: informational and behavioral.
Informational harms relate to the unintended or unanticipated leakage of information. Behavioral harms, on the other hand, relate to manipulating the behavior of the model itself, impacting the predictions or outcomes of the model.
By describing the specific “attacks” that constitute these types of harms below, viewing each such attack as a warning sign of future, more widely known and exploited vulnerabilities associated with ML.
INFORMATIONAL HARMS
Membership Inference: This attack involves inferring whether or not an individual’s data was contained in the data used to train the model, based on a sample of the model’s output. While seemingly complex, this analysis requires much less technical sophistication than is frequently assumed. A group of researchers from Cornell University, for example, recently released an auditing technique meant to help the general public learn if their data was used to train ML models, hoping to enable compliance with privacy regulations such as the EU’s GDPR. If used by malicious third parties, such analysis could compromise the confidentiality of the model and violate the privacy of affected individuals by revealing whether they are members of sensitive classes.
Model Inversion: Model inversion uses ML outputs to recreate the actual data the model was originally trained upon. In one well-known example of model inversion, researchers were able to reconstruct an image of an individual’s face that was used to train a facial recognition model. Another study, focused on ML systems that used genetic information to recommend dosing of specific medications, was able to directly predict individual patients’ genetic markers.
Model Extraction: This type of attack uses model outputs to recreate the model itself. Such attacks have been piblicly demonstrated against ML as a service providers like BigML and Amazon Machine Learning, and can have implications for privacy and security as well as the intellectual property or proprietary business logic of the underlying model. While there exist myriad types of harms that can arise from this type of attack, the very fact that models retain representations of their training data, as described above, makes the threat of extraction an inherent vulnerability from the privacy perspective.
Collective harms posed by ML-Enabled inferences
ML exacerbates one particularly thorny informational harm in the world of data analytics: creating dangers for individuals with no relation to the underlying training data or the model itself. That is, if ML models are able to make increasingly powerful predictions, the ability to apply those predictions to new individuals raises serious privacy concerns on its own. In that sense, a narrow focus on protecting the privacy and security of only the individuals whose data is used to train models is mistaken; all individuals may be affected by significantly powerful ML.
One such example is the recent creation of a model that can detect anxiety and depression in children simply based on statistical patterns in each child’s voice. The model can take ordinary input data (voice recordings) and make decisions that constitute sensitive diagnostic data (the presence of anxiety or depression in a specific child). As a result, the very act of any child speaking—beyond the children involved in this study—now contains new privacy implications.
BEHAVIORAL HARMS
Poisoning: Model poisoning occurs when an adversary is able to insert malicious data into training data in order to alter the behavior of the model at a later point in time. This technique may be used in practice for a variety of malicious activities, such as creating an artificially low insurance premium for particular individuals, or otherwise training a model to intentionally discriminate against a group of people. Altering the behavior of models can have both security and privacy implications, and does not necessarily require that the malicious actor have direct access to a model once deployed.
Evasion: Evasion occurs when input data is fed into an ML system that intentionally causes the system to misclassify that data. Such attacks may occur in a range of scenarios, and the input data may not be noticeable by humans. In one such example, researchers were able to cause a road sign classifier to misidentify road signs by placing small black and white stickers on each sign. This type of attack could cause traffic violations in systems such as those in autonomous vehicles. Similar evasion attacks have been demonstrated in a variety of other sensitive contexts as well.
A layered approach to data protection in ML
What can we do to guard against these harms in practice? While there are no easy answers, there are a series of actions that can make such harms less likely to occure or minimize their impact. Below are a handful of approaches.
Noise Injection: From a technical perspective, one of the most promising techniques involves adding tailored amounts of noise into the data used to train the model. Rather than training directly on the raw data, models can train on data with slight perturbations, which increases the difficulty of gaining insight into the original data or manipulating the model itself. One such method, known as differential privacy, is among the most widely accepted (and promising) methods of randomized noise injection.
Intermediaries: Another approach relies on inserting intermediaries or additional layers between the raw training data and the model, which can be implemented in a variety of ways. Federated learning, for example, trains models against data that is separated in silos, which can make the attacks discussed above more difficult to implement. Another method involves what is known as a student teacher approach, in which a variety of student models are trained on different aspects of the underlying data, which are then used to train the parent model or models that are actually deployed.
Transparent ML Mechanisms: A motivated attacker may be able to learn more about a black-box ML model than is known by its original creators, creating the possibility for privacy and security harms that they might not have envisioned. While traditionally dominated by black-box modeling routines, the field of ML has experienced a renaissance of research and techniques for training transparent models, which can help to address such concerns. Examples of such techniques, with accompanying open source code, include explainable boosting machines and scalable Bayesian rule lists.
Access Controls: While it is broadly true that attacks against ML do not require the type of direct access needed to cause harms in traditional software systems, access to the model output is still required in many cases. For this reason, attempts to limit access to model output, along with methods to detect when such access is being abused, are among the most simple and effective ways to protect against the attacks described above.
Model Monitoring: It can be difficult to predict how ML systems will respond to new inputs, making their behavior difficult to manage over time. Detecting when such models are misbehaving is therefore critical to managing security and privacy risks. Key components of monitoring include outlining major risks and failure modes, devising a plan for how to detect complications or anomalies that occur, along with mechanisms for responding quickly if problems are detected.
Model Documentation: A long-standing best practice in the high-stakes world of credit scoring, model documentation formally records information about modeling systems, including but not limited to: business justifications; business and IT stakeholders; data scientists involved in model training; names, locations, and properties of training data; assumptions and details of ML methodologies; test and out-of-time sample performance; and model monitoring plans. A good model report should allow future model owners or validators to determine whether a model is behaving as intended.
White Hat or Red Team Hacking: As many ML attacks are described in technical detail in peer-reviewed publications, organizations can use these details to test public-facing or mission-critical ML end points against known attacks. White hat or red teams, either internally or provided by third parties, may therefore be able to identify and potentially remediate discovered vulnerabilities.
Open Source Software Privacy & Security Resources: Nascent open source tools for private learning, accurate and transparent models, and debugging of potential security vulnerabilities are currently being released and are often associated with credible research or software organizations.
No team is an Island: The importance of cross-functional expertise
Ongoing, cross-functional communication is required to help ensure the privacy and security of ML systems. Data scientists and software developers need access to legal expertise to identify privacy risks at the beginning of the ML lifecycle. Similarly, lawyers and privacy personnel need access to those with design responsibilities and security proficiencies to understand technical limitations and to identify potential security harms. Processes for ongoing communication, for risk identification and management, and for clear setting of objectives should be established early and followed scrupulously to ensure that no team operates in isolation.
There is no point in time in the process of creating, testing, deploying, and auditing production ML where a model can be ‘certified’ as being free from risk. There are, however, a host of methods to thoroughly document and monitor ML throughout its lifecycle to keep risk manageable, and to enable organizations to respond to fluctuations in the factors that affect this risk. Identifying, preventing, minimizing and responding to such risks must be an ongoing and thorough process.
Conclusion
This blog aimed to outline a framework for understanding and addressing privacy and security risks in ML, and welcomes suggestions or comments to improve our analysis. Please reach out with feedback.