What is GDPR?
General Data Protection Regulations (GDPR) is a regulation in EU law that was adopted by the European Parliament in April 2016 and became enforceable on May 25th, 2018. The regulation applies to the collection, processing, and movement of personal data for individuals residing in 32 European states. Non-compliance will result in a penalty up to €20,000,000 or 4% of global revenue, whichever is higher. Furthermore, GDPR extends to all companies that are holding information from data subjects that are EU citizens. This includes Australian businesses.
In short, with regard to Machine learning, GDPR signifies a significant compliance burden for automated decision-making (i.e. machine learning) systems. It outlines three areas where automated decisions are legal:
Where it is necessary for contractual reasons;
Where it is separately authorised by another law;
When the data subject (individual) has explicitly consented.
Moreover, the legislative text describes a general ‘right to explanation’ of automated decision-making processes. In greater detail, data subjects are entitled to an explanation of the automated decision-making systems after the decisions are made. Data subjects are further entitled to contest those decisions.
How Does it Affect Australian Businesses?
The GDPR will apply to Australian businesses that:
Have an establishment in the EU (regardless of whether they process personal data in the EU); or
Do not have an establishment in the EU, but offer goods and services, or monitor the behaviour of individuals (through sensitive personal data) in the EU.
For example, an Australian business will need to comply with GDPR if it:
Ships products to individuals in the EU;
Sells a health gadget that can monitor the behaviour of an individual in the EU;
Deals with the personal information of an individual in the EU (for example, an EU citizen living in Australia obtains tax advice from local accountant).
What Does This Mean for Machine Learning?
One of the main legislative points affecting the application of machine learning in businesses is the aforementioned ‘right to explanation’ for all EU individuals affected by automated decision-making systems. In short, if an outcome-of-interest of the individual is affected by their data, they have a right to know why the machine learning model made that decision.
What people generally think when initially hearing this is that GDPR and further data privacy legislations will constrain the application of more accurate black-box machine learning models. This is true to a certain degree. Naïve businesses and machine learning practitioners can no longer purely focus on the accuracy of the models while neglecting holistic and individualistic understandings of the algorithm, ignoring the quirks and potential biases that a business’s dataset may bring.
Although the application of machine learning may initially be constrained by higher compliance costs, the need for greater model interpretability is an exciting shift for both academia and industry. Existing research and methodologies are able to facilitate both global interpretability (understanding the aggregate relationships between the factors and the recommended decision) and local interpretability (understanding the individual factors that led to a decision being made) for almost all areas of the supervised machine learning spectrum.
Model interpretability is not a new concept. Academia has always noted that the lack of understanding for black-box models is not something to simply accept. For example, a team of researchers looking to understand their image classifier found that their convolutional neural network was identifying ‘dog’ and ‘wolf’ solely on the background snow – instead of the visual attributes of the animals.
Rather than constraining machine learning, GDPR will accelerate research in model interpretability and mature the overall data science industry.
Contact us today, if you are interested in better understanding your machine learning algorithms and their role in data driven decision making.