Month: June 2018

Blog

Accessing Jupyter Lab in Amazon SageMaker

Introduction
Amazon SageMaker makes it easy and accessible to build machine learning models using the familiar Jupyter Notebook interface. Given the prominence of Jupyter Notebooks in the data science and machine learning workflow, Jupyter Lab is the next generation user-face of Jupyter notebook files. All notebook files are fully supported within the Lab interface. This tutorial will detail how to access Jupyter Lab on the Amazon SageMaker platform. However, note that Jupyter Lab is currently in beta so more features (such as real-time collaboration) and improvements are still under development.
Accessing Jupyter Lab in Amazon SageMaker
Jupyter Lab is easily accessible on the Amazon SageMaker platform. First, start your SageMaker instance on the management console. The platform is currently only available in Tokyo, North Virginia, Ohio, Ireland, and Oregon (as of June 2018) so you will have to change your region to one of the available regions. After some time, open the notebook interface inside your browser; you will see the familiar Jupyter notebook interface start up on the home directory. It only takes one step to change the user interface – simply replace the ‘tree’ at the end of the URL with ‘lab’.

3 Reasons to Use Jupyter Lab:
Below we list three of our favourite features of Jupyter Lab in no particular order. For more details and features, visit the user documentation at http://jupyterlab.readthedocs.io/en/latest/.
Dark Theme
A dark theme is now built into the interface. We also expect custom themes and extensions from the community in the near future.

Multiple Notebooks and Windows
Having multiple notebooks snapped to a bash terminal or python console facilitates a modular environment that is entirely customisable to the user or task.

Live Markdown Editor
A live markdown editor is included in Jupyter Lab, enabling users to preview their edits in real time.

Blog

GDPR and its Implications for Australia and Machine Learning

What is GDPR?
General Data Protection Regulations (GDPR) is a regulation in EU law that was adopted by the European Parliament in April 2016 and became enforceable on May 25th, 2018. The regulation applies to the collection, processing, and movement of personal data for individuals residing in 32 European states. Non-compliance will result in a penalty up to €20,000,000 or 4% of global revenue, whichever is higher. Furthermore, GDPR extends to all companies that are holding information from data subjects that are EU citizens. This includes Australian businesses.

In short, with regard to Machine learning, GDPR signifies a significant compliance burden for automated decision-making (i.e. machine learning) systems. It outlines three areas where automated decisions are legal:

Where it is necessary for contractual reasons;
Where it is separately authorised by another law;
When the data subject (individual) has explicitly consented.

Moreover, the legislative text describes a general ‘right to explanation’ of automated decision-making processes. In greater detail, data subjects are entitled to an explanation of the automated decision-making systems after the decisions are made. Data subjects are further entitled to contest those decisions.
How Does it Affect Australian Businesses?
The GDPR will apply to Australian businesses that:

Have an establishment in the EU (regardless of whether they process personal data in the EU); or
Do not have an establishment in the EU, but offer goods and services, or monitor the behaviour of individuals (through sensitive personal data) in the EU.

For example, an Australian business will need to comply with GDPR if it:

Ships products to individuals in the EU;
Sells a health gadget that can monitor the behaviour of an individual in the EU;
Deals with the personal information of an individual in the EU (for example, an EU citizen living in Australia obtains tax advice from local accountant).

What Does This Mean for Machine Learning?
One of the main legislative points affecting the application of machine learning in businesses is the aforementioned ‘right to explanation’ for all EU individuals affected by automated decision-making systems. In short, if an outcome-of-interest of the individual is affected by their data, they have a right to know why the machine learning model made that decision.

What people generally think when initially hearing this is that GDPR and further data privacy legislations will constrain the application of more accurate black-box machine learning models. This is true to a certain degree. Naïve businesses and machine learning practitioners can no longer purely focus on the accuracy of the models while neglecting holistic and individualistic understandings of the algorithm, ignoring the quirks and potential biases that a business’s dataset may bring.

Although the application of machine learning may initially be constrained by higher compliance costs, the need for greater model interpretability is an exciting shift for both academia and industry. Existing research and methodologies are able to facilitate both global interpretability (understanding the aggregate relationships between the factors and the recommended decision) and local interpretability (understanding the individual factors that led to a decision being made) for almost all areas of the supervised machine learning spectrum.

Model interpretability is not a new concept. Academia has always noted that the lack of understanding for black-box models is not something to simply accept. For example, a team of researchers looking to understand their image classifier found that their convolutional neural network was identifying ‘dog’ and ‘wolf’ solely on the background snow – instead of the visual attributes of the animals.
Our View
Rather than constraining machine learning, GDPR will accelerate research in model interpretability and mature the overall data science industry.

 

Contact us today, if you are interested in better understanding your machine learning algorithms and their role in data driven decision making.

 
References
https://www.eugdpr.org

https://www.oreilly.com/ideas/how-will-the-gdpr-impact-machine-learning