Contact us today (02) 80894043 | info@intellify.com.au

Category: Blog

AWS Blog Events In the Media

Visit Intellify at the AWS Summit Sydney 2019

Intellify is proud to announce that we will be exhibiting at this year’s AWS Summit at the International Convention Centre (ICC), Sydney. The Summit aims to bring together the cloud computing community to enhance and celebrate knowledge about creating future-ready businesses.

Explore how cloud technology can help businesses innovate, reduce costs and promote efficiency at large. The event is open to cloud users of all levels! It’s a free event designed to inspire new skill and application development as well as educate all those interested about the newest technologies and experts who have built their solutions on AWS.

 

What to Expect:

Day 1 (30th of April) is Amazon Innovation Day. The agenda focuses on first-hand experience and insights on how Machine Learning, Robotics, and AI are changing the way we do business an life our daily lives.

Day 2 & 3 (1st and 2nd of May) is the AWS Summit. These days include exciting workshops, networking, and presentations about the latest in cloud vision.

 

Intellify will be exhibiting on all three days so come ask us some questions and join the cloud computing conversation. See you there!

Book your ticket for the Summit before its too late: https://pages.awscloud.com/anz-summit-2019.html

For more information about the Summit follow this link: https://aws.amazon.com/events/summits/sydney/

Also, keep up to date with Intellify: http://www.intellify.com.au/

Blog Events

PyData Sydney Meetup: March 2019

PyData Sydney’s March meetup featured presentations from two guest speakers: Nick Halmagyi and Zhuo Jia Dai. These presentations showcased two different areas of data science. Our session had a great turnout, and provided an opportunity for data enthusiasts to learn and network amongst their peers. To find the presentations from this session, please click here.

Below are the speakers’ summaries of their presentations.

Nick presented on Graph Analytics in Python:

Graph Analytics in Python: Predicting Missing Links
The goal of the missing link problem in graph theory is to predict the pairs of nodes, which are not currently linked, yet are most similar in character to existing links. It is a problem regularly studied by operators of social networks to suggest new connections between people but the only limit as to what can be represented as nodes and edges of a graph is your own creativity! The state of the art benchmarks are all rooted in machine learning and I will provide an example based exposition of how to approach this problem in Python.

 Zhuo presented on Machine Learning in Banking:

Scorecards: How banks use Machine Learning to make lending decisions
Credit scoring has a long history in banking. Banks use it to make lending decisions on a daily basis. In this talk, ZJ will explain what a scorecard is, how is a scorecard built, and how modern ML frameworks, such as XGBoost, can be applied to automatically build scorecards.

Interested in taking the reins for the next PyData presentation? Want to teach your peers about the topic of your choice in the data science realm? Send us an email at pydatasydney@gmail.com to enquire. Alternatively, please keep an eye on our Meetup page for more information about our next session in April.

Blog Events

PyData Sydney Meetup: Feb 2019

PyData Sydney’s first meetup of the new year resulted in an evening full of networking, pizza, and most importantly, learning about the best and brightest in new data management, processing, analytics, and visualisation. The PyData network provides rare opportunities for group of data science enthusiasts to meet and chat about the newest trends and developments in the industry.

This meetup put a spotlight on time series models, and how they have progressed past statistical methods. Attendees learned about the state-of-the-art methods and how they compare to classical ones.

In a field that’s growing at such a fast pace, it’s important that its followers do the same. The crowd of data science aficionados has certainly gone up in size, and we hope to see more and more eager scientists at our future events!

Interested in taking the reins for the next PyData presentation? Want to teach your peers about the topic of your choice in the data science realm? Send us an email at pydatasydney@gmail.com to enquire! Alternatively, please keep an eye on our Meetup page for more information about our next session on 6th March.

Blog

Happy New Year from the Intellify Team

Happy new year from the Intellify team!
As we enter into our second year of operation, we look forward to creating more great Machine Learning and Data Science solutions for our clients. Through our growth in 2018, we have built an environment in which our employees can thrive, and a consultancy creating big wins in this rapidly developing space.

We thank you for your support and participation in our journey thus far! We eagerly anticipate the possibilities 2019 brings, especially with the growing excitement and opportunities for organisations using Machine Learning and Data Science.

Our focus this year will be to continue to invest in building a world-class Data Science team, and to make sure our staff have every opportunity to grow and develop. Our team may be from all around the world, but they all speak the same language. Check out the video below for new year’s well wishes from the Intellify team! 

AWS Blog

re:Invent 2018 – Machine Learning and AI Take Centre Stage

This year’s AWS re:Invent was full of new releases and updates, cutting-edge tools, innovative products, and valuable resources to help take advantage of the rapid pace of growth and development in cloud services. For us at Intellify, we were most excited by the focus on building out Machine Learning tools and environments. SageMaker has gotten many new updates, making optimisation, recommendation systems, reinforcement learning (RL) and forecasting models deployable by more users than ever. Tellingly, much of Andy Jassy’s keynote was dedicated to announcing the massive new focus on Machine Learning and AI from AWS, and we are gearing up to help our customers take full advantage of the benefits.

5 Key Takeaways

While we continue to digest the content coming from re:Invent – it is a huge week after all – and begin to work with the expansions to SageMaker, Reinforcement Learning, Elastic Inference, Marketplace, and other ML-based toolkits, here are some of the major takeaways we had from the keynote presentations:

New SageMaker offerings: Ground Truth, Neo, Reinforcement Learning. We have been developing and deploying for our customers on SageMaker (see here, and here) since its first release and have found it a powerful environment for ML projects. AWS is expanding SageMaker even more, including:

Ground Truth: adding the capability to outsource data labelling to either human agents through Mechanical Turk, third parties, or in-house; or a combination of human and intelligent systems. Accurate labelling of data is crucial to successful implementation and integration of data sets for ML, and we can see ourselves taking advantage of this new service to ensure solution accuracy.
Reinforcement Learning (RL): a new environment dedicated to RL algorithms and compute power allowing deployment of RL-based ML solutions, which have much different requirements to supervised and unsupervised learning techniques.

 

Allowing ML vendors to share their cutting-edge ML algorithms and solutions in AWS Marketplace. Not only will this greatly expand the options available to those seeking pre-built ML solutions, it will also enable us to share some of our expertise on the platform, and more rapidly deploy flexible customer solutions using an array of customisable tools.

Significant expansions on Elastic Inference, and the arrival of AWS’ specialised ML chip: Inferentia. This is particularly exciting for us to see the expansion of flexible compute power for ML. Inference requires significant power when running our models, but for relatively sparse periods of time. The advantage of elastic inference will be in providing that high compute when it is needed, rather than paying for the availability of high compute all month. We’re sure this will make our future inference-based ML projects more cost-effective, while retaining the compute power we need for successful deployment.

The Introduction of DeepRacer and DeepRacer League.We loved this! Based on the new RL environment in AWS, watch out for an Intellify team flexing their ML and data science muscles in the League!

Amazon Textract, Personalize, and Forecast. This past year, customers have shown a lot of interest in document recognition/parsing; recommender systems, especially in ecommerce and customer experience-focused businesses; and time series modelling and forecasting. There are so many vital applications of these ML-based tools, and we can’t wait to get on board with Textract, Personalize, and Forecast to take them to a whole new realm of customers seeking the benefits of AI/ML.

Our Thoughts

AI and ML are hugely on the rise for both organisations and individuals. AWS’ new suite of releases and expansions to SageMaker, RL, Inference, and Marketplace will only help this field grow at an even faster rate. We’ve been deploying on SageMaker since its release, and these tools will only help us expand what we can offer to customers in terms of cutting edge, competitive AI/ML solutions.

Get in Contact

If you or your organisation are looking to take advantage of AI/ML and its enormous opportunities to boost revenue, create operational efficiency, and enhance competitiveness, please get in contact via phone (02 8089 4073) or email (info@intellify.com.au). We are AWS Consulting partners for Machine Learning, with a range of projects already completed across SageMaker and AWS cloud services.

Blog In the Media

In the Media: [This is My Code] on AWS

Our very own Kale Temple was featured in an AWS segment on their Youtube channel. This video featured an in-depth discussion on Particle Swarm Optimisation using Amazon SageMaker. In an age where companies are looking for better ways to optimise pricing, discounts, and offers across their product portfolios, environments like SageMaker are game changing. Take a look at the video below for a detailed look at this incredible process.

We’re thrilled to be a part of the conversation involving machine learning and AWS tools, including SageMaker. We look forward to the exciting new things that will come out of re:Invent 2018. 

Blog Events

Executive Breakfast – AI & ML Opportunities for Australian Business

Thursday 1st November we had our Executive Briefing breakfast session, a roundtable presentation and discussion with an array of 20 senior managers and executives, at the Four Seasons Sydney. Our discussion centred around the growing market and opportunities for enterprises looking to implement data science, machine learning and AI. This session we had a focus on not just the economic importance of implementing ML & AI now to secure competitive advantage, but also some of the tools and techniques to get AI projects successfully underway. With a presentation from AWS, and a sumptuous Four Seasons breakfast, attendees responded they felt the event was very useful and informative.

We will be running our next Executive Briefing in March 2019 – if you are at the stage of encouraging customers to investigate AWS Machine Learning for data projects, or looking at leveraging current opportunities in Machine Learning, SageMaker, or Artificial Intelligence, we’ll be inviting selected CxOs and senior managers for another exclusive opportunity to network and discuss the future of ML & AI.

Blog

Amazon SageMaker vs. AWS Lambda for Machine Learning

Introduction
Advances in serverless compute have enabled applications and micro-services to elastically scale out horizontally, sustaining the volatility of service demand. AWS Lambda is perhaps the most prominent serverless platform for developers and machine learning engineers alike. Its scalability and accessibility makes it a top choice for serving and deploying modern machine learning models.

SageMaker, on the other hand, was introduced by Amazon on re:Invent 2017 and aims to make machine learning solutions more accessible to developers and data scientists. In detail, SageMaker is a fully-managed platform that enables quick and easy building, training, and deploying of machine learning models at any scale. The platform, interacted through a Jupyter Notebook interface, further makes it accessible to perform exploratory data analysis and preprocessing on training data stored in Amazon S3 buckets. Moreover, SageMaker includes 12 common machine learning algorithms that are pre-installed and optimised to the underlying hardware – delivering up to 10x performance improvements compared to other frameworks.

In this article, we will compare and contrast the various advantages and disadvantages of SageMaker (server) and Lambda (serverless) for the machine learning and data science workflow. We will use the categories of cost, model training, and model deployment to detail the characteristics of both services.
Pricing
The pricing model for SageMaker mirrors that of EC2 and ECS, albeit at a premium when compared to the bare-bone virtual machines. Like most server deployments, serving a machine learning model on the SageMaker platform is more costly for sparse prediction jobs. However interestingly, SageMaker instance prices are divided into the segments of model building, training, and deployment. Below is an example from the US West – Oregon region; for more detail see .
Model Building Instance Pricing (US West – Oregon)

Model Training Instance Pricing (US West – Oregon)

Model Hosting Instance Pricing (US West – Oregon)

 

AWS Lambda, in contrast, allows flexible horizontal scaling of model predictions to the workload. Lambda pricing is further based on the number of requests per month and the GB-seconds of Lambda executions (https://aws.amazon.com/lambda/pricing/). Lambda, and serverless compute in general, are more cost effective for models with low or highly volatile interactions.
Model Training
For model training, SageMaker is reliant on the notebook interface and lifecycle while Lambda is agnostic to the model training environment – one can choose to train either locally or through EC2. This means that Lambda requires no changes to the model training process. On the other hand, SageMaker provides the added benefit of highly optimised native implementations of popular machine learning algorithms such as:

Linear regression – https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html
K-means – https://docs.aws.amazon.com/sagemaker/latest/dg/k-means.html
Principle component analysis (PCA) – https://docs.aws.amazon.com/sagemaker/latest/dg/pca.html
Latent Dirichlet Allocation (LDA) – https://docs.aws.amazon.com/sagemaker/latest/dg/lda.html
Factorisation machines – https://docs.aws.amazon.com/sagemaker/latest/dg/fact-machines.html
Neural topic modelling (NTM) – https://docs.aws.amazon.com/sagemaker/latest/dg/ntm.html
Sequence to sequence modelling (Based on Sockeye) – https://docs.aws.amazon.com/sagemaker/latest/dg/seq-2-seq.html
Boosted decision trees (XGBoost) – https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html
Image Classification (Based on ResNet for full training or transfer learning) – https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html
Recurrent Neural Network Forecasting – https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html

Using SageMaker C5 or P3 instances enables further optimisations for neural network training. C5 instances come with  Intel’s Advanced Vector Instructions optimisations while CUDA 9 and cuDNN 7 drivers on P3 instances take advantage of mixed precision training on the Volta V100 GPUs. Lastly, SageMaker’s formal deployment and serving module enables developers and data scientists to build models offline while using the service for model training through the SDK.
Model Deployment
SageMaker makes it easy and accessible for model deployment. The platform has the ability to autoscale inference APIs in a manner similar to attaching application load scalers to EC2 instances (https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling.html). Furthermore, SageMaker conveniently enable data scientists to A/B test candidate model deployments in real time. Below is a diagram from the official SageMaker documentation that outlines the model deployment process. When deploying a model, SageMaker first creates a new EC2 instance, downloads the serialised model data from S3, pulls the docker container from ECR, and deploys the runtime required to run and serve the model. Finally, it mounts a volume with the model data to the runtime.

 

In comparison, AWS Lambda provides greater flexibility and cost efficiency for model deployments with low interaction rates. However, the main drawbacks of Lambda include the inherent cold-start delay and the lack of GPU support (not that GPUs are needed for model predictions). Moreover, Lambda deployments are constrained to a maximum of 5 minutes of computation time and 50MB of compressed deployment size (in a zip or jar format). One can work around the package size constraint by uploading large dependencies to S3 and subsequently downloading and caching them to an in-memory directory such as /tmp on the first request. Furthermore, one can reduce cold start delays by using a model architectures with higher prediction throughput, driver optimisations, and/or continuously pinging the API every 5-10 minutes.
Conclusion
Overall, both Amazon SageMaker and AWS Lambda provide many benefits for the machine learning workflow. SageMaker’s Jupyter notebook interface for model building and training makes it extremely accessible to developers and data scientists. Moreover, the optimised implementations of popular machine learning algorithms such as K-means and boosted trees makes it attractive in the model training process. For machine learning models with variable demand, we recommend model deployments on AWS Lambda. However despite Lambda providing continuous horizontal scaling to the workload, we find that SageMaker’s ‘one-click’ deployment and built-in A/B testing enables us to quickly iterate across candidate model deployments — multiple minimal viable product at a time.
Further Reading

https://aws.amazon.com/sagemaker/
https://aws.amazon.com/lambda/

 

Blog

Accessing Jupyter Lab in Amazon SageMaker

Introduction
Amazon SageMaker makes it easy and accessible to build machine learning models using the familiar Jupyter Notebook interface. Given the prominence of Jupyter Notebooks in the data science and machine learning workflow, Jupyter Lab is the next generation user-face of Jupyter notebook files. All notebook files are fully supported within the Lab interface. This tutorial will detail how to access Jupyter Lab on the Amazon SageMaker platform. However, note that Jupyter Lab is currently in beta so more features (such as real-time collaboration) and improvements are still under development.
Accessing Jupyter Lab in Amazon SageMaker
Jupyter Lab is easily accessible on the Amazon SageMaker platform. First, start your SageMaker instance on the management console. The platform is currently only available in Tokyo, North Virginia, Ohio, Ireland, and Oregon (as of June 2018) so you will have to change your region to one of the available regions. After some time, open the notebook interface inside your browser; you will see the familiar Jupyter notebook interface start up on the home directory. It only takes one step to change the user interface – simply replace the ‘tree’ at the end of the URL with ‘lab’.

3 Reasons to Use Jupyter Lab:
Below we list three of our favourite features of Jupyter Lab in no particular order. For more details and features, visit the user documentation at http://jupyterlab.readthedocs.io/en/latest/.
Dark Theme
A dark theme is now built into the interface. We also expect custom themes and extensions from the community in the near future.

Multiple Notebooks and Windows
Having multiple notebooks snapped to a bash terminal or python console facilitates a modular environment that is entirely customisable to the user or task.

Live Markdown Editor
A live markdown editor is included in Jupyter Lab, enabling users to preview their edits in real time.

Blog

GDPR and its Implications for Australia and Machine Learning

What is GDPR?
General Data Protection Regulations (GDPR) is a regulation in EU law that was adopted by the European Parliament in April 2016 and became enforceable on May 25th, 2018. The regulation applies to the collection, processing, and movement of personal data for individuals residing in 32 European states. Non-compliance will result in a penalty up to €20,000,000 or 4% of global revenue, whichever is higher. Furthermore, GDPR extends to all companies that are holding information from data subjects that are EU citizens. This includes Australian businesses.

In short, with regard to Machine learning, GDPR signifies a significant compliance burden for automated decision-making (i.e. machine learning) systems. It outlines three areas where automated decisions are legal:

Where it is necessary for contractual reasons;
Where it is separately authorised by another law;
When the data subject (individual) has explicitly consented.

Moreover, the legislative text describes a general ‘right to explanation’ of automated decision-making processes. In greater detail, data subjects are entitled to an explanation of the automated decision-making systems after the decisions are made. Data subjects are further entitled to contest those decisions.
How Does it Affect Australian Businesses?
The GDPR will apply to Australian businesses that:

Have an establishment in the EU (regardless of whether they process personal data in the EU); or
Do not have an establishment in the EU, but offer goods and services, or monitor the behaviour of individuals (through sensitive personal data) in the EU.

For example, an Australian business will need to comply with GDPR if it:

Ships products to individuals in the EU;
Sells a health gadget that can monitor the behaviour of an individual in the EU;
Deals with the personal information of an individual in the EU (for example, an EU citizen living in Australia obtains tax advice from local accountant).

What Does This Mean for Machine Learning?
One of the main legislative points affecting the application of machine learning in businesses is the aforementioned ‘right to explanation’ for all EU individuals affected by automated decision-making systems. In short, if an outcome-of-interest of the individual is affected by their data, they have a right to know why the machine learning model made that decision.

What people generally think when initially hearing this is that GDPR and further data privacy legislations will constrain the application of more accurate black-box machine learning models. This is true to a certain degree. Naïve businesses and machine learning practitioners can no longer purely focus on the accuracy of the models while neglecting holistic and individualistic understandings of the algorithm, ignoring the quirks and potential biases that a business’s dataset may bring.

Although the application of machine learning may initially be constrained by higher compliance costs, the need for greater model interpretability is an exciting shift for both academia and industry. Existing research and methodologies are able to facilitate both global interpretability (understanding the aggregate relationships between the factors and the recommended decision) and local interpretability (understanding the individual factors that led to a decision being made) for almost all areas of the supervised machine learning spectrum.

Model interpretability is not a new concept. Academia has always noted that the lack of understanding for black-box models is not something to simply accept. For example, a team of researchers looking to understand their image classifier found that their convolutional neural network was identifying ‘dog’ and ‘wolf’ solely on the background snow – instead of the visual attributes of the animals.
Our View
Rather than constraining machine learning, GDPR will accelerate research in model interpretability and mature the overall data science industry.

 

Contact us today, if you are interested in better understanding your machine learning algorithms and their role in data driven decision making.

 
References
https://www.eugdpr.org

https://www.oreilly.com/ideas/how-will-the-gdpr-impact-machine-learning