Optimising AI with Multiple Objectives​

Why AI Efficiency is Critical for Scaling AI

Slide Background

Introduction Download

Artificial Intelligence (AI) is an iterative optimisation process to achieve multiple objectives.

In a real business world with resources constraints, there are many trade-offs to tackle before deploying AI in business processes. These include accuracy, model complexity, explainability, running speed, cost, etc.

Let’s simplify Optimal AI as the formula below:
F is the optimal AI model; X is a specific use case.

F = {f1 (X), f2 (X), … ,fm (X)}.

In each AI use case X, there is an ideal target value for each objective. For example, when developing an AI model running on a smart watch to predict heart attack, it may require:
1) Y1 Accuracy= f1(X) > 98%

2) Y2 Speed (time it takes to generate prediction) = f2(X) <1 millisecond

3) Y3 Energy Consumption = f3(X) < XX w/h to avoid draining the battery of the device

4) Y4 Explainability= f4(X) = High



The optimal AI model is the one that can simultaneously satisfy multiple criteria. Among these criteria, mainly accuracy and explainability come under the spotlight today. However, to scale AI across different business units, clouds and devices, efficiency is critical. As most companies are just starting to develop their first AI project, they lack the foresight of AI efficiency.

In this article, we will answer the key questions that every business should ask about AI efficiency. The answers will allow them to improve their AI strategies with efficiency in mind, to deliver optimal business outcomes. Thus, the questions this article answers are:

  1. Why AutoML is not enough?
  2. What is AI efficiency?
  3. Why AI Efficiency is important?
  4. Why AI Efficiency is difficult?
  5. How TurinTech optimises AI with multiple objectives?


1. Why AutoML is Not Enough?

As AI is starting to show its true value, more and more organisations are starting to adopt it, but implementing it in real-world scenarios has proven more difficult than anticipated.

The arrival of automated machine learning (AutoML) enables AI to be more widely accessible. AutoML allows both ML and non-ML experts to automatically build ML models to handle different predetermined tasks without the need for domain knowledge. This makes the whole process of creating ML systems easier and quicker. However, AutoML alone may create accurate AI, but not efficient AI that can satisfy multiple objectives to scale in real business world.

As we explain in the Introduction, AI is a multi-objective optimisation problem. AI built on AutoML platforms needs to fulfil specific restrictions in application environment before it can deliver value. For example, some AI models need to have very low latency to predict results in real time; some need to have small memory size to run on small devices; some need to be more energy-efficient and environmentally friendly, so the company won’t break the bank while training and running AI at scale.

Therefore, for AI to drive real business value and truly scale, AutoML is not enough. AI needs to be efficient to satisfy multiple objectives and constrains, without overusing expensive computational resources.



2. What is AI Efficiency

The real business world is resource constrained.

Efficient AI means models that can run fast to achieve a specific capability using fewer computing resources. Two factors are critical for AI efficiency: time and space.

    Time efficiency refers to the execution time necessary for an AI model to accomplish its tasks - the faster the better.



    Space Efficiency refers to the memory space required by an AI model to complete work on a set of data - the less the better.






AI Efficiency Will Differentiate Business Leaders from Followers.

AI is computationally expensive. Currently, most AI models reside on a remote cloud server or a giant data centre. Data must travel thousands of miles from device to data centre and then back to the device, before returning useful information. Consequently, cloud-based AI models have higher latency, cost more to implement, lack autonomy, and are often less personalised.

As more devices are embedded with sensors, software, and other technologies for the purpose of connecting and exchanging data with other devices and systems, AI is shifting from the cloud to the Internet of Things (IoT), where ‘things’ are physical devices such as mobile phones, smart watches and vehicles. But the biggest roadblock of on-device AI, is efficiency. These edge devices have significant resource constraints in terms of memory and computing power. If a model is very slow to inference or requires too much memory for storage, then it is not practical for edge computing.

“Efficiency will allow computing to move from data centres to edge devices like smartphones, making AI accessible to more people around the world; shifting computation from the cloud to personal devices reduces the flow, and potential leakage, of sensitive data; and processing data on the edge eliminates transmission costs, leading to faster inference with a shorter reaction time, which is key for interactive driving and augmented/virtual reality applications.”

Bilge Yildiz MIT Professor of Nuclear Science and Engineering, and Professor of Materials Science and Engineering

Therefore, only AI that runs fast and efficiently uses hardware can truly scale in the real world. And only those businesses that efficiently scale AI can gain competitive advantages.



Figure 1: Various benefits of on-device AI

Source: qualcomm.com



3. Why AI Efficiency is important?

In the real world, AI is intrinsically intertwined with a multi-objective problem where efficiency plays a crucial role. High accuracy makes an AI model desirable, but high efficiency enables businesses to deploy it into production and unlock its full potential. The adoption and iterative further development of AI applications are capital-intensive and energy consuming. Therefore, it is important to invest in AI technologies that will drive higher ROI with less energy consumption.

3.1 Efficiency is Money



High AI Efficiency can not only boost profits, but also reduces costs.

Efficient AI runs fast at low latency. Even one millisecond can be monumental. In high-frequency trading, a 1-millisecond advantage in latency can be worth upwards of $100 million per year. To give you more context, a blink of an eye takes 300 milliseconds.

In the competitive e-commerce world, AI efficiency means successful customer conversion, which directly drives revenue. According to Booking.com, an increase of about 30% in AI latency, costs more than 0.5% in conversion rate. For an e-commerce website, the typical conversion rate is 2%. This means high latency may reduce revenue by 25%, which can be million dollars per day.

With over 1.5 million room nights reserved on Booking.com each day, even a fraction-of-a-percent increase in conversion can make a big difference to profit.

In addition, high AI efficiency results in low energy consumption. In 2020, US Data Centre spent $13 billion in energy consumption. 1% improvement in efficiency can lead to hundreds of million in cost savings.

3.2 Efficiency is Customer Experience



Efficient AI helps businesses champion customer experience. AI is being deployed closer to the customer, from smart retail to health wearables.

Customers crave immediacy. Efficient AI can anticipate customer expectations faster and give what they need at the right time and the right place. Efficient AI can gauge customer preferences and interests quicker on real-time behaviour data, enabling real-time improvements in personalisation.

Furthermore, on-device AI efficiency drives customer experience by impacting device operating life. An image classification model can drain the smartphone battery within an hour. As more activities go mobile, longer battery life is a must for smartphone users.

3.3 Efficiency is Risk



Efficient AI can help enterprises quickly identify potential risks and ultimately impact the bottom line. For instance, payment processors typically only have milliseconds to match account information and detect fraud. Slow AI-powered fraud detection applications can mean billions of costs in undetected payment fraud.

When it comes to healthcare, AI efficiency means life-critical risks. For a patient with critical and time-sensitive needs, slow AI-embedded healthcare applications could have a dramatically negative impact and increase fatal risks in emergency situations.

3.4 Efficiency is Carbon Emission



Developing AI algorithms involves using servers that consume a large amount of electricity. As AI usage has increased dramatically in recent years, its electricity consumption and CO2 emission have become a great concern for the environment.

“We need to rethink the entire stack — from software to hardware. Deep learning has made the recent AI revolution possible, but its growing cost in energy and carbon emissions is untenable.”

Aude Oliva, MIT director of the MIT-IBM Watson AI Lab

A study conducted by researchers at the University of Massachusetts, Amherst, found that training one large model can emit more than 626,000 pounds of CO2 while another one discovered that Google’s AlphaGo Zero (the chess playing AI system) released 192,000 pounds of CO2 during 40 days of training. This is the equivalent of 1000 hours of air travel.



4. Why Efficient AI is difficult?

4.1 Optimisation Targets Are Dynamic Rather than Static

Unlike traditional software that is only subject to code changes, AI solutions are subject to changes on three axes: the code, the model, and the data. These make solving the Optimal AI formula a complex, difficult and continuous process. There are so many different aspects to be considered including: model selection, model tuning, data volume, and system architecture, etc. Optimisation requires finding the right balance across all these aspects.

As a result, manual optimisation is almost impossible.



Figure 2: the 3 axes of change in a Machine Learning application — data, model, and code — and a few reasons for them to change
Source: martinfowler.com

Finding the right balance across these aspects is trial and error. The most talented professionals in the data science field still need to tweak all those aspects to get to the final solution.

4.2 Hardware Efficiency Optimisation Is Not Sufficient

Efficient hardware is important for the AI development. Just like humans need their body to function, AI needs a physical device to thrive. New hardware like GPUs and TPUs, are specifically designed for AI to accelerate training and performance of neural networks and reduce the power consumption. However, hardware alone is not enough to keep up with the abundance of more complex, power-hungry models.

Apart from hardware, there are two key factors driving AI efficiency: algorithmic improvement and code optimisation.

4.3 Algorithmic Improvement Is Advancing, But Not Good Enough

Currently, most efforts at making AI software more efficient have been done at the algorithmic level, either by using less complicated ML algorithms (such as random forests) or by making changes to existing ones so that they can utilise fewer resources.

A recent study from OpenAI, shows that algorithmic advances have proved more successful at keeping computational power low and since 2012, it now takes 44 times less computational power to train a neural network. By contrast, according to Moore’s law, achieving the same improvements in hardware would bring an 11x increase in cost.

However, there are some challenges to be addressed, before moving and training models on the edge devices. For instance, disconnecting algorithms from the hardware and developing metrics that consider resource limitations. AI needs to be hardware-aware to unlock optimal efficiency.

4.4 Code Optimisation Is the Secret Sauce

So far, the progress made in the field of optimisation has been mostly focused on achieving better accuracy and rate of convergence at the sacrifice of memory usage and compute resources. However, different devices have different constraints (e.g. memory, battery life). AI needs to be hardware-aware and satisfy multiple requirements of a specific device. This requires AI to be tailored for each device to deliver optimal performance.

To make AI hardware-aware, code optimisation is the secret sauce. TurinTech’s research in code optimisation has shown that by improving inefficiencies in AI code, users can significantly increase AI efficiency on a specific hardware, without compromising accuracy and other metrics.

Powered by this proprietary research, TurinTech has built a platform to automate the whole lifecycle of building and optimising AI. This enables businesses to automatically generate hardware-aware AI, with tailored algorithm and tailored code for a specific device.

By complementing hardware optimisation with TurinTech’s optimisation at both algorithm and code level, businesses will be able to achieve optimal AI that delivers superior prediction accuracy, execution speed or other user specific metrics.



Figure 3: Co-optimised solution for Efficient AI



5. How TurinTech optimises AI with multiple objectives

To unlock the full potential of AI, businesses need to continuously optimise AI for ever-changing business and technical objectives, within a short time frame and with resource constraints.

At TurinTech, we automate AI optimisation to make efficient AI scalable. Our EvoML platform enables businesses to automatically build, optimise and deploy models within days. These smart and efficient AI models run faster anywhere, both in the cloud and on devices, without compromising accuracy or other business metrics. Thus, businesses can easily scale AI across multiple cloud and edge devices.



5.1 Evolving for Optimal Model

Inspired by Darwin’s theory of evolution, and based on our proprietary research, EvoML creates and evolves thousands of candidate models. These models can then learn and optimise themselves based on interaction with the business environment, which is analogous to natural selection. These models evolve multiple times into novel generations, and only the optimal models for your use case will survive. This evolution approach not only creates better models, but also accelerates model development time to 30+% faster.

5.2 Multi-objectives Optimisation on-demand

As we mentioned earlier, AI performance is continuously drifting as a result of changes in the code, the model, and the data. Being able to optimise AI and deploy latest models in production is critical to operational success.

TurinTech continuously monitors the model performance, detects outdated models and triggers retraining. TurinTech’s multi-objective optimisation enables businesses to tackle difficult trade-offs between accuracy and performance, with the purpose of rolling out AI models to various clouds and devices at scale. Thus, businesses can always have the optimal model available at speed, even under dynamic circumstances.

5.3 Optimise at AI Code Level

EvoML is the only platform that optimises AI at source code level. As aforementioned, to build efficient AI, hardware optimisation is not sufficient, while advancement in algorithmic improvements is not good enough. The effective solution requires code optimisation.

Powered by our proprietary research in automatic code optimisation, EvoML automatically scans the code to identify inefficiencies. Those inefficiencies will then be replaced with optimised code, enabling AI to run faster in any given environment. Furthermore, EvoML provides the optimised code, allowing businesses to do further customisation to solve specific business problems.

Conclusion

2021 marks the start of a new decade with the pandemic accelerating digitalisation and AI adoption across industries. In the resource-constrained business world, only businesses that efficiently scale AI can win in this new decade.

Accuracy is not enough for moving AI from experiment to production level, and from cloud to devices. AI is an iterative optimisation process to achieve multiple objectives, where efficiency is crucial. In this article, we defined AI efficiency and explained its significant impacts on business. High AI Efficiency boosts profits, reduces cost, enhances customer experience, mitigates risks and minimises carbon emissions.

Building efficient AI is a complex, difficult, and continuous process. Most optimisation technologies are on hardware or at algorithm level. However, the optimisation result that they could achieve is not enough for AI to become ubiquitous. There is still a big gap to be closed between current solutions and the optimal AI efficiency, which is exponentially increasing.


TurinTech automates the end-to-end process of building, optimising and deploying AI. Our multi-objective optimisation enables businesses to build accurate and efficient AI in just a few clicks. Powered by our proprietary research in code optimisation, businesses can optimise AI at code level on-demand, to achieve optimal AI performance in dynamic business environments.

Think future, act now. Embed efficient AI to be ahead in the AI game.

Learn more about scaling AI at https://turintech.ai/

You may also like our white paper on Why It Is So Difficult to Build and Scale AI




References:

https://martinfowler.com/articles/cd4ml.html
https://openai.com/blog/ai-and-efficiency/
https://arxiv.org/abs/1908.04909
https://arxiv.org/abs/2011.09926
https://medium.com/acing-ai/ml-ops-data-science-version-control-5935c49d1b76
https://www.qualcomm.com/news/onq/2020/06/12/we-are-making-ai-ubiquitous
https://towardsdatascience.com/from-cloud-to-device-the-future-of-ai-and-machine-learning-on-the-edge-78009d0aee9
https://booking.ai/150-successful-machine-learning-models-6-lessons-learned-at-booking-com-681e09107bec
https://booking.ai/how-booking-com-increases-the-power-of-online-experiments-with-cuped-995d186fff1d
https://arxiv.org/abs/1809.05476
https://www.ibm.com/watson/advantage-reports/future-of-artificial-intelligence/ai-innovation-equation.html
https://www.sciencedirect.com/topics/engineering/multiobjective-optimization-problem
https://www.wired.com/2012/04/netflix-prize-costs/
https://www.idevnews.com/stories/7322/In-2020-Low-Latency-Will-Differentiate-Business-Leaders-from-Laggards
https://moallemi.com/ciamac/papers/latency-2009.pdf
https://venturebeat.com/2020/04/23/mit-csails-ai-trains-models-in-an-energy-efficient-way/
https://medium.com/sciforce/ai-hardware-and-the-battle-for-more-computational-power-3272045160a6