Machine Learning Inference Chips, Custom Designed By Amazon
Amazon Web Services, Amazon’s cloud business is not just sticking to software anymore. It is also building hardware chips. At its ongoing “AWS re:Invent Conference” in Las Vegas , it has unveiled 2 new chips and 13 new machine learning capabilities and services to attract more developers to AWS by broadening its range of tools and services. The chips, it claims, will provide A.I. researchers "high performance at low cost."
The chip called “Inferentia” is a machine learning inference chip designed to deliver high performance to find patterns in large amounts of data. AWS Inferentia provides high throughput, low latency inference performance. Each chip has hundreds of TOPS (tera operations per second) of inference throughput to allow complex models to make fast predictions. For even more performance, multiple AWS Inferentia chips can be used together to drive thousands of TOPS of throughput. AWS Inferentia will support the TensorFlow, Apache MXNet, and PyTorch deep learning frameworks, as well as models that use the ONNX format and will be available for use with Amazon SageMaker, Amazon EC2, and Amazon Elastic Inference. AWS has also introduced Elastic Inference, which allow developers to design their own inference processor capacities and cloud deployments based on workload. According to AWS, developers can reduce inference costs by up to 75% by attaching GPU-powered inference acceleration to Amazon EC2 and Amazon SageMaker instances.
AWS hasn’t stopped here. It has also announced a second ARM-based chip that represents an alternative to traditional computing processors from chipmakers like Intel. Across the clouds—and in corporate’s on-premises data centres — customers' computing workloads often run on Intel-based chips. The chip dubbed “Graviton”, introduced as a potential alternative to Intel, is akin to an Arm-based AMD chip. The CPU core is based on an Arm’s 2015 Cortex-A72 design, clocked at 2.3GHz clock, 64-bit Armv8-A, non-NUMA processor with floating point math, and supporting SIMD, AES, SHA-1, SHA-256, GCM and hardware acceleration of the CRC-32 algorithm.
It’s been an interesting trend since last few years as software giants are entering into the tricky business of AI chip design. Amazon’s launch of these chips echoes Google’s AI chip - Tensor Processing Unit. Since 2016 Google has introduced new TPU chips that compete with NVIDIA for training AI models. Last year Microsoft demonstrated its Windows Server operating system running on Arm servers, but Arm-based computing is not currently possible from Microsoft's Azure public cloud. Microsoft has also partnered with Qualcomm to create AI developer kit enabled by Azure. IBM has the Power9 processor chips which it sells to third-party manufacturers and to cloud vendors and AC922 chip on the IBM cloud. But, it currently has no plans for AI chips. Earlier this year, Alibaba has announced that it is working on developing its own AI chip called the “Ali-NPU” and that the chips will become available for anyone to use through its public cloud.
AWS is by far the leader in public cloud infrastructure. But its advantage is shrinking as Microsoft, Google, IBM are competing with it for business as companies move their workloads from traditional data centres to the cloud. Increasing number of cloud vendors in the AI chip market only shows its rapid expansion and strong growth illustrate just how competitive the cloud business has become, and how important innovation is for players who want to stay in the race.
With the rising influence of AI, cloud vendors will try to provide the infrastructure for AI to easily attract business. Much of AI training needs can be abstracted by the cloud providers just like what cloud business did to the server industry. It takes away all the infrastructure setup and large start-up cost hassle to lower the barrier for companies trying AI projects. This has the potential of having the same impact in democratizing AI as it did for various online businesses post cloud as a service model.
AI infra-as-a-service could be the new game changer!