Amazon Refreshes AI Story With New Chips, Models And Platform Tools
Summary:
- At their annual re:Invent conference in Las Vegas, Amazon’s Web Services (AWS) business announced a range of products and services primarily focused on enhancements to the company’s existing range of offerings as opposed to completely new additions.
- AWS refined its tools and bundled a number of existing products and services in a way that looks to make some major steps towards simplifying the process of creating and deploying GenAI technologies.
- New AWS CEO Matt Garman announced the general availability of the Trainium 2 chip and EC2 compute instances for AI training and inferencing workloads using those chips. He also already announced the development of and some initial specs for Trainium 3.
- Garman unveiled a number of enhancements to Amazon’s Sagemaker and Bedrock platforms, including the announcement of Sagemaker Studio, which brings together a number of previously independent AWS services into a single UI.
One thing that’s become very clear when it comes to Generative AI (GenAI) is that we’re still in the early days of the technology, and major evolutions and refinements of existing products is going to be a standard part of the tech industry news cycle for some time to come.
At their annual re:Invent conference in Las Vegas, Amazon’s (NASDAQ:AMZN) Web Services (AWS) business provided a clear example of this trend with a range of products and service announcements that were primarily focused on enhancements to the company’s existing range of offerings as opposed to completely new additions. To be clear, there were certainly a few truly unknown entries incorporated into the firehose of new announcements that has become the hallmark of AWS keynote speeches – particularly in regard to foundation models. Even there, however, it was arguably a newly branded replacement for an existing product.
Part of the reason for this approach is that big tech entities like Amazon did a pretty good job of initially defining and creating an overall approach and set of offerings for enabling GenAI at a high level. As time has passed, however, there’s an awareness that these tools and processes weren’t completely meeting the needs of many customers. In a word, it was/is still too hard for most companies to leverage the capabilities of GenAI with what was available.
With that mindset in place, AWS took it upon itself to fill in the gaps at this year’s re:Invent. They refined the tools and bundled a number of existing products and services in a way that looks to make some major steps towards simplifying the process of creating and deploying GenAI technologies. Plus, they did so for companies at many different technology sophistication levels. Importantly – and impressively – they also did that across a huge range of offerings, from custom silicon to foundation models, database enhancements, developer tools and software platforms.
Starting appropriately enough at the base silicon level, new AWS CEO Matt Garman kicked off his keynote discussing the company’s major investments and successes in custom chips over the last decade or so. He highlighted the company’s now prescient decision to invest in Arm-based CPUs with its Graviton chip and shared that their Graviton-based business is now larger than all of the AWS compute business was when Graviton was first introduced. He went on to announce the general availability of the Trainium 2 chip and EC2 compute instances for AI training and inferencing workloads using those chips.
Taking things a step further, he even claimed that Trainium 2 is the first truly viable alternative against Nvidia (NVDA) GPUs – most importantly, at a significantly lower cost of operation. Proving the veracity of those claims remains to be seen, but initial discussions around the chip’s architecture suggest it’s a huge improvement over the first-generation Trainium. Interestingly, he also already announced the development of and some initial specs for Trainium 3, highlighting how committed the company is to the ongoing development of custom silicon. Despite their own silicon efforts, however, AWS also made it clear that Nvidia is a critical partner, and they officially announced new EC2 instances with Nvidia’s Blackwell GPUs were coming soon.
Of course, a huge part of any GenAI compute system – regardless of its underlying silicon – is the capabilities of the software tools used to build and fine-tune models and applications that run on that hardware. In this area, Garman unveiled a number of enhancements to Amazon’s Sagemaker and Bedrock platforms, including the announcement of Sagemaker Studio, which brings together a number of previously independent AWS services into a single UI.
Building on its history as a tool for data scientists and early AI machine learning models, Sagemaker has taken on an important role in the GenAI era as a means to develop, train and fine-tune foundation models. Not surprisingly, Sagemaker Studio now offers enhancements that can take full advantage of the new hardware capabilities in Trainium 2, positioning the combination as a serious alternative to Nvidia’s CUDA and their GPUs. Enhancements to Bedrock, which is positioned as a higher-level tool more focused on GenAI app developers who want to tap into existing foundation models, include the addition of several well-known models and a new Bedrock Marketplace for an even wider range of options.
Two of the most intriguing additions to Bedrock include a model distillation feature and a method for reducing hallucinations. The Bedrock Distillation capability provides a way to dramatically reduce the size of a large frontier model, such as a 405B Llama model down to something as small as an 8 billion parameter version, via specialized customization techniques. In essence, the end result is meant to be akin to what an organization can do with Retrieval Augmented Generation (RAG), but the process is very different, and the results are potentially even more effective. Automated Reasoning check is a new technique in Bedrock Guardrails that uses mathematically verifiable methods to dramatically reduce the problem of hallucinations in outputs from GenAI models. While details on how it worked were sparse, it certainly sounded like a potentially very important breakthrough.
Bedrock also now incorporates some of the same kind of fine-tuning capabilities found in Sagemaker but abstracted to a higher level. The net result is that Bedrock is now a much more capable platform. At the same time, its overlap with some of the features of Sagemaker can create confusion over what tool is best suited for a given task or type of user.
To be fair, Amazon faced the same kind of confusion over the role of Sagemaker, Bedrock and their Q agent capabilities when they first introduced Q at last year’s re:Invent (see “The Amazon AWS GenAI Strategy Comes With A Big Q” for more). Since then, I believe they’ve improved the positioning of each option in their development stack, but it’s still extremely complex and worthy of even more message simplification and clarification.
To better address the challenges that companies have in organizing their data for ingestion into GenAI foundation models, AWS also made a number of important enhancements to their various S3 storage and database offerings. Two in particular that stood out were adding support for managed Apache Iceberg data tables for faster data lake analytics and the automatic creation of searchable metadata. Collectively, these and many other related announcements highlight the company’s continued commitment to improving the data wrangling and organization process.
For developers, Amazon announced several important new capabilities they call Amazon Q Developer that offer AI-powered agentic capabilities for writing new code, modernizing older Java and mainframe code, helping automate the code documentation process, and more.
Two of the biggest surprises from the AWS keynote were the return of former AWS CEO (and now Amazon CEO) Andy Jassy and his announcement of the company’s complete new line of foundation models that they’re branding Nova. The range includes four levels of different multimodal models and one each specifically for image and video creation. In conjunction with the new Trainium chip, the Sagemaker and Bedrock platform enhancements, and the improved range of database capabilities, these new Amazon Nova models complete a full suite of GenAI offerings that the company believes positions them very capably as a full solution provider for GenAI. Certainly, at first glance, it seems like a defensible position. However, the fact that these new models are going to end up replacing Amazon’s previous Titan family of models, which they initially positioned as a key part of their AI strategy not that long ago, does seem to muddy the messaging a bit.
Still, as we have seen from others, things in the world of GenAI do evolve at an extraordinarily rapid pace. Discussions with AWS representatives suggested that they essentially found a much better and more performant architecture and approach with Nova than they did with Titan. As a result, they had to make the difficult decision to move away from Titan and restart with Nova. It’s bound to raise a few eyebrows for companies and developers who started working with Titan, but such is the nature of the GenAI beast in today’s dynamic environment.
As I walked away from the event, I couldn’t help but be impressed at the comprehensive range of enhancements that AWS clearly made to its initial round of tools and services. While the evolution will undoubtedly continue, as we migrate from the era of GenAI POCs to enterprise-wide GenAI deployments, having access to full suite of tools from a major cloud computing provider that addresses a number of early pain points is bound to be an important step forward.
Disclaimer: Some of the author’s clients are vendors in the tech industry.
Disclosure: None.
Original Source: Author
Editor’s Note: The summary bullets for this article were chosen by Seeking Alpha editors.