Covering Disruptive Technology Powering Business in The Digital Age

Google Cloud Next ’23: AI-Optimised Infrastructure to Build Bold and Responsible Gen AI 

At Google Cloud Next ’23, Google Cloud announced a series of new partnerships and product innovations to empower every business and public sector organisation in Southeast Asia to easily experiment and build with large language models (LLMs) and generative AI (gen AI) models, customise them with enterprise data, and smoothly integrate and deploy them into applications with built-in privacy, safety features, and responsible AI (artificial intelligence).

Enhancements to Google Cloud’s Purpose-Built, AI-Optimised Infrastructure Portfolio

The capabilities and applications that make gen AI so revolutionary demand the most sophisticated and capable infrastructure. Google Cloud has been investing in its data centres and network for 25 years, and now has a global network of 38 cloud regions, with a goal to operate entirely on carbon-free energy 24/7 by 2030. This global network includes cloud regions in Indonesia and Singapore, with new cloud regions coming to Malaysia and Thailand.

Building on this, Google Cloud’s AI-optimised infrastructure is the leading choice for training and serving gen AI models, with more than 70% of gen AI unicorns already building on Google Cloud, including AI21AnthropicCohereJasperReplitRunway, and Typeface.

To help organisations in Southeast Asia run their most demanding AI workloads cost-effectively and scalably, Google Cloud today unveiled significant enhancements to its AI-optimised infrastructure portfolio: Cloud TPU v5e—available in public preview—and the general availability of A3 VMs with NVIDIA H100 GPU

Training AI on Google Cloud

Cloud TPU v5e is Google Cloud’s most cost-efficient, versatile, and scalable purpose-built AI accelerator to date. Now, customers can use a single Cloud Tensor Processing Unit (TPU) platform to run both large-scale AI training and inferencing.

Cloud TPU v5e delivers up to two times higher training performance per dollar and up to 2.5 times higher inference performance per dollar for LLMs and gen AI models compared to Cloud TPU v4, making it possible for more organisations to train and deploy larger, more complex AI models. Cloud TPU v5e is currently available in public preview in Google Cloud’s Las Vegas and Columbus cloud regions, with plans to expand to other regions, including Google Cloud’s Singapore cloud region later this year.

A3 VMs, supercomputers powered by NVIDIA’s H100 Graphics Processing Unit (GPU), will be generally available next month, enabling organisations to achieve three times faster training performance compared to A2, its prior generation. A3 VMs are purpose-built to train and serve especially demanding LLM and gen AI workloads. On stage at Google Cloud Next ’23, Google Cloud and NVIDIA also announced new integrations to help organisations utilise the same NVIDIA technologies employed over the past two years by Google DeepMind and Google research teams.

A Range of Infrastructure Advancements

Google Cloud also announced other key infrastructure advancements, including:

  • Google Kubernetes Engine (GKE) Enterprise. This enables the multi-cluster horizontal scaling required for the most demanding, mission-critical AI and machine learning (ML) workloads. Customers can now improve AI development productivity by leveraging GKE to manage large-scale AI workload orchestration on Cloud TPU v5e. In addition, GKE support for A3 VM with NVIDIA H100 GPU is now generally available.
  • Cross-Cloud Network. This is a global networking platform that helps customers connect and secure applications between clouds and on-premises locations. It is open, workload-optimised—which is crucial for end-to-end performance as organisations adopt gen AI and offers machine learning (ML)-powered security to deliver zero trust.
  • New AI offerings for Google Distributed Cloud (GDC). GDC is designed to meet the unique demands of organisations that want to run workloads at the edge or in their data centres. The GDC portfolio will bring AI to the edge, with Vertex AI integrations and a new managed offering of AlloyDB Omni on GDC Hosted.

“For two decades, Google has built some of the industry’s leading AI capabilities: from the creation of Google’s Transformer architecture that makes gen AI possible, to our AI-optimised infrastructure, which is built to deliver the global scale and performance required by Google products that serve billions of users like YouTube, Gmail, Google Maps, Google Play, and Android,” said Mark Lohmeyer, Vice President and General Manager, Compute and ML Infrastructure, at Google Cloud.

He added: We are excited to bring decades of innovation and research to Google Cloud customers as they pursue transformative opportunities in AI. We offer a complete solution for AI, from computing infrastructure optimised for AI to the end-to-end software and services…”

Extending Enterprise-Ready Gen AI Development with New Models and Tools on Vertex AI

On top of Google Cloud’s world-class infrastructure, the company delivers Vertex AI, a comprehensive AI platform that enables customers to access, tune, and deploy first-party, third-party, and open-source models, and build and scale enterprise-grade AI applications. Building on the launch of gen AI support on Vertex AI, Google Cloud is now significantly expanding Vertex AI’s capabilities. These include:

  • Enhancements to PaLM 2. Thirdy-eight languages, including Simplified Chinese, Traditional Chinese, Indonesian, Thai, and Vietnamese, are now generally available for PaLM 2 for Text and Chat—first-party models for summarising and translating text, and maintaining an ongoing conversation.
  • Enhancements to Codey. Improvements have been made to the quality of Codey, Google Cloud’s first-party model for generating and fixing software code, by up to 25% in major supported languages for code generation and code chat.
  • Enhancements to Imagen. Google Cloud introduced Style Tuning for Imagen, a new capability to help enterprises further align their images to their brand guidelines with 10 images or less. Imagen is Google Cloud’s first-party model for creating studio-grade images from text descriptions.
  • New models. Llama 2 and Code Llama from Meta, Technology Innovative Institute’s Falcon LLM—a popular open-source model—are now available on Vertex AI’s Model Garden. Vertex AI extensions. Developers can access, build, and manage extensions that deliver real-time information, incorporate company data, and take action on the user’s behalf.
  • Vertex AI Search and Conversation. Now generally available, these tools enable organizations to create advanced search and chat applications using their data in just minutes with minimal coding and enterprise-grade management and security built in.
  • Grounding. Google Cloud announced an enterprise grounding service that works across Vertex AI Search and Conversation and foundation models on Vertex AI’s Model Garden, giving organisations the ability to ground responses in their own enterprise data to deliver more accurate responses.

Enabling Complete Control

Google rigorously evaluates its models to ensure they meet its Responsible AI Principles. When using Vertex AI, customers retain complete control over their data: it does not need to leave the customer’s cloud tenant, is encrypted both in transit and at rest and is not shared or used to train Google models.

“Equally important to discovering and training the right model is controlling your data. From the beginning, we designed Vertex AI to give you full control and segregation of your data, code, and intellectual property, with zero data leakage,” said Thomas Kurian, CEO at Google Cloud. “When you customise and train your model with Vertex AI . . . you are not exposing that data to the foundation model. We take a snapshot of the model, allowing you to train and encapsulate it together in a private configuration, giving you complete control over your data.”

Organisations across industries and around the world are already using Vertex AI to build and deploy AI applications, including affable.aiArunaBank Raykat IndonesiaFOX SportsGE AppliancesHCA HealthcareHSBCJivaKasikorn Business-Technology Group LabsKoinWorksThe Estée Lauder Companiesthe Singapore GovernmentMayo ClinicPricelineShopifyWendy’s, and many more.

“Since announcing gen AI support on Vertex AI less than six months ago, we’ve been thrilled and humbled to see innovative use cases from customers of all kinds—from enterprises like GE Appliances, whose consumer app SmartHQ offers users the ability to generate custom recipes based on the food in their kitchen, to startup unicorns like Typeface, which helps organisations leverage AI for compelling brand storytelling. We’re seeing strong demand, with the number of Vertex AI customer accounts growing more than 15 times in the last quarter,” added Kurian.