Using serverless solutions to harness generative AI capabilities
- Alibaba Cloud unveils a serverless version of its Platform for AI-Elastic Algorithm Service.
- The PAI-EAS is designed to offer a cost-efficient solution for model deployment and inference to individuals and enterprises.
- Serverless computing can be used to run AI workloads.
When it comes to harnessing generative AI capabilities for enterprises, investing in the right infrastructure is crucial to ensure the best results are achieved. While technology has let businesses decide if they want to work on physical or virtual infrastructure for their AI workloads, there are still some challenges for them.
For example, scalability, cost and flexibility are some of the challenges organizations face when it comes to working on their generative AI solutions. Often, businesses want quick deployments, and while the infrastructure caters to speed, it still requires several rounds of testing to ensure the applications work properly.
One way of speeding up the process is by relying on serverless technology. Serverless computing is a cloud computing model that allows developers to run code without having to manage servers, provision resources, or worry about scalability.
Serverless computing can be used to run AI workloads. But the workloads need to be compatible with the serverless model and the limitations set by the cloud provider. Only then can businesses enjoy the advantages of using serverless computing for AI.
Alibaba Cloud and serverless computing for AI
Alibaba Cloud recently unveiled a serverless version of its Platform for AI (PAI)-Elastic Algorithm Service (EAS) at the inaugural AI & Big Data Summit in Singapore. The PAI-EAS is designed to offer a cost-efficient solution for model deployment and inference to individuals and enterprises.
The PAI-EAS platform allows users to tap into computing resources as needed, eliminating the need to oversee the management and upkeep of physical or virtual servers. What’s more, users will be billed only for the computing resources they employ, which could mean a 50% reduction in inference costs when compared with the traditional pricing model.
Currently in beta testing, the serverless offering is accessible for image generation model deployment. In March 2024, the serverless version is scheduled to expand its capabilities to support the deployment of prominent open-source LLMs and models from Alibaba’s AI model community, ModelScope. This includes models tailored for tasks such as image segmentation, summary generation, and voice recognition.
Alibaba Cloud also announced the latest integration of its vector engine technology into more product offerings, including its data warehouse Hologres, search services Elasticsearch and OpenSearch. The integration is designed to make it easier for enterprises to access various LLMs and build customized generative AI applications.
With LLMs serving, training services and the vector engine technology, Alibaba Cloud is able to support a Retrieval-Augmented Generation (RAG) process, letting enterprises enhance LLMs with their knowledge bases for improved outcomes. This translates to improved accuracy, accelerated retrieval of relevant information, and more nuanced insights for enterprises, contributing to heightened efficiency and decision-making capabilities across a wide range of applications.
“Our technology updates underscore our commitment to empowering enterprises with the latest intelligence-driven solutions for heightened efficiency and performance. This marks a significant stride in our mission to provide innovative solutions that redefine the possibilities of artificial intelligence in diverse applications,” Zhou Jingren, chief technology officer, of Alibaba Cloud, commented during the summit.
Make model training easier
Training your LLM is critical for AI workloads. During the summit, Alibaba Cloud also announced an upgrade to its big data service, called MaxCompute MaxFrame. A distributed Python data processing framework, the service taps into the growing demand for data preprocessing and data offline or online analysis in AI-related computing tasks. It allows users to process massive amounts of data more efficiently and flexibly while launching AI tasks such as LLM training.
For image generation, Alibaba Cloud has introduced PAI-Artlab, to foster enhanced creativity among designers. The comprehensive platform for model training and image generation empowers designers to quickly produce professional-grade designs and unlock greater creative potential.
With PAI-Artlab, designers can generate design images for a variety of applications, including interior home design, product promotional posters, gaming character creation, and gaming scene development. While it is only available in China currently, it will be available in Singapore soon. The platform also provides a rich ecosystem of ready-to-use tools to enable designers with no coding background to develop and train custom models that generate images tailored to their specific requirements.
Last year, Alibaba Cloud elevated its entire range of database solutions, including the cloud-native database PolarDB, cloud-native data warehouse AnalyticDB, and cloud-native multi-model database, Lindorm. The solutions integrated proprietary vector engine technology to significantly enhance performance and capabilities.
Vector engines transform text and data into a high-dimensional space, optimizing AI performance by embedding large volumes of structured and unstructured context in a complex yet efficient manner. This facilitates and streamlines tasks like similarity comparisons and semantic analysis, particularly benefiting LLMs and advancing various advanced AI functionalities.
“There is an increasing demand for AI technologies among our global customers. By open-sourcing our proprietary language models, we are well-equipped to offer powerful computing solutions and cutting-edge AI innovations to support clients in developing customized generative AI applications, addressing their unique challenges and positioning them to harness the wave of opportunities emerging from the dynamic generative AI sector”, commented Selina Yuan, president of international business at Alibaba Cloud.
READ MORE
- 3 Steps to Successfully Automate Copilot for Microsoft 365 Implementation
- Trustworthy AI – the Promise of Enterprise-Friendly Generative Machine Learning with Dell and NVIDIA
- Strategies for Democratizing GenAI
- The criticality of endpoint management in cybersecurity and operations
- Ethical AI: The renewed importance of safeguarding data and customer privacy in Generative AI applications