NVIDIA unveils enterprise-grade AI microservices for custom applications

Tue, 19th Mar 2024

FYI, this story is more than a year old

NVIDIA has released an array of enterprise-grade generative AI microservices designed for businesses to create and deploy custom applications on their platforms while retaining complete ownership and control of their intellectual property. Constructed on the NVIDIA CUDA platform, these cloud-native microservices include NVIDIA NIM microservices to optimise inference on over two dozen AI models from NVIDIA and its partner network.

Apart from NVIDIA NIM, the technology company offers a range of software development kits, libraries and tools as NVIDIA CUDA-X microservices for retrieval-augmented generation (RAG), guardrails, data processing and high-performance computing. NVIDIA also separately revealed more than two dozen healthcare centred NIM and CUDA-X microservices.

The selection of microservices becomes an additional layer to NVIDIA's full-stack computing platform. This improved layer interfaces the AI ecosystem including model developers, platform providers, and enterprises with a standardised path to run custom AI models optimised for NVIDIA's CUDA installed base of hundreds of millions of GPUs across clouds, data centres, workstations and PCs.

Among the premier to utilise the latest NVIDIA generative AI microservices are leading application, data, and cybersecurity platform providers including Adobe, Cadence, CrowdStrike, Getty Images, SAP, ServiceNow and Shutterstock.

Jensen Huang, founder and CEO of NVIDIA, explained, "Established enterprise platforms are sitting on a goldmine of data that can be transformed into generative AI copilots". He further added that the AI microservices, built with partner ecosystem, are the crucial building blocks for businesses in every industry to transform into AI companies.

NVIDIA's NIM microservices offer pre-built containers powered by NVIDIA inference software, such as Triton Inference Server and TensorRT-LLM. This enables developers to diminish deployment times from weeks to just minutes, providing industry-standard APIs for domains that include language, speech and drug discovery to enable swift building of AI applications using proprietary data hosted securely in their own infrastructure.

In addition to hosting various leading application providers, several data, infrastructure, and computing platform providers from across the NVIDIA ecosystem are incorporating NVIDIA's microservices to introduce generative AI to enterprises. Top data platform providers such as Box, Cloudera, Cohesity, Datastax, Dropbox and NetApp are working with NVIDIA microservices to aid customers optimise their RAG pipelines and integrate their proprietary data into generative AI applications.

NVIDIA microservices can be deployed with NVIDIA AI Enterprise 5.0 across the chosen infrastructure, such as leading clouds Amazon Web Services (AWS), Google Cloud and Oracle Cloud Infrastructure. Furthermore, the microservices are supported on over 400 NVIDIA-Certified Systems, including servers and workstations from prominent technological companies like Cisco, Dell Technologies, HP, Lenovo and Supermicro.

NVIDIA's ecosystem of hundreds of AI and MLOps partners, such as Abridge, Anyscale, Dataiku, DataRobot, and more are offering support for NVIDIA microservices through NVIDIA AI Enterprise.

Developers looking to trial NVIDIA's new microservices can do so at NVIDIA's website free of cost, and enterprises can deploy production-grade NIM microservices with NVIDIA AI Enterprise 5.0 operating on NVIDIA-Certified Systems and leading cloud platforms.

Share on: