
NIM microservices are your go-to for deploying the latest AI models from NVIDIA and the community quickly and reliably. They leverage NVIDIA TensorRT-LLM to accelerate and optimize inference alongside popular community backends like vLLM and SGLang to seamlessly deploy a massive range of community LLMs.
Get ready for high-performance inference on NVIDIA GPUs with rapid, reliable deployment for a broad range of LLMs.
➡️ Try out the developer example:
➡️ Technical Deep Dive:
➡️ Join the NVIDIA Developer Program:
➡️ Read and subscribe to the NVIDIA Technical Blog:
00:00:00 - Introduction and Overview
00:00:36 - Setting Up and Launching the Model
00:02:50 - Model Profiles and Compatibility
00:05:02 - Example
LLM, inference, agentic AI, generative AI, AI engineer