A version of this content was previously published on Forbes.
Vultr recently announced the launch of Vultr Cloud Inference, a new serverless platform aimed at transforming AI scalability and reach. The new solution facilitates AI model deployment and inference capabilities worldwide.
Vultr Cloud Inference
The industry is rapidly transitioning from a focus on training AI models towards deploying those models into production environments. As enterprises and technology providers alike deploy inference solutions, infrastructure to handle the surge in inference workloads while maintaining high performance is critical.
Vultr addresses this with its new Vultr Cloud Inference, a solution built on a serverless architecture that simplifies the integration and deployment of AI models regardless of their training environment.
The new offering from Vultr addresses the increasing demand for inference-optimized cloud infrastructure capable of supporting the complex requirements of deploying and managing large AI models. This includes models trained on cloud platforms, on-premises, or even those developed on Vultr Cloud GPUs powered by Nvidia.
One of the primary benefits of Vultr Cloud Inference is its flexibility in integrating AI models, regardless of their training environment. Whether models are developed on Vultr Cloud GPUs, in a user’s own data center, or on another cloud, models can be easily integrated and deployed globally, ensuring businesses comply with data sovereignty, residency, and privacy regulations by deploying AI applications in appropriate regions.
The new platform’s serverless architecture enables automated scaling of inference-optimized infrastructure, resulting in cost savings and reduced environmental impact.
Vultr Cloud Inference also offers private, dedicated computing resources for sensitive or high-demand workloads, enhancing security and performance. This feature is essential for organizations with strict data protection and regulatory compliance needs.
Vultr Cloud Inference simplifies AI deployment and allows AI applications to deploy globally, making AI technology more accessible and efficient for organizations worldwide. The new offering is available now for early access.
Analysis
Like many in the industry, Vultr has been on a rapid run of AI-focused innovation. Earlier this month, the company announced it was expanding its footprint in Seattle with a new data center and a significant increase in Nvidia HGX H100 clusters. Last month at the Mobile World Congress in Barcelona, Vultr demonstrated generative AI solutions for telecom.
It is unfair to compare Vultr to GPU-cloud providers like Lambda Labs and CoreWeave. While those companies, riding high on the current marketing hype cycle around AI, provide much-needed GPU resources, Vultr is much more.
Vultr chief marketing officer Kevin Cochrane recently told me that Vultr is more about allowing enterprises to deliver business intelligence where it matters most, which is often at the edge. The edge is global, Cochrane told me, so Vultr built data centers in 32 locations across six continents–one of the largest footprints outside the large public cloud providers.
Vultr’s global footprint solves more than just a logistical problem for AI; it also addresses growing enterprise needs for compliance and governance. The capability to deploy AI applications in line with local regulations is increasingly important, giving Vultr a competitive differentiator.
While you can certainly train AI models on Vultr’s infrastructure, the company’s focus isn’t on AI training. As the broader market begins realizing value from the models the industry has spent the past year training, inference will become a prime driver. The ability to provide inference at the edge, where insights can be delivered where they matter most, is critical.