IBM and Groq have announced a new partnership aimed at speeding up the deployment of artificial intelligence (AI) in enterprises. The collaboration will make Groq’s inference technology, GroqCloud, available through IBM’s watsonx Orchestrate. This move is intended to provide clients with faster AI inference capabilities while managing costs.
The companies plan to integrate Red Hat open source vLLM technology with Groq’s LPU architecture. Support for IBM Granite models on GroqCloud is also planned for IBM clients.
Many organizations encounter challenges when moving AI projects from pilot stages to full production, particularly in sectors such as healthcare, finance, government, retail, and manufacturing. These challenges often involve issues of speed, cost, and reliability. By combining Groq’s fast and cost-effective inference with IBM’s orchestration tools for agentic AI, the partnership aims to provide the necessary infrastructure for enterprises to scale their AI solutions.
GroqCloud uses a custom LPU that reportedly delivers more than five times the speed and cost efficiency of traditional GPU systems. This results in low latency and consistent performance even as demand increases globally—a feature highlighted as especially useful for regulated industries.
In healthcare settings, IBM says its clients receive thousands of complex patient questions at once. With Groq technology, IBM’s AI agents can analyze information in real time and deliver immediate responses. The same approach is being used in other sectors; for example, retail and consumer packaged goods companies are deploying HR agents powered by Groq to automate processes and boost employee productivity.
Rob Thomas, Senior Vice President of Software and Chief Commercial Officer at IBM said: “Many large enterprise organizations have a range of options with AI inferencing when they’re experimenting, but when they want to go into production, they must ensure complex workflows can be deployed successfully to ensure high-quality experiences. Our partnership with Groq underscores IBM’s commitment to providing clients with the most advanced technologies to achieve AI deployment and drive business value.”
Jonathan Ross, CEO & Founder at Groq added: “With Groq’s speed and IBM’s enterprise expertise, we’re making agentic AI real for business. Together, we’re enabling organizations to unlock the full potential of AI-driven responses with the performance needed to scale. Beyond speed and resilience, this partnership is about transforming how enterprises work with AI, moving from experimentation to enterprise-wide adoption with confidence, and opening the door to new patterns where AI can act instantly and learn continuously.”
IBM will immediately offer access to GroqCloud’s capabilities. The joint teams will focus on delivering several features for clients:
– High-speed inference designed for applications like customer care and employee support.
– Security-focused deployment methods suitable for stringent regulatory requirements.
– Seamless integration with watsonx Orchestrate so clients can tailor agentic patterns across different use cases.
Plans also include enhancing Red Hat open source vLLM technology within Groq’s architecture—aiming to help developers address common challenges related to inference orchestration, load balancing, and hardware acceleration.
Both companies emphasize that their statements regarding future direction are subject to change or withdrawal without notice.



