Should you run generative AI workloads on premises or in the public cloud? Yes, but your operational maturity and ability to master risk are more critical.
Buckets of digital ink have been ladled out of late over whether generative AI applications should run internally or externally. Where GenAI workloads run does matter, as the wrong location can have significant consequences for your organization.
The increasing distributed nature of applications and data suggests these workloads will run in a combination of places, including on-premises environments, public and private clouds, and the edge.
However, it’s more important that your organization’s operational maturity is strong enough to mitigate risks to the business. This means ensuring that your data quality, security and other IT attributes help support the business strategy.
In that vein, how well your GenAI workloads run is more important than where they run—whether that’s the public cloud, on-premises systems or other locations across your multicloud estate.
The Current Debate
Zooming out, let’s contextualize the current narrative. Organizations are still figuring out the best place to run large language models (LLMs) that power ChatGPT and other popular conversational assistants.
The primary locations for running GenAI workloads are on-premises, the public cloud or at the edge, said EY Americas Emerging Technologies Leader Matt Barrington in a recent webinar regarding how organizations should articulate their GenAI strategy.
Evidence suggests this is happening, with most organizations choosing on-premises and edge or the public cloud, according to industry pundits Dave Vellante and David Linthicum. “Where the systems run depends mainly on the type of problem you’re looking to solve and the attributes of that generative AI system.,” Linthicum wrote.
For some mature organizations the answer may be both—in different stages of a workload’s lifecycle. An organization may initially run a GenAI application in a public cloud but bring it back on premises when it decides to incorporate sensitive intellectual property.
Or consider this scenario: AI workloads typically require two main components—inferencing and training. It may make more sense for IT organizations to train GenAI models, which are performance intensive, on premises, according to IDC research sponsored by Dell.1
Fifty-five percent of IT decision makers Dell surveyed cited performance as the main reason for running GenAI workloads on premises.2 Conversely, inferencing tasks can be run in a distributed fashion at edge locations, in public cloud environments or on premises.
Security also plays a critical factor in where organizations decide to run AI workloads. For instance, organizations can scrub the data of any sensitive information during the modeling phase before moving AI workloads to another environment.
Fears of intellectual property leakage through GenAI consumption in the enterprise are very real. Thirty percent of ITDMs Dell surveyed said more control over their AI model was a critical factor in running their GenAI workloads on premises. For such organizations, it may make the most sense to bring the AI to their data or procure an off-the-shelf or open source LLM for their on-premises environment. This approach can deliver more value at great performance while reducing cost.
Ultimately, the right location will depend on several factors, including performance, latency, reliability and security. As such, IT leaders would do well to square their GenAI requirements with their operating model options and risk appetite.
Why Operational Maturity Matters Most
Because LLMs are “somewhat accurate and 100% confident” organizations must orchestrate a great deal of governance while consuming them, said EY’s Barrington.
He recommended that organizations embrace top-down approaches; that is, executive leadership must work closely with line-of-business leaders to rethink customer interaction models and other operations. Ideally, IT will take the lead in deploying technology and architecting governance models.
Organizations with strong operational maturity will deploy people, process and technology to extract the most value out of GenAI workloads. This includes assigning the right talent to build and train models, empowering them with the right tools, and implementing the right operational guardrails. Greater operational maturity, the thinking goes, will minimize business risks.
There is no plug-and-play approach for how to achieve operational maturity; however, centralizing GenAI efforts within IT and implementing strong data management techniques are critical. Organizations that pair their corporate data with the right GenAI tools will have an advantage.
“Data is the fuel for all of this,” Barrington said. Companies that fail to build the right data constructs “are behind and going to have to play catch up.”
GenAI technologies are evolving swiftly. In this era of digital innovation, you don’t want to chase more nimble competitors.
How will you rise to the challenge?