This article is more than 1 year old
Red Hat promises AI trained on 'curated' and 'domain-specific' data
Says it'll keep track of what it hoovers up at post-layoff summit
Opinion In Red Hat land, some things remain the same – Fedora will still be supported, we're told – while others, including AI-driven applications, are starting to surface.
This year's Red Hat Summit wasn't the usual lowkey event. Coming on the heels of Red Hat's first layoffs, it felt fair to brace for a somber air. Instead, the energy seemed high as Red Hat talked up its forthcoming releases.
After the company's recent four percent layoffs, many Fedora Linux users noticed that the popular community Linux distro took some hits. In particular, Fedora Program Manager Ben Cotton had been laid off. This led to Fedora fans wondering if their favorite Linux distribution would be cut back.
I asked Matt Hicks, Red Hat's CEO, about this. "I think Fedora has an incredible opportunity. It's still the distribution base for what Red Hat Enterprise Linux (RHEL) will look like in five years." It will also, Hicks added, be the "default for AI as we change and push innovation. That's why we have our community. Nothing changes for us, and it's still a critical innovation vehicle for us."
When Hicks mentioned AI, he wasn't simply joining the flood of companies "AI-washing" his company's product line the way so many businesses "cloud-washed" their products in 2009. Long before ChatGPT turned AI into the buzziest of buzzwords, Red Hat had been working on turning AI into a useful tool.
This began in 2021 with IBM Research's Project CodeNet. From this, Red Hat and IBM created Project Wisdom. This enabled users to input a coding command as a straightforward English sentence. For example, "Deploy Web Application Stack" or "Install Nodejs dependencies."
This matured into Red Hat's first major AI success: Ansible Lightspeed. This takes the Ansible DevOps program and extends it with IBM Watson Code Assistant. This generative AI service delivers, according to Red Hat, more consistent, accurate, and faster automation. It uses natural language processing and integrates with Code Assistant to access IBM Foundation Models built on OpenShift, Red Hat's Kubernetes service.
CTO Chris Wright told The Register in an interview that, unlike ChatGPT, which built its large language models (LLMs) on essentially all the publicly available data it could vacuum in, Red Hat's LLMs are curated and domain-specific.
That means, we're told, these LLMs have been built on data that Red Hat knows is correct. When Lightspeed generates a particular Ansible Playbook – a reusable, simple configuration management and multi-machine deployment system – Red Hat says it's based on tested, high-quality data and code. Not some garbage someone wrote up in a hurry to meet a deadline.
- Leaked Kyndryl files show 55 was average age of laid-off US workers
- IBM asks UChicago, UTokyo for help building a 100K qubit quantum supercomputer
- IBM pauses counting its billions to trim Red Hat staff
- IT depts struggle with skills shortages despite Big Tech layoffs
It's also not just built on good code. Wright told us: "We make sure the models are accurate because we build metrics into the whole end-to-end process." This includes business metrics to make sure your projects aren't just technically successful but deliver successful results for your business as well.
Another major plus that IBM and Red Hat are bringing to the table is that unlike the AI projects getting all the headlines, Wright said: "We can tell you exactly where the data our domain-specific LLMs comes from." This is a dramatic difference from the response that ChatGPT gives you when you ask it where it gets its answers from.
This is similar to the push by the open source community towards a Software Bills of Material (SBOM) to make sure open source code really is what it says it is. Knowing exactly what's in LLMs is rapidly becoming a critical issue for quality, accuracy, and legal issues. For example, if you use code from GitHub CoPilot, do you know if the code it produces is sourced from a copyrighted open source project? Can you be sued for using it? Stay tuned. The courts are working on that exact question.
Businesses, once they recover from getting drunk on AI's potential, must deal with these issues.
Red Hat says it knows that, so it and IBM have been focusing on making sure the data their LLMs are using is good and legal.
"Enterprises require this kind of accuracy," says Wright. Besides, "where a model is trained on a set of data, and the world around us is highly dynamic, a model can drift in terms of its accuracy. We must constantly work to maintain the continued accuracy of models. that form the picture is really critical for enterprises to be successful in bringing AI and making kind of meaningful business impact."
Put it all together, and Red Hat, which was first known as a Linux power, and then gained fame as a hybrid-cloud power, may – just may – become known as it enters its third decade as the AI company that got it right. We'll see. There's a long road ahead. ®