The case for a national language model

Introduction

Countries outside the US and China, particularly those with a rich talent base in AI research like Scotland, need to stake a claim in the landscape of disruption brought about by recent advances in AI language models. This is not about national pride or boosterism — it's a strategic move to build the next generation of national infrastructure. A move that could greatly enhance the efficiency of public sector workflows, improve national resilience and security, and act as a foundational cornerstone for a digital industrial policy.

This piece aims to explore why such an undertaking is not only possible but necessary, considering historical precedents, resource availability, and economic implications. It takes it cues from France, which is already paving the way for AI nationalism with its investment in MistralAI.

AI nationalism

It has been 5 years since Ian Hogarth wrote his article on AI nationalism¹. Ian's article posited that machine learning would disrupt all areas of the economy and society, and that disruption would subsequently lead to governments enacting nationalist-centric, protectionist policies around AI.

At the time, Deep Mind was seen as the poster child for the one that got away. Since then, no alternative AI champion in the UK has emerged. Deep Mind continue to release impressive results, but it has been overshadowed by OpenAI. Microsoft, Google, Facebook and Amazon have all been stung into some form of response by the success that OpenAI has achieved, most notably Microsoft to the tune of $10B. All the while China has (presumably) continued to execute on a policy of coordinated AI nationalism out of the direct gaze of the rest of the world.

Five years later, the USA and China remain a duopoly in AI leadership, and we are still left asking "What can countries that aren’t China or America do?"

The rise and rise of the LLMs

Transformer-based Large Language Models (LLMs) are the AI technology underpinning OpenAI's GPT-3 and chatGPT. If you'd had a quibble with Ian's prophecy back in 2018, my bet is that you'd have been convinced by the seemingly endless stream of advances we've seen delivered by LLMs over the past 18 months. I podcasted, and subsequently wrote about GPT-3 not long after it came out² and questioned whether the emergent capabilities of very large transformer models would be enough to bring about a step-change in usefulness of language models... or if another model architecture would be required. It turned out that model tuning, specifically, Reinforcement Learning from Human Feedback (RLHF) was sufficient to greatly increase the utility of these underlying models, without requiring an alternative core architecture.

Where we are at today is a general acceptance that LLMs are capable of the following:

Answering questions as well as many experts
Exhibiting pseudo-reasoning as well as many knowledge workers
Performing text ingestion, summarisation and generation better than most bespoke software

All this means that we stand on the cusp of a radical shift on how knowledge work is performed where many, many workflows traditional undertaken by knowledge workers, can be semi or fully automated by the capabilities of LLMs. While technologies like Blockchain and AR/VR have to some extent remained technology solutions searching for a future problem to solve, LLMs have very quickly passed through this phase to become accepted as general purpose technologies that can and do deliver value today across a plethora of use-cases.

Most predictions are that private enterprises will eventually need to make heavy use of LLMs to stay competitive, but that the path they will go down will be to train and control private LLMs, fed their jealously guarded customer data in a safe and secure manner. Specifically, the argument being trotted out by a million CIOs in a million board rooms is that an enterprise cannot afford to give up their competitive advantage of feeding their internal data into a public LLM such as OpenAI's chatGPT.

To fill this void, we will probably see offerings of hosted and trained LLMs from large platform providers such as Amazon AWS, enabled by the existence of open-source language models.

Fundamental infrastructure

So where does government fit in this picture? I believe that efficient delivery of the machine of state will also require extensive use of LLMs. Access to an advanced LLM won't be a nice-to-have, it will be a matter of fundamental infrastructure. Much of the mechanism of the state could be classed to a greater or lesser extent as knowledge work. It is inconceivable that there won't be a productivity benefit to the state if it is able to adopt LLMs into governmental workflows. Being able to capture AI productivity benefits is as important as rolling out railway infrastructure was in the 19th century.

The question is one of requirements. Specifically, what requirements for the use of an LLM would the state have, over and above any other large enterprise. Here's my guess:

Confidence that the data that trained the LLMs came from government-approved sources³
Ability to fine tune using government held, confidential data
Ability to send highly confidential, potentially secret data to the LLM for results
Requirement to keep the data inside the country's border

To some extent 1-3 are shadows of existing enterprise requirements, but I do believe these will be turned up to a whole new level for the state. Similarly with 4, this is a concern for enterprise data compliance in general, but the state is going to do it for reasons of national resilience and security.

France is seeding an LLM champion

You know what's better than making a smart decision about which LLM to use? Using that decision to seed an AI champion for your nation. This is exactly what the French state appears to be doing with MistralAI⁴:

"We are proud to initiate this global project from France, our home country, and to contribute, at our level, to the emergence of a credible new player in generative artificial intelligence from Europe," Mensch, Mistral AI's president, said.

How much leverage will the French state have over what Mistral produce? Who knows, probably not much. But what they will get is the ability to use an LLM that is not hosted by an American or a Chinese company. I think that matters to them. And I think it should matter to every country that isn't America or China. They also potentially get further dividends from seeding a French AI champion. They get to put that company at the heart of a forward looking French industrial policy. They get to have national determinacy over a technology that will be critical to productivity going forwards. They get a reason to convince talented French AI researchers to stay in France. They get a place to attract overseas AI talent into France. They get to direct a chunk of their future state IT spend into a French company that they can tax how they like. They get a French technology eco-system built around a successful global player in AI. All these benefits (if they happen) make whatever share the French sovereign fund Bpifrance put into Mistral seem like the bargain of the century.

Yes, there is a chance that this could become another state-funded éléphant blanc that Europe is so desperately keen to avoid. Cynics will point out that state intervention was never going to produce an EU Google. However, LLMs are different because they are a form of technology infrastructure that the state itself can leverage, in a way that consumer facing search never was. It is a multiplier on human capital, and who would argue that the state shouldn't invest to remain competitive when it comes to human capital?

Could Scotland seed its own champion?

I have no doubt that the UK could seed an LLM champion if it wanted to. A classical liberal might argue that the UK already enjoys many of the potential benefits of an LLM champion in its laissez-faire relationship with Deep Mind, and to go any further would be state interference in the private markets. To side-step that argument, and because I live in Scotland, I want to ask a different question. Specifically, could Scotland seed its own LLM champion? At first sight this seems ridiculously parochial. But let me make the case for why I think its potentially an opportunity that is almost uniquely well suited to Scotland.

Firstly, Scotland has a large public sector. Over 22% of total employment as measured in 2022⁵. I'm not going to comment on whether that's good or bad, I'm just observing that a state LLM would proportionally benefit Scotland more than most places.

Secondly, the University of Edinburgh was a pioneer in AI for many years. Some would argue it still is, and yet too many of our experts end up elsewhere. The rich history in university-based AI research hasn’t yet led to local commercialisation of that expertise. We have a talent pipeline in place. We just need a local destination for that talent.

Thirdly, an ethically responsible job based in Scotland is globally attractive. It seems strange to identify this as a particular advantage, but I strongly believe top AI talent can be persuaded to stay in Scotland, or come to Scotland based on those factors. Talent entering the global workforce aren't just motivated by money anymore - we need to play to that strength and we simply can't do that without a credible company for these people to join.

Fourthly, Scotland has abundant renewable energy. At some point in the future, high usage of power-hungry GPUs becomes at the margins an arbitrage on energy prices. I'm aware that we aren't at this point yet. I'm also aware that we cannot necessarily capture the potential of abundant renewable energy generation in Scotland and put it to direct use as a marginal advantage in running a GPU heavy datacentre. But this is an area we should explore as part of a national industrial strategy. Far better this than some of the other potential energy arbitrage opportunities in technology⁶.

Fifthly, the Scottish Government has control over technology procurement. Why can't the Government put out a bold statement of intent stipulating the intention to purchase LLM access at scale for a model that satisfies criteria they have laid out... and heck, why can't the criteria be:

The LLM must be trained on data from Scottish Government-approved sources
The data must be hosted in a datacentre in Scotland
The data must be powered by 100% directly renewable energy
The LLM must be built and maintained by an applied research group at least 50% based in Scotland
The LLM pricing model must be globally competitive

If OpenAI, Microsoft, Facebook or Amazon want to meet those requirements then great. Scotland still wins. But even better if a home-grown company can meet them. Frankly, I think the state should be willing to seed a company to make this happen because I think it would be the most forward thinking and ambitious thing they could do. But if the Scottish Government laid out a future LLM Access agreement⁷ for a model that met those requirements they wouldn't even need to venture fund the idea. It could all be done through future obligations to spend.

Yes, someone will complain about procurement rules. I don't care. Let's just find a way to make it work. You can't just gift the contract to any old startup and equally, you can't deliver this through a government body. The company needs to have a commercial model, because you need to find a way to attract the world's best talent into it. But it doesn't need to be a direct copy of the OpenAI model. Even OpenAI doesn't follow a traditional technology company model with its use of non-standard Profit Participation Units over Stock Options. We could do something equally innovative here.

A reality check on GPU availability

At the time of writing this article, even if you wanted to train a national language model from a clean dataset you couldn't do it on your own hardware, and you might not be able to get access to cloud GPUs at all. Venture capital is busy funnelling money (via startup balance sheets) into Nvidia's hardware at a rate limited only by Nvidia's manufacturing capacity. Current availability of high-end Nvidia H100 and A100 GPUs is heavily concentrated in a few large companies⁸:

Meta has 21,000 A100s
Google has 25,000 A100s.
Microsoft has 40,000 H100s.
Tesla has 7,000 A100s
StabilityAI has 5,000 A100s

And to give you some idea of how many GPUs you might want access to:

Falcon-40B was trained on 384 A100s
Inflection was trained on 3,500 H100s
GPT-4 was trained on 10,000 to 25,000 A100s
GPT-5 might need 25,000 to 50,000 H100s

But this isn't necessarily the barrier it first seems. Training an LLM from scratch isn't the only path to delivering on these requirements in the near term if we could make use of existing open-source models. Sure, it doesn't fulfil the requirement to be trained on government approved sources right now, but we could build up to a position of data independence and hardware capability while delivering value via homegrown applied research.

The time to act is now

Why have I written this article now? I firmly believe that the time to act is now. Lots of technologies go through hype cycles, and goodness knows AI has been through its own share of them. But if you eavesdrop into any corporate boardroom around the world right now, you'll be hearing a consistent message. We are at a moment of discontinuity. To paraphrase Ian, LLMs will disrupt all areas of the economy and society and they'll do it in the next few years. If we don't act now to seed a national AI champion, 60 years of Scotland leading the world in AI research will be eclipsed by commercial labs in America and China. Let's take a bold step to give Scotland a role in the future of AI, and not just in its past.

https://www.ianhogarth.com/blog/2018/6/13/ai-nationalism ↩
https://mattfarrugia.com/is-gpt-3-a-giant-leap ↩
Legitimate means at least sources that aren't copyrighted, but might also mean more than that.↩
https://www.reuters.com/technology/french-company-mistral-ai-raises-105-mln-euros-shortly-after-being-set-up-2023-06-13/ and https://www.bpifrance.com/2023/06/30/bpifrance-supports-french-companies-in-the-artificial-intelligence-revolution/↩
https://www.gov.scot/publications/public-sector-employment-scotland-statistics-2nd-quarter-2022/pages/2/↩
I'm looking at you bitcoin miners.↩
Modelled on Virtual Power Purchase Agreements.↩
https://gpus.llm-utils.org/nvidia-h100-gpus-supply-and-demand/↩