This article is more than 1 year old

That attacking transition that clinched the game for England? It started in the cloud…

How the English FA uses Google Cloud to visualize success

Sponsored Feature When we talk about football, we're used to hearing words like passion, natural talent, and commitment. These are all admirable qualities, but the most successful managers, coaches, and indeed players, have all brought something else to the game; the ability to analyze their performance, and their opposition's, and use that to tilt the odds of victory in their favor.

But with the best will in the world, there's only so much information one manager or coach can absorb and process. So, as England's senior women's team prepares for the Euros, and the senior men's team for the World Cup, the English Football Association will be relying on the cloud to give England an edge whether it's on team selection, evolving tactics, and even helping manage the kit and equipment taken to the tournament.

That we've come this far shouldn't really surprise anyone.

The stakes in the game have become higher, both financially and in terms of glory and national pride. As with many sports, natural talent, intuition, and hunger for victory are no longer quite enough to win. Every team, every coach, every team manager is constantly searching for that extra element that will give them the edge over the competition.

Team of performance staff

The business world increasingly garners competitive insight and intelligence by analyzing every piece of data available, with sophisticated algorithms supported by highly scalable systems in the cloud. So why should football be any different?

The English Football Association has built its own team of performance staff at The Football Association to support the Men's and Women's teams from a base at St George's Park. And there's certainly no shortage of potential information for them to work on.

Its staff monitor over 3,500 players across the various squads the FA runs - senior mens; senior womens; under 23s/21s; youth; and para teams - and collect performance data from both international and club games. The FA also maintains an archive of match videos – in the halfway line high and wide format coaches prefer – each of which means a 5GB file, with 600MB added per game.

Directorate staff track a range of wellness data when players are training directly for England, from wearables and gym data, to sleep patterns. Add to this player profiles, medical information, nutritional data, and whatever other nuggets might give that elusive edge. Even the reports that come in from the network of experts the FA relies on to identify future prospects are increasingly being digitized, with the association encouraging its highly skilled scouts and emerging talent reporters to shift to tablet devices.

All of this information is pulled together in the FA's Helix system, which integrates its own data resources, analytics capabilities, and third party applications. Helix stores 22 million player data points and counting, with the associated data mountain analyzing 40TB a week and covering around 2,500 fixtures. Think of it as an ERP system for football. And then some.

The platform also supports warehousing – ie, tracking the vast amount of kit that needs to travel with the national team – and calendaring to schedule the (never quite enough) time that the national team is able to prise players away from their professional clubs.

So perhaps it's no surprise that the FA has turned to Google Cloud to underpin Helix.

From the Big Match to Big Query

Just managing that stream of video content is a major challenge. As the FA explains, it wasn't that long ago that a team coach wanting to analyze a game would have had to reach into a cupboard full of VHS tapes. Even as games became available in digital formats, getting them onto its systems and then to the coaches and other interested parties would have meant a succession of batch processing jobs. And coaches would then have to rely on FTP to download individual files to their laptop or other device. Inevitably, the extent to which that video footage could be viewed alongside and associated with other data sources was limited, leaving coaches to perform their own mental gymnastics.

With the system expected to hoover up as many 400 games a day, carrying on with the same approach was clearly not feasible. So the FA's data team has moved to an event driven architecture to get the video into Cloud Storage. This is based on a Kubernetes- based video streaming system, running on a Google Compute instance, supplemented by Cloud Functions, Cloud Run, and using Pub/Sub.

When a file is pushed, this triggers an ingestion job. If the ingestion is unsuccessful, it gets flagged, but assuming the import goes through it is added to the FA's data lake, based on Google Cloud's BigQuery data warehouse platform, ready for transformation.

That enables the FA to ensure videos are tagged according to its standard set of definitions. So, one of its data architects explains, "No matter what our provider, we're mapping to the same set of criteria that we've got internally."

"When a match event comes through, we want to know which tournament that was part of, which camp that was part of, what was the wellness score, as well as the scores leading up to that point." And this is all mapped within BigQuery.

More broadly, BigQuery together with Google's Cloud SQL, is used to track data and join multiple data sources, providing the basis for further sophisticated analysis by the FA's performance staff. For example, players will wear smart vests during training – at both club and national level – which allow performance and positioning stats to be captured.

FA staff apply a range of techniques to this ever-growing data lake, including machine learning, regression, clustering and autoencoder functions. Originally the data team had to export data outside of BigQuery for this sort of analysis. It was a big jump forward being able to do the modeling within BigQuery.

"It just allows for much tighter versioning on data and models. Being able to track metrics and variables of the data set in a familiar UI is a benefit," said a spokesperson "Tracking data is a very complex data structure and using BigQuery ML and Autoencoder allows us to simplify the process within various applications and use cases."

Getting data into the dugout

This is geared towards "implementations you would expect a football organisation to do - expected goals (xG) and expected threat (xThreat), pitch value modelling and control, etc... All that revolves around ML and regression analysis."

Meanwhile, he explains, clustering is used for analyzing general patterns of play, such as passes. "We've applied ML techniques such as clustering to tactical analysis with great success."

Needless to say the data flow is spikey – most of it will be concentrated around the weekend and midweek fixtures, ten months of the year, with additional spikes around training camps, and of course, international matches and tournaments. Which makes the elasticity and scalability of GCP attractive. The scalability of the Google platform also means the team can focus its efforts on analysis, he says, rather than having to focus on more formal DevOps types of work.

But while the performance staff can perform ever more sophisticated functions, this counts for nothing if it doesn't get in front of the coaches and managers. They're the ones making the actual decisions about who, and what, goes on the pitch. So the FA team uses Tableau to provide "intuitive" representations of the data, in part using a library of pre-built charts, reports and dashboards.

The coaches also use third party platforms such as sports performance analysis platform Hudl and Academy Soccer Coach. Access to both the inhouse analytics, and third party apps is managed using Google Cloud's single sign on (SSO) technology.

This all creates a "common model", FA CIO Craig Donald says, which allows performance staff to think and visualize in different ways.

The aim, he says, is to provide the managers and coaches with, "A very concise summary of a player's performance, and the ability to model a squad based on the up-to-date performance they see across all the eligible players in the team."

So, has the team finally cracked the secret of mining data and turning it into winners' medals? We'll see over the course of this year.

But as Donald says, the task is never-ending. There's always more data, and there's always new players working their way up the football pyramid. In fact, the breakout stars of World Cup 2030 are probably already in there. Maybe not in arrays of assists, expected goals, and hours of video. But you can be sure they're flexing their muscles in the form of those digitized scouts' notes.

Sponsored by Google Cloud.

More about

TIP US OFF

Send us news