This article is more than 1 year old

How I made a Chrome extension for converting Reg articles to UK spelling

Long live the King's English or something like that

Hands On The Register began life in London in 1994 and today has journalists and other staff all over the world, which is to say San Francisco, Sydney, Singapore, Berlin, and beyond.

It used to be that our vultures wrote in their local style: Americans used US spellings, the British relied on UK spellings, the Australians were pretty much in UK-mode, too, and everyone else did what felt natural. If an article was, for instance, written by the US team and published in the UK morning, the spelling would often be changed to British.

As we added more writers and editors around the world, and reached more people globally – now about 40 million unique readers a year – across all sorts of time zones, it became necessary for logistical and professional reasons to agree on just one consistent style for the whole site. After The Register moved from a .co.uk to a .com during the pandemic, we chose American spelling.

Why? Because it honestly reflects the true global nature of our readership. UK spelling may make us appear UK-only. Yes, we grew from Great Britain, but today we cater for as many enterprise tech folks as we can around the planet. We want our flavor of irreverent, informed technology journalism to continue spreading far and wide, so that we can do our bit to challenge vendors, explain what's going on, and bite the hand that feeds IT. That's not changing.

The US spelling isn't everyone's cup of tea, though, or at least that's the impression we got upon reading brick-centered notes hurled through our digital windows lately.

As an olive branch that passionate segment of our readership with a preference for the King's English, this vulture decided to create a Chrome browser extension to change the words in published articles from US to UK spellings.

It's called Spellerizer, because I was going for subtly stupid in the hope of standing apart from the sprawling, unrepentant witlessness that is the internet's new normal. Also, the daft name serves as a reminder that the extension does not work all that well. But hey, it's free.

Translation, even between US and UK spellings, is an art that requires consideration of context. Spellerizer is not that artful; it relies on a brute force search-and-replace algorithm that pays no attention to the words around it. So if it sees "check," it will replace the US spelling with "cheque," even if the appropriate UK spelling in that instance also happens to be "check."

Really, I should have relied on a machine learning model to make more informed spelling changes. But then this wouldn't have been a weekend coding project. Feel free to submit improvements via GitHub.

My hope in creating Spellerizer is to demonstrate that it's reasonably easy to write a browser extension and to encourage those modestly familiar with JavaScript to try their hand at it.

Of course there are similarly approachable options, such as shell, Perl, or Python scripts to fetch, translate, and spit out pages. But the browser is a particularly important bit of software and is worth customizing if you enjoy writing code.

Spellerizer can be downloaded and installed from the Chrome Web Store or from its GitHub repo. The former is a better option if you want it to persist and receive updates – some people report that Chrome removes manually loaded (unpacked) extensions on browser restart as a security precaution. However, that's not been my experience.

As I haven't had an extension in the Chrome Web Store before, you can expect to see a warning – "This extension is not trusted by Enhanced Safe Browsing." – on the chrome://extensions/ page post-installation. Google explains, "For new developers, it generally takes a few months to become trusted."

Spellerizer is not officially endorsed by or supported by The Register and its publisher Situation Publishing which makes no warranty about its fitness or function. As the Chrome Web Store listing says, you probably don't need it. But if you want it, if you really really want it, my editors are okay with the extension's existence for the time being.

Google's Chrome Extension API documentation is a good place to get familiar with the quirks of building a browser extension. Once you have a basic sense of how the major components (the manifest file, the service worker, content scripts, and other extension-related web pages) relate to one another, it's worth installing an extension like Chrome extension source viewer (CRX) to view extension source code from the Chrome Web Store prior to download.

It's possible to view the source code of an extension that's already installed but it takes a bit more effort because you have to know the path to Chrome's extension folder (chrome://version/ -> [Profile Path field]/extensions) and then recognize the correct 32-character identifier used for the extension's directory name among others that may be present.

Spellerizer is perhaps more complicated than it needs to be because I chose to implement internationalization – a way to replace visible text strings in the extension with translated text based on the browser's set language. The localize.js script gathers text values from HTML page elements marked with the "data-i18n" attribute and then substitutes translations pulled from the messages.json file.

The extension, written using Manifest v3, functions by loading the service worker, background.js, and showing onboarding-page.html upon installation. Preliminaries out of the way, it adds a listener function to the Spellerizer icon, which if you followed the instructions and pinned to the browser bar, will be visible without poking around the Extensions popup menu.

The listener function attached to the Spellerizer's vulture icon triggers the content-script.js file, which does the bulk of the work. If clicked while viewing a Register page, it fetches the spelling data file, spelling_data.json (made available by developer Heiswayi Nrird under an MIT license), and iterates through all the DOM nodes – a tree structure used to organize the elements on a web page.

The approach I took was not particularly sophisticated – I have since learned that there's a more concise way to iterate DOM nodes using a Treewalker object – but loops within loops get the job done. The scanWords function (line 20, content-script.js) separates display text associated with DOM nodes into individual words and compares each to every one of the 1700 or so US/UK word pairs. If it finds a match, it swaps the US spelling for the UK one, preserving the case of the original.

If the script finds any words to replace, it will update the extension icon's badge with a number representing swapped word count. The badge disappears when you click to a different browser tab. And if you reload a Register page, any changes made disappear as they're only client-side.

There's also an options page that I added to test persisting data via the local storage API. Accessible via a ctrl-click on the Spellerizer icon, the options menu has a single checkbox labeled "World Peace."

As you might imagine, it has no effect. Enjoy. ®

More about

TIP US OFF

Send us news


Other stories you might like