What happens when you host code and git clone turns into a DDoS? Let's ask SourceHut
OK Google, whatever you're doing, please don't
Git-based code host SourceHut will not have to block the Go Module Mirror as planned, after Google took notice of its complaints.
Here's the situation: for the past two years, SourceHut has struggled to deal with the amount of data demanded by the Go module proxy when developers use that tool to fetch repositories from the biz via git clone operations.
When working in Google's Go programming language, modules consist of sets of Go packages with specific versions bundled together. Running the
go get command from a command line interface fetches the requested packages with any new dependencies declared in the module.
Gathering this code from version control can cause latency and can tax storage because the command may scour the entire commit history of a repo with a transitive dependency – whether built or not – to resolve the version.
The Go Module Mirror is supposed to work faster by requesting only the specific metadata or source code it needs.
"A module mirror is a special kind of module proxy that caches metadata and source code in its own storage system, allowing the mirror to continue to serve source code that is no longer available from the original locations," the Go documentation explains. "This can speed up downloads and protect you from disappearing dependencies."
Alas, the proxy proved to be impolite, asking for more data than a small code hosting firm could reasonably afford to bear. A year ago, Drew DeVault, founder of SourceHut, likened the situation to a distributed denial of service attack. And last month, he resolved to ban the Go Module Mirror over its excessive caching of SourceHut repos.
- Sourcehut to shun Google's Go Module Mirror over greed
- Google's Go team decides not to give it a try
- When software depends on a project thanklessly maintained by a random guy in Nebraska, is open source sustainable?
- Google experiments with user-choice-defying Android search box
Finally, DeVault's two-year crusade – documented in detail as a GitHub Issue post – has produced results. On Tuesday, in an update to his January 9 post, he said that Russ Cox from the Go team had got in touch. After some discussion, the Go team plans to revise its
go command line tool to support a
-reuse flag, which will reduce the traffic created by fetching modules.
"In the meantime, the automated refresh traffic from
proxy.golang.org was disabled for SourceHut, which the Go team assures us should have little to no impact on users and which reduces the burden on our system to a manageable level," explained DeVault.
He also suggested that the Go team has acknowledged that it is responsible for demanding too much from small data hosts.
"The Go team has decided that the automatic refresh behavior is their responsibility, not the responsibility of other operators, so any other small hosts will hopefully not be affected as the Go team will enable or disable the refresh behavior at their discretion with the burden on third-party operators in mind," he said.
So the Go ban plan is a no-go. Go traffic to
git.sr.ht once again has a green light.
A Google spokesperson declined to comment, saying only that the details in the SourceHut blog post speak for themselves. ®