Meltdown ahoy!: Net king returns to save the interwebs
Cometh the hour. Cometh the Van. Again
When you need to save the internet, who ya gonna call? Van Jacobson.
Long before Facebook snared five million users, before Gmail revolutionized web email by stuffing inboxes with free storage, and just before Jim Clark and Marc Andreessen developed Netscape as the first commercial browser, the web couldn't cope.
It was 1986 and more and more government and academic networks were linking up to form what became known the internet. But more computers meant more chatter, and the networks began suffering a series of congestion collapses under all the talk. Data packets got lost or delayed as the computers choked the bandwidth, causing slowdowns and dead zones.
The culprit was the fledgling TCP/IP protocol, which had been installed on ARPANET and other large networks connecting to it. TCP/IP was the networks' lingua franca. The problem was that TCP/IP hadn't been built for such huge scale, and computers running the protocol simply didn't know when to stop talking. They shoved their bits and bytes online, regardless of bandwidth, throughput, or whether packets were actually being received.
Jacobson: web privacy is not a compromise
Enter Jacobson, who had been a primary contributor to TCP/IP since 1978. He'd spotted the slowdown when the link between his Lawrence Berkeley National Laboratory and the University of California at Berkeley — which were 400 yards and two IMP hops apart — dropped from 32 Kbps to 40 bps.
Jacobson devised a congestion-avoidance algorithm that lets a computer determine the maximum bandwidth allowed. In essence, a machine increases its throughput until loss occurs — then it stops. He worked with a team who supplied additional algorithms, and his fix was applied as a client-side patch to PCs by an army of sysadmins. It was so successful it became incorporated into the TCP/IP stack. He also authored the Congestion Avoidance and Control (SIGCOMM 88) paper, (here (PDF), which has become a seminal piece of reading.
Jacobson's work is credited with having helped save the internet. Today, we have 800 million nodes online, according to the Internet Systems Consortium, versus the paltry 28,000 in 1986, when the internet was struggling to accommodate the demand for bandwidth.
Jacobson didn't just help save the web from a technology perspective. He saved its reputation at a critical time, as it was slipping from the protected world of government and academia. Had there been no fix, it's possible that an unforgiving private sector and the general public would have grown tired of the net's unreliability and moved on. We wouldn't have the internet we have today.
More than twenty years on, the internet is on the cusp of yet another congestion and scale crisis, thanks to an explosion of content on sites such as Facebook and YouTube, and the proliferation of devices used for accessing and downloading that material: smart phones, tablets, and netbooks. Service providers are worried.
According to industry group the GSM Association, the increase in mobile phones with internet access will at least quadruple the amount of online traffic within the next three years.
The International Telecommunication Union (ITU) says there will be five billion cell-phone subscriptions by the end of this year on a planet of 6.8 billion people. Mobile broadband subscriptions are expected to break one billion — up from 600 million in 2009.
Service providers are spooked by this anticipated demand and what it means for their aging networks. They feel they must charge us more. Spanish carrier Telefonica recently told the European Union that it might have to "optimize scarce shared network resources" to cope, while Nokia Siemens Networks said "we have to be fair, and we cannot beat the laws of physics" — in other words, charge or de-prioritize free services like Skype.
Unlimited hits its limit
Meanwhile, AT&T plans to spend $2bn this year upgrading its networks, adding thousands of new cells and fiber lines for better 3G speeds. But even as they upgrade, providers are attempting to cut back on bandwidth. This past June, three years into selling the iPhone, AT&T killed all-you-can-eat data plans, and Verizon has capped its data plans before offering the iPhone on its network.
It's a situation that will alarm fans of net neutrality as well as consumers and businesses who resent paying more for the same service.
Just like in 1986, it's TCP/IP that's the problem. But Jacobson — a former chief scientist for network giant Cisco Systems and for Packet Design Networks — has a new beef, and it goes beyond just making the technology work better: privacy.
The terror of Faceook
Tired of hearing tech companies belittle your concerns about privacy online? Telling you that you have no privacy on the web and to "get over it" while they sell service providers more servers or expose more of your data to advertisers? So is Jacobson. "I don't like that — that's an architecture failing — it doesn't need to be," Jacobson told us recently. "It terrified me — my daughter is on Facebook and I cringe because their default is to expose everything."
McNealy: you have no privacy on the internet — get over it
To be fair, Jacobson isn't too hard on Facebook's chief executive Mark Zuckerberg or Sun Microsystems' former chairman and founder Scott McNealy, who said you have no privacy online. He blames the technology cards they've been dealt. In other words: TCP/IP.
"Everybody has to build with the tools that they've got. Facebook has the internet as their TCP/IP model and the context of Scott's comment was the TCP/IP model," he told us. "We are trying to add to the toolbox and add a set of tools that let you do different models."
Jacobson is now proposing a fundamental shake-up to the way the internet is architected, to solve not just the scale problem but also to put privacy and disclosure in the hands of users.
He proposes to reduce network load by redistributing where content is stored online to away from service providers' overloaded central servers and networks, while also allowing content creators — that's you — to set access controls and say who sees what. His idea is called Content Centric Networking (CCN), and it's impossible to implement using TCP/IP.
"One of my biggest worries about the internet is — structurally — it's hard to do a security architecture because the nature of how you secure calls is always going to be hard. Securing the content is easy — but it requires a shift in thinking," Jacobson said.
"CCN is trying to make that model where you name the content at the low-level mode rather than the high level mode... Starting with that model it's real easy to do content-focused security because you can start to name the things that important for."
It's an idea Jacobson has been evangelizing for at last a half-decade, but it will finally start becoming reality in 2011. We first wrote about CCN on the 40th anniversary of Xerox PARC, but decided it was worth hearing more from Jacobson and getting an update.
A project of Xerox company PARC — where Jacobson's been a research fellow since 2006 — CCN in September received funding from an $8m award by the US National Science Foundation (NSF) looking at the future of the web. CCN falls into the Named Data Networking (NDN) architecture project, to make the web "more usable."
The idea is to achieve this by focusing on the data people want, and not where the data's based. In a TCP/IP network, the focus is on where the data lives — endpoints like the server.
How serious is this? It was the NSF who in 1986 initiated development of NSFNET, which started as a project to connect five US universities via a high-speed network. It plugged into ARPANET and — for a while — was a major internet backbone connecting 4,000 institutions and 50,000 networks across the US, Canada, and Europe.
The NSF cash will go to work solving basic problems such as fast forwarding, trust, network security, content protection, and privacy — in short: a new communications theory.
Jacobson and his PARC team have produced early protocol specifications released under an open source implementation called CCNx, used in NDN. Separately, PARC is talking to network, consumer, and cellular service providers about using the technology in the near term.
CCNx contains early protocols that the project's website stresses are still experimental and may change. These cover a transport protocol based on named data rather than packet address, a basic name convention that assigns meaning to elements such as application, institution, and/or global conventions rather than name. You can see the rest here.
TCP/IP: a success disaster
The NSF's $8m means that PARC, working on NDN with nine universities including the University of California in Los Angeles, can now fund the engineering to build out Jacobson's concept. The immediate priorities are intelligence, infrastructure security, and internet routing — making it more robust, expressible, and in need of less configuration.
Given that TCP/IP has had a good 40 years to mature since it was co-drafted by Vint Cerf and Robert Kahn in 1974, you'd think that all the kinks had been knocked out. You'd also assume that as a prime contributor since 1978, Jacobson would be happy with the state of things. But no.
TCP/IP's success is that it unified ARPANET with other large networks like NSFNET over public telephone lines and laid the foundations of today's internet. TCP/IP replaced closed protocols devised by different government and research operations that had used their own addressing and encapsulation structures — such as ARPANET's Network Control Protocol (NCP) — with something that was infinitely more open, efficient, and flexible. The military officially "turned on" TCP/IP on ARPANET on 1 January 1983 and TCP/IP went on to provide a "terrific way of doing networking" according to Van Jacobson.
The problem is that TCP/IP's produced what Jacobson calls a "success disaster."
When Michael Jackson killed the internet
TCP/IP is a "success" because it provided a ubiquitous communications infrastructure where anything can talk to anything. It's a "disaster" because TCP/IP is not built to handle today's wealth of data, unlimited numbers of users, or mobile computing. TCP/IP comes from a world of a few, fixed PCs used by lots of users processing a relatively small quantity of data. As such, TCP/IP connects one endpoint to another using a stable, known IP address.
This is a "conversational" model borrowed from the phone system, where the endpoints are trusted and known. According to Jacobson, the problem is that people on the net aren't having "conversations" — despite what the Web 2.0 crowd say. Ninety-nine per cent of traffic is for named chunks of data — or content. People are downloading web pages or emails.
TCP/IP was not built to know what content people want, just to set up the conversation between the endpoints and to secure those connections. That's a problem because people can — and do — flock to the same servers to watch exactly the same video or get the same piece of information, and proceed to overload sections of the network and take sites down.
Connecting conversations: not the way today's web works
In the past, Jacobson has cited the example of an NBC network server severely congested with requests for 6,000 copies of the same piece of video from one year's Winter Olympics of US downhill-skiing medal winner Bodie Miller storming to victory. Everybody wanted the same video, but the NBC router had no idea. It thought it was handling 6,000 different conversations not 6,000 requests for exactly the same piece of content.
More recently, in the summer of 2009, we saw the same effect when Google News, TMZ, Twitter, the LA Times, and other sites all slowed down or failed as people rushed the web to find out about one big event: the death of Michael Jackson. CNN claimed a fivefold rise in traffic in just over an hour, receiving 20 million page views in the hour the story broke.
Network overload isn't the only problem. Privacy is an issue too. Over on sites like Facebook, as you post content, you're offered such broad disclosure options that they really provide very little control. Your choices are friends, friends and acquaintances, or world + dog. These are not very accommodating if you want to broadcast, on a case-by-case basis, specific content to only a select group of people — such as a video of your toddler walking meant for the grandparents, maybe a post about yourself wearing your airline's uniform in an out-of-work context, or just share your contact details.
YouTube is similar. You can upload your video, but if you want only selected people to see it, then you have to make sure the recipients have a YouTube account — which suits YouTube's owner Google because it wants to serve more ads to as many people as possible. Otherwise, you can upload your videos to YouTube's "unlisted category", which won't put your video in YouTube's search results, but it does mean your video can be shared by anyone who happens to come across it. And I do mean anyone.
"We have these wonderful, useful web services like Twitter and Facebook and YouTube, but by their nature you got to make a lot of privacy compromises because they are aggregating the content in one place to distribute it," Van Jacobson told us. "That's because the architecture doesn't solve scalable content distribution."
Research dead end
One way around this is to broadcast that video of the kids to the grandparents, but then the ISP would shut you down for file-sharing. "The only way I can do that is to upload the videos to YouTube, but then I have to work in their business model and their privacy mode. I'd like to encrypt them and hand out the keys to the people," he said.
Another problem in the TCP/IP world is that hackers and spammers get a foot in the door. You may well be getting a secure TCP/IP packet signed by your bank's web site, but what if the site's been compromised and that packet you're downloading contains a worm or a keystroke logger as we speak? TCP/IP doesn't know, because it doesn't know what the content is.
Jacobson reckons that network research in the US has failed to keep pace with any of this. Since the middle of the last decade, network research has been stuck in a dead end when this should be a wonderful time thanks to ubiquitous wireless and phones, and a wealth of information available and retrievable through things such as Google indexing.
CCN is his answer. CCN will cache content as it passes along the network and remember where it's been delivered. The next time somebody requested a hot story, CCN would remember where it was last downloaded and potentially direct you there, instead of to those central Google News, TMZ, Twitter or LA Times servers. That download could be on a local router or the smartphone of the guy sitting next to you on a plane.
The project also includes content authentication, according to PARC, so the content's publisher or the author can wrap it with their credentials to verify that the content is genuine.
When it comes to privacy, it's the data not the server endpoint that is encrypted. That gives the author the ability to say who's allowed to view their work. Data is signed cryptographically, so there's a notion of an identity provider, with the crypto key itself treated as just another content object. You, the content creator, manage and set identities for recipients of your data while CCN handles the encryption.
Vint Cerf: no idea what he'd unleashed with TCP/IP
That has a big pay-off for media companies concerned about digital rights management, but also for private individuals who don't like that all-or-nothing approach offered by Facebook and YouTube. Now you have more control over videos and posts fed to Zuckerberg's beast in the sky.
"Architecturally, you can make privacy the default, unlike now where privacy is expendable," Van Jacobson told us. "If you make the communications very simple and very robust, you can make access to the information highly controlled."
These are early days for CCN. The CCNx code is experimental and NDN is only just starting to think about turning Jacobson's idea into something that might actually work on a web scale. PARC has done proofs of concept at its Silicon Valley campus, and it's working with those unnamed commercial partners, who it reckons have bought into the CCN vision.
The real problem will be overcoming inertia — the inertia of those controlling the internet who feel they have too much time and money invested in the TCP/IP infrastructure to change, that they can simply throw a few billion dollars' worth of servers at the situation while recouping cost through tiered data plans for iPhone and iPad users.
There's also the potential inertia of Facebook and others, who quite like the commercial aspects of sharing your data with third parties. CCN doesn't just promise to change the internet's architecture. It also promises to alter who controls content online, and it will kill the carpetbagging business model of the web giants that make free money off user-generated content.
There's also the scale aspect. This is 2010, not 1988. There are hundreds of millions of nodes, not tens of thousands. It'll take more than a simple patch applied by willing sysadmins to fix this one. If we're talking about a major upgrade to the way the entire internet runs, we're talking about a fundamental shift in how the internet is built — right down the packet level.
According to Jacobson, CCN should at least have a low barrier to adoption from a technology standpoint. CCN has been architected to have the same universality as IP, which runs over Ethernet, Wi-Fi, and phone lines. CCN runs over these plus broadcast.
CCNx just hit its first mobile platform — Google's Android. PARC said Android — a Linux operating system that uses Java — was first on the list because it's "very compatible" with CCNx's code base — CCNx has a mix of portable C code and Java. The fact that PARC is engaged in research collaboration with handset maker Samsung also helped. Samsung, the world's second-largest mobile phone maker, is an Android convert that expects to sell a million of its Android-based Galaxy tablets in the coming year.
The future, again?
Van Jacobson, who recently showed us chat, email, and video apps all working with encryption over CCN, told The Reg: "You can deploy this over existing infrastructure. The web proxy targets CCN with a bog standard browser." The idea is not to rip out TCP/IP and start over.
It's a comment on Jacobson's drive and commitment that one of those who helped build TCP/IP should still be gnawing away at a system he helped refine nearly 30 years ago. Since those early days of beige flares and analog phones, Jacobson has devised algorithms to improve TCP/IP's performance over slow serial links; written network diagnostics tools traceroute, pathchar, and tcpdump; and worked on standards for Voice over IP and for multimedia.
With the funding now here to prove his idea on a web scale, will CCN be as big for the internet as TCP/IP was and as big as his subsequent work to help TCP/IP scale in 1986? "Ask me in six months... I'll give you an answer. They are working on that now," Jacobson says of the researchers now starting to work out how to deliver his vision.
Bear in mind that back in the mid 1970s, Jacobson with Cerf and Khan couldn't have imagined their work on TCP/IP would go on to revolutionize the planet, either. "It was a little thing that grew, and it grew slowly at first," Jacobson said of TCP/IP.
"Most people's perceptions of the internet was it happened in 1995, to others it happened around 1975. It stayed small until there was more general understanding and exposure to the ideas — and how simple and powerful the ideas were and what you could build with them. We hope we will follow in those footsteps, but only history will be the judge of that." ®