Sysadmin blog Next year will see the 20th anniversary of IEEE 802.1Q, the standard that defines the tagged VLAN for Ethernet networks. Despite it being two decades since modern VLANs started being used in anger a significant number of systems administrators remain afraid of them. Unfortunately, the time is upon us where VLANs are becoming a necessity even in small businesses.
VLANs are used to simulate the physical segmentation of networks without having to actually create separate physical networks. Let's say, for example, that I had two networks that were both set up to be 10.0.0.0/24, and for some reason I needed to run them on the same physical infrastructure - perhaps because of a merger - but they couldn't actually be on the same network.
Before VLANs, this was impossible. You simply could not have two systems with the same IP address on the same physical network. With VLANs, you can, so long as at least one of the networks is set up to be on a separate VLAN.
To understand the technical bits, it's easiest to start with a little bit about how tagging works in practice.
Tagging in practice
Implementing VLANs requires understanding the three basic port types. Access ports only allow untagged packets. Trunk ports only allow tagged packets. Hybrid ports allow both kinds of packets.
On most networks, end user devices plugged into switches don't have to configure a VLAN to access the network. Your Windows workstation, for example, probably just plugs in, gets an IP address via DHCP and you're off to the races. This doesn't mean that the traffic from that port isn't going to participate in a VLAN, it just means that anything attached to that port doesn't need to worry about it.
One can configure a switch to tag all traffic on an untagged port as a given VLAN. So, for example, I could say that all traffic in and out of ports 1-8 was VLAN 10, all traffic in and out of ports 9-12 was VLAN 20, and ports 13-16 were trunk ports that only passed tagged traffic.
If I plugged a computer into port 1 and port 8 they could talk to one another because they are both on LAN 10, however, a computer on port 1 could not talk to one on port 9. This is because port 1's untagged traffic is set for VLAN 10 and port 9's untagged traffic is set for VLAN 20. Similarly, if I plugged a computer into ports 13-16 they couldn't talk to anything because those ports would simply drop untagged traffic.
Also important is the concept of the native VLAN. When creating a switch fabric that includes VLANs switches need to know which VLAN will be connected to ports otherwise unconfigured for VLANs. By default, this is VLAN 1, but this can be changed.
Trunk ports don't need to be switch-to-switch only. A virtual server, for example, is a good candidate for using a trunk port. There's a reasonable chance that different VMs operating on the server will use different VLANs.
Virtual switches can be set up to operate with VLANs. In most hypervisors one can create multiple VM networks, each with a different VLAN and attach them to a single virtual switch. That virtual switch is then connected to a physical network card (or cards) which are connected to physical switches.
Assigning a virtual machine to a VM network configured to a specific VLAN means that all untagged traffic from that VM will be tagged by the virtual switch as belonging to the relevant VLAN. Configuring a virtual switch to allow the guest VM to handle tagging is possible, but it's usually a bad idea.
Personally, I tend to use hybrid ports for my virtual servers. I leave my management networking untagged and make all my VM traffic tagged. This way, if I screw something up in configuring VLANs I still can get to the management interfaces and rectify my problem.
To understand why guest VLAN tagging is rare, or why all ports aren't simply configured as hybrid ports, one must understand the security implications.
It's not hard for a server, switch or virtual machine to be configured to work with VLANs, so don't think that VLANs are some sort of security holy grail. VLANs are absolutely part of proper network segmentation and security; however, vigilance must be applied to ensure that workloads aren't given the opportunity to access networks they shouldn't.
This is why, as a general rule, administrators tend to limit VLANs to switches, both physical and virtual. The individual workloads should communicate with their switch untagged. An application administrator shouldn't have the opportunity to simply decide "I want to see what's going on in VLAN 20, so I'll configure my network card for that and start interrogating the network".
As an additional layer of security, network administrators will typically only allow trunk and hybrid ports access to VLANs which are required on that port instead of any arbitrary VLAN. This helps mitigate the security impact of, for example, accidentally setting a port to hybrid instead of access.
It may be appropriate for virtualized routers to have unrestricted tagged access to the network. It is probably appropriate for virtual servers to have restricted tagged access. There aren't a lot of other scenarios I can think of.
GARP, GVRP, MRP and MVRP and VTP
Dynamically registering network attributes is not a new concept. Generic Attribute Registration Protocol (GARP) is old, defined in 802.1p and later incorporated into 802.1D, way back in 1998. GARP VLAN Registration Protocol (GVRP) is a means by which switches and standards-compliant connected servers could register attributes with the network dynamically, including VLAN membership.
Multiple Registration Protocol (MRP) and Multiple VLAN Registration Protocol (MVRP) were the replacement protocols for GARP and GVRP. MRP and MVRP were designed as a less bandwidth intensive versions of their predecessors that also allowed for faster network convergence (response to change) times. This became increasingly critical in the mid 00s as networks with large numbers of VLANs began to appear.
Both of these have been around so long that you'll find support for them even on cheap-as-chips Netgear switches. Of course, some vendors will claim that "there isn't widespread support" and thus push their own proprietary versions of dynamic registration protocols.
Cisco's version of the above is VLAN Trunking Protocol (VTP). Unlike the standards-based versions which rely on advertisements, VTP uses a client-server model to distribute VLAN information. I will leave it up to the real network nerds to engage in debate about why one or the other is better.
What's worth noting is that many virtual switches – most notably VMware's – don't support these protocols. This means that if one were to create a VM network on a virtual host it would not advertise to the rest of the network. Network administrators would still have to manually allow the switch ports connected to that host's network cards to carry that VLAN's traffic.
To say that this is highly inconvenient is putting it rather mildly.
Spanning Tree and Shortest Path Bridging
Spanning Tree Protocol (STP) is a means of allowing switches to cope with multiple interconnections without getting into nasty broadcast loops that can bring down entire fabrics. Think about that time you thought "I need more speed between these two switches", went ahead and plugged a second cable between them and watched the whole network collapse. STP is the thing that prevents that.
STP was defined in 802.1D and restated in 802.1Q along with more grown up versions 802.1w and 802.1s and then completely supplanted by the more VLAN aware Shortest Path Bridging (SPB) in 802.1aq, which was approved in 2012. SPB is generally considered one of the most significant changes in Ethernet's long history, and it's not hard to see why.
STP was significant because it allowed organizations to wire up redundant links between switches without cratering the whole network, adding reliability that would otherwise not be possible until rapid reconvergence software defined network fabrics emerged two decades later. It made modern networks possible.
SPB allows organizations to not only wire up redundant links between switches, but to use all those links simultaneously. Previous attempts (such as TRILL) were either proprietary or suffered from practical setbacks that severely limited their deployment.
SPB can understand double tagged VLANs and ensure that packets from a given VLAN follow the same path through the network. This ensures that a given VLAN's packets don't have wildly varying latency. It also supports 16M VLANS instead of the classic 4k.
In other words SPB allows you to wire your network up like Dr Seuss' worst nightmare and it will not only not break, it will operate efficiently. Basically, it's black magic.
The solution preferred by VMware (and some of the other SDN players) to all of the above is to bypass switch awareness of VLANs altogether. By all means, set up a resilient physical network fabric that uses links efficiently, but VMware believes control of VLANs should be up to software.
Instead of using 802.1Q VLANs, VMware prefers the use of the VXLAN encapsulation protocol, which does what it says on the tin. It encapsulates packets to segment networks instead of changing the packet header. (STT and GRE are competing encapsulation protocols, but VXLAN appears to be winning.)
VXLAN has a lot of advantages over VLANs. For example, it's routable, and can handle 16M VLANs. Its routability means that virtual networking can be more easily accomplished across physically disconnected networks, allowing administrators to shrink the layer 2 failure domain.
While a single large network with lots of interconnects between switches can provide for high throughput without needing expensive routers, with classic VLANs a single bad NIC transmitting bad frames can wreck a VLAN or even the entire fabric.
That said, neither classic VLANs nor encapsulated VLANs are the whole of the solution. What administrators both need and want is for both to interoperate and do so in a dynamic, scriptable and centrally controllable fashion.
Unfortunately, getting to that utopia would require various tech industry titans to stop trying to create monopolies and actually work together. In other words: that isn't going to happen any time soon.
For the foreseeable future classic and encapsulated VLANs will have to be managed separately, and usage of both will likely increase as networks grow increasingly complex. Best practices calling for network segmentation and the increasing pressure of regulatory compliance will drive adoption, even within small networks.
VLANs aren't scary. Even the newfangled SDN VXLANs aren't all that hard. If you haven't taken the plunge, it's time to experiment. Good luck. ®