Reverse DNS queries may reveal too much, computer scientists argue
When you combine it with DHCP, that spells TRACK ME
Computer scientists at the University of Twente in the Netherlands have found the interplay between the internet and local networks can be analyzed to reveal private data and facilitate tracking.
In a study titled, "Saving Brian’s Privacy: the Perils of Privacy Exposure through Reverse DNS," Olivier van der Toorn, Raffaele Sommese, Anna Sperotto, Roland van Rijswijk-Deij, and Mattijs Jonker look at how DNS interacts with DHCP and find that some of the data exchanged can be exposed by Reverse DNS (rDNS) queries.
DHCP is a network management protocol that allows IP addresses to be dynamically assigned to devices on a network. This involves a client-server model where the device joining the network (the client) requests an address from the DHCP server.
The client retains this address for a set amount of time (a lease period) or until it sends a release message and leaves the network, to allow the assigned IP address to be reallocated. But clients may also leave a network without sending a release message, creating a time gap between client departure and automated record removal that provides an opportunity for further rDNS network interrogation.
Typically, DNS maps host and domain names to IP addresses, a process known as forward DNS that uses an "A Record" to match a domain name like
theregister.com to an IPv4 address [don't start – ed.].
Reverse DNS takes a DNS pointer record (PTR) with an IP address and returns a hostname. For example, if we want to know what hostname points to
18.104.22.168, we juggle that IPv4 address around into a special
in-addr.arpa address, look up the PTR record for
22.214.171.124.in-addr.arpa, and see that it's
dns.google, Google's public DNS offering.
That also means if we run through all public IPv4 addresses, looking up their reverse DNS, we can get all the associated hostnames. For devices on, say, university LANs that are assigned public IP addresses via DHCP, their hostnames can therefore be discovered.
126.96.36.199 might point to
188.8.131.52 could be
secret-nas.example.edu, and so on.
You don't even have to scan all of IP space, just home in on the IP blocks of institutions or organizations you're interested in.
These hostnames probably won't reveal much at all, in practicality, or all the interesting systems you want to know about are not assigned public IPs. When public-facing hostnames contain sensitive or revealing information, though, and can be read via rDNS queries by anyone on the internet, you have a potential privacy problem, the research team argues.
It gets interesting when you can see the delays in DHCP-issued IP addresses dropping their hostnames, and later reappearing, as it gives you an idea of someone's movements. We'll leave it up to readers to decide how much of a risk this is to their own users and network environments.
Past privacy research, the paper's authors claim, has already established that network hostnames can contain information that's useful to adversaries. They point to studies in which rDNS data has been used to infer the link speeds of routers and switches, network topology, geographic information, and so on. Hostnames can also reveal the hardware in use, and the user's name.
The researchers say their work builds on these findings to show that automated and continual changes to rDNS records, via DHCP, may reveal client identifiers that imperil privacy.
"Our results show a strong link: in 9 out of 10 cases, records linger for at most an hour, for a selection of academic, enterprise and ISP networks alike," the paper says. "We also demonstrate how client patterns and network dynamics can be learned, by tracking devices owned by persons named Brian over time, revealing shifts in work patterns caused by COVID-19 related work-from-home measures, and by determining a good time to stage a heist."
The suggestion here is that being able to track individuals through their devices from the internet provides the opportunity to rob an associated location when it's unoccupied.
Not a new issue
The researchers observe that the privacy risk of DHCP has been recognized at least since 2016 in RFC 7844, which describes how DHCP clients can remain anonymous on a network.
"Our findings do not only demonstrate that identifiers are in fact carried over in the wild, but also reveal that the content contained in identifiers is in itself privacy-sensitive," the paper claims. "For example, being able to tell the make and model of a client device may benefit sophisticated attackers, who could use this information to pre-select relevant exploits. Owner names, in turn, can tie IP addresses to users, which could be used for a number of malicious purposes."
Often, the researchers speculate, phone and computer names get revealed via the DHCP hostname parameter. And because people often choose an identifying identifier when setting up devices, this information may be available to miscreants using the described techniques.
- Ubuntu Linux 18.04 systemd security patch breaks DNS in Microsoft Azure
- Two thirds of DNS queries for IPv6 hosts sent to Chinese resolvers fail, researchers find
- How legacy IPv6 addresses can spoil your network privacy
- Big Tech shrank the internet while growing its own power
"We view this as a serious problem that may very well be in a blind spot of network operators," explained Mattijs Jonker, assistant professor at University of Twente's Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS), in an email to The Register.
"First, the practice of dynamically adding records when devices join and leave a network offers a path for miscreants to remotely learn network dynamics and internals, even if traditional mechanisms to stop tracking by outsiders are in place.
"Let's say that a firewall was placed in front of a campus or enterprise network to block ping probes from the internet to devices inside the network to stop outsiders from learning the presence of devices. This function would be undermined if the presence of said devices is signaled through dynamically added records.
"Second, if we look at the content of the records itself, privacy-sensitive and/or uniquely identifiable device information makes it onto the public internet."
To demonstrate how individuals can be tracked, the researchers used rDNS data to follow one or more individuals named Brian around a US university network over a six-week period. The rDNS queries yielded hostnames like brians-air, brians-galaxy-note9, brians-ipad, brians-mbp, and brians-phone.
"The Brians mentioned and tracked in the paper are real people, although we deliberately chose not to identify an individual Brian because of the privacy concerns," explained Jonker. "We suspect that in our case we tracked a limited number of persons named Brian (in the network that we targeted in our case study)."
We reveal that observing automated changes to rDNS can provide insights into client presence and network dynamics
Because the Galaxy Note 9 appeared for the first time on Monday afternoon after the US Thanksgiving holiday, they speculate that one these Brians bought the device at a sale on the Friday after the holiday or on that day.
The boffins say that their study shows rDNS data can provide insight into the behavior of clients that have received dynamically assigned hostnames. And because these hostnames often map to a device owner's name or reveal other identifying info, associated individuals can be tracked from the internet.
"Our findings are disconcerting," they conclude. "While existing literature has shown that meaningful information can be extracted from hostnames primarily without considering continual changes to reverse DNS records, we reveal that observing automated changes to rDNS can provide insights into client presence and network dynamics.
"The publicness of rDNS severely increases this risk, enabling anyone on the Internet to observe automated changes. An adversary with measurement capability and knowledge about a potential target can gain valuable insights following an approach similar to ours."
To mitigate these risks, the researchers argue that DHCP client-provided information, such as device names, should not be mapped to publicly accessible PTR records. And they urge network operators to prevent hostname formation from propagating from DHCP to DNS. ®