The Tor Network, is a network that enables anonymous communication over the internet. Please see the official site for details on how it works.
While I was working on this project, I tried to see if someone were able to break the anonymity of the Tor concept, and I found a report with the title "Peeling the Onion: Unmasking TOR users". At first I was intriged by the title, and after reading it, I was a bit disappointed with Tor, because they were able to extract a wide variety of information from the Tor network. Then I read it again, and realised that there is actually nothing wrong with the Tor network. They have simply gotten their title wrong, it should have been somtehing like "Accessing browser information through anonymizers". If you read the report (which is a prerequisite for understanding this blog entry), you can see that they try to uncover some vandals who have exploited a weakness on some sites, and used Tor to hide their IP.
The first point that is given in the report, is that it is easy to find a complete list of exit nodes in the Tor network. That list can be used to block the entry from Tor machines. This does not pose any threat to the Tor users, because an exit node cannot be mapped to any particular user. It can however be usefull for site operators to limit anonymous access (ea. disallow comments from Tor users).
The next point in the report, is that it is possible to figure out who was attacking a site, by the use of timing statistics (They use a simplified approach at first). This very point is written on the Tor site itself, so there is nothing new here. This possibility is not considered a risk because it can only provide an indication that two machines are communicating.
The basic idea is this:
- Set up a listener at machine A and B.
- Every time A sends a package, record the size and time.
- Every time B recieves a package record the size and time.
- Do the last two steps with A and B switched.
Now plot in the data in a graph, and if A and B were communicating, it should be visible that most packages sent from A were recieved by B, delayed by some (constant) time factor.
This will ONLY work if you suspect A and B BEFORE they start communicating, AND you have access to set up the listening devices.
If the machines communicate heavily with other machines all the time, the pattern will be very hard to see.
The report claims that since there are only a few Tor Servers in Denmark (they replace Tor Server with Tor user by mistake), it would be possible to see if the server was online at the time of the attack. They do mention that this will only work if the offendant is running a server. Well, yes it would require the offender to run the server AND two other conditions:
The Server is only online during attacks.
The Offender is forcing the use of his/her own onion router (highly weird choice).
Its pretty obvious that the two other conditions are highly unlikely, and thus the entire technique is completely useless.
They all use browsers
The next way they try to unmask the user, is by using browser bugs/features (they forget to mention that). This method will work against almost any unknowing user. The simple user solution is to ensure that the machine does not know its own IP, through the use of proxies and outbound firewall rules.
Following the browser exploits, are a number of weird takes, that show dumps of the various protocol specific data that a program sends, such as Locale and OS version. I have a hard time seeing how a flag saying "Locale Danish, OS WinXP" can in any way help to identify anyone.
Linux Encryption FS
Right after that weird entry, they point to a very old problem with the Linux Encryption engine that uses Sector numbers as Initialization Vectors. They claim that they have not investigated this further. Well, if they did, they would know the following:
- It was fixed a looong time ago.
- It cannot be exploited by sending anything to the machine. It is a bug that will enable people with low-level access to the disk, to guess the passphrase faster.
- Since sector numbers do not exist in Tor, it is completely unrelated to the Tor network.
My guess is that they threw that link in there, to try to leverage their report by pointing to some real investegation.
Timing attacks, again
As a last point, they try to visualize the timing attacks, and introduce the idea of taging Tor data. If they read the protocol, they would know that the packages are encrypted. Encryption makes it impossible to change the headers inside the real package, and the layered approach makes sure that no TCP headers remain intact. It is possible to inject such headers from the attacked machine, but the output would only be readable to the client and thus very hard to intercept.
Traffic analysis, sort of...
This section is one of the funniest. At first, it is exciting that someone who runs such a server can extract this info. Then again, you don't know who is requesting the data.
The reason I find this section funny, is that when you read their findings on what people use the server for, you see some http traffic, and some telnet traffic. Then, In the next section, you see that they configured the server to accept only http and telnet traffic. Well, it would be very odd if they found anything else!
The state of affairs in Denmark
If you live in Denmark, you should start using Tor NOW. The government of Denmark has recently accepted a law that requires all usage of phone and internet to be logged. The Internet Service Providers are required by law to register all your phone calls, and all your internet connections for a whole year. If you think: "I am not criminal, why should that affect me?", consider the following scenario:
You visit a number of websites, and get a virus. The virus turns out to be a trojan, and your computer is being used to attack various sites. The police checks up on the data, and finds your IP. Now every website you have visited, or have pulled a commercial from, will now be on the list, and you may have to explain why you visited each and every one of these.