Introduction to Anonymizing Networks – Tor vs I2P
The Right to Anonymity
Every operation made in cyber space, every visited web site, and every web service accessed, leave traces of the user's experience on the Internet. This information is considered very precious for commercial and intelligence purposes.
Private companies and governments are constantly monitoring the world wide web to collect and correlate the information to use in analysis on the user's behavior, but who manages these data, how does he do it, and which are the real finalities of monitoring activities?
Data acquired during the monitoring of the Internet often are personal information, even socially harmful, that may be available, intentionally or not, to anyone beyond the time limits dictated by the principle of finality of the data. Even if such data were deleted, they may still be accessible through storage mechanisms such as "cached".
Surveillance and monitoring are activities of primary interest for many governments that in many cases trace political opponents with dramatic consequences that flow in fierce persecution.
Recently the demand of anonymity has increased, mainly to respond to the large diffusion of surveillance platforms deployed all over the world, but the concept of anonymity induces fear in our imaginations due to the direct link that is usually made to illicit activities. It must considered that anonymity of user's experiences on the web could also be motivated by noble argumentations, such as the fight for the human right to liberty of expression, avoidance of censorship, liberal promotion and the circulation of thought.
Anonymous communications have an important role in our political and social discourse. Individuals desire to hide their identities because they may be concerned about political or economic retribution, harassment, or even threats to their lives.
How to anonymize the user's experience?
In the Internet, every machine is identified by its IP address that could be hidden by using anonymizing services and networks such as I2P and Tor network. Usually the anonymizing process is based on the concept of distribution of routing information. During the transmission of data between two entities in a network it is impossible to not know prior the path between source and destination, and every node of the network manages minimal information to route the packets to the next hop without conserving history on the path. To avoid interception, large use is made of encryption algorithms that make impossible the wiretapping of the information and the recomposition of the original messages.
The Tor Network
The Deep Web is the set of information resources on the World Wide Web, not reported by normal search engines. It is a network of interconnected systems not indexed having a size hundreds of times higher than the current visible web.
A parallel web that has a much wider number of information represents an invaluable resource for private companies, governments, and especially cybercrime. In the imagination of many persons, the Deep Web term is associated with the concept of anonymity that goes with criminal intents that cannot be pursued because they are submerged in an inaccessible world. It's fundamental to remark that this interpretation of the Deep Web is deeply wrong.
Tor is the acronym of "The Onion Router", a system implemented to enable online anonymity thanks to the routing of Internet traffic through a worldwide volunteer network of servers hiding user's information.
As usually happens, the project was born in the military sector, sponsored the US Naval Research Laboratory, and from 2004 to 2005 it was supported by the Electronic Frontier Foundation.
Access to the network is possible using a Tor client, a software that allows user to reach network resources otherwise inaccessible. Today the software is under development and maintenance of Tor Project. Using a Tor Network a user could avoid his tracing, his privacy is guaranteed by the unpredictable route of the information inside the net and due to the large adoption of encrypting mechanisms.
Connecting to the Tor network
Imagine a typical scenario where Alice desire to be connected with Bob using the Tor network. Let's see step by step how it is possible.
She makes an unencrypted connection to a centralized directory server containing the addresses of the Tor nodes. After receiving the address list from the directory server the Tor client software will connect to a random node (the entry node) through an encrypted connection. The entry node would make an encrypted connection to a random second node which would in turn do the same to connect to a random third Tor node. The process goes on until it involves a node (exit node) connected to the destination.
Consider that during Tor routing, in each connection, the Tor nodes are randomly chosen and the same node cannot be used twice in the same path.
To ensure anonymity the connections have a fixed duration. Every ten minutes, to avoid statistical analysis that could compromise the user's privacy, the client software changes the entry node.
Up to now we have considered an ideal situation in which a user accesses the network only to connect to another. To further complicate the discussion, in a real scenario, the node Alice used could in turn be used as a node for routing purposes with other established connections between other users.
A malevolent third party would not be able to know which connection is initiated as a user and which as a node, making the monitoring of the communications impossible.
Figure 1 - Tor Routing
The Tor client distributed from the official web site of the project could be executed on all the existing platforms and many add-ons are freely available that allow the integration of navigation software in existing web browsers. Despite that the network has been projected to protect user's privacy, to be really anonymous it's suggested to go though a VPN.
A better mode to navigate inside the Deep Web is to use the Tails OS distribution which is bootable from any machine without leaving a trace on the host. Once the Tor Bundle is installed, it comes with its own portable Firefox version, ideal for anonymous navigation due to an appropriate control of installed plugins.
The user must be aware of the presence of many plugins in his browsers that expose his privacy to serious risks. Many of these plugins could be used to reveal a user's information during the navigation.
As said, the resources inside the Tor network are not indexed and is very hard to find them if we are accustomed to classic search engines. The way to search the information is profoundly different due to the absence of indexing of the content. A practical suggestion to new users is to refer to Wikis and BBS-like sites that aggregate links, categorizing them in more suitable groups of consulting. Another difference that the user has to take in mind is that instead of classic extensions (e.g. .com, .gov), the domains in the Deep Web generally end with the .onion suffix.
The following is a short list of links that have made famous the Deep Web published on Pastebin:
Figure 2 - Tor Links
Cleaned Hidden Wiki should be a also a good starting point for the first navigations:
http://3suaolltfj2xjksb.onion/hiddenwiki/index.php/Main_Page
Be careful, some contents are labeled with commonly used tags such as CP= child porn. PD is pedophile, stay far from them.
The Deep Web is considered the place where everything is possible, you can find every kind of material and services for sale, most of them illegal. The hidden web offers to cybercrime great business opportunity, hacking services, malware, stolen credit cards, and weapons.
We all know the potentiality of the e-commerce in the ordinary web and its impressive growth in last couple of years, well now imagine the Deep Web market that is more than 500 times bigger and where there is no legal limits on the goods to sell. We are faced with amazing businesses controlled by cyber criminal organizations.
I2P
According to the official definition, "I2P is a scalable, self organizing, resilient packet switched anonymous network layer, upon which any number of different anonymity or security conscious applications can operate".
I2P is an open source project developed in early 2003 by a group of full time developers with a group of part time contributors from all over the world.
It is fundamental to understand that inside an I2P network the "hidden" component is represented by an application in execution on the node doing, and of course the path followed by the information to reach the destination. Another important concept for I2P is the "tunnel", a directed path which extends through an explicitly selected list of routers. The first router that belongs to a tunnel is named "gateway".
The communication within a tunnel in unidirectional, this means that it is impossible to send back data without using another separated tunnel.
Also for I2P a layered encryption model is implemented, known as "garlic routing" and "garlic encryption", the information transits on network routers that are able to decrypt only the respective layer. The information managed by each single node is composed by:
- IP address of the next router
- encrypted data to transfer.
The original architecture provides two further definitions:
- "outbound" tunnels are those tunnels used to send messages away from the tunnel creator
- "inbound" tunnels are those tunnels used to bring messages to the tunnel creator.
Another element of critical importance for the I2P model is the network database (known as "netDb"), a pair of algorithms used to share the following metadata with the network:
- "routerInfo" - a data structure to provide routers the information necessary for contacting a specific router (their public keys, transport addresses, etc). Each router send its routerInfo to the netDb directly, that will collect info on the entire network.
-
"leaseSets" - a data structure to give routers the information necessary for contacting a particular destination. A leaseSet is a collection of "leases". Each of them specifies a tunnel gateway to reach a specific destination. It is sent through outbound tunnels anonymously, to avoid correlating a router with its leaseSets. A lease contains the following info:
- Inbound gateway for a tunnel that allows reaching a specific destination.
- Time when a tunnel expires.
- Pair of public keys to be able to encrypt messages (to send through the tunnel and reach the destination).
- Inbound gateway for a tunnel that allows reaching a specific destination.
I2P Routing
When Alice wants to send a message to Bob, she does a lookup in the netDb to find Bob's leaseSet, giving her his current inbound tunnel gateways.
Alice's router aggregates multiple messages into a single "garlic message", encrypting it using a particular public key, in this way only the public key owner can open the message.
For typical end to end communication between Alice and Bob, the garlic will be encrypted using the public key published in Bob's leaseSet, allowing the message to be encrypted without giving out the public key to Bob's router.
She selects one of her outbound tunnels and sends the data include of necessary instructions message and with instructions for the outbound tunnel's endpoint to forward the message on to one of Bob's inbound tunnel gateways. When the outbound tunnel endpoint receives those instructions, it forwards the message according the instructions provided, and when Bob's inbound tunnel gateway receives it, it is forwarded down the tunnel to Bob's router.
Be aware, we have said that transmission is unidirectional, this means that if Alice wants Bob to be able to reply to the message, she needs to transmit her own destination explicitly as part of the message itself.
I2P is end-to-end encryption. No information is sent in clear or decrypted during its path including the sender and recipient. To each node is assigned an internal network address different from the network IP address that isn't used.
Figure 3 - I2P Routing
Layered Encryption
The term layered encryption refers to the encryption process used during the transfer from a source to the destination through a series of peers that composes the tunnel.
Both Tor and I2P use layered cryptography. Intermediate entities have only to know how to forward the connection on to the next hop in the chain but cannot decipher the contents of the connections.
I2P is end to end encryption. No information is sent in clear or decrypted along its path, including the sender and recipient. To each node is assigned an internal network address different from the network IP address that isn't used.
I2P uses cryptographic ID to identify both routers and end point services, for naming identifiers is used the "Base 32 Names" techniques that attributes a SHA256 digest to the base64 representation of the destination. The hash is base 32 encoded and ".b32.i2p" is concatenated onto the end of the hash.
Figure 4 - Base 32 Names
The sender repeatedly encrypts the data to transmit and at each hop is applied the proper decryption process. During the building phase, only the routing instructions for the next hop are exposed to each peer, meanwhile during the transferring, messages are passed through the tunnel, and the message and its routing instructions are only exposed to the endpoint of the tunnel.
Note that it is necessary to introduce an additional end to end layer of encryption to hide the data from the outbound tunnel endpoint and the inbound tunnel gateway, meanwhile each tunnel has a layered encryption to avoid unauthorized disclosure to peers inside the network.
At each hop the peer decrypts the message, extracting data and routing instructions, and sends them to the successive peer, encrypting all using the recipient's public key. The process is repeated until it has one layer of encryption per hop along the path. The algorithm used for encryption of the packets are ElGamal and AES encryption.
Garlic Routing
Garlic Routing is very similar to onion routing with several differences. Let's consider first of all that in garlic routing, it is possible to aggregate multiple messages. Another difference from Tor is that the tunnels are unidirectional.
Garlic routing in I2P is adopted mainly in three distinguished phases:
- For building and routing through tunnels (layered encryption). In I2P communication tunnels are unidirectional; this means that each interlocutor has to create a couple of tunnels, one for outbound and one for inbound traffic. There is also the possibility of a reply from the recipient, therefore another couple of tunnels must be created for a total of four tunnels.
- For bundling, determining the success or failure of end to end message delivery.
- For publishing some network database entries.
Figure 5 - Garlic Message
Comparison
The core design goal for I2P network is to allow the anonymous hosting of services, like Tor Hidden Services, rather than focusing on anonymous access to the public Internet such as Tor network. I2P does provide direct access to the public Internet via "out proxies". The functionality is offered by various internal services to proxy out onto other anonymizing systems such as Tor.
I2P APIs are designed specifically for anonymity and security, while SOCKS is designed for functionality. In the I2P total security is ensured against detecting client activity.
Without doubt Tor network has a greater visibility in the landscape of anonymizing networks. It is largely used by governments, hackers and also common people that in many countries try to elude censorship. The documentation related to the Tor network is more detailed and complete with respect to I2P and it is available in different languages.
From a technical point of view, Tor appears more efficient due to a better memory management and low bandwidth overhead for its clients. Despite these considerations, I2P services are faster than hidden service in Tor. A mechanism of performance ranking is implemented in I2P allowing the analysis of real performance of the nodes.
Every node in I2P architecture is also generally a router, so there is not a rigid distinction between a server and a pure client like there is in the Tor architecture. I2P implements a Packet switched routing instead of circuit switched that allows a better balancing of data across the network and major reliability. Let's remember that tunnels are unidirectional, meanwhile circuits are bidirectional.
Unlike Tor, I2P doesn't use centralized directory servers, but it utilizes a Distributed Hash Table (DHT). A distributed architecture system eliminates the risks of a single point of failure.
While Tor is developed using C language, I2P is based on Java.
Conclusion
The article has the main purpose to introduce basics of the two most diffused softwares to anonymize a user's experiences on the web, Tor and I2P. Their importance in very high; thanks these networks it is possible to avoid censorship and monitoring. At the moment I have a meaningful experience with Tor networks, its community as said provides a great support for those users that desire or need to be anonymous on Internet.
I believe that despite the fact that I2P has existed about a decade, it is very under-utilized, the presence of a limited community represents in my opinion a brake on its growth.
I have used both and I found both efficiency effective. I tried also to sniff a package using specific software with the intent to disclose navigation data or any reference to the user's identity, of course without success.
The success of anonymizing a network is related to their diffusion, and without doubt Tor is a step forward, and the more users have access to sharing resources, the faster will be the navigation.
References
http://www.i2p2.de/_static/pdf/i2p_philosophy.pdf
http://dougvitale.wordpress.
FREE role-guided training plans
http://www.i2p2.de/how_intro