Tag Archives: internet

Internet privacy and security beyond 2016

adolfomaria.rosasgomez@gmail.com                                     http://www.adolforosas.com

(Download this article as PDF : Security and Internet beyond 2016_v1)

internet_security_2

We feel that we need ‘security’ in our networks especially in internet. But what do we mean by ‘security’? Do we have clarity about what we want? There are concepts that live in the security arena but are not equivalent: reliability vs. trust vs. safety, identity vs. anonymity vs. privacy.

By increasing the scale, reach and speed of our networks we are redefining what we used to name as ‘communication’. Let us explore the paradoxes that internet is introducing in human communication. Let us take a look at what we can/cannot expect today from internet and let us look beyond 2016 for the possible evolution of human communication through internet.

Modern day trade-offs

In today’s News you can easily find ‘Famous Government’ vs. ‘Famous Company’ in a clear case of privacy vs. security, you can recall a recent breach that exposed data of customers of a business in which privacy was key in a case of privacy vs. morality, in recent years we have seen information leaks suggesting that governments perform massive scale interception of citizens communication, so the leak is a case of government right to keep secrets vs citizen right to keep secrets.

The Room

When thinking about communication in The Internet I propose to apply a simple paradigm: imagine that all the actors in communication are people in the same closed room and they have no other device than their voice and the ability to walk around. Anyone can speak aloud and publicly for all the room or N people can join together and ‘try’ to talk in short range volume. I propose this simplified model to analyse all possible cases of privacy, anonymity, and trust in today’s internet. Due to the current capabilities of internet HW and SW and due to its wide reach there are many similarities in terms of privacy and trust between The Room and The Internet.

Reliability and Privacy

We want the network to deliver the message to the intended recipient and no one else, but worldwide networks like internet cannot grant that a single message will ever reach the intended recipient. This statement has a meaning beyond technology prowess. High quality fibre links have virtually no losses, connection oriented protocols like TCP are supposed to grant that every message makes it through the network, but routers, fibre links and TCP connections can be spied and attacked by anyone with physical access to them (anyone can spy in the case of internet).

VPNs and shared secrets

VPNs (Virtual Private Networks) are subnetworks created ‘on top’ or ‘through’ internet. They are kind of a private tunnel through a public space. Access to the VPN is restricted to N endpoints that share some common property. VPN technology adds reliability by making more difficult to break or spy the conversation. VPNs fight impersonation, tampering, injection or deletion of messages. VPNs rely on encryption (cyphering messages) but encryption is not completely safe under attack. There are many methods to cypher data today. All of these methods rely on pieces of information or ‘keys’ that must be known only by the two entities that communicate. Message plain text is combined with the key using some transformation that is difficult to invert without knowing the key. It is difficult but not impossible. Inverting the transformation must be straightforward if you know the message and the key.  The most obvious cipher is based on a symmetric key. The same key is used to cipher and to decipher. Having key + plaintext the direct transformation is easy to do and it renders a ciphered message. Having the ciphered message and the key the inverse transformation is easy to do and it renders the plaintext. Symmetric key cryptography requires that sender and receiver have the key. There is a ´key distribution problem’. The transformation selected must be very difficult to invert when you have the ciphered message but not the key.

Statistical attack

As far as the same key is applied to a growing amount of information (many messages) the channel between sender and receiver becomes more and more vulnerable to statistic attack. In everyday´ s life keys are short pieces of information compared to the amount of information that these keys encrypt. As Claude Shannon demonstrated the only possible escape to statistical attack is to use a key that is at least the same length of the message it cyphers. Shannon´s demonstration led to an encryption technique known as ‘one time pad’. Sender and receiver have a shared secret (key) as long as the message. Once the key is used to cypher a message, the key is discarded (and thus the ‘one time ’). To send a new message the sender must use a new key.

Everything is a VPN

Beyond TCP we could use any imaginable protocol to make our channel through internet more resistant to losses and less vulnerable to attacks and/or spies, but from a logical point of view any imaginable ‘secure’ channel built through internet is completely equivalent to a VPN and always relies on some kind of encryption, so it shares the vulnerability of known VPNs to statistic attack.

A VPN is only as robust as its encryption technique. Establishment of a VPN link is based on the existence of secrets shared by both ends. State of the art VPNs use short keys that are static. How do we share these secrets? If we use internet we can be spied and keys can be discovered.

Public keys

A proposed solution to key distribution is public key cryptography. This is the solution adopted in certificates and state of the art VPNs. I want to share a secret (key) with many people. I divide the key in two parts. I distribute part 1 widely (public) and keep part 2 secret (private). Anyone having part 1 can use it to cipher a message and send it to me. I can use part 2 to decipher what was ciphered using part 1, but no one having only part 1 can decipher it. If I want to reply to a message I need to know part 1 of the receiver´s key, his ‘public’ key and he will use part 2, his ‘private’ key to decipher. This is not really ‘sharing a secret’ as public keys are no secret, everyone knows them, and private keys are never shared. The relation public key-private key is what mimics sharing secrets. It mimics sharing because it exports some information about part2, the private key, without exporting the whole key. The methods used to divide a key in public + private are difficult to invert when you only have the public key and the message but do not have the private key, but inversion is not impossible, it is only computationally difficult.

Out Of Band secret sharing

An alternate approach to public key is a key distribution method based on ‘out of band secrets’. This ‘out of band’ means that we need to share a secret (the key) with the other end by means of any other channel that is not internet. Two people in The Room can communicate aloud in front of everyone else with perfect secrecy as far as they have shared enough secrets out of The Room.

Grid Cards

As you can verify, people that need privacy over internet channels have put in place VPN-like mechanisms that rely on out of band secrets: Banks provide ‘pads’ (grid cards) to their customers, cards with an indexed sequence of characters. With each new transaction the bank provides an index and requests the customer to provide the indexed character. This mechanism uses the OOB secret to authenticate the user before establishing the channel. A grid card cannot hold many different keys, so the reservoir of keys is pretty small to implement an OTP (one time pad).

E-Tokens

Some companies provide e-Tokens. Every e-Token is a portable device with a reasonably accurate local clock that is one-time synced to a central server at ‘device-to-user-linking-time’. Every e-Token generates a short pseudorandom sequence (PIN) based on the current time and a seed. Every e-Token uses the same PRNG algorithm to generate the PIN. This mechanism ensures that we can ‘share’ a secret (PIN) OOB (not using the internet) between all tokens and the server. The server ´knows´ how to generate the PIN based on the current time-slot and it has the same seed, so it can check if a PIN is good at any given time. When a user needs to start a secure VPN to the server the user can add the PIN to his identity to qualify as a member of the server + e-Tokens closed group (a kind of ‘family’ of users). This authentication mechanism is called 2-factor authentication (password + PIN) or multi-factor authentication. This mechanism works as far as the PRNG algorithm remains unknown and the timestamp of the e-Token cannot be recreated by an attacker. The PIN is only valid and unique inside the current time slot; usually the server allows slots to be 10s to 30s long. Quartz IC clocks in e-Tokens have considerable drift, and they cannot be reset by user so if there is no resync at the server side for that user account (and there usually isn´t) after some time the PIN authentication will fail. To overcome this limitation a better quartz clock (more expensive) can be used or the server may try to adjust to the drift of each specific user by maintaining a drift number per user and adjusting it with each new PIN authentication request. As you can see it suffices to reveal the PRNG method and seed to compromise the whole network, as it is not really difficult to recreate a valid timestamp to feed the PRNG inside a 30s slot.

Connected e-Token

A refinement of the e-Token is the ‘connected e-Token’. This is a portable device with a clock, a PRNG, memory and CPU with crypto functions and a communication link (more expensive). The physical format may be a smart card or it can even be an App living in a smart phone. The connection to the server solves the drift problem, and that is all the merit of the device. Crypto functions are used to implement cyphered protocols that handle the synchronization. These crypto functions will normally use a symmetric cypher applied to the PIN extracted from the PRNG. As you can see the connected device does not protect the ´family’ (the set of users that share the server) against any attack that reveals the PRNG method. An interesting property of some connected e-Tokens is that they can be used to generate PINs in sequence, one per time slot, provide them over a USB link to a host and the host will use them to cypher a sequence of transactions (which is faster than entering the PINs by hand). The connected e-Token adds a weakness not present in the e-Token: synchronization takes place in-band, so it can be attacked statistically. Now there are two ways to attack the connected e-Token: 1) discover PRNG method, 2) spy synchronization messages. By means of 2) an attacker can solve 1).

Secure transaction vs secure channel

As you can see bank grid cards and e-Tokens just protect the start of a session. They protect a single transaction. The rest of the session in the VPN is protected by a static key. No matter how securely this key is stored, the key is short compared to the message. Connected e-Tokens may protect a small number of transactions per minute. Latency token-server limits the minimum size of the time slot in which a PIN is unique. So forget about apps that have more than 2 to 6 short messages per minute. In Internet physical access to the links cannot be avoided. This means that all the messages can be captured and analysed statistically. The current usage of bank pads and e-Tokens provides just an illusion of privacy to users.  The best we can say about grid cards and e-Tokens is that the less they are used the more secure they are against statistical attacks. But hey, the most secure transaction is the one that never happened, so did we need to buy an OOB device to re-discover that?  Definitely these devices will not work for people that want to ´talk a lot’ privately through internet.

Identity and perfect Trust

We want to ensure that our message reaches the intended recipient and no other, but at the same time we know that there are many entities in the internet with the capacity and the motivation to detect and intercept our messages. (Remember The Room). Again the only perfect method to determine identity based on messages received over internet is ‘shared secrets´. We need to ask the other end about some information that only this other end can know. As we have discussed above, OOB secret sharing is the only method that can grant perfect secrecy. Authentication (determination of identity) today can be done with perfect reliability as far as we have an OOB channel available (for instance we can walk to our banks desk to get a grid card or we can walk to our IT help desk to get an e-Token).  Authentication is easily protected by a small shared secret because it is a ‘small and infrequent transaction’. It carries little information and we do not do it 10^6 times a second, so it may be enough to replenish our shared secrets reservoir once a month, or once a year. The problem that comes with current implementations of perfect authentication via OOB shared secrets is that this method is ‘only’ used to secure ‘the start’ of a connection (a VPN or a TLS session), and it is never implemented as an OTP, because keys are reused: grid cards reuse keys as the card does not hold many keys, e-Tokens have a much wider key space so they reuse less, but knowing the seed and method you could reproduce the actual key at any moment, so the ‘real’ key is just the seed and that seed is reused in every transaction. To simplify let us assume that we implemented OOB secret reasonably to protect the start of the conversation, we ‘started’ talking to the right person, but after the start an attacker may eventually break our VPN by statistical attack, then he can read, eliminate or inject messages. The right solution would be to apply OOB authentication to every message.  Clearly the grid card or the e-Token or the connected e-Token do not work for this purpose. Can you imagine hand-entering a 6 digit PIN for every 6 chars that you transmit? Can you imagine chatting with the other end at a pace of 6 chars every 30 s? It does not look very attractive.

Can we have perfect Trust? Trust usually means we can be assured that the message is not modified and identity of our partner is known. We cannot protect any internet channel of a decent bitrate using the available OOB secret sharing technology available today. So no, in general, we cannot have perfect Trust. For a reduced amount of transactions or for a low bitrate we can use one time pads. Two partners can agree to meet (OOB) physically once a year and share, let’s say 1TB, 1PB, whatever size they require of secret data in a physical device (HD/flash memory), and then they will consume the secret over the next year having perfect secrecy. OK, that works. But as noted it is a fastidious technique and it has not been implemented mainstream.

Anonymity

Anonymity in communications may have two meanings: 1) I communicate to N receivers, no one of the N can know my identity, interceptors cannot know my identity; 2) I communicate to N receivers, all of them know my identity, interceptors cannot know my identity. As far as at any time during my communication I can explicitly reveal my identity the important difference between 1) and 2) is that 1) presents a mechanism in which a receiver must accept a message from unidentified sender (as in telephony) while in 2)there cannot exist unidentified senders but there are identity hiding techniques oriented to interceptors.  Internet today is in the case 2). It is not possible to hide the origin of a message. It can be tracked, always. There are mechanisms to obfuscate the identity of the sender (Tor network), but these methods only make the task difficult and this difficulty can be overcome using a decent amount of computational power.

Do we really want anonymity? Anti-tampering

In the phone world there is no real anonymity as any call can be tracked by the carriers if there is motivation (a court order for example). But out of those extreme cases, it is possible and really annoying to receive calls from unidentified callers. Many people have developed the tradition of not taking unidentified calls, which is a very reasonable choice. In internet it is not really possible to hide the sender address. Yes, there are people with the capability to do ‘spoofing’, tampering with the lower level headers and faking the sender address in a message. This spoofing technique looks scary at first sight, but then you must remember that the address of the sender, as any other flag, header or bit of information in a message is unprotected in internet and can be tampered with. That tampering capability means that the message can be modified, faked or even destroyed, but it does not mean that the message can be understood by the attacker. Without understanding the message semantics it is easy for the two ends that communicate to devise mechanisms that alert of tampering: checksums, timestamps, sequenced messages, identity challenges, and many others. These mechanisms will use OOB information so they cannot be attacked statistically. So, no, we do not want or need anonymity and we are not afraid of message tampering as far as we have enough OOB secrets and we know how to build an anti-tampering protocol based on them.

Current levels of Trust

It is interesting to note that the internet that we have in 2016 does not have what we demand from it in terms of security. As we have briefly reviewed everyone is in need of a VPN to connect to every other person or service. This is not happening yet. Even in case of a hypothetic VPN boom tomorrow morning, every commercial VPN is vulnerable to statistical attack, so we will be just reducing the set of attackers that can do harm to those with patience and big computers: governments?, big corporations?, organized crime? Can we really implement VPNs based on OTPs that in turn rely on OOB secrets? Well, we can do it on a one-by-one basis, so if we meet someone in the real world and we have periodic access to this person in the real world we can replenish our OOB secrets and conduct perfectly secret VPN traffic. But as you easily see we will not like to do that for every relationship that we have today through internet with everyone and with every service that we use. And by the way, current commercial VPNs do not implement proper OTP.

Devaluation of privacy, identity, responsibility and trust

So no, in internet we don’t trust. We can’t. Those with private info, important transactions, or a lot of responsibility know how to use OTP based on OOB secrets. Those who don’t, maybe you, probably are not aware of the solution or the perils of not having a solution. The result is people do not expose through internet those bits of information that are really valuable to them, unless they have no other option. If you suspect that your bank’s grid card is not secure enough for your operations you have very little option beyond doing every transaction personally at your bank’s desk. To buy a book via internet you are not going to worry. If you are target of an online fraud you will take it as a risk of modern life. If someone impersonates you on LinkedIn or Facebook, things may get more serious. You may end up in court. Even in that case what can you do? Are you going to ask LinkedIn or Facebook to implement OTPs? I don´t think so. How could they do it? Will they have a LinkedIn or Facebook desk in every small village of the world to share OOB secrets with 100 billion users? We are seeing increased usage of VPNs. Especially for remote workers. We are also seeing increased usage of multi-factor authentication, naturally for VPNs and remote workers but that is also becoming common for wide spectrum services like Facebook, Gmail, LinkedIn, and others. Trust is ‘forced’. We trust online retail platforms because we want to shop online. We cannot live without that. But we will not shop in the first online portal that we bump into. Prestige in the online world is more important than ever. Companies that have been longer in the market and have a track record of none or very little leaks of data or compromised transactions will be the choice.

What to expect in the near future

Internet evolution is accelerating. Many companies that are in business today will not be in 5 years. Many companies that do not exist while I write this will become dominant in 5 to 10 years from now. In terms of security we cannot expect a breakthrough in such a short time. We may see some sophistication reaching the common internet user. We can expect to have free personal VPN services with state of the art security, which is not really 100% secure, but it is what ‘free’ will buy you in the short term. VPNs for businesses will grow in security. The best of them will opt for higher levels of encryption, maybe even OTP/OOB. Services that have a wide range of users will target multifactor security for authentication and transactions. They will surpass soon the current level of security that we can find in banks.

Banks, they need to evolve.

Banks really do not seem to be taking the challenge very seriously. The technology that they use is way old and insecure to be dealing with customer’s money. As the non-banking world evolves providing electronic payment we can assume that banks will improve their technology to attract customers. One of the first movements must be to provide VPNs to all their customers and better OOB, more complex secrets given at their desks to consumers. Grid cards are no good for protecting frequent transactions. As micropayments for online services become much more popular (they are already popular now), and thus much more frequent, grid cards need to be replaced by an OOB method with a much wider key space. I do not think e-Tokens are a good replacement. Much better would be a gigantic grid card implemented as an indexed ROM created per user. Let´s say 32-64 GByte of random bytes burned on an inalterable IC given to every customer. Add a display and keyboard to enter the index and you are done. This kind of IC can be created today and is affordable to banks. The eGridCard must not have connectivity. Any connectivity will make it weaker as the keys could be spied over USB, or wifi or any kind of link.

Social and Retail

Multi-factor authentication will take over the place. Social networks do not have a great return on each individual member (a few cents/year due to ads), so they are unlikely to invest in a HW OOB+OTP, but I can see cheaper multifactor coming: adding phone numbers to your identity (the phone network is a good OOB separate channel). I also see premium services coming from social networks. Paid premium services allow to provide OOB+OTP HW, as described for the case of banks.  Online retail sites and online premium social networks can offer true privacy to their members via eGridCards, at least to protect start of session.  To protect long messages we will need a better way to share a huge secret.

Professional big secret sharing

Corporations wanting to seriously protect a channel, not a transaction, will push the state of the art of VPNs. Combining VPN methods for reliable channels: sequencing, timestamping, identity challenges, checksums, multiple handshake, and others with OOB+OTP will make corporations much safer. This effort will require new HW and new SW. In opposition to protecting a single transaction, protecting a channel requires a continuous feed of secrets to the transmission device. This feed cannot be delegated to a human (as in an e-token or Grid card), but we cannot rely on an ‘open interface’ as USB, Ethernet, radio or whatever existing link. The solution that comes to mind is that the secret holding HW must be coupled to the internet connected device only briefly, while the channel is active, and the coupling must be a one way interface that does not work from the internet side. This kind of HW interface is not available today (at least it is not mainstream) , but there is no difficulty in building it.

Size of secrets

We can speculate that any ‘decent’ communication today is very likely to move from KBytes per minute to MBytes per minute. Non-media-intensive talks will be in the lower 1 Kbps to 100 kbps, while state of the art media-mixed talks may be 100Kbps to 500 Kbps, and some rich-media talks will reach the Mbps (1 Mbps to 5 Mbps). This speculation applies to very general communication taking place in social media, micro transactions in banking and retail (small photographs included), in mobile terminals and desktop computers. In other more professional environments like VoIP and videoconferencing we may move up the Mbps scale. If we want to protect a channel of 1 Mbps that is active 8 h/day, 300 day/year, we need 8.64×10^12 bits (8.64 Tbits =1.08 TBytes). It will be easy to build OOB shared secrets worth of 1 TByte/year. A cheap HD will do.

Internet fabric

Internet is made of routers and links. We have said that every link and every router is accessible to eavesdroppers today, which is true and you better act as if you believe in that statement. Internet is multitenant (many entities own the routers and the links) so we could reasonably guess that some internet portion could be hardened against eavesdroppers while remaining standards compliant in its connection to the rest of internet. Yes, this can be done by replacing every router in that area with routing machines that cipher every single packet that goes through using OOB + OTP secrets. Ciphering works end to end in the area that the secret is shared. As this area cannot be the whole internet, we can think of a new kind of router that admits periodic replacement of a HW storage module containing OOB secrets. All routers in the area will receive the module let’s say once a week or once a month. Modules will be written centrally at network owners premises. Traffic that comes into that ‘spot’ in internet will be ciphered via OOB+OTP so only routers in that ‘spot’ will understand the low level packets. Egressing traffic will be just ‘normal’ traffic as low level packets will be deciphered at the border. The ‘spot’ will be transparent to the rest of internet, but now traffic cannot be spied in that spot. This is a business advantage. If a customer traffic originates in that spot and terminates in that spot it will be much more secure and the customer does not need to do anything special. This claim may attract final customers to an specific ISP or Telco or network service provider. This could be called a STN (Secure Transaction Network) for similarity to a CDN, which is a closed infrastructure. Today we call SDN a Software Defined Network. Interestingly SW defined networking will make much easier to build custom routers and thus STN. Imagine how easy it will be to build a ‘device’ out of a fast packet forwarding engine (HW based) plus SDN modules for OOB+OTP written in house to cipher every packet and support our proprietary storage module. I would move from my current ISP to another ISP that can swear (and demonstrate) that my traffic will ONLY go through this kind of routers in my country. At least I can reach my bank and a good number of friends and family in a secure spot.

Standards

It is very unlikely that we will see a new standard appear to include ciphering in the base internet protocols to transform all routers in secure routers. Even if we see that standard appear in the next few years (5 years) that standard will be based on classical cryptography which is vulnerable to statistical attack. This is due to the impossibility of specifying OOB mechanisms in a standard. And due to the fact that very few global coverage networks exist that are not internet accessible (OOB). The most practical two networks that can be used for OOB are: people carrying secrets in their pockets, phone (non-data but voice) network. The second network is much less reliable as an OOB than the first one. Even if an agreement is reached for a OOB method (impossible in my view) adoption through a significant part of internet will take over 10 years, which will render the effort useless.

Conclusion

You have to do your part. If you want to have an increased level of privacy you cannot count on current privacy protection from internet links and/or routers, internet protocols, bank grid cards, e-Tokens, or VPNs. You cannot count on this situation improving to a practical level over the next 5 to 10 years. You can implement some sort of OOB + OTP today on your own. Just look for the pieces of technology out there to implement your secret sharing at the level that you require.


CDNs and Net Neutrality

(Download this article as PDF:  CDNs and Net Neutrality)

 

neutral

1. Introduction

In these weeks many articles appear that go in favour or against (very few) net neutrality. This ‘net neutrality’ topic has been present in legal, technological and business worlds for years but it is gaining momentum and now every word spoken by FCC or by any of the big players in these issues unleashes a storm of responses.

In this hyper-sensitive climate it may seem that anyone could have and show an opinion.

I’ve seen people that do not work in Internet, do not work for Internet and do not work by means of Internet and who do not have a background in technology, legal issues or social issues shout their opinion in many media. Defending net neutrality is popular today. It sounds like defending human rights.

The natural result of this over-excitement is that a lot of nonsense is being published.

How can we harness such a difficult topic and bring it back to reason?

In this article I will try to pose the ‘net neutrality’ discussion in right terms, or at least in reasonable terms, connected to the roles that Internet has reached in technology, society, economics and specific businesses as content businesses and CDNs which are specially affected by this discussion.

 

2. What do they mean with Net Neutrality?

The whole discussion starts with the very definition of Net Neutrality as there are many to choose from. The simpler the better: Net Neutrality is the policy by which we ensure that no net-user is ‘treated differently’ or ‘treated worse’ than other net-user for the sole reason of having a different identity.

I have selected a definition that on purpose avoids business terms and technology terms. This is a good starting point to dig in the meaning of ‘net neutrality’ and inquire for the massive response that it is rising.

What does it mean a ‘policy’ in ‘net neutrality’ definition? A policy is a consolidated behavior. It is a promise of future behavior. It is a continued and consistent practice.

Which is the ‘net’ in ‘net neutrality’ definition? It is Internet. This means every segment of network that talks IP to/from public IP addresses.

Who is the ‘net-user’ in ‘net neutrality’ definition? He is anyone linked to a public IP address. He is anyone that can send and receive IP packets through a public IP address.

What is ‘treating someone differently’ or ‘treating someone worse than others’ in ‘net neutrality’ definition? As we are talking about ‘network-level behaviors’, treating means dealing with network-level objects: packets, addresses, traffic… So we can translate that ambiguous ‘treating worse’ to: handle packets, handle addresses, handle traffic from someone differently/worse than we handle packets, addresses, traffic from any other just because we know who is the traffic originator.

How can we ‘deal worse’ with packets/traffic? There are some network-level-actions that affect traffic adversely and thus can be interpreted as ‘bad treatment’: delaying/dropping packets in router queues.

Why would anyone delay/drop packets from anyone else in a router queue? There is no reason to harm traffic flow in a router just for the sake of doing it or to bother someone. It is plain nonsense. But every minute of every day routers delay packets and drop packets…why? The reason is routers are limited and they can only deal with X packets/s. In case they receive more they forcefully need to ignore (drop) some. This fact should be no major drama as most transport protocols (TCP) deal with lost packets, and all end-to-end applications should deal with continuity of communication no matter which transports they use. The only effect that we can acknowledge from packet drops is that they create a ‘resistance’ to make packets through Internet that increases with network occupation, but this ‘resistance’ is statistically distributed across all net-users that have communications going through each router. No one sees a problem in that. It is the same condition of a crowded highway, it is crowded for the rich and for the poor…no discrimination. Classic routing algorithms do not discriminate individual IP addresses. They maximize network utilization.

 

3. Does (recent) technology threaten net neutrality?

Despite congestion problems in internet have been present from the beginning of this now famous network, some people have recently developed a tendency to think that router owners will decide to drop some specific net-user’s packets all the time. Just to harm this user. But why complicate the operator’s life so much? It is easier to let routers work in best effort mode, for instance with a policy of first-come first-serve. This is in fact the way most internet routers have behaved for years. Just consider that ‘routing IP packets’ is a network layer activity that involves only checking origin-IP addr, destination IP-addr and IP priority bits (optional). This is very scarce information, that just deals with the network level, without identifying applications or users, and even for those ‘quick routing choices’ the routing algorithms are complex enough and internet is big enough to have kept routers for years under line-rate speed (usually well under line-rate speed) even for the most expensive and more modern routers. ‘Priority bits’ were rarely used by the operators until recently, as creating a policy for priorities used to degrade badly the router performance. Only very recently technology is going over that barrier. Read on to know if we can go much further.

As technology evolves it is now possible (in the few recent years , maybe 5) to apply complex policies to routing engines so they can ‘separate’ traffic in ‘classes’ attending to multiple different criteria that go beyond the classic layer 3 info : (origin, destination, priority). With the advent of Line-rate DPI (Deep Packet Inspection) some routing and prioritization choices can be taken based on upper layers info: protocol on top of IP: FTP, HTTP, MAIL… (this information belongs to layers 4-7) , SW application originating packets (this info belongs to layer 7, and has been used to throttle P2P for instance), content transported (layer 7 info, has been used to block P2P downloads)…
So it is now (maybe in the last 5 years) that it is commercially possible to buy routers with something close to line-rate DPI and program them to create first class Internet passengers and second class Internet passengers attending to many weird criteria. It is possible, yes, no doubt… but is it happening? Does it make sense? Let’s see.

 

4. What is an ISP and what is ‘Internet Service’?

Internet Service Providers, taken one by one, do not own a significant portion of the Internet, nobody does. So how can anyone offer to you ‘Internet Service’? Do you think that all of them (ISPs of the world) have a commercial alliance so any of them can sell to you the whole service in representation of all of them? No.

Then, what is Internet Service?

We could define IS as being close to a ‘Carrier Service’, that is: a point-to-point messaging service. This basic service takes a ‘packet’ from one entry point (a public IP address) and delivers this packet to other (public IP addr) point. That is all. Well, it is in fact usually much less than that. This ‘point-to-point service’ is not exactly what ISPs sell to us. In case both points lay in the ISP network then yes, ISP contract means end-to-end service inside that ISP ‘portion of internet’ (nothing very attractive for anyone to pay for), but what if the ‘destination point’ is out of the ISP?. Does our ISP promise to deliver our IP packet to any public IP? Nope. Here lies the difference with Carrier Services. Internet Service cannot be sold as an ‘end-to-end’ service. It is impossible to provide that service. It is physically impossible to reach the required number of bilateral agreements to have a reasonable confidence that an IP packet can be delivered to any valid IP address. Internet Service is an ‘access service’. This means that you are being promised to do all reasonable effort to put your packet ‘on the way’ to destination IP, but they never promise you to deliver. What does mean ‘all reasonable effort’? This is subject of much controversy, but usually national laws of fair trade force the ISP to behave ‘correctly’ with locally originated traffic and ‘deliver’ this traffic in good condition to any Internet exchange or any peering point with any other company. That is all. Is this good for you? Will this ensure your IP packet is delivered, or better, ‘delivered in good time’? Nope. As internet exchanges and peering points may distract us from the focus of our discussion lets save these two concepts for a couple of paragraphs later. (See 6).

(NOTE: we will see later that Internet Service, not being end to end, is not currently under the legal denomination of ‘Common Carrier’, and that is extremely important for this discussion.)

The Internet service is essentially different from ‘Carrier services’ that you may be used to.

It is important to review classic Carrier services in search of resemblances & differences to Internet Service.

 

5. Classic Carrier Services

The paradigm of Carrier Services is ‘snail mail’ or traditional postal service. No company does own the whole postal resources across the world, but there is a tacit agreement between all of them to ‘terminate each other’s services’. Each postal company (usually owned by a government) deals internally with in-house messages, originated in its territory and addressed to its territory. When a message is addressed ‘out of territory’ the company at origin requests a fee from the sender that is usually proportional to the difficulty (distance) to destination. At the destination there is another postal company. This worldwide postal transport is always a business of two companies. The biggest cost is moving the letter from country to country and there is no controversy: the postal company at the originating country takes the burden and cost of moving the letter to the destination territory. And this effort is pre-paid by sender. For the company living in the destination, doing the final step of distribution does not take more than dealing with a local message. Of course the letter must be correctly addressed. These two companies usually do not exchange money. They consider all other postal companies to be ‘peers’ as roughly the same effort (cost) is involved in sending letters ‘out’ of territory (a few letters but a high cost per letter) than the effort (cost) to distribute foreign letters ‘in’ (much more letters but at a low local cost per letter). Roughly each company will spend the same sending out than it may request from all the other companies to deliver their messages. So it is just more ‘polite’ to not receive payment for dealing with foreign origins and in response don’t pay to send messages abroad. Notice also that the local postal company does not need to ‘prepare’ anything special to receive letters from out of the country. The same big central warehouse that is used to collect all local letters is used to receive letters from abroad. This has worked well for postal companies for hundreds of years and it still works. Of course if a country falls in the rare state that no local people send letters out and at the same time local inhabitants receive tons of letters from other countries, the local postal company would have big losses as they have cost but no income. Anyway these situations are scarce if at all possible and usually postal companies have been subsidized or owned by the local government so losses have been taken as a ‘fact of life’.
Important facts to remember about this service: the sender pays per message. The originator company bears the whole cost of moving messages to destination postal company. Each destination may have a different price. Each destination may have a different delivery time. Letters above a certain size and weight will have extra cost proportional to actual size and weight and also to distance to destination. Local companies do not create additional infrastructure to receive foreign messages. There are no termination costs.

Another more modern ‘Carrier Service’ is wired telephony. As in the postal service no company owns the whole telephony network. As in the postal service there exist local companies that take incoming calls from its territory and deliver the calls to its territory. When a call originates out of territory the caller must do special actions: he must add a prefix identifying the destination territory. In the destination country a local company has an explicit (not tacit) agreement with many other companies out there (not all) to terminate the incoming call. As in the postal service the termination business always involves exactly two companies and the highest cost is transporting the call to the ‘doors’ of the destination company. As in the postal service the caller (sender) pays for the extra cost. An important difference is that the caller usually pays at the end of the month for all the calls and not before. Again these telephony companies consider themselves as ‘peers’ but with some important differences: in this service it is required to build and pay for a physical connection from company to company. In the postal service the originator company was free to hire trains, plains, trucks, ships or whatever means to carry letters to the door of the local post company. The volume of letters may vary substantially and it does not mean big trouble for anyone except for the originator that must pay for all the transport means. The receiving infrastructure is exactly the same as for local letters and it is not changed in size or in function by foreign workload. In telephony the local company must allow the entrance of calls at a specific exchange station. Telephony works over switched circuits and this means that the originating company must extend its circuits to other companies in other countries and connect on-purpose through a switch in the other company circuits. This now has a cost (which is not minor by the way). More important: this cost depends on the forecasted capacity that this exchange must have: the estimated amount of simultaneous calls that may come from the foreign company. Now the infrastructure for local calls cannot be simply ‘shared’ with foreign calls. We need to add new switches that cannot be used by local workload. Notice that every new company out there wanting access to my company circuits will require additional capacity from my switches. No telephony company will carry alone the cost of interconnection to other companies in other countries. Now ‘balance’ is important. If a telephony company A sends X simultaneous calls to company B and company B sends Y simultaneous calls to company A now it is very important to compare X to Y. In case they are similar: X~Y, ‘business politeness’ leads to no money being exchanged. In case A sends much more than B: X>>Y, B will charge for the extra cost of injecting A’s calls in B’s circuits. Remember that callers pay A, but B terminates calls and receives nothing for doing that.
Important facts to remember about this service: caller pays per call (or per traffic). The originator company bears the cost of extending circuits to the ‘door’ (switch) of the destination company. Each destination may have a different price for calls. Cost will be proportional to call duration and distance to destination. Local companies MUST create (and pay for) specific infrastructure (switches) and MUST reserve capacity PER EACH foreign company. This infrastructure MUST be planned in advance to avoid congestion. Cost of infrastructure is proportional to expected traffic (expected incoming simultaneous calls). There are termination costs in case of unbalanced traffic between companies.

 

6. Internet Service compared to Carrier Services

The Internet Service is sometimes viewed as similar to telephony. At the end, in many cases telephony companies have picked up the responsibility (and the benefits) of providing Internet Service. But Internet Service is an access service not an end-to-end service. How is this service built and run? An ISP builds a segment of IP network. If there are public IP addresses inside and they are ‘reachable’ from other public IP addresses, now this segment is part of internet. For the ISP it is no big deal to move packets to and from its own machines, its own IP addresses. The small ISP just applies ‘classic routing’ inside its segment. (Apply classic routing means: all routers in this small network share a common view of this small network and they run well-known algorithms that can determine the best path or route crossing this network from machine 1 to machine 2 possibly jumping through several routers inside the network. These routers have a distributed implementation of a shortest path algorithm based on selecting the next hop in a regularly re-computed routing table. As the required capacity of the routers depends on the number of IP addresses managed and the number of routers inside this ‘small’ network, there is a limit in cost and performance to the size of a network that can apply classical routing.)

What is interesting is what happens when destination IP is out of the ‘small network’. The new segment of internet does not have a clue about how to deliver to destination IP. That destination IP may be at the other end of the world and may belong to an ISP we have never heard of and of course we do not have business with them. The ISP does not feel bad about this. It is confident that ‘It is connected to internet’. How? The ISP is connected to a bigger ISP through a private transit connection and the smaller ISP pays for transit (pays for all the traffic going to the bigger ISP), or it is connected to a similar ISP through a peering connection, or it is connected to many other ISPs at an internet exchange. Usually peering happens between companies (ISPs) that are balanced in traffic, so following the same reasoning that was applied to telephony they do not pay each other. Internet exchanges are places in which physical connection to others is facilitated but nothing is assumed about traffic balance. The Internet Exchange is ‘the place’ but the actual traffic exchange must be governed by agreements 1-to-1 and can be limited to keep it balanced as ‘free peering’ (no charge) or on the contrary it may be measured and paid for as a ‘paid peering’.

We have said that smaller ISPs pay for transit. What is ‘transit’? Small ISPs route packets inside their small network, but to connect to internet they must direct all outgoing traffic through a bigger ISP router. This bigger ISP will see all these IP addresses from the small ISP as their own addresses and apply classical routing to and from its own machines. The bigger ISP supports the whole cost of the transit router. For an ISP to accept traffic in transit from smaller ISPs the transit routers must be dimensioned accordingly to expected traffic. This big ISP may not be very big and so it may in turn direct traffic through transit to a bigger ISP… At the end of the chain, the biggest, worldwide ISPs are called ‘tier 1’. These are owners of huge networks; they are all connected to all the rest of tier 1’s. They see IP addresses of other tier 1’s through ‘peering points’ in which they put powerful routers. The cost of the peering infrastructure is supported by both the two ISPs connecting there. They do not pay for traffic but they invest regularly in maintenance and capacity increases. It is of key importance to both peers to account for the traffic going through the peering point in both directions. They must maintain it balanced. If a misbalance occurs it is either corrected or the ISP that injects substantially more traffic will have to pay for the extra effort it is causing on the other side.

We have not yet demonstrated that the IP packet coming from the small ISP can find its way to destination IP. Let’s say that destination IP belongs to a small ISP that is 20 ‘hops’ (jumps) away from origin. In the middle there can be ten or more middle size ISPs that pay for transit to bigger ISPs and there maybe 3 tier 1 ISPs that peer to each other. The IP packet will be moved onto a higher rank network ‘blindly’ in its way up just for a single reason : all routers in the way notice that they do not know where lays the destination IP so their only possibility is to send the packet through the door (port) marked as ‘way out into internet’. At some point in this way up the IP packet will reach a tier 1 network that talks BGP to other tier 1’s. Some portion of the destination IP will match a big IP pool that all BGP speaking machines handle as a single AS (Autonomous System). Many tier 1’s have one or more AS registered. Many ISPs that are not tier 1’s also talk BGP and have registered one or more AS. What is important is that routers that talk BGP have a way of telling when an IP address is the responsibility of some AS. Let’s say that in this case the first moment at which our destination IP is matched to a ‘responsible destination network’ happens at a tier 1 router talking BGP. This router knows one or more ways (routes) from itself to the destination AS so it simply sends the packet to the next hop (router) that best suites its needs. The next router does the same and in this way our IP packet traverses the tier 1 networks. At one of the tier 1’s the destination IP address will be matched to a local sub-network; this means our packet can now be routed through classical routing algorithms. This classical routing will make our packet go down from bigger ISPs through transit routers to smaller ISPs until it reaches the destination machine.

What has happened in comparison to our good old carrier services? Now no one ‘in the middle’ knows what happened to the packet. They only know they treated this packet ‘fairly’. Essentially transit ISPs just worry about the statistics about dropped packets in their transit routers and make sure that number of dropped packets is kept inside a reasonable margin. For instance 1 drop per 10^5 packets is not a drama for any transit router. But notice that a sudden change in a remote part of the world may increase this router losses to 1 in 10^3 drops and there is little that the router owner can do. In fact all he can do is to rely on global compensation mechanisms implemented at the highest routing level (BGP) that are supposed to slowly balance the load. But in the meantime lost packets are lost and they must be retransmitted if possible. In any case the application that is managing communication will suffer in one way or another. It is now impossible to plan for capacity end to end as routes traverse many companies and these companies are left to the only resource of measuring their bilateral agreements and react to what happens. The transit companies cannot know when traffic is going to increase as it may very well be that the originator of the traffic does not have contracts with any of them so this originator is not going to inform all companies in the middle of its capacity forecasts. Especially difficult is to realize that the dynamic nature of routing may cause that sometimes this traffic causes effort to a certain set of companies and the next moment it causes effort to a different set of companies. For this reason Internet Service is NOT a Carrier Service, it does NOT carry a message ‘end to end’. The IP protocol works end to end, but it is in practice impossible to establish the responsibilities and trade duties of the companies involved in the journey of a single packet. It is impossible to tell a customer of an ISP who (which companies) have taken part in carrying his packets to destination. It is impossible to tell him which is the average quality of the services that his packets have used to reach destination. Worse than all of this it is impossible to all the companies in the middle to prepare to give good service to all traffic that possibly will go at any time through their routers.

So in this terrible environment the companies carrying packets fall back to ‘statistical good behavior’.

For this reason ISPs cannot charge their users for ‘transactions’ as they are not responsible for terminating a transaction nor they are able to set up a commercial agreement with one, or two or one hundred companies that could make them assume the responsibility of granting the transaction. So, as they do not charge per transaction they need a different business model to take traffic from net users. They have decided that the best model is to charge per traffic, considering traffic in a wide statistical sense. In the past and still in the mobile data today they charge per total amount of information sent over a period of time: GBytes/month. It is more common to charge for ‘capacity available’ at access not for ‘capacity used’ : Mbps up/down. This is a dirty trick and a fallacy in business as you may easily see: you are receiving an access service, your ISP wants to charge you 40$ a month for a 5/50 Mbps link no matter if you use it or not. But does this mean you can feed 5 Mbps to any destination IP in the Internet? Or do they grant you can receive 50 Mbps from any combination of other IP? Of course it does not. How could it be? Your ISP can at best move your 5 Mbps up to the next router in the hierarchy. But as you will see no ISP in the world will make a contract with you promising to do that. They will say they do not know how much of those 5 Mbps can go out up to internet.

I think it would be fair to force an ISP to measure incoming and outgoing throughput as an aggregate at the network edge. This means measuring all contributions in from: accesses + transit + peering, then measuring all contributions out to: accesses + transit + peering. Of course it is impossible to tell how many customers need to go ‘out’ at any moment so outgoing throughput may be sometimes a small fraction of incoming throughput. This ratio will probably vary wildly over time. The only number that should be forced as a quality measure onto the ISP is: the aggregate number of bytes taken from users at the edges of the network (from accesses + incoming transit + incoming peering) must equal the aggregate number of bytes delivered to network edges (out to accesses + out to transit + out to peering). If an ISP claims to provide ‘Internet Service’ it should be prepared to handle any situation, any possible distribution of incoming bytes to outgoing bytes.

Notice that it is cheaper if all bytes from local accesses go to other local accesses in the same ISP, in this case transit and peering are unused. Much more dramatic is the case in which all accesses suddenly need to send packets out through a transit router. This will not work in any ISP of the world. Packets will be lost massively at the transit routers. The usual business is to estimate the amount of outgoing traffic as a fraction of the local accesses traffic and then make transit contracts that grant ONLY that fraction of throughput out. These contracts are reviewed yearly, sometimes monthly but there is not much more flexibility. A sudden surge of traffic still can and often does cause bad experiences to users that coincide trying to make their packets through internet at the same time. This ‘resistance’ to go through is experienced in different ways by different applications : email will not suffer much, Voice over IP and Videoconference will be almost impossible and viewing movies can be affected seriously depending on buffering model and average bitrate.

You could hardly convince an ISP to over dimension transit contracts out. What happens if this ISP sees the transit unused for 12 months while still having to pay a tremendous amount of money for those expensive transit contracts? Naturally the ISP will soon reduce the contracts to the point in which the transit carries just the traffic the ISP is paying for. Unfortunately the interconnection machinery cannot be made flexible; it cannot be made to increase capacity as needed. For many reasons this is impossible. As you can see a cascading effect will be created if a few ISPs start over dimensioning their peering/transit machinery while keeping their contracts low… In case they need to flush a sudden surge of traffic the next ISP in the chain not having over dimensioned machinery will not be able to cope with the surge. Also notice that paying for a router that can handle 2X or 5X the traffic it actually has is very bad business and no one will do it.

Important facts to remember about this service: sender does NOT pay per message. He pays per ‘max sending capacity installed at his premises’. The originator company does NOT bear the cost of extending circuits to the ‘door’ (switch) of the destination company. The originator company extends circuits JUST TO ANY other carrier in the middle. Interconnection machinery costs are supported by bigger carriers, not by smaller ones. Interconnection cost between equal sized carriers is shared between them. Each destination will have exactly the SAME price. Cost can NOT be proportional to distance to destination. Service is NOT an end to end transaction. Local companies MUST pay for using specific infrastructure (transit) to go out of territory; and they MUST build specific infrastructure and reserve capacity for EACH transit company that carries traffic into its territory. Both the outgoing contract and the inbound infrastructure MUST be planned in advance to avoid congestion. Cost of infrastructure is proportional to expected traffic (expected aggregated rate of packets/s). It is impossible to forecast the traffic of an incoming connection from a transit company as this company is a dynamic middle-man to an unknown number of traffic sources. There are transit costs in case of unbalanced traffic between companies; the small ISP pays the big ISP. Equal size ISPs do not charge each other for traffic they just ‘peer’. There are infrastructure costs in every interconnection. Both ‘peers’ and transit providers spend lots of money buying and maintaining interconnection machinery.

 

7. Mapping ‘net neutrality’ to the crude reality of Internet access service

We have seen that Internet Service is an Access Service, not a Carrier Service. This fact is reflected in some legislation, particularly in the Anglo-Saxon world under the concept of ‘Common Carrier Services’. These Services include but are not limited to postal service and wired telephony. These are services so important that usually have been admitted to be ‘public interest service’ and thus governments have interfered the market rules to make sure that these services were widely available, non-abusive, reliable, affordable, responsive… beyond the pure market dynamics.

So when we say ‘do not treat IP packets differently in case they belong to different net users’, how does this statement map to the Internet Service that we have described in the previous paragraph?

How can an ISP comply with the above definition of ‘net neutrality’?

ISPs deal with packets; they have to route packets inside their network and also to the next middle-man in Internet. They get money from accesses (for availability) and charge/pay for Transit contracts (for available capacity and/or transferred traffic). Can they reconcile their business model with net neutrality? Yes, I do not see the problem. It is fairly simple. The flaw is in the business model. It is a very weak proposition to sell ‘a reasonable effort to put your packets in its way to destination’. I’m sure people only buy this service because no other alternative is available. It is easy to drop packets at every interface when there is congestion, possibly frustrating some users out there, and at the same time keep my promise of treating equally (bad) all net-users and at the same time maintain my business model based on ‘reasonable effort’. Who decides what a reasonable effort is? Currently entities like FCC have a hard time trying to regulate Internet Service. As they cannot treat it as ‘Common Carrier’ it would not be fair to force ISPs to have a strict policy on transit contracts. How could they create this policy? Let’s say that FCC forces every ISP to contract transit for a total amount of 10% of its upstream aggregated accesses…Is this enough? Who knows… It is Impossible to tell. Would this grant absence of congestion? No. Internet traffic varies wildly. You see the actions of the regulator will be very unfair for all ISPs and at the end they will not solve congestion.

 

8. CDNs… businesses beyond plain Internet Access

Now we have seen that plain Internet Service can be kept neutral due to the fact that ‘access business model’ is a weak commercial proposition essentially easy to accomplish to-the-letter while frustrating users.

Are there other Internet Services that cannot be reconciled with net neutrality? Do these other services (if any exist) distort the plain Internet Service? Is any ISP in the world abusing new technology to violate net neutrality, despite being easy to maintain strictly the neutrality claim? I will try to address all these interesting questions.

CDN: Content Delivery Networks, they are businesses that go beyond Internet Service. I’m sure you do not find strange that companies with an important message to communicate are NOT happy with a service that promises ‘a reasonable effort to put your packets in its way to destination’. I will not be happy too. If I had the need and the money to pay for it I would buy something better.

CDNs are end to end services. CDNs are true carrier services. Very interestingly CDNs are not regulated as Common Carrier (not yet at least), but in my humble opinion they are as good carriers as any other end to end service. They sell transactions. They also sell traffic, but not bulk, best effort delivery, they impose SLAs to delivery. CDNs work to build special routing through known paths through internet, so they avoid the middle-man loss of responsibility. Once you know all the actors in the middle you can distribute responsibility and cost and you can make the service work with any quality parameter that you would like to set end to end.

Of course sending bytes through a CDN implies a loss of flexibility. Now routing cannot be the all-purpose dynamic routing of internet. You have to place machines of your own in the middle of the way from one end to another. You have to do many special agreements with network owners for collocation; you have to hire private links, install your own powerful routers, and install your own intermediate storages. All these actions cost an incredible amount of money. Who is going to pay for this? Of course the sender will pay.

Does CDN service violate net neutrality? No. why? CDNs treat packets very differently from plain Internet Service. But who is the owner of these packets? Is it you at home viewing a movie? Nope. The packets you receive are courtesy of someone that paid a CDN to host them. You request an item (a movie) by sending packets as part of your Internet Service. In this Internet Service your packets can be lost with equal probability as any other user sending email, viewing a non-CDNized movie, chatting, or whatever. But when you receive a response that is ‘sponsored’ by someone through a CDN, special care is taken not by your Internet Service Provider, no, do not fool yourself, it is by the resources of this ‘someone’ and this CDN that special actions happen to the packets that reach you. It is not anymore ‘your’ service. It is this ‘someone’s’ service what is better. But the benefit is all FOR YOU.

We can now compare CDN service to our old good Carrier Services. You can imagine that you use regular Royal Mail/US Mail/Any National mail… to request an important document from an entity (maybe even from a Government). Your request is nothing ‘special’ in size, urgency, quality or confidentiality so regular mail service is just OK to carry it. You are using entirely neutral service. The response to you is a very unique and valuable object/document so responder pays a Courier service to deliver urgently and securely to you. Does this violate neutrality of postal service? No, absolutely not. When you receive this high quality service you are not diminishing or subtracting resources from the regular postal service. You do not even pay anything for the extra quality, it is the sender who ‘makes you a present’ by enhancing the speed, security and reliability of delivery. The extra effort is done by an independent third party and this party receives extra payment which is completely fair. No one violates postal service neutrality by providing Courier Services.

Have you ever wondered if the existence of Courier companies could be violating ‘postal service neutrality’? Are the DHLs and UPSs of this world ‘bad people’ because they make money offering better service than National Mail services? Of course they are not. At the same time you would like Courier prices to be lower if possible but that depends only on the differential quality vs National Mail and the price of National Mail.

 

9. Regulation

Have you ever wondered why so many people claim ‘for a regulation’ over many things? They want regulation over Internet behavior, over telecommunications prices, over highways, over their capacity and their pricing… We are all the time asking ‘someone’ to come and regulate things. No one seems to have a problem with that. Don’t you think there should be limits to regulation? These claims are childish.

Regulation has had a good effect over ‘public interest services’, as we have said there is a fair amount of these services in our world : water distribution, postal service, energy, telephony, first aid and urgency health services (not in all countries), education (not in all countries),… .The regulator places itself above the market and disrupts the ‘pure’ market behavior. Of course to do this only someone with higher authority than the money can buy can take the role of regulator. Only Governments can do it and they usually do it. There are enormous differences about regulation coming from different cultures and political arrangements.

But even regulation cannot work without legal limits. In the Anglo-Saxon legal tradition the figure of ‘Common Carrier’ defines the limits of public interest Carrier service to be a candidate to be regulated by the Government. At least it tries to set the conditions in which a service can be considered to be ‘Carrying messages for anyone without discrimination’ and thus can be considered ‘public interest’ and be regulated to ensure that everyone can have equal access to such a service. It comes from the postal world by the way.

Another reason for an authority to intervene a service is the ‘natural right’ that emanates from the property of resources. For big, common, public infrastructures like the ones needed for water transportation, energy, litter, telecommunications, roads and highways, postal service… it is needed to ‘take’ terrain that belongs to the community and restrict it to a new usage. This capture of resources is done to serve the community but some authority that represents the community (a government) must take control of this action so at the end the community does not receive more damage than benefit.

Internet does not consume physical space (at least nothing that would bother any community). Installations of cabling may cross national borders , like trucks working for postal service do, but there is no need to make ‘border checks’ on information bits, as they are all born equal and harmless in the physical world. There are no national borders for telephony cabling. Companies do not pay fees to governments to cross the borders with cabling. So you start to see that there is no ‘natural right’ to regulate telecommunication emanating from community resources. The only reason to allow for regulation comes from ‘public utility’ of being connected to internet.

No one doubts today that there is a value in having access to internet. It is an economic, social, political, personal value. So internet access has become like water, energy, health, education. But at the same time notice that these important ‘public interest matters’ are not equally regulated all across the planet. Why would you expect that internet access will be?

 

10. DPI, transparent caching and communication secrecy

I have mentioned DPI as a new technology that allows breaking network neutrality (in case some router owner is very interested in breaking it).

There is bigger controversy about DPI than just allowing for unfair traffic handling. Notice that if DPI allows someone to harm your traffic by imposing a ‘higher resistance’ to cross the Internet, compared to the ‘average resistance’ that all users suffer… prior to causing this damage the one that applied DPI must have had access to information in the upper level protocols (above layer 3). This is comparable in the world of Carrier services to ‘looking inside the envelope’. This violates communication secrecy. It is a major sin.

In life there are envelopes for a reason. In the outside you place information for the handler; in the inside you place information for the receiver. You do not want the handler to eavesdrop inside your envelopes. Regulation of the postal service helped not only ensuring reasonable prices and whole territory coverage so anyone has the ‘right’ to send and receive letters. Postal regulation also set an authority (the Government usually) protecting the secrecy of communication and is this authority who does prosecution of infringers. And this is very serious in most parts of the world.

Wired telephony inherited the protection of the postal service so telephone calls cannot be lawfully intercepted. Both services postal service and telephony have evolved to Internet. Has Internet inherited the protection of carrier services? Oh, it is difficult to tell. My first reaction will be to answer: no. Not yet. You will need to review the question country by country.

Not being a ‘Common Carrier’, things get messy. There are some rules out there that seem to protect ‘all communications’. Out of the Anglo-Saxon world, in Europe, many countries have rules that protect secrecy in communication and that seems to cover Internet messaging. But these countries find difficulties in distributing responsibility to essentially an unknown number of message-handlers in-between sender and receiver.

One good example is ‘caching services’. CDNs have been considered caching services in many regulations.

Did you know that for effective caching it is necessary to eavesdrop inside your messages? Did you know that early caching services started to do it without telling anyone and without permission of sender or receiver? For this very reason many early caching services were found as violators of secrecy and closed.

As caching turned out to be ‘useful’ for ‘common messaging’, that is, good for the sender and user in many circumstances law-makers were ‘compelled’ to modify the secrecy protection allowing exceptions. The ‘caching exception’ is translated into ‘Common Carrier’ laws all around as a list of circumstances that limit the purpose and ways in which information ‘inside the envelope’ can be accessed by the handler.

Of course this is just a ‘patch’ to the law. Smart people can now eavesdrop into you messages claiming they adhere to the ‘caching exception’ to secrecy. As any patch, this is a dirty and misaligned thing in the face of a very solid basic right that is communication secrecy.

How to overcome ‘secrecy protection’ to offer CDN service? Easy; ask for permission to eavesdrop to the sender. As the sender is not the one that receives traffic (a movie for example), but the one who hires a CDN to serve the movie, in the services contract there is a clause allowing technical middle-man packet inspection for caching purposes that comply with the ‘caching exception’ rules. The movie viewer cannot complain. The movie owner does not want to complain, he is happy about the caching.

What about transparent caching? If I do not hire a CDN… can anyone in the middle inspect my messages claiming a ‘caching exception’? Of course not, but sometimes they do. Some ISPs install transparent caches. They inspect traffic from ANY source in search of clues to cache repeated content. They do not ask anyone permission to do that. Prior to ‘caching exceptions’ they could be considered liable of secrecy violation. Today you would need to take the laws of the country in which the DPI/cache is physically placed and carefully study the method and purpose of transparent caching. In many circumstances you will have a legal case against the one who is doing DPI/transparent caching.

Did you know that to avoid legal prosecution it is very probable that you have been made to sign a clause in your ISP contract allowing this ISP to perform DPI/transparent caching? Of course this clause does not say ‘…we hereby are granted permission to eavesdrop…’ No, the clause will more or less say ‘…we are granted permission to manipulate traffic through technical means including caching under the caching exceptions to telecommunication laws…’.

The fact is that asking for permission is the best way to eavesdrop. There is a well known company that give you free email but you allow them to classify your messages by means of inspecting everything inside them.

Another fact is that if someone does not have a contract with you he cannot ask for your permission nor receive it to look into your traffic. That is, if someone different from my ISP places a cache in the middle of my traffic (for example he caches a web page of my own, or intercepts traffic from any server at my home), or anyone does DPI on packets going out of my home, not being my ISP it is impossible that he asked me for permission, and thus I may not agree with him eavesdropping into my messages.

It is important to notice that this is happening and you can do very little to stop it. You could figure out that an ISP in the middle is doing transparent caching, find the country in which the DPI/cache is placed, find the specific law of that country, (try to) find the specific method and purpose the ISP applies and if you find yourself with enough money and strength take them to court. Honestly you do not have much hope of success.

 

11. Conclusion

We have seen that net neutrality is about dealing with traffic in equal conditions independently of the identity of traffic owner.

We have seen that, as of recently, technology allows to break neutrality. But the violator still needs a reason.

We have seen that Internet service is not a Carrier Service; it is not end-to-end, it is an Access Service.

We have seen that from the legal perspective, Internet Service is not a ‘Common Carrier’ service.

We have seen that regulators, like the FCC, cannot simply force ISPs to increase any capacity in any interconnection. We have seen it will not address congestion problems.

We have seen that neutrality is violated everyday by transparent caching and DPI. We have seen that a ‘patch’ has been applied to law to allow violating secrecy to a certain extent.

It seems clear that even supposing that DPI/ transparent caching is lawful (which in many cases is objectionable) , once a DPI has been performed the router owner could do other things that go beyond violating secrets. He can use the information to change the natural balance of traffic. He can prioritize at will.

This prioritization can be a net neutrality violation.

As Net Neutrality is not a law, it is not even a ‘law principle’, it is just a policy, that is a recommendation, no one can take to the court an ISP due to a ‘creative traffic engineering’, once proved that the DPI performed by this ISP was lawful (under the caching exceptions or allowed by the ISP-user contract).

It is still possible to take to the court ISPs and service providers that have not asked you for permission to unbalance your traffic and that cannot allege lawful caching exception.

Applying these conclusions to some recent cases of ‘famous movie Distribution Company’ vs ‘famous ISP’ you can see that the regulator, or the courts will have a very difficult day (or year) in dealing with their responsibility to take control of the behavior of the ISP or the behavior of the distribution company.

The most probable development of these complaints is to be judged under trade laws, not under communication laws. The courts will not feel competent to apply ‘recommendations’ as ‘net neutrality’ but they will be happy to look for contract infringements.

What is uncertain is if they will find any contract infringement. In my own view it is very likely they won’t.

We can conclude that ‘Net Neutrality’ is an aspiration; it is not backed by law.

Net neutrality is a complex issue that requires society, companies, law and courts to mature and change.

Today Net neutrality is badly understood. I have had the sad experience to read articles, even from reputed writers and journalists that usually have a clear understanding of the topics they deal with, that completely missed the point.

They miss the point because they let themselves be abducted by rage and by a misleading comparison to ‘basic human rights’. They feel it is important to do something to grant neutrality… and they fail to realize that the network is essentially neutral and someone not making his traffic through cannot claim the network is not neutral.

At the same time there are neutrality violators (DPI/transparent caching) but our society has created laws to support them. It is important to realize that these violations are serious and laws must be changed.

I hope that this long reflection about all factors involved in Net Neutrality may have been interesting to all of you.

Have a nice and neutral day.

 


Some thoughts about CDNs, Internet and the immediate future of both

(Download this article in pdf format : thoughts CDN internet )

1. INTRODUCTION

A CDN (Content delivery Network) is a Network overlaid on top of internet.  Why bother to put another network on top of internet? Answer is easy: the Internet as of today does not work well for doing certain things, for instance content services for today’s content types.  Any CDN that ever existed was just intended to improve the behaviour of the underlying network in some very specific cases: ‘some services’ (content services for example), for ‘some users’ (those who pay, or at least those whom someone pays for). CDNs do not want nor can improve Internet as a whole.

Internet is just yet another IP network combined with some basic services, for instance: ‘object names’ translation into ‘network addresses’ (network names): DNS.  Internet’s ‘service model’ is multi-tenant, collaborative, non-managed, and ‘open’ opposite to private networks (single owner), joined to standards that may vary one from another, non-collaborative (though they may peer and do business at some points) and managed. It is now accepted that the ‘service model’ of Internet, is not optimal for some things: secure transactions, real time communications and uninterrupted access to really big objects (coherent sustained flows)…

The service model in a network of the likes of Internet , so little managed, so little centralized, with so many ‘open’ contributions,  today can grant very few things to the end-to-end user, and the more the network grows and the more the network interconnects with itself the less good properties it has end to end. It is a paradox. It relates to complex systems size. The basic mechanisms that are good for a size X network with a connection degree C may not be good for another network  10^6X in size and/or 100C in connection. Solutions to internet growth and stability must never compromise its good properties: openness, de-centralisation, multi-tenancy …. This growth& stability problem is important enough to have several groups working on it: Future Internet Architecture Groups. These Groups exist in UE, USA and Asia.

Internet basic tools for service building are: a packet service that is non-connection-oriented (UDP) and a packet service that is connection-oriented (TCP) and on top of this last one a service that is text-query-oriented and stateless (HTTP) (sessions last for just one transaction).A name translation service from object names to network names helps a lot to write services for Internet and also allows these applications to keep running no matter the network addresses are changing.

For most services/applications Internet is a ‘HTTP network’. The spread of NAT and firewalls makes UDP inaccessible to most internet consumers, and when it comes to TCP, only port 80 is always open and even more only TCP flows marked with HTTP headers are allowed through many filters. These constraints make today’s internet a limited place for building services. If you want to reach the maximum possible number of consumers you have to build your service as an HTTP service.

 

2. OBJECTS, OBJECT NAMES AND CONTENT

A decent ‘network’ must be flexible and easy to use. That flexibility includes the ability to find your counterpart when you want to communicate.    In the voice network (POTS) we create point to point connections. We need to know the other endpoint address (phone number) and there is no service inside POTS to discover endpoint addresses not even a translation service.

In Internet it was clear from the very beginning that we needed names that were more meaningful than network addresses.  To make the network more palatable to humans Internet has been complemented with mechanisms that support ‘meaningful names’.  The ‘meaning’ of these names was designed to be one very concrete: “one name-one network termination” … and the semantics that will apply to these names were borrowed from set-theory through the concept of ‘domain’ (a set of names) with strict inclusion. Pairs name-address are modelled making ‘name’ to have such an structure that represents a hierarchy of domains. In case a domain includes some other domain that is clearly expressed by means of a chain of ‘qualifiers’.  A ‘qualifier’ is a string of characters. The way to name a subdomain is to add one more qualifier to the string and so on and so forth. If two domains do not have any inclusion relationship then they are forcefully disjoint.

This naming system was originally intended just to identify machines (network terminals) but it can be ,and has been, easily extended to identify resources inside machines by adding subdomains. This extension is a powerful tool that offers flexibility to place objects in the vast space of the network applying ‘meaningful names’. It gives us the ability to name machines, files, files that contain other files (folders), and so on… . These are all the ‘objects’ that we can place in internet for the sake of building services/applications.  It is important to realise that only the names that identify machines get translated to network entities (IP addresses). Names that refer to files or ‘resources’ cannot map to IP network entities and thus, it is the responsibility of the service/application to ‘complete’ the meaning of the name.

To implement this semantics on top of Internet they built a ‘names translator’ that ended up being called ‘name server’. Internet feature is called: Domain Name Service (DNS).  A name server is an entity that you can query to resolve a ‘name’ into an IP address.  Each name server only ‘maps’ objects placed in a limited portion of the network. The owner of this area has the responsibility of maintaining the names of objects associated to proper network addresses.   DNS just gives us  part of the meaning of a name.  The part that can be mapped onto the network. The full meaning of an object name is rooted deeply in the service/application in which that object exists. To implement a naming system that is compatible to DNS domain semantics we can for instance use the syntax described in RFC2369. There we are given the concept of URI: Uniform resource Identifier. This concept is compatible and encloses previous concepts as URL: Uniform Resource Locator and URN: Uniform Resource Name.

For the naming system to be sound and useful it is necessary that an authority exists to assign names, to manage the ‘namespace’..  Bearing in mind that translation process is hierarchical and can be delegated; many interesting intermediation cases are possible that involve cooperation among service owners and between service and network owners. In HTTP the naming system uses URLs. These URLs are names that help us in finding a ‘resource’ inside a machine inside the Internet. In this framework that HTTP provides, the resources are files.

What is ‘Content’?

It is not possible to give a non-restrictive definition of ‘content’ that covers all possible content types for all possible viewpoints. We should agree that ‘content’ is a piece of information. A file/stream is the technological object that implements ‘content’ in the framework of HTTP+DNS.

 

3. THE CONTENT DISTRIBUTION PROBLEM

We face the problem of optimising the following task: find & recover some content from internet..

Observation 1: current names do not have a helpful meaning. URLs (HTTP+DNS framework) are ‘toponymic’ names. They give us an address for a content name or machine name. There is nothing in the name that refers to the geographic placement of the content. The name is not ‘topographic’ (as it would be for instance in case it contains UTM coordinates). The name is not ‘topologic’ (it gives no clue about how to get to the content, about the route). In brief: Internet names, URLs, do not have a meaningful structure that could help in optimising the task (find & recover).

Observation 2: current translations don’t have context. DNS (the current implementation) does not recover information about query originator, nor any other context for the query. DNS does not worry about WHO asks for a name translation or WHEN or WHERE… as it is designed for a semantic association 1:1, one name one network address, and thus, why worry? We could properly say that the DNS, as is today, does not have ‘context’. Current DNS is kind of a dictionary.

Observation 3: there is a diversity of content distribution problems.  The content distribution problem is not, usually, a transmission 1 to 1; it is usually 1 to many.  Usually there is for one content ‘C’ at any given time ‘T’ the amount of ‘N’ consumers with N>>1 most of the times.  The keys to quality are delay and integrity (time coherence is a result of delay). Audio-visual content can be consumed in batch or in stream. A ‘live’ content can only be consumed as a stream. It is very important that latency (time shift T=t1-t0 between an event that happens at t0 and the time t1 at which that event is perceived by consumer) is as low as possible. A pre-recorded content is consumed ‘on demand’ (VoD for instance).

It is important to notice that there are different ‘content distribution problems’ for live and recorded and also different for files and for streams.

A live transmission gives to all the consumers simultaneously the same exact experience (Broadcast/multicast), but it cannot benefit from networks with storage, as store-and-forward techniques increase delay. It is impossible also to pre-position the content in many places in the network to avoid long distance transmission as the content does not exist before consumption time.

An on-demand service cannot be a shared experience.. If it is a stream, there is a different stream per consumer. Nevertheless an on demand transmission may benefit from store and forward networks.  It is possible to pre-position the same title in many places across the network to avoid long distance transmission. This technique at the same time impacts on the ‘naming problem’: how will the network know which is the best copy for a given consumer?

We soon realise that the content distribution problem is affected by (at least): geographic position of content, geographic position of consumer and network topology

 

4. CURRENT CDNS: OPTIMISING INTERNET FOR CONTENT

-to distribute a live content the best network is a broadcast network with low latency: classical radio & TV broadcasting, satellite are optimal options. It is not possible to do ‘better’ with a switched, routed network as IP networks are. The point is: IP networks just do NOT do well with one-to-many services. It takes incredible effort from a switched network to let a broadcast/multicast flow compared to a truly shared medium like radio.)

to distribute on demand content the best network is a network with intermediate storage.  In those networks a single content must be transformed into M ‘instances’ that will be stored in many places through the network. For the content title ‘C’, the function ‘F’ that assigns a concrete instance ‘Cn’ to a concrete query ‘Ric’ is the key to optimising Content delivery. This function ‘F’ is commonly referred as ‘request mapping’ or ‘request routing’.

Internet + HTTP servers + DNS have both storage and naming.  (Neither of HTTP or DNS is a must.)

There is no ‘normalised’ storage service in internet, but a bunch of interconnected caches. Most of the caches work together as CDNs. A CDN, for a price, can grant that 99% consumers of your content will get it properly (low delay + integrity). It makes sense to build CDNs on top of HTTP+DNS. In fact most CDNs today build ‘request routing’ as an extension of DNS.

A network with intermediate storage should use the following info to find & retrieve content:

content name (Identity of content)

-geographic position of requester

-geographic position of all existing copies of that content

network topology (including dynamic status of network)

-business variables (cost associated to retrieval, requester Identity, quality,…)

Nowadays there are services (some paid) that give us the geographic position of an IP address : MaxMind, Hostip.info, IPinfoDB,… . Many CDNs leverage these services for request routing.

It seems that there are solutions to geo-positioning, but still have a naming problem. A CDN must offer a ‘standard face’ to content requesters. As we have said content dealers usually host their content in HTTP servers and build URLs based on HTTP+DNS so CDNs are forced to build an interface to the HTTP+DNS world.. On the internal side, today the most relevant CDNs use non-standard mechanisms to interconnect their servers (IP spoofing, DNS extensions, Anycast,…)

 

5. POSSIBLE EVOLUTION OF INTERNET

-add context to object queries: identify requester position through DNS. Today some networks use several proprietary versions of ‘enhanced DNS’ (Google is one of them). The enhancement usually is implemented transporting the IP addr of the requester in the DNS request and preserving this info across DNS messages so it can be used for DNS resolution.   We would prefer to use geo-position better than IP address. This geo position is available in terminals equipped with GPS, and can also be in static terminals if an admin provides positioning info when the terminal is started.

add topological + topographical structure to names: enhance DNS+HTTP.   A web server may know its geographic position and build object names based on UTM. An organization may handle domains named after UTM. This kind of solution is plausible due to the fact that servers’ mobility is ‘slow’. Servers do not need to change position frequently and their IP addresses could be ‘named’ in a topographic way.  It is more complicated to include topological information in names. This complexity is addressed through successive name-resolution and routing processes that painstakingly give us back the IP addresses in a dynamic way that consumes the efforts of BGP and classical routing (ISIS, OSPF).

Nevertheless it is possible to give servers names that could be used collaboratively with the current routing systems. The AS number could be part of the name.  It is even possible to increase ‘topologic resolution’ by introducing a sub-AS number.  Currently Autonomous Systems (AS) are not subdivided topologically nor linked to any geography. These facts prevent us from using the AS number as a geo-locator. There are organisations spread over the whole world that have a single AS.  Thus AS number is a political-ID, not a geo-ID nor a topology-ID. An organizational revolution could be to eradicate too spread AS and/or too complex AS. This goal could be achieved by breaking AS in smaller parts confined each one in a delimited geo-area and with a simple topology. Again we would need a sub-AS number. There are mechanisms today that could serve to create a rough implementation of geo-referenced AS, for instance BGP communities.

request routing performed mainly by network terminals: /etc/hosts sync. The abovementioned improvements in the structure of names would allow web browsers (or any SW client that recovers content) to do their request routing locally. It could be done entirely in the local machine using a local database of structured names (similar to /etc/hosts) taking advantage of the structure in the names to guess parts of the mapping not explicitly declared in the local DB. Taking the naming approach to the extreme (super structured names) the DB would not be necessary, just a set of rules to parse the structure of the name producing an IP address that identifies the optimal server in which the content that carried the structured name can be found. It is important to note that any practical implementation that we could imagine will require a DB. The more structured the names the smaller the DB.

 

6. POSSIBLE EVOLUTIONS OF CDNS

It makes sense to think of a CDN that has a proprietary SW client for content recovery that uses an efficient naming system that allows for the ‘request routing’ to be performed in the client, in the consumer machine not depending of (unpredictably slow) network services.

Such a CDN would host all content in their own servers naming objects in a sound way (probably with geographical and topological meaning) so each consumer with the proper plugin and a minimum local DB can access the best server in the very first transaction: resolution time is zero! This CDN would rewrite web pages of its customers replacing names by structured names that are meaningful to the request routing function.   The most dynamic part of the intelligence that the plugin requires is a small pre-computed DB that is created centrally, periodically using all the relevant information to map servers to names. This DB is updated from the network periodically. The information included in this DB:  updated topology info, business policies, updated lists of servers.  It is important to realise that a new naming structure is key to make this approach practical. If names do not help the DB will end up being humungous.

Of course this is not so futuristic. Today we have a name cache in the web browser + /etc/hosts + cache in the DNS servers. It is a little subtle to notice that the best things of the new schema are: suppress the first query (and all the first queries after TTL expiration). Also there is no influence of TTLs, which are controlled by DNS owners out of cdn1, and there are no TTLs that maybe built in browsers….

This approach may succeed for these reasons:

1-      Not all objects hosted in internet are important enough to be indexed in a CDN and dynamism of key routing information is so low that it is feasible to keep all terminals up to date with infrequent sync events.

2-      Today computing capacity and storage capacity in terminals (even mobile) are enough to handle this task and the penalty paid in time is by far less than the best possible situation (with the best luck) using collaborative DNS.

3-      It is possible, attending to geographic position of the client, to download only that part of the map of servers that the client needs to know.  It suffices to recover the ‘neighbouring’ part of the map. In case of an uncommon chained failure of many neighbour servers, it is still possible to dynamically download a far portion of the map.