Tag Archives: content

Modern Exploitation of Content Businesses

(Download this article as PDF: Exploitation content services)

3D schema. Exploitation of Content. by Adolfo M. Rosas. www.adolforosas.com

1 Introduction

Today some say ‘content is king’ … some others say ‘network is king’, or ‘user is king’, or ‘data is king’ or … whatever thing is king. There are so many ‘kings’ today in the markets and industry.  But you have to concede that content services flourish today, as of mid-2014, in the network. We have more online video than ever before, more shared media: photos, videos, messages (sometimes with embedded media), downloadable apps, streaming games…and more video streaming services than anyone could have imagined just a few years ago.

If we admit that the internet economy is going up, and at the same time we admit that content services are an important part of that economy, it would be reasonable to assume that people have discovered how to exploit content services. How to build services that are not only convenient for consumers: easy to use, appealing, affordable, modern, engaging ….but also convenient for the ones behind the service, convenient for exploitation: easy to run, understandable, reliable, profitable.

To a certain extent it is true that some people learnt to build those services and made them as good as it is conceivable, but only to a certain extent. The fact is you can name a few (less than 5) content services that surpass all others, you can name even less leading search engines, you can name very few leading streaming services, you can name essentially a couple leading movie services, very few first-line music services, virtually only one leading e-book service…  The very existence of this clear category of ‘leading content services’ lets us know that not all the people building services knew how to make them as good as it is possible.

I want to devote this article to the art and science of making content services ‘exploitable’ in modern terms.


2 Service ideation

Many managers I’ve known just cannot imagine the ‘whole coherent idea’ of a new content service. They cannot create a mind representation of a coherent set of actions, definitions, processes that involve all actors (producer, aggregator, sellers, end-users,…), all infrastructure (production stages, repositories, distribution infrastructure, catalogues, front ends, payment systems,…), and all processes(content ingestion, metadata generation, catalogue and frontend provisioning, transaction processing, billing, analytics,…)

Content Business Managers ‘not always’ understand how involved the ‘service idea’ or ‘service model’ is with technology. Some of these managers will let ‘technical people’ alone build the managers ‘current view’ of the service. Usually in these cases that ‘view’ is a short list of requirements that describes some concrete aspects about what the user should see, what should happen when the user does this and that, and what options should ‘the system’ (doesn’t it make you smile to realize how many people refer to almost any part of the service as: ‘the system…’) provide to end users and administrators.

But please do not take me wrong on this; I would not say that ‘technical people’ can do it better. In fact most ‘technical people’ I’ve known would have even a shorter, less-coherent view of the whole service. The problem is many managers have a tendency to imagine the service only as a money-maker forgetting completely about how it works and many technicians have a tendency to imagine only isolated parts of the service at work and they can’t and won’t imagine how the service makes money.

What happens usually?

Working on a partial and incoherent view of the service any implementation of that ‘view’ will be flawed, impossible to evolve once we start asking ourselves very basic questions that test the coherence of the service model. If a technician raises one of these questions, many times he will need the manager to give an answer, to think of choices he never imagined. If the manager was the one to discover something he does not like in the behavior of the service, many times he will need assessment from the technician, making the technician think of options he never considered. It is only a matter of time (very short time) that other managers or even the same manager will need to do very simple variations to the original idea: instead of single items sell bundles, instead of fixed prices do promotions, instead of billing per total traffic by end of month bill per percentile 95/5 min of instant available bitrate, instead of ….

In many cases the first implementation of a content service must be thrown to litter and start all over again just after the first round of ‘improvement ideas’. It happens that those ‘clever requirements’ that we had at the start were mutually conflicting, short-sighted, and did not allow for any flexibility even the most basic.

So, what to do to avoid recreating all the service from scratch every month?

The approach that I would suggest is to do a lot of planning before actually building anything.

Write your ideas down. Draw flow diagrams on paper. Draw mockups on paper. Put them through ‘logic tests’. Ask yourself questions in the style ‘what if…?’ , or ‘how can the user do…?’, or would it be possible later to enhance/add/change…?’  Show your work to others.  Rethink. Rewrite. Retest….

Spend a few days or better a few weeks doing this and I can assure you that your requirements list will grow, your idea about complexity of the service will change, your enthusiasm will increase substantially, your understanding of possible problems will be much better, and the time and effort to build and maintain your service will be sensibly reduced.

It is important that in the ideation process different people work together: marketing & pricing experts, engineering & operations experts, end user experience experts, billing, analytics, support… I would also say: please do not use computers for work in this phase. PowerPoint is great… but your hands with a pencil are far better and much faster. And in a room with no computers there is no possibility for someone to go and check email and thus be out of the group for a while. (Smartphones are computers too and should be prohibited in this phase.) Recall that: ‘if you cannot imagine it you cannot draw it. If you cannot draw it with a pencil you cannot draw it in PowerPoint’. If your boxes are not perfectly rectangular and your circles are not perfectly round you can rest assured that at some point later a good computer will fix that with no pain.

If you were not forced to imagine the internals of the service at work you would not find the problems you have to solve and you would never find the solutions. You must realize that everyone that has a voice to claim requirements over the service MUST imagine the service at the SAME time that others do it and SHARE immediately his ideas and concerns with the entire group.


3 Building coherency: a semantic model

Oops…I couldn’t avoid the word ‘semantic’ I’ve immediately lost half of my readers. I’m sorry. For the brave remaining readers: Yes, it is an academic style of saying that the whole service must make sense and not contradict itself neither in ‘concept’ nor in ‘implementation’.

I’ve noticed that some people, including some of my colleagues, start to mentally wander when we speak about modelling. But it is important to create models of services.  Services do not grow on trees.  A service is not a natural thing, it is a contraption of the human mind and it must be polished, rounded and perfected until it can be communicated to other minds.

A ‘model’ is a simplified version of what the service will be once built. Our model can start as an entity that only lives in paper. A few connected boxes with inputs and outputs and processes inside may be a perfectly good model. Of course the model starts to get complicated as we imagine more and more functions and capabilities that will be in the service. At some point our model will need to live out of paper,  just because paper cannot support the complexity of the model… but recall that the model is always a simplification of the real service, so if you think your model is complicated, what about your not-yet-existent service? Do you really want to start spending money in something you do not really understand?

Have you noticed that you need to establish a common-language (also known as ‘nomenclature’) to communicate ‘about’ the service inside the service team? Let me show you an example: what happens when you tell another member of the team: “…then the user will be able to undo (some operation)…” But, who is ‘the user’? For the designer of the consumer interface it is unequivocally ‘the end user consuming the service’, for the OSS/support/operations guys it is unequivocally ‘the operator of the support interfaces’, and for other people ‘the user’ may even be other kind of human.  You probably think I’m exaggerating… you think this is not a problem as every expert has a good understanding of ‘the user’ in his own context.  But what happens if the end-user-at-home must be mixed in the same sentence with the user-of-the-admin interfaces?  Let me tell you what happens: each one dealing with that sentence will express it in the way that is more convenient to him.   The sentence will be expressed in terms of the mind that is in charge of that document/chapter/paragraph… Names will be applied to distinguish between roles, but these names belong to the experience of the one that is writing the text and they do not always fit in the ideas of other people in the team. These names have never been agreed A-Priori. And worse, in other sentences that also mix the two concepts another different guy can name the same roles completely differently.  This is one of the reasons why some technical documents about a service are impossible to read by anyone that just has common sense but lacks 1500 hours of meetings with all the internal stakeholders in the service.

I cannot be sure that I convinced you of the need to stablish a ‘common language’ about the service, but if you have had to coordinate a documentation department or a big program management office you know what I’m saying. In fact the problem goes much farther than ‘names’. The problem extends to ‘concepts’.

I would never have thought at the start of my career that people could develop so very different ideas of what is ‘billing’, what is ‘support’, what is ‘reporting’, what is ‘consuming a content’, what is ‘consumer behavior’, and a few other hundreds of concepts… But over time I learnt. These, and essentially all concepts that we deal with in our human language do not have the same representation in each other’s mind. That does not usually create problems in our daily life but when we design something very detailed, if we want others to understand it at first attempt we need to agree in the language and concepts that we deal with.  We need to start with a model of the service that is semantically correct, and never use terms strange to this model to communicate about the service.  The paradigm of the correctness in semantic models is called an ‘ontology’ (oops bye, bye another half of readers). Ontologies are very hard to ‘close’ as coherency must be complete but we may do reasonably well with a simpler semantic model that contains ‘primitive concepts’ (definitions of objects and actions that will not be questioned), semantic relations between primitive concepts and ‘derivative concepts’ which are objects and actions defined in terms of the primitive concepts.


4 Content Service primitive concepts

The word ‘service’ taken as a ‘semantic field’ inside human experience can extend to touch many concepts. If we shrink our area of interest just to Telecommunication services the possible interpretations of service are much less, but if we go further and define our domain as ‘Telecommunication Content Services’, then the number of possible concepts that we will touch and the language appropriate to deal with them gets much more ‘manageable’.

‘Exploitation model’ for a content service: simple semantic model that contains all definitions of entities and actions relevant to describe ‘exploitation’ of a content service.

We define ‘exploitation’ as: all actions that pursue the creation and maintenance of value/utility for consumers and all accompanying actions that ensure that service stakeholders obtain ‘something’ in return.

‘Exploitation system’ for a content service: information system built to implement the exploitation model of a content service.

The following could be a very simple Exploitation Model for Content Services:

Content or Content Product: a piece of information valuable to humans that thus can be sold.

Content Service: a continued action that provides value through Content Products. A service can be sold as a product, but the service is action(s) while Content Product is information. A concrete service is defined by specifying the actions allowed to the consumer (see Capabilities below).

Consumer or Customer: Human that receives value from a product or service. (P.S.: a subtle remark: Consumer gets utility and Customer pays for it. Sometimes they are not the same person.)

Service Capabilities: actions available through a service.

Contract: statement of commitment that links consumer identity + product/service + pricing + SLA

Bundle (of products/services): group of products/services gathered through some criteria.  The criteria used to bundle may vary: joint packaging, joint charging, joint promotion … but these criteria define the bundle as much as its components and thus criteria + components must be stated explicitly.

– (one possible) List of Content Products that pretends to define a CDN-based Service:

Streaming VoD (Movies…): content pre-stored and consumed on demand as a stream.

Streaming Live (channels, events…): content consumed live, as the event is captured, as a stream.

Non-streaming Big Object (SW, docs…): pre-stored huge content consumed on demand as a block.

Non-streaming Small Object (web objects…): pre-stored tiny content consumed O.D. as a block.


The following could be a possible minimum description of an Exploitation system for the above model:

Portals: (Human Interaction tools)

-Product owner (product definition) portal: meta-portal permits defining prod+capabilities.

-Seller portal: meta-portal permits define ‘soft’ prod properties: name, pricing…

-Customer portal: handles customer action: consumption and feedback.

-value added portals & tools: customer analytics, trends, help, guides…

Mediation systems

-report gathering & analytics: log gathering & processing, analytics processing

-billing & clearing: ‘money analytics’

Operation systems

-customer provisioning: customer identity management

-service provisioning: contract management + resource allocation

-event (incidence) ticketing: link customer + product + ticket + alarms + SLA

-SLA monitoring: link product properties + analytics + contract -> alarms -> billing

Inventory systems

-commercial (product) inventory: database of products + capabilities (properties)

-resources (technical) inventory: database of infrastructure items (OSS)

-customer inventory: (protected) database of identity records


5 Human interactions: Portals

This is usually the first (and sadly sometimes apparently the only) part of service that a company works on.

It is not bad to devote time to design great interaction with your consumers and operators, but you must go beyond nice interfaces. In fact a streamlined interface that does just what is needed and nothing else (extra functionality is not a gift, it is just distracting) is not so easy to build.

As I mentioned before it is important to have ‘perfected’ the idea about the service. Once that idea is clear and once you have written down your objectives: what ‘powers’ (capabilities) you want to put in the hands of your consumers and what would it take to implement those capabilities in the unavoidable explosion of support systems, operation portals, processes,… it is time to go and draw a few mockups of the interfaces.

It is not easy to build simple interfaces. The simpler the look & feel the more clever you have to be to handle correctly the interaction. It is especially difficult to get rid of ‘bad habits’ and ‘legacies’ in interface design. You must consider that today the interface that you design will be ‘interpreted’ by devices that have new and appealing possibilities: HD displays with video even in small mobile devices, multi-touch displays, high quality sound … and best of all: always-on connection and social interaction with other ‘places/sites/services’ . This interaction must be ‘facilitated’ by making a natural ‘shift’ from your interface to any other site/interface that you share your consumer identity with at the click (or swipe) of one widget.

5.1 The customer portal

You need interaction with your consumers/customers. They need a place to contract with you (in case you support ‘online contract generation’ which by the way is the trend), and a place to consume your product that as far as this article is concerned is ‘content’, that is: information intended for humans.

This portal is the place to let consumers browse products, get info about them, get them… and this is also the place to let them know transparently everything that links them to you: their purchases (contracts), their wishes, their issues/complaints, their payment info, and their approved social-links to other sites…

What makes you a provider of preference to a consumer is: trust, ease of use and clarity.

Through the customer portal the Consumer registers his identity with the service (maybe totally automated or aided by Operations), purchases products, receives billing information and places complaints. We’ll see later that there are a number of entities related to these actions: Customer Inventory, Product Inventory, Mediation, Event Ticketing, Operations,…

5.2 The seller (internal) portal

You may run a small company and in that case you run directly your business through a relatively small portal in which you do everything: advertising, contracting, delivery, feedback, ticketing…

Or you may be a huge company with commercial branches in tens of countries, with local pricing in different currencies, portals in many languages and a localized portfolio in every region.

In both cases it is useful to keep a clear distinction between ‘hard’ and ‘soft’ properties of your product.

‘Hard properties’ of your (content) product/service are those properties that make your product valuable to customers: the content itself (movie, channel, episodes, events…), the ease of use (view in a click, purchase in a click…), the quality (high bandwidth, quality encoding, and flexibility of formats…), the responsiveness of your service (good personalized attention, quick response times, knowledgeable staff…), etc.

‘Soft properties’ of your (content) product/service are those properties that you have added to make exploitation possible but that are not critical to your consumers : the ‘names’ that you use to sell (names of your options, packages, bundles, promotions, IDs, SKUs,…), the prices and price models (per GByte, per movie, per Mbps, per event, per bundle, per channel, per promotion…), the ads you use to promote (ads targeting by population segment, by language, by region,…), the social links and commercial alliances you build, the themes and colors, the time-windows of pricing, ….

The best way to materialize the distinction between ‘hard’ and ‘soft’ properties of a product/service is to keep two distinct portals (and all their associated backend) for ‘product owner’ and for ‘product seller’.

In the ‘product owner’ portal you will manage hard properties .

In the ‘product seller’ portal you will manage soft properties.

The customer portals are built ON the seller portals. That means that you at least have as many customer portals as ‘sellers’ in your organization. If you have branches in 30 countries and each of them has autonomy to localize the portfolio, pricing, ads, names, etc… each branch is a seller. Each branch needs an internal portal to build its entire commercial portfolio adding all the soft properties to a very basic set of common hard properties taken from the product owner internal portal (see below). Each branch (seller) will build one or more customer portals depending upon their internal seller portal.

You can even be a ‘relatively small’ company that licenses products/tools to resellers. In that case you provide an internal reseller portal to your licensees so they can sell your hard product as their own by changing names, prices, ads, links, etc…

5.3 The product owner (internal) portal

This is the sancta sanctorum of the product definition. This is the place where you define the internals of your product. This is the place where you empower your consumers by ‘creating’ fantastic actions over fantastic content pieces and putting all these actions and portfolio in the hands of your consumers.

It is VERY difficult to find a tool that could be flexible enough to take ANY piece of content and link it to ANY possible action that makes commercial sense. In fact it is impossible.

For this reason the ‘owner portal’ lives more time in the realm of developers than in the realm of administrators. (This MUST NOT be the case for the other portals: seller, customer… or you would be in serious trouble.)

What I mean is: it is impossible to design the ‘tool of tools’ that can graphically modify the actions that you make available to your consumers in every imaginable way. The new ideas that you come up with will surely require some new code and unfortunately this code will be at the heart of your exploitation systems. For this reason it is better to cleanly separate your sellers internal backend from the mother company backend and your customers portals from the sellers internal portals.

But do not despair; it is possible to implement a very serious backend for content exploitation, tremendously powerful, and a flexible tool that manages the most common hard properties of a content product /service.

The common ‘product owner portal’ must implement the following concepts and links:

-the complete product list: items not listed here are not available to sellers

-the complete capabilities list: actions and options not listed here are not available to sellers

-general ‘hard’ restrictions: internal SKUs, internal options (formats, quality steps, viewing options…), billing options (per item, per info unit, per bandwidth…), SLA (availability, BER, re-buffering ratio, re-buffering events/minute…)

Every Content Product must go through a cycle: ingestion – delivery – mediation (accounting & billing).

The links between consumer id, contract, product id, options for consumption, options for billing, options for SLA, etc… must be implemented in several information systems: databases, registries, logs, CRMs, CDRs, LDAPs…)

From these three segments in a service life-cycle: ingestion–delivery-mediation, most academic articles on content services focus on ‘delivery’ as this is the aspect of the service that creates the hardest problems and thus it is fertile soil for innovation (CDNs are a great example). This article focuses in all the rest of the effort, everything that is not pure service delivery.   One important goal of this article is to demonstrate that creating a great online service and later figuring out how to exploit it is a bad idea.

5.4 Value added portals & tools

These tools and values added are as I’ve said…‘added’. No one needs them to find, get, pay and enjoy a Content Product. But who ‘needs’ a Movie anyway? I mean ‘needing’ is not a good word to describe the reasons that drive Content consumption.

Many times it happens that there are actions that no one ‘needs’ but the value they add to a basic service provides such an attraction to consumers that the service becomes popular and the new actions become a convenience that every other competing service must implement. Examples are: search engines, comparison engines, wish lists, social appreciation lists, ranks, comments … People are buying ‘objects’ that no one needs to keep himself alive, there is nothing obvious per-se in a Content Product value. We must compute its value before purchase judging on other people appreciation and comments.

All the ‘tools’ that we can imagine that may help people understand our offer (portfolio) and navigate through it are positive to our business and should be appropriately placed in the customer portals or gathered together in a tools tab , not distracting consumers but helping them know and consume Content.

Some modern tools that help monetize Content: product search, product comparison, social wish list, price evolution over time, price comparison, related products, buyers ranks, social (our group) ranks, open comments, long term trend analysis…


6 Mediation Systems

These are key systems to exploitation but the name is really ugly. What do we mean with ‘Mediation’?

We need a middle-man, an inter-mediate, a mediator, when we do not have all the capabilities required to do something or when it is convenient to delegate some task in others that will do it better or at least equally well but cheaper than us… or simply when we prefer to focus in other tasks.

In a commercial exploitation system, ‘Mediation’, usually means ‘doing everything that is needed to apply and enforce contracts’. Sounds easy, yes? OK, it isn’t.

Enforcing contracts is ‘Mediation’ for us because we choose not to identify with all the ‘boring’ actions needed to apply the contract,… we prefer to identify ourselves with the actions that deliver the service, and that is human. Delivery is much more visible. Delivery usually drives much more engagement in consumers.

Mediation usually involves accounting and processing of data items to prepare billing and data analyses.

Mediation in Content Services includes:

log gathering & processing

billing & clearing (and sometimes payment)

analytics processing and report rendering

In the CDN business Log gathering & processing is a cornerstone of the business. Many CDNs in fact offer edge logs as a byproduct and some sell them. Even in some legal frameworks especially in Europe CDN service providers are forced to keep edge logs for 12 months available to any authority that may demand them for audit.

CDNs are huge infrastructures, usually with thousands of edge machines in tens or hundreds of PoPs distributed over tens of countries. Almost 100% of CDNs bill their customers by the total number of bytes delivered over a month (GBytes/month). Only a few CDNs bill customers per percentile 95 measured in 5 min slots of delivery speed over a Month (Mbps/Month). In any case it is necessary to measure traffic at the delivery points (edge). But the edge is huge in a CDN so lots of log files will need to be moved to a central element for some processing. This processing involves separating CDRs (Customer Data Records) that belong to different customers, different Content Products, different regions, etc…etc… In case a CDN implements percentile 95/5 billing the downloads have to be processed in 5 min slots, average Mbps per slot and customer calculated, rank of slots over the whole month gathered and the percentile 95 calculated per customer.

Usually other interesting calculations are worth doing over edge logs.

We now live in the era of ‘Big Data’ which is a new buzzword for an activity that has been for long time present in some businesses (like CDNs) and longtime absent in some other businesses (like many online services), this activity is behavior recording (journaling) and offline analysis (trend spotting and data correlation).

Analytics and billing should be related in a CDN. As time goes by more and more data analyses become popular for CDN customers. We started with very basic billing information (traffic/month) and that is still valid but many other analyses become feasible in these days due to increased processing power and due to new and interesting propositions about data correlation.   Online content businesses have appeared over the world in a moment when other online services existed and there were established billing information systems. These billing systems for online services were mostly of two kinds: continuous service accounting for deferred billing (CDR based, common in Telephony), discrete event billing (common in online shops).

Discrete event billing is easy to understand: one SKU is purchased –one SKU is billed. No more time spent.

The CDR (Customer Data Records) are tiny pieces of information that must be collected over a period (monthly usually) to help reconstructing the ‘service usage history’.  Each CDR is as much as possible an independent piece of information intended to be later processed by a ‘billing machine’.  When creating the CDR we must not rely in that any context information will be later available and thus the CDR must contain everything needed to convert it in money: customer ID, service ID, units consumed, time-stamp, and other ‘creative specific billing data’.  The fact is that there is always some context needed at processing time so no CDR system is perfect, but the whole idea of keeping CDRs is to reduce the context that exists at the time of CDR creation and in this way we will be able to post process adding information that was not available at consumption time (in case this information ever appears).

Consolidation of CDRs is a cumbersome process that allows great flexibility in Telco billing but it does not come for free. In fact this ‘flexibility’ has created one of the biggest problems in data back ends: processing of CDRs usually cannot start until the billing period has ended (explanation below) and at that moment CDRs in the billing system can be thousands of millions of records. Huge datacenters have been built for billing. They are expensive, they are slow, they are complex, they are unreliable (no matter how enthusiastic the vendor is and how small he claims is the amount of ‘impossible to charge for’ records). Why is this? Creativity in business models for ‘Telecom utilities’ has been enormous in recent times, especially since the advent of mobile communications. A subscriber is charged usually at the end of a month, and in the middle he can contract, refuse, modify, consume a variety of communication products, receive promotions, discounts, obtain fidelity points, redeem points… All this complexity of actions that affect the monthly bill must be recorded, enriched with context, time-stamped, stored… and a consolidation process must be run at the end of the billing period to transform CDRs in a bill per customer. This high complexity is supported willingly by Telcos today. They seem to have a preference for creating a plethora of different promotions, personal plans, personal discounts, special discounts, special surcharges, different pricing time windows… It seems that currently this complexity is good for Telco business, but the other side of it is that you need CDR billing.

Now you should be questioning yourself about this: Business-wise, is a Content Service more like a shop or more like a mobile Telco service? Will we do better with discrete-event billing or with CDR billing? That may be a tricky question. In my own humble opinion any Content Service must be better thought of as a shop, and a CDN is no exception.  CDNs create an interesting paradox: the Customer (the one who looks for the service and eventually gets it and pays for it) usually is not the same human that ‘consumes’ the service. The typical CDN customer is a company that has some important message to make through internet. There can be millions of ‘consumers’ on demand of that message. There can be thousands of millions of consumption actions in a billing period, exerted by millions of different humans. This fact distracts many people from other more important facts:

-the Service Capabilities are completely determined and agreed before starting to serve

-the SLA is completely established and agreed before starting to serve

-pricing is completely clear and agreed before starting to serve

-no matter there are millions of termination points, it is perfectly possible to track all them to the CDN service and bill all the actions to the proper customer

-a Telco service is strongly asymmetric: the customer is many orders of magnitude less ‘powerful’ than the service provider; a CDN is not. For a CDN many customers may be in fact bigger financially than the service provider, so there is space for initial negotiation, and there is NO space for wild contract changes in the middle of the billing period just because the service provider gets creative about tariffs or whatever.

So I would say that CDR billing for a CDN does only complicate things. Logs of edge activity are the ultimate source for service audit and billing but there is no point in separating individual transactions, time-stamping each one, adding all context that makes a transaction independent from all others, and storing all those millions of records.

A CDN deserves something that rests midway between event-billing and CDR-billing. I like to call it ‘report-based-billing’. Some distributed processing (distributed along the edge and regions of the world) may allow us to separate ‘reports’ about the bytes downloaded from the edge and accountable to each of our customers. These reports are not CDRs. These reports are not either ‘unique events’ to be billed. These reports are partial bills for some time-window and for some customer. We may do processing daily, hourly or even finer than that. We will end up having the daily (for instance) bill for each customer in each region. This daily bill can be accumulated over the month easily so we will have the bill up to day X in month Y with little added effort over daily processing. These reports support ‘easily’ corrections due to failures in service that will have an effect on billing (compensations to customers, free traffic, promotions…) and also  support surgical amendments of daily report consolidation in case (for instance) some edge log was unrecoverable at the time of daily processing but was recovered later.

By implementing this ‘continuous consumption accounting and continuous report consolidation’ it is possible to bill CDN (or any content business) immediately after the billing period ends (month usually), but most important there is no need to process thousands of millions of CDRs to produce our bills nor is it needed to have a huge datacenter for this specific purpose.


7 Operation Systems

This concept of ‘operation’ leads us to an interesting discussion. In the Telco world operation is always present. No system or service can work with ‘zero operation’. This concept of operation goes beyond ‘maintenance’. Operation means ‘keeping the service up’. This is a very varying task from one service to another. One may think that the better the service was imagined the less operation it needs… and that is not a bad idea. It is true. But in the real world ‘zero operation’ is not yet possible.

Put simply, the services we create have so many actions inside that affect so many machines and lines of code that we cannot really believe they can work without keeping an eye on them. Taking care of that is ‘monitoring’, and by the way we never really discovered how to accomplish some tasks automatically (customer contact, contracting, support calls, replacement of SW versions, etc…) and that is ‘support’.  These human concepts of ‘monitoring’ and ‘support’ have been named in the Telco world: OSS (Operation Support Systems) and BSS (Business Support Systems), but in real life there is high overlap between them.  How could you possibly think of a task that means operation of a service without being a support to the business?  Have you ever seen any business that has operations that do not carry costs?  Do you have operations that do not produce business? (If you answered ‘yes’ to any of the two questions you better review your business…).

The most important (in my view) OSS/BSS in Content Services are:

customer provisioning: customer identity management

service provisioning:  contract management + resource allocation

event (incidence) ticketing: link customer + product + ticket + alarms + SLA

SLA monitoring: link product properties + analytics + contract -> alarms -> billing

7.1 Customer/Consumer provisioning

This kind of system, an information system that acquires and handles human identity records has evolved enormously in the recent years. ‘Managing Identity’ is an action incredibly powerful for a business and it carries great responsibility that will be enforced by law.  However only very recently we are seeing some part of the power of Identity Management in real life.

In a series of internal research articles that I wrote seven years ago I was promoting the idea of a ‘partially shared identity’. At that moment the idea was certainly new as some syndication of ‘whole identities’ was entering the industry and some more or less ‘promising standards’ were in the works.  We built a demonstration of three commercial platforms that were loosely-coupled by sharing fragments of the whole identity of the end user.

Today I’m happy to see that the once ‘promising’ standards which were overly complex have been forgotten but the leading commercial platforms and the leading identity management platforms (social networks) now allow cross-authentication by invoking APIs inspired by the idea of ‘set of identity records and set of permissions’. The platform that requires access to your identity data will let you know what ‘items’ it is requesting from your authenticator before you allow the request to go on.  This is a practical implementation of ‘partial identity’.

But let’s focus in the simplest purpose of the ‘Customer Provisioning’: we need to acquire a hook to some human so we can recognize her when she is back, we can give service to some ‘address’, we can take her feedback and we can send her bills and charge her account for the price of our service.

As I’ve said the most intelligent approach to knowing our users today is …going directly to our would-be-customer and saying: … ‘Do you have a social network in which you are well known? Good, please let me know which one. By the way I have bridges to the biggest three. You can choose the one you prefer to authenticate with me and I will not bother you a single minute entering your data.’

Usually social networks do not hold information about payment methods (VISA, PayPal, etc…) so fortunately for the peace of mind of our customer/consumer that part of the personal data cannot be shared. But taking the more general concept of a ‘platform’ in which a consumer has a personal account, it is imaginable a business relationship with another platform in which the consumer would occasionally like to do a purchase but he does not want to rely on them to handle his payment. In case the consumer gives permission the charge could be sent to the first platform that is already trusted by the consumer. The first platform will handle consumer’s money and the new (or second) platform will just be a provider of goods to the first platform, sending these goods (in our case Content Products) directly to the address of the consumer. In this way the consumer obtains the good effects of sharing his payment data without actually sharing them.

I have to say that I’m also happy to see this concept today implemented in Amazon Marketplace. In case of virtual goods (Content) it could be even easier to implement (or more complicated it depends on the nature of content and the kind of delivery that is appropriate.)

7.2 Service Provisioning

This is hard stuff. As I mentioned at the beginning of this article ‘…today we are not talking about delivery…’ But in fact delivery is the most attractive part of content businesses from a technological perspective. It is also the biggest source of pain for the content business. It is where you can fail, where you can be wrong, have the wrong strategy, have the wrong infrastructure, the wrong scale… and you see… It is a hard problem to solve, but this is the reason it is so exciting. CDNs are exciting. Service Provisioning is directly related to how you plan and execute your content delivery.

Provisioning more service is a daily problem in CDNs. It may be due to a new customer arriving or because existing customers demand ‘more service’.  It cannot be taken lightly. Customers/Consumers can be everywhere through your footprint, even worldwide, but you do not have PoPs everywhere and your PoPs do not have infinite capacity. Service provisioning must be the result of thorough thinking and data analysis about your current and projected demand.

As I commented in a different article, a CDN takes requests from essentially anywhere and then has to compute ‘request routing’ to decide per request which is the best resource to serve the request. Resources are not ‘anywhere’. There is a ‘footprint’ for a CDN.  There are many strategies to do this computation, and there are many high level strategies to geographically distribute resources. As of recently the edge of CDNs starts to be less distributed. Or it would be better to say that the original trend of ‘sprawling the edge’ through the world has been greatly slowed down. CDNs nowadays enhance the capacity of their edges but they have almost stopped branching finely the edge. There is a reason for this behavior: the most consumed content in CDNs is VoD (per byte) and pre-recorded content delivery is not very sensible to edge ramification. With appropriate buffering a few-PoPs-edge can do very well with VoD. On the contrary live events and low latency events depend very much in proper branching of the edge.

When the probability of dropping requests in our request routing due to the misalignment of our demand and our resources capacity/position gets above a certain threshold we will need to increase our service.

In a CDN there is usually dynamic allocation of resources to requests. There is no static allocation of resources to some requests, for example to some customer. But there are always exceptions. In a sophisticated CDN it is possible to compute the request routing function with reservation of resources for some customers. This technique of course makes global request routing much more complicated but introduces new business models and new possibilities in SLAs that are worth considering.

In case your CDN applies capacity reservation then a new customer with a reservation will have an immediate impact in service provisioning.

Other impacts in service provisioning emanate from the very nature of some CDN services. For example, when a CDN starts caching a domain of a new customer it is usually necessary to inform the caches of the name of this domain so they (the caches) change their policy to active caching. This action should be triggered by a proper service provisioning system.

7.4 Event ticketing

In any online service it is important to keep track of complaints and possible service failure. I would say that this is not a very special part of a Content service. (Please understand me right: being a Content Service does not make this special over other Services.) Essentially it is a workflow system that will let you account for events and link them to: Customer Identity + Operations work orders.  Easy as it is to implement a simple workflow it is worth the time to use alarms and time stamps to implement a ‘promptly communication policy’. Once you have received notice of a potential problem clock starts ticking and you must ensure that all stakeholders receive updates of your action in due time. The ticketing system does exactly that. It creates ‘tickets’ and manages their lifecycle. A ticket is a piece of information that accounts for a potential problem. As more details are added to the ticket all stakeholders get benefits from the existence of the ticket: the customer gets responses and corrective actions, operations get information to address a problem, the whole system gets repaired, other users avoid running into problems, your data backend and analytical accounting get info about your time to solve problems and number of problems and cost of repairing.

All in all the ticketing system is your opportunity to implement transparency and a ‘communication bus’ that works for emergencies and gives the right priority to many different events and incidences.

7.5 SLA Monitoring

This is an information system that I rarely see ‘out-of-the-box’. You need to build your own most of the times. Many vendors of video equipment and/or OVPs sell ‘probes’ that you can insert in a variety of points in your video distribution chain. These probes can give you a plethora of measures or ‘quality insights’ about your service. Many other vendors will provide you with network probes, traffic analysis, etc…It is advisable to have a solid background in performance analysis before trying to use the vendors’ suggestion of a set of SLO (Service Level Objective) to build a SLA (Service Level Agreement) for a customer. It happens many times that the understanding that we get from the written SLA is not the same that the customer gets. And it happens even more frequently that the measures that the probes give us DO NOT implement what we have announced in our SLA.  It is key to clear any doubt about what is measured and how, exactly, it is measured. (For more in depth information you may want to read my previous article: CDN Performance Management.)

The SLA is our commitment in front of our customer to grant certain characteristics of content traffic. Today no one will be selling a Content Service on the ‘soft promise’ that the Service will scale seamlessly with demand, the traffic shaping will be correct, the delay low, the losses inexistent, the encoding quality superb,… All these ‘fuzzy statements about service quality’ are simply not admitted.  The ‘reach’ of the service in terms of what that service can really do cannot be an aspiration. It must be documented in an SLA. This SLA will state clearly what we can expect from the service using quantitative measures.

There are very serious differences between ‘cheap’ CDNs / content services and ‘serious’ ‘high quality’ services. Even when the finished product may occasionally look the same: video on demand, channels, events… there is a whole world of complexity about preparing the service in advance to support any eventuality. A quality service provider may spend easily 5X to 10X more than a cheap provider preparing for variations in load and preparing for all kinds of performance threats.  Of course taking care of performance in advance is expensive. It involves lots of analysis of your systems, constant re-design and improvement, buying capacity in excess of demand, buying redundancy, hiring emergency teams, buying monitoring systems…how can a business survive to this investment? .  This investment is an opportunity for positive publicity and for a business model based on quality and SLAs. If you are highly confident in your performance you can sign a very aggressive SLA, promising high quality marks and accepting penalties for casual infringement.

There used to be a huge difference in the delivery options available to a Content Service in the early days of CDNs (15 years ago). At that moment it was:

Option 1: Plain carrier connectivity service: no content oriented SLA. Use it at your own risk. Only percentiles of drop packets and average Mbps available were eligible to describe quality. Nothing was said about integrity of individual transactions.

Option 2: CDN. A ‘shy’ SLA, promising a certain uptime of the service, certain bounded average transaction-latency, a certain set of content-related quality KPIs: buffering ratio, time to show, a certain level of cache-hit ratio…

At that moment option 2 was much more valuable than option 1 (no surprise…), and for that reason prices could be 10X raw Carrier traffic prices for CDN customers.

Today after years of CDN business, after continued improvement in Carrier services, but also after a serious escalation in demand of Content and a serious escalation in typical Content bitrate…SLAs have to be different and CDN prices vs traffic prices have to be in a different ratio. Anyway this is a matter for a much longer article.

What happens today is that SLAs are now a much less impressive sales tool. Almost all CDNs show very similar SLAs. I’ve been able to notice a trend that is very interesting. Some CDNs are getting increasingly ‘bold’, promising to achieve certain SLOs that are close-to-impossible to grant.  This is probably an effect of the way most customers check SLAs: they check them only in case of serious failure.  Or even disastrous failure. There is not a culture of reviewing the quality of the traffic when there are no complaints from end users.   Companies that commercialize SLA-based services have noticed this and they may be in some cases relaxing their vigilance on SLAs, moving resources to other more profitable activities and reacting only in the rare case of a disastrous infringement of the SLA. In that case they just refund the customer and go on with their activity. But at the same time they keep on selling service on SLA promises.

My own personal view about managing SLAs is not aligned with this ‘react only in case of problem’ style. It is true that the underlying carrier services are today more reliable than 15 years ago, but as I’ve said Content Technology keeps pushing the envelope so it would be better to redefine the quality standards.  We should not assume that IP-broadcasting of a worldwide event ‘must’ carry a variable delay of 30s to 60s. We should not assume that the end user will have to live with a high buffering ratio for 4K content. We should not assume that the end user must optimize his player for whatever transport my content service uses.

It is a good selling point to provide SLA monitoring reports for all services contracted by the customer on a monthly basis.  These reports will show how closely we have monitored the SLA, and which margin we have had across the month for every SLO in the SLA. Of course these reports also help our internal engineering in analyzing the infrastructure. A good management will create a cycle of continuous improvement that will give us a bigger margin in our SLOs and/or the ability to support more aggressive SLOs.

SLAs and their included SLOs are great opportunities for service differentiation. If my service can have seriously low latency, or no buffering for 4K, let us demonstrate it month by month with reports that we send for free to all customers.

So having SLA reports for all customers all the time is a good idea. These reports can usually be drawn from our Performance Management Systems and through mediation can be personalized to each Customer.


8 Inventory Systems

These are of course core components of our exploitation. As commented above we must keep track of at least: tech resources, customers, products.

I like to start with the hardcore components of a good delivery: tech resources

8.1 Technical Inventory

This technical inventory is a concept that comes very close to the classical OSS inventory of machines. I say close and not identical because a content service should go beyond connectivity in the analysis of the technical resources.

The technical inventory must contain a list of all machines in the service (mostly delivery machines in a content service) with all their key characteristics: capacity, location, status …  These are long term informative items. Real time-load is not represented in the inventory. An alarm (malfunction) may or may not be represented in the inventory. It may be there to signal that a machine is out of service: status out.

Having a well-structured tech inventory helps a lot when implementing automated processes for increasing the delivery capacity. In a CDN it is also especially important to regularly compute the resource map and the demand map. In fact the request routing function is a mapping of the demand onto the resources. Ideally this mapping would be computed instantly and the calculation repeated continuously.

The technical inventory is not required to represent the current instantaneous load of every machine. That is the responsibility of the request routing function. But the request routing is greatly supported by a comprehensive, well-structured technical inventory in which a ‘logical item’ (like for instance a cache) can be linked to a hardware part description (inventory).

Having this rich data HW inventory allows us to implement an automated capacity forecasting process. In case a new big customer wants to receive service we may quickly calculate a projection of demand and determine (through the inventory) which is the best place to increase capacity.

It is also very useful to link the inventory to the Event ticketing system. In case a machine is involved in a service malfunction that machine can be quickly identified, marked as out of service, and retired from our delivery and request routing functions. At the same time our OSS will be triggered for a repair on site, a replacement… or simply we may mark the datacenter as eligible for end of the month visit.

The tech inventory must be also linked to our cost computation process that also takes data from our mediation systems and our purchases department. We want to know the lifetime of each machine that we operate and which is the impact of each machine in our costs. This impact has CAPEX and OPEX components.  Having these links between analytic systems allows us to implement a long term profitability analysis of our business.

8.2 Product Inventory AKA Commercial portfolio

As we saw when talking about the service ideation there is a portfolio of different products. In case of Content Products this portfolio maps to a list of titles and a wealth of ‘actions’ or capabilities that our customers buy the right to execute. We may package titles with actions in the most creative way that anyone could imagine : Channels, Pre-recorded content on demand ,events…any combination of the mentioned with an idea of quality through ‘viewing profiles’ (bitrate, frame size, audio quality, frame rate, codec, color depth,…), monthly subscription, pay per view, hour bonus, plain tariff, premium plain tariff,…whatever. But how do we map all these ‘products’ to our other systems: technical inventory, customer inventory, mediation, analytics, portals, SLAs monitoring, event ticketing…

The best solution is to build links from the Product Inventory to all the systems in a way that makes sense. And that ‘way’ is different for each of the exploitation system components that we have described.

For instance, when designing a VoD product we should map it to the Technical Inventory to be sure that the list of codecs + transports is supported by our streamers. If we have an heterogeneous population of streamers in which some support the new Product and some not…we need to link that knowledge to customer provisioning so we do not sell a footprint for that product that we cannot serve…. If that same VoD product will be billed through a monthly plain tariff with a cap in traffic and with a closed list of titles but we allow premium titles to be streamed for an extra fee… we must include informational tips in the Product inventory so the link to the Mediation can build properly the monthly bill for this customer.  If we want to apply different pricing for different places in the world we need to include those tips in the Product inventory and use them to link to Mediation and to link to Customer provisioning and to link to Portals.

Of course the most obvious link of the Product inventory is to the Product Owner Portal. The Product Owner Portal is the technical tool that is used to design the product capabilities (actions) and as I’ve said it is a system that lives at the core of the Exploitation system, in a dungeon where only developers and a few product owners can touch it. As it goes through frequent updates to provide new and exciting capabilities the same happens to the Product Inventory. This inventory evolves with the Product Owner Portal, to reflect and account for every new capability and to store the tips that are used to link via many processes to the rest of exploitation system components.

8.3 Customer Inventory

As we have mentioned before today having information about our customers is an asset that has turned to be more powerful than ever before. In fact there is serious fight for having the personal data records of customers among commercial platforms. For this reason ‘sharing’ part of the customer identity is the new trend.

Anyway, let’s assume that we are the primary source of data about one particular customer. In that case we need to account for enough information to legally approach our customer: full Name, full address, fiscal ID, payment data.  On top of that we may pile up whatever data we dare to ask our customer about himself.

And on top of ‘what our customer knows we know about him’… we will add a ton of ‘insights’ that we can get about our customer just watching his public activity.  Important: watching a public activity means taking notes on actions that we are supposed to notice as service providers… It is not and will never be spying on other activities of our customer of course!   There are many insights that online businesses do not exploit, or at least exploiting them was cumbersome and not very fashionable until recently.  The ‘Big Data’ age is changing that.  Profiling customers is a hard task that involves lots of interesting algorithms to correlate data sources, but the good part is that we already have the data sources: purchases, timestamps of purchases, traffic, clicks on media players, ‘behavior’ at large. And the other good thing about collecting all these insights is that it is lawful and it is a ‘win-win’ action that benefits equally the service and the Customer.

The Customer Inventory is of course linked to Portals, to Mediation, to event ticketing, to some Analytics and to SLA monitoring.


9 Conclusions

We have seen that Exploitation Systems are a set of ‘systems’ that rival in complexity with the core service systems, usually called ‘service delivery infrastructure’. But Services must be exploitable…easy for the service provider.

We have seen that we cannot buy Exploitation Systems off the shelves. OK, we can. But is it good to go with an all-purpose exploitation suite with 100+ modules that are designed to behave equally when selling cars, houses, apples …movies? My guess is that Content Businesses have some specifics that put them apart from other Services, and even there are specifics that separate one Content Service from another. If we buy a famous exploitation suite for online businesses we MUST have a clear design in mind to customize it.

We have seen that some formality when thinking at the design stage helps later. I suggest creating first a Service Exploitation Model and implementing a Service Exploitation System after it.

We have decomposed the typical pipeline for exploitation of Content Services in major subsystems: Portals, Mediation, Operation, Inventories.

We have reviewed the major subsystems of the exploitation and analyzed the good properties that each subsystem should have for Content Services and also have discussed trends in design of these subsystems.

While reviewing the desired properties of subsystems we have noticed the links and processes that we need to create between them.  We have noticed the huge possibilities that we get from linking subsystems that in other Service views (alien to Content) are kept separated. These links are key to the coordinated behavior of the Exploitation and they must be instrumented by adding information that makes the subsystems cooperate.

As a final remark I would like to emphasize how important it is to apply innovation, analysis and continuous improvement methods to the Exploitation of Content Services. I know it looks fancier to deal with the infrastructure for the delivery of video but there are a lot of interesting and even scientific problems to solve in Exploitation.

Best Regards.                                                                                       Adolfo M. Rosas


(download  as PDF : CDN interconnection business and service )



What does it mean ‘connecting two (or more) CDNs’?

There could be many possible ‘CDN-to-CDN connections’. Let us have from the beginning a wide definition of ‘interconnection of CDNs’:

<<Agreement between two businesses (CDNs) by which some responsibilities and activities of one party (CDN) are delegated to another party or parties and some compensation is agreed and received for this exchange>>

Technical means may be needed to implement the exchange of responsibilities/activities and to handle compensation.

Why connect two (or more) CDNs?

Two distinct CDNs are two separate businesses, two separate networks. The reason that stands up is ‘connect to increase both businesses’. If we cannot find business advantages for both parties in the connection the effort does not make sense.

What business advantages are enabled by interconnection?

A CDN gets money from bulk-delivered traffic and/or per transaction. A CDN receives requests over a ‘URL portfolio’ and gives either bulk traffic or transactions in response. A CDN is designed to gather requests from anywhere in the world BUT to deliver traffic only to some footprint (part of the world). The only way to increase your business results is to increase income or cut costs (obvious). Increasing income can be done either by increasing delivery/transactions or by raising traffic prices or by raising transaction value (have VAS). Cutting cost is completely another tale that can be achieved through many technically intricate ways.




1) Increase delivery

More requests coming in are an opportunity for more business (more transactions and/or more traffic). The only action available to CDN admins that want to increase the number of incoming requests is to increase the URL portfolio: ‘CDN A’ will see more requests if ‘CDN B’ delegates a set of URLs to CDN A (even if it is done temporarily).

(NOTE: End user demand is unpredictable. Total sum of requests over current URL portfolio may increase without actually increasing the size of the portfolio<number of URLs>, just imagine that each URL in our current portfolio receives some more requests, but that increase happens as the result to an action of the end users not the CDN).

But, why would CDN B delegate part of its precious portfolio to CDN A?

Resources are limited. More processing/delivery resources allow for more business. A CDN can never know nor be in control of how many requests come in, so it is possible that some requests coming into CDN B cannot be attended. In that case CDN B could ‘overflow’ some requests to CDN A thus retaining an otherwise lost profit.

1.1)   Temporary delegation of portfolio. (impulsive overflow).

Maybe CDN B just do not have enough machines (limited processing and limited delivery) and ‘success took them by surprise’, in that case this is a pure-temporary-overflow and may be handed to another CDN A that operates in the same footprint. Over time CDN B will adjust capacity and will stop overflowing to CDN A as it is usually more profitable to use your own capacity and retain your whole portfolio of URLs and Clients. The handover in and out must be fast. It is important to be able to trigger overflow based upon some real world variables that are inspected in real time. Trigger action is real time, but all the agreements needed for this to happen are negotiated in advance and the limits and conditions are set in advance.

1.2)  Long term delegation of portfolio. (delegation or ‘traffic termination’).

Maybe CDN B is not deployed in some footprint so it is suboptimal (not profitable) for CDNB to deliver traffic/transactions to that footprint. In this case CDN B needs a long-term-delegation of some URLs delivery to CDN A for a specific footprint. This is a case of ‘keeping growth without increasing cost’.


2) Adjust prices

2.1) Mutual Overflow in the same footprint (Hidden balancing)

Competing CDNs that operate in the same footprint usually force a trend of diminishing prices, trying to capture clients. Two-way Interconnection at low transit prices may have the effect of presenting to the client a stable price in the area (footprint). Any CDN in the area may benefit from a new client of another CDN in case a Mutual Overflow is agreed at a reasonable price. This sounds much more reasonable that trying to compete in quality in the same footprint, as the biggest contributor to quality is the performance of the underlying carrier and that is something that all CDNs in the area can equally get. The mutual overflow means that under some conditions evaluated in real time a URL normally served by CDN A will be served by CDN B in the same footprint and all the way round. This mutual overflow can be thought of as ‘Hidden Balancing’ between CDNs, as it is a mechanism transparent to the CDN clients.

2.2) Balanced CDNs in the same footprint (Explicit balancing)

Two or more CDNs may go to market together through a CDN balancer. In fact many balancers are now in place built by CDN clients or by third parties. The business that comes out of the balancer works on the fact that in a big area (multi country) and through a long time (up to a year) the performance of any CDN is going to vary unexpectedly due to these factors:

-unknown expected behavior of demand over own URL portfolio

-unknown evolution of own client portfolio

-unknown instant performance of underlying transport networks (carriers)

A balancer will take a request and will route it to the best suited CDN in real time. Opposed to ‘Mutual Overflow’==’Hidden Balancing’ this can be called ‘Explicit Balancing’ as the mechanism is now visible to CDN clients. The reasoning for ‘best CDN’ will be complex, based on the real time status of involved CDNs, but also based on ‘fairness for business’ of all involved CDNs in case the balancer is controlled by all of these CDNs. (In case the balancer is property of a third party fairness for all involved CDNs is not guaranteed.)

Many CDN clients feel better when they know that their portfolio has more than one CDN ready for delivery. In that case it is the best option for the CDNs to work on their mutual balancing. In case a third party balances them the results will not be so good, and some money will go to the third party for joining the CDNs. It is better to identify CDNs that may complement our CDN and work together in a balancer that could be sold together and directly to clients.


3) Balance costs vs. income: increase occupation

Planning cost in a CDN is nothing straightforward. Agreements with Carriers cost money (CDNs have to pay for traffic and sometimes pay for dedicated links /ports or housing). Cost of traffic is directly linked to income, but cost of ports, links and housing are fixed costs not related to the amount of activity or the success of the service. Machinery for delivery costs money (edge), but maintenance of machinery: installation and operation may be the most costly part of a CDN.

In case a CDN is not able to maintain a high occupation, the fixed costs will make the business not worth the effort, thus it is a good idea to offer capacity to other CDNs either as overflow, or through a balancer or as traffic termination. The real-time measurement of available capacity in our CDN may be an input to the balancer/overflow/delegation.





High-level CDN-Interconn Service goals:

.Temporary Delegation of some part of portfolio (Impulsive Overflow)

.Long Term Delegation of some part of portfolio (delegation or Traffic Termination)

.Mutual Overflow (Hidden Balancing)

.Explicit balancing


Requirements to implement Long Term Delegation one-way (receiving) in CDN ‘X’:

  1. ‘X’ must have the ability to receive a delegation of some URL set from another CDN, intended ONLY for a given footprint.
  2. Metadata must be given to the receiving CDN (‘X’) identifying the account in ‘X’ that is the owner of the URL set prior to handling any transaction on behalf of this account. (Delegation Metadata).
  3. Metadata must be given to the receiving CDN (‘X’) containing any configuration data needed to perform the transactions on behalf of the donating CDN. Examples of transactions are: geo-targeting, IP blacklisting, geo-blocking, security token verification, etc…. (These metadata may apply for the whole account, and in that case they are part of Delegation Metadata or they may apply to a URL or to a URL set, in that case we call them ‘URL-service Metadata’.)
  4. Creation of the client account in ‘X’ (to handle delegated transactions) could be done ‘on the fly’ on receiving the URLs + metadata (client auto-provisioning) or could be done in advance by an admin process in ‘X’ (client pre-provisioning).
  5. Long Term Delegation must be ‘actionable’ immediately (for instance at the request from ‘X’) and also it must be possible to ‘schedule’ activation/termination planned ahead by both CDNs administrators.
  6. Long Term Delegation must start to be effective ‘fast’, could have a defined termination date and must stop ‘fast’ either immediately (by admin action) or at the scheduled date. Here, in the context of ‘X’, ‘fast’ means as fast as it is convenient (SLA) for the donating CDN. (Usually this ‘fast’ will be in the range of minutes.)
  7. Delegation must carry with it a feedback channel so the donating CDN (the one that ‘gives URLs’) regularly receives details of the delivery/transactions performed by the receiving CDN (the one that ‘takes URLs’). This feedback as the very minimum must contain the full technical records generated at the edge of receiving CDN (this is commonplace in CDN business).
  8. It is desirable that the receiving CDN (‘X’ ) builds ‘Analytics’ specific to delegated traffic, thus offering info about the extra business that comes to it through compensation. In absence of specific arrangements the Analytics and Mediation (Billing) Services in ‘X’ will create graphs and reports of the delegated traffic as for any other account so delegated traffic is not distinguishable ‘per se’. For this reason it is desirable to mark delegation accounts so we can analyze separately traffic due to delegations.


Requirements to implement unrestricted (two-way) Long term Delegation:

  1. Feedback data could transparently update any Analytics & Mediation (billing) service that the donating CDN may have. Records of deliveries/transactions that have been delegated must appear mixed and added with all the regular records of the donating CDN.
  2. Records of deliveries/transactions that have been delegated could be separated from all the regular records of the donating CDN (in a view different from the mixed view), as an additional action that tries to give more information to the business. This information serves to plan capacity increases.
  3. A CDN admin could have the ability to select a subset of the own URL portfolio and delegate it to another CDN ONLY for a given footprint. (Implementation of delegation ‘the other way round’: not receiving but exporting URLs.


Requirements to implement Temporary Delegation:

  1. Temporary delegation (impulsive overflow) must be transparent to clients.
  2. Temporary delegation (impulsive overflow) must be transparent to the exploitation systems of the donating CDN. Consumption records must be accounted as ‘normal’ deliveries/transactions.
  3. Temporary delegation (impulsive overflow) must be ‘triggered’ by rules based on real time inspection of several variables.
  4. Variables that trigger temporary delegation (impulsive overflow) must be defined by the CDN business management.


Requirements to implement Mutual overflow:

  1. A CDN admin must have the ability to select a subset of the own URL portfolio and add it to a mutual balancing set with another CDN ONLY for a given footprint . All URLs in the mutual balancing set must be served to end users by both CDNs. Technical devices and rules used to balance must be set up by both CDNs.

Requirements to implement Explicit Balancing:

  1. A CDN must have a tool for clients: an explicit balancer. The balancer must act as the highest level request router for all involved CDNs, affecting subsets of each CDN portfolio, and applying to a specific footprint for each URL in the sum of portfolios.
  2. The explicit balancer must have public (known to anyone including clients) balancing rules, public input variables and public policies.
  3. The explicit balancer must account for every transaction/delivery and offer Analytics that allow analyzing behavior of the balancer and fine tune balancing rules and giving feedback for pricing models of the balanced CDNs product.

Worldwide live distribution through cascaded splitting

(Download as pdf: worldwide live splitting cascading or not)


In this short paper I want to analyze the impact of different architectures in worldwide distribution of live streams over internet: cascaded splitting vs. non-cascaded multi region local splitting.

In a worldwide CDN overlaid on top of Internet we will need to pay attention to regional areas (REG) equipped each one with a single Entry-point that is a First Level Splitter (FLS) that gathers many input channels from customers. We have in the same REG many endpoints/streamers or Second Level Splitters (SLS) that split each input channel into many copies on demand of consumers.  The SLS are classical splitters, classical caches in the border of any current CDN, in brief: Endpoints.  They talk to consumers directly. Each endpoint must obtain its live input from the FLS output. FLS + endpoints act as a hierarchy of splitters inside the REG.

We have two very different scenarios:

Non cascading entry-points: every endpoint in a REG can connect to local channels (local FLS) and can connect to far (foreign) channels (foreign FLSs).  The FLS in the REG receives as input ONLY local channels and gives as output all endpoint connections: local and foreign.

Cascading entry-points: endpoints in a REG connect ONLY to the FLS in that REG. The FLS in a REG receives as input local channels and also foreign channels from other REGs. The FLS gives as output ONLY local connections to local endpoints.

We want to compare these two scenarios in terms of:

Quality of connections:  the shortest the connection the less latency, the less losses and the highest usable throughput. We want to keep connections local (both ends in the same REG) as much as possible.

Capacity of worldwide CDN: We do not want to open more connections than strictly needed.

So we need to count connections in both scenarios and we need to evaluate the average length of connections in both scenarios.

There are also other properties of connections that matter in our study:

-input connections cost MORE to a splitter than output connections. This applies both to FLS and endpoint.

(Note: Intuitively every input connection is a ‘write’ and it is different from any other write. It requires a separate write buffer and separate quota of the transmission capacity. Output connections on the contrary are ‘reads’. Two reads about the same input have separate quota of transmission capacity but they share the same read buffer. This buffer can even be exactly the same write buffer of the input. So creating an input connection is more expensive than creating an output connection. It consumes much more memory. Both input and the output from that input consume exactly the same transmission capacity quota.)

Statements and nomenclature:

O             = Number of REGs

Si                    = Number of source channels in REG ‘i’

Ei              = Number of endpoints in REG ‘i’

pihk                = Probability that channel ‘k’ (1<=k<= Sh ) originated in REG ‘h’ is requested by ANY consumer in REG ‘i’.   (Note that it is possible that i=h, and it is also possible that i≠h.)

peijhk           = Probability that channel ‘k’ (1<=k<= Sh ) originated in REG ‘h’ is requested by ANY consumer in Endpoint ‘j’ (1<=j<= Ei ) in REG ‘i’.   (Note that it is possible that i=h, and it is also possible that i≠h.)

dik                  = distance REG ‘i’ to REG ‘k’.   (Note that, as usual,  dii=0, djk=dki ).




We try to compare two architectural options that present a problem in worldwide distribution of channels. This problem will not be present if we just had local channels that do not need to be distributed far apart of production site. We will not have this problem also if there was a single source and a worldwide distribution tree for that source, in that case we will build a static tree for that source that could be optimal.  What I’m dealing with here is the choices that we have in case there are many live sources in many regions and there is cross-consumption of live distribution from region to region.  Questions are:

-which is the best splitting architecture in terms of quality and CDN effort?

-which are the alternatives?

-which is the quantitative difference between them?

I’ve presented two alternatives that exist in the real world

1)       Non-cascading distributed splitting: two level splitting: first at entrypoint, second at the edge, in endpoints. When a source is foreign, the local endpoints must establish long reach connections.

2)       Cascading distributed splitting: two level splitting: first at entrypoint second at the edge AND entrypoint to entrypoint. When a source is foreign the local endpoints are not allowed to connect directly to it. Instead the local entrypoint connects to the foreign source and turns it local making it available to local endpoints as if had been a local channel.

In the following pages you can see a diagram of several (5) REGs with cross traffic and the analysis of CDN effort and resulting quality in terms of the nomenclature introduced in the above paragraph.

Non-Cascading entrypoints

non cascading


Cascading entrypoints



Practical calculations:

It is clear to anyone that the most difficult to obtain piece of information is the list of probabilities pihk and peijhk.

No matter the complexity of our real world demand, there are some convenient simplifications that we could do.  These simplifications come from our observation of real world behavior:

Simplification 1:

pihk  is always ‘0’ or ‘1’. A channel is either ‘exported for sure’ from REG ‘h’ to REG ‘i’ or ‘completely prohibited’.

(Note: there are so many consumers in REGs that once a channel is announced in a REG it is really easy to find someone interested in it. A single consumer is enough. If one out of 500000 wants the channel in REG ‘i’ then the channel MUST be exported to REG ‘j’.  The probability pihkhere models behavior of isolated people and so we observe that once the population is high enough pihk approximates 1.  If the channel ‘k’ in REG ‘h’ is geo-locked to its original REG, it cannot be requested by any REG ‘i’, and then pihk is 0.

Simplification 2:

piik  is always ‘1’. Following the same reasoning that we see above, once a channel is announced in a REG it is always consumed in that REG.

Simplification 3:

peijhk  has approximately the same value for all ‘j’ in the REG ‘i’.  That means that every endpoint ‘j’ in REG ‘i’ receives equal load of requests about channel ‘k’ from REG ‘h’. Why is this? This happens Just because endpoints receive requests that are balanced to them by a regional Request Router (usually a supercharged local DNS).  The aim of the Req. Router is to distribute requests evenly to streamers working for the REG.

P.S: this behavior depends on implementation of req. Router. A reasonable strategy is based on specializing endpoints in some content (through the use of URL hashes for instance), and thus a given channel will always be routed through the same endpoint. If correctly implemented this would mean that for a given pair (h,k) ,peijhk  is ‘0’ for all values of ‘j’ except one in REG ‘i’, and for that value of  ‘j’ peijhk  is ‘1’.

Current content technology: appropriate ot not up to the task?

(Download this article in pdf format: Content technology apropriate or not)


‘Content’ is a piece of information. We may add ‘intended for humans’. That maybe the most defining property of content in front of other pieces of information. In a world increasingly mediated by machines it would not be appropriate to say that machines ‘do not consume content’, they do, on our behalf.  Key idea is that any machine that processes content does it just immediately on behalf of a human consumer, so it processes content human-like. STBs, browser players, standalone video players, connected TVs,… may be deemed ‘consumers’ of content with a personality of their own (more on this through the article) but they are acting over information that was intended for humans.  They are mimicking human information consumption models, as opposed to M2M (Machine to machine) information processing which happens in a way totally unnatural to humans.

Which is the state of the art in ‘content technology’? By ‘content technology’ I mean: ways to represent information natural for humans, devices designed to take information from humans and provide information to humans. Are these technologies adequate today? How far are we from reaching perceptual limits? Are these technologies expensive/cheap? Do we have significantly better ‘content technology’ than we used to have or not?


The most natural channel to feed info to humans is the combination of audio + video.  Our viewing and hearing capacities work together extremely well. We humans are owners of some other sensory equipment: a fair amount of surface sensible to pressure and temperature (aka skin), an entry-level chemical lab for solids and liquids (tongue) and a definitely sub-par gas analyser (nose). As suggested by these descriptions we cannot be really proud about our perceptions of chemicals but we do fairly well perceiving and interpreting light, sound and pressure.  This is not due to the quality of our sensors, which are surpassed easily by almost every creature out there, but due to the processing power connected to those sensors.  We see, hear and feel through our brain. We humans have devoted more brain real state to image and sound processing than other animals.

It is no coincidence thus that technology intended to handle human information has been focused on audio visual.  Relatively recently a branch of technology took off: ‘Haptics’, that may give our pressure sensors some fun (multi-touch surfaces, force-feedback 3D devices, gesture capture…) but ‘Haptics’ is still underdeveloped if we compare it to audio-visual technologies.

So we have created our information technology around audio-visual. Let’s see where we are.


We are able to perceive light. We develop brain interpretations linked to properties of light: intensity, wavelength (colour is our interpretation of light wavelength + composition of wavelengths). There are clear limits to our perception. We just perceive some of the frequencies. We cannot ‘see’ below red frequencies (infra-red) or above violet (ultra-violet). We have an array of light sensors of different types: ‘cones’ and ‘rods’ and a focus system (an organic ‘lens’). Equipped with this hardware we are able to sample the light field of the outer world. I have chosen to say ‘sample’ because there is still no consensus about how do we process data from time varying light fields: does our brain work in entire ‘frames’? Does it keep a ‘static image’ memory? Does it use a hierarchical scene representation? Does it sample always at the same pace (at a fixed rate)? We know very few things about how we process images. Nevertheless some of the physical limits to our viewing system come from the ‘sensory hardware’ not from the processor, not from the brain.

Lens + light sensor:  aperture, angular resolution, viewing distance

If you browse any good biomechanics literature (ie: Richard Dawkins: Climbing mount improbable, or http://micro.magnet.fsu.edu/primer/lightandcolor/humanvisionintro.html), you can find that human eyes lens has in average an aperture of 10 deg.   We form ‘stereoscopic’ images by overlapping information from two eyes. We have roughly 125-130 million light receptors per eye, most of them are rods unable to detect colour, and less than 5% (about 6-7 million) are cones sensible to colour. They are not laid out in a comfortable square grid so our field of view maybe represented delimited by something like an oval with its major axis horizontal. Central part of this oval is another oval with mayor axis vertical where the two vision cones overlap, this small spot is our stereoscopic-high-resolution view spot, though we still receive visual info from the resting part of the ‘whole field of view’ which may be 90º vert. and 120º hor.

Optical resolution can be measured in Line pairs per degree (LPD) or cycles per degree (CPD). Humans are able to resolve 0.6 LP per minute arc (1/60 Deg). We can tell two points (each one lying on one different line of a line pair of contrasting colour) are different when they are apart by more than 0.3 minute arc (see for instance www.clarkvision.com or www.normankoren.com/Tutorials/MTF.html ).  Visual acuity defined as 1/a where ‘a’ is the response in LPD is 1.7, and ‘a’ is the abovementioned 0.6 LP per min arc. This is called 20/20 vision.  You standing at 20 feet from target see the same detail as any normal viewer standing at 20 feet. If you have better acuity than average you can see the same detail standing farther, for instance 22/ 20. If you are subpar you need to get closer for instance 18/20.  As a reference a falcon’s eye can be as good as 20/2. Cells in the fovea (central region of view field spanning 1.4 deg from total 10 deg) are better connected to the brain: 1 cone -1 nerve and cones are more tightly packed there so spatial resolution in the fovea is higher than in the peripheral view. The ratio of connections in the peripheral view can drop as much as 1:20, which means that 20 light-receptors sum up their signals into a single signal they feed to a single nerve.

I know that this way of measuring the resolution power of our eyes is cumbersome, but by the way is the only right method! Let’s do some practical math. Let’s say that we read a book or we read on a tablet. Normal reading distance may be 18” = 45.7 cm. Our eyesight cone at that distance is just 8cm in base-radius (spot height). We see 0.6LPminarc*600 min arc=360LP in 8cm vertical so we can tell 720 ‘points’ to be different in a vertical high-contrast alternating colour strip of points. This is 720 points/8cm = 90 points/cm or 228 points per inch (ppi). You may have noticed that you cannot tell two adjacent dots printed by modern laser printers (300-600 dpi) at normal reading distance. It would not be fair to say that 250 dpi suffices to print so we can read comfortably at normal distance, as far as different printing technologies may need more than 1 inkdrop (a dot) to represent a pixel. This is the reason state of the art printing moves between 300-600 dpi and it does not make much sense to go beyond. (P.S: You may notice there are printers that offer well above that: 1200-1400 ppi, but most of them confuse ‘inkdrop dots’ with pixels. They cannot represent a single pixel with a single drop or dot. There are also scanners boasting as much as 4000dpi…but this is an entirely different world as it may make sense scanning a surface at much closer than viewing distance so we can correct scanner optical defects to produce a right image for normal viewing distance.)

Assuming that display/print technology is not too bad translating pixels to dots we can say that  a good surface for reading at 18”must be capable of showing no less than 250 dpi or ppi so you can take full advantage of your eyes. By these standards ‘regular’ computer displays are not up to the task, as they have 75-100 ppi and typically are viewed at 20” so they would require over 200 ppi. iPad Retina seems a more appropriate display as it has been designed with these magic numbers in mind. Retina devices have pixel densities from 326 ppi (phone) through 264ppi (tablet) down to 220 ppi (monitor). As the viewing distance for a phone is less than 10”, 326ppi fits in the same acuity range that 264 ppi for 18” and so does 220 ppi for 20”. Other display manufacturers have followed on the trail of Apple: Amazon fire HD devices have followed and then surpassed Retina displays : Kindle Fire HD 8.9” (254 ppi), Kindle Fire HDX 7” (323 ppi), Kindle Fire HDX 8.9” (339 ppi). Specially note that the HDX devices are tablets while they use pixel densities that Apple Retina reserves for phones…so these tablets are designed to fit eyes much better than standard. Newer phones like HTC One (468 ppi), Huawei Ascend D2 (443 ppi), LG Nexus 5 (445 ppi), Samsung Galaxy S4 (443 ppi)… go that same way.  ( http://en.wikipedia.org/wiki/List_of_displays_by_pixel_density)

What about a Full HD TV or a 4K TV? We can calculate the optimal viewing distance for perfect eyesight. Let’s do the math for a 56” Full HD TV and a 56” 4K TV assuming aspect ratio 16:9 and square pixels. The right triangle is: 16:9:18.36 which is homothetic to  ~ 48.8: 27.45: 56, so in vertical 27.45” we have 1080 points (Full HD) which is 39.34 ppi or double for 4K. Let’s round to 40 ppi for a 56” FullHD and 80ppi for a 56” 4K TV. So optimal viewing distance corresponds to 360 points per 5 Deg matching to 40ppi: 360/40= 9 inches viewed at 5 Deg or Distance in inches= 9/tan (5 Deg) = 103 inches= 2.6 m for Full HD and 1.3 m for 4K.   So if you have a nice 56” Full HD TV you will enjoy it best by sitting closer than 2.6 m.  If you are lucky enough to have a 56” 4K TV you can sit as close as 1.3 m and enjoy to the max of your eyes resolving power

Colour perception.

Humans have three types of cones (the colour receptors) each one sensible to a different range of light wavelengths: red, green, blue. Colour is NOT an objective property of light. Colour is an interpretation of two physical phenomena: 1) wavelength of radiation, 2) composition of ‘pure tones’ or single-wavelength radiations. A healthy eye can distinguish more than 16 M different shades of colour (some lab experiments say even as much as 50 M). As we have commented cones are scarce compared to rods so we do not have the same resolution power for colour as we have for any light presence. Pure ‘tones’ range from violet to red. They are called ‘spectral colours’. Non spectral colours must be produced by compositing any number of pure tones. For example white, grey, pink… need to be obtained as compositions.

Colour is subjective. Within a range, different people will see slightly different shades of colour when presented with exactly the same light. (The same exact composition of wavelengths). This is due to the way cones react to light. Cones are pigmented and thus when receiving photons of a certain wavelength range their pigment reacts triggering a current to the nerve. But cone pigment ‘quality’ varies from human to human so they may trigger a different signal for the same stimulus and on the contrary they may trigger the same signal for a slightly different stimulus. The colour we see is an interpretation of light. It happens that different lighting conditions may render exactly the same electrical response. This means that the composition of wavelengths to produce some colour output is not unique; there are a number of input combinations that render the same output (metamers).

Light intensity range.

To make things even more difficult, the colour response (to light) function of our eyes depends on intensity of radiation. Cones may respond to the same wavelength differently when the intensity of light is much higher or much lower (bear in mind that intensity relates to the energy carried by the photons…it is crudely the number of photons reaching the cone per time unit, it has nothing to do with individual energy of each photon that solely relates to its wavelength.)  Our eyes have a dynamic sensitivity range that is truly amazing, it covers 10 decades. We can discern shapes in low light receiving as less as 100/150 photons and we can still see all the way up to 10 orders of magnitude more light!!!. Of course we do not perceive colour information equally well all across the range. When we are in the lower 4 decades of the range we need to sum up all possible receptors to trigger a decent signal, so we ‘see’ mostly through rods (scotopic vision) and through many peripheral rods that are less individually connected to nerves so many of them share a nerve thus losing in spatial resolution but trading in sensibility as very low photon counts per receptor may excite the nerve when summed from several receptors. When we are in the 6 upper decades of intensity range we can perceive colour (photopic vision) although in extreme intensity we just perceive washed or white colours. Some authors and labs have checked the human intensity range for a single scene (see www.clakvision.com). This range is different of the whole range, as the eye is able of 10 decades but not in the same scene, only through a few minutes of adaptation to low/high light. For a single night scene with very low light (gazing at stars for instance) the range is estimated to be 6 decades 1:10^6; for daylight the range is estimated to be 4 decades 1:10^4.

What about current display devices? Are they good to represent colour in front of our eyes?

State of the art displays consist of a grid of picture elements (pixels) each one formed by three light emitting devices selected to be pure tones: R, G, B. The amount of light that is emitted by each device can be controlled independently by polarizing a Liquid Crystal with a variable voltage, allowing more or less light to go through. The polarization range is discretized to N steps by feeding a digital signal through a DAC to the LC.  Superposition of light from three very closely placed light emitters produces a mix of wavelengths, a colour shade, concentrated on one pixel.

Today most LCD panels are 8 bpc (bit per channel) and only the most expensive are 10 bpc. That means that each pure tone (R, G, B) in each pixel can be modulated through 256 steps (8 bit per channel) so roughly 2^24 tones are possible (16.7 M). The best panels support 2^30 tones (1073.7 M). VGA and DVI interfaces only provide 8bpc input: RGB-24. To excite a 10 bpc panel a DisplayPort interface or an HDMI 1.3 with DeepColour: RGB-30 enabled is needed.  Video sources for these panels may be PCs with high end video cards that support true 30 bit colour or high end Blu-ray players with DeepColour (and a Blu-ray title encoded in 30 bit colour of course!). As we are capable of distinguishing over 16 M shades you can think that 8 bpc could be barely enough but here comes the tricky part…who told you that the 16 M shades of an 8bpc panel are ‘precisely these’ 16 M different shades that your eye can see? They are not for most cheap panels. Even when a colorimeter may tell us that the panel is producing 16M different shades, our eyes have a colour transfer function that must be matched by the source of light that pretends to have 16 M recognizable shades. If not, many of the shades produced by the source will probably lie in ‘indistinguishable places’ of our ‘colour transfer function’ rendering effectively much less than 16 M distinguishable shades.  Professional monitors and very high end TVs have ‘corrected gamma’ output. This means that they do not attempt to produce a linear amount of polarization to each channel (R, G and B) in the range 0-255. Instead of that they have pre-computed tables with the right amount of R, G and B that our eyes can see all through the visible gamma. They use internal processing to ‘bias’ a standard RGB-24 signal to render it into ‘the gamma of shades your eyes can see’ with a preference of some shades and some disrespect for others so they can render RGB-24 input into truly 16M distinguishable shades. To achieve this goal these devices store the mapping functions (gamma correction functions) in a LUT (Lookup Table) that is a double entry table that produces the right voltage to excite the LCD for each RGB input. That voltage may have finer steps in some parts of colour space where human eyes have more ‘colour density’. For this reason an 8 bit DAC won’t be enough. More often an 8 bit signal is fed through the LUT to a 10 or 12 bit DAC. You see, more than 8 bits of internal calculus space are needed to handle 8 bit input so many 8 bpc displays are said to be 8 bpc panels with 10 bit LUT or 12 bit LUT. Today the best displays are 10 bpc panels with 14 bit LUT or 16 bit LUT. It is also worth mentioning a technique called FRC (Frame Rate Control). Some manufacturers claim they can show over 16.7 M colours while they use only 8 true bit panels .They double the frame rate and excite the same pixel with two alternating different voltages. So they fake a colour by mixing two colours through ‘modulation in time’. This technique seems to work perceptually but it is always good to know if your panel is true 10 bit or instead 8 bit + FRC. Once available this technique it has been used to make regular monitors cheaper by going all the way down to 6 bit + FRC.

We can conclude that today’s high end monitors that properly implement gamma correction are proficient to show us the maximum range of colours we are capable of seeing (normal people see something above 16 M shades of colour, maybe even 50 M) when using LCD panels that are true 8 bpc or better (true 10 bpc). Unfortunately mainstream computer monitors and TVs do NOT have proper gamma correction and those that have the feature rarely are correctly calibrated (especially TVs), so digital colour is not yet where it should be in our lives. Most cheap monitors take 8 bit per channel colour and feed it right to an 8 bit DAC to produce whatever ranges of colour it ends up being, resulting in much less than 16 M viewable shades of colour. Many cheap computer screens and TVs are even 6 bit + FRC. By applying a colorimeter you can discover the gamma that your device produces and match it to some ‘locus’ in a standard ‘colour space’.  A standard colour space is a bi-dimensional representation of all colour shades the eye can see. This representation can be built only for some fixed intensity level. This means that with more or less intensity the corresponding bi-dimensional representation will be different. You can imagine a ‘cone’ with vertex in ‘intensity 0 plane’. Slices of this cone (one for each fixed intensity value) lay in parallel planes occupying each one a ‘locus’ (a connected bi-dimensional plot) that gets bigger and richer as intensity increases until we get to optimal intensity and then starts to get washed out when intensity is above optimal. The locus of viewable shades takes in most representations the shape of a deformed triangle with pure Red, Green and Blue in the three vertices. Different colour spaces differ in shape but more or less all of them look like a triangle deformed into a ‘horseshoe’. You may know Adobe RGB and sRGB. All these representations are subsets of the viewable locus and tried to standardize respectively what a printing device and a monitor should do (when they were created). Today’s professional monitors can match 99.X% Adobe RGB which is more ample than sRGB.  Most TV sets and monitors can only produce a locus much smaller than sRGB.

Refresh rates, interlacing and time response

How do we see moving objects? Does our brain create a sequence of frames? Does our brain even have the notion of a still frame? How do technology-produced signals compare to natural world in front of our eyes?  Are we close to fooling ourselves with fake windows replacing the real world?

It turns out that we can only see moving objects. Yes that is true. You may be disturbed by this statement but no matter how solidly static an object is and how static you think you are… when you stare at it you are moving constantly your eyes. If your eyes do not move and the world does not move your brain just can see nothing. The image processor in our brain likes movement and searches for it continually. If there is no apparent movement in the world our eyes need to move so the flow of information can continue.

We just do not know if there is something like a frame memory in our brain, but it seems that our eyes continually scan space stopping by the places that show more movement. To understand a ‘still frame’ (reasonably still…, let’s say that for humans something still is something that does not change over 10 to 20 ms) our eyes need to scan it many times looking for movement/features. If there is no movement eyes will focus on edges and high contrast spots.  This process will give our brain a collection of ‘high quality patches’ where the high resolution channel that is the fovea has been aimed to , selected by its image characteristics (movement, edges, contrast) that may not cover completely our field of vision, so effectively we may not see things that our brain reputes as ‘unimportant’.  Our pretended ‘frame’ will look like a big black poster with some high resolution photos stuck all over following strange patterns (edges for instance), surrounded by many low resolution photos all around the HQ ones, and a lot of empty space (black you may imagine).

It seems we do not scan whole frames. It seems we can live well with partial frames and still understand the world. This cognitive process is helped by involuntary eye movement and by voluntary gaze aiming.  Our brain has ‘image persistence’. We rely on partial frame persistence to cover a greater amount of field view by making old samples of the world last in our perceptual system being added to fresh samples in a kind of ‘time-based collage’.  Cinema and TV benefit from image persistence by encoding the moving world as a series of frames that are rapidly presented to the eye. As our brain scans the world from time to time it does not seem very unnatural to look at a ‘fake world’ that is not just there all the time but only some part of the time.  Of course the trick to fool our brain is: don´t be slower than the brain.

Cinema uses 24 frames per second (24 fps) and this is clearly slower than our scanning system so we need an additional trick to fake motion: we achieve it through over-exposure of film frame. To capture a movement through let’s say one second spanning 24 film frames, we allow film to get blurred with the fast movement by overexposing each frame so the image of the moving object impresses a trail on film instead of a static image. If cinema was shown to us without this overexposure we will perceive a jerky movement as 24 fps is not up to our brain’s capabilities to scan for movement.  Most people will be much more comfortable with properly exposed shots played at 100 fps. The use of overexposure defines ‘cinema style’ movement as opposed to ‘video style’ movement. People get used to cinema and when a movie is shot not on film but on video with proper exposure the ‘motion feeling’ is different, they say ‘too lifelike, not cinema’.

Today we see a mixture of techniques to represent motion. In TV we have been using PAL and NTSC standards that were capturing 576 and 480 horizontal lines respectively to form frames in a tricky way. They would capture half a frame (what they call a ‘field’) by sampling just the even lines, then just the odd lines, taking a field every 1/50 s in PAL and every 1/60 in NTSC. This schema produces 25 fps or 30 fps in average, but see that in fact they produce 50 fields per second or 60 fields per second. Due to the abovementioned image persistence two fields seem to combine in a single image, but notice that two consecutive fields were never sampled at the same time, but shifted 1/50 s or 1/60 s so if displayed simultaneously they won’t match. Edges will show ‘combing’ (you will see dents like in a comb). This is precisely what happens when an interlaced TV signal arrives at a progressive TV set and you must turn fields into frames. Of course there are de-comb filters built in modern TVs. I want just to point out that with 95% of TV sets out there being progressive monitors capable of showing 50/60 fps progressive, interlaced TV signals just do not make sense anymore…, but we still ‘enjoy’ interlaced TV in most places of the world.  Of course de-interlacing TV for progressive TV sets comes at a cost: image filters result in a poor image quality. De-comb filters produce un-sharpening. The perceptual result is a loss of resolution.  Maybe you do not realise how deep interlaced-imagery is in our lives: most DVD titles have been captured and stored as interlaced. This makes even less sense than broadcasting interlaced signals. DVD is a Digital format. You are very likely to play it on a Digital TV set (a progressive LCD panel) so why bother interlacing the signal so your DVD or your TV or both will need to de-interlace it?. Plainly it does not make sense, and by the way it worsens image quality.  Even some Blu-ray titles have been made from interlaced masters. Here nonsense gets extreme but it happens anyway. We may forgive DVD producers as when DVD standard came up most TV sets were still interlaced (cathode ray tubes), but having ‘modern’ content shot in interlaced format today is plain heresy.


The human hearing system is made of two channels, each one acquiring information independently, both mapping information to a brain area. In the same way two eyes combine information for stereoscopic vision.

The hearing sensory equipment is complex. It is made of external devices designed for directional wave capture (ear, inner duct and tympanic membrane). We cannot aim ears (do not have the muscles some animals have), just turn our head which is a ‘massive movement’ subject to great inertia and thus slow, so our hearing attention must work all the time for surrounding sounds. There are internal mechanisms (chain of tiny bones: ossicles) and pressure transmission that are intended to transform the spectral response of the human hearing amplifying some frequencies more than others. At the very internal part of the human hearing system there is the cochlea, a tubular, spiralling duct that is covered with sensitive ‘hairs’. Is in this last stage that individual frequencies are identified. All the rest of the equipment: internal and middle part and the ear is just a sophisticated amplifier. As in the visual system there are limitations that come from the sensory equipment, not the brain.

Sound: differential pressure inside a fluid produced as vibration.

What we call sound is a vibration of a fluid. As in any fluid (gas or liquid), there is an average pressure at every point in space. Our hearing system is able to detect pressure variations deviating from the average. We do not detect any small variation (but almost!), and we will not detect a really strong one (at least we will not detect a sound, just pain and possibly damage of the hearing system). So there is a range of intensities in pressure: the human hearing system has an amazing range of 13 decades in pressure (sound intensity as the energy produced per unit surface by the vibration is proportional to differential pressure).  The smallest perceivable pressure difference is: 2×10^-5 Newton/m^2, the highest is 60 Newton/m^2. The intensity range is 10^-12Watt/m^2 to 10Watt/m^2.  We do not detect isolated pulses of differential pressure. We need sustained vibration to excite our hearing system (there is a minimum excitation time of about 200 ms to 300 ms). And thus there is a minimum frequency and also a maximum. Roughly we can hear from 20 Hz to 20 kHz. Aging reduces severely the upper limit. We are not equally sensible to pressure (intensity) through the range. As with our eyes and colour perception, there is a transfer function, a spectral response in frequency space that is not flat.

Frequency resolution: pitch discrimination, Intensity resolution: loudness

Our hearing system retrieves the following information from sound: frequency (pitch), intensity (loudness), position (using two ears).  We are able to detect from 20 Hz to 20 KHz and we can tell 1500 different pitches in the range. Separation of individually recognizable pitches is not the same across the range. There is a fairly variant transfer function. It is assumed that frequency resolution is 3.6 Hz in the octave going from 1 KHz to 2 KHz. This relates to perception of changes in pitch of a pure tone. As with colour, sound can be a perception of combined pure tones. It happens that when there is more than one pure tone, interference between tones can be perceived as a modulation of intensity (loudness) that is called ‘beating’ and the human ear is then more sensible to frequency. For instance two pure tones of 220 Hz and 222 Hz when heard simultaneous interfere producing a beating of 2 Hz. The human ear can perceive that effect, but if we increase let’s say our pure tone from 200 Hz to 202 Hz the human ear will not perceive the change.

We perceive sound intensity (loudness) differently across the frequency range. Several different pressures can be perceived equal if they are vibrating at different frequencies. For this reason the human ear is characterized by drawing ‘loudness curves’. There is a line (a contour line as level curves in geographical maps) per perceived loudness value, these lines cover the whole range of audible frequencies and they (obviously) do not cross. It is noticeable that the low threshold of loudness perception has a valley in the range 2 khz to 4 khz. There is where we can hear the less intense sounds. In that frequency range lays most of the energy of the human voice spectrum.

Sound Hardware: is it up to the task?

It seems that audio-only content is not very fashionable in these days. It does not attract the attention of the masses as it did in the past.  Anyway let’s take a look at what we have.

Sound storage and playback is mostly digital in our days. Since the inception of the CD a whole culture of sound devices employs the same audio capacities. CD spec: two channels, each sampled at 44.1 KHz, 16 bit samples using PCM. (Sound input is filtered by a low-pass with cut frequency at 20 KHz, and then sampled. As you may know to preserve a tone of 20 kHz you must sample at least at 40 kHz: Nyquist theorem). Is CD spec on par with human perception? Let’s see.

Audio sampling takes place in time domain (light sampling happens in frequency). Each sample takes an analog value for pressure (provided converting pressure to voltage or current intensity in a microphone during sampling time), then this value is quantized in N steps (16 bits provide 2^16 =65536 steps). The resulting bit stream can be further compressed after for storage and/or transmission efficiency. Some techniques can be applied before quantization to reduce amount of data (amplitude compression), but usually data reduction is applied while quantizing leading to Differential PCM and Adaptive PCM which deviate from Linear PCM.

If we look at frequency resolution, LPCM with cut-off frequency at 20 kHz is ok. Two separate tones of less than 20 KHz will be properly sampled and can be distinguished perfectly no matter them are separated by 2-4 Hz.

If we look at pressure resolution it has nothing to do with CD spec. A loudness curve for each pressure value can be properly encoded using the CD spec. What plays here is the Microphone technology (sensitivity) and all the chain of manipulations (AD conversion, storing, transmitting, DA conversion, amplifying) analog and digital that intervene to display sound in front of your ears.

It is easier to look at dynamic range to compare fidelity of sound handling. As we said the human ear is capable of 13 decades (130 dB), but as it happens with the human eye not in the same scene, not in the same sound time segment. For human hearing the name of this reduced range effect is called ‘masking’. Loud noises make our hearing system to adapt by reducing the range so we cannot hear faint sounds when an intense signal is playing. Some experiments (http://en.wikipedia.org/wiki/Dynamic_range) demonstrate that the CD spec (16 bit samples) can render a range of 98 dB for sine shaped signals, a range of 120 dB using special dithering (not LPCM), 20 bit LPCM can render 120 dB, 24 bit LPCM can render 144 dB. But at the same time other elements in the chain: AD/DA steps, amplifiers, transmission are very likely to reduce the range below 90 dB

So theoretically there are high end sound devices with high dynamic range, that could be paired among them so the resulting end to end system gets close to 125 dB, but that may take a fair amount of money.  To technically achieve the maximum possible fidelity one way (recording) and the other way around (playing) you must ensure that all your equipment fits together not breaking the 125 dB dynamic range. For the recording segment it is no problem. Studios have the money to afford that and more. For the playing segment you will find trouble in DAC, amplifiers and loudspeakers. Cheap HW does not have the highest dynamic range. The symptom of not being up to the task is the amount of distortion that appears in the range, usually measured as maximum % of distortion in the range. http://books.google.es/books?id=00m1SlorUcIC&pg=PA75&redir_esc=y). But anyway ¿do we need the maximum dynamic range all over the chain? Is audio quality available? Is it expensive?

Reading this (http://www.aes.org/e-lib/browse.cfm?elib=14195) and this (http://www.tomshardware.com/reviews/high-end-pc-audio,3733.html) may help us derive the conclusion that YES, we have today the quality needed to experience the best possible sound that our perceptual system can detect, and NO it is not expensive. Using computer parts and peripherals you can build a cheap and very perceptually decent sound system. Of course pressure wave propagation is a tricky science and to adapt to any possible room you may need to invest in more powerful much more expensive equipment. But for ‘direct ray’ sound we are fortunate, virtually anyone can afford perceptually correct equipment today.



With the advent of Digital technologies we have inherited a world in which video is digital. This means of course that the video signal is ‘encoded’ in a digital format. Today EBU, DVB and other organisations like DVD Forum and Blu-ray Disc Association have closed the variety of encoding options to a few standards: MPEG (1, 2, &4) and VC1, there are also other famous video coders: WebM, On2 VP6, VP8 and VP9. By far the more successful standard is MPEG (Motion Pictures Expert Group) which is today well consolidated after more than 25 years of existence.  Most of the TV channels in the world are today delivered as a MPEG2 video over M2TS (MPEG 2 transport stream) and more recently HD TV is delivered as MPEG4-part 10 video over M2TS. Also called ISO H264 or AVC. And the latest addition from MPEG is HEVC (H265)

We are seeing a digital world that moves much faster than EBU/DVB/DVD FORUM/BLURAY DISC ASSOCIATION… these big entities may take as much as 5 years to standardize a new format. Then manufacturers NEED to change production lines and keep them for some years producing devices that adhere to the new standard (so they can derive profit from investment). So you cannot expect a breakthrough in commodity electronics for video in less than 5-7 years and that is accelerating as in the past (1960-2000), TV sets have been built essentially equal in viewing specs for more than 20 years in a row.

But as I’ve said today we can expect to see much more dynamism in video sources and video displays. Thanks to Internet video distribution people can encode and decode video in a wealth of video formats that may fit best their needs than regular broadcast TV. Of course the hardest limitation is the availability of displays. It doesn’t make sense to encode 30 bit colour video if you do not have a capable display… but assuming we have a capable display, today high end PC video boards can be used to feed HDMI 1.4 or DisplayPort signals to these panels overcoming the limitations of broadcasting standards. For this single reason 4K TVs are being sold today. Only PCs can feed 4K content to current 4K TVs, and most of the times this content must be ‘downloaded’ from Internet. Today streaming 4K content would take above 25 Mbps encoded in H265 HEVC.

We have seen that state of the art displays have just met perceptual minimum resolution (250 ppi at 18”) and are getting better every day. We are seeing the introduction of decent colour handling with 10 bpc LCDs and 30 bit RGB colour. We are seeing the introduction of large format high resolution displays: 4K displays starting at 24” for computer monitors (200 ppi) and at 56” for TV sets (80 ppi).  BDA has recently announced that the 4K extension to BluRay spec will be available before end 2014. In the meantime they need to choose the codec (H265 and VP9 are contenders) and cut some corners of the spec.  The available displays have a proper dynamic range usually better that 1000:1 and getting close to 10000:1. At least if we take contrast ratio as if it were a ‘real’ intensity range, which is not. Of course our monitors cannot light up with the intensity of sunlight and at the same time or even in a different scene show a star field with distinguishable faint stars. No. Dynamic range of displays will not get there soon, but HDR (High Dynamic Range) techniques are starting to appear and they can compress the dynamic range of real world input much better than current technology, which by the way does not compress it just clips. Current cameras can take high light clipping low light or on the contrary low light clipping high light, and you are fortunate to be left to select which part of the range you want. Near future HDR cameras will capture the full range. As displays will not be on par with the full range, some image processing that is already available must be done to compress the range and adapt it to the display. (PS: today you can process RAW files to produce your own HDR images, or even take multi-exposure shots to produce HDR files. The problem is that to see the results you must compress the range to standard RGB or otherwise you must select an ‘exposure’ value to view the HDR file.)  We can expect to see incredible high definition high dynamic range content in full glory using 4K 30 bit per colour displays.


Digital audio is not living up to digital video expectations. In the past decade a few high definition audio formats appeared: SACD (Super Audio CD), DVD-A (DVD Audio) and multichannel uncompressed LPCM in BD. From these formats SACD and DVDA have proved real failures. It has been demonstrated (http://www.aes.org/e-lib/browse.cfm?elib=14195) that increasing the bit count per sample above 16 bits: 20, 24, 32, 48…. and increasing sampling rate above 44.1 khz : 48, 96, 1 Mhz (DDS 1 bit) do not produce perceptually distinguishable results… so the answer is clear:  we got there many years ago. We have achieved the maximum ‘reasonable’ fidelity with the CD spec. (OK, OK the noise floor could be improved going to sound dithering or moving from 16 to 20 bit, but anyway the perceptual effect is ridiculous and the change is not worth the investment at all.) The only breakthrough in digital audio comes from the fact that now we have more space available in content discs so we can go back to uncompressed formats and enjoy again LPCM after years of MP3 compression or other sorts of compression : DTS, Dolby, MPEG. Also state of the art audio is multichannel. So the reference audio today is uncompressed LPCM 16 bit/sample, 44.1 or 48 khz in 5.1 or 7.1 multichannel format stored on Blu-ray disc.


I started this article posing fairly open questions about the availability of perceptually correct technology to display image and sound in front of our eyes and ears.  After careful examination of our viewing and hearing sensory equipment, and after examination of the recent achievements of the CE industry providing displays and audio equipment, and the prices of these devices and after examining the market acceptance for content and the ways to distribute content….  we can conclude that we are living in an extremely interesting time for content. We’ve got ‘there’ and virtually no one noticed. We have the technology to provide perceptually perfect content and we have the distribution paths and we (almost) have the market.

In the way to this discovery we have found that today only a very small amount of devices and content encodings have put all the pieces together, but that is changing. We will be no more delayed by broadcast standards; we will no more be fooled by empty promises in audio specs.  The right technology is just at hand and the rate of price decline is accelerating. Full HD adoption took more than 15 years but maybe 4K adoption will take less than 5 years, and maybe most content will not get to us via broadcast anymore…












Some thoughts about CDNs, Internet and the immediate future of both

(Download this article in pdf format : thoughts CDN internet )


A CDN (Content delivery Network) is a Network overlaid on top of internet.  Why bother to put another network on top of internet? Answer is easy: the Internet as of today does not work well for doing certain things, for instance content services for today’s content types.  Any CDN that ever existed was just intended to improve the behaviour of the underlying network in some very specific cases: ‘some services’ (content services for example), for ‘some users’ (those who pay, or at least those whom someone pays for). CDNs do not want nor can improve Internet as a whole.

Internet is just yet another IP network combined with some basic services, for instance: ‘object names’ translation into ‘network addresses’ (network names): DNS.  Internet’s ‘service model’ is multi-tenant, collaborative, non-managed, and ‘open’ opposite to private networks (single owner), joined to standards that may vary one from another, non-collaborative (though they may peer and do business at some points) and managed. It is now accepted that the ‘service model’ of Internet, is not optimal for some things: secure transactions, real time communications and uninterrupted access to really big objects (coherent sustained flows)…

The service model in a network of the likes of Internet , so little managed, so little centralized, with so many ‘open’ contributions,  today can grant very few things to the end-to-end user, and the more the network grows and the more the network interconnects with itself the less good properties it has end to end. It is a paradox. It relates to complex systems size. The basic mechanisms that are good for a size X network with a connection degree C may not be good for another network  10^6X in size and/or 100C in connection. Solutions to internet growth and stability must never compromise its good properties: openness, de-centralisation, multi-tenancy …. This growth& stability problem is important enough to have several groups working on it: Future Internet Architecture Groups. These Groups exist in UE, USA and Asia.

Internet basic tools for service building are: a packet service that is non-connection-oriented (UDP) and a packet service that is connection-oriented (TCP) and on top of this last one a service that is text-query-oriented and stateless (HTTP) (sessions last for just one transaction).A name translation service from object names to network names helps a lot to write services for Internet and also allows these applications to keep running no matter the network addresses are changing.

For most services/applications Internet is a ‘HTTP network’. The spread of NAT and firewalls makes UDP inaccessible to most internet consumers, and when it comes to TCP, only port 80 is always open and even more only TCP flows marked with HTTP headers are allowed through many filters. These constraints make today’s internet a limited place for building services. If you want to reach the maximum possible number of consumers you have to build your service as an HTTP service.



A decent ‘network’ must be flexible and easy to use. That flexibility includes the ability to find your counterpart when you want to communicate.    In the voice network (POTS) we create point to point connections. We need to know the other endpoint address (phone number) and there is no service inside POTS to discover endpoint addresses not even a translation service.

In Internet it was clear from the very beginning that we needed names that were more meaningful than network addresses.  To make the network more palatable to humans Internet has been complemented with mechanisms that support ‘meaningful names’.  The ‘meaning’ of these names was designed to be one very concrete: “one name-one network termination” … and the semantics that will apply to these names were borrowed from set-theory through the concept of ‘domain’ (a set of names) with strict inclusion. Pairs name-address are modelled making ‘name’ to have such an structure that represents a hierarchy of domains. In case a domain includes some other domain that is clearly expressed by means of a chain of ‘qualifiers’.  A ‘qualifier’ is a string of characters. The way to name a subdomain is to add one more qualifier to the string and so on and so forth. If two domains do not have any inclusion relationship then they are forcefully disjoint.

This naming system was originally intended just to identify machines (network terminals) but it can be ,and has been, easily extended to identify resources inside machines by adding subdomains. This extension is a powerful tool that offers flexibility to place objects in the vast space of the network applying ‘meaningful names’. It gives us the ability to name machines, files, files that contain other files (folders), and so on… . These are all the ‘objects’ that we can place in internet for the sake of building services/applications.  It is important to realise that only the names that identify machines get translated to network entities (IP addresses). Names that refer to files or ‘resources’ cannot map to IP network entities and thus, it is the responsibility of the service/application to ‘complete’ the meaning of the name.

To implement this semantics on top of Internet they built a ‘names translator’ that ended up being called ‘name server’. Internet feature is called: Domain Name Service (DNS).  A name server is an entity that you can query to resolve a ‘name’ into an IP address.  Each name server only ‘maps’ objects placed in a limited portion of the network. The owner of this area has the responsibility of maintaining the names of objects associated to proper network addresses.   DNS just gives us  part of the meaning of a name.  The part that can be mapped onto the network. The full meaning of an object name is rooted deeply in the service/application in which that object exists. To implement a naming system that is compatible to DNS domain semantics we can for instance use the syntax described in RFC2369. There we are given the concept of URI: Uniform resource Identifier. This concept is compatible and encloses previous concepts as URL: Uniform Resource Locator and URN: Uniform Resource Name.

For the naming system to be sound and useful it is necessary that an authority exists to assign names, to manage the ‘namespace’..  Bearing in mind that translation process is hierarchical and can be delegated; many interesting intermediation cases are possible that involve cooperation among service owners and between service and network owners. In HTTP the naming system uses URLs. These URLs are names that help us in finding a ‘resource’ inside a machine inside the Internet. In this framework that HTTP provides, the resources are files.

What is ‘Content’?

It is not possible to give a non-restrictive definition of ‘content’ that covers all possible content types for all possible viewpoints. We should agree that ‘content’ is a piece of information. A file/stream is the technological object that implements ‘content’ in the framework of HTTP+DNS.



We face the problem of optimising the following task: find & recover some content from internet..

Observation 1: current names do not have a helpful meaning. URLs (HTTP+DNS framework) are ‘toponymic’ names. They give us an address for a content name or machine name. There is nothing in the name that refers to the geographic placement of the content. The name is not ‘topographic’ (as it would be for instance in case it contains UTM coordinates). The name is not ‘topologic’ (it gives no clue about how to get to the content, about the route). In brief: Internet names, URLs, do not have a meaningful structure that could help in optimising the task (find & recover).

Observation 2: current translations don’t have context. DNS (the current implementation) does not recover information about query originator, nor any other context for the query. DNS does not worry about WHO asks for a name translation or WHEN or WHERE… as it is designed for a semantic association 1:1, one name one network address, and thus, why worry? We could properly say that the DNS, as is today, does not have ‘context’. Current DNS is kind of a dictionary.

Observation 3: there is a diversity of content distribution problems.  The content distribution problem is not, usually, a transmission 1 to 1; it is usually 1 to many.  Usually there is for one content ‘C’ at any given time ‘T’ the amount of ‘N’ consumers with N>>1 most of the times.  The keys to quality are delay and integrity (time coherence is a result of delay). Audio-visual content can be consumed in batch or in stream. A ‘live’ content can only be consumed as a stream. It is very important that latency (time shift T=t1-t0 between an event that happens at t0 and the time t1 at which that event is perceived by consumer) is as low as possible. A pre-recorded content is consumed ‘on demand’ (VoD for instance).

It is important to notice that there are different ‘content distribution problems’ for live and recorded and also different for files and for streams.

A live transmission gives to all the consumers simultaneously the same exact experience (Broadcast/multicast), but it cannot benefit from networks with storage, as store-and-forward techniques increase delay. It is impossible also to pre-position the content in many places in the network to avoid long distance transmission as the content does not exist before consumption time.

An on-demand service cannot be a shared experience.. If it is a stream, there is a different stream per consumer. Nevertheless an on demand transmission may benefit from store and forward networks.  It is possible to pre-position the same title in many places across the network to avoid long distance transmission. This technique at the same time impacts on the ‘naming problem’: how will the network know which is the best copy for a given consumer?

We soon realise that the content distribution problem is affected by (at least): geographic position of content, geographic position of consumer and network topology



-to distribute a live content the best network is a broadcast network with low latency: classical radio & TV broadcasting, satellite are optimal options. It is not possible to do ‘better’ with a switched, routed network as IP networks are. The point is: IP networks just do NOT do well with one-to-many services. It takes incredible effort from a switched network to let a broadcast/multicast flow compared to a truly shared medium like radio.)

to distribute on demand content the best network is a network with intermediate storage.  In those networks a single content must be transformed into M ‘instances’ that will be stored in many places through the network. For the content title ‘C’, the function ‘F’ that assigns a concrete instance ‘Cn’ to a concrete query ‘Ric’ is the key to optimising Content delivery. This function ‘F’ is commonly referred as ‘request mapping’ or ‘request routing’.

Internet + HTTP servers + DNS have both storage and naming.  (Neither of HTTP or DNS is a must.)

There is no ‘normalised’ storage service in internet, but a bunch of interconnected caches. Most of the caches work together as CDNs. A CDN, for a price, can grant that 99% consumers of your content will get it properly (low delay + integrity). It makes sense to build CDNs on top of HTTP+DNS. In fact most CDNs today build ‘request routing’ as an extension of DNS.

A network with intermediate storage should use the following info to find & retrieve content:

content name (Identity of content)

-geographic position of requester

-geographic position of all existing copies of that content

network topology (including dynamic status of network)

-business variables (cost associated to retrieval, requester Identity, quality,…)

Nowadays there are services (some paid) that give us the geographic position of an IP address : MaxMind, Hostip.info, IPinfoDB,… . Many CDNs leverage these services for request routing.

It seems that there are solutions to geo-positioning, but still have a naming problem. A CDN must offer a ‘standard face’ to content requesters. As we have said content dealers usually host their content in HTTP servers and build URLs based on HTTP+DNS so CDNs are forced to build an interface to the HTTP+DNS world.. On the internal side, today the most relevant CDNs use non-standard mechanisms to interconnect their servers (IP spoofing, DNS extensions, Anycast,…)



-add context to object queries: identify requester position through DNS. Today some networks use several proprietary versions of ‘enhanced DNS’ (Google is one of them). The enhancement usually is implemented transporting the IP addr of the requester in the DNS request and preserving this info across DNS messages so it can be used for DNS resolution.   We would prefer to use geo-position better than IP address. This geo position is available in terminals equipped with GPS, and can also be in static terminals if an admin provides positioning info when the terminal is started.

add topological + topographical structure to names: enhance DNS+HTTP.   A web server may know its geographic position and build object names based on UTM. An organization may handle domains named after UTM. This kind of solution is plausible due to the fact that servers’ mobility is ‘slow’. Servers do not need to change position frequently and their IP addresses could be ‘named’ in a topographic way.  It is more complicated to include topological information in names. This complexity is addressed through successive name-resolution and routing processes that painstakingly give us back the IP addresses in a dynamic way that consumes the efforts of BGP and classical routing (ISIS, OSPF).

Nevertheless it is possible to give servers names that could be used collaboratively with the current routing systems. The AS number could be part of the name.  It is even possible to increase ‘topologic resolution’ by introducing a sub-AS number.  Currently Autonomous Systems (AS) are not subdivided topologically nor linked to any geography. These facts prevent us from using the AS number as a geo-locator. There are organisations spread over the whole world that have a single AS.  Thus AS number is a political-ID, not a geo-ID nor a topology-ID. An organizational revolution could be to eradicate too spread AS and/or too complex AS. This goal could be achieved by breaking AS in smaller parts confined each one in a delimited geo-area and with a simple topology. Again we would need a sub-AS number. There are mechanisms today that could serve to create a rough implementation of geo-referenced AS, for instance BGP communities.

request routing performed mainly by network terminals: /etc/hosts sync. The abovementioned improvements in the structure of names would allow web browsers (or any SW client that recovers content) to do their request routing locally. It could be done entirely in the local machine using a local database of structured names (similar to /etc/hosts) taking advantage of the structure in the names to guess parts of the mapping not explicitly declared in the local DB. Taking the naming approach to the extreme (super structured names) the DB would not be necessary, just a set of rules to parse the structure of the name producing an IP address that identifies the optimal server in which the content that carried the structured name can be found. It is important to note that any practical implementation that we could imagine will require a DB. The more structured the names the smaller the DB.



It makes sense to think of a CDN that has a proprietary SW client for content recovery that uses an efficient naming system that allows for the ‘request routing’ to be performed in the client, in the consumer machine not depending of (unpredictably slow) network services.

Such a CDN would host all content in their own servers naming objects in a sound way (probably with geographical and topological meaning) so each consumer with the proper plugin and a minimum local DB can access the best server in the very first transaction: resolution time is zero! This CDN would rewrite web pages of its customers replacing names by structured names that are meaningful to the request routing function.   The most dynamic part of the intelligence that the plugin requires is a small pre-computed DB that is created centrally, periodically using all the relevant information to map servers to names. This DB is updated from the network periodically. The information included in this DB:  updated topology info, business policies, updated lists of servers.  It is important to realise that a new naming structure is key to make this approach practical. If names do not help the DB will end up being humungous.

Of course this is not so futuristic. Today we have a name cache in the web browser + /etc/hosts + cache in the DNS servers. It is a little subtle to notice that the best things of the new schema are: suppress the first query (and all the first queries after TTL expiration). Also there is no influence of TTLs, which are controlled by DNS owners out of cdn1, and there are no TTLs that maybe built in browsers….

This approach may succeed for these reasons:

1-      Not all objects hosted in internet are important enough to be indexed in a CDN and dynamism of key routing information is so low that it is feasible to keep all terminals up to date with infrequent sync events.

2-      Today computing capacity and storage capacity in terminals (even mobile) are enough to handle this task and the penalty paid in time is by far less than the best possible situation (with the best luck) using collaborative DNS.

3-      It is possible, attending to geographic position of the client, to download only that part of the map of servers that the client needs to know.  It suffices to recover the ‘neighbouring’ part of the map. In case of an uncommon chained failure of many neighbour servers, it is still possible to dynamically download a far portion of the map.