Quality of Service: Lessons from Content Delivery World Conference 2015

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

content_deliveryQuality of Service: Lessons from Content Delivery World Conference 2015

CDW 2015: The evolution of content delivery architectures and workflows, and their role in distributing content throughout the globe.

In October 2015 Speedchecker hosted a stand at the Content Delivery World Conference 2015. Featuring  influential speakers from Canal+, Time Warner Cable, Wuaki.tv, Sky, Telecom Italia, BT, Telefonica, Cisco, Freeview and others.

This is a summary of the event from Steve Gledhill (Head of Content, Speedchecker Ltd) with focus on how to improve the Quality of Service for the end user.

SAMSUNG CSC
Typical activity between presentations

The Agenda

Content Delivery World 2015 brought together players from across the content delivery ecosystem, enabling the exchange of ideas and innovations and formation of partnerships that will fuel the growth of the content delivery industry, furthering the potential of this multi-billion dollar market.

SAMSUNG CSC
Standing room only

Of particular interest was how to measure the Quality of Experience for the end user rather than only the Quality of Service of the suppliers (CDNs, ISPs and Media Providers). Most of the information that we see regarding the performance of Content Delivery focus on latency, bitrate, throughput and other important factors that relate to the efficient data transmission. We were keen to learn from the key players at the conference how they translate these cold figures into a meaningful Quality of Experience. It was heartening to see that QoE was covered in most of the presentations indicating that this is seen as a key factor in improving the industry moving forward.

Measuring Quality of Experience

During a discussion about the need for multi-CDNs (redundancy, control, independence, QoE, peak traffic management) a number of Key Performance Indicators were listed:

  • Availability
  • Throughput
  • Buffer status
  • Routes taken
  • Bandwidth
  • Latency
  • User experience
  • Video impairments
  • Packet loss
  • Concurrent users
  • User viewing patterns

With regards to QoE it was agreed that “quick enough is good enough”; black screens are unacceptable; less than 2 seconds to switch channels is OK.

VTT in Finland have used user panels to discover how much (or how little) latency and buffering will cause their viewers’ QoE to become unacceptable.

Delivering Quality of Experience

Time Warner reported that they have seen a steady 20% annual increase in the growth of IP traffic year on year and they note that 78% of this is video. In order to cope with this steady increase they see a need for TCP Tuning, OS Tuning, NIC Tuning and a general reduction in the protocol overheads. They see the key metrics need to be captured passively and actively and include the monitoring of the system, analysis of logs and simulated clients.

Alcatel made the point that the CDNs need to be more content aware than they are at the present to ensure the highest quality. They also recommend that each CDN end point should be aware of existing cache to prevent unnecessary delays.

Another common discussion was around variable bit rate. For example, Sky Italia use Adaptive Encoding to ensure that users with 2.4 Mbps experience the same quality as those with 9 Mbs using near real-time encoding. This relies on a high quality original input. This bit rate is managed by the Sky CDN Selector at the Edge Server as close as possible to the consumer.

Orange presented some interesting concerns around certain protocols. Their international CDN provides average speeds of 4Mbps but they feel they have a number of issues to contend with. First, they have issues with caching (many presenters referred to caching as being a key area for improvement). Second, they highlighted the HTTPS issues for carriers under HTTP 2.0. Third, the need to be flexible and responsive to changes makes it hard to provide consistently high quality. Finally, they made reference to the Microsoft Smooth Streaming minimum latency requirement being too high to provide live content as they would wish to.

Multicasting

Looking to the future and how to deal with increased demand and higher quality video BT TV talked about Multicasting. They acknowledge that although they can distribute at speeds up to 12GBs at production that they lose control the closer they get to the customer. They have no control over the Home User’s devices or network; their equipment; the wiring in the home or the core/backhaul network all of which can lead to packet loss. Multitasking reduces the number of individual streams required and thus reduces delays and congestion. They can use the application layer to control dropped packets / retransmission and can even use 2 identical streams via different routes to be combined at the receiver in the home thus providing built-in redundancy. Problems are identified by end to end monitoring of the network data and user behaviour. The break-even point for multicasting compared to unicasting is 500 users or more. Quality of Experience is improved in terms of immediacy, quality and
continuity.

Problems with multicasting: Requires Unicast Tunnelling across Gaps where Multicast is not possible; Speeds can fall to slowest bit rate; Unicasting is recommended in the home/office.

Delivering on Mobile

Aventeq predict that by 2020 the average smartphone contract will allow 5Gb of data to be downloaded each month. This confirms that we need to ensure that the user experience for the mobile user is given high priority when considering QoE.

This emphasis was highlighted by EE when they showcased their 4G video offering showing seamless streaming from dense urban areas, to high speed (legal) motorway driving and in to the countryside and forests of England. They say this is possible because of the UK having an average speed over LTE of 20Mbps – the fastest in the World.

EE also took a different approach to improving QoE by not just dealing with latency and speed but with the actual content. They propose giving audiences to live events the option to see a choice of angles and bespoke statistics. This is available in the home but they propose making this available on mobile devices.

CDNs – Content Delivery Networks

CDNs are the backbone of the delivery mechanism and were a common point of discussion and debate throughout the conference

A number of Media providers are using or developing their own CDN (Content Delivery Network) to ensure that they are in control of the users’ QoE. Canal+ in France have 20-25 Free To Air channels and other services that will be transmitted via their own CDN in the next 12 months. Most of their traffic is driven by Live (premium) content and it is important for Canal+ that they maintain a high QoS for this content. They report that their users complain about streaming that doesn’t launch; video quality issues and buffering in roughly equal measure. Their research shows that download speeds of between 2 and 2.5Mbps are acceptable for the QoE of the end user. These speeds are achieved at all times except for the peak times of 7pm and later when problems start. Canal+ plan to provide their content directly to the ISPs instead of using a CDN as is currently the case. This should save money but they also hope that it will improve users’ QoE.

This lead to recommendations for choosing CDNs:

  • Point of Contact if there’s a problem
  • Excellent throughput and latency performance
  • Live content handling
  • Cloud computing support

Net Neutrality, Copyright and DRM

Most of the discussions and presentations dealt with moving the data around as efficiently as possible with no concern with regard to Net Neutrality. That’s not to say that this is ignored but rather, I suggest, that it is acknowledged that best technical practice needs to be modified to comply with regulations in any and all countries. For that matter, DRM and other copyright issues were only touched upon in a few presentations for similar reasons: the focus of the conference was on efficiency and how the technology can be used and improved upon.

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

API Changes> State targeting for USA, new API limits and web reputation check

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

Added support for targeting probes with state level accuracy

We have added optional input StateCode parameter to those methods:

StartDIGTestByCountry
StartHttpGetTestByCountry
StartPageLoadTestByCountry
StartTracertTestByCountry
StartPingTestByCountry

Currently only probes located in the USA can be targeted on state level. The API response will also contain StateCode (e.g. VA) and also State (e.g. Virginia) if you request CountryCode=US.

Added new limits on the number of probes you can request test from To avoid abuse we have added limits to how many probes you can request results at the same time in 1 API call. This is controlled by the probeLimit parameter which is required.

Here are the maximum values allowed for different tests.

PING – 100
DIG – 100
TRACEROUTE – 100
PAGELOAD – 20
HTTPGET – 20

If you require those limits to be lifted, please contact us.

Added web reputation services for HTTP tests

To avoid abuse and protect the probes from accessing risky content, we have added white and black lists that will be checked for all HTTP tests (e.g. HTTP GET or Page load). Following errors can happen>

StatusCode: 221
StatusDescription: Destination blocked Message: “This URL destination has been marked as risky by our web classification engine. We are unable to test this URL. ”

StatusCode: 222
StatusDescription: Destination not classified Message: “This URL destination has not been classified by our web classification engine. We are unable to test this URL at this time. Please contact support to whitelist this URL”

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

API improvements

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

We have deployed few new improvements to the API.

Changes to default timeout and commandTimeout

We have changed default timeout and commandTimeout parameters to maximize the number of returned results and minimize the timeouts on probes (of course you can set any time out as you want by specifying those parameters)

Ping
timeout 8000
commandtimeout = 1000

DIG
timeout = 4000
commandtimeout = 1000

Traceroute
timeout 52000
commandtimout 1000

HTTP GET
timeout 8000
commandtimeout = timeout – 3000

Pageload
timeout 45000
commandtimeout = timeout – 3000

Global error handler. Supported errors in the HTTP Header
To better understand returned results from the API we are now returning more information about the possible API problesms right in the HTTP Header.

Some of the new Status Codes:
200 – OK
500 – Unexpected Error from the API side
520 – Errors related with the empty results

If HTTP Status Code is <> from 200, then we also respond with [Message] element in the response body that will provide more user friendly message what the error is about.

Changes in the data output model

We are returning only objects which are currently in use. Objects as PING / HTTPGET etc are hidden for DIG. And the same rules exists for the others type of tests

e.g based on results for DIG OLD RESPONSE:

"StartDIGTestByCountryResult": [ { "ASN": { "AsnID": "AS12741", "AsnName": "Netia SA" }, "CDNEdgeNode": null, "Country": { "CountryCode": "PL", "CountryFlag": "http:\/\/speedcheckerapi.blob.core.windows.net\/bsc-img-country-logos\/pl.png", "CountryName": "Poland" }, "DIGDns": [ { "AdditionalInformation": null, "Destination": "www.broadbandspeedchecker.co.uk", "QueryTime": "13", "Status": "NoError" } ], "DateTimeStamp": "\/Date(1435158434263+0000)\/", "HTTPGet": null, "ID": 31658310, "Location": { "Latitude": 50.320525, "Longitude": 19.132328 }, "LoginDate": null, "Network": { "LogoURL": "https:\/\/speedcheckerapi.blob.core.windows.net\/bsc-img-providers\/logo_0039_ef7487217cc7_60.jpg", "NetworkID": "226", "NetworkName": "Netia SA" }, "PAGELoad": null, "Ping": null, "PingTime": null, "TRACERoute": null } ]

NEW RESPONSE:

"StartDIGTestByCountryResult": [ { "ASN": { "AsnID": "AS197480", "AsnName": "SerczerNET Malgorzata Nienaltowska" }, "Country": { "CountryCode": "PL", "CountryFlag": "http:\/\/speedcheckerapi.blob.core.windows.net\/bsc-img-country-logos\/pl.png", "CountryName": "Poland" }, "DIGDns": [ { "Status": "OK", "Destination": "www.broadbandspeedchecker.co.uk", "QueryTime": "1" } ], "DateTimeStamp": "\/Date(1435150848253+0000)\/", "ID": 17854019, "Location": { "Latitude": 53.29718, "Longitude": 23.28092 }, "Network": { "LogoURL": null, "NetworkID": "3362", "NetworkName": "SerczerNET Malgorzata Nienaltowska" } } ]

New properties available for probe results

For all probe results we added new property: Status

Status: OK, Timeout , TtlExpired (+ potentially other statuses added in the future)

For Ping and Traceroute we added also PingTime array, which can be useful to calculate standard deviations and other statistical analysis of the individual pings that were performed.

Ping Response "Ping": [ { "Status": "OK", "Destination": "www.interia.pl", "Hostname": "www.interia.pl", "IP": "217.74.65.27", "PingTime": 12, "PingTimeArray": [ "13", "11", "12" ] } ]

Traceroute response

"TRACERoute": [ { "Destination": "www.interia.pl", "HostName": "www.interia.pl", "IP": "217.74.65.27", "Tracert": [ { "HostName": "", "IP": "10.27.1.1", "PingTimeArray": [ "1", "0", "0" ], "Ping1": "1", "Ping2": "0", "Ping3": "0", "Status": "OK" }, { "HostName": "", "IP": "178.217.192.254", "PingTimeArray": [ "42", "31", "33" ], "Ping1": "42", "Ping2": "31", "Ping3": "33", "Status": "OK" }, { "HostName": "interia.tpix.pl", "IP": "195.149.232.12", "PingTimeArray": [ "12", "14", "11" ], "Ping1": "12", "Ping2": "14", "Ping3": "11", "Status": "OK" }, { "HostName": "", "IP": "217.74.64.187", "PingTimeArray": [ "11", "11", "11" ], "Ping1": "11", "Ping2": "11", "Ping3": "11", "Status": "OK" }, { "HostName": "www.interia.pl", "IP": "217.74.65.27", "PingTimeArray": [ "12", "11", "13" ]", "Ping1":"12", "Ping2":"11", "Ping3":"13", "Status":"OK"}]}]}

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

New parameters available for test methods

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

We have deployed few new improvements to the API.

Support for new parameters

PING Test methods

ttl – Max allowed hops for packet (default=128, max=255)
bufferSize – Size of the Packet data (default=32, max=65500)
fragment – Fragmentation of sending packets. (default=1)
resolve – IP addresses will be resolved to domain names for each hop (default=0)
ipv4only – Force using IPv4. If no IPv4 IP address is returned will return error (default=0)
ipv6only – Force using IPv6. If no IPv6 IP address is returned will return error (default=0)

DIGDNS Test methods

commandTimeout – DNS query timeout in milliseconds (default=1000ms)
retries – Total number of retries (default=0)

TRACEROUTE Test methods

maxFailedHops – Stop the command execution after maximum errors in a row (e.g. stop after 5 ping timeouts, default=0)
ttlStart – First Hop from which the trace route should start (default=1) bufferSize – Size of the Packet data (default=32, max=65500)
fragment – Fragmentation of sending packets (default=1)
resolve – IP addresses will be resolved to domain names for each hop (default=0)
ipv4only – Force using IPv4. If no IPv4 IP address is returned will return error (default=0)
ipv6only – Force using IPv6. If no IPv6 IP address is returned will return error (default=0)

PAGELOAD Test methods

commandTimeout – Maximum allowed time for pageload in milliseconds

HTTPGET Test methods

commandTimeout – Maximum allowed time to send HTTP GET request and receive the response in milliseconds
maxBytes – Max bytes to download from response stream

2. Changes in how timeouts are handled by the API

Each method has 2 main time out parameters:

timeout – Amount of time in milliseconds in which API responds with result
commandTimeout – timeout for the actual test command. For most of the tests its obvious that commandTimeout correlates with total timeout of the API method. However, in case of Ping and Traceroute its not the case. Ping methods by default execute 3 ping commands. Traceroute executes even more and it depends on many parameters such as TTL, count etc.

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail