Quality of Experience in Streaming
In Eyevinn Technology’s ambition to broader our sharing of knowledge we now expand this with addressing quality. In today’s landscape of media systems this has been growing into a differentiating factor. We start this initiative with discussing Quality of Experience (QoE); its definitions and influence factors. This first article is written by Andreas Rossholm, Ph.D. in applied signal processing with over 15 years’ experience within audio and video communication in both real-time and streaming scenarios.
Quality of Experience and its Influence Factors
With the increasing usage of audiovisual communication technology systems delivering media to humans, the need for evaluation of the quality is essential. This includes both media-to-human systems, e.g. streaming as Netflix, and human-to-human systems, e.g. two-way audiovisual applications as Skype. It can also be seen that the quality per se is getting increased attention both in the context of increased resolution and dynamic range e.g. 4k and HDR but also in combination with efficient usage of bandwidths as for Netflix per-title encode optimization or enable video over cellular networks like 3G/4G.
One of the most used concepts when it comes to perceived quality is the Quality of Experience (QoE) which is used in a very broad meaning. In this text I’m trying to explain the meaning from a more theoretic point of view.
When audiovisual communication is looked upon the processing chain can be represented by acquisition, compression, transmission over network, and reconstruction stages, where each stage includes fundamental parts enabling the communication, see illustration in Fig. 1.
This processing chain can be applied to audio or speech only, e.g. GSM and VoIP (voice over IP), or audiovisual, e.g. streaming or two-way real-time communication. When divided into the four processing parts many similarities are seen between them, and where the low latency or processing delay aspect represents the major difference between streaming and two-way real-time communication.
The research around the concepts of quality really took off in the early twenty-first century including researchers from different disciplines e.g. signal processing, telecommunications, psychophysics, and psychology. One of the outcomes from this work was construction of the term ”Quality of Experience” used for evaluation of media transmission systems, services or applications and where the primary aim was to consider the perceived quality from the engineering’s point of view. The most frequently used standardized definition is ITU-T’s :
Definition 1 (QoE (ITU-T)). The overall acceptability of an application or service, as perceived subjectively by the end-user.
Note 1: Includes the complete end-to-end system effects.
Note 2: May be influenced by user expectations and context
The definition of QoE differs from the concept of ”Quality of Service” where an explicit view from the perspective of a system’s or service’s operator is considered, see definition of ITU-T :
Definition 2 (QoS (ITU-T)). Totality of characteristics of a telecommunications service that bear on its ability to satisfy stated and implied needs of the user of the service.
However, there is a dependence between QoE and QoS, where QoS can be seen as a contributor to the potential QoE since it is reflecting the network performance. It should also be mentioned that in a comparison between QoE and User Experience (UX), which may occur in some contexts since there are similarities, one of the general differences is the relationship between QoE and QoS, which is mainly technology driven, compared to UX which is human-centered and have its origins from the field of Human-Computer Interaction.
In ITU-T’s definition of QoE some possible impediments were found in the formulation. To address this and to be more specific around the context of users of applications and services a new definition of QoE was performed by the European Network on Quality of Experience in Multimedia Systems and Services, Qualinet. This started in 2011 and resulted in Qualinet White Paper on Definitions of Quality of Experience :
Definition 3 (QoE (Qualinet)). The degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and/ or enjoyment of the application or service in the light of the users personality and current state.
Application: A software and/or hardware that enables usage and interaction by a user for a given purpose. Such purpose may include entertainment or information retrieval, or other.
Service: An episode in which an entity takes the responsibility that something desirable happens on the behalf of another entity. (Dagstuhl Seminar 09192, May 2009, cited after Möller, 2010)
This definition has also been adopted in 2016 by International Telecommunication Union (ITU-T Recommendation P.10: Vocabulary for performance and quality of service, Amendment 5 (07/16)).
When QoE or quality is looked upon, one fundamental area is the human perception. This includes the brain activity where some incidence of stimuli reaching one or multiple of human sensory organs, and how the stimuli converts into neural signals and transforms into more abstract or symbolic representations. How this process works in details is studied in neuroscience and not covered further here.
Another area that has been studied is related to the factors that influence the QoE in the context of users of applications and services and this has also been defined by Qualinet :
Definition 4 (Influence Factor (Qualinet)). Any characteristic of a user, system, service, application, or context whose actual state or setting may have influence on the Quality of Experience for the user.
The influence factors (IF) can be grouped in three categories; Human IF, System IF, and Context IF, where the definitions are:
Human IF (HIF): Any variant or invariant property or characteristic of a human user. The characteristic can describe the demographic and socioeconomic background, the physical and mental constitution, or the users emotional state.
System IF (SIF): Refer to properties and characteristics that determine the technically produced quality of an application or service. They are related to media capture, coding, transmission, storage, rendering, and reproduction/display, as well as to the communication of information itself from content production to user.
Context IF (CIF): Factors that embrace any situational property to describe the user’s environment in terms of physical, temporal, social, economic, task, and technical characteristics.
These IFs must not be regarded as isolated as they may interrelate. Also, it should be noted that the impact a specific IF has on QoE is not deterministic. Some examples of influence factors from the three categories could be for HIF e.g. gender, age, and expertise level, for CIF e.g. time of day, duration, costs of service, brand of the service/system, level of focus, alone or with other people, or technical interconnectivity.
The SIF category is related to the technical part of producing QoE and has been divided further into four sub-categories: Content-related, Media-related, Network-related, and Device-related. If these are applied on the audiovisual communication chain in Fig. 1 these can be described as:
Content-related, referring to the content type and content reliability e.g. audio signal bandwidth and dynamic range, and video spatial and temporal information in the acquisition block, Media-related, referring to media configuration factors e.g. compression related influences such as blocking and ringing artifacts, but also factors from the rate controller performance, bandwidth estimator, and resource manager resulting in temporal and spatial scaling, aliasing artifacts and bandwidth limitation, and overall introduced delay in the process in the compression block. Network-related, referring to data transmission over a network. These could be generated in different ways depending on the type of network and connection, but some generic IFs would be e.g. bandwidth fluctuation and congestions, packet loss, jitter, and delay in the transmission over network block. Device-related, referring to the end systems or devices involved along the end-to-end communication path, including system specifications, equipment specifications, device capabilities and provider specification and capabilities e.g. the quality and performance of the display and its scaling, the loudspeaker, computational power to perform decoding, introduced delay in the reconstruction block.
When the processing chain is divided into IFs as described it can be seen that some factors may originate from several places and in some cases, it can be problematic to distinguish between these origins, e.g. for delay and for aliasing. As a conclusion, this highlights the importance to take the whole processing chain into account when QoE optimizations are taking place.
As been described in this article QoE is a complex concept and addressing many areas both when it comes to its definitions as what factors that influences the experienced quality. To emphasize two takeaways from the text above these are first: that when the system influence factors, which are related to the technical part of producing QoE, should be evaluated the whole end-2-end chain needs to be taken into count and second, that it is not only the technical aspects that contributes to the final perceived experience and these are not deterministic.
 ITU, ITU-T Recommendation P.10 (Amendment 3). (2011)
 ITU, ITU-T Recommendation P.10 (Amendment 2). (2008)
 Kjell Brunnström, Sergio Ariel Beker, Katrien De Moor, Ann Dooms, Sebastian Egger, et al.. Qualinet White Paper on Definitions of Quality of Experience. Qualinet White Paper on Definitions of Quality of Experience Output from the fifth Qualinet meeting. 2013.