next up previous
Next: Estimate of loss probability Up: Playout scheduling of multiple Previous: Multiple description coding of

Playout scheduling of multiple streams

 

In setting the playout schedule at the receiver, we have to deal with the tradeoffs among delay, loss and voice SNR degradation. Here we refer the event that both descriptions are either lost or discarded as packet loss, while we use the term SNR degradation to describe the situation that only one description is lost or discarded.

A playout deadline must be set before the arrival of each packet i. With no knowledge of the future network delay of packet i, we set the playout deadline according to the most recent delays we recorded in the past. We denote the playout deadline for packet i by tex2html_wrap_inline912 , which is the time from the packet is delivered to the network till it has to be played out. It is the total end-to-end delay of packet i (without including the packetization time at the sender), which characterizes the latency of transmission and playout.

   table47
Table 1: Basic Notation.

In order to determine tex2html_wrap_inline912 , we define a Lagrange cost function for packet i as follows

  eqnarray72

where tex2html_wrap_inline952 and tex2html_wrap_inline954 are the estimated loss probability of the packet from stream 1 and 2 respectively, given certain tex2html_wrap_inline912 . The estimate of tex2html_wrap_inline952 and tex2html_wrap_inline954 is based on past delays recorded of the two streams, which will be discussed in Subsection 2.4 in detail. The Lagrange multipliers tex2html_wrap_inline962 and tex2html_wrap_inline964 are predefined parameters to balance the tradeoffs. tex2html_wrap_inline966 is the SNR degradation of packet i, or the noise power introduced by receiving only one description. tex2html_wrap_inline966 is a constant in (1), depending on the codecs used. Since our concern here is transmission, the received SNR is compared to that of the quantized (in full resolution) signal at the sender.

The playout deadline is obtained by searching for the optimal tex2html_wrap_inline912 which minimizes the cost function. Perceptually, the quality degradation resulting from high latency and high loss rate is ``orthogonal''. The multiplier tex2html_wrap_inline962 is used to tradeoff total delay and loss probability. Greater tex2html_wrap_inline962 puts more penalty to higher loss rate, and the optimization results in lower loss rate at the cost of higher latency.

For multiple streams, we are also concerned about the voice quality when we do not receive all the MDC descriptions. The third term in (1) with multiplier tex2html_wrap_inline964 is introduced to give penalty to degraded SNR as a result of receiving only one description. The greater tex2html_wrap_inline964 is, the better the SNR of reconstructed signal, at the cost of higher delay. One should note that, packet loss by losing both descriptions (the second term in (1)) and SNR degradation (the third term in (1)) are not orthogonal perceptual experiences. Packet loss also impairs SNR greatly. From (1), it can be observed that greater tex2html_wrap_inline964 also leads to lower loss rate, which makes the existence of tex2html_wrap_inline962 trivial. However, with very small tex2html_wrap_inline964 , only packet loss is given emphasis. In this case good reconstruction quality is not a priority but latency is given more concern, with the tradeoff between loss rate and delay determined mainly by tex2html_wrap_inline962 . In practice, this is usually desired since human perceptual experience is impaired by high latency most, while the degraded voice quality can be largely tolerated [9].

   figure142
Figure 2: Playout scheduling of multiple streams.

Fig. 2 illustrates the scheduling process when tex2html_wrap_inline964 is small and low latency is given more emphasis. The source stream is coded and sent in two streams p and q. The playout deadline is being kept to the minimum level and dynamically adjusted according to the varying delay jitter of the two paths. At the receiver, the first two packets played are taken from stream p, since they have lower delays. As the delay of stream p increases, the playout switches to stream q and adjusts the scheduling accordingly, so as to avoid any late loss while keeping buffering delay low. The playout switches back to stream p from the 5th packet, when the turbulence in path p is over and the network delay comes back to normal. In adaptive playout, proper reconstruction of continuous output speech is achieved by scaling individual voice packets using a time-scale modification technique which modifies the rate of speech [10].


next up previous
Next: Estimate of loss probability Up: Playout scheduling of multiple Previous: Multiple description coding of

Yi Liang
Mon Mar 12 21:52:19 PST 2001