Key words: MOS, interference, BER, C/I, power control, DTX, frequency hopping, PESQ, PSQM /PSQM+, PAMS
Abstract: With the development of the radio network, mobile operators become more focused on end users’ experience instead of key performance indicators (KPIs). The improvement of the end users’ experience and the improvement of the network capacity are regarded as KPIs. Therefore, Huawei must pay close attention to the improvement of the soft capability of the network quality as well as the fulfillment of KPIs. At present, there are three methods of evaluating the speech quality: subjective evaluation, objective evaluation, and estimation. Among the three methods, objective evaluation is the most accurate. The PESQ algorithm defined by the ITU can objectively evaluate the speech quality of the communication network. This document uses the mean opinion score (MOS) to label the speech quality after objective evaluation.
This document describes the factors of MOS, the impact of each factor on the MOS, and the methods of improving the network QoS and then the speech quality. It also describes the attention points during the test of speech quality of the existing network and the device capability value of the lab test. In addition, this document introduces the differences between the speech test tools. The methods and principles of using the test tools are omitted. This document serves as a reference to the acceptance of network KPIs and the marketing bidding.
|MOS||Mean Opinion Score|
|PESQ||Perceptual Evaluation of Speech Quality|
|PSQM||Perceptual Speech Quality Measurement|
|PAMS||Perceptual Analyse Measurement Sytem|
1 Basic Principles of MOS
1.1 Subjective Speech Quality Evaluation
ITU-T Rec. P.830 defines a subjective evaluation method toward speech quality, that is, MOS. In this method, different persons subjectively compare the original speech materials and the system-processed speech materials and then obtain an opinion score. The MOS is obtained through the division of the total opinion scores by the number of persons. The MOS reflects the opinion of a person about the speech quality, so the MOS method is widely used. The MOS method uses an evaluation system of five quality grades, each quality grade mapping to a score. In the MOS method, dozens of persons are invited to listen in the same channel environment and to give a score. Then, a mean score is obtained through statistical treatment. The scores vary largely from listener to listener. Therefore, abundant listeners and speech materials and a fixed test environment are required to obtain an accurate result.
Note that the opinion of a listener about the speech quality is generally related to the listening effect of the listener. Therefore, the listening effect scale is introduced in this method. Table 1 describes the relations between the quality grade, score, and listening effect scale.
Table 1 Relations between the quality grade, score, and listening effect scale
|Quality Grade||Score||Listening Effect Scale|
|Very good||5||The listener can be totally relaxed without paying attention.|
|Good||4||The listener should pay some attention.|
|Average||3||The listener should pay close attention.|
|Poor||2||The listener should pay very close attention.|
|Very poor||1||The listener cannot understand even with very close attention.|
Although the formal subjective listening test is the most reliable evaluation method and the network performance and any coding/decoding algorithm can be evaluated, the test result varies from listener to listener. In addition, the factors such as the listening environment, listeners, and speech materials should be strictly controlled during the test. As a result, this method consumes a lot of time and money. Therefore, several objective evaluation methods, such as PSQM, PESQ, and P862.1, are introduced. For details about the objective evaluation methods, see the next section.
1.2 Objective Speech Quality Evaluation
1.2.1 PSQM (P.861) Recommendation or Algorithm
The perceptual speech quality measurement (PSQM) recommendation or algorithm introduces the system of five quality grades, with each grade further classified in the form of percentages through the %PoW (Percent Poor or Worse) and %GoB (Percent Good or Better) scales. Although the PSQM involves subclassification, it is still one of the subjective evaluation methods. At present, someone uses a computer to generate a wave file. Through the changes in the wave file before and after network transmission, the quality grade is obtained to evaluate the speech quality. In 1996, the PSQM was accepted as Recommendation P.861 by the ITU-T. In 1998, an optional system based on measuring normalizing blocks (MNBs) was added to P.861 as an attachment.
1.2.2 PESQ (P.862) Recommendation or Algorithm
Jointly developed by British Telecom and KPN, the Perceptual Evaluation of Speech Quality (PESQ) was accepted as ITU-T Recommendation P.862 in 2001. The PESQ compares an original signal with a degraded signal and then provides an MOS. The MOS is similar to the result of a subjective listening test. The PESQ is an intrusive test algorithm. The algorithm is powerful enough to test both the performance of a network element (NE) such as decoder and end-to-end speech quality. In addition, the algorithm can give test results by degradation causes, such as codec distortion, error, packet loss, delay, jitter, and filtering. The PESQ is the industry’s best standard algorithm that has been commercially used.
For both the PSQM and the PAMS, a speech reference signal should be transmitted on the telephone network. At the other end of the network, the sample signal and the received signal should be compared through the use of digit signal processing so that the speech quality of the network can be estimated. The PESQ incorporates the advantages of both the PSQM and the PAMS. It improves the VoIP and hybrid end-to-end applications and modifies the MOS and MOS-LQ calculation methods. Initially, these methods are used to measure the coding algorithm. Afterwards, they are also used to measure the VoIP network system.
1.2.3 P862.1 Recommendation (Mapping Function for Transforming)
The perceptual evaluation of speech quality (PESQ) is a method of objectively evaluating the speech quality of the communication network. It is developed on the basis of the PSQM+ and PAMS. In February 2001, the PESQ was accepted as ITU-T Recommendation P.862. Afterwards, P.862.1 (mapping function for transforming) was added. Not an independent protocol, P.862.1 is only the mapping of P862. P.862.1 simulates the human ear’s perception of speech more exactly than P.862. Therefore, P.862.1 is more comparable to a subjective listening test than P.862. The high scores obtained according to P.862.1 are higher than those obtained according to P.862. The low scores obtained according to P.862.1 are lower than those obtained according to P.862. The watershed is at the score of 3.4. Therefore, according to P.862.1, the percentage of MOSs above 3.4 should be increased to enhance end users’ experience.
1.2.4 P.563 Recommendation
The P.563 Recommendation was prepared by the ITU in May 2004. As a single-end objective measurement algorithm, P.563 can process only the received audio streams. The MOSs obtained according to P.563 are spread more widely than those obtained according to P.862. For an accurate result, several measurements should be performed and the scores should be averaged. This method is not applicable to individual calls. If it is used to measure the QoS of several calls, a reliable result can be obtained.
1.3 Speech Processing of Involved NEs
This section introduces the speech processing of all the involved network elements (NEs): MS, BTS, BSC, and UMG. Faulty speech processing of any one of the NEs will affect the speech quality.
Accordingly, four transmission procedures are involved in the transmission of speech signals. The transmission procedures are Um-interface transmission, Abis-interface transmission, Ater-interface transmission, and A-interface transmission. Faults in any one of the transmission procedures will lead to bit errors. Therefore, if a speech-related problem occurs, the four NEs and the four transmission procedures should be troubleshoot.
If the problem occurs on the Um interface, the transmission quality on the Um interface should be optimized. If the problem occurs on the other interfaces, the fault should be located on the basis of the bit error rate (BER). The BSC6000 can perform BER detection.
The BSC modules other than the GTCS perform transparent transmission on the speech signals. Instead of participating in the speech coding/decoding, these modules are only responsible for the establishment of the speech channel, wiring, and speech connection. For the transparent transmission process, see the BSC6000 speech process figure.
220.127.116.11 FTC Processing on Speech
Coding/decoding is performed on the speech signals and rate adaptation is performed on the data signals so that the communication between a GSM subscriber and a PSTN subscriber is realized and the transparent transmission on the SS7 signaling over the A interface is implemented.
18.104.22.168 FTC Loopback
In a loopback, a message is transmitted by a transmission device or transmission channel and then is received by the same to check the health of the hardware and the settings of the software parameters. The FTC loopback is one of the most commonly used method for locating the transmission problems and for checking whether the settings of the trunk parameters are accurate.
The UMG performs the coding/decoding conversion. Different coding/decoding algorithms have different impacts on the speech quality. If the communication is performed between different networks, if the MSs use different coding/decoding algorithms, or if the same coding/decoding uses different rates to perform communications, the coding/decoding conversion is required. Generally, the UMG8900 coding/decoding algorithm uses the codec cascading to perform speech conversions. As shown in Figure 8, codec A is cascaded with codec B. First, the compressed code stream is restored to the PCM linear code through the corresponding decoder. Then, the PCM linear code is encoded through another coding/decoding algorithm. The codecs involve lots of redundancy operations, so the speech quality is degraded to some extent.
2 Factors That Affect the MOS in GSM
The MOS is affected by many factors, such as the background noise, mute suppression, low-rate coder, frame error rate, echo, mobile terminal (MS). Here, the frame error rate pertains to the frame handling strategy (handling of frame loss during signaling transmission), frame stealing, bit error, handover, and number of online subscribers (congestion degree). During the speech propagation, several NEs participate in the speech handling: MS, BTS, TC, and MGW. The following paragraphs describe the impact of each NE on the speech quality.
2.1 Introduction to GSM Speech Acoustic Principles
In a radio network, the basic processing of speech data involves source sampling, source coding, framing, Um-interface radio transmission, internal NE processing, handover, terrestrial transmission, and source decoding at the receive end.
A fault in any segment of the speech transmission will result in bit errors, thus leading to poor speech quality.
For the wireless communication system, the speech quality is significantly affected by the Um interface, that is, the radio transmission part. An intrinsic characteristic of radio transmission is time-variant fading and interference. Even for a normally functioning network, the radio transmission characteristics are changing from time to time. For a radio network, the radio transmission has a great impact on the speech quality. A speech signal is transmitted to the BSS system over the Um interface. Then, the signal is transmitted within the BSS system through the standard and non-standard interfaces. The process requires the transmission lines to be stable and the port BER to be lower than the predefined threshold. If a transmission alarm is generated, the related speech transmission lines should be checked. If the speech quality is poor, a port BER test should be conducted.
2.2 Impact of Field Intensity and C/I on the Speech Quality
For the wireless communication system, the speech quality is significantly affected by the Um interface, that is, the radio transmission part. An intrinsic characteristic of the radio transmission is time-variant fading and interference. Even for a normally functioning network, the radio transmission characteristics are changing from time to time. For a radio network, the radio transmission has a great impact on the speech quality.
If the changes in the signal field intensity do not cause the BER/FER to be greater than zero, the RXQUAL remains zero. In this case, the speech quality is not affected theoretically. If the changes in the signal filed intensity cause the BER/FER to be greater than zero (equivalently some interference exists), the C/I and the field intensity have a great impact on the MOS.
Both the in-network interference and the out-network interference may affect the C/I and the receive quality and degrade the demodulation capability of the BTS. This will lead to continuous bit errors and faulty parsing of speech frames. Thus, frame loss may occur, causing adverse effect on the speech quality.
2.3 Impact of Handover on the Speech Quality
The GSM network uses hard handovers, so a handover from a source channel to a target channel definitely causes loss of downlink speech frames on the Abis interface. Therefore, audio discontinuity caused by handovers is inevitable during a call. Hence, the handover parameters should be properly set to avoid frequent handovers. In addition, the audio discontinuity caused by handovers should be minimized to improve the speech quality.
2.4 Impact of DTX on the Speech Quality
If the DTX is enabled for a radio network, comfort noise and voice activity detection (VAD) are introduced. Affected by the background noise and system noise, the VAD cannot be totally exact. This definitely leads to the clipping of speech signals. Thus, the loss of speech frames and the distortion of speech may occur, and the speech quality and MOS test may be greatly affected. When the Comarco device marks a speech score, the statistics on the clipping are collected. Generally, the value of the clipping has a positive correlation with the clipped portion of speech. Therefore, if the intrusive algorithm is used, the MOS is definitely low.
2.5 Impact of Speed (Frequency Deviation) on the Speech Quality
Generally, at a speed of 200 km/h, the BER increases and the speech quality deteriorates because of multi-path interference. If the speed is increased to 400 to 500 km/h, a certain frequency deviation occurs in the signals received by the BTS from the MS because of the Doppler effect. The uplink and downlink frequency deviations may accumulate to 1,320 Hz to 1,650 Hz. Thus, the BTS cannot correctly decode the signals from the MS.
With the development of high-speed railways and maglev trains, mobile operators pay increasing attention to the speech quality in high-speed scenarios. In 2007, Dongguan Branch of China Mobile requested Huawei to optimize the speech quality for the railways in Dongguan under the coverage of Huawei equipment. After optimizing the speech quality, Huawei enabled the HQI (HQI indicates the percentage of quality levels 0-3 to quality levels 0-7 in the measurement report) to be 97.2%, which is the competitor’s level. In addition, the highest HQI reached 98.5%. The percentage of SQIs distributed between 20 and 30, however, is only 40% and that distributed between 16 and 20 is also only 40%. The distribution of the highest SQIs is sparser than that (about 90%) with the same speech quality at a low speed. Therefore, high speed greatly affects the speech quality. Ensure that the speed is stable during acceptance tests or comparative tests.
2.6 Impact of Speech Coding Rate on the Speech Quality
The speech coding schemes are HR, FR, EFR, and AMR.
Each speech coding scheme maps to an MOS. Table 3 lists the mapping between the speech coding scheme and the MOS value.
Table 3 Mapping between the speech coding scheme and the MOS value
2.7 Impact of Transmission Quality on the Speech Quality
Generally, if the transmission quality is poor, the BER and the slip rate are high and the transmission is intermittent. The statistics on OBJTYPE LAPD involve the retransmission of LAPD signaling, LAPD bad frame, and overload. These counters are used to monitor the transmission quality on the Abis interface. If too many bad frames are generated or if the signaling retransmission occurs frequently, the transmission quality is probably poor. From the perspective of principle, poor transmission quality is equivalent to the loss of some speech frames. If the speech frames are lost, the speech quality deteriorates greatly.