IEEE P1918.1.1


IEEE P1918.1.1
Haptic Codecs for the Tactile Internet Task Group
Proposal for Tactile Codec
TUM Vibrotactile Perceptual Codec based on DWT and SPIHT (TUM-VPC-DS)
DCN: HC_NGS_19-1-r0_Proposal_for_Tactile_Codec
Date: 2019-3-29
Abstract
This document describes a proposal for a tactile codec for the IEEE P1918.1.1 standardization
activity in response to the respective Call for Contributions. The proposed codec uses a
perceptual approach with a DWT and subsequent quantization. The quantizer is designed to be
adaptive considering a psychohaptic model. After quantization we further compress using the
SPIHT algorithm and generate the bitstream. The whole process is modulary, hence the encoder
can work with any psychohaptic model. This allows for future enhancements.
Subclause 5.2.1 of the IEEE-SA Standards Board Bylaws states, "While participating in IEEE
standards development activities, all participants...shall act in accordance with all applicable laws
(nation-based and international), the IEEE Code of Ethics, and with IEEE Standards policies and
procedures."
The contributor acknowledges and accepts that this contribution is subject to
The IEEE Standards copyright policy as stated in the IEEE-SA Standards Board Bylaws,
section 7, http://standards.ieee.org/develop/policies/bylaws/sect6-7.html#7, and the
IEEE-SA Standards Board Operations Manual, section 6.1,
http://standards.ieee.org/develop/policies/opman/sect6.html
The IEEE Standards patent policy as stated in the IEEE-SA Standards Board Bylaws,
section 6, http://standards.ieee.org/develop/policies/bylaws/sect6-7.html#6, and the
IEEE-SA Standards Board Operations Manual, section 6.3,
http://standards.ieee.org/develop/policies/opman/sect6.html
1 Technical Description

IEEE P1918.1.1作业代做、Haptic Codecs作业代写、C++程序作业调试、代做JAVA
The proposed compression scheme involves various operations in the encoder as depicted in
Figure 1.
Figure 1: Encoding structure of the proposed tactile codec.
The input signal is split into blocks. Each block is first decomposed by a Discrete Wavelet
Transform (DWT) using CDF 9/7 filters. At the same time, each block is transformed by a DFT.
The obtained spectrum is passed on to a psychohaptic model that computes masking and
perception thresholds for the corresponding block. Results from the psychohaptic model are used
to adapt the quantization and allocate bits to different DWT bands. After quantizing the wavelet
coefficients we further compress them with an adaptation of the SPIHT algorithm from [1]. The
compressed bitstream is then multiplexed with side information from the quantization and stored
or transmitted. In the following, we will explain all steps in more detail.
1.1 DWT
The DWT operates on the blocks by applying the CDF 9/7 filters. These filters are chosen as they
have a symmetric impulse response, which implies linear phase. Therefore, we achieve the same
number of wavelet coefficients in each block as we have input signal values. In addition, the CDF
9/7 filters are almost orthogonal, meaning we can calculate signal energy values in the wavelet
domain with acceptable accuracy.
1.2 Psychohaptic Model
The psychohaptic model plays a crucial role in the codec as it adapts the quantizer to introduce
distortion where it is least perceivable. We start by taking the FFT of an input block and represent
it in dB. We determine dominant peaks in the spectrum. It has been shown in [2] that for tonal
signals we observe masking phenomena. Thus, in a more complex signal we assume that masking
will occur as well, which means that dominant peaks will increase perception thresholds around
them. To model this we use quadratic spreading functions that constitute masking thresholds for
peaks at different frequencies. Then all masking thresholds are added together with the absolute
threshold of perception by power additive combination. This yields the so-called global masking
threshold. This process is illustrated in Figure 2.
After obtaining the global masking threshold, we compute the so-called Signal-to-Mask-Ratio
(SMR) for each DWT band. That is we take the energy of the spectrum in each band divided by
the energy of the previously obtained global masking threshold. The SMR values are passed on to
the quantizer together with the values of the signal energy in each band.
1.3 Quantization
The quantizer is the core component of our codec. It allocates a certain bit budget to the different
DWT bands according to the psychohaptic model to reduce the rate considerably without
introducing any perceivable distortion.
To accomplish this task the quantizer takes into account the values from the psychohaptic
model. In a loop a total of n bits are allocated to each band. We start with 0 bits allocated to all
bands. In every iteration we calculate the in dB using the signal energy values in each band
passed over by the psychohaptic model and the noise energy introduced by the quantization. We
then calculate the so-called Mask-to-Noise-Ratio. Then we
allocate one bit to the band with the lowest value and repeat until all n bits are allocated.
Since in general the bands will have a different number of quantizer bits, we design the quantizer
itself as an embedded deadzone quantizer adapted from [2]. We first calculate the maximum
wavelet coefficient for the current block ?)*+. This value is quantized to a fixed point number
with 3 integer bits and 4 fraction bits by a ceiling operation to receive ?,)*+ . The 7 bits
representing this maximum value are passed on to the bitstream encoding as side information. The
quantizer then takes the bits allocated to each band and this maximum value to determine the
quantization interval asΔ = ,)*+2/ ,
where is the number of bits allocated to a particular band. The wavelet coefficients are then
quantized according to2 = sgn() 8Δ9 Δ.
Figure 2: Magnitude spectrum of an exemplary block (blue), computed masking thresholds (red), absolute threshold
of perception (green) and the resulting global masking threshold (black).
Thus, the wavelet coefficients are quantized to the original range. This formula also implies the
addition of one sign bit. After all bits have been allocated and therefore all wavelet coefficients
have been quantized, we scale all the quantized wavelet coefficients to integers by.
These quantized integer wavelet coefficients are passed on to the SPIHT algorithm.
1.4 SPIHT
In order to efficiently compress the quantized wavelet coefficients, we employ a 1D version of Set
Partitioning in Hierarchical Trees (SPIHT) algorithm proposed in [1]. SPIHT is a zero tree based
coding method, which achieves superior performance than Embedded Zero-tree Wavelet (EZW)
coding. It utilizes two types of zero trees and encodes the significant coefficients and zero trees by
successive significance and refinement passes. The details of the algorithm for coding of 2D
wavelet coefficients is provided in [1], and exemplified in [3]. We adapt the same for the quantized
1D wavelet coefficients, by constructing the parent-child relationship in only one dimension. The
output of the SPIHT module is the bitstream of lossless compression of quantized 1D wavelet
coefficients.
1.5 Bitstream Encoding
In order for the decoder to be able to decompress the signals correctly, we need to pass some side
information in the bitstream. We therefore add a header on front of every compressed block. This
header consists of 32 bits for a 512 sample long block and codes the following information:
- 14 bits: Length of the following bitstream segment belonging to one block
- 2 bits: Coding of block length chosen from 64, 128, 256, and 512.
- 6 bits: Integer number coding the maximum number of bits allocated to the DWT bands
- 3 bits: Integer coding the level of the DWT
- 7 bits: Fixed point number with 3 integer and 4 fraction bits coding the maximum wavelet
coefficient value of the current block.
For smaller block lengths the length of the header can be reduced accordingly.
1.6 Decoding
The decoder can be built very simply by 4 operations. First, the blocks are separated out of the
bitstream, followed be an inverse SPIHT algorithm. Then we dequantize and do an inverse DWT
to obtain the reconstructed signal suitable for playback.
2 Performance Evaluation
We aim to show the performance of our compression scheme by examining its rate-distortion
behavior. We use the provided test data set consisting of 280 vibrotactile signals recorded with an
accelerometer. The test dataset contains signals of various materials for different exploration
speeds. We compress the signals using a block length of 512 samples and a DWT of level 7.
All signals are encoded, decoded and the resulting output is then compared to the original. We
vary the bit budgets of the quantizer between 8 and 112 bits to achieve different rates and therefore
quality levels. We define the compression ratio () as the ratio between the original rate and the
compressed rate. Then, we compute SNR and PSNR for all 280 test signals for different values.
The respective scatter plots for all three metrics with averages are given in the following plots. In
blue are the scatter plots for all test signals at different rates and in red the average over all test
signals. It is clearly visible that the quality decreases with increasing compression. At a of 10
we have an SNR of about 10 dB and a PSNR of about 52 dB.

Additionally, the results for different bit budgets are given in the following table.
Here we also computed the required runtime per block of our algorithm in MATLAB. Especially
for low rates, this time is sufficiently low, to allow for a real-time scenario. In this case we would
have to choose a significantly lower block length, since 512 samples already will account for a
delay of about 180ms. A block length of 64 samples would deliver 23ms of delay at the cost of a
slightly worse compression performance.
To assess the behavior of our algorithm in more detail, we examine individual signals in terms of
their PSNR over performance. The resulting plots are given in the following figures.
MSE SNR (dB) PSNR (dB) Runtime per block (ms)
8 54.65 1.51 × 10FG 2.56 45.12 4.3
10 41.62 1.38 × 10FG 3.20 45.75 4.2
12 32.58 1.23 × 10FG 3.81 46.36 4.7
14 26.74 1.10 × 10FG 4.44 47.00 5.4
16 22.24 9.61 × 10FH 5.02 47.58 5.9
20 15.90 6.68 × 10FH 6.24 48.80 6.9
24 11.53 4.19 × 10FH 7.78 50.34 8.5
28 8.73 2.46 × 10FH 9.56 52.11 9.7
32 6.90 1.31 × 10FH 11.50 54.06 11.2
40 4.98 4.00 × 10FI 15.12 57.67 12.9
48 3.69 1.17 × 10FI 19.22 61.78 15.0
56 2.77 3.26 × 10FJ 24.77 67.33 17.1
64 2.29 8.32 × 10FK 30.65 73.20 18.7
80 1.78 5.93 × 10FO 42.55 85.11 20.6
96 1.47 9.28 × 10FP 54.41 96.97 23.4
112 1.26 6.21 × 10FP 66.38 108.94 26.0
128 1.10 6.03 × 10FP 78.26 120.81 28.4
'Direct_-_1spike_Probe_-_cork_-_slower.mat'
(Signal #20)
'Direct_-_3x1spike_Probe_-_antiVibPad_-_fast.mat'
(Signal #84)
'Direct_-_3x1spike_Probe_-
_polyesterPad_-_slow.mat' (Signal #107)
'Direct_-_3x3small-round_Probe_-_felt_-_fast.mat'
(Signal #175)
'Direct_-_big-round_Probe_-_foam_-_fast.mat'
(Signal #255)
'Direct_-_big-round_Probe_-_foam_-_medium.mat'
(Signal #256)
'Direct_-_big-round_Probe_-_foam_-_tooSlow.mat'
(Signal #258)
'Direct_-_finger_Probe_-_foam_-_slower.mat'
(Signal #274)
We see that the quality decreases for all signals over the compression ratio.
Lastly, we aim to exemplify the behavior of our method towards the signal shape. This will help
to gain some further intuition into how perceivable the introduced distortions are. We take the first
signal from the 8 examples before ('Direct_-_1spike_Probe_-_cork_-_slower.mat') and plot the
first 200 samples together with reconstructed signals for = 8, 16, 32, 64. The results are given
in Figure 3. We can see that the general structure of the signal is preserved even for very high
levels of compression (= 8 is equivalent here to ≈ 62). At = 64 the two signals are so
close that we assume that no distortions should be perceivable. To assess the codec correctly in
terms of its transparency, we need to conduct extensive experiments and develop new metrics
based on human haptic perception.
3 Conclusion
We have presented a novel method to compress and encode 1D tactile signals. The rate distortion
performance is good and the algorithm allows for offline and online encoding with the appropriate
choice of block length. The transparency of the codec should be evaluated in terms of subjective
experiments and newly developed perceptual metrics.
The presented codec works with any choice of perceptual model, which readily allows for future
enhancements as better psychohaptic models are being developed. In addition, it can fairly easily
be extended to higher dimensional signals to allow for more points of interaction.
Figure 3: First 200 samples of the signal 'Direct_-_1spike_Probe_-_cork_-_slower.mat' for various levels of
compression determined by bit budget ?.
4 References
[1] A. Said and W. A. Pearlman, "A new, fast, and efficient image codec based on set partitioning
in hierarchical trees," IEEE Transactions on Circuits and Systems for Video Technology, vol.
6, no. 3, pp. 243-250, June 1996.
[2] R. Chaudhari, C. Schuwerk, M. Danaei and E. Steinbach, "Perceptual and Bitrate-Scalable
Coding of Haptic Surface Texture Signals," IEEE Journal of Selected Topics in Signal
Processing, vol. 9, no. 3, pp. 462-473, November 2014.
[3] D. S. Taubman und M. W. Marcellin, JPEG2000: Image compression fundamentals,
standards, and practice, Kluwer Academic, 2002.
Annex A: Information form for the submission of contributions
Name of Contribution: TUM Vibrotactile Perceptual Codec based on DWT and SPIHT
(TUM-VPC-DS)
Authors and Affiliation: Andreas Noll, Basak Gülecyüz, Eckehard Steinbach; Chair of
Media Technology, Technical University of Munich
Addressed Requirements and Test Conditions (see Section 4.2.1): Test condition 1: test data
traces
Summary of Proposal: The proposed codec uses a perceptual approach with a DWT and
subsequent quantization. The quantizer is designed to be adaptive considering a psychohaptic
model. After quantization we further compress using the SPIHT algorithm and generate the
bitstream. The whole process is modulary, hence the encoder can work with any psychohaptic
model. This allows for future enhancements.
Comments on Relevance to CfC: Fully in line with the CfC

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com 

微信:codinghelp

上一篇:如何将pyWavelet和openCV结合起来进行图像处理?


下一篇:Spring security oauth2最简单入门环境搭建