The VLC and VLR Codecs and the Big Data

Abstract

With compatible data, our codecs can integrate efficiently in a Big Data architecture, with a reduction in data size for the storage and the transmission, and a reduction in the memory used for the processings if the latter use the frequency domain.

The Big Data

The Big Data is one of the hottest themes at the moment in computer science.
For the data that support lossy compression, the VLC and VLR codecs are compatible with Big Data algorithms such as MapReduce, storage systems like NoSQL, and can integrate perfectly in architectures like Hadoop or Spark.

For more information on these topics, see at the following addresses:
MapReduce
Hadoop
Spark
NoSQL



The VLC and VLR Codecs

In this document we will discuss the nearest neighbor search using methods called kNN (or k-NN, k-Nearest Neighbor) or ANN (Approximate Nearest neighbor).

We begin by briefly recalling here the main characteristics of the VLC and VLR codecs:

- Audio codecs based entirely on FFT (Fast Fourier Transform), using two planes: the foreground constituted of the largest points (or points of greater magnitude), and the background constituted of the most energetic bands (excluding the foreground points).
- One can use the greatest local peaks only, instead of the greatest points.
- The foreground consists of magnitudes and phases. One can also work directly with the amplitudes of the sine waves and the amplitudes of the cosine waves.
- The coding of the background is less precise than that of the foreground. The phases in the background can be reduced to only one bit of sign.
- In the background, decimations can be carried out which consist in taking the biggest point on N (N = 2, 3, 4, ...) and to memorize its local position.
- The background is particularly redundant, especially if you can increase the size of the FFT buffers (non-audio data). It is also possible to carry out additional lossless compressions on the background and increase the compression rates.
- Contrary to the background, the number of points in the foreground is not proportional to the size of the FFT buffers, but depends only on the nature of the signal.
- The magnitude or amplitude coding can be done with the logarithm or a faster binary transformation algorithm, on 8 bits or less.
- If you take a contiguous zone in the background, there is no need to calculate the energies in the bands and sort them. There is also no need to transmit their positions.

For further information see at the following addresses:
Algorithms
VLR++
VLB



The Use with the Big Data

Very large data are divided into constant-sized buffers.
Each buffer is compressed using the methods described above. These compressed buffers are sent to processing servers (local or remote servers).
The processing servers search in one or more databases for the closest vector of magnitudes and the closest vector of positions (or the closest vectors of amplitudes and positions).
The whole compressed frame is represented by an integer (or some integers). The databases are built using partitioning (data clustering) algorithms, such as the k-means algorithm, or the LGB algorithm, or an equivalent algorithm.

For additional information on the codebook version of the codecs (with the voice), see at the following address:
Codebook Version

From very large data, we find ourselves with a restricted set of integers, which can serve as a signature, for carrying out rapid operations of detection, alignment or classification.
In addition, for a quick visualization, or to obtain an approximate original, from an integer (or some integers), it is enough to query for the vectors of the magnitudes (or amplitudes) and positions, and then performing an iFFT (inverse FFT).
For a fast cross-correlation or a fast convolution, from an integer (or some integers), it is enough to query for the necessary vectors and to carry out a multiplication of vectors in the frequency domain, and then performing an iFFT (inverse FFT).
Finally, as in general the magnitudes of the points are negligible towards the higher frequencies, if one limits the frequency range, one can reduce considerably the amount of memory required for the storage of the vectors and for the treatments. Everything can happen with a reduced memory, even if the size of the FFT buffers choosen for the compression is very large.

Notes

- If the samplings are not regular, one must do interpolations to have regular samplings before FFT.
- We can look for several nearest neighbors (k-NN) instead of just one (1-NN). One can then easily process the classification problems (k-NN classification) and the regression problems (k-NN regression).
- LSH (Locality-Sensitive Hashing) is an effective algorithm for the nearest neighbors search.
For more information:
LSH

- The examples in this document assume that we just want to compress very strongly, for future uses, perform a filtering, or that we want to use some frames.
With the appropriate data, all operations that can be paralleled can be considered.
- The data models, which are very simple or very complex, but which tend to repeat themselves, are particularly well suited, since in the frequency domain, the majority of their energy is in harmonics, so in local peaks.
- With the local peaks alone, memory usage is particularly optimized for the processings taking place in the frequency domain (the values of the magnitudes or the amplitudes are contiguous, the frequencies are in the vectors of the positions).
- The methods described in this document are particularly well suited for the analysis of the immense masses of data coming from the Internet of Things (IoT).
- Due to the compression, it is often necessary to use a frame overlapping (50% or less) to ensure the continuity of the data after iFFT (inverse FFT). In this case, the database containing the codebook must also be constructed with a frame overlapping.