These notes are written for investors, developers and decision makers.
A version with codebook is planned. It remains to build the database and to add the GPU (Graphics Processing Unit) acceleration support.
The main features of this version are:
- The use of one database (or more) of vectors containing the magnitudes and positions of the greatest local peaks. Each frame provides a vector of positions and magnitudes or a vector of magnitudes and a vector of positions. The database is built by using the k-means algorithm, or the LGB algorithm or an equivalent algorithm. The local peaks must remain local peaks.
- The transmitter sends a simple index (number) corresponding to the nearest vector in the database. The algorithm of the nearest neighbor search (kNN) is used to find this index.
- The base can be constructed with a number of local peaks (eg 16, 24 or 32) and some precision (eg 8 bits or even 16 bits), one can determine an index and do a search with only the first elements of the vector (eg the first 6 or 8 elements).
- The order of the first positions (frequencies) being very important, a weighting coefficient is applied to the first elements (eg the first 4 to 8 elements), just before the loading of the base and in the search queries.
- The nearest neighbor search (kNN) is very fast with approximate methods (ANN). But it can still be very greatly accelerated with the GPU support, thanks to the parallel programming.
- The finishing, the availability of the sources codes and the deployment are expected in 2017 or 2018.
- One can expect bitrates less than 1000 bps (VLC and VLR) and mean bitrates less than 600 bps for VLR (excluding the zones of silence).
- A patent application on the used algorithms was filed in France (INPI).
It also remains to demonstrate the compatibility of the codebook version with some stationary signals such as the ElectroCardioGram (ECG), the Arterial Blood Pressure (ABP) or the PhotoPlethysmoGram (PPG).
If the results are acceptable, by transmitting for example one compressed frame per second or one compressed frame every two seconds instead of 25 or 31.25 frames per second with the voice, we can hope to have bit rates less than 32 bits per second and per channel.
Each compressed frame contains the code of a vector of the magnitudes and the positions of the greatest local peaks of the frame. These local peaks constitute the points of the spectral envelope of the frame. These codes can be used directly in the abnormality detection algorithms using databases. A single integer can be used to represent a vector of the positions and / or the magnitudes of the local peaks, thus can serve as signature for a frame.
Dynamic and local codebook:
A version using a dynamic and local codebook can be proposed (search for similar frames located on the rear, with a very large buffer). This codebook is contextual, requires memory and the GPU programming (nearest neighbors searches and sorts).
Unlike the global codebook, it does not support frames loss or one must implement a mechanism of resynchronization. Finally, this codebook is not shareable on a server.
Contacts and Comments: