Synthetic Hearing And Speech (SHAS)
What is Synthetic?
Synthetic refers to engineered (constructed by design) which is opposite of evolved by trial and error (darwinism). However even while being synthetic the by software emulated processes were evolved by Nature and governed by Physic laws.
What is Hearing?
Hearing is the capture of sound samples and parsing (decoding) of these into its utmost-compact data representation.
What is Speaking?
Speaking is the generation of sound samples (encoding) from utmost-compact data representation of these sounds.
What is it used for?
SHAS is one of the 2 pillars on which DSS established a data link between the world we live in and computer software.
One could think of this as SHAS being for the spoken word what ASCII is for the written word, a direct bi-directional translation of sound shape into data tokens.
How it works?
It is no secret that all sound samples are made up out of multiple sound sources that each have characteristics limited and defined by real world physic laws. E.g. all sound which is not white noise (meaning it is oscillating on stable frequencies) is always stabilising in standing stable oscillations on frequencies that are multiples of the lowest frequency.
By knowing that (and a few other tricks) we are able to isolate the characteristics of EACH separate sound source and define these in only a few bytes.
The thing is, nobody has yet pulled it off to get it to work . . .
The result will replace existing DCT based algorithms that are used by MP3 and derived sound container formats.
Which is more or less the entire sound media library world wide to date.
Technology enabler?
SHAS is a technology enabler by that it can do things that were not possible without having information about the separate (together recorded) sound sources.
By having perfect isolation it becomes possible to differentiate different people talking at the same time, filter out background noise and even decode animal vocals from which it was assumed that they did not contain data.
As bonus feature the resulting data contains references to established sound descriptors making the resulting file format an order of times more compact than existing MP3 variants.
SHAS is able to do recognition tasks that before required NN except that NN were never able to properly show how they did what they had learned.