Dec 2000
The Goldmund/Job/Stellavox Fundamental R&D
Some preliminary precautions...
Fundamental Research has nothing
to do with product development. We have, since Goldmund came to Switzerland
in 1980, clearly separated the product development departments, different for
each company because they develop products for different fields, from the Fundamental
Research which is common to all our brands.
Fundamental Research has nothing
to do with technology development. We are experimenting on several advanced
technologies all the time like Firewire, Wireless,
Voice Activation, UPNP components
(see our technical page) or fast
Internet and broadband links between sites for Audio and Video. We may
be among the very few who currently do it, but it is not Fundamental Research
either.
We also master the Computer
Technologies that will be used a lot in the future of Audio and we are
certainly the most advanced company in that field among hi-end Audio manufacturers,
but it is not Fundamental Research.
Fundamental Research addresses
the long term development fields and work for that reason on the most advanced
technologies. There is usually no immediate product application of the results
even if, for most of them we hope, they will be used one day in products developped
by more than one of the brands.
Sound is Everywhere
- in the Hifi playing music.
- in the Radio playing music or speeches.
- in the TV programs.
- At the Movies.
- in the Telephone.
- and more recently in the Computer field as :
- Background music (using the computer as a CD or DVD player or downloading
from internet when programming or working on business).
- Games sound.
- Speaking computer (very soon everywhere).
- Voice Recognition.
- Storage and distribution (MP3, AAC or other compression systems) through
the Internet.
Sound is Vital
- It is half
of the human communication. In most
of the applications seen above the sound is at least half the message, the
other half being the image.
The human perception can be split into two separate domains :
- The visual perception (the image)
- The audible perception (the sound).
- These two domain are both necessary to
rebuild the full perception. However, for the accuracy of the human perception,
sound is even more important than the image. Even a movie is more understandable without
the image than without the sound. Video conferencing is nothing without the
sound, but telephone is often sufficient for human communication.
- It is the
emotional half. The sound is the
part of the message carrying the most emotion. Try a Movie without sound,
or a computer game without the sound ? On the contrary, Radio and telephone
prove that emotion can easily be carried by the sound only.
Sound Quality is mandatory
- It guarantees the emotional content. A poor quality
sound is not impressive. Top quality sound induces greater emotional impact.
- Sound Quality is needed for helping recognition of the
original source. Intelligibility is mandatory in movie sound reproduction because otherwise
dialogs would not be understood. Accuracy is mandatory for voice recognition.
- Sound quality is a guarantee for health and safety.
Noise and severe distortion (especially in the digital domain and time domain) have
detrimental effect on the human nervous system. As well has the today's computer screens
have to avoid radiation of detrimental rays or magnetic fields, best sound systems have to
avoid creating irritating noise and distortion.
At Goldmund, our work is
to Master Sound Quality
Within few years, most sound (music, movie sound or
video conference sound) will be coming from a computer. Here is a simple basic possible
configuration (finally quite accurate despite the fact we introduced it already 3 years
ago !).

Our fundamental research, for the time being (we may extend
some of our research in Audio to Video too...), covers the following 4 domains
:
- The Digital Sound Signal Processing inside the computer or in DSP's.
- The A/D and D/A Audio converters (converting the analogue signals to the digital domain and
backwards).
- The Power Amplifiers (powering the speakers with maximum accuracy).
- The Speaker Systems (to make the sound signal audible
to the human ear).
These 4 domains are fully explored in the 3
following principle directions :
- Space Recreation.
Stereo, multichannel (AC3, etc..), are
due to reproduce space in sound imaging. We do extensive research on all of these ways to
properly recreate the Sound Imaging.
The most significant of the technologies we use or develop today are the Single Stereo
Speaker (the JOB Speaker), the Goldmund 100% software multichannel decoder (AC3, Prologic,
etc..), as well as some experimental 2/3/4 channels with 3/4 speakers formats to recreate
a more accurate spatial imaging in very large rooms (where the single stereo speaker is
not sufficient in image width). The same program covers the holophonic recreation needed
in virtual reality to precisely locate a sound in space.
All these technologies involve a lot of digital signal processing.
- Intelligibility.
Everything (or nearly everything) in
sound is based on signal transients. Speech as well as music. The capability to reproduce
credible sounds (which induces intelligibility) for sound equipment is directly related to
its capability to rebuild transient with extreme time
accuracy. It means the signal has to be treated
by circuits both extremely fast and time-accurate. Today, the time accuracy we reach in
the lab (and in the "Alize" D/A converters) is already better than 100
picoseconds and the circuit bandwidth necessary is around 10 MHz. These circuits provide
already a far better intelligibility. But speed and time accuracy have also to be improved
in speakers and digital domain signal processing.
The largest domain of application, and one of our main concern, is voice recognition,
for which the inclusion of one of our circuits may immediately increase dramatically the
performance of the existing software in recognition ratio.
- Naturalness.
The sound naturalness is what produces
recognition by resemblance to reality. It is brought by a combination of several factors :
- Time Accuracy in the reproduction. Time Distortion doesn't exist in mother nature and is
literally impossible to avoid in electronic circuitry. The Human Brain is impossible to
fool on this matter. It is probably the most important in the Recognition Factor.
- Low sound Coloration. Mechanical vibrations produced by speakers, or even by power
amplifiers, coloration induced by non-linear electronic components or
cabling, natural mechanically-related component (capacitors, coils but
also semi-conductors) coloration, all these phenomenons are hiding the
original timbre of the sound by modifying its harmonic/temporal distribution.
- High sonic Dynamics (difference between high level and low level sound). Dynamics
is needed to reproduce accurately the very strong impact of the change
in sound level in music, noises or even human speech. Most of today's
circuits are limited inherently in dynamics and so are the usual speakers.
- Low Distortion (especially those not found in mother nature like time distortion and
intermodulation distortion). Distortion affects the tonal balance and the credibility of
sound by adding spurious components not existing in mother nature, which confuses brain.
- Absence of sonic artifacts. It is quite easy to make most music and speech sound
"nicer" by elaborate digital signal manipulation. But naturalness (as we want
it) implies that no sound artifact of any kind is applied, in order for the original sound
to carry all its emotional content. And the technology to avoid all artifacts is even more
complex than the technology to produce them...
Copyright © Job&Goldmund. All rights
reserved.