Sie sind auf Seite 1von 26

MAKE

SOMETHING
THAT TALKS?
Modeling the Human Vocal Tract

pitch, timing,
and formant
pitch, timing, and formant control signals control signals

lips, teeth,
and tongue
formant
cavity 2

formant
cavity 1
pulse resonant resonant noise,
vocal
source filter filter dynamics
folds
Sound Path Functional Blocks

vowels:
resonant
filter

pulse envelope
source (attack,
decay)

resonant
filter
consonants:

envelope
Combined outputs creates
noise resonant a functional approximation
(attack,
source filter
decay) of the human vocal tract!
Sound Path Functional Blocks

vowels:
resonant
filter
=

pulse envelope
source (attack,
decay)

resonant
filter
consonants:

envelope
Combined outputs creates
noise resonant a functional approximation
(attack,
source filter
decay) of the human vocal tract!
Mixing the output of two filters approximates double-peak speech resonances.

dB

Freq.

“ee” “ah”

“ih” “oo”

Spectra captured with Visual Analyzer software by Alfredo Accattatis http://www.sillanumsoft.org/


What's in a Word? Phonemes!
“Grist” = 1 word
--- it has 1 syllable
--- but 5(!) phonemes: /g/ /r/ /ih/ /s/ /t/
We will be making syllables by combining phonemes. If we want to synthesize
“grist”, it will require that we synthesize 5 phonemes and link them together in time.

But that's pretty difficult.

Let's try to simplify. Our system will allow us to use 1- or 2-phoneme syllables,
as long as they start with the consonant. And let's use simple consonants too.
Things like “koo”, “bah”, or “toe”.

“Koo” has 2 phonemes: /k/ and /oo/. Here's a timing diagram of it:

/k/ /oo/

So we need to build a machine that first produces the consonant, then after
that produces the vowel. This is something we can handle. So let's discuss how
to build the control circuits of the synthesizer.
Spectral Control
Your brain controls the resonant frequencies of formants by
changing the shape of the formant cavities in your throat and mouth.
The littleBits Synth Kit controls the resonant frequencies of
voltage controlled filters by modulating the control voltage to the filter.
The microsequencer has 4 independent steps, each having
its own knob to set the control voltage for that step.

4
2 5V
3
1
0V
1 2 3 4

dB

Freq.
Timing Control
This circuit uses a sawtooth (ramp) wave from
oscillator and a few logic modules to
generate a normal positive-edge-triggered
clock, and a delayed clock. The length
of the delay varies depending on the
frequency of the oscillator, so you will need
to experiment with different frequencies
until the delay sounds right.
Positive
edge triggered
clock

Delayed
clock
Control Path Functional Blocks
timing: spectra:
5V
to consonant formant
0V
Positive Variable control voltages
edge triggered
clock

ramp
source 5V
to vowel
formant 2
0V
Variable control voltages

Functional approximation
of the brain and
nerve connections delayed 5V
to the vocal tract! clock to vowel
formant 1
0V
Variable control voltages
Control Path Functional Blocks
timing: spectra:
5V
to consonant formant
0V
Variable control voltages
Positive
edge triggered
clock
ramp
source 5V
to vowel
formant 2
0V
==
Variable control voltages

Functional approximation
of the brain and
delayed
nerve connections 5V
clock
to the vocal tract! to vowel
formant 1
0V
Variable control voltages
Building the Synth
Tune All Three Filters:

1. Set filter “cutoff” to ~ 45%


2. Set filter “peak” to 100%
3. Set speaker “volume” to 20%
4. Connect the circuit below, and listen to the pitch produced.

45% 100% 20%

Indicator line
5. Now listen to each filter in succession, making small adjustments with the “cutoff” knob
until all three produce the same pitch.
6. Now don't change the “cutoff” of the filters ever again, unless you want to re-tune them.
7. Turn the “peak” control down (counterclockwise) until the filter just stops oscillating (until it
stops producing a tone).
Building the Synth
Build and test the Clock Generator:

1. Set oscillator mode to “saw”


2. Oscillator “tune” knob does not matter fork inverter LED
3. Set oscillator “pitch” to about 10% o1 led

4. Connect the circuit below: Consonant


= clock

wire

power dimmer oscillator

NOR
5. Turn on the power and observe the LEDs
6. Adjust the dimmer and oscillator knobs to
set the LEDs to slow flashing. LED
7. The delayed clock LED should stay mostly on, o1 led

and turn off then back on quickly Vowel


XOR clock (delayed)
8. The delayed vowel clock LED should turn
on AFTER the consonant clock LED.
Building the Synth
Build and test the Consonant Generator:

1. Set microsequencer mode to “step”


2. Set all knobs on microsequencer to approximately 50%
3. Set random mode to “noise” speaker
4. Set speaker volume to approximately 50%
5. Set dimmer to 0% (off)
6. Build the circuit pictured.
7. Turn the dimmer on slowly to advance the sequencer
8. Check that at each step you can move the knobs
on the sequencers to change the sound changes envelope
9. Remove the power, dimmer, and
speaker modules when done.

filter

random

power dimmer micro sequencer split


Building the Synth
Build and test the Vowel Generator:

1. Set microsequencer mode to “step” on all microsequencers


2. Set all knobs on all microsequencers to approximately 50%

speaker
3. Set set number mode to “values”
4. Set oscillator mode to “saw”
5. Set oscillator “pitch” to 30%
6. Oscillator “tune” knob does not matter

envelope
7. Set speaker volume to approximately 50%
8. Set dimmer to 0% (off)
9. Set envelope “attack” to 0%, and “decay” to max
10. Build the circuit pictured
11. Turn the dimmer on slowly to advance the sequencers
12. Check that at each step you can move the knobs
mix
on the sequencers to change the values on the number
number modules and that the sound changes

filter
13. Remove the power, dimmer, and
speaker modules when done. micro sequencer split

split

power dimmer oscillator


filter

fork micro sequencer number


The Complete Vocal Synth
Assemble the Components:
1. Connect the consonant generator
to the clock generator
2. Connect the vowel generator the
to clock generator
3. Connect a mix module
to the outputs of the vowel
and clock generators
4. Connect the mix module to the speaker
5. Switch on power and turn up the dimmer
6. Verify that the micro sequencers
step and are synchronized
7. Set mix knobs to max
8. Verify that you hear
vowels and consonants.
Tuning the Synth Vowel-
1. Set vowel mix knobs to max consonant
mix
2. Use dimmer to cycle to step 1
3. Use micro sequencer knobs to set filter values
4. Cycle to next step and set filter values Cons
5. Repeat for up to 4 steps onant
6. Start continuous cycling with dimmer envel
7. Adjust envelope and mix settings for ope
best sound. You will probably need to reduce
the consonant volume. Often, increasing
the vowel attack can improve quality.
8. Keep in mind that these settings are Vowel
starting points! You will need to make envel
Very Small Adjustments as you listen to ope
the sounds. With time and practice, you will
learn how to quickly achieve good results.

Dimmer HiFilt

Step 4
Step 3
Step 2
Step 1

LoFilt
Word 1: Cookie Vowel-
Vowel envelope: consonant
Attack: ~20% mix
Decay: max
Cons
Consonant envelope: onant
Attack: ~5% envel
Decay: ~10% ope

Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
----------------------------------- Vowel
1 "koo" 05 20 15 envel
2 "kee" 10 85 15 ope
3
4

Dimmer
HiFilt

Step 4
Step 3
Step 2
Step 1

LoFilt
Word 2: Barbecue Vowel-
Vowel envelope: consonant
Attack: ~20% mix
Decay: max
Cons
Consonant envelope: onant
Attack: ~5% envel
Decay: ~10% ope

Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
----------------------------------- Vowel
1 "bah" 30 60 5 envel
2 "bee" 10 85 5 ope
3 "koo" 05 20 15
4

Dimmer
HiFilt

Step 4
Step 3
Step 2
Step 1

LoFilt
A Few Enhancements
Adding delays in front of the filters can add
motion to the vowels, which can be more
life-like. Start with “delay” at min, and turn
up the “feedback”.

Then, adding a dimmer in front of the vocal


oscillator allows you to add inflection. Move
the vocal dimmer in small amounts during
speech to give expressive pitch changes to
the word.

Step 4
Step 3
Step 2
Step 1
Word 3: Autobahn Vowel-
consonant
Vowel envelope: mix
Attack: ~20%
Decay: max
Cons
onant
Consonant envelope: envel
Attack: ~5% ope
Decay: ~10%

Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
----------------------------------- Vowel
1 "ah" 30 60 0 envel
2 "toh" 15 40 25 ope
3 "bah" 30 60 5
4 "n" 0 90 0

Dimmer

Step 4
Step 3
Step 2
HiFilt Delay
Step 1

LoFilt Delay

Vocal
Dimmer
Word 4: Robot
Vowel envelope:
Attack: ~20%
Decay: max

Consonant envelope:
Attack: ~5%
Decay: ~10%

Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
-----------------------------------
1 "roh" 15 40 0
2 "bah" 30 60 5
3 "t" 0 0 25
4

Step 4
Step 3
Step 2
Step 1
Control Reference
All filters tuned to ~800Hz Consonant Formant Frequencies:
(about 45%) peak set above 50% Cons. Seq.
cons. f1 Knob Setting
Vowel envelope: -----------------------------
Attack: ~20% w 290 12
Decay: max y 260 07
r 310 15
Consonant envelope: l 310 15
Attack: ~0%, varies f 340 17
Decay: ~10%, varies v 220 05
s 320 15
Vowel Formant Frequencies: Z 240 10
phoneme f1 f2 filt#1 filt#2 ch 350 19
--------------------------------- jh 260 08
"oh" 450 1000 15 40 p 400 24
"ah" 700 1300 30 60 b 200 00
"ee" 400 2500 10 85 t 400 24
"oo" 350 700 05 20 d 200 00
"ih" 350 2500 05 85 k 300 14
"eh" 750 2300 35 80 g 200 00
"uh" 420 1200 15 45 m 270 10
"er" 450 1400 20 55 n 270 10
"ll" 300 3000 00 90
Adapted from Dennis H. Klatt p987
Further Research Reference
Vocal Synthesis History:
Voder
Vocoder
Speak-N-Spell

Speech Science:
Homer Dudley (Voder, Vocoder)
Dennis H. Klatt (Rules based synthesis)
http://www.cs.indiana.edu/rhythmsp/ASA/Contents.html
http://dspace.mit.edu/handle/1721.1/29185 *good bibliography

Good search Terms:


“vowel formants”
“consonant formants”
“speech synthesis”
“rules based speech synthesis”
“formant synthesis”
Next Steps 1:
The consonant block should really have two

mix
i37
filters, just like the vowel block, but using noise

i33 envelope
instead of an oscillator for the source. w19
split

mix
i37
o24
i37 mix i35 delay synth

i37 mix
speaker

i32 filter
w19 o21 number i35 delay
split

i36 microsequencer w19


w7
fork

split

split
w19
w19

mix
w1

i37
i34 random
wire

mix
i37
i32 filter
split

i33 envelope
i36 microsequencer o21 number i35 delay

w19
split

w10 w1
w7
inverter wire
fork

split
w19

mix
i37

i37 mix
p1 power i6 dimmer i31 oscillator w1
wire

i32 filter
w19 o21 number i35 delay
split
NOR
w15

i36 microsequencer w19


w7
fork

split

split
w19
mix
w19

i37
i6 dimmer i31 oscillator
w17
XOR

i32 filter
split

i36 microsequencer o21 number i35 delay


wire
w1

split
w19
Next Steps 2:

mix
i37
Control and timing is difficult. Using a programmable controller would

i33 envelope
improve intelligibility. If we could program each syllable or word w19
split

individually, all the detailed timings and filter movements would make
speech more realistic. Our new Arduino module, for example, would be a
perfect fit for this job.

mix
i37
o24
i37 mix i35 delay synth

i37 mix
speaker

i32 filter
w19 o21 number i35 delay
split

i36 microsequencer w19


w7
fork

split

split
w19
mix
w1

i37
wire

mix
i37
i32 filter

i33 envelope
i36 microsequencer o21 number i35 delay

w19
split

w10 w1
w7
inverter wire
fork

split
w19

mix
i37
i34 random w19

i37 mix
p1 power i6 dimmer i31 oscillator w1
wire

i32 filter
split

w19 o21 number i35 delay


split

littleBits
NOR
w15

i36 microsequencer w19


w7

Arduino
fork

Module
split

split
w19
mix
i37
i6 dimmer
w17
XOR

Cock

i32 filter
i36 microsequencer o21 number i35 delay
wire

CV1
w1

CV2

split
w19
Power i31 oscillator w19

split
Next Steps 2:

mix
i37
Here you see how you could use two littleBits Arduino modules to replace

i33 envelope
about 17 regular modules and get improved functionality. w19
split

littleBits

mix
i37
o24
i37 mix i35 delay

Arduino
synth

i37 mix
speaker

i32 filter
Module i35 delay

Clock

CV1

split
w19
mix
i37

mix
i37
CV2

i32 filter

i33 envelope
i35 delay

w19
split

Power

split
w19

mix
i37
i34 random w19

i37 mix
i32 filter
split

i35 delay

littleBits
Arduino
Module

split
w19
mix
i37
Clock

i32 filter
i35 delay

CV1

CV2

split
w19
Power i31 oscillator w19

split