Skip to content

Using SAM (Voice Synthesizer)

Thomas Cherryhomes edited this page Sep 18, 2020 · 11 revisions

#FujiNet contains a built in S.A.M. speech synthesizer that is available on printer port P4:.

Open a channel to P4:

You can use it by opening a channel to it, e.g. in BASIC:

OPEN #1,8,0,"P4:"

Ask him to say Hello!

By printing text to this channel, just as a printer, SAM will recite the output to your audio speaker.

PRINT #1;"HELLO!"

Defaults

By default, SAM runs with the following parameters

Setting value
Phoneme Mode OFF
Sing Mode OFF
Speed 72
Pitch 64
Throat 128
Mouth 128

These can be re-called at any time by sending a CTRL-R as part of the print stream, e.g.

? #1;CHR$(12)

Commands

Six commands can be sent to SAM at any point during the data stream by entering a control character, and if needed, a parameter. Both the control character and the parameter must be separated by white spaces.

The following set of control characters sets a very alien voice:

Phoneme Mode.

Using CTRL-P in a print stream sets phoneme mode. This allows for much greater control over speech output by not only specifying the specific sounds to use for each utterance, but it also allows you to specify the inflection of each utterance for a greater degree of lifelike speech.

To get back to the default RECITER text to speech mode, use the RESET command (CTRL-R).

A list of phonemes, and common words and their phonemes is provided below.

Sing Mode

Using CTRL-G in a print stream turns on sing mode, which alters the transitions between phonemes and removes some of them, to give the illusion of singing. It is best to experiment with this mode in various combinations of phonetic mode to understand what is possible. By default, sing mode is OFF.

Speed (1-256)

Using CTRL-S in a print stream sets the speed of successive utterances. Larger values mean slower speech.

Value Description
0-20 impractical
20-40 very fast
40-60 fast
60-70 fast conversational
70-75 normal conversational
75-90 narrative
90-100 slow
100-225 very slow

SAM's default value is 72.

Pitch (1-256)

Using CTRL-I in a print stream sets the overall pitch of successive utterances. Larger values mean a lower overall pitch.

Value Description
00-20 impractical
20-30 very high
30-40 high
40-50 high normal
50-70 normal
70-80 low normal
80-90 low
90-255 very low

Mouth

Using CTRL-M in a print stream sets the mouth value, that is, the overall intensity of fricative and plosive transitions in each utterance. Higher values increase this intensity.

Throat

Using CTRL-T in a print stream sets the throat value, that is, the overall intensity of formants between transitions, most notably heard in vowel sounds, Higher values increase this intensity.

Examples

DESCRIPTION SPEED PITCH THROAT MOUTH
Elf 72 64 110 160
Little Robot 92 60 190 190
Stuffy Guy 82 72 110 105
Little Old Lady 82 32 145 145
Extra-Terrestrial 100 64 150 200
SAM 72 64 128 128

More About Phoneme Mode

S.A.M. is equipped with a version of the easy-to-learn, very readable International Phonetic Alphabet. There are about fifty phonemes which will let you spell all the words in English. Some sounds from foreign languages are not available in the system at this time.

Why use the phonetic system? There are two compelling reasons. 1.) In the phonetic system, all the words will be pronounced correctly; and 2.) You can put inflection into the speech however and wherever you want it.

If you have already tried the RECITER text-to-speech program, you know that it does a fair job of pronouncing English words. However, it does make mistakes. Some words sound a little strange and others are difficult to understand. The reasons for this are not hard to understand. English is a language of exceptions rather than rules; words that are spelled alike are pronounced differently ("have" vs. "gave"). A rule system like RECITER cannot pronounce all words correctly unless it stores an enormous dictionary that takes up vast amounts of memory. But the second flaw in text-to-speech conversion is more serious. Such a rule system cannot decide where the stress belongs in what is being said. The phonetic system in S.A.M., on the other hand, allows you to decide where to accent syllables within a word and where to stress words within a sentence.

So it is clear that the preferred way to make S.A.M. speak is with the phonetic alphabet. But how hard is it to use? It's really easier than writing in English because you don't have to know how to spell! You only have to know how to say the word in order to spell it phonetically.

Here is the complete list of phonemes, each presented with a sample word containing its sound. Note that there are many vowels, which is why they are all indicated by two letters rather than one.

The phonemes are classified into two categories: vowels and consonants. Among the vowels are the simple vowel sounds such as the "i" in "sit", the "o" in "slot", and the "a" in "hat". These vowels do not change their quality throughout their duration. There are also vowels called diphthongs such as the "i" in "site", the "o" in "slow", and the "a" in "hate", as well as the "oi" in "oil" and the "ow" in "how". These vowels start with one sound and end with another (e.g. "oi" glides from an "oh" sound to an "ee" sound).

The consonants are also divided into two groups: voiced and unvoiced. The voiced consonants require you to use your vocal chords to produce the sound. Such sounds as "b". "I", "n", and "z" sounds fall into this category. The unvoiced consonants, on the other hand, are produced entirely by rushing air and include such sounds as the "p", "t", "h", and "sh" sounds.

Phonetic Alphabet

Vowels

PHONEME Example
IY feet
IH pin
EH beg
AE Sam
AA pot
AH budget
AO talk
OH cone
UH book
UX loot
ER bird
AX gallon
IX digit

Voiced Consonants

PHONEME Example
R red
L allow
W away
WH whale
Y you
M Sam
N man
NX song
B bad
D dog
G again
J judge
Z zoo
ZH pleasure
V seven
DH then
Clone this wiki locally