The Death of the Authors, 1945
Anne Frank pour Public Domain Day 2017
Git Repository: https://gitlab.constantvzw.org/death-of-the-authors/1945-Anne_Frank
Sources of texts:
Olivier Ertzscheid http://affordance.typepad.com//mon_weblog/2016/01/anne-frank.html
https://scinfolex.com/2016/01/01/liberer-anne-frank-pour-le-jour-du-domaine-public/
TODO
- Check browser type and add warning if not Chrome or Firefox.
- add recording conditions Voxforge: http://www.voxforge.org/home/dev/mansegaudio
- document & add source code + link on http://publicdomainday.constantvzw.org
- add reference Séverine
- test sound micro on computer Steph - ok
- buy new battery microphone (An)
- finalize css/images http://publicdomainday.constantvzw.org (Femke)
VOXFORGE
To contribute to VoxForge, we have to make a post on the forum here: http://www.voxforge.org/home/audacity
Template: http://www.voxforge.org/home/audacity/audio-file-submission#rU976vlXOBuEIjX51SpZKA
Example: http://www.voxforge.org/home/audacity/audio-file-submission---spanish#7mzZmIMdnjf-nsu_LcS7FA
Toutes les ligatures ont sautées dans le OCR
Liste de remplacements:
" jn" = " mooi", " lm" = " film", " rma" = " firma", " ets" = " fiets", " a e" = " afle", " uister" = " fluister", "proe es" = "proefles", " its" = " flits", " ink " = " flink ", "a oopt" = "afloopt", " auw" = " flauw", " ge irt" = " geflirt", "a oop" = "afloop", "kof e" = "koffie", " anel" = "flanel", " guurlijk" = " figuurlijk", " uks" = " fluks", "zel ngenomen" = "zelfingenomen", " uweel" = " fluweel", " ltreren" = " filtreren", "ophef ng" = "opheffing", " irt" = " flirt", "Ongeloo ijk" = "Ongelooflijk", "Magni ek" = "Magnifiek", " nesse" = " finesse", "philoso e" = "philosofie", "biogra e" = "biografie", "saf aantjes" = "????" (saffietjes ?, gerolde sigaretjes), "in atie" = "inflatie", "pam et" = "pamflet", "proe anding" = "proeflanding", "of cier" = "officier", aflopen, floot/fluiten, enfin, fles(je)
version corrigée en tex: https://github.com/skadge/diary-anne-frank/blob/master/anne-frank.tex
Text-to-Speech tools:
- espeak
- espeak-ng
- festival
- say
- spd-say
- kdeaccessibility-jovie
- kdeaccessibility-kmouth
- Orca
- flite
- gespeaker (frontend for espeak
- blather (python + gstreamer)
- epos
- marytts
- ivona
- mbrola
- mimic
- python-pyttsx
- python2-pyvona
- python-espeak
- svox-pico-bin (on Android phones)
- praat (nl)
Speech recognition tools:
- julius
- freespeech-vr-devel
- htk
- opensmile
- pocketsphinx
NLP tools:
- Frog
- python-frog
- python-speechrecognition (Google-powered)
- python2-gtts (interface to Google speech)
FESTIVAL
sudo apt-get install festival
Test your setup by typing in a Terminal
*festival
You will be presented with a > prompt. Type
*(SayText "Hello")
The computer should say "hello".
To listen to a text file named FILENAME, type
*(tts "FILENAME" nil)
Note FILENAME must be in quote marks.
--> not for DUTCH
ESPEAK & mbrola -> will be the solution
sudo apt-get install espeak
espeak --stdout -f text.txt > text.wav
SPHINX
Dutch model was very old, it has not been updated for 5 years. I've just uploaded a new model on cmusphinx website.
https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Dutch/
It should be more accurate but still it is trained only with 13 hours of data. English models are trained with 1000+ hours. We need more transcribed Dutch data.
Someone made a language model for Dutch, published here:
http://www.voxforge.org/home
"VoxForge was set up to collect transcribed speech for use with Free and Open SourceSpeech Recognition Engines (on Linux, Windows and Mac).
We will make available all submitted audio files under the GPL license, and then 'compile' them into acoustic models for use with Open Source speech recognition engines such as CMU Sphinx, ISIP, Julius (github) and HTK (note: HTK has distribution restrictions). "
install Pocketsphinx
1. install Sphinxbase: https://github.com/cmusphinx/sphinx4
add dependencies: libtool, swig
follow instructions of README / make install (as root!)
./configure --enable-fixed
2. install Pocketsphinx
3. test installation Pocketsphinx: pocketsphinx_continuous -inmic yes - ok
TODO NEXT: look at Gijs' Obamascript
--------------------------------------
OVERVIEW
1. texte corrigé ok
2. espeak in NL + women voice + recording ok
3. espeak in NL + women voice + whispering + recording ok -> this result is not understandable if you don't see the text
3.bis. record using mbrola voice with espeak
4. RESULT: we have wav file, to be passed in Pocketsphinx
*4.1. install Sphinxbase + Pocketsphinx (error)
*4.2. create /test the reuse of a Dutch language model : http://www.repository.voxforge1.org/downloads/Dutch/Trunk/AcousticModels/
*4.3. pass the wav, see what comes out (for the moment: accuracy 55%, error rate of 44%)
*4.4. eventually retrain the model with more data to optimise results
5. write contextualising introduction
6. lay-out text file + introduction
7. Print min 2 books & upload pdf
Options: espeak reading html & including breaks, silences....
* names/titles/days-dates/'Lieve Kitty'/'Je Anne' in 'highlighted slower voice'
* add phonemes for frequently used words that have bad accents now
* punctuation: replace by words so we can find it back?
Lire le livre
durée: 7h de lecture (170 fragments)
construire la voix d'Anne Frank
website avec 1 fragment au hasard
anne.constantvzw.org
contacter developeur voxforge NL
http://cmusphinx.sourceforge.net/wiki/tutorialam
When you need to train
You want to create an acoustic model for new language/dialect
OR you need specialized model for small vocabulary application
AND you have plenty of data to train on:
1 hour of recording for command and control for single speaker
5 hour of recordings of 200 speakers for command and control for many speakers
10 hours of recordings for single speaker dictation
50 hours of recordings of 200 speakers for many speakers dictation
AND you have knowledge on phonetic structure of the language
AND you have time to train the model and optimize parameters (1 month)
When you don't need to train
You need to improve accuracy - do acoustic model adaptation instead
You don't have enough data - do acoustic model adaptation instead
You don't have enough time
You don't have enough experience
Mail to Kmaclean@voxforge.org
We would like to produce a series of sound files by min 170 people based on Anne Franks' diary.
It would be great to contribute them to Voxforge. Could you let us know what form you prefer for this?
Rather sentences? At random? Or fragments of the text?
For your information:
We are two artists from Brussels, working exclusively with Free Software and Free Licenses.
For the Belgian Day of the Public Domain we contribute with an installation based on Anne Frank's diary, in the series 'The Death of the Authors':
http://publicdomainday.constantvzw.org/
As the first version of the diary is officially in the public domain, but called back by the 2 foundations based on specific copyright conventions, we would like to call upon citation right in order to have the book as a collection of speech fragments by a large group of people, as such somehow reconstructing Anne Frank's voice by people who defend her work as a call for peace.
Ideally - in a second stage - we would like to use these recordings to create an Anne Frank's Language Model for Speech-to-Text, but we are looking for a person who has the right skills for that or might be happy to include this into her/his research. Or guide us somehow.
Instructions for useful recordings - to mention on the homepage/recording:
http://www.voxforge.org/home/read/recording-how-to
In Dutch:
http://www.voxforge.org/nl/read