Friday 4th December 2015
Look further into text generators  (after Relearn + K. Goldsmith's Reading session) 
WTC25, 12-20h

Workshop ideas:

*Vladimir Propp - fsa model for tales -> translate in Dada engine?
*Cognitive exhaustion through FreeLing. Detecting syntactic styles. What are the inner repetitions in a text. Finding out where the repetitons/patterns are. 
*(Applied) Starting with neural networks : run it on a Raspberry Pi - What is a useful context?
*Something more theoretical on neural networks. Depending on whether somebody has time to prepare it. 
*Creating libraries - "the Obama Oxford Dictionary". Build libraries of 'wordclips' / dictionaries from certain movies. Rewrap content / text into movie vocabularies. Replay one movie in clips of the other

Dates of confirmed workshops:

*23.02.2016 10:00, Gijs : Build libraries of 'wordclips' / dictionaries from certain movies. Rewrap content / text into movie vocabularies. Replay one movie in clips of the other
*12.03.2016 10:00, Olivier : Adressing cognitive exhaustion through FreeLing. Detecting syntactic styles.
*23.04.2016 10:00, Ann : Vladimir Propp - fsa model for tales -> translate in Dada engine


Threads from before and now:

*Follow-up on Poet assistant (Mia - tbc - yes I will. http://pad.constantvzw.org/public_pad/Poetry%20generating%20softwares)
*Follow-up on Dada-generator - introduction to model of FSA & Finite State Transducers
*http://www.clips.uantwerpen.be/cl1516/cl-2015-2.pdf
*Interest in trying Google's Tensor Flow http://www.wired.com/2015/11/google-open-sources-its-artificial-intelligence-engine*and this one more critical: http://www.wired.com/2015/11/google-open-sourcing-tensorflow-shows-ais-future-is-data-not-code/
*automated novel generators:http://arcade.stanford.edu/blogs/nanogenmo-dada-20
*https://github.com/dariusk/NaNoGenMo-2015/issues
*Kenneth Goldsmith Uncreative Writing Session Notes: http://www.paramoulipist.be/algolit/Logbook_A4_all.pdf
*About Distant Reading
*http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant-reading.html
*Franco Moretti from Stanford Literary Lab 
*https://litlab.stanford.edu/current-projects/
*https://www.historypin.org/en/victorian-london/geo/51.5128,-0.116085,12/bounds/51.471972,-0.278477,51.553591,0.046307
*https://en.wikipedia.org/wiki/Philip_M._Parker
*model for algorithmic production of books: http://www.nytimes.com/2008/04/14/business/media/14link.html?pagewanted=all&_r=0


*Introduction to Automata Theory, Languages, and Computation (Hopcroft & Ulman)
*i put it online here:
*https://cp.sync.com/dl/f41ae9970#9p3tdeq8-2bbvzs5s-hqw5jiqv-z8z7erpa
*:)
*
*ex : Click-o-Tron http://www.clickotron.com "click-bait articles generated by neural networks" 
*
*http://larseidnes.com/2015/10/13/auto-generating-clickbait-with-recurrent-neural-networks/ completely computer generated - neural networks based on words -> combine with preformatted models using linguistic structures -> looks for common word combinations, based on what is there (not on a 'history', 'previous model')  
*possible to show different steps of neural network processing? are there forms of illustrating/visualising the process? But that's not the same as understanding it. 
*ref: mario "gaining consciousness" with the help of NN https://www.youtube.com/watch?v=qv6UVOQ0F44 

*A nice introduction on neural networks: http://neuralnetworksanddeeplearning.com
*To get an idea what neural networks are doing: http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/
*Tensor Flow is relatively easy to use in Python scrips, but it needs a basic understanding of neural networks to be able to make sense of it. The tutorials contain interesting information on how to use this. E.g. concerning text: http://www.tensorflow.org/tutorials/word2vec/index.html and http://www.tensorflow.org/tutorials/seq2seq/index.html
*
*Deep learning is a form of neural networks, but other mathematical models are used that improved results a lot. (Geoffrey Hinton: Boltzmann models - https://www.youtube.com/watch?v=AyzOUbkUf3M)  
*
*Recurrent Neural network package: https://github.com/karpathy/char-rnn
*

Notes from today: 

About Distant Reading
http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant-reading.html
Franco Moretti from Stanford Literary Lab 
https://litlab.stanford.edu/current-projects/
https://www.historypin.org/en/victorian-london/geo/51.5128,-0.116085,12/bounds/51.471972,-0.278477,51.553591,0.046307
& also Turing Love Letters ref / Love Letters Generator
http://elmcip.net/creative-work/muc-love-letter-generator
http://www.turing.org.uk/scrapbook/test.html
http://www.bbc.com/news/uk-england-manchester-12430551

A house of Dust
http://zachwhalen.net/pg/dust/
db of adj, verbs, nouns - simple replacements
performance

viral infection of text by another text
-> gives the taste of invador
http://nlp.lsi.upc.edu/freeling/

Presentation Mia
Follow-up on Poet assistant  http://pad.constantvzw.org/public_pad/Poetry%20generating%20softwares)

curatorial rol of human
most markov chain or ngrams
-> most interesting results?
-> less challenging?

why so early, so passionate, to make poetry
why poetry? related to turing test
very similar results
reason for making it so different
cfr Turing Love Letters

1st things to test as meeting between humans and computers
Herman De Vries
http://www.hermandevries.org
related to Zero mouvement / Holland
https://en.wikipedia.org/wiki/A_Million_Random_Digits_with_100,000_Normal_Deviates

Is it serious poetry gerators?
Markov Chain looks like poetry

-> easy to do poetry, open field for language 'what is poetry'?

more complex approach, result might not be so interesting
'distant reading': analysing big data (18th century literature), how it is written, next step: how to write
'not necessary to read books anymore'
http://www.newyorker.com/books/page-turner/an-attempt-to-discover-the-laws-of-literature
Franco Moretti

semantical approach
syntactical approach

Olivier:
Phd in Computational Biology
tools from Computational linguistics to analyse genome
genome - specific text
find patterns
huge databases of text now / parts of them we don't know what it is
big corpus of texts, try to find patterns in it

overproduction, you can't consume anymore - what would you with them then?
what are the side-effects of overload of poetry?
no validation system/Poetry Rank as exists for analogue poetry tradition
-> too easy? no authenticity?
plastics got there, 'exclusive plastic' now

Mia & Martino is setting up a local radio (illegal)
produce poetry for radio?
see algolit radio show @ Désert Numérique http://desert.numerique.free.fr//archives/?id=1011&ln=fr

Gijs text generator from youtube subtitling tool http://192.168.42.103/  
markov chain based on speeches by obama
candidate speeches
president speeches
promises he couldn't keep
Relearn: someone trained neural network to train algorithm with Shakespeare
wanted to do the same thing with obama, first sketch

recurrent neural network
based on text you're feeding it

https://github.com/karpathy/char-rnn
http://karpathy.github.io/2015/05/21/rnn-effectiveness/

recreate new speeches based on automatic subtitling
human touches the robot
possible because Obama talks in a very systemized way!!! protcollarian
-> remake videos & messages, 1984
-> 'watch the library'
cfr Le petit Journal (France)
patterns of speech, iterative aspects

FreeLing
Cognitive exhaustion
We all talk in templates
*Example; Not talking in your mother tongue, giving one a limited set of templates, forces one to recycle sooner.
Hypothesis: When we write a text or talk, we recycle structures. We have a very small number of structures when we talk. 
Trying to detect such repetitions.
Searching for certain genes and structures.
But being able to be able to detect it despite of mutations.

This works on regular language
the levenshtein distance is linear in time

Relate through: Translation to syntactical structure
translating symbols to the function of the word.
What is the use: 
*Detect styles on a syntactic level
*What are the inner loops in a text. Finding out where the repetitons/patterns are. 
*Workshop
*
In the hope of it to teach us something. - -> hope, mystery vs. knowledge 
(The mythology of the algorithm)
create an instant linguistic expansion tool
-> texts that are simple in syntax, rich in semantics (ex song) or vice versa


Semantic decontamination
As a way of getting people into detox. To enrich/enlarge their vocabulary.
Thesaurus python script - Searching synonyms - using it to rewrite. 
Also analog - finding lists of words. - with the goal to expand.

A really, really applied workshop idea:
running neural networks
on a raspberry pi 
from June onwards?

Make your wikis!
Wiki log-in for hdsb wiki to create your own article for your publish and destroy projects:
User: hdsb
Password: raspberry

Here you find the links to your wiki articles:
    http://wiki.hackersanddesigners.nl/mediawiki/index.php/Published_and_Destroyed#.21.21BOOM.21.21
    
If you want to format your text check out this tutorial about the wiki markup:    
http://wiki.hackersanddesigners.nl/mediawiki/index.php/Wiki_Tutorial