Pad of next session: http://pad.constantvzw.org/public_pad/neural_networks_2

Programme:
    12:30 Reading of tutorial?
    13:00 Lunch
    14:00
    18:00 End

Some resources Deep Learning and neural networks

Software for Deep Learning and neural networks

Tensorflow (Google): https://www.tensorflow.org/
-> a lot of tutorials, including on language processing: https://www.tensorflow.org/versions/r0.11/tutorials/index.html
Free course on Deep Learning (Tensorflow) by Google: https://www.udacity.com/course/deep-learning--ud730
*Their Jupyter-notebooks: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/udacity
Some links to how it is used: https://research.googleblog.com/2016/11/celebrating-tensorflows-first-year.html

Theano: http://deeplearning.net/software/theano/index.html / http://deeplearning.net/tutorial/intro.html

other: overview November 2015
http://venturebeat.com/2015/11/14/deep-learning-frameworks/?imm_mid=0dd6b0&cmp=em-data-na-na-newsltr_20151209

Microsoft Cognitive Toolkit
https://www.microsoft.com/en-us/research/product/cognitive-toolkit/
Free online course
Other source from Seda: http://course.fast.ai/

Free online books

Deep Learning4J
https://deeplearning4j.org/neuralnet-overview.html

Deep Learning - Ian Goodfellow and Yoshua Bengio and Aaron Courville
http://www.deeplearningbook.org/
(this book is a bit more in depth)

Neural Networks and Deep Learning - Michael Nielsen
http://neuralnetworksanddeeplearning.com
(this book is quite accessible)

Freed books on Tensorflow:
http://gen.lib.rus.ec/search.php?req=tensorflow&lg_topic=libgen&open=0&view=simple&res=25&phrase=1&column=def
Useful are Rodolfo Bonnin,Building Machine Learning Projects with TensorFlow. Packt Publishing (2016) and Sam Abrahams, Danijar Hafner, Erik Erwitt, Ariel Scarpinelli,TensorFlow for Machine Intelligence. A Hands-On Introduction to Learning Algorithms. Bleeding Edge Press (2016)

blogs and articles

The Unreasonable Effectiveness of Recurrent Neural Networks
http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Embed, encode, attend, predict: The new deep learning formula for state-of-the-art NLP models
https://explosion.ai/blog/deep-learning-formula-nlp?imm_mid=0eaadb&cmp=em-data-na-na-newsltr_20161116

Deep Visual-Semantic Alignments for Generating Image Descriptions
http://cs.stanford.edu/people/karpathy/deepimagesent/

Lecture by Nicolas Malevé on architecture of Tensor Flow compared to Scikit Learn: http://sound.constantvzw.org/cqrrelations/afterlife/ogg/09_NicolasMalev%c3%a9_discussion.ogg.ogv

Simulation tools
Simulation of NN in browser: http://playground.tensorflow.org
Nice introduction using this tool: https://cloud.google.com/blog/big-data/2016/07/understanding-neural-networks-with-tensorflow-playground

More fundamental explanations. Including what the hidden layers 'see'
http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/

Training process visualization
http://cs.stanford.edu/people/karpathy/convnetjs//demo/classify2d.html

Agenda
- experiment with Tensor Flow using tutorials
- install machine (eventually afternoon)

Playground Tensorflow ( http://playground.tensorflow.org )
- visual tool to start to understand what Neural Nets do.
- while running it, the tool tries to find the difference between 'blue' and 'orange'
- aiming to solve a classification problem
- in mathematical terms if you want to classify you want to draw a line between 'blue' and 'orange'
- the features are functions that you can use in order to draw different types of lines

install Tensorflow

using pip:
    https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html

error on 32-bit systems:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
$ tensorflow-0.7.1-cp27-none-linux_x86_64.whl is not a supported wheel on this platform.
> The above error comes because of trying to install TensorFlow onto a 32 bit system. As you could observe, the wheel was linux_x86_64, which is intended to be installed on 64 bit. http://stackoverflow.com/questions/33637208/getting-tensorflow-is-not-a-supported-wheel-on-this-platform
> there is no 32-bit package prepared for Tensorflow
> to run it on 32-bit: install from source > https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#installing_from_sources

if installing through pip does not work:
use conda for a virtual environment (BSD License)
"Conda is a part of the Anaconda distribution." https://pypi.python.org/pypi/conda
you need to install anaconda or miniconda or purchase anaconda to work with conda ( http://conda.pydata.org/docs/download.html )
> pip install conda - https://pypi.python.org/pypi/conda
*or from source: http://conda.pydata.org/docs/download.html
> conda create --name snowflakes python=2.7
> source activate snowflakes (and to deactivate: source deactivate snowflakes)
> conda search tensorflow
> conda install tensorflow

When using an old Ubuntu (e.g. 14.04), you can get a problem/conflicts with an old version of numpy. Using virtualenv is a workaround.

mac users:
conda install -c conda-forge tensorflow=0.11.0rc2
(tested with python 2.7)

install through venv / Python2.7 worked as well

virtualenv

create
virtualenv path/to/foldername

activate
source path/to/foldername/bin/activate
. foldername/bin/activate

software as a service
Big companies like google, facebook, ibm and microsoft offer neural networks under an open source licence, in order to create a community around their software. People invest in learning it by working with the software. Another important aspect is that currently there are many cloud services that offer space in the cloud to run your neural networks, as they use a lot of hardware power.

example scripts
training handwritten digits > in the getting started section

lexicon:
reinforcement supervised learning
recurrent neural networks
"%" in python is called a 'modulo'

Experiments
An: looking into this: https://www.tensorflow.org/versions/r0.11/tutorials/word2vec/index.html

Word2vec: "Word2vec is a particularly computationally-efficient predictive model for learning word embeddings from raw text."

"why we would want to represent words as vectors."
"We look at the intuition behind the model"
"For tasks like object or speech recognition we know that all the information required to successfully perform the task is encoded in the data (because humans can perform these tasks from the raw data)." (?)
" natural language processing systems traditionally treat words as discrete atomic symbols, and therefore 'cat' may be represented as Id537 and 'dog' as Id143."

"These encodings are arbitrary, and provide no useful information to the system regarding the relationships that may exist between the individual symbols. This means that the model can leverage very little of what it has learned about 'cats' when it is processing data about 'dogs' (such that they are both animals, four-legged, pets, etc.). Representing words as unique, discrete ids furthermore leads to data sparsity, and usually means that we may need more data in order to successfully train statistical models. Using vector representations can overcome some of these obstacles." >>> but these words does not contain any of these specifics ... ???

"semantically similar words are mapped to nearby points" >>> what is semantically similar here? how is this decided? >>> "words that appear in the same contexts share semantic meaning" >>> to do this:
    - count-based methods (e.g. Latent Semantic Analysis) >>> analysis
    - or predictive methods (e.g. neural probabilistic language models). >>> versus prediction (based on ???)

Word2Vec comes in two flavors, the Continuous Bag-of-Words model (CBOW) and the Skip-Gram model

Neural probabilistic language models:
*- maximum likelihood (ML) principle to maximize the probability of the next word (for "target") given the previous word (for "history") in terms of a softmax function
*
context: We first form a dataset of words and the contexts in which they appear. We could define 'context' in any way that makes sense

Building the Graph

"embedding" >>> representing a word through its surroundings by counting its 'window' (words that appear on the left, and words that appear on the right) either through analysis or prediction

to transform words into something computable, we need to transforms words in numbers and use a structure to store them which is the vector space
how are the words counted?
example sentence: the quick brown fox jumped over the lazy dog
> 8 words (the is used twice)

first:
> you set the dimension of the model in the embedding-size?? embedding-size is tabou word of the day ;-)
> you set all your vectors at random
> and by training batch by batch, the nn looks at the word-embeddings of your vocabulary

two reference that explain the word2vec:
    - https://arxiv.org/pdf/1402.3722v1.pdf word2vec - Explained: Negative-Sampling Word-Embedding Method, Deriving Mikolov et al.’s, February 14, 2014
    - https://www.youtube.com/watch?v=wTp3P2UnTfQ MLMU.cz - Radim ?eh??ek - Word2vec & friends (7.1.2015)

This 'basic' script shows the different steps and generates a plot at the end, showing the distances between words
https://github.com/tensorflow/tensorflow/blob/r0.11/tensorflow/examples/tutorials/word2vec/word2vec_basic.py

questions:
    - is the position of the datapoints in the plot meaningful? how?

Code & small movie is here:
    www.algolit.net/neural_networks_tensorflow