Pad of next session: http://pad.constantvzw.org/public_pad/neural_networks_2 Programme: 12:30 Reading of tutorial? 13:00 Lunch 14:00 18:00 End Some resources Deep Learning and neural networks Software for Deep Learning and neural networks Tensorflow (Google): https://www.tensorflow.org/ -> a lot of tutorials, including on language processing: https://www.tensorflow.org/versions/r0.11/tutorials/index.html Free course on Deep Learning (Tensorflow) by Google: https://www.udacity.com/course/deep-learning--ud730 *Their Jupyter-notebooks: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/udacity Some links to how it is used: https://research.googleblog.com/2016/11/celebrating-tensorflows-first-year.html Theano: http://deeplearning.net/software/theano/index.html / http://deeplearning.net/tutorial/intro.html other: overview November 2015 http://venturebeat.com/2015/11/14/deep-learning-frameworks/?imm_mid=0dd6b0&cmp=em-data-na-na-newsltr_20151209 Microsoft Cognitive Toolkit https://www.microsoft.com/en-us/research/product/cognitive-toolkit/ Free online course Other source from Seda: http://course.fast.ai/ Free online books Deep Learning4J https://deeplearning4j.org/neuralnet-overview.html Deep Learning - Ian Goodfellow and Yoshua Bengio and Aaron Courville http://www.deeplearningbook.org/ (this book is a bit more in depth) Neural Networks and Deep Learning - Michael Nielsen http://neuralnetworksanddeeplearning.com (this book is quite accessible) Freed books on Tensorflow: http://gen.lib.rus.ec/search.php?req=tensorflow&lg_topic=libgen&open=0&view=simple&res=25&phrase=1&column=def Useful are Rodolfo Bonnin,Building Machine Learning Projects with TensorFlow. Packt Publishing (2016) and Sam Abrahams, Danijar Hafner, Erik Erwitt, Ariel Scarpinelli,TensorFlow for Machine Intelligence. A Hands-On Introduction to Learning Algorithms. Bleeding Edge Press (2016) blogs and articles The Unreasonable Effectiveness of Recurrent Neural Networks http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Embed, encode, attend, predict: The new deep learning formula for state-of-the-art NLP models https://explosion.ai/blog/deep-learning-formula-nlp?imm_mid=0eaadb&cmp=em-data-na-na-newsltr_20161116 Deep Visual-Semantic Alignments for Generating Image Descriptions http://cs.stanford.edu/people/karpathy/deepimagesent/ Lecture by Nicolas Malevé on architecture of Tensor Flow compared to Scikit Learn: http://sound.constantvzw.org/cqrrelations/afterlife/ogg/09_NicolasMalev%c3%a9_discussion.ogg.ogv Simulation tools Simulation of NN in browser: http://playground.tensorflow.org Nice introduction using this tool: https://cloud.google.com/blog/big-data/2016/07/understanding-neural-networks-with-tensorflow-playground More fundamental explanations. Including what the hidden layers 'see' http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/ Training process visualization http://cs.stanford.edu/people/karpathy/convnetjs//demo/classify2d.html Agenda - experiment with Tensor Flow using tutorials - install machine (eventually afternoon) Playground Tensorflow ( http://playground.tensorflow.org ) - visual tool to start to understand what Neural Nets do. - while running it, the tool tries to find the difference between 'blue' and 'orange' - aiming to solve a classification problem - in mathematical terms if you want to classify you want to draw a line between 'blue' and 'orange' - the features are functions that you can use in order to draw different types of lines install Tensorflow using pip: https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html error on 32-bit systems: $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl $ tensorflow-0.7.1-cp27-none-linux_x86_64.whl is not a supported wheel on this platform. > The above error comes because of trying to install TensorFlow onto a 32 bit system. As you could observe, the wheel was linux_x86_64, which is intended to be installed on 64 bit. http://stackoverflow.com/questions/33637208/getting-tensorflow-is-not-a-supported-wheel-on-this-platform > there is no 32-bit package prepared for Tensorflow > to run it on 32-bit: install from source > https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#installing_from_sources if installing through pip does not work: use conda for a virtual environment (BSD License) "Conda is a part of the Anaconda distribution." https://pypi.python.org/pypi/conda you need to install anaconda or miniconda or purchase anaconda to work with conda ( http://conda.pydata.org/docs/download.html ) > pip install conda - https://pypi.python.org/pypi/conda *or from source: http://conda.pydata.org/docs/download.html > conda create --name snowflakes python=2.7 > source activate snowflakes (and to deactivate: source deactivate snowflakes) > conda search tensorflow > conda install tensorflow When using an old Ubuntu (e.g. 14.04), you can get a problem/conflicts with an old version of numpy. Using virtualenv is a workaround. mac users: conda install -c conda-forge tensorflow=0.11.0rc2 (tested with python 2.7) install through venv / Python2.7 worked as well virtualenv create virtualenv path/to/foldername activate source path/to/foldername/bin/activate . foldername/bin/activate software as a service Big companies like google, facebook, ibm and microsoft offer neural networks under an open source licence, in order to create a community around their software. People invest in learning it by working with the software. Another important aspect is that currently there are many cloud services that offer space in the cloud to run your neural networks, as they use a lot of hardware power. example scripts training handwritten digits > in the getting started section lexicon: reinforcement supervised learning recurrent neural networks "%" in python is called a 'modulo' Experiments An: looking into this: https://www.tensorflow.org/versions/r0.11/tutorials/word2vec/index.html Word2vec: "Word2vec is a particularly computationally-efficient predictive model for learning word embeddings from raw text." "why we would want to represent words as vectors." "We look at the intuition behind the model" "For tasks like object or speech recognition we know that all the information required to successfully perform the task is encoded in the data (because humans can perform these tasks from the raw data)." (?) " natural language processing systems traditionally treat words as discrete atomic symbols, and therefore 'cat' may be represented as Id537 and 'dog' as Id143." "These encodings are arbitrary, and provide no useful information to the system regarding the relationships that may exist between the individual symbols. This means that the model can leverage very little of what it has learned about 'cats' when it is processing data about 'dogs' (such that they are both animals, four-legged, pets, etc.). Representing words as unique, discrete ids furthermore leads to data sparsity, and usually means that we may need more data in order to successfully train statistical models. Using vector representations can overcome some of these obstacles." >>> but these words does not contain any of these specifics ... ??? "semantically similar words are mapped to nearby points" >>> what is semantically similar here? how is this decided? >>> "words that appear in the same contexts share semantic meaning" >>> to do this: - count-based methods (e.g. Latent Semantic Analysis) >>> analysis - or predictive methods (e.g. neural probabilistic language models). >>> versus prediction (based on ???) Word2Vec comes in two flavors, the Continuous Bag-of-Words model (CBOW) and the Skip-Gram model Neural probabilistic language models: *- maximum likelihood (ML) principle to maximize the probability of the next word (for "target") given the previous word (for "history") in terms of a softmax function * context: We first form a dataset of words and the contexts in which they appear. We could define 'context' in any way that makes sense Building the Graph "embedding" >>> representing a word through its surroundings by counting its 'window' (words that appear on the left, and words that appear on the right) either through analysis or prediction to transform words into something computable, we need to transforms words in numbers and use a structure to store them which is the vector space how are the words counted? example sentence: the quick brown fox jumped over the lazy dog > 8 words (the is used twice) first: > you set the dimension of the model in the embedding-size?? embedding-size is tabou word of the day ;-) > you set all your vectors at random > and by training batch by batch, the nn looks at the word-embeddings of your vocabulary two reference that explain the word2vec: - https://arxiv.org/pdf/1402.3722v1.pdf word2vec - Explained: Negative-Sampling Word-Embedding Method, Deriving Mikolov et al.’s, February 14, 2014 - https://www.youtube.com/watch?v=wTp3P2UnTfQ MLMU.cz - Radim ?eh??ek - Word2vec & friends (7.1.2015) This 'basic' script shows the different steps and generates a plot at the end, showing the distances between words https://github.com/tensorflow/tensorflow/blob/r0.11/tensorflow/examples/tutorials/word2vec/word2vec_basic.py questions: - is the position of the datapoints in the plot meaningful? how? Code & small movie is here: www.algolit.net/neural_networks_tensorflow