Michael Murtaugh - techniques of computer vision
https://fr.wikipedia.org/wiki/Vision_par_ordinateur
http://sicv.activearchives.org/mondo/
interface for archive which is asking questions to the archive
scans from university of Gent, splits pdf in pages
if you click on layer for 1 page, it reorders images on all other layrs
1er traitement: gradient
ex take all the black of an image
the more black, the higher it is -> turn black in 3D relief, you look at how steep the peaks are (not how high they are), on top of plateau the gradient would be 0, vitesse de l'eau qui court de la pente
gradient has orientation (water runs in certain direction, from black to white)
techniques ar enot used for thinking an archive or a text
tool for trying to understand what this could mean if we use them for this purpose
ordre des images: image avec le plus de gradients
chaque pixel a un gradient, mais la résolution est trop faible (raster)
2e traitement: contour
il cherche les bords d'objects
la couleur n'est pas importante, mais le changement de couleur montre un bord
graphic designers use same tools as engineers for treating images (for esthetic / algorithmic objectives)
3e traitement: sift
montre les différents points d'entrées sur une image
ex Google street view: composée d'images différentes qui sont 'cousu' ensembles -> sift aide à trouver les points correspondants entre 2 images
qu'est-ce qui est un 'feature' dans un livre? ou pour 1 image?
http://sicv.activearchives.org/logbook/sifting-through-the-pages-of-arkiv/
feature regarde les directions des gradients
it looks for correspondences between images
-> that's the way in which an algorithm can learn if you feed it 100000 images...
consistent in seeing the same details in different images
it makes a lot of mistakes
(book as database - what could algoirthmic indexing be?)
4e traitement: texture
OCR makes a series of different treatments: lay out analysis to see colums, then it looks for lines, and afterwards for letters
-> it makes mistakes: sees letters in images
5e traitement: lexicality
OCR passes this letter recognition through a dictionary -> shows words that are recognized on a page
it is linked to 1 language
TODAY proposal: software Tesseract, feed it image, it produces html-output of the page
there is more information than what oyu see (structure of paragrpah, line, word is inside it)
3D of text in Firefox