sicv_finissage

Welcome to Etherpad!

This pad text is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents!

Get involved with Etherpad at http://etherpad.org
Redone the vitrine three times in three months
How we rewrote it ...

What tools do we use.

The kind of images, what kinds of archives are we using and how are they interpellated by/through the work

How does everything work together?
http://pad.constantvzw.org/p/sicv_finissage
First try: different screens. Using Node (javascript), a way to run servers. Google wrote it as a backend for Chrome -- V8
It is FLOSS, and allowed Node to happen = commandline version. Mix between front-end Javascript developers (designers) and back-end developers.
Promiscuous: code sliding between different communities.
plus OpenCV (1990s) -- an intel project released as FLOSS to get a community around it (and Python). Processors got better, and cameras as well. Many images to process.

Node IS a server. Different from classic http server programming is 'hard'; one server connection at a time. Node can have multiple connections (think etherpad).
Each screen had a different program running, but they could share information.

The archive was pre-processed (so faces were already detected). Guttomsgard was "only" 1400 images, made into a datadump of faces and eyes.

But: in the Haarcascade (end of the 1990's) algorithm there is not just face-detection. It is called 'cascades' for a reason, so you go from a face to an eye ... so the eye detector it would find eyes everywhere.

The first treatment you make is blurring the image!

"Feature detection" can be ... edges, patterns, or objects.

In Open CV there are about 20 trained Haar Cascade trained algorithms. There is somewhere a database of faces that is humanly annotated. The popular one (the one that works well) is called "frontal face".

What is interesting ... we want to come back to it ... we wanted to work with machines that were not in the same space.

http://192.168.0.24:10987/collage.html

The second version felt like a step back, the three displays were disconnected. Much more stable than the peer-to-peer version, that had too much interdependencies

Face-detection is in any camera, phone.

Where do the faces go? http://sicv.activearchives.org/vitrine/
http://sicv.activearchives.org/vitrine/20160207/ -- comments on the files. What to do with it?

To be seen by the algorithm ... there are people that come repeatedly.

The combination of a tree and a body ... some elements seem to help detection.

Can you train the software? This is still a bit of a mystery.

Three sets of images:
    Camera
    Archive
    Models

Camera is mediator between the archive of models, and the artist-archives

Body-detection: is it taking movement into account? It seems these models only take frontal images into account.

The datasets for training used to be really small. The first large one is:
http://groups.csail.mit.edu/vision/SUN/
SUN dataset initated by Thor Alba?? at MIT . They used the web to get people , students in fact, involved; than shifted to mechanical turk for annotation. Precarious work. 131.000 images -- you don't want to wait three years to complete the task. Describing elements of the image.

To benchmark the algorithm they use this database. How well does it perform? If you all use the same dataset ... it becomes a standard, the measurement. Most computervision is trained based on this.

The images are mainly cc --

http://image-net.org/ - images mapped on wordnet. wordnet quite a precise worldview.
http://image-net.org/search?q=face

"vision is hierarchical" What can you see in 25ms? Statistacilly you seem to have the same range of description.
a model of vision .. the quicker you look, the less problems you have, the priorities
short sightedness

Do you need it to be more "accurate"?

People agreeing to pose for the algorithms vs. the algorithm being succesful

What do you think a face is and what the algorithm thinks it is.

The fact that it gives visual feedback, makes it different than surveillance

The collection of images ... "what needs to be there"

when the algorithm is not trained, where is the intelligence of the algorithm
haar cascade is generic