Notes from the Workshop Anyware - Location Privacy at VUB:
    13 October 2016
    Organized by Mireille Hildebrandt and Irina Baraliuc
    Also in celebration of Katja De Vries' defense
    https://vublsts.wordpress.com/2016/09/28/anyware-privacy-and-location-data-in-the-era-of-machine-learning/

Solon: ask not what your algorithm can do for you: the ethics of autonomous experimentation
waze:
outsmarting traffic together
was an app on smartphones
routing instructions for driving
have people using the report back to the app what the driving conditions are
police officer here, certain things happening on this road
learning about road conditions: from the people using the app
by using you were reporting
purchased by google
now under alphabet
what does this have to do with autonomous experimentation?
this seems to be the optimal path
based on what we know about traffic conditions
learning from other epxeriences
if service begine to direct everyone to location a, they may no longer no what happens in location b, where they diverted people from
to deal with this problem
there is a known technique: you send people to some other paths, uncertainty about the driving conditions, to see how the conditions are there
users are being used to collect information thaty ou have told some other users to avoid
there are different terms for this:
explore/exploit algorithms
for any given instance: routing information, you can exploit what you know about driving conditions
or you can use drivers to explore and see what discovers
it optimizes for the system overall so that in general you have an optimal solution for most users
at the expense of individual users
one users has a risk, the others share the benefit

machine learning: observational data, historical information that you have
explore exploit: it is experimental, you are proposing alternatives and looking at the consequences of the alternatives
        varying treaments and comparing effects
        these things are now being merged

online learning:
        ethical implications at the intersection of the two procedures
        credit: you make a credit decision, it effects the world and who is getting credit
        and you want to know what happened to your assessment of people
        we deploy the model, and see the effect, and retrain the model
        online makes that continuous

        A/B testing:
        Multi-Arm Bandits
                in optimization problems, you think you are getting to an optimal solution
                you have confirming evidence that this is the best solution
                but you may notice that there is another local maximum that is better
                using randomness

        reinforcement learning:
                machine learning and experimentation mixed

alpha go:
        this is the success of AI
        alpha go had two primary steps:
                look at previous go games
                then have the computer play itself, using reinforcement techniques, which includes exploring other areas not explored by humans

same methods to deal with things that policy people are concerned with:
        you deploy more police in one part of the city
        as a consequence you are deploying less police to other parts of the city
        so you cannot what is going on elsewhere
        bandits: to produce disputes, could solve this problem
        help you avoid you confront with predictions

driven by uncertainty:
        if there is something you don't know much about, this is
        uncertain effect is not equal to worse treatment
        but there is uncertainty
        should uncertainty be borne by individual users
        and who is the person selected to explore the uncertain area
        who gets to exploit that information

        why is there greater uncertainty to certain solutions to a problem?

ethics:
belmont report:
        autonomy
        beneficence: do no harm
        justice: unjust for prisoners, if they were all the subjects of risky experiments where the welfare flows to others

what is the relevance:
        autonomy issue
        consent issue
        beneficence: users are being knowingly: put into uncertain conditions where they may incure extra cost
        maybe there is reason why we have certain historical data, because humans know that it is risky or inconvenience
        justice: you can imagine that there will be much more uncertainty about less common solutions
        i minotiry exhibits different behavior than majority group
        being less commmon, that population will be subject to mroe experimentation
        the direct beneficiaries will be that population, so maybe itis ok
        but the question still stands: is this the appropriate way to learn the information
        could they be they benefit from a different solution
        how do we avoid subjecting to significant risk
        and that is what beneficence is telling us to do

baseline:
        what is the baseline?
        humans sometimes don't have an intuition about what is the right solution
        the argument has to be that there will be many circumstances where there is knowledge of preferable solutions which are historically not well known to certain populations, like google?
        we need to ask about why there is uncertainty in certain areas?
        providing greater social context to what is currently known by these platforms
        naive and obvious things to answer:
        does the person know they are part of an experiment?
        what information is the experiment intend to discover?
        could that information have been obtained otherwise?


judith simon:
philosophy of sts: copenhagen

location based data and priacy: some epistemological considerations
is there a paradigm shift in science (due to changes in methods)?

target case:
        inasion of privacy:
                illegitmate access to data versus informed consent through payback cards
                invasion of privacy not due to the gathering of data, but due to data processing and inferences
                big data practices as epistemic practices

proposals:
        big data practices looked through epistemics and politics

location data:
        thomas hoffman
        eth zurich: data analytics
        hermeneutics of location data
        zur hermeneutik digitaler daten

        personbased raw data + common background data = enriched persona based data
        semantic maps + social information + predictive models
problems: regarding privacy, discrimination, ...

what is the epistemic difference between consumer and location based data
        data about location is non-inferential relevance
                location data: data on presence and movement (this is banal)
                if i know where your mobile phone is, i know where you live, your workplaces, if i zoom in, i know where you spend time, whether you have a child
                i know where you are
                secret endeavors

        data of movement: mode, route and speed of transporation, when from where to where

it gives us a deep description of a sample of one
        versus inferential statistics
                you make a relationship between an individual and aggregated individual and make inferences
                this is not new
                you can do this with location data but not necessary for all usages

how do you ensure location privacy if it is so expressive:
        solution: restrict data usage to aggregate level
        solution 2: restrict data gathering, data sparsity


jean paul van bendegem
plato in the background (or in the machine)?

i want to explain simple point:
        lots of space to explore
        philosopher of mathematics
        when i read stuff with ml: to my taste, not so often there is a questioning of the applicability of the mathematics
        but not of the mathematics itself
        in 10 minutes i will question the whole of mathematics

Niccolo Tartaglia (1499 - 1557)
beginning of the 16th century
        i am fond of this kind of etching
        we today would interpret this as a superposition
        you have trees, canon, the smoke of the cannon ball, and we are tempted to say, there is a superimposed geometry
        model of a cannonball's path

introduction of a machine learning/location thesis
        we model the data "as if it were generated by a State Space Model....
        we will make it discrete
        for simplicities sake, we started as the uniform distribution as the start model

there are different places, and as a start position, we assume it can be anywhere
        you have a probability distribution:
        you take it discrete, so that you have the same probability of being everywhere

what has happened between the tartaglia etching and the phd thesis

important role of purification of mathematics
        i dare to use that term, since we talk about "pure mathematics" vs. applied
        if you look at the historical development
        if you start with tartaglia
        the term they used was not the opposition between pure and applied but mixed mathematics

        if you forget the background
        wait, tartaglia is one picture, if you could pull them apart, you could have a mixture of the two
        they would only consider arithmetic as being pure
        but back then they did not have infinity of numbers, they only had finite numbers
        it takes some time for the emergence between pure and applied maths
        whenever part of what was considered to be pure math, became infected with applications, the consequence was elimination

        Hardy: a mathematician's apology
        i have been doing all this time, it could have caused no harm whatsoever, becuase it has no use whatsoever
        i can't have done anything wrong

        that goes together with an ontology and epistemology
        and it is related to platonism
        in a sence platonism has been created!

in opposition to the (neo)platonist view is the constructivist view
        mathematical objects are created
        involves procedures and notations
        also applies to identity
        you say either things are the same, they are different or nothing can be said about it

major difference between
        location (in pure sense): that being there
        location + procedure (program): a program
        and always think of the two together

        if you think that is the basic unit, you...
        i need to insert wittgenstein

        if there is no basic unit, what location is, to extend this as a basic unit, is it interesting?
        in road maps, some places are interesting?
        being intentional: what are the purposes?

        basic unit: location, procedure , and all these other things

why did the ancient egyptians needed right angles to measure the banks of the nile after the flooding?
        it is easy to calculate rectangles, and that is great for taxes
        and if you include the taxes, that is why in tartaglia, it is not an accident that it is a canon, it could have ben a ball, it would not have been so useful


a paper explaining random walks
        two pictures
        a person going 45 degrees left or right, w equal probability at the end you get a drunk person
        later you see: a graph of the USD and Euro exchange rate and it looks like a random walk
        why is there no drunk person at the end of that curve
        this means something else
        the pictures are similar
        mathematically they are the same, but you treat the same because of all the elements in the basic unit



Q: is it better to be treated as member of a class or an individual?
maybe for some people it is better to be treated as an individual and for others as a group?

Q: is this related to the uncertainty?
        so that we can experiment to find out what it is where we want to be?

        insurance: if you know for certain that somebody would get sick, insurance system would unravel: charge them the exact amount
        it is the uncertainty that allows us to socialize the cost
        uncertainty enforces group categorizations, that if we would treat as individuals, we would loose

        the way you aggregate, what if there is an outlier, is that the counterpart to the individual
        or is it not an interesting feature anymore?
        outlier would not be at the level of the person but to events

        a person leaves his cell phone at home
        it doesn't move the entire day
        does that make me something special?

Q: what would your opinion be, if one of these systems, an ethical reflection, attempts in AI, to code ethical reflection through the belmont principles to automated decision making?
        the paper gave me appreciation for people critical of singularity
        this is oten what they have in mind: bostrom
        the scenario is, you teach a machine to make paper clips
        and it tries to make every atom in the universe into a paper clip
        now i understand where these ideas are coming from
        but particular way of designing a machine that intervenes, learns, ... that is oblivious to a bunch of things in the world
        goal alignment is what you are describing
        as much as still find those people obnoxious, there is some legitimacy

        separate work: can we design systems that take this into account
        how to use multi-arm bandit that does not engage in populations that can not take the burden

Q: is it that the balancing choice that the developers are making of how many they can screw before they can loose their audience
        to what degree of experimentation would be acceptable
        they are mathematically design to be very efficient: so they can involve small number of experiments
        but there is an independent business decision to be made about it


q: location data is more direct than web data
        i was suprised by that statement
        certain web pages also give a lot of intimate information of things that i am interested in

        if i look for something: it is indicative of something i am doing but it is not directly telling you
        you cannot really know for sure, i am looking for someone else,
        where i am is where i am
        the more you dive in, the shadies the distinctions will be

        i asked my father to send me emails he gets about post-factual emails he receives
        what merkel is doing, paranoid emails
        what does this tell you: if you just have this information, you don't know if i oppose these news or not
        web beahvior is only probabilistic indicative
        unlike in location

q: why the producers of these applications
        would respect principles of bioethics, they are not producing science, but exploiting a business opportunity?

        facebook contagion study
        people rely on ethical practice to do corporate practice
        this paper that we have written is far too, does not acknowledge
        that this is a simplistic way of dealing with a practical problem
        there is no expectation that people will adopt these principles
        encourage people to think about how you might deisgn systems that take ethics into consideration
        that gives some source to the argument

q: contrasting very routine location data, with very marginal web data
i received some stuff i am not interested about
i have my routines on the web, random places are of less importance to me

Lydia Nicholas, NESTA

most of the stuff you are talking about doesn't happen in local governments
medium data
basic statistics

who can push back against applications like waves
case workers may not be able to look at certain data
and there can be a system that looks at the different databases and provide risk scores
then you don't break confidentiality

smart places
traffic management and whether there are icy patches on the road
that is all that is there in the much of UK
optimizing resources: making your bins get collected effectively

governments collect a lot of data
most of the data interactions that i have with the government is at a life event
the times i may apply for benefits, crime
significant life events: but they don't know what i am doing in between
benefit system: requires knowing who you are living with, how long, your relationship with them
you may oversurveil certain populations
apart from your communications data,
which is not part of governance

case workers report about what was going on in a troubled family where a child is being abused
a lot of the data entered into systems is garbled, it is sensitive
it will take three years to get the permission to get into the room where the information is introduced
but then i was told the system doesn't work so they are writing it down
so what is going on in data practice with governments and governance

resistance from people working on the ground to data capture
you generally take this job because you want to help people
building relationships of trust
and help them transform their situation
you don't want to make them legible to the state

you are seeing the impossibility of doing big data work
people don't have the language, but they are thinking about the privacy issues
they want to protect their job, don't want to get fired

i find that extremely interesting as a point of focus
you get to what we are actually trying to do
you creating a social and healthcare system that looks like amazon warehouse
it is perfectly efficient, but there is no time to have a cup of tea

steps forward:
AI will get very good at parsing unstructured data
are you going to make that deliberately obfuscated
or, one of the interesting projects, they are embedding data analysts in the small teams
in place
when you try to identify families that need support
there are concern markers, attending school, drug use
usually you have to tick above a threshold
to go to a general social worker

ML use here: to segment the most significant risks and paths forward for those families
and put them into specialist social groups
and get data analysts to work with them
and find out not only what patterns there are
but which ones are important
end goal: idealized social landscape
what our government could do is turn this whole thing into an amazon warehouse
or you can find patterns that can transform
governments transforming from providing services to comissioning services
this idea wheter you are trying to watch to tick boxes and to sustain
or identify patterns to transform
produces a very different relationships between capture and machine learning
then google and apple's predictive models
it involves a lot of human and data expertise
dificulty of getting care workers, data analysts in the place
and improving statistical knowledge
people reject correlations: are you god!
basic denial of correlation is something that you may need to battle with here.

i have questions about the data analysis/care work
northern city: where to put cycle lanes
give away bikes with gps on them
30 million pounds
they were proud of the strategy
someone knocked on the door: you realize that the bikehub, and would have given you the location data and routes for free
they are people who chose to bike and engage with the app
there are basic issues in education and infrastructure
day by day work of taking state knowledge and how it impacts people

Arjen de Vries
who controls your search log data
information retrieval
i did a project on IR for children
improve it for children
best way to do that was to use log data from yahoo
i was enthusiastic about the profiling you can do
you can do cool things for normal people

somebody wanted to teach high school kids about profiling and digital identity
to teach the to think about what they do online
i was asked for advice on the tech aspects
they know everything about our online searches and what your interests are
at the same time: there is this web getting more centralized
web was intended as a decentralized thing
decentralized web summit
too much centralization: too much pwoer put to gether in a few parties that control the info online

the photos of the event are on a google drive
but they are meant to be shared! :::::>>>> centralization isnot the only way to share?

mobile makes things only worse
there is one way to earn money as a company: sell data
what can we do for seach
decentralized and localize
for search that might be harder

the shift to agile turn is causing that the line flattens: because these are bought by data center parties and people who illegally download films
we are used to this: central heating, do something for our house as way
we can store the whole web at home??
the data that i need from the web is much smaller than the web
it is a naive proposal

two problems:
how to get the data on the personal search engine?
how to replace the lack of usage data from many?

getting the data:
idea: organize the data in bundles anduse techniques inspired by query obfuscation to die the real user's interests when downloading bundles
web archive to the rescue?
we ship the part that is only of your interest
but does this mean that the web archive gets all my information instead of google?

but search gets slow
query obfuscation to hide a bit your profile and still get the chunks that would be most useful for you
how do you keep the data fresh?
make use of the fact that you use certain sites frequently

no log data: is that bad? yes!
all sorts of google search functionality
predict query intent/rank verticals etc. etc.
this is only available in exchange for the data that we give to these technologies
they do something good with the mass surveillance

without log data:
hinders retrieval experiments in academia
related problemm: reproducibility vs. representativeness of research results?
we can't study these companies, but only from the outside or as interns

alternative sources of clicks:
bitly api lists how often links are shortened and clicked
wikistts
google trends

anchor text and timestamps
anchor text with timestamps can be used to capture and trace entity evolution
or to find popular topics!

trade log data:
data markets for usage data
behavior data turns into something valuable for us
so i am much less concerned
challenges: how to select the part of your log data you are willing to share?
how to estimate the value of log data?

truly personal search:
safely gain access to rich personal data including email, browsing history, documents read and contents of the user's home directory

can high quality evidence about an individual's recurring long-term interests replace the shallow information ofmany?

alison:

slightly different ways of talking about our shared interests
rather than privacy: talk about communication as something people have always claimed a right to do
privacy and communication are corollaries with each other
to communicate and not to be forced to communicate

data and citizenship
in the broader framework of communication

seda: sketch out a transformation from a paradigm of communication about access to a network
to a communication paradigm of production of data (a compulsion to produce data for someone to take action)

hwo we organize our institutions is effected by this problem:
the second one is not about rights
the shift to wards producing data for the purposes of action: rights claims become more contested

sketch the shift
in the communication paradigm of access to a network
15 years ago
info society discourse: was about getting online and connecting with other people
glorification of a global network of peers
it was electric
speak to each other without intermediation
that network grew in which many entities, people or not, are on the network
totalizing
increasingly people have access to the network :::::::>>>>NO, they continue to exist for other people!

that is the end of the electric vision of globalization and equality
and the end of a certain way of making communication based rights claims
the right to communicate is now a compulsion to create consistent data
the network cannot continue to geenrate value, unless people continue to produce data
business models have shifted, from a whole bunch of companies providing access to a network
to intermediary giants, whose business is data processing and calculation
shift from rights claims to data acts
data acts happen through intermediation

location is a good example here
a coordinate: long/lat is nothing in itself, it needs to be constructed, you must be placed on a map, and positioned in relation to other information

next argument:
this intermediation creates what is meaningful about data
and the way that data can be connected to action
my concerns about citizenship
and the ability to act on things that people care about in the world
the intermediation happens through a framework that is first applied by the corporate and then public sector
public sector often responds, not knowing if they should repeat these things

frames:
consumption
optimization

the corporate actors make the conceptual space: what sorts of things get to be data
i work with civic orgs
bottom up orgs
they reappropriate the same kind of frameworks
data is valuable because you can link it to consumption and optimization

there are some interesting responses in the realm of ethics
normative

critical
        we have this notion of data citizenship
        taking infromation and calculating it
        and making a calculative judgement
        that produces a consumer model
        what is location data good for: things that have consistent identities in space
        based on the location of an object that gets information
        which entities will create the first structured data first: the ones that want to sell you something
        classic consumer model

        second optimization model
        let's use data to make things more efficient
        this uses the idea that an expansion of things that are computable
        expansion of areas of life that can become more optimal
        comes with the datafication
        commuting applications
        two sided business model
        you buy app as a commuter
        "don't bother taking the central line"
        the info depends on the availability of fully up to date streaming information
        transit authority
        take the potential of that mass data and turn it something that na indivdual can act on
        the applications sell the data back to the transport authorities

        this positions the ideal citizen as producer and consumer data
        public institution as also free produced of data and consumer data
        and optimization

        what's interesting: these are the early low hanging fruit frameworks for how to use location data for an improved civic experience
        the civic groups are doing the same thing
        fixmystreet
        aggregate citizen identification of problem issues in local areas

        how do i demonstrate to a government that they should not close the local library?
        this illustrated to me that there is a particular kind of computable civic action
        for people who are working with the data
        which does not fundamentally challenge the consumer paradigm of data production for consumption and optimization

normative response:
        justice: inclusion/exclusion in relation to the results of the calculation
        existing frameworks: protected classes, abstractions and contextual integrity

critical response:
        boltanski's notion of critique: questioning the nature of things
        can we question this fundamental shift in citizenship
        how it has been redefined through datafication
        what it means to have only spaces of action
        instead of rights claims as a way to claim civic positions

        play of uncertainty: make things unoptimal
        you may think about optimization resistant behavior
        good research that shows that if you seek to datafy an organization, there is always information that cannot be turnedi into data

one possible way of thinking about this is to shift from an expost framework: at the end of the technology
to think about the construction of ethical beahviors all the way through developing a technology

Irina shklovski

mutable stories: the shifting accountability of data interpretation
the conditions we are discussing here are not about tech per say, but a blend of human decision making and tech practice
not about algorithms and data itself
but about the output and its interpretation
the connection between data and action

information as thing: -> deserves careful examination
michael buckland

data is information as thing
giving attention to data is now strange: people used to think this is worth scholarly attention
iat: is just an object, may or may not be informative

information as thing has become central to the worries and criticisms
we worry about data and services
what seems to be have said over and over
there is a different between describing patterns in data and the attempts to understand and explain these patterns
what must be known or data to be interpretable?
how does one tell a story with.from data?

how do data shift from thing to knowledge?
what is the minimum necessary dataset for interpretation?

given enough data, the output can be interpreted in a way that is actionable?

location is commodity:
        commodification: removing something from its context of production such that its value is determined by its context o fuse
        commodity fetishism: when the commodity's value is entirely determined by other means, such that the richness of its context o production is lost, and its value, which once came from social relations, is invested in the object itself

        commodification here shifts the process of interpreting by changing the available minimum necessary dataset
        location data is commodified, and it displaces the practices for social meaning
        interpreations of data can reshape expectation, accountability etc
        .

        interpretation of data is about managing uncertainty

        when data used for decision making are interpreted through the lens of the social relations and contexts of its production - the minimum necessary dataset is inflected with the consequence of the decision taken

        commodified data are never the entire story byt form a partial basis for it...

Katja de Vries
Thesis presentation

Baroque
17th century
time of scientific revolution
cartesian separation between object subject
newtorn society: difference between subjective beliefs and objectives facts

baroque: in reaction to protestantism
endless theatrical art style
17th century merchantile capitalism
the first financial bubbles
first insurances and probability calculus
a very baroque work of art
frescal ceiling of a church in home
visual illusion
if you stand in thie jesuit church in rome
you have to stand in a specific marble stone
it is as if the sky opens above your head
if you step down, the visual effect stops
as if the building collapses on ou
andrea ... had a problem with perspective painting that works in more than one stop
another interpretation
is that potzo, exactly wanted to show that this is an illusion
it is good to rejoice in the performativity and artificiality of the paingin
you step down from the marble stone and it collapses on your head
it rejoices in the work of making something work
baroque is not only an artistic style but a style of thinking
what is it to make something work and in general
it is a style, the process of making, and how to work with differences and probabilities

leibnitz: great thinker of baroque
working with princess sophie and something happens
sophie says: is everything really different
and he says, let's do emprical philosophy
and find leaves that are identical
they walk around for a while, and they say we cannot find identical leaves
hegel says this is like children looking for the same snow sakes
but it is turn: how do we deal with difference and sameness

chapter 2:
traces
if an animal walks through the snow, it leaves a trace
the trace is open for interpretation
a dear, a wolf, and what is it that you are after
scientific, hunter, tourist
do you follow the trace, or turn around and run for it
we give interpretations to traces and act on it
interpretations can be dead serious
it is what decides who is in the stew: the dear, or the hunter

interpretation is a word i don't use in my thesis
it sounds voluntaristic
as if you can chose what you want to see
but it is limited by your body, the space that has constituted you
this is a tick: it has three sensations
sweaty or not, hairy skin or not, 27 degrees or not
is the experience of the tick wrong, or not???

perceptions or percepts is the word
perceptions are not determinisitc
things can appear is diferent ways
picture that can be interpretted as a young fgirl or an old woman
you can extend bodies in ...
with a dog, glasses, google glasses, laptop
what the body can see and perceive will not be the same
train a body in a particular way of doing things
practice of law, science or medicine
again, this body will not be the same, and what the body can see and perceive will not be the same

what makes someone act or perceive
mapping what makes someone act
latour calls the mapping of the netowrk
or work-net
the work that goes into the percetption that is getting foregrounded

i look at two worknets
eu informational fundamental rights
and network of ML algorithms that are applied to human behavior and characteristics
we return to the baroque
the sameness of those two networks

practices that escape the opposition between art and science
these are not completely baroque
they are a bit modern and a bit baroque
and they could even become more baroque

i juxtaposed these two networks and how they contaminate each other
the exercise of studying these networks has two effects on two other levels
what is the best way to study the practice of making: law making, how do you study that as a philosopher of technology
do you look at it as a general way
or do you look at the specificities of makins
the analysis is also used to read philosophy against the grain as to how identity and difference are related to each other

a recurrent theme is the heigh ho heigh ho song
what does it mean to make something work today

digital traces: not only animals leave traces in the snow
the amount of traces we leave has exploded
the traveability isnot limited to what we do behind the screen
but the signals emitted by mobile phones
and footage captured by smart cameras
footage from a smart camera: how we stroll from a real shop

capter 4
machine perceptions
machines categorize
they give interpretation
separating male from female faces
google making a mistake
smart face analytics
the tech that is used is probabilistic
this face is mostly happy and a a little surprised, sad
summarized in clearly understandable labels
a trace
it is nothing without an interpretation
also nothing without acting upon that information
the question in modern tech is the same as with the trace in the snow: who ends up on the stew
let's look at the face again
a startup:
        if you are angry, our app will offer you a whiskey
        or if w recognize you are engry we will not respond at all

what is the worknet of the network of machine learning
        i looked at 11 specific algorithms
        how they construct identity and difference
        is this face angry, sad, male, female
        main idea:
                classical programming:
                        explicit instruction:
                machine learning:
                        examples, instructions how to extract patterns, sometimes feedback (you classified wrong)

chapter 6

percepts creation by eu fundemantal rights

        privacy, data protection and anti-discrimination rights
        the choice of these rights allows us to see something about these rights
        what is important in understanding fundamental rights, they are embedded in a certain political constituttion
        end of 18th century: the state is no longer leviathan, but a pastoral quality, the state is more like a shepherd
        why do i compare these networks
        they are both
        they create similarities between cases
        they are not scientific at all
        but with the case of fundamental rights, there is something else at stake
        i tis not just a baroque practice
        but ther eis also a political idea behind it
        can pastoral power be included in machine learning

chapter 7

contamination of ways of being

        can ML be expected to make a political option
        so that the sheep and people are given the possibility to resist
        one of the possibilities is to do some transparency about how the algorithms work
        how do we co-exist with ml being applied us
        one is that we would be able to see what the algorithms do and think about alternatives

chapter 8
        what is making, and how to study it?
        looking at two ways of making
        making through informational rights and ML
        what is actually to make something
        can we by looking at two different ways of making, say something about the nature of making


solon: you say we need more baroqueness: what problems would baroqueness help us solve?
        i think it is good to say that ml is both baroque and not
        it is not baroque in that it comes from a tradition of statistics
        which is often about presenting an objective reality
        classical statistical approach is we looka t how people with psychosis behave and we create a model and present reality in some way
        the ml applicaions are applied in a quick pace to reality
        this means that, suddenly it is not just knowledge from paper but it is knowledge that is being applied


        when we are aware that this is performative interventions into reality
        then it is important for the practitioners who created this algorithms to realize
        what all the human efforts that go into it
        what is the goal in creating these algorithms
        better world, optimizing profit
        how do we create the variables
        constructive validity
        these are choices that are being made
        how do we test what is the standard against which we measure the algorithms
        they are being tested often on standardized databases
        how well can an alforithm recognize those in a standard database (in comparison to others)
        this is all very constructivist
        it would be good if the practitioners acknowledged that
        there is no ultimate standard of whether this is a model that represents reality or not
        ML person once said: there is no best, the question is does it work
        a plane doesn't represent reality
        but if it flies from a to b without crashing, then it is a good plane
        so what is it that we want algorithms to do
        in the case of the plane: not destroy the environment
        it is important to clarify to practitioners it is more an art
        creating an algorithm that cannot be measured with standards of reality and truth
        but to make clear to them, you are making something that is like recipes
        one recipe has certain advantages over others

solon: i like the idea, recognizing the constructivist aspect of ml? do the practitioners not recognize? who has this mistaken belief? practitioners or others?

        two answers:
        on a high level, the top level ml engineers are concerned about the implications that ml has
        when you look at the concerns, they are mainly ethical concerns
        kathy oneil: she did a course to students
        she asked: can you develop an algorithm to weight your own essays
        the algorithm effects the people themselves
        an ethical awareness of their real life effects
        by my thesis, i extend the view: it is not only ethical concerns, it is also a legal concern
        it is also a political concern about the constitution

        in lower levels of ml there is little understanding: when a company does not have high quality know how
        and you are throwing in data, and something will come out
        this will be based on what the black box puts out
        outside these highly educated ml
        there is a great believe that this is data, this model must be true
        there is a lot of work to do in making clear that this is not about reality


judith simon:
        reads her review: summarizes and says it is great!
        how do you bring power asymmetry matter to the ml engineers?

        law can change the obligations companies have towards users
        once there is a legal situation where industry has to incorparate certain power balance
        between users and more powerful info tech structures
        it will be necessary to incorporate in the pragmatism of ml
        if you are a big it company
        it is much easier to think: let's look at the dp requirements
        what to do
        and how to comply with the anti-discrimination rule
        as a more general way
        there should be power equality - article 8
        force companies to think about power balances in a broader way
        this will be a way in which the machine learning pragmatism can be effectd
        engineers can be compelled to do something about it
        but i acknowledge that the economic incentives are big

thomas heskens:
i call myself a machine learner
you said ml is not science, what do you count as science: statistics, etc.

        all modern science is disappearing
        all of it is turning into engineering
        i draw a line between 17th century as experimental science
        a research doing an experiment and representing reality

so it is about the experimental part and what the truth
        the idea that nature is discovered

science is really builogy, physics, but then when you start making models
then most of the models are completely wrong but useful
that is the line
if they are wrong they are not science
if they are right it is engineering?

        this is one of the quotes
        all models are wrong and some are useful
        in fact, this is the criterium that should go for everything that is called science now
        there is not a representation of reality but models that are very useful
        but ml, is more upfront about it
        it can be more upfrot about the performative


ibm watson
medical data, systems for medical diagnosis
you can go to watson or your physician
what would you choose? watson has seen much more data

        i currently would ask both
        ask them both

they give different answers?
        even the people who make them cannot tell exactly
but you can't tell waht the physician's reasoning
        but you can ask the physician

i can make a rule of a neural network to respond to such a question...
neither can the doctor give you more
        medical practitioners will know that there is a model on top of the neural network
        they can have some assessment of what is going on in the black box
        what you see in those debates is that medical practioners, it is not clear what input you should put into the machine
        a doctor may sometimes cannot explain, but based on his experience

that is weird:
jean paul van bendegem
as a mathematician and philosopher i have one question
i am grateful to tom for getting my question started
there was a nice exchange about what is science and not
if i understood right, math is not science

        i am not a mathematician
        i write about porbability
        and sometimes i collide logic with mathematics
        math it is a formal system with a cleaned up space where certain manipulations are possible
        in some ways, mathematics can be very useful
        you can build bridges with it
        you can predict fluctuations on the financial market
        it can be useful as a tool, some times

i have the impression when reading the thesis
you have an idea of pure mathematics
i still have the feeling that you are acknowledging this purity
        i have used mathematics as a counter side
        as a contrast
        to show what baroque sciences can be
        this question about science, i contrast law and machine learning with modern science
        i acknowledge that modern science does not really exist
        if you are going to disentangle that

        in ml baroqueness was more visible

if you would not leave math behind but baroqify it, too, that would be great

our whole approach is similar to
we are standing at the right stone to se the illusion of law
if i want to show you the same illusion with machine learning, we need to make a move
that is your rhetorical strategy

        thre are two aswers
        contrast positions
        when you speak two languages
        you speak an issue in another language, you se the limits of the first language
        if you only speak english, you don't know how a language shapes you

        2) the front cover has the illlusion by potzo
        it is important in law and in machine learning to be able to make it noticeable
        that you are standing on the stone and to step down and see it is artificial

gloria gonzalez

my battery died!!!

==========   NOTES IN PREPARATION OF TALK AT ANYWARE ===================

we are and always have been continuously reshaped by the artifacts we shape, to which we ask: who designed the lives we live today? What are the forms of life we inhabit, and what new forms are currently being designed? Where are the sites, and what are the techniques, to design others?

location determines who you are: you are where you live
you make up location

The first is the ZIP (Zone Improvement Plan) code. Mandated as an element of President Kennedy’s attempt to rationalize government, the ZIP code allowed for the first time the quantification and thereby the easy organization of both residence and business addresses

Under the ZIP code system, households were aggregated into units served by a single post office, serving at most perhaps 15,000 people, each indicated by a five-digit number. The US Postal Service at the same time established a system of numbering of postal carrier routes, the routes traveled by each letter carrier, and they received two-digit numbers, so that ZIP codes could in turn be divided into units of perhaps 800 people. As an incentive to using the systems the Postal Service (or, actually, its predecessor, the Post Office Department) gave a discount to mass mailers who sorted their mail by carrier route, so that that geographical unit, defined by the daily path of the individual letter carrier, came to be the preferred unit of division

The US Bureau of the Census had begun the establishment of first the GBF-DIME files and then the (currently used) TIGER files. First for urban areas and then for the entire country, these files were the basis of a computerized mapping system to be used for the first time in the 1970 decennial census

ore to the point here, these computerized files, consisting in part of latitude and longitude values for the four corners of every block in every city in the US, allowed -through a process of matching with the Postal Service’s ZIP code files - the determination of the geographical coordinates of every mailing address in every city in the US. (Rural addresses created special problems, whichhave only recently begin to be resolved through the development of new rural addressing systems, the impetus for which has been the perceived need to rationalize and support emergency response, or 911, systems.) This in turn allowed the creation not merely of lists, but also of maps of ZIP codes, postal carrier routes, and so on

umerical taxonomy is a sophisticated way of clustering similar individuals by imagining them to be in an N-dimensional space, where “n” is the number of socioeconomic variables. So using the roughly 600 socioeconomic variables available at the block-group level, the creators of the geodemographic systems determined the distance of each of 230,000 block groups to all the others in 600-dimensional space. The ones that were “closest” were characterized as being most alike, just as the ones farthest from one another were seen as least alike

These earliest systems, developed in an era before the advent of desktop computing, relied on computationally intensive numerical taxonomy, and ran on mainframe computers.
****Software was inevitably written specifically for the project at hand, and customized results were as a consequence expensive****

At the same time, the systems began to direct their attention to smaller and smaller units of measure, to the individual and household level.

he geodemographic industry was founded upon resources generated by public programs. These programs, ZIP codes, census data, and 911 address standardization, produced standardized regions as well as locational data (that is, latitude and longitude information) for particular entities. These standard regions were not developed for the specific requirements of the geodemographic industry. Nevertheless, the industry used these legacy regions as the foundation of their marketing analysis. System improvement was marked by the ability to define, categorize, and target smaller and smaller fixed regions

What changed at the end of the 90s?
1) The first is a new attention to what are viewed as temporally fluid regions
People's behavior can be understood only by understanding that people are mobile, and that they routinely move from home to work (school, markets etc.) and back

2) The second trend is the development of location based services (LBS). These services are in the first instance designed to coordinate or assist the activities of mobile individuals as they pass through stable regions

the recognition of temporal changes in the character of regions and the interest in tracking mobile individuals - are in an obvious sense closely connected

Two factors have been especially influential in the latest development of geodemographic systems. The first is the role of government-subsidized information. Especially critical here have been global positioning systems, the development of which was originally financed and implemented by the U.S. Department of Defense. Also, U.S. Federal legislation now mandates that mobile telecommunications devices transmit, under certain circumstances, their location
BUT OF COURSE, THIS MATTER HAS BEEN ECLIPSED BY THE GSM INFRASTRUCTURE, LOCATION FINGERPRINTING, AND SERVICES

IMPORTANT ANALYSIS:
we wish to trace the ways in which the sociological and geographical understandings of the developers of these systems and the systems themselves have been mutually influential.

NOTICE THE GREAT POINTERS HERE TO CLASS AND RACE!
Indeed, the today-familiar forms of horizontal segregation, emblematic of the idea that you are where you live, developed only through the nineteenth century, as a result of a growing middle class wishing to express its new status, and at the same time wishing to attain a degree of separateness from the less well-off people whom they had economically left behind (Johnson 1978). Each small region or neighborhood came to be seen as a place wherein

The ideal, if not the practice, of locational marketing can be found in this historical moment. Regional space was considered a container which people occupied without really affecting

Each of these neighborhoods could be conceptualized as a container, within which there were households and residents who occupied that neighborhood much like sardines in a can. That is, their inhabiting a neighborhood was just a matter of being there; it was fundamentally passive. Computational energy was spent, not in redefining regions, but in defining a set of classificatory categories, and assigning the extant regions to those categories

ut with the 1960s, even as the flight to the suburbs continued, and even as this sociological ideal was being implemented in working geodemographic systems, the nation was increasingly rent by schism. The notion of the suburbs, or of any segment of American life, as united in a set of core values or ideals, was increasingly challenged. The very premise of locational marketing - the social cohesion of neighborhoods -became increasingly questionable. Marketers responded with the technological means at their command, and within a conservative ideological framework. They made their locational analysis more and more precise in the seemingly desperate belief that at some level - if not 40,000 people then 1,000 people, and if not there, well, then 40 people -the ideal refuge of a like-minded group of neighbors could be resuscitate

The increasing availability of ever more precise locational data as well as ever more abundant personal information, and marketers’ sense of devolving social cohesion, have gradually led to an ever more narrow definition of the “where” of “you are where you live,” until it is now thought of as the skin that marks the boundaries of your physical extension

SO IMPORTANT:
In carrying “You are where youlive” to its technological and analytic extreme, it has turned on itself, and begun to recognize that the spatial container cannot be the primary definer of its individual contents
MOST PRIVACY WORK ASSUMES THAT THIS MODEL IS STILL VALID: FROM WHERE YOU ARE I CAN INFER WHAT YOU ARE DOING AND WHAT YOUR INTENTIONS ARE.
marketers and demographers have begun to understand regions themselves as constituted by the patterns of activity of individuals

TEMPORALITY OF HOW THE UTILITY OF THE SYSTEM IS MEASURED
The individual is an active geographical agent, making decisions on the fly, as opportunities arise. And here those decisions seem inevitably to occasion responses on the part of the users of the systems, just because the systems for the first time allow immediate validation of their worth. If a store uses a geodemographic system to offer electronic coupons to people walking by, or if a digital sign promoting a sale is set to appeal to an especially large group of people with certain tastes, again known to be walking or driving by, the utility of the system is immediately evident.

AND THIS IS THE POINT: BUT THE NEGATIVE SPACES ARE WHERE THIS AMBITION OF COURSE TURNS WEIRD
So just as from a philosophical point of view the new systems are fulfillments of the desire for a richer way of understanding people’s geographical behavior, they at the same time are themselves active agents in manipulating that behavior to create “ideal” geographies

WHAT ARE DIFFERENCES IN WHICH POWER MANAGES SPACE: TIME AND VISIBILITY
There is another sense in which the public domain becomes privatized through new developments in geodemographic systems. That is the degree to which the character of lived regions becomes the product of the goals and strategies of ever fewer, more interlinked, well-capitalized, and private corporate interests. Corporate and state actors have always been significant actors in the social construction of place. Historically, though, the mechanisms of those actions have at least been visible and to a degree opposable. Highway projects loom before they are built. The effects of redlines are enduring, relatively stable, and noticeable. However, new systems will potentially allow the instantaneous reconfiguring of spatial elements toward any emergent strategic end. The spatial contours of places will become more fluid, and the means by which the existence, the meaning, and the social importance of places are negotiated will become more fast-paced, and less visible to their inhabitants

The sociological belief that “You are where you live” has fostered a drive to understand “where you live”
BUT NOW THE YOU IS ALSO DISECTED, YOU ARE NOT REALLY SEEN AS ONE BUT MULTIPLE, AS A SET OF GESTURES, EMOTIONS, AND OTHER THINGS

Since regions are created by the behavior of individual inhabitants, the goal becomes to influence those behaviors through direct, persuasive appeals. Regions are managed by managing individuals

creation of temporal spaces:
        predicting action
        gaming action

simulation:
        simulate locations so that you can plan for them
        apple xcode
        android emulator: mock location data
        not just used by app providers, but also pokemon go users


the world needs to be legible
        in place and time
        so that it can simulate things


creation of negative spaces:
        the meeting spot of courriers in brussels (femke snelting)
        or the airplane tickets that become cheaper because they are out of the predictive profile of the masses