Notes from the Workshop Anyware - Location Privacy at VUB: 13 October 2016 Organized by Mireille Hildebrandt and Irina Baraliuc Also in celebration of Katja De Vries' defense https://vublsts.wordpress.com/2016/09/28/anyware-privacy-and-location-data-in-the-era-of-machine-learning/ Solon: ask not what your algorithm can do for you: the ethics of autonomous experimentation waze: outsmarting traffic together was an app on smartphones routing instructions for driving have people using the report back to the app what the driving conditions are police officer here, certain things happening on this road learning about road conditions: from the people using the app by using you were reporting purchased by google now under alphabet what does this have to do with autonomous experimentation? this seems to be the optimal path based on what we know about traffic conditions learning from other epxeriences if service begine to direct everyone to location a, they may no longer no what happens in location b, where they diverted people from to deal with this problem there is a known technique: you send people to some other paths, uncertainty about the driving conditions, to see how the conditions are there users are being used to collect information thaty ou have told some other users to avoid there are different terms for this: explore/exploit algorithms for any given instance: routing information, you can exploit what you know about driving conditions or you can use drivers to explore and see what discovers it optimizes for the system overall so that in general you have an optimal solution for most users at the expense of individual users one users has a risk, the others share the benefit machine learning: observational data, historical information that you have explore exploit: it is experimental, you are proposing alternatives and looking at the consequences of the alternatives varying treaments and comparing effects these things are now being merged online learning: ethical implications at the intersection of the two procedures credit: you make a credit decision, it effects the world and who is getting credit and you want to know what happened to your assessment of people we deploy the model, and see the effect, and retrain the model online makes that continuous A/B testing: Multi-Arm Bandits in optimization problems, you think you are getting to an optimal solution you have confirming evidence that this is the best solution but you may notice that there is another local maximum that is better using randomness reinforcement learning: machine learning and experimentation mixed alpha go: this is the success of AI alpha go had two primary steps: look at previous go games then have the computer play itself, using reinforcement techniques, which includes exploring other areas not explored by humans same methods to deal with things that policy people are concerned with: you deploy more police in one part of the city as a consequence you are deploying less police to other parts of the city so you cannot what is going on elsewhere bandits: to produce disputes, could solve this problem help you avoid you confront with predictions driven by uncertainty: if there is something you don't know much about, this is uncertain effect is not equal to worse treatment but there is uncertainty should uncertainty be borne by individual users and who is the person selected to explore the uncertain area who gets to exploit that information why is there greater uncertainty to certain solutions to a problem? ethics: belmont report: autonomy beneficence: do no harm justice: unjust for prisoners, if they were all the subjects of risky experiments where the welfare flows to others what is the relevance: autonomy issue consent issue beneficence: users are being knowingly: put into uncertain conditions where they may incure extra cost maybe there is reason why we have certain historical data, because humans know that it is risky or inconvenience justice: you can imagine that there will be much more uncertainty about less common solutions i minotiry exhibits different behavior than majority group being less commmon, that population will be subject to mroe experimentation the direct beneficiaries will be that population, so maybe itis ok but the question still stands: is this the appropriate way to learn the information could they be they benefit from a different solution how do we avoid subjecting to significant risk and that is what beneficence is telling us to do baseline: what is the baseline? humans sometimes don't have an intuition about what is the right solution the argument has to be that there will be many circumstances where there is knowledge of preferable solutions which are historically not well known to certain populations, like google? we need to ask about why there is uncertainty in certain areas? providing greater social context to what is currently known by these platforms naive and obvious things to answer: does the person know they are part of an experiment? what information is the experiment intend to discover? could that information have been obtained otherwise? judith simon: philosophy of sts: copenhagen location based data and priacy: some epistemological considerations is there a paradigm shift in science (due to changes in methods)? target case: inasion of privacy: illegitmate access to data versus informed consent through payback cards invasion of privacy not due to the gathering of data, but due to data processing and inferences big data practices as epistemic practices proposals: big data practices looked through epistemics and politics location data: thomas hoffman eth zurich: data analytics hermeneutics of location data zur hermeneutik digitaler daten personbased raw data + common background data = enriched persona based data semantic maps + social information + predictive models problems: regarding privacy, discrimination, ... what is the epistemic difference between consumer and location based data data about location is non-inferential relevance location data: data on presence and movement (this is banal) if i know where your mobile phone is, i know where you live, your workplaces, if i zoom in, i know where you spend time, whether you have a child i know where you are secret endeavors data of movement: mode, route and speed of transporation, when from where to where it gives us a deep description of a sample of one versus inferential statistics you make a relationship between an individual and aggregated individual and make inferences this is not new you can do this with location data but not necessary for all usages how do you ensure location privacy if it is so expressive: solution: restrict data usage to aggregate level solution 2: restrict data gathering, data sparsity jean paul van bendegem plato in the background (or in the machine)? i want to explain simple point: lots of space to explore philosopher of mathematics when i read stuff with ml: to my taste, not so often there is a questioning of the applicability of the mathematics but not of the mathematics itself in 10 minutes i will question the whole of mathematics Niccolo Tartaglia (1499 - 1557) beginning of the 16th century i am fond of this kind of etching we today would interpret this as a superposition you have trees, canon, the smoke of the cannon ball, and we are tempted to say, there is a superimposed geometry model of a cannonball's path introduction of a machine learning/location thesis we model the data "as if it were generated by a State Space Model.... we will make it discrete for simplicities sake, we started as the uniform distribution as the start model there are different places, and as a start position, we assume it can be anywhere you have a probability distribution: you take it discrete, so that you have the same probability of being everywhere what has happened between the tartaglia etching and the phd thesis important role of purification of mathematics i dare to use that term, since we talk about "pure mathematics" vs. applied if you look at the historical development if you start with tartaglia the term they used was not the opposition between pure and applied but mixed mathematics if you forget the background wait, tartaglia is one picture, if you could pull them apart, you could have a mixture of the two they would only consider arithmetic as being pure but back then they did not have infinity of numbers, they only had finite numbers it takes some time for the emergence between pure and applied maths whenever part of what was considered to be pure math, became infected with applications, the consequence was elimination Hardy: a mathematician's apology i have been doing all this time, it could have caused no harm whatsoever, becuase it has no use whatsoever i can't have done anything wrong that goes together with an ontology and epistemology and it is related to platonism in a sence platonism has been created! in opposition to the (neo)platonist view is the constructivist view mathematical objects are created involves procedures and notations also applies to identity you say either things are the same, they are different or nothing can be said about it major difference between location (in pure sense): that being there location + procedure (program): a program and always think of the two together if you think that is the basic unit, you... i need to insert wittgenstein if there is no basic unit, what location is, to extend this as a basic unit, is it interesting? in road maps, some places are interesting? being intentional: what are the purposes? basic unit: location, procedure , and all these other things why did the ancient egyptians needed right angles to measure the banks of the nile after the flooding? it is easy to calculate rectangles, and that is great for taxes and if you include the taxes, that is why in tartaglia, it is not an accident that it is a canon, it could have ben a ball, it would not have been so useful a paper explaining random walks two pictures a person going 45 degrees left or right, w equal probability at the end you get a drunk person later you see: a graph of the USD and Euro exchange rate and it looks like a random walk why is there no drunk person at the end of that curve this means something else the pictures are similar mathematically they are the same, but you treat the same because of all the elements in the basic unit Q: is it better to be treated as member of a class or an individual? maybe for some people it is better to be treated as an individual and for others as a group? Q: is this related to the uncertainty? so that we can experiment to find out what it is where we want to be? insurance: if you know for certain that somebody would get sick, insurance system would unravel: charge them the exact amount it is the uncertainty that allows us to socialize the cost uncertainty enforces group categorizations, that if we would treat as individuals, we would loose the way you aggregate, what if there is an outlier, is that the counterpart to the individual or is it not an interesting feature anymore? outlier would not be at the level of the person but to events a person leaves his cell phone at home it doesn't move the entire day does that make me something special? Q: what would your opinion be, if one of these systems, an ethical reflection, attempts in AI, to code ethical reflection through the belmont principles to automated decision making? the paper gave me appreciation for people critical of singularity this is oten what they have in mind: bostrom the scenario is, you teach a machine to make paper clips and it tries to make every atom in the universe into a paper clip now i understand where these ideas are coming from but particular way of designing a machine that intervenes, learns, ... that is oblivious to a bunch of things in the world goal alignment is what you are describing as much as still find those people obnoxious, there is some legitimacy separate work: can we design systems that take this into account how to use multi-arm bandit that does not engage in populations that can not take the burden Q: is it that the balancing choice that the developers are making of how many they can screw before they can loose their audience to what degree of experimentation would be acceptable they are mathematically design to be very efficient: so they can involve small number of experiments but there is an independent business decision to be made about it q: location data is more direct than web data i was suprised by that statement certain web pages also give a lot of intimate information of things that i am interested in if i look for something: it is indicative of something i am doing but it is not directly telling you you cannot really know for sure, i am looking for someone else, where i am is where i am the more you dive in, the shadies the distinctions will be i asked my father to send me emails he gets about post-factual emails he receives what merkel is doing, paranoid emails what does this tell you: if you just have this information, you don't know if i oppose these news or not web beahvior is only probabilistic indicative unlike in location q: why the producers of these applications would respect principles of bioethics, they are not producing science, but exploiting a business opportunity? facebook contagion study people rely on ethical practice to do corporate practice this paper that we have written is far too, does not acknowledge that this is a simplistic way of dealing with a practical problem there is no expectation that people will adopt these principles encourage people to think about how you might deisgn systems that take ethics into consideration that gives some source to the argument q: contrasting very routine location data, with very marginal web data i received some stuff i am not interested about i have my routines on the web, random places are of less importance to me Lydia Nicholas, NESTA most of the stuff you are talking about doesn't happen in local governments medium data basic statistics who can push back against applications like waves case workers may not be able to look at certain data and there can be a system that looks at the different databases and provide risk scores then you don't break confidentiality smart places traffic management and whether there are icy patches on the road that is all that is there in the much of UK optimizing resources: making your bins get collected effectively governments collect a lot of data most of the data interactions that i have with the government is at a life event the times i may apply for benefits, crime significant life events: but they don't know what i am doing in between benefit system: requires knowing who you are living with, how long, your relationship with them you may oversurveil certain populations apart from your communications data, which is not part of governance case workers report about what was going on in a troubled family where a child is being abused a lot of the data entered into systems is garbled, it is sensitive it will take three years to get the permission to get into the room where the information is introduced but then i was told the system doesn't work so they are writing it down so what is going on in data practice with governments and governance resistance from people working on the ground to data capture you generally take this job because you want to help people building relationships of trust and help them transform their situation you don't want to make them legible to the state you are seeing the impossibility of doing big data work people don't have the language, but they are thinking about the privacy issues they want to protect their job, don't want to get fired i find that extremely interesting as a point of focus you get to what we are actually trying to do you creating a social and healthcare system that looks like amazon warehouse it is perfectly efficient, but there is no time to have a cup of tea steps forward: AI will get very good at parsing unstructured data are you going to make that deliberately obfuscated or, one of the interesting projects, they are embedding data analysts in the small teams in place when you try to identify families that need support there are concern markers, attending school, drug use usually you have to tick above a threshold to go to a general social worker ML use here: to segment the most significant risks and paths forward for those families and put them into specialist social groups and get data analysts to work with them and find out not only what patterns there are but which ones are important end goal: idealized social landscape what our government could do is turn this whole thing into an amazon warehouse or you can find patterns that can transform governments transforming from providing services to comissioning services this idea wheter you are trying to watch to tick boxes and to sustain or identify patterns to transform produces a very different relationships between capture and machine learning then google and apple's predictive models it involves a lot of human and data expertise dificulty of getting care workers, data analysts in the place and improving statistical knowledge people reject correlations: are you god! basic denial of correlation is something that you may need to battle with here. i have questions about the data analysis/care work northern city: where to put cycle lanes give away bikes with gps on them 30 million pounds they were proud of the strategy someone knocked on the door: you realize that the bikehub, and would have given you the location data and routes for free they are people who chose to bike and engage with the app there are basic issues in education and infrastructure day by day work of taking state knowledge and how it impacts people Arjen de Vries who controls your search log data information retrieval i did a project on IR for children improve it for children best way to do that was to use log data from yahoo i was enthusiastic about the profiling you can do you can do cool things for normal people somebody wanted to teach high school kids about profiling and digital identity to teach the to think about what they do online i was asked for advice on the tech aspects they know everything about our online searches and what your interests are at the same time: there is this web getting more centralized web was intended as a decentralized thing decentralized web summit too much centralization: too much pwoer put to gether in a few parties that control the info online the photos of the event are on a google drive but they are meant to be shared! :::::>>>> centralization isnot the only way to share? mobile makes things only worse there is one way to earn money as a company: sell data what can we do for seach decentralized and localize for search that might be harder the shift to agile turn is causing that the line flattens: because these are bought by data center parties and people who illegally download films we are used to this: central heating, do something for our house as way we can store the whole web at home?? the data that i need from the web is much smaller than the web it is a naive proposal two problems: how to get the data on the personal search engine? how to replace the lack of usage data from many? getting the data: idea: organize the data in bundles anduse techniques inspired by query obfuscation to die the real user's interests when downloading bundles web archive to the rescue? we ship the part that is only of your interest but does this mean that the web archive gets all my information instead of google? but search gets slow query obfuscation to hide a bit your profile and still get the chunks that would be most useful for you how do you keep the data fresh? make use of the fact that you use certain sites frequently no log data: is that bad? yes! all sorts of google search functionality predict query intent/rank verticals etc. etc. this is only available in exchange for the data that we give to these technologies they do something good with the mass surveillance without log data: hinders retrieval experiments in academia related problemm: reproducibility vs. representativeness of research results? we can't study these companies, but only from the outside or as interns alternative sources of clicks: bitly api lists how often links are shortened and clicked wikistts google trends anchor text and timestamps anchor text with timestamps can be used to capture and trace entity evolution or to find popular topics! trade log data: data markets for usage data behavior data turns into something valuable for us so i am much less concerned challenges: how to select the part of your log data you are willing to share? how to estimate the value of log data? truly personal search: safely gain access to rich personal data including email, browsing history, documents read and contents of the user's home directory can high quality evidence about an individual's recurring long-term interests replace the shallow information ofmany? alison: slightly different ways of talking about our shared interests rather than privacy: talk about communication as something people have always claimed a right to do privacy and communication are corollaries with each other to communicate and not to be forced to communicate data and citizenship in the broader framework of communication seda: sketch out a transformation from a paradigm of communication about access to a network to a communication paradigm of production of data (a compulsion to produce data for someone to take action) hwo we organize our institutions is effected by this problem: the second one is not about rights the shift to wards producing data for the purposes of action: rights claims become more contested sketch the shift in the communication paradigm of access to a network 15 years ago info society discourse: was about getting online and connecting with other people glorification of a global network of peers it was electric speak to each other without intermediation that network grew in which many entities, people or not, are on the network totalizing increasingly people have access to the network :::::::>>>>NO, they continue to exist for other people! that is the end of the electric vision of globalization and equality and the end of a certain way of making communication based rights claims the right to communicate is now a compulsion to create consistent data the network cannot continue to geenrate value, unless people continue to produce data business models have shifted, from a whole bunch of companies providing access to a network to intermediary giants, whose business is data processing and calculation shift from rights claims to data acts data acts happen through intermediation location is a good example here a coordinate: long/lat is nothing in itself, it needs to be constructed, you must be placed on a map, and positioned in relation to other information next argument: this intermediation creates what is meaningful about data and the way that data can be connected to action my concerns about citizenship and the ability to act on things that people care about in the world the intermediation happens through a framework that is first applied by the corporate and then public sector public sector often responds, not knowing if they should repeat these things frames: consumption optimization the corporate actors make the conceptual space: what sorts of things get to be data i work with civic orgs bottom up orgs they reappropriate the same kind of frameworks data is valuable because you can link it to consumption and optimization there are some interesting responses in the realm of ethics normative critical we have this notion of data citizenship taking infromation and calculating it and making a calculative judgement that produces a consumer model what is location data good for: things that have consistent identities in space based on the location of an object that gets information which entities will create the first structured data first: the ones that want to sell you something classic consumer model second optimization model let's use data to make things more efficient this uses the idea that an expansion of things that are computable expansion of areas of life that can become more optimal comes with the datafication commuting applications two sided business model you buy app as a commuter "don't bother taking the central line" the info depends on the availability of fully up to date streaming information transit authority take the potential of that mass data and turn it something that na indivdual can act on the applications sell the data back to the transport authorities this positions the ideal citizen as producer and consumer data public institution as also free produced of data and consumer data and optimization what's interesting: these are the early low hanging fruit frameworks for how to use location data for an improved civic experience the civic groups are doing the same thing fixmystreet aggregate citizen identification of problem issues in local areas how do i demonstrate to a government that they should not close the local library? this illustrated to me that there is a particular kind of computable civic action for people who are working with the data which does not fundamentally challenge the consumer paradigm of data production for consumption and optimization normative response: justice: inclusion/exclusion in relation to the results of the calculation existing frameworks: protected classes, abstractions and contextual integrity critical response: boltanski's notion of critique: questioning the nature of things can we question this fundamental shift in citizenship how it has been redefined through datafication what it means to have only spaces of action instead of rights claims as a way to claim civic positions play of uncertainty: make things unoptimal you may think about optimization resistant behavior good research that shows that if you seek to datafy an organization, there is always information that cannot be turnedi into data one possible way of thinking about this is to shift from an expost framework: at the end of the technology to think about the construction of ethical beahviors all the way through developing a technology Irina shklovski mutable stories: the shifting accountability of data interpretation the conditions we are discussing here are not about tech per say, but a blend of human decision making and tech practice not about algorithms and data itself but about the output and its interpretation the connection between data and action information as thing: -> deserves careful examination michael buckland data is information as thing giving attention to data is now strange: people used to think this is worth scholarly attention iat: is just an object, may or may not be informative information as thing has become central to the worries and criticisms we worry about data and services what seems to be have said over and over there is a different between describing patterns in data and the attempts to understand and explain these patterns what must be known or data to be interpretable? how does one tell a story with.from data? how do data shift from thing to knowledge? what is the minimum necessary dataset for interpretation? given enough data, the output can be interpreted in a way that is actionable? location is commodity: commodification: removing something from its context of production such that its value is determined by its context o fuse commodity fetishism: when the commodity's value is entirely determined by other means, such that the richness of its context o production is lost, and its value, which once came from social relations, is invested in the object itself commodification here shifts the process of interpreting by changing the available minimum necessary dataset location data is commodified, and it displaces the practices for social meaning interpreations of data can reshape expectation, accountability etc . interpretation of data is about managing uncertainty when data used for decision making are interpreted through the lens of the social relations and contexts of its production - the minimum necessary dataset is inflected with the consequence of the decision taken commodified data are never the entire story byt form a partial basis for it... Katja de Vries Thesis presentation Baroque 17th century time of scientific revolution cartesian separation between object subject newtorn society: difference between subjective beliefs and objectives facts baroque: in reaction to protestantism endless theatrical art style 17th century merchantile capitalism the first financial bubbles first insurances and probability calculus a very baroque work of art frescal ceiling of a church in home visual illusion if you stand in thie jesuit church in rome you have to stand in a specific marble stone it is as if the sky opens above your head if you step down, the visual effect stops as if the building collapses on ou andrea ... had a problem with perspective painting that works in more than one stop another interpretation is that potzo, exactly wanted to show that this is an illusion it is good to rejoice in the performativity and artificiality of the paingin you step down from the marble stone and it collapses on your head it rejoices in the work of making something work baroque is not only an artistic style but a style of thinking what is it to make something work and in general it is a style, the process of making, and how to work with differences and probabilities leibnitz: great thinker of baroque working with princess sophie and something happens sophie says: is everything really different and he says, let's do emprical philosophy and find leaves that are identical they walk around for a while, and they say we cannot find identical leaves hegel says this is like children looking for the same snow sakes but it is turn: how do we deal with difference and sameness chapter 2: traces if an animal walks through the snow, it leaves a trace the trace is open for interpretation a dear, a wolf, and what is it that you are after scientific, hunter, tourist do you follow the trace, or turn around and run for it we give interpretations to traces and act on it interpretations can be dead serious it is what decides who is in the stew: the dear, or the hunter interpretation is a word i don't use in my thesis it sounds voluntaristic as if you can chose what you want to see but it is limited by your body, the space that has constituted you this is a tick: it has three sensations sweaty or not, hairy skin or not, 27 degrees or not is the experience of the tick wrong, or not??? perceptions or percepts is the word perceptions are not determinisitc things can appear is diferent ways picture that can be interpretted as a young fgirl or an old woman you can extend bodies in ... with a dog, glasses, google glasses, laptop what the body can see and perceive will not be the same train a body in a particular way of doing things practice of law, science or medicine again, this body will not be the same, and what the body can see and perceive will not be the same what makes someone act or perceive mapping what makes someone act latour calls the mapping of the netowrk or work-net the work that goes into the percetption that is getting foregrounded i look at two worknets eu informational fundamental rights and network of ML algorithms that are applied to human behavior and characteristics we return to the baroque the sameness of those two networks practices that escape the opposition between art and science these are not completely baroque they are a bit modern and a bit baroque and they could even become more baroque i juxtaposed these two networks and how they contaminate each other the exercise of studying these networks has two effects on two other levels what is the best way to study the practice of making: law making, how do you study that as a philosopher of technology do you look at it as a general way or do you look at the specificities of makins the analysis is also used to read philosophy against the grain as to how identity and difference are related to each other a recurrent theme is the heigh ho heigh ho song what does it mean to make something work today digital traces: not only animals leave traces in the snow the amount of traces we leave has exploded the traveability isnot limited to what we do behind the screen but the signals emitted by mobile phones and footage captured by smart cameras footage from a smart camera: how we stroll from a real shop capter 4 machine perceptions machines categorize they give interpretation separating male from female faces google making a mistake smart face analytics the tech that is used is probabilistic this face is mostly happy and a a little surprised, sad summarized in clearly understandable labels a trace it is nothing without an interpretation also nothing without acting upon that information the question in modern tech is the same as with the trace in the snow: who ends up on the stew let's look at the face again a startup: if you are angry, our app will offer you a whiskey or if w recognize you are engry we will not respond at all what is the worknet of the network of machine learning i looked at 11 specific algorithms how they construct identity and difference is this face angry, sad, male, female main idea: classical programming: explicit instruction: machine learning: examples, instructions how to extract patterns, sometimes feedback (you classified wrong) chapter 6 percepts creation by eu fundemantal rights privacy, data protection and anti-discrimination rights the choice of these rights allows us to see something about these rights what is important in understanding fundamental rights, they are embedded in a certain political constituttion end of 18th century: the state is no longer leviathan, but a pastoral quality, the state is more like a shepherd why do i compare these networks they are both they create similarities between cases they are not scientific at all but with the case of fundamental rights, there is something else at stake i tis not just a baroque practice but ther eis also a political idea behind it can pastoral power be included in machine learning chapter 7 contamination of ways of being can ML be expected to make a political option so that the sheep and people are given the possibility to resist one of the possibilities is to do some transparency about how the algorithms work how do we co-exist with ml being applied us one is that we would be able to see what the algorithms do and think about alternatives chapter 8 what is making, and how to study it? looking at two ways of making making through informational rights and ML what is actually to make something can we by looking at two different ways of making, say something about the nature of making solon: you say we need more baroqueness: what problems would baroqueness help us solve? i think it is good to say that ml is both baroque and not it is not baroque in that it comes from a tradition of statistics which is often about presenting an objective reality classical statistical approach is we looka t how people with psychosis behave and we create a model and present reality in some way the ml applicaions are applied in a quick pace to reality this means that, suddenly it is not just knowledge from paper but it is knowledge that is being applied when we are aware that this is performative interventions into reality then it is important for the practitioners who created this algorithms to realize what all the human efforts that go into it what is the goal in creating these algorithms better world, optimizing profit how do we create the variables constructive validity these are choices that are being made how do we test what is the standard against which we measure the algorithms they are being tested often on standardized databases how well can an alforithm recognize those in a standard database (in comparison to others) this is all very constructivist it would be good if the practitioners acknowledged that there is no ultimate standard of whether this is a model that represents reality or not ML person once said: there is no best, the question is does it work a plane doesn't represent reality but if it flies from a to b without crashing, then it is a good plane so what is it that we want algorithms to do in the case of the plane: not destroy the environment it is important to clarify to practitioners it is more an art creating an algorithm that cannot be measured with standards of reality and truth but to make clear to them, you are making something that is like recipes one recipe has certain advantages over others solon: i like the idea, recognizing the constructivist aspect of ml? do the practitioners not recognize? who has this mistaken belief? practitioners or others? two answers: on a high level, the top level ml engineers are concerned about the implications that ml has when you look at the concerns, they are mainly ethical concerns kathy oneil: she did a course to students she asked: can you develop an algorithm to weight your own essays the algorithm effects the people themselves an ethical awareness of their real life effects by my thesis, i extend the view: it is not only ethical concerns, it is also a legal concern it is also a political concern about the constitution in lower levels of ml there is little understanding: when a company does not have high quality know how and you are throwing in data, and something will come out this will be based on what the black box puts out outside these highly educated ml there is a great believe that this is data, this model must be true there is a lot of work to do in making clear that this is not about reality judith simon: reads her review: summarizes and says it is great! how do you bring power asymmetry matter to the ml engineers? law can change the obligations companies have towards users once there is a legal situation where industry has to incorparate certain power balance between users and more powerful info tech structures it will be necessary to incorporate in the pragmatism of ml if you are a big it company it is much easier to think: let's look at the dp requirements what to do and how to comply with the anti-discrimination rule as a more general way there should be power equality - article 8 force companies to think about power balances in a broader way this will be a way in which the machine learning pragmatism can be effectd engineers can be compelled to do something about it but i acknowledge that the economic incentives are big thomas heskens: i call myself a machine learner you said ml is not science, what do you count as science: statistics, etc. all modern science is disappearing all of it is turning into engineering i draw a line between 17th century as experimental science a research doing an experiment and representing reality so it is about the experimental part and what the truth the idea that nature is discovered science is really builogy, physics, but then when you start making models then most of the models are completely wrong but useful that is the line if they are wrong they are not science if they are right it is engineering? this is one of the quotes all models are wrong and some are useful in fact, this is the criterium that should go for everything that is called science now there is not a representation of reality but models that are very useful but ml, is more upfront about it it can be more upfrot about the performative ibm watson medical data, systems for medical diagnosis you can go to watson or your physician what would you choose? watson has seen much more data i currently would ask both ask them both they give different answers? even the people who make them cannot tell exactly but you can't tell waht the physician's reasoning but you can ask the physician i can make a rule of a neural network to respond to such a question... neither can the doctor give you more medical practitioners will know that there is a model on top of the neural network they can have some assessment of what is going on in the black box what you see in those debates is that medical practioners, it is not clear what input you should put into the machine a doctor may sometimes cannot explain, but based on his experience that is weird: jean paul van bendegem as a mathematician and philosopher i have one question i am grateful to tom for getting my question started there was a nice exchange about what is science and not if i understood right, math is not science i am not a mathematician i write about porbability and sometimes i collide logic with mathematics math it is a formal system with a cleaned up space where certain manipulations are possible in some ways, mathematics can be very useful you can build bridges with it you can predict fluctuations on the financial market it can be useful as a tool, some times i have the impression when reading the thesis you have an idea of pure mathematics i still have the feeling that you are acknowledging this purity i have used mathematics as a counter side as a contrast to show what baroque sciences can be this question about science, i contrast law and machine learning with modern science i acknowledge that modern science does not really exist if you are going to disentangle that in ml baroqueness was more visible if you would not leave math behind but baroqify it, too, that would be great our whole approach is similar to we are standing at the right stone to se the illusion of law if i want to show you the same illusion with machine learning, we need to make a move that is your rhetorical strategy thre are two aswers contrast positions when you speak two languages you speak an issue in another language, you se the limits of the first language if you only speak english, you don't know how a language shapes you 2) the front cover has the illlusion by potzo it is important in law and in machine learning to be able to make it noticeable that you are standing on the stone and to step down and see it is artificial gloria gonzalez my battery died!!! ========== NOTES IN PREPARATION OF TALK AT ANYWARE =================== we are and always have been continuously reshaped by the artifacts we shape, to which we ask: who designed the lives we live today? What are the forms of life we inhabit, and what new forms are currently being designed? Where are the sites, and what are the techniques, to design others? location determines who you are: you are where you live you make up location The first is the ZIP (Zone Improvement Plan) code. Mandated as an element of President Kennedy’s attempt to rationalize government, the ZIP code allowed for the first time the quantification and thereby the easy organization of both residence and business addresses Under the ZIP code system, households were aggregated into units served by a single post office, serving at most perhaps 15,000 people, each indicated by a five-digit number. The US Postal Service at the same time established a system of numbering of postal carrier routes, the routes traveled by each letter carrier, and they received two-digit numbers, so that ZIP codes could in turn be divided into units of perhaps 800 people. As an incentive to using the systems the Postal Service (or, actually, its predecessor, the Post Office Department) gave a discount to mass mailers who sorted their mail by carrier route, so that that geographical unit, defined by the daily path of the individual letter carrier, came to be the preferred unit of division The US Bureau of the Census had begun the establishment of first the GBF-DIME files and then the (currently used) TIGER files. First for urban areas and then for the entire country, these files were the basis of a computerized mapping system to be used for the first time in the 1970 decennial census ore to the point here, these computerized files, consisting in part of latitude and longitude values for the four corners of every block in every city in the US, allowed -through a process of matching with the Postal Service’s ZIP code files - the determination of the geographical coordinates of every mailing address in every city in the US. (Rural addresses created special problems, whichhave only recently begin to be resolved through the development of new rural addressing systems, the impetus for which has been the perceived need to rationalize and support emergency response, or 911, systems.) This in turn allowed the creation not merely of lists, but also of maps of ZIP codes, postal carrier routes, and so on umerical taxonomy is a sophisticated way of clustering similar individuals by imagining them to be in an N-dimensional space, where “n” is the number of socioeconomic variables. So using the roughly 600 socioeconomic variables available at the block-group level, the creators of the geodemographic systems determined the distance of each of 230,000 block groups to all the others in 600-dimensional space. The ones that were “closest” were characterized as being most alike, just as the ones farthest from one another were seen as least alike These earliest systems, developed in an era before the advent of desktop computing, relied on computationally intensive numerical taxonomy, and ran on mainframe computers. ****Software was inevitably written specifically for the project at hand, and customized results were as a consequence expensive**** At the same time, the systems began to direct their attention to smaller and smaller units of measure, to the individual and household level. he geodemographic industry was founded upon resources generated by public programs. These programs, ZIP codes, census data, and 911 address standardization, produced standardized regions as well as locational data (that is, latitude and longitude information) for particular entities. These standard regions were not developed for the specific requirements of the geodemographic industry. Nevertheless, the industry used these legacy regions as the foundation of their marketing analysis. System improvement was marked by the ability to define, categorize, and target smaller and smaller fixed regions What changed at the end of the 90s? 1) The first is a new attention to what are viewed as temporally fluid regions People's behavior can be understood only by understanding that people are mobile, and that they routinely move from home to work (school, markets etc.) and back 2) The second trend is the development of location based services (LBS). These services are in the first instance designed to coordinate or assist the activities of mobile individuals as they pass through stable regions the recognition of temporal changes in the character of regions and the interest in tracking mobile individuals - are in an obvious sense closely connected Two factors have been especially influential in the latest development of geodemographic systems. The first is the role of government-subsidized information. Especially critical here have been global positioning systems, the development of which was originally financed and implemented by the U.S. Department of Defense. Also, U.S. Federal legislation now mandates that mobile telecommunications devices transmit, under certain circumstances, their location BUT OF COURSE, THIS MATTER HAS BEEN ECLIPSED BY THE GSM INFRASTRUCTURE, LOCATION FINGERPRINTING, AND SERVICES IMPORTANT ANALYSIS: we wish to trace the ways in which the sociological and geographical understandings of the developers of these systems and the systems themselves have been mutually influential. NOTICE THE GREAT POINTERS HERE TO CLASS AND RACE! Indeed, the today-familiar forms of horizontal segregation, emblematic of the idea that you are where you live, developed only through the nineteenth century, as a result of a growing middle class wishing to express its new status, and at the same time wishing to attain a degree of separateness from the less well-off people whom they had economically left behind (Johnson 1978). Each small region or neighborhood came to be seen as a place wherein The ideal, if not the practice, of locational marketing can be found in this historical moment. Regional space was considered a container which people occupied without really affecting Each of these neighborhoods could be conceptualized as a container, within which there were households and residents who occupied that neighborhood much like sardines in a can. That is, their inhabiting a neighborhood was just a matter of being there; it was fundamentally passive. Computational energy was spent, not in redefining regions, but in defining a set of classificatory categories, and assigning the extant regions to those categories ut with the 1960s, even as the flight to the suburbs continued, and even as this sociological ideal was being implemented in working geodemographic systems, the nation was increasingly rent by schism. The notion of the suburbs, or of any segment of American life, as united in a set of core values or ideals, was increasingly challenged. The very premise of locational marketing - the social cohesion of neighborhoods -became increasingly questionable. Marketers responded with the technological means at their command, and within a conservative ideological framework. They made their locational analysis more and more precise in the seemingly desperate belief that at some level - if not 40,000 people then 1,000 people, and if not there, well, then 40 people -the ideal refuge of a like-minded group of neighbors could be resuscitate The increasing availability of ever more precise locational data as well as ever more abundant personal information, and marketers’ sense of devolving social cohesion, have gradually led to an ever more narrow definition of the “where” of “you are where you live,” until it is now thought of as the skin that marks the boundaries of your physical extension SO IMPORTANT: In carrying “You are where youlive” to its technological and analytic extreme, it has turned on itself, and begun to recognize that the spatial container cannot be the primary definer of its individual contents MOST PRIVACY WORK ASSUMES THAT THIS MODEL IS STILL VALID: FROM WHERE YOU ARE I CAN INFER WHAT YOU ARE DOING AND WHAT YOUR INTENTIONS ARE. marketers and demographers have begun to understand regions themselves as constituted by the patterns of activity of individuals TEMPORALITY OF HOW THE UTILITY OF THE SYSTEM IS MEASURED The individual is an active geographical agent, making decisions on the fly, as opportunities arise. And here those decisions seem inevitably to occasion responses on the part of the users of the systems, just because the systems for the first time allow immediate validation of their worth. If a store uses a geodemographic system to offer electronic coupons to people walking by, or if a digital sign promoting a sale is set to appeal to an especially large group of people with certain tastes, again known to be walking or driving by, the utility of the system is immediately evident. AND THIS IS THE POINT: BUT THE NEGATIVE SPACES ARE WHERE THIS AMBITION OF COURSE TURNS WEIRD So just as from a philosophical point of view the new systems are fulfillments of the desire for a richer way of understanding people’s geographical behavior, they at the same time are themselves active agents in manipulating that behavior to create “ideal” geographies WHAT ARE DIFFERENCES IN WHICH POWER MANAGES SPACE: TIME AND VISIBILITY There is another sense in which the public domain becomes privatized through new developments in geodemographic systems. That is the degree to which the character of lived regions becomes the product of the goals and strategies of ever fewer, more interlinked, well-capitalized, and private corporate interests. Corporate and state actors have always been significant actors in the social construction of place. Historically, though, the mechanisms of those actions have at least been visible and to a degree opposable. Highway projects loom before they are built. The effects of redlines are enduring, relatively stable, and noticeable. However, new systems will potentially allow the instantaneous reconfiguring of spatial elements toward any emergent strategic end. The spatial contours of places will become more fluid, and the means by which the existence, the meaning, and the social importance of places are negotiated will become more fast-paced, and less visible to their inhabitants The sociological belief that “You are where you live” has fostered a drive to understand “where you live” BUT NOW THE YOU IS ALSO DISECTED, YOU ARE NOT REALLY SEEN AS ONE BUT MULTIPLE, AS A SET OF GESTURES, EMOTIONS, AND OTHER THINGS Since regions are created by the behavior of individual inhabitants, the goal becomes to influence those behaviors through direct, persuasive appeals. Regions are managed by managing individuals creation of temporal spaces: predicting action gaming action simulation: simulate locations so that you can plan for them apple xcode android emulator: mock location data not just used by app providers, but also pokemon go users the world needs to be legible in place and time so that it can simulate things creation of negative spaces: the meeting spot of courriers in brussels (femke snelting) or the airplane tickets that become cheaper because they are out of the predictive profile of the masses