Вы находитесь на странице: 1из 64

20160915 MARIUS WATZ - PAPERS WE LOVE, ST.

LOUIS

ABUSE OF AN ALGORITHM COMES AS


NO SURPRISE
Pseudo-random thoughts about algorithms as creative
materials and instruments of power.
Color and geometry gone wild (in ~1000 slides)
Inconvenient truths about technology and bias
Creative code - the practice and the clich
Note: Slightly altered from version presented at PWL conference.
@MARIUSWATZ
MARIUSWATZ.COM

INSTAGRAM.COM/NOSUCHFUTURE
FLICKR.COM/PHOTOS/WATZ

Jenny Holzer: Truisms (1982)

Jenny Holzer: Truisms

Jenny Holzer: Truisms

HI, IM MARIUS AND IM AN ARTIST


But I never went to art school.
I started coding on a TRS-80 Color Computer at age 11, then
later dropped out of Computer Science without a degree (too
damn boring.)
Wanting to explore the application of computational logic to
graphic design in 1994 meant being self-taught. Fortunately,
the Dot Com boom soon made being a coder with graphic skills
potentially lucrative.
Today I am an artist, freelance creative technologist and
educator, working with code and data as materials.

Getting Started With Color Basic

Marius Watz - The Work

SHOCK & AWE


The short, sharp shock way to see all my work in 4 minutes
and 23 seconds.
900+ slides
Recent and not-so-recent work (1994 to present)
Unseen work (sketches and unpublished)
Credit goes to Golan Levin for the original suggestion that I do an interactive talk based on
slides of all my work. Sadly, Im less interactively minded.

AND NOW FOR SOMETHING LESS


PLEASANT.

BEFORE WE BEGIN: CONCLUSIONS


Some of the ideas that follow may appear controversial. They
shouldnt be.
My basic argument is simple: Technology, whether its a
machine learning algorithm or an infrared sensor, is never
neutral, nor is it created in a (cultural) vacuum.
The perfect world of algorithms is an illusion, instantly broken
whenever technology intersects with human behavior.
Coders are people, too, remember?

BEFORE WE BEGIN: CONCLUSIONS


My goal in the following is not political correctness, but
to extend basic ethics and human decency to issues of
technology.
From IBMs Watson to social media and machine learning, the
biggest trends in tech are deeply linked to understanding and
facilitating human experiences.
Even the most unassuming software developer can have
the power to affect society. If Uber can disrupt labor
politics and Airbnb can undermine urban planning policy, why
shouldnt apps and APIs be able to reinforce positive change?

PERSONAL DISCLAIMER
As a Caucasian male born and educated in a country (Norway)
that offers universal healthcare and free education (even
at college level), I am the benefactor of multiple layers of
privilege.
I do not presume to be able to speak to the lived reality of
some of the issues I will discuss, but I hope address that
pitfalls developers face in creating tools and participating a
critical discourse around technology.

Dave Schroeder of Eyeo, being awesome.

WE ARE ALL AWESOME. SO WHY DOES


THE WORD BROGRAMMER EVEN EXIST?
The tech, design and startup worlds are full of smart
people. Geniuses, even.
Sadly, this does not preclude the persistence of
discrimination based on gender identity, race or sexual
preference.
Diversity is an agreed-upon universal goal, but simply
agreeing does not make it so.
PS. I am not implying that the present audience is racist or sexist.
Im just saying stupid shit does go on.

Geek Feminism Wiki


http://geekfeminism.wikia.com/

Geek Feminism Wiki


http://geekfeminism.wikia.com/

Moritz Stefaner: Gender Balance (2013)


Data collection+viz, gender balance of creative tech speakers

The Atlantic, Oct 2015

Aanand Prasad, Diversity Calculator

CULTURAL BIAS VS. BIAS AS TECH


Culturally reinforced biases are bad enough. But what
happens when bias is (un)intentionally included in
algorithm development?
Can an algorithm be racist? (Take a wild guess.)
Software developmers routinely adhere to principles
related to accessibility. So why isnt preventing bias or
cultural insensitivity a priority?

2009 HPs racist web cams

Syreeta McFadden: Teaching the camera to see my skin

Syreeta McFadden: Teaching the camera to see my skin

Shirley cards (color balance reference sheets )


Year unknown

Contemporary color reference


Getty Images, year unknown

Greg Dorsainville: DataFaces.net

Greg Dorsainville: DataFaces.net

Adam Harvey: CV Dazzle

Adam Harvey: CV Dazzle

Adam Harvey: CV Dazzle

MACHINE LEARNING IS SO EXCITING


(AND ABSOLUTELY TERRIFYING)
If simple image processing or sensor sensitivity can lead
to people of color not being seen, consider the exciting and
terrifying potential of machine learning.
Recognizers are famous for hilariously mis-identifying
objects, largely due to limitations in training data.
But what is hilarious while debugging, can be horribly
inappropriate (and potentially brand-destroying) when
deployed unchecked.

Flickr autotagger labels Dachau gate jungle gym


Daily Mail, May 2015

Jacky Alcine on Twitter

Comparetheircrimewithaimilarone:Thepreviouummer,41-ear-oldVernonPraterwapickedup
forhoplifting$86.35worthoftoolfromanearHomeDepottore.
Praterwathemoreeaonedcriminal.Hehadalreadeenconvictedofarmedroerandattempted
armed robbery, for which he served ve years in prison, in addition to another armed robbery charge.
ordenhadarecord,too,utitwaformidemeanorcommittedwhenhewaajuvenile.
YetomethingoddhappenedwhenordenandPraterwereookedintojail:Acomputerprogrampat
outacorepredictingthelikelihoodofeachcommittingafuturecrime.ordenwhoilackwa
ratedahighrik.Praterwhoiwhitewaratedalowrik.
Twoearlater,weknowthecomputeralgorithmgotitexactlackward.ordenhanoteencharged
withannewcrime.Prateriervinganeight-earpriontermforuequentlreakingintoa
warehoueandtealingthouandofdollarworthofelectronic.
corelikethiknownarikaementareincreainglcommonincourtroomacrothe
nation.Theareuedtoinformdeciionaoutwhocaneetfreeatevertageofthecriminaljutice
tem,fromaigningondamountaithecaeinFortLauderdaletoevenmorefundamental
deciionaoutdefendantfreedom.InArizona,Colorado,Delaware,Kentuck,Louiiana,Oklahoma,
Virginia,WahingtonandWiconin,thereultofuchaementaregiventojudgeduring
criminalentencing.
Ratingadefendantrikoffuturecrimeioftendoneinconjunctionwithanevaluationofadefendant
rehailitationneed.TheJuticeDepartmentNationalIntituteofCorrectionnowencouragetheue
ofuchcominedaementatevertageofthecriminaljuticeproce.Andalandmarkentencing
reformillcurrentlpendinginCongrewouldmandatetheueofuchaementinfederal
prion.

byJuliaAngwin,JeffLarson,SuryaMattuandLaurenKirchner,ProPublica
Ma23,2016

ONAPRINGAFTRNOONIN2014,rihaordenwarunninglatetopickuphergod-iterfrom
choolwhenhepottedanunlockedkidlueHuiccleandailverRazorcooter.ordenanda
friendgraedtheikeandcooterandtriedtoridethemdownthetreetintheFortLauderdaleuur
ofCoralpring.
Jutathe18-ear-oldgirlwererealizingtheweretooigforthetinconveancewhichelonged
toa6-ear-oldoawomancamerunningafterthemaing,Thatmkidtu.ordenandher
friendimmediateldroppedtheikeandcooterandwalkedawa.
utitwatoolateaneighorwhowitneedtheheithadalreadcalledthepolice.ordenandher
friendwerearretedandchargedwithurglarandpetttheftfortheitem,whichwerevaluedata total
of $80.

In2014,thenU..AttorneGeneralricHolder

warnedthattherikcoremighteinjectingia

intothecourt.HecalledfortheU..entencing

Commiiontotudtheirue.Althoughthee

meaurewerecraftedwiththeetofintention,I

amconcernedthattheinadvertentlundermine

our eorts to ensure individualized and equal


jutice,heaid,adding,themaexacerate
unwarrantedandunjutdiparitiethatarealread

fartoocommoninourcriminaljuticetemand

inourociet.

ProPublica on bias in crime risk


assessment
software models (2016)

Theentencingcommiiondidnot,however,
launchatudofrikcore.oProPulicadid,a

partofalargerexaminationofthepow
erful,largel

hidden eect of algorithms in American life.

ALSO: DATA COLLECTION (AND NON-COLLECTION)


Data collection is largely accepted as the privilege of
government and corporations, often unchecked and rarely
questioned. (Compare US to European policies on data
collection and retention.)
Both what data is collected and *how* it is collected have
major implications for harm potential.
What are the mechanisms of collection, what fields are
collected, is the data anonymized and reliably updated /
deprecated?
Conversely: The absence of data, where data could
reasonably be expected to exist, is most revealing.

The Guardian: The Counted (2015)


Database of US police shootings

Fatal Encounters (D. Brian Burghart et.al,2012)


Tracking of US Officer-involved homicides

The potential irony of Open Data success

COUNTER-COLLECTION AND CITIZEN DATA SCIENCE:


NGOs, journalists and independent groups often collect data
on controversial issues and in dangerous locales, where
official sources are either missing or likely unreliable.
Distributed, crowdsourced and encrypted tools for data
collection (apps, APIs, wikis) provide the means to develop
alternative networks for information exchange.
Citizen data science challenges existing data narratives and
may provides input on issues outside the scope of corporate
and government efforts.
Examples: Tracking CIA rendition routes, drone strike casualties, citizen weather stations,
networks of corporate influence.

MimiOnuoha / missingdatasets
Code

Issues 0

Watch

Pull requests 0

Projects 0

Wiki

Pulse

Star

26

Fork

Graphs

gather such data, which could prove incriminating.

An overview and exploration of the concept of missing datasets.


9 commits
Branch: master

1 branch

New pull request

0 releases
Create new file

MimiOnuoha Fixed typos in links section

1 contributor
Upload files

Find file

Clone or download

Latest commit 0057662 on Aug 15

resources

Initial commit

README.md

Fixed typos in links section

Nowadays we've got a political and cultural climate where this issue has become one of public discussion. Public interest
campaigns like Fatal Encounters and the Guardians The Counted have helped fill that void. But even for these
individuals/organizations the work is difficult and timeconsuming. The group who would make the most sense to
monitor this issuethe law enforcement agents who create the data set in the first placehave no incentive to actually

8 months ago
a month ago

README.md

On Missing Data Sets


This repo will be periodically updated with more information, links, and topics. Most recent update: 08/15/16.

Overview
What is a Missing Data Set?
"Missing data sets" are my term for the blank spots that exist in spaces that are otherwise datasaturated. My interest in them
stems from the observation that within many spaces where large amounts of data are collected, there are often empty spaces
where no data live. Unsurprisingly, this lack of data typically correlates with issues affecting those who are most vulnerable in
that context.
The word "missing" is inherently normative, it implies both a lack and an ought: something does not exist, but it should. That
which should be somewhere is not in its expected place; an established system is disrupted by distinct absence. Just because

some type of data doesn't exist doesn't mean it's missing, and the idea of missing data sets is inextricably tied to a more
expansive climate of inevitable and routine data collection.

Why Do They Matter?

2. The data to be collected resist simple quantification corollary: we prioritize collecting things that fit our modes of
collection.
The defining tension of data collection is the struggle of taking a messy, organic world and defining it in formats that are
neat, clean, and structured.
Some things are difficult to collect and quantify by nature of their structure. We don't know how much US currency is

outside of our borders. There's no incentive for other countries to monitor US currency within their countries, and the
very nature of cash and the anonymity it affords makes it difficult to track.
But then there are other subjects that resist quantification entirely. Things like emotions are hard to quantify at this time,
at least. Institutional racism is subtle and deniable; it reveals itself more in effects than in acts. Not all things are easily
quantifiable, and at times the very desire to render the world more abstract, trackable, and machinereadable is an idea
that itself deserves questioning.

3. The act of collection involves more work than the benefit the presence of the data is perceived to give.
Sexual assault and harrassment are woefully underreported. And while there are many reasons why this is, one major one
is that in many cases the very act of reporting sexual assault is a very intensive, painful, and difficult process. For some,
the benefit of reporting isn't perceived to be equal or greater than the cost of the process.
4. There are advantages to nonexistence.
To collect, record, and archive aspects of the world is an intentional act. There are situations in which it can be

advantageous for a group to remain outside of the oftnarrow bounds of collection. In short, sometimes a missing datset
can function as a form of protection.

Below is an everexpanding list of missing datasets. Contributions are extra welcome.

An Incomplete List of Missing Data Sets


This list will always be incomplete, and is designed to be illustrative rather than comprehensive.
Civilians killed in encounters with police or law enforcement agencies
Sales and prices in the art world and relationships between artists and gallerists

That which we ignore reveals more than what we give our attention to. Its in these things that we find cultural and colloquial
hints of what is deemed important. Spots that we've left blank reveal our hidden social biases and indifferences.

People excluded from public housing because of criminal records

Why Are They Missing?

Muslim mosques/communities surveilled by the FBI/CIA

There are a number of reasons why a data set that seems like it should exist might not, and they are all tied to the quiet
complications inherent in data collection. Below are four reasons, with accompanying realworld examples.
1. Those who have the resources to collect data lack the incentive to.
Police brutality towards civilians provides a powerful example. Though policing and crime are among the most data

driven areas of public policy, traditionally there has been little history of standardized and rigorous data collected about
police brutality.

Trans people killed or injured in instances of hate crime


Poverty and employment statistics that include people who are behind bars
Mobility for older adults with physical disabilities or cognitive impairments
LGBT older adults discriminated against in housing
Undocumented immigrants currently incarcerated and/or underpaid

Mimi Onuhoha: Missing Data Sets

Undocumented immigrants for whom prosecutorial discretion has been used to justify release or general punishment
Measurements for global web users that take into account shared devices and VPNs
True measures around how often sexual harassment happens in the workplace
Firm statistics on how often police arrest women for making false rape reports

CONCLUSION: POWER IMPLIES RESPONSIBILITY


Creating technology should come with the responsibility to
make sure that the potential of that technology to do harm, is
predicted and minimized. (Preferably ahead of time.)
One developers innocent assumption about calibration
parameters can become a users hurtful experience.
Collect only the data you need. Consider harmful crosscorrelations.
Dont blame the algorithm. Its not a puppy.

CONCLUSION: BE NICE AND THINK.


Acknowledging that you have bias / privilege is not admitting
fault or guilt. Its being honest and human.
Remember, there are both known unknowns and unknown
unknowns. Acknowledging limits to your personal knowledge
and asking for input is the starting point of a conversation
about possible concerns.
Dont be the team behind Apple Health, omitting the crucial
health metric of period tracking from an otherwise extensive
data platform.
When all else fails, apologize.

CONCLUSION: BE NICE AND THINK.


Stereotypes (the nerd, the jock, the clingy boyfriend, the
always-angry feminist) are seductive due to their apparent
ability to explain observed behavior. In reality, they reinforce
subconscious bias and belittle individual complexity.
Expecting those who are being harmed or discriminated
against to speak up and provide solutions, only serves to
silence (as well as annoy) them.
To quote a good friend of mine, you can be the best. Or the worst.

AND NOW FOR SOME LIGHTER MATERIAL.

CREATIVE CODE
An awkward label loosely applied to creative practices in
architecture, design and art.
Implies forms of creative expressions directly based on
computational logic, both as a process tool and a material to
manipulate.
Requires the articulation of aesthetic principles and decisionmaking as a set of algorithms, along with the parameter sets
that define them.

CREATIVE CODE
Common sub-genres:
Generative art
Parametric design / architecture
Data visualization (the new-fangled kind)
Computational typography
Interaction design

The Algorithm Thought Police


Marius Watz, blog post, Feb 2012

COMMON ALGORITHMS
Circle packing
Reaction diffusion
Fractals (yes, all of them)
Strange attractors
Voronoi / Delaunay diagrams
Flocking / boids
Cellular Automata (Game of Life etc)
Polygon subdivision
Iso-surfaces aka blobs

Google Image search: circle packing

Google Image search: circle packing architecture

Google Image search:


Marius
circle Watz:
packing
Packing
architecture
(2007)

Google Image search: voronoi

Google Image search: voronoi architecture

COMMON ALGORITHMS, YOUR PROBLEMATIC FRIEND


All of these are awesome (and beautiful) tools. But they are
not neutral vessels. In fact, their popularity stems directly for
their usefulness and/or ability to produce strong visual forms.
Algorithms provide the means to produce specific outcomes,
typically through generative logic or data processing. But in
the process they leave their distinct footprints on the result.
Speaking through algorithms, your way of thinking about
a problem and your range of expression are shaped by their
syntax.

THE TEMPTATION
Upon discovering an elegant algorithm that yields compelling
visual results (say, circle packing or reaction-diffusion) there
is a strong temptation to exploit it as-is, crank out a series of
images and brag about it on social media.
Problem is, the kid down the block often has the same idea.
And both of you have access to Github.

grasshopper3d.com forums
How to emulate my Grid Distortions series

Marius Watz: Grid Distortions

ALGORITHMS AND DATA AS FOUND OBJECTS


Untreated and unmodulated, a standard algorithm is just a
found form - a preset structure producing preset results.
Similarly, many data sets have striking intrinsic forms or
data textures:
Network structures
GPS traces
Plots of timestamped events
Audio waveforms
FFT spectrum analysis (sound landscapes)

Marius Watz: The Happy


Message / call viz (metadata of a relationship)

ALGORITHMS AND DATA AS FOUND OBJECTS


Given the seductive visual impact of many of these preset
forms, awareness of what you bring to the final creation must
be a part of any critical computational creativity.
Most importantly, consider:
Craftmanship (trite, I know)
Originality / transformation
Credible claim to authorship

A HIGHLY UNSCIENTIFIC CRITERION FOR ALGO CRITIQUE


As stated in the original Algo Thought Police post:
Unless you can make it *rock*, stay away. (And if you dont
think algorithms can rock, we have nothing to talk about.)

What I meant: Well, make it rock. (It seems obvious, doesnt it?)

INSTANTLY KNOWABLE AND INFINITELY MASTERABLE


(GOLAN LEVIN)
Unlike a pencil or a piano, an algorithm for visual composition
or parametric design is rarely (if ever) instantly knowable or
infinitely masterable.
More commonly it is a terra incognita, the features of which
must be discovered through experimentation.

THANK YOU FOR LISTENING!


Marius Watz is:
marius@mariuswatz.com
@mariuswatz
mariuswatz.com
instagram.com/nosuchfuture
flickr.com/photos/watz

Вам также может понравиться