logo

📕 What is the Intersectional AI Toolkit?

A ZINE COLLECTION FOR ARTISTS, ACTIVISTS, MAKERS, ENGINEERS, AND YOU

The Intersectional AI Toolkit gathers ideas, ethics, and tactics for more ethical, equitable tech. It shows how established queer, antiracist, antiableist, neurodiverse, feminist communities contribute needed perspectives to reshape digital systems. The toolkit also offers approachable guides to both intersectionality and AI. This endeavor works from the hope that code can feel approachable for everyone, can move us toward care and repair---rather than perpetuating power imbalances---and can do so by embodying lessons from intersectionality.

Of course, this toolkit is not the first or only resource on intersectionality or AI. Instead, it gathers together some of the amazing people, ideas, and forces working to re-examine the foundational assumptions built into these technologies. In the tradition of '90s zine aesthetics and politics, it celebrates these radical efforts by sharing them---connecting concepts, creators, tools, and tactics across disciplines and counter-histories---hoping to spark further conversation and collaboration. It does so imperfectly and incrementally, showing rough edges and edit marks, in the belief that no text is final, all code can be forked, and everything is better with friends.

Please join in by exploring the toolkit, commenting with your questions or thoughts, or remixing it into your own new text(s). No experience is necessary to participate; all backgrounds and perspectives are welcome!

🖨 Zine collection

Our zine collection grows with every in-person and online zine-making workshop:

💯 Intersectional Data Bodies created at Halfsister Gallery Jan 2023
🌀 Intersectional AI Anarchies created at Akademie der Künste Oct 2022
💜 An AI Love Letter from Berlin created at the HIIG IAI Edit-a-thon Sep 2021
😍 Help Me Code IAI created at the Creative Code Collective USC Zine-Making Workshop Nov 2021
🚩 What Shouldn't AI Be Used For? created at Mozilla Festival online Mar 2022
🔮 Intersectional AI Futures created at USC in Los Angeles May 2022
🖌 How Can Artists Help Reshape AI? created at ZK/U in Berlin Sep 2022
🙋 Hi AI, We Have Questions created at ZK/U in Berlin Sep 2022

These online issues are continual works-in-progress: READABLE, REMIXABLE, SHAREABLE.

🍦🍦 A-to-Z IAI & FAQ A double-sided glossary from AI technical perspectives and social perspectives
💜 Who Loves IAI Great practitioners and projects already at work
🔥 Why We Need IAI Making the case for more equitable, empathic tech
🌶 Tactics for IAI Practical approaches from wide-ranging communities and decades of intersectional effort
😍 Help Me Code IAI Afraid of programming but want to save AI from itself?
🤩 Intersectionality Means to Me Why listen to academic, activist, or artistic approaches?

Full Toolkit PDF Render

😎 How do I make my own zine?

"fold at solid lines, cut at center scored lines only" "refold and pinch toward center"

fold your paper in half long ways, then unfold.
fold your paper in half short ways, then unfold.
fold the edges of your paper toward the middle, and unfold. you should have eight mini sections.
cut ONLY along the two short folds in the middle, by folding the paper in half short ways again. find the folded edge and cut only halfway in (not from the side that is open). do not cut all the way across. unfold. the goal is to have a slice in the center that does not connect to any edges.
finally, fold the paper in half long ways again so the printed side faces out. pinch open the sliced center and separate those pages apart from each other until they join their neighbors.
fold the book closed with the covers on the front and back.

... and it's pronounced zeen, right?

Yep, like "magazine." Like, "I'm so excited to read this zeeeeeeen!"

This wiki/zine library of digital-print hybrid zines are written for a non-academic/non-technical audiences and are intended as practical introductory field guides to key concepts, strategies, and resources around inclusive, intersectional AI. But mainly they are intended as jumping off points for your own practice, inspiration, and continued conversation. They celebrate, cite, remix, reframe, and you should feel welcome to do the same.

Christina Dunbar-Hester, in Hacking Diversity (2020), notes that the perhaps surprising appearance of zines and crafting in feminist technology circles makes sense: "In feminist zine making, forms of knowledge like folk medicine can be filtered through the riot grrrl practice of zine-making, which is itself connected to long traditions of feminine papercraft and journaling. They are identity practices in addition to circulations of knowledge" (111). As digital-print hybrids, they can utilize the liveliest aspects of both: GIFS and images, the expandability and nonlinearity of hyperlinks, and other dynamic content online; but also the accessible, 'handmade', distributable, low-power novelty of paper at a moment of maximum screen fatigue. The print versions of these zines are formatted to print double-sided on a single landscape page, making reversible two-part mini-zines (in multiple combinations as the library grows).

They are also inspired by the Tiny Tech Zines festival and two fantastic zines acquired there "Bite-Size Networking" by Julia Evans and "How to Cite Like a Badass Feminist Tech Scholar of Color" by Data and Society's Rigoberto Lara Guzmán and Sareeta Amrute. Check out the Who Loves IAI zine for other zines we love, like the Techno-Galactic Guide to Software Observation.

🤓 Who's making this IAI TK?

Developed by Sarah Ciston while a virtual fellow at the HIIG, with valued inspiration and collaboration from many others included on the Community page. See Process Notes for more on Sarah's making of this collection-in-progress and read more below about how you can help it grow.

🤗 ...and how can I contribute?

Comment, edit, & remix this

There are many ways to contribute. One of the most immediate is to join in the co-creation of these texts! Feel free to read along, and add your thoughts on the project's archive. I think of this as an expanded form of reading-writing.

At the top of any page on this site, you'll see links to "Suggest an edit" and "Git repository." When you follow these, you're invited to make a user account to make edits and suggestions to any page. You can also look at all the prior versions and compare any old version to any new one. Don't worry, you can't break anything! You can even copy this entire repository (archive) as a model to make your own digital zine too.

Add your thoughts, edits, and new pages in style

How do I format my contribution?

🤖 See also

Creative Code Collective Resource Hub

The Creative Code Collective Resource Hub is an ever-expanding, community-sourced database of inspiring projects, tutorials, and coding tools — curated for critical and creative learning. The IAI Toolkit is closely partnered with the Resource Hub, which has been curated by members of Creative Code Collective and friends. It is an interactive, searchable, sortable database that will point you to the many different kinds of projects, tools, and research being made around creative coding, Intersectional AI, and related topics. It's fed by a user-friendly spreadsheet where you can add your own resources and tell us why you like them. Get inspired and get cracking making your own intersectional projects and tools for others.

Trans*formative TechnoCraft & Coding.Care

Elsewhere Sarah is working on documenting the process of building the Intersectional AI Toolkit and Creative Code Collective as an experimental dissertation. Trans*formative TechnoCraft includes how-to guides for creating and sustaining communities around critical-creative code, working critically with machine learning datasets, and intersectionally with AI systems.

Workshop zines 🖨

💯 Intersectional Data Bodies created at Halfsister Gallery Jan 2023
🌀 Intersectional AI Anarchies created at Akademie der Künste Oct 2022
💜 An AI Love Letter from Berlin created at the HIIG IAI Edit-a-thon Sep 2021
😍 Help Me Code IAI created at the Creative Code Collective USC Zine-Making Workshop Nov 2021
🚩 What Shouldn't AI Be Used For? created at Mozilla Festival online Mar 2022
🔮 Intersectional AI Futures created at USC in Los Angeles May 2022
🖌 How Can Artists Help Reshape AI? created at ZK/U in Berlin Sep 2022
🙋 Hi AI, We Have Questions created at ZK/U in Berlin Sep 2022

😎 How do I make my own zine?

"fold at solid lines, cut at center scored lines only" "refold and pinch toward center"

fold your paper in half long ways, then unfold.
fold your paper in half short ways, then unfold.
fold the edges of your paper toward the middle, and unfold. you should have eight mini sections.
cut ONLY along the two short folds in the middle, by folding the paper in half short ways again. find the folded edge and cut only halfway in (not from the side that is open). do not cut all the way across. unfold. the goal is to have a slice in the center that does not connect to any edges.
finally, fold the paper in half long ways again so the printed side faces out. pinch open the sliced center and separate those pages apart from each other until they join their neighbors.
fold the book closed with the covers on the front and back.

Intersectional Data Bodies

Print & Fold Edition: A3/Tabloid

AI Anarchies Intersectional AI Zine

Print & Fold Edition: A3/Tabloid

An AI Love Letter from Berlin

Print & Fold Edition: A3/Tabloid

Help Me Code IAI

Print & Fold Edition: A3/Tabloid

What Shouldn't AI Be Used For?

Print & Fold Edition: A3/Tabloid

How Can Artists Help Reshape AI?

Print & Fold Edition: A3/Tabloid

Intersectional AI Futures

Print & Fold Edition: A3/Tabloid

Hi AI, We Have Questions

Print & Fold Edition: A3/Tabloid

💜 Love Notes to Intersectional AI

Many, many people believe better AI futures are possible. We hope you draw inspiration for IAI from a wide range of practitioners already in action! Here is an ever-growing list of projects, tools, resources to learn from. Some of these are AI– or tech-adjacent, but we think all of them are cool ways to get into thinking about tech differently.

Creative Code Collective Resource Hub is an interactive, sortable version of this effort (work in progress). Please add any and all items listed here to the hub: Share via this form.

Wishlist...

More international examples: would love to know more from outside US & EU!
Local and in-person examples from where you live

Intersectional Tools & Resources

Approaches & Frameworks

Not My AI
Indigenous AI
AI Decolonial Manyfesto
If AI, Then Feminist
Indigenous AI
Queer.AI "conversational agents for the advancement of new eroticisms
Guidelines for Checking Essential Properties of AI-Based Systems in German
Feminist Tech Policyby superrr
Dreaming Beyond AI "a space for critical and constructive knowledge, visionary fiction & speculative art and community-organising"
AIxDesign "nourishing alternative, feminist, + participatory approaches to ai for the rest of us"
Katrin Fritsch: Towards an Emancipatory Understanding of Widespread Datification
Feminist Internet "There is no feminism, only possible feminisms. There is no internet, only possible internets."
Algo.Rules
Civic Data Library of Context
Afrotechtopia
Data for Black lives
MOTIF feminist futures
Superrr "building diverse & equal futures in tech and beyond"
Data Justice Lab
Our Data, Ourselves and the works of Tactical Tech
Interspecies Internet
Fix the Glitch Toolkit 2.0 "Helping to End Online Gender Based Violence for Black Women"
Framework for Participatory Data Stewardship

Tutorials

Elements of AI
fast.ai course Practical Deep Learning for Coders, Code-First Intro to NLP, Practical Data Ethics
Why.AI dossier debunking myths in plain language
Dive Into Deep Learning open source book, technical
Generative Engine exploration of VQGAN-CLIP
@TheAnnaLytical Anna Lytical Tutorial Glamorous Javascript: Makeup and Coding Edition #tutorial

Projects & Examples

Drag Deep Fakes Screen Walks. (2021, April 7) https://www.youtube.com/watch?v=qQSSl533rb8 #vision #deepfake #gan
Library of Missing Datasets #datacollection by Mimi Onuoha
Glaze Project disguise your artwork
Syb Trans Voice Interface
Feminist Guide to AI Bias (chatbot)
Designing Feminist Chatbots by Josie Swords @swordstoyoung

Teaching

Critical Coding Cookbook
DecarceratingTheClassroom
TextGenEd Book of prompts for teaching critically with text-generating AI, edited by Annette Vee, Tim Laquintano, Carly Schnitzler

Communities

Local: Berlin

CreativeCode.Berlin
School of Machines, Making, & Make-Believe classes on art, tech, & design

Tools

Mukurtu CMS data platform for Indigenous communities to share and protect cultural heritage
Switching.Software #deplatforming
Anti Capitalist Software License
In Solidarity
Threads anonymity in forums tool by Berkman Klein
Tiny Tools Directory gathers open source tools, small, free or experimental #compendium #collection #tool
Read the Feminist Manual chrome add-on to swap pronouns to gender-neutral

Zines, Principles, Readings, and...

Internet Teapot and Algorithms of Late Capitalism Zines, esp. special edition: Reconstructing AI
Oracle for Transfeminist Technologies by Coding Rights' Joana Varon and Clara Juliano
A is for Another: A Dictionary of AI
A New AI Lexicon from AI Now Institute, specifically the first sequence of CARE
Detroit Digital Justice Coalition Principles Access, Participation, Common Ownership, Healthy Communities
Tiny Tech Zines
Our Data Bodies' Digital Defense Playbook
Feminist Data Set & Toolkit Caroline Sinders
How to Write Non-Violent Creative Code
We Need to Talk AI: A Comic Essay
Feminist Principles of the Internet
Feminist Data Manifest-NO
Techno-Galactic Guide to Software Observation
Women in Computation, Portable Syllabus
A People's Guide to Artificial Intelligence, Mimi Onuoha and Mother Cyborg (a.k.a. Daina Nucera)
coveillance toolkit
Networks of One's Own publishing an issue as a hardware-software stack with feminist server principles etc. Issue 2, NoOO: Three Takes on Taking Care
Feminist Infrastructure Zine in English and en Español
Reclaiming Digital Infrastructures Zine

Web/print tools

Bindery Javascript Library or Paged.js or html2print by OSP
Markdown Cheatsheet
ZineMachine
P5.js
p5.js Contributors Zine
Decent Patterns "Decentralization Off the Shelf" (DOTS) offers design patterns for rethinking user applications
In Solidarity github app for ensuring more inclusive language in technical documents and code
How to make a zine

Beloved readings

Cyberfeminist Index #cyberfeminism #compendium #collection
Radical Software
People's Computer Company Newsletter
Computer Lib Dream Machines
2600 Hacker Magazine
Information Activism: A Queer History of Lesbian Media Technologies, by Cait McKinney
Whole Earth Index "access to tools, ideas, and practices," an archive of Whole Earth Catalogs and offshoots between 1970–2002, including the Whole Earth Software Review and other couunterculture takes on technology
Homebrew Server Club

An AI Love Letter from Berlin

Print & Fold Edition: A3/Tabloid

🍦 Intersectional AI A-to-Z & FAQ

These glossaries of terms for Intersectional AI A-to-Z are a great place to get started. By all means they're only limited examples of definitions for complex ideas. They put in conversation the technical aspects and social aspects of AI systems by interweaving definitions found in disparate fields. They try to show the complexity of the topic from many angles, while breaking down the concepts into plain language. The goal is that a common vocabulary will allow more people to join the key conversations about AI's futures. Please chime in, ask questions, help make these definitions better! These glossaries would not be possible without the careful thought from other glossaries and readings which inform it. See below for references and resources.

"'[A]rtificial intelligence' means lots of things, depending on whether you're reading science fiction or selling a new app or doing academic research."¹

When defining and talking about AI we have to be cautious as many of the words that we use can be quite misleading. Common examples are learning, understanding, and intelligence.²

AI terms are easy to mix up. AI is a subset of the field of computer science. Within it, machine learning is currently a commonly used techniqueand includes a variety of practices within it, like deep learning and neural networks. Almost all of these make use of work from the highly related field of data science.

"it's often the way the technology is being used, rather than the technology itself, that determines whether it is appropriate to call it AI or not."²

Shane, Janelle. 2019. You Look Like a Thing and I Love You. Little, Brown and Company.

University of Helsinki, Minna Learn. Elements of AI

A is for artificial intelligence

Artificial intelligence (1)

AI colloquially refers to various systems that look for patterns in provided data. These can appear from the outside similar "human" abilities such as "understanding" or "seeing." But these can be achieved through many different systems that vary widely from simple calculation a programmer would not call AI to complex programs that search for patterns without being given directions in advance. AI systems are often made up of multiple components of machine learning tasks and other techniques.

Artificial intelligence (2)

There is no agreed definition of AI, but in general the ability to perform tasks without supervision and to learn so as to improve performance are key parts of AI.³ Even AI researchers have no exact definition of AI. The field is rather being constantly redefined when some topics are classified as non-AI, and new topics emerge.² No matter their context or complexity, AI tools are always socio-technical systems, meaning they are designed, operated, and influenced by humans, rather than entirely autonomous, neutral systems.

University of Helsinki, Minna Learn. "Ethics of AI."

B is for bias

Bias & variance

In a machine learning problem, bias is the technical definition for when a model is underfit to the problem as defined by designers, meaning it cannot find the pattern in the data as expected. This happens when there is not enough, or not representative enough, data used to train it. Meanwhile variance, or overfit, is when a model overstates a pattern, overcomplicating the relationships in the data based on the creator's expected outcome. As bias decreases, variance increases, and vice versa. Other trade-offs include accuracy vs interpretability, complexity vs scalability, domain-specific knowledge vs data driven, better algorithm vs more data.⁴

Bias, implicit & systemic

Bias cannot be 'removed' entirely from algorithmic systems, from analog systems, nor from individuals — because "there is no way to create something without some intention and intended user in mind."⁵ Acknowledging implicit bias helps account for its existence and address its root causes.

"Implicit bias, also known as hidden bias, refers to the numerous ways in which we organize patterns 'thus creating real-world implications.' Exposure to structural and cultural racism has enabled stereotypes and biases to penetrate deep into our psyches. Implicit bias is one part of the system of inequity that serves to justify racist policies, practices and behaviors that persist in mainstream culture and narratives. Current research on implicit bias also provides some promise that individual neural associations can be changed through specific practices (debiasing). If those biases can be changed at the individual level, by definition, they can be changed at the societal level, given sufficient will and investment. Since some biases are unconscious, it may contribute to individuals shirking responsibility rather than actively disrupting the behavior. It is critical for implicit bias to be discussed in the context of how bias, racism, and privilege operate together and systemically." – Racial Equity Tools, via Studio Pathways Glossary⁶

Implicit and systemic biases are embedded in and are often amplified by digital systems, because computation replicates, speeds up, and compounds human decision-making. Unfortunately, "tech fixes often hide, speed up, and even deepen discrimination, while appearing to be neutral or benevolent when compared to the racism of a previous era." ⁵

"Codes are both reflective and predictive." ⁵ Bias is not the whole story; it merely points to preexisting systemic inequality, highlighted by classification thinking used in computation. "The tendency to focus on the issue of bias in artificial intelligence has drawn us away from assessing the core practices of classification in AI, along with their attendant politics." ⁷

⁴

Rajati, M.R. 2021. Lecture. Machine Learning for Data Science, USC, Los Angeles. June 2021.

⁵

Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code. Polity.

⁶

Studio Pathways

⁷

Crawford, K. (2021). Atlas of AI: power, politics, and the planetary costs of artificial intelligence. Yale UP.

C is for codes & confidence

Confidence interval

A range of numbers that helps describe how uncertain an estimate is. Any confidence interval has a high (e.g. 95%) chance of containing the "true" value (that is, the accurate answer to the question being asked). So the bigger the interval, the more uncertain and the more doubt. Confidence intervals are used in statistics and in AI to determine a model formula's reliability. Classically presumes a hidden but "true" unknown value that is independent of the model (and this is not always the case of course). Unfortunately, uncertainty remains inherent in prediction and difficult to comprehend in models, even for researchers who create them.⁸

⁸

D'Ignazio, C. and Klein, K. 2020. Data Feminism. MIT Press.

Codes of conduct

Usually written together by a group, these guidelines outline expectations for behavior and procedures for when members of a community don't meet those expectations. While some argue for structureless, free-speech zones online, many counter that a lack of guidelines highlights power dynamics existing in broader culture.⁹

⁹

Dunbar-Hester, C. 2020. Hacking Diversity. Princeton UP.

D is for data

Data cleaning

Data does not come in ready to go, it must be preprocessed. Sometimes called cleaning, this process involves checking and modifying data before analyzing it or using it for training a system. Preprocessing includes many adjustments that can affect the outcome, including selecting a subset of data (sampling), standardizing and scaling it in relation to a baseline (normalization), handling missing data and outliers with decision trees (which Adrian MacKenzie calls "affiliated with arbitrariness"),¹⁰ as well as feature creation and extraction (discussed in Feature extraction). The transformation of real-world information into data is never a neutral process but relies heavily on the conditions and goals of the research in context. For more on preprocessing data, see "A Critical Field Guide to Working with Machine Learning Datasets: Transforming Datasets."¹¹

Data colonialism

Data are values that can be assigned to a thing and can take a variety of forms.¹² How you think about and utilize the information is what turns it into data. Sensing, observing, and collecting are all acts of interpretation that have contexts, which shape the data. Data do not just exist but have to be generated, through sensors and human effort.¹¹ The human labor to produce and modify data can become another form of extraction and exploitation, say researchers Nick Couldry and Ulises A. Mejias, who describe the "data relations" required to convert "daily life into a data stream":

"data relations enact a new form of data colonialism, normalizing the exploitation of human beings through data, just as historic colonialism appropriated territory and resources and ruled subjects for profit. [...] These new types of social relations implicate human beings in processes of data extraction, but in ways that do not prima facie seem extractive. That is the key point: the audacious yet largely disguised corporate attempt to incorporate all of life, whether or not conceived by those doing it as 'production,' into an expanded process for the generation of surplus value. The extraction of data from bodies, things, and systems create new possibilities for managing everything. This is the new and distinctive role of platforms and other environments of routine data extraction." ¹³

Datasheets

Datasheets are documents describing each dataset’s characteristics and composition, motivation and collection processes, recommended usage and ethical considerations, and any other information to help people choose the best dataset for their task. Datasheets were proposed by diversity advocate and computer scientist Timnit Gebru, et al., as a field-wide practice to "encourage reflection on the process of creating, distributing, and maintaining a dataset, including any underlying assumptions, potential risks or harms, and implications for use." ¹⁴ Datasheets are also resources to help people select and adapt datasets for new contexts.¹¹

¹⁰

MacKenzie, A. 2018. Machine Learners. MIT Press.

¹¹

Ciston S (2023) “A Critical Field Guide for Working with Machine Learning Datasets."Crawford K and Ananny M, Eds., Knowing Machines project.

¹²

Engine Room. (n.d.). Responsible Data Handbook

¹³

¹⁴

Gebru T, et. al. (2020). "Datasheets for Datasets,” ArXiv180309010 CS, Mar 2020

E is for embeddings

Embeddings

Embeddings are the complex numerical approximations of words, images, or other media, created in order for them to be processed by computers. For example, word embeddings are created by repeatedly comparing each word (or word fragment) in a document (or large group of documents) to all the other words around it. The frequency with which each word appears near other words is recorded in a matrix, which is repeatedly manipulated and reduced until each word can be represented by a long string of usually hundreds of numbers (word vectors). Those numbers 'represent' the word — but only in the context of the other terms with which it was trained. Many large models use these embeddings to compare words or images to one another or to predict what word or image to produce next in a series. They do this by using the numerical representations to output another appropriate numerical representation (not by understanding as a human might).

F is for features & free software

Feature extraction

Features are the attributes being analyzed, considered, or explored across the dataset, often viewed as a column in a table. Feature extraction and feature engineering are techniques used to focus on the specific information in a dataset that is relevant to the researchers or model designers. They may need to create features (e.g., add columns to a table) to show data from new perspectives. This can impact how the dataset can be analyzed going forward, how the model can be designed, and how the data subjects and subjectees might be affected.

FLOSS

FLOSS stands for Free Libre Open Source Software. Broadly, open source means the dataset or source code is available to be viewed, changed, and used free of charge by the public. In most cases, licenses must be observed that describe how it should (not) be used.¹⁵ FLOSS philosophies represent many different approaches to licensing information around the production and distribution of technologies, with a focus on access and permissive use. This has been a positive for many academic and scientific endeavors, but it has also been exploited by many corporate endeavors who build on FLOSS work to develop and profit privately.

¹⁵

Training the Archive. Glossary

Foundation models

See Models

G is for global & generative

GPT

GPT stands for Generative Pre-trained Transformer. It's a type of machine learning model developed first in 2018 that relies on giant collections of unlabeled data. In the case of OpenAI's language generating tools and many others, these data often come from scraped public websites, including Wikipedia, Reddit, GitHub, Smashwords, Flicker, and Project Guttenberg.

Global majority

The phrase 'global majority' (sometimes also referred to as the global south) suggests reframing how we consider the many identities who are left out of conversations and calculations about technologies that impact them, whether because they are treated as 'minorities' or edge cases, or because they have been denied access and resources due to wealth disparity in the global north — often both. Western, educated, industrialized, rich, democratic (W.E.I.R.D.) populations, along with heterosexual, monogamous, white men, are usually treated as the norm when conducting research or creating technologies — against which all others are differentiated. However, the W.E.I.R.D. may average to a middle, but they are not a majority, nor a default. The global majority of people exist in all intersecting variations outside this normalized baseline. What happens when we rethink the design and use of AI systems using new baselines?

"The (largely North led) agenda setting has material implications in terms of which problems are studied, with the limited funding and resources available. However, not only are southern populations more vulnerable to ‘existential’ risk (in part because of their post-colonial contexts), but North-led development of AI perpetuates extractive patterns that exacerbate these vulnerabilities."¹⁶

¹⁶

Singh A, Vale D. 2021. "Existential Risk." A New AI Lexicon.

GAN

GAN stands for generative adversarial network and is a now-popular kind of machine learning used to generate new data, such as images seen in the "AI dreaming" aesthetic. It requires two parts: One part is trained on existing data in order to check the second part's work. The second part is trying to generate new data that can fool the first part (hence adversaries).

H is for heteronormative

Heteronormative

"Attitudes and behaviors that incorrectly assume gender is binary, ignoring genders besides women and men, and that people should and will align with conventional expectations of society for gender identity, gender expression, and sexual and romantic attraction".

–UC Davis LGTBQIA Resource Center,¹⁷ cited by Studio Pathways Glossary⁶

Normative categories affect AI because they impact how computational systems are designed and implemented. For example, if a survey is designed with only two choices for gender, or if a programming language uses only true and false to encode those choices, machine learning models trained on data from that survey will present already limited viewpoints. They will not be able to account for survey takers who did not fit in those two choices, nor be able to account for viewers of the machine learning outputs who do fall outside of those choices either. Despite no explicit decision being made to exclude anyone, many people end up not represented by using normative lenses alone.

¹⁷

UC Davis LGBTQIA Resource Center

I is for Intersectionality

Intersectionality

"Intersectionality, as first named by Kimberlé Crenshaw (1989), center[s] interlocking systems of oppression and in doing so make[s] visible the normative value systems that facilitate erasure."¹⁸

In Kimberlé Crenshaw’s original formulation of intersectionality, which originated in her legal scholarship and has been expanded broadly, intersectionality analyzes differences in structural power and how it operates at scale. It is not only about individuals' identities but about how multiple forms of discrimination have compounding, interdependent effects. She argues that intersectional analysis is critical for examining both discrimination and privilege, as these are two aspects of the same systems. ¹⁹ Intersectional methods are essential for addressing bias and power in AI, because they draw on important work by a wide range of communities — Black feminists, queer and disabled theorists, and others — who have been considering difference and equitable systems for decades before these questions became digital.

¹⁸

Gipson, B., Corry, F., & Noble, S. U. (2021). Intersectionality. In Uncertain Archives: Critical Keywords for Big Data. https://doi.org/10.7551/mitpress/12236.003.0027

¹⁹

Crenshaw, K. (1989). Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics. University of Chicago Legal Forum, 1989, 139–168. ———. (2021, March 29). What Does Intersectionality Mean? : 1A. https://www.npr.org/2021/03/29/982357959/what-does-intersectionality-mean

J is for Justice

Justice, transformative

Rather than punitive justice as practiced by governments, which punishes or removes people from their communities who perpetuate harm, transformative justice is a way to respond to violence and harm within our own communities that avoids reproducing harm and instead to seeks to repair as well as to address the root of problems, so that the conditions which created the issue cannot be repeated.²⁰

"How do we change, heal, transform so that this harm is no longer possible? How do we understand that the state is committed to punitive justice and transformative justice is not possible from the state?" ²⁰

²⁰

What is Transformative Justice? (2020, March 5). Barnard Center for Research on Women. https://bcrw.barnard.edu/videos/what-is-transformative-justice/

K is for K

K-Means & K-Nearest Neighbor (KNN)

K-Means and K-Nearest Neighbor are two commonly used algorithms for machine learning. The k represents something different in each, and their process of grouping information is also different. K-Means separates into groups. It finds patterns by clustering an unlabeled set of data into a selected number (k) of categories (see "Unsupervised" in Supervised & unsupervised). In contrast, K-Nearest Neigbhors sorts like with like. It categorizes new data based on similarity to existing data, by looking for a selected number (k) of closest nearby datapoints (see "Supervised" in Supervised & unsupervised).

L is for Lynn

Lynn and her Alto at Xerox PARC (1983)

Lynn Conway

Lynn Conway is a prominent computer scientist who is known for designing innovations in supercomputers and very large-scale integrated circuits. In 1968 she was fired from IBM for notifying them she would be undertaking gender transition. In 2000, after 52 years in an extremely successful career at Xerox PARC, Memorex, and University of Michigan, she received a formal apology from IBM.

M is for models & machines

Machine learning

Machine learning is a set of tools used by computer programmers to find a formula that best describes (or models) a dataset. Whereas in other kinds of software the programmer will write explicit instructions for every part of a task, in machine learning, programmers will instruct the software to adjust its code based on the data it processes, thus "learning" from new information. Its learning is unlike human understanding and the term is used metaphorically. Due to their increasing complexity, the outputs of machine learning models are not reliable for making decisions about people, especially in highly consequential cases. Include machine learning as one suite of options in a broader toolkit — rather than a generalizable multi-tool for every task.¹¹

Model

Models are the result of the processes of machine learning, once it includes revisions that take into account the data it was exposed to during its training. It is the saved output of the training process, ready to make predictions about new data. One way to think of a model is as a very complex mathematical formula (algorithm) containing millions or billions of variables (values that can change). These variables are designed to transform the numerical input into the desired outputs. The process of model training requires adjusting the variables that make up the formula until the output matches the desired output. Much focus is put on machine learning models, but models depend directly on datasets for making their predictions.¹¹ Increasingly, models are trained using on top of other 'foundation models'. These popular models were originally trained with a broad scope of data, with the intention of making them widely adaptable for many uses; however, because they are so large and because they get folded repeatedly into new models and new contexts, this makes their results harder to understand and potentially more fraught ²¹.

²¹

"Reflections on Foundation Models." 2021. Stanford Human-Centered Artificial Intelligence. October 18, 2021. https://hai.stanford.edu/news/reflections-foundation-models

N is for 'neural'

Neural network

Neural networks describe some of the ways to structure machine learning models, including large language models. Named for the inspiration they take from brain neurons (very simplified), they move information through a series of nodes (steps) organized in layers or sets. Each node receives the output of the previous layers' nodes, combines or processes them using a mathematical formula, then passes the output to the next layer of nodes.¹¹

O is for opacity

Opacity

Rather than more data and more transparency, many people who bare the brunt of the harmful impacts of machine learning systems like facial recognition have been arguing for the right not to be included in datasets (opt-out) and for the right to have their information removed from systems (sometimes referred to as machine unlearning). For much longer than there has been AI, Black activists and scholars have been arguing for alternative perspectives than totalizing approaches that demand to 'know' in in ways that capture and reduce bodies, difference, and freedom:

"If we examine the process of 'understanding' people and ideas from the perspective of Western thought, we discover that its basis is this requirement for transparency. In order to understand and thus accept you, I have to measure your solidity with the ideal scale providing me with grounds to make comparisons and, perhaps, judgments. I have to reduce. [...] "—But perhaps we need to bring an end to the very notion of a scale. Displace all reduction. Agree not merely to the right to difference but, carrying this further, agree also to the right to opacity that is [not enclosure within an impenetrable autarchy but] subsistence within an irreducible singularity. Opacities can coexist and converge, weaving fabrics. To understand these truly one must focus on the texture of the weave and not on the nature of its components. For the time being, perhaps, give up this oid obsession with discovering what lies at the bottom of natures. There would be something great and noble about initiating such a movement, referring not to Humanity but to the exultant divergence of humanities. Thought of self and thought of other here become obsolete in their duality. Every Other is a citizen and no longer a barbarian. What is here is open, as much as this there. I would be incapable of projecting from one to the other. This-here is the weave, and it weaves no boundaries." Édouard Glissant [-@glissantPoeticsRelation2009a]

Overfitting & underfitting

See Bias & variance

P is for peers & protocols

Protocol

A protocol is a set of well-defined rules for how data is sent between computers. ²²

Peer-to-peer

In peer-to-peer networks, there is no central authority. Rather each machine acts as a server offering content to the others.²² This model could be aligned with mutal aid networks and other decentralized systems built outside of big (government or corporate) infrastructure platforms.

²³

You and AI Festival Glossary

²²

Decentralized Off the Shelf

Q is for queer

Queer OS

"Queer OS," as theorized by Black film scholar Kara Keeling, names a way of seeing queerness and gender as technologies and a way of seeing and using technologies [queerly/as queer infrastructure], "to facilitate and support imaginative, unexpected, and ethical relations between and among living beings and the environment, even when they have little, and perhaps nothing, in common." Keeling positions a Queer OS as an operating system at the social level and computational level, which reconfigures power based on queer values: "Because Queer OS ideally functions to transform material relations, it is at odds with the logics embedded in [existing] operating systems [.... It] seeks to undermine the relationships secured through those logics, even as [...] it acknowledges its own imbrication with and reliance on those logics while still striving to forge new relationships and connections."²⁴

²⁴

Keeling, K. (2014). Queer OS. Cinema Journal, 53(2), 152–157. https://doi.org/10.1353/cj.2014.0004

R is for regression & racialization

Racialization

Structural racism: A system in which public policies, institutional practices, cultural representations, and other norms work in mutually reinforcing ways to perpetuate racial group inequity. A structural analysis of racism identifies dimensions of our history and culture that have allowed privileges associated with "whiteness" and disadvantages associated with "color" to endure and adapt over time. Structural racism is not something that a few people or institutions choose to practice. Instead it is a feature of the social, economic and political systems in which we all exist. –Mimi Onuoha and Mother Cyborg (Diana Nucera), A People's Guide to AI[^Onuoha]

Racialization: "Processes of racialization begin by attributing racial meaning to people's identity and, in particular, as they relate to social structures and institutional systems, such as housing, employment, and education." –Encyclopedia of Race, Ethnicity, and Society, via Studio Pathways⁶

Regression & classification

Regression tasks show the relationship between features in a dataset through ordering them based on a selected feature or features, for example sorting dogs by their age and number of spots. These are distinguished from classification tasks, which label and sort items in a dataset by discrete categories. For example, asking whether an image is a dog or a cat is handled by a classification task.¹¹

S is for sustainability & supervision

Supervised & unsupervised

In supervised machine learning techniques, at least a portion of the training data will already indicate the patterns that the model is designed to "learn." Unsupervised machine learning techniques also find patterns, but these are not already labeled in the dataset. They different kinds of machine learning techniques, such as clustering groups of data together by the features they share. However, don't think that conclusions drawn from unsupervised machine learning are somehow more pure or rational. Just as much human judgment goes into developing unsupervised machine learning models as supervised ones. Often supervised and unsupervised approaches are used in combination to ask different kinds of questions, or techniques are used that are somewhere between the two approaches. ¹¹

Sustainability

Is AI sustainable? Creating and using AI systems accumulates huge environmental impacts. The ease with which we can get quick, if fallible, answers from systems like ChatGPT obscures their resource-hungry consumption, like the bottles of water consumed with every few queries and the electricity needed to power their training and keep their data centers cool. Considering "environmental justice" as a more complex intersectional lens can help us wrestle with the harms of large technical systems. This includes but goes beyond quantifying AI systems' impacts, which are disproportionate across categories of difference like class, race, and gender. "Reframing sustainability and AI in terms of environmental justice offers a way to center the material contexts and implications of AI technologies and provides a framework for imagining community-led, socially just futures."²⁵

²⁵

Rachel Bergmann & Sonja Solomun. 2021. AI Now's A New AI Lexicon: Sustainability

T is for trans

Transformer

A transformer is a common structure for current machine learning models, including GPT. They are designed to digest huge datasets of unsorted text, images, audio, or video. Those input data are converted into numerical form based on their relationship to nearby word fragments or pixel values, for example (see values). These processes are repeated many times to 'encode' the numbers and to 'normalize' or standardize those numbers in relation to each other. Then the process can be repeated to 'decode' data and return an output in the form of text, image, audio, or video. At each stage the designers can control what passes through the model with filters called 'activation functions', and additional information be used to 'weight' the existing information, or to focus it toward a new topic.

U is for un

Uncertainty

Louise Amoore argues that the output of an AI system is never simply either true or false, but an effect of relations, a series of optimized probabilities, eventually labeled as a certainty. She says, "Where politics expresses the fallibility of the world and the irresolvability of all claims, the algorithm expresses optimized outcomes and the resolvability of the claim in the reduction to a single output."[^Amoore] Further, she argues that these systems are "geared to profit from uncertainty, or to output something that had not been spoken or anticipated."[^Amoore] However, uncertainty remains at every point in the process, says Amoore:

"Though at the point of optimized output, the algorithm places action beyond doubt, there are multiple branching points, weights, and parameters in the arrangements of decision trees and random forest algorithms, branching points at which doubt flourishes and proliferates."[^Amoore]

V is for values, variables, vectors, oh my!

Values

Values are the perspectives and ethics each person and community holds which determine how they act and how they evaluate situations, what they protect and who they esteem (and don't). Values are embedded in and expressed by technologies because they are designed and produced by people who hold values and make decisions based on those values, whether consciously or unconsciously. Values are also a term for information that can be represented as numbers or strings of text. On a social media profile, your age may be a value in a dataset that also includes values like your username, password, and personal interests that are of use to advertisers.

Variables

Variables are labeled containers for information. They are placeholders you can name and store values in to recall for later. Let's say let x = 1. That means we declared x is a variable storing the number 1. Later we can ask, "What was x?" or say, "Change x = 3 now." (And then next time we ask what x equals, the answer will be 3.) This allows information to be moved through a program and manipulated. By combining lots of variables and processes, programmers can perform powerful manipulations of large amounts of data.

This naming has power. As a programmer, you decide what to name your variables, and you decide how your systems are organized and structured. That means you decide what information means, whether it is the weight of a feature in a machine learning model or the threshold that separates one color from another. These decisions matter and are informed by your values.

Vectors

Vectors are lists of numbers used for machine learning calculations. Sometimes called arrays, these lists of numbers can be compared with each other and can be graphed in space to understand the relationships among the data they represent. You might remember plotting [x,y] coordinates in a geometry class. Imagine plotting vectors coordinates with many more dimensions [x,y,z,...], sometimes hundreds. Vectors are often used in machine learning tasks to represent words, images, and other media, for example as word embeddings.

W is for

(bag of) Words

"Bag of words" is a natural language processing method of analyzing and classifying text that looks only at the frequency each word occurs, while disregarding the order of the words, syntax, or grammar — as if the words were all thrown in a bag. In current approaches to creating word embeddings (see E is for embeddings), the "continuous bag of words" (CBOW) technique is used to predict a single word given a set number of surrounding words for context. In contrast, the "skip-gram" technique tries to predict the context words given a set number of input words.

X is for X (input)

X (input)

"Garbage in, garbage out" the saying goes. How datasets are created, shaped, and implemented as training input for AI systems fundamentally informs their resulting outputs. Datasets are not the only element affecting how AI works, but they are a key element in all AI systems. To dive deep into datasets, see A Critical Field Guide for Working with Machine Learning Datasets, which is a friendly introduction, defining all the types and parts of datasets, all the benefits and pitfalls of how to use datasets practically and critically.

Y is for output ... YOLO!

Y (output)

y = f(x) + Σ the simplest machine learning model looks like this. Don't be afraid of the math — it's shorthand like you might have learned in high school. Written out in words, it means that the output or results of a model is function f(). A function just means that some calculation (algorithm, operation, recipe) is performed on the stuff inside ( ). Here we see a function of some inputs x (which are known, and are also called parameters or features) plus the error Σ (which is unknown). It is adjusted based on what the model's creators determine will yield expected, "appropriate" results.

YOLO predictions, credit: YOLO Joseph Redmon

YOLO

"You Only Look Once" (YOLO) is a popular computer vision algorithm used for real-time and multiple object detection and classification. It identifies and labels items in still and moving images, and it has been applied to self-driving cars and surveillance. YOLO is an example of a convolutional neural network (CNN) that finds patterns in the number representations of image pixels. As the network layers accumulate, the patterns get more complex and it is trained to categorize these patterns into objects based on labels it is given in advance. ²⁶ The YOLO9000 system, for example, was trained on prior datasets like ImageNet (itself a highly contested dataset) for its classification labels and on COCO for detection.

²⁶

ml5.js object detector

²⁷

Redmon J, Farhadi A. 2016. "Yolo9000: Better, Faster, Stronger"

Z is for

Zero-shot learning

Usually, machine learning systems are trained by being exposed to tons of examples, but in few-shot or zero-shot learning, they are designed to output desired results without being previously exposed to that category of information. It is becoming more common in recent "generalizable" models that are supposed to work for many kinds of purposes and topics. It operates by inferring from information in other categories and fields of knowledge.

Zines

Zines, short for 'magazines', are publications that come in many print and digital forms. Often self-published, they can be photocopied and stapled or elaborately constructed. They have a history rooted in diverse politics and perspectives. Some of our favorite zines are collected by Tiny Tech Zines. Read more writing about the history and variety of zine culture in these books and articles:

Duncombe, S. (2008). Notes from underground: Zines and the politics of alternative culture. Microcosm Publishing.
Hono, M. (2021). Scrappy Messiness Increases Affection – Zines as Rebellion Against the Cultural Dominance of Digital Self-Publishing. https://you.stonybrook.edu/zines/scrappy-messiness-increases-affection/
Oakley, B. (2023). Imperfect Archiving, Archiving as Practice: The Ethics of the Archive. GenderFail.
Piepmeier, A. (2008). Why Zines Matter: Materiality and the Creation of Embodied Community. American Periodicals, 18(2), 213–238.
Simanjuntak, R., Espinoza, T., & Yin, T. Tiny Tech Zines. http://tinytechzines.org/

Code Basics for IAI 😍

Print & Fold Edition: A3/Tabloid

Don't let { curly braces } intimidate you -- let's jump in!

All Programs Combine Just a Few Basic Concepts

AI and coding can be intimidating to learn. But the entrance is framed in some simple ideas.

All code combines just a few key components—no matter the scale or complexity! Anything you want to build will be composed of these pieces layered together.

Data

Data are pieces of input to the machine. They are the information you are giving, manipulating, and getting back out.

Data types are all the different kinds of information you can store. Some languages ask you to specify which kind of variable you're creating, like Javascript. Others, like Python, can tell just by how the data is formatted.

Strings are simple lines of text (always seen in between single or double quotation marks).

Numbers include integers aka whole numbers like 100; floats are numbers with decimals like 0.25; sometimes other kinds too.

Booleans are either True or False (or a sneaky NULL).

Variables

Variables are labeled containers for information. They are placeholders you can name and store things in to recall for later. Let's say let x = 1. That means we declared x is a variable storing the data 1. Later we can ask, "What was x?" or say, "Change x = 3 now." (And then next time we ask what x equals, the answer will be different.)

Naming has power. ENCODE <><> DECODE

Pssst... you decide what to name your variables.

You are deciding what information means. e.g. the weight of a feature in a machine learning model or the threshold that demarcates one color from another.

Data in Groups

There are different data types we use to group collections of data together. They each have different benefits:

Arrays AKA Lists are simple run downs of data. They can usually be of any length and be of any kind of data:

numberList = [3, 4, 5, 6]
emojiList = ["peach", "eggplant", "heart"]
mixedList = ["peach", 33, "heart", True, 0.314159] Dictionaries contain pairs of information, where each pair has a key and a value assigned to match, like a label for each piece of info:
dict = {"key": "value", "name": "Sarah", "color": "teal"} Sets are special lists that forbid duplicates, which can sometimes come in handy!

Any of the above data types can be stored in a variable that you name (almost whatever you want).

Action! Calculations & Operations

Now that you've got some data in your program, time to do something with it! One of the main things programs do is make calculations on data using mathematical operations like add, subtract, multiply, divide, etc. (+ - * /), as well as Boolean operations that act as filters, for example AND (&&) and OR (||).

Functions: Take It with You

Functions are a way to organize and reuse code, along with its embedded ideas and values. Like putting snippets of code and ideas in a bag.

Wow that's some useful code!

What about those ideas, though?

The more you organize your code into functions, the easier it is to reuse in later projects. And guess what, you also save your functions with almost any variable name you want!

Like variables, functions store stuff, but instead they can save a whole phrase that expresses an action. They can execute and act on whatever data you feed into them when you put them to work later.

The best (and sometimes worst) part about computation is that it's designed to make patterns to apply to multiple situations.

Do It Again! Loops & Conditionals

A loop helps you repeat an action many times.

What factors are overlooked when working at a large scale?

for (image in dataset):
    // find out if blueberry muffin
    // or chihuahua

How do we treat different things?

A conditional helps you decide whether to do an action or not.

How do you decide on the right judgment call?

if (object on plate):
    //muffin!
    //unless SOMETHING is horribly wrong

Computers can compound decision-making work at high speed and great volume, allowing for amazing automation but also enormous, unforeseen ethical impacts.

NO SUCH THING AS RAW DATA

Input | Output Data entered into a program, stored in variables, used and modified when the program is run. | Data the program sends out into "the world." e.g. from sensors, keyboard, mouse, camera, microphone, database, API | e.g. print(), translation, recommendation, calculation, search result, prediction

Pro Tip: The quality of the information coming out cannot be better than the quality of the information coming in.

Creative Code Collective, Resource Hub

Sharon Lee De La Cruz: Can computers understand slang? Code Slang: crowdsourced library, flexible, visual output. "Retaining a culture, meeting people where they're at, celebrating the way we communicate." & the Digital Citizens Lab

Aesthetic Programming Exploratory Programming for the Arts and Humanities, 2nd Ed