Repositories

Below you can find a few code and dataset repositories we have previously generated from various research studies conducted. Find relevant datasets or tools that might aid your research purposes.

Social B(eye)as Dataset

Authors: Barlas, Pinar; Kyriakou, Kyriakos; Kleanthous, Styliani; Otterbacher, Jahna
Published: 2019

Image analysis algorithms have become an indispensable tool in our information ecosystem, facilitating new forms of visual communication and information sharing. At the same time, they enable large-scale socio-technical research which would otherwise be difficult to carry out. However, their outputs may exhibit social bias, especially when analyzing people images. Since most algorithms are proprietary and opaque, we propose a method of auditing their outputs for social biases. To be able to compare how algorithms interpret a controlled set of people images, we collected descriptions across six image tagging APIs. In order to compare these results to human behavior, we also collected descriptions on the same images from crowdworkers in two anglophone regions. While the APIs do not output explicitly offensive descriptions, as humans do, future work should consider if and how they reinforce social inequalities in implicit ways. Beyond computer vision auditing, the dataset of human- and machine-produced tags, and the typology of tags, can be used to explore a range of research questions related to both algorithmic and human behaviors.

Download

Social B(eye)as Dataset v2.0

Authors: Barlas, Pinar; Kyriakou, Kyriakos; Guest, Olivia; Kleanthous, Styliani; Otterbacher, Jahna
Published: 2020

Researchers of Web and social media rely extensively on image analysis tools to understand users’ sharing behaviors and engagement with content on the large scale. However, it has been made clear over the past years that there are disparities in the way that these tools treat images depicting people from different social groups. Previously, we released the Social B(eye)as Dataset, consisting of machine- and human-generated descriptions on a controlled set of people images without context. This resource allows researchers to compare the behaviors of taggers and humans systematically. We now update this, with a process that imposes the people-images onto backgrounds. The current release uses four stereotypically “feminine” and four “masculine” contexts. Thus, it enables us to consider the possible influences upon the gender inferences that are made by tagging algorithms. We also provide an updated typology of tags used by the six proprietary taggers as well as initial analyses. Our methodology for imposing semi-transparent images onto background images is publicly available, allowing others to repeat the process with other combinations of images for various research topics.

Download

CFD background stimuli generator

Authors: Barlas, Pinar; Kyriakou, Kyriakos; Guest, Olivia; Kleanthous, Styliani; Otterbacher, Jahna
Coded by: Guest, Olivia
Published: 2020

This python tool is super-imposing input images of transparent profile photos of people (which must have had their backgrounds rendered transparent from the original white). It then combines them (in all possible ways) with some pre-set backgrounds to create stimuli that have a person in the foreground and a specific environment in the background.

More specifically, in previous research, we used this tool with our input images from the Chicago Face Database and combined them with some selected backgrounds to study Image Tagging Algorithms behavior when the background context of the depicted person is changing.

Find it on GitHub