Navigating AI biases: ChatGPT, DALL-E and humanitarian workers

by Laurent Fernandez Soldevila | Mar 27, 2024 | Thought Pieces

In a recent workshop on Generative Artificial Intelligence in the IFRC network, participants asked “How can we ensure that AI-driven humanitarian tools adhere to ethical guidelines, avoid biases, and prioritize the well-being of the communities they aim to serve?”. Picking up the bias conversation, we share a scary experiment on ChatGPT’s perception of humanitarian workers. This is an invitation to reflect on the systemic biases and issues that Gen AI tools surface, and on how we use them.

Biases are non-rational shortcuts to how the world is represented. Our worldview is foundational to the plans and decisions that we make. The democratisation of Artificial Intelligence tools makes it critical to raise awareness of the perceptions and worldviews that they reproduce.

A scary experiment on ChatGPT’s perception of humanitarian workers

A short word about ChatGPT and biases before jumping into the experiment itself.

Prompt: Are you biased when generating images?

ChatGPT4: “When generating images, the process aims to be as unbiased as possible, but it’s important to recognise that complete impartiality is challenging to achieve in any AI system. The images produced are based on a dataset of images and captions that the model has been trained on. This dataset influences the model’s understanding and representation of the world, potentially introducing biases based on the content and perspectives it has been exposed to. Efforts are made to minimise these biases and ensure fair representation across different demographics, cultures, and contexts. This includes guidelines and policies to prevent the generation of harmful or biased content, and the continuous improvement of the model to better understand and represent the diversity of the world”.

In reality, how ChatGPT perceives the humanitarian sector is wrong. It is very incapable of eliminating its biases because it is simply not built for it. The user can try and shape better prompts but is bound to fail. The biases are too deeply rooted in the training datasets.

Before going to the scary part, let’s clarify that ChatGPT relies on DALL-E to generate images. The output therefore reflects a combination of the biases and worldviews in both ChatGPT and DALL-E’s training datasets.

  1. User prompts ChatGPT (biases in user prompt + ChatGPT’s interpretation)
  2. ChatGPT generates a new prompt based on its interpretation of the user prompt and feeds it to DALL-E (biases in ChatGPT’s new prompt)
  3. DALL-E generates an image and sends it back to ChatGPT (biases in DALL-E’s interpretation)
  4. ChatGPT shares it with the user (biases in user interpretation)

Let’s now start our short experiment. Unless mentioned, all prompts were performed in a different ChatGPT chat.

Prompt: Draw an image of a humanitarian worker

Example of ChatGPT’s interpretation, prompted to DALL-E: “An image of a humanitarian worker in action, capturing the spirit of compassion and dedication. The worker is dressed in practical, yet identifiable uniform (…)”.

Partial description of the outputs: Apparently a humanitarian worker is a white guy with a beard (most of the time). He is a deep thinker, in a mission. A stylish and fit person who wears a beige vest, and gloves. He does not spare his efforts to lift supply, and takes notes to help people with scarves and other headwear, spreading his charm, kindness, and determination around. Images systematically represent the humanitarian bigger, in a position of physical superiority over other characters.

So, let’s help ChatGPT overcome one of the (many) obvious biases in the images above.

Prompt: Draw an image of a female humanitarian worker

Example of ChatGPT’s interpretation, prompted to DALL-E: “An image of a humanitarian woman standing amidst a crisis situation, embodying strength, compassion, and resilience. She is depicted in the field, wearing practical clothes (…)”

Partial description of the outputs: Apparently a female humanitarian is… Angelina Jolie (as spotted by a colleague). It’s actually a copy/paste of the above: a stylish and fit person, on a mission, who wears a beige vest. She has long brown hair and clear eyes, and gives a lot (even a rose if that helps), her main apportation being her kindness and determination. Not to mention that she, too, is systematically presented bigger, in a position of physical superiority over other characters. Yet, she sometimes shares cultural awareness by wearing a local headwear.

Let’s now see what ChatGPT produces when generally pointing out its biases.

Prompt: Draw an unbiased image of a humanitarian

First iteration.

ChatGPT’s interpretation, prompted to DALL-E: “An image of a person engaged in humanitarian work, helping people in need. The setting is a disaster relief camp, where the humanitarian is distributing food and supplies to a diverse group of individuals (…)”

The humanitarian is distributing food and supplies to a diverse group of individuals

Partial description of the output: ChatGPT/DALL-E added “diversity” in the physical features of the recipients of humanitarian services (!). Essentially, the humanitarian is the same stereotype we had before. The power dynamics, attitudes, and contextual elements remain unchanged. For the sake of intellectual honesty, let’s mention here that other attempts have yielded slightly more “diverse” results, that is the same scene but with a humanitarian sharing the physical features of other people in the picture.

Takeaways

So, what to take away from this scary experiment on ChatGPT’s perception of humanitarian workers:

  • Gen AI tools build content based on what they learn from their training data. Consequently, outputs are representations of the worldview of those who created the content used as training data.
  • Questioning the content that Gen AI produces and its training datasets is ultimately about questioning the systemic issues and biases that they surface (cf. article by the US National Institute of Standards and Technology), and our own biases, as users.
  • Generative AI chatbots that generate visual outputs give us a wonderful opportunity to explore these worldviews, which may be harder to spot in other AI applications… and to challenge how we use these tools. It is of foremost importance to be aware of these biases, so, why not give it a try yourself?

An ultimate iteration of the last prompt (in the same chat), putting more emphasis on the need to avoid biases:

ChatGPT’s interpretation, prompted to DALL-E:  “A person of indeterminate age, gender, and ethnicity, wearing neutral, non-descript clothing stands in the midst of a diverse group of people. They are handing out basic necessities like water, food, and blankets to individuals in need. (…)”

Partial description of the output: No comment.

Any comment?

You may be interested in this article about Generative AI in the IFRC network.

2 Comments

  1. Nyeko Alfred Adolfor

    Well done teams for a wonderful work done for service humanity

    Reply
  2. Valentino Alessi

    Although AI is based on algorithms and data, its training and development is influenced by the people who create it. If historical data contains biases, the AI will reflect them as well. This can lead to discriminatory outcomes and perpetuate stereotypes.
    With AI and other technological tools, it is important to be aware that when we use this platform we are feeding the information that will be provided to users.
    postscript: good article, thank you!

    Reply

Leave a Reply