Post

Gt Nlp M14

Responsible AI: Societal Implications

Can a Chatbot harm someone? They can.

  • March 2023, a Belgian man experiencing severe depression began seeking solace in conversations with a chatbot based on a large language model
  • This chatbot, trained to be agreeable, reinforced his suicidal ideation thoughts and added to his feelings of isolation.
  • And eventually :frowning_face:
    • An example of what we call cognitive harms
  • Many large language models learn to be agreeable from internet text and can be reinforced with RLHF (reinforcement learning with human feedback).

Other harms as well:

  • Factual errors and Misinformation
  • Prejudicial bias
  • Privacy
  • Worker displacement

Factual Errors and Misinformation

  • Large Language Models can output wrong information.
    • Wrong authors
    • Wrong details
      • e.g wrong number of components, type of components.

What causes these errors?

  • Language models generate highly probable next tokens
  • There is no constraint that requires the next token to conform to factually correct information in the real world
  • This is known as Hallucination or confabulation
  • Sometimes the probability distribution over tokens overrides the correct answer
  • Sometimes the sampler just makes a low-probability choice
  • Retrieval-based chat bots do a web-search and feed the results into the input context window
    • Mistakes can still occur (from the model)
    • Webpages can also have fake information, so the chat bot produced wrong information.

Overall, LLMs can be used to generate:

  • Political Misinformation
  • Conspiracy theories
  • Defamatory text
  • … At scale

Prejudicial Bias and Privacy

Prejudicial bias

  • Prejudicial bias is any implication that one class of person or community is inferior to another based on superficial differences such as skin color, gender, sexual identity, nationality, political stance, etc
  • Prevalent in human-human communication and therefore prevalent in our natural language text data
  • Models readily pick these prejudicial biases
  • Perpetuate and reinforce prejudicial thoughts
  • Emotional effects of being dehumanized

Toxicity and non-normative Language

  • Training language models on internet content means it can learn things we don’t want to see repeated
  • Language models can amplify Toxicity
  • Can be subtle: Language models also learn to be agreeable and can agree with and reinforce toxic language
  • Can also describe non-normative behavior

Privacy

  • Language models can leak out private information.
  • Sensitive privacy information can be memorized by large language models
  • Two ways to reduce cross entropy loss: generalization and memorization
  • PRivacy information may be more easily surfaced with large language models
  • Information removed from the internet may linger in large language models trained on internet data

Worker displacement

  • Many applications of NLP are meant to help knowledge Work
  • Large language models are decent at common coding tasks
  • IF NLP systems make workers more productive, we need fewer people to do the same amount of work
  • Creative writing vs “rewriting”

Upside:

  • Potential benefits of intelligent system outweigh the negatives
  • Do the systems we build work toward the enhancement of the human condition
  • more knowledge
  • Dignity of work
  • Cognitive and emotional health
  • Entertainment
  • Research is not complete
  • May need regulation of applications
This post is licensed under CC BY 4.0 by the author.