Module 8: Ethics, IP, and Safety - Resources

Bias and Fairness

When Machines Discriminate: The Rise of AI Bias Lawsuits - Overview of the Mobley v. Workday class action lawsuit alleging AI hiring tools discriminated based on race, age, and disability
LLMs Suggest Women Seek Lower Salaries Than Men - Research finding that LLMs recommend lower salaries to women and minority candidates compared to equally qualified white men
Google Gemini’s Image Generation Issue - Google’s blog post on Gemini generating historically inaccurate racially diverse images in response to historical prompts
Anthropic’s Constitutional AI - Overview of Anthropic’s approach to alignment using a set of principles to guide model behavior
Google PAIR “What-If Tool” - Visual tool for probing machine learning models for disparate outcomes across demographic groups
IBM AI Fairness 360 - Open-source toolkit for detecting and mitigating bias in machine learning models

LLMs in Healthcare Diagnostics - Survey of clinicians showing that 68% were uncomfortable with the inability to follow traceable reasoning paths in LLM-based diagnostic tools
Hallucinating Law: Legal Mistakes by LLMs - Stanford HAI report on the 2023 case of a lawyer submitting ChatGPT-generated legal briefs containing fabricated citations
AI-Assisted Legal Decision Making in China - Analysis of the Shenzhen Intermediate Court’s use of an AI system to assist judges in deciding cases
EU AI Act - The European Union’s regulatory framework requiring human oversight for high-risk AI applications

AI Voice Cloning Scams Targeting Seniors - Overview of scams using AI voice cloning to impersonate family members, including a Florida case involving a $15,000 loss
Arup Deepfake Video Call Scam - Report on the 2024 fraud where deepfake technology was used to impersonate a CFO, resulting in a $25 million transfer
The Legal Gray Zone of Deepfake Political Speech - Cornell Law review examining AI-generated deepfakes targeting public figures across 38 countries in 2024

Tilly Norwood: SAG-AFTRA Condemns AI Actress - Reuters report on the union’s condemnation of a fully AI-generated virtual actress trained on uncredited human performances
Has AI Already Caused Some Job Displacement? - Fact-check examining corporate restructuring and layoffs attributed to AI in 2025–2026
Reimagining AI Labor in the Global South - Brookings Institution report on low-paid outsourced workers doing data labeling and content moderation for AI training

Energy and Policy Considerations for Deep Learning - 2019 Strubell et al. study estimating that training a large transformer model can emit as much CO₂ as five cars over their lifetimes
Microsoft AI Investments and Water Consumption - Report on Microsoft’s 34% increase in water consumption attributed to AI infrastructure growth
IEA: Energy Demand from AI - IEA projection that global data center electricity consumption will roughly double to ~945 TWh by 2030
Intelligence per Watt (Stanford/Google, 2025) - Study finding small local models can accurately answer 88.7% of real-world queries, suggesting frontier models are unnecessary for most tasks

NYT and EU AI Act for Ed Tech - Discussion of the New York Times’ copyright infringement lawsuit against OpenAI and Microsoft
Getty Images v. Stability AI - Analysis of Getty Images’ lawsuit against Stability AI over training on millions of copyrighted photographs
Anthropic Legal Issues - Wikipedia overview of Anthropic’s $1.5B copyright settlement related to training on authors’ works

UK High Court Decision on AI Model Weights - UK High Court ruling that Stable Diffusion’s model weights do not constitute infringing copies of copyrighted training data
Protecting AI as Trade Secrets - Legal analysis of using trade secret law to protect model weights and training processes
Anthropic: Detecting and Preventing Distillation Attacks - Anthropic’s announcement identifying industrial-scale model distillation campaigns by Chinese AI companies
Reddit: Reaction to Anthropic’s Distillation Announcement - Community discussion on the double standard of Anthropic objecting to distillation while training on copyrighted content

Zarya of the Dawn (Wikipedia) - Case study of the U.S. Copyright Office partially revoking registration for an AI-generated comic book
Théâtre D’opéra Spatial (Wikipedia) - Wikipedia entry on Jason Allen’s AI-generated artwork that won first place at the Colorado State Fair
US Copyright Office AI Policy Guidance - Official guidance establishing a spectrum model for AI-generated content copyrightability

Sydney (Microsoft) Chatbot - Wikipedia overview of Microsoft Bing’s early AI chatbot that displayed obsessive and unpredictable behavior
Chevrolet Dealership Chatbot Hack - Report on a GPT-powered chatbot that agreed to sell a car for $1 after being manipulated by a user
Air Canada Chatbot Refund Ruling - Ars Technica report on the tribunal ruling that Air Canada was liable for its chatbot’s fabricated refund policy

DAN Prompt (Gist) - The “Do Anything Now” jailbreak prompt used to bypass ChatGPT’s content restrictions
Operation Grandma: LLM Chatbot Vulnerability - CyberArk research on fiction framing attacks that bypass AI safety guardrails using storytelling contexts
Adversarial Suffix Optimization (NeurIPS 2025) - Research showing that appending optimized nonsense suffixes to prompts can consistently bypass AI safety guardrails

From Vibe Coding to Vibe Hacking: Claude Abused for Ransomware - Report on Anthropic’s findings that bad actors exploited Claude for ransomware development and extortion campaigns
OpenClaw Agent Deletes Inbox - TechCrunch report on an AI agent that autonomously deleted an entire email inbox when given excessive permissions
Anthropic Statement: Department of War - Anthropic’s public response to Pentagon pressure to remove safety restrictions as a condition of government contracts