The promise of artificial intelligence is often framed in sweeping, universalist terms. AI is said to be aligned with "human values", capable of "human-like reasoning", and increasingly endowed with something approaching "common sense". These claims, while rhetorically powerful, conceal an obvious ambiguity. Which humans? Whose values? And perhaps most vexingly, what sort of common sense? These questions—long familiar to philosophers and anthropologists—are now unavoidable for computer scientists, ethicists, and policymakers. To treat "humanity" as a singular and undifferentiated reference point is to risk mistaking a greater part for the homogeneous whole.
A review for the Department of Human Evolutionary Biology at Harvard University titled Which Humans? demonstrate that when AI are tested against psychological and cultural benchmarks, their responses align most closely with Western, Educated, Industrialised, Rich, and Democratic (WEIRD) populations. Meanwhile, a more recent paper—Reinforcement Learning from Human Feedback in LLMs—argues that the very process by which AI is fine-tuned—Reinforcement Learning from Human Feedback (RLHF)—is structurally biased. Both studies converge on the same concern: that a supposedly "universal" technology reflects only a narrow slice of humanity.