Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

A data scientist reveals how our private online behavior exposes our true thoughts, feelings, and desires that we hide from the world.

Introduction

"Certain online sources get people to admit things they would not admit anywhere else. They serve as a digital truth serum. " Seth Stephens-Davidowitz discovered this while analyzing Google search data: people tell search engines truths they hide from surveys, friends, even themselves.

The book's central finding is uncomfortable: the gap between public statements and private searches is enormous.

Search data reveals higher racism levels than any survey captures, sexual anxieties nobody admits, and everyday concerns social media carefully hides. Big Data doesn't just collect more information, it collects different information, the kind humans systematically lie about through every traditional research method.

Stephens-Davidowitz covers what this digital truth serum reveals about sex, prejudice, parenting, success factors, and life outcomes. Some findings confirm suspicions, others overturn accepted wisdom.

The methodology has limits: correlation isn't causation, measured variables aren't always what matters, and statistical significance with enough data becomes meaningless.

But the core insight stands: if you want to know what people actually think and do rather than what they claim, watch what they search when nobody's watching.

The Honesty Gap

So.Let's start with the foundation. The uncomfortable truth that makes everything else possible. Every survey we use to understand human behavior is corrupted by the same systematic error. People lie. Not occasionally. Not just about embarrassing things. Constantly, and about everything. Here's the evidence.

1950, Denver. Researchers got official records on residents, voting, charitable donations, library cards. Then they surveyed the same people.

Anonymous surveys, no names attached. People still lied. They claimed to vote more than records showed.

They inflated their charitable giving. They overstated library card ownership. This is the part that matters.

These weren't face to face interviews where you might lie to seem better. These were anonymous forms where lying served zero purpose.

People lied anyway. Fast forward to University of Maryland graduates. Official academic records versus survey responses.

Only 2 percent admitted to graduating with a GPA below 2.5. Reality, 11 percent had GPAs that low. The gap isn't small. It's 5 to 1.

Now, Google searches change the equation completely. When you search for something, you have actual incentive to be honest. You want accurate results. No interviewer is watching. No social judgment exists. If you think you might be depressed, admitting that to a survey helps nobody.

Searching Google for depression symptoms helps you. This difference shows up everywhere. Surveys say 25 percent of men watch pornography.

Search data says Americans search for porn more than weather. One of these numbers is real.

The gay population research proves the point. Surveys suggest 2 to 3 percent of American men are gay, with huge state differences. Rhode Island shows twice the rate of Mississippi. Two explanations. Either gay men migrate to tolerant states, or gay men in intolerant states hide in surveys.

Facebook mobility data kills the migration theory. Gay men do move from Oklahoma City to San Francisco, but not enough to explain the differences.

High school students can't choose where they live, yet the same pattern appears. Two in one thousand male high school students in Mississippi identify as openly gay on Facebook.

Pornography searches tell the real story. Nationwide, 5 percent of male porn searches are for gay content.

Mississippi, 4.8 percent. Rhode Island, 5.2 percent. The geographic variation almost disappears. Sexual orientation distributes evenly.

Expression of it does not. The math confirms it. For every 20 point increase in support for gay marriage, 1.5 times more men identify as openly gay on Facebook.

Extrapolate to full tolerance, you get 5 percent. Same number as the porn searches. This means millions of American men are closeted, particularly in intolerant regions.

The evidence appears in their wives' searches. Is my husband gay ranks as a top completion for Is my husband.

It's 10 percent more common than cheating, eight times more common than alcoholic. These searches concentrate in the least tolerant states.

The AOL data leak from 2006 showed individual search histories. One user, six days of searches.

Gay pornography, then gay test, then gay pornography, then quiz for confused men, then gay pornography again. The cycle repeated. Someone consuming content while desperately hoping to confirm they're not actually gay.

This isn't about morality. It's about measurement. Traditional methods surveys, social media profiles capture what people are willing to admit.

Search data captures what people actually want and think. The gap between these two numbers is the foundation. Everything else builds on recognizing that gap exists.

Review

So here's the uncomfortable truth: you're more knowable than you think, and more unknown than you'd like. The data doesn't lie—you do. But recognizing that gap? That's where wisdom starts.

Next time you fill out a survey or post something carefully curated, ask yourself: what would my search history say instead? Because somewhere in that difference between your public self and your 3 AM Google queries lives the person you actually are.

And maybe, just maybe, that person deserves more honesty than an algorithm.