The cost of fairness in location-based ads

13 Mar

Hi!  Below is the article that I submitted as part of my application for the AAAS Mass Media Fellowship.  My friend Chris Riederer helpfully sent me a short paper that he’d written with his adviser, and I very lightly dive into it.  I’ll write a quick summary of the paper first.

They apply an algorithm that other people had proved exists (which guarantees individual fairness) to real-world data to figure out what the costs are of maximizing revenue.  Here the costs are to group fairness.  Individual fairness means that similar users see similar ads.  They use probability distributions to represent the likelihood that users will see certain ads – so if two users are similar, their probability distributions will also be similar.  Group fairness means that the expectation of two random users from your two groups will be very close. In both individual and group fairness, we’ve implicitly been using a choice of metric.  The theoretical algorithm guarantees that this choice of metric is equal to the earth mover’s metric.

Jeremy Kun just blogged a nice explanation of earthmover distance.  Once they had that, and a ton of data that they trawled from Instagram, they compared fairness between groups depending on how precisely they’d recorded locations.  For instance, my current latitude and longitude is (35.212294, -80.817132).  If they looked at users at those coordinates, they’d see the 11 other people in this coffeeshop with me.  But if they truncate the coordinates to (35.21, -80.81), they’d see the millions of people around Charlotte, NC.  If they targeted ads for this coffeeshop just to the 11 of us, we’d definitely click on those ads.  It’s a coincidence that the people here right now are reasonably diverse between gender and race.  But if we were all white women, you’d see a difference between the people you didn’t target (everyone outside the shop, which includes non-white people and non-women) and the people you did target.  You generally don’t want to be discriminatory in your ads, but you also want to be effective–this coffeeshop doesn’t want to spend money advertising to users in South Carolina.

OK here’s the article!  The first news-like article I’ve written since high school.  Also, I interviewed Janice Tsai, a privacy expert at Mozilla, and I really appreciated her generosity of time with me as I stumbled through asking her questions.

Sample News Story

When you post on social media, companies can save your location data with different levels of precision, like by venue, by neighborhood, or by zip code. They personalize ads so that people who go the same places will see the same ads, which increases advertisers’ revenue.  But according to researchers from Columbia University, these location-based ads can lead to racial and gender disparities in how often they appear.

Computer scientists Christopher Riederer and Augustin Chaintreau studied the cost of enforcing fairness in location-based ads.  They applied an algorithm to Instagram data that guarantees that two similar users will see similar ads, and found differences in how often ads were targeted to white and minority users, and to women and men.

“When you do this binning of locations, people who look similar to a human eye will look pretty different,” said Riederer.  “It leaves more room for unfairness.”

Using face recognition software to detect race and a Social Security database to predict gender based on first names, the researchers saved hashtags, location data, and URLS of Instagram photos from over 40,000 users, with 1753 labeled by race and around 20,000 labeled by gender.

They sorted whether users visited locations to different levels of precision using the latitude and longitude of the posts.  Then they used the sorted data to identify users who were more likely to include certain hashtags: #fashion, #travel, and #health.

They fed this data to a model advertiser who targeted these users with an ad that resulted in $2 of revenue, versus a generic ad which raised $1 of revenue.  The most precise locations made more money: for instance, $1021 for #fashion users over a baseline of $902.

“This is where marketers say targeting is beneficial because it increases engagement rates,” privacy expert Janice Tsai said. “The question is, what happens next? For the normal person, does that mean lost opportunity, or more ads?”

More precise locations resulted in disparities between racial and gender groups.  Using one decimal place of precision, whites saw an ad 20% more often than minorities, while at four decimal places, that difference jumped to over 80%.  It’s unclear how significant this is- grouping the users into two random groups also resulted in an almost 80% difference.  The race difference was higher than the random difference, which Riederer said shows evidence that a disparity can arise from applying theoretical algorithms to the real world.  Further research is needed to find the size of that disparity.

“Some papers define fairness and show that you can use an algorithm,” Riederer said. “We want to inspire other people to take these solutions and apply them to real data sets.”

As algorithms have become more sophisticated, more instances of inadvertent discrimination have arisen.  In 2017, Facebook accepted “Jew-haters” as an advertising category and Stanford researchers claimed to create an artificially intelligent “gaydar”.  In 2015, Google showed an ad for an executive job position 1816 times to male profiles and only 311 times to female profiles.

“The question comes back to fairness,” Tsai said.  “Maybe the women would’ve clicked if they had a chance to see this ad more.”

While the study weighs the costs and benefits of enforcing fairness for advertisers, the public must also consider the price of location based advertising.

“People want to use the internet and use these things that don’t cost them any dollars, and in return their information is collected,” Riederer said.  “It seems like a reasonable tradeoff, but what is the cost of that going to be? If there’s something out there that scrapes data about me, and now I can’t get a loan, or health care, or bail, that’s a bigger concern.”

Understanding how bias creeps into algorithms, like through levels of precision in location data, is key to preventing it.  There are no regulations that require ads must be shown equally to different groups of people.  Only the threat of bad publicity encourages companies to fight bias.

“The shame or people being mad at things used to last much longer,” Tsai said.  “Our attention span is so short now that companies realize if they wait two days, something else will sweep the nation.”

Tsai suggests that companies proactively fight bias, which will give them positive publicity and perhaps keep them ahead of regulators.

“The best thing to do is to have some allocation for random ads,” Tsai said. “So you might see an ad for being a blacksmith or a CEO even if it’s not optimal for your job search.”

Advertisement

One Response to “The cost of fairness in location-based ads”

Trackbacks/Pingbacks

  1. I am the 2018 AMS AAAS Mass Media Fellow! | Baking and Math - April 11, 2018

    […] for the AAAS Mass Media Fellowship and wrote about it in a post about failure and I even posted my application essay.  Good news: the American Mathematical Society is sponsoring my fellowship and I get to spend the […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: