Commissioner Julie Brill

“Big Data and Consumer Privacy:

Identifying Challenges, Finding Solutions”

Address at the Woodrow Wilson School of

Public and International Affairs

Princeton, University

February 20, 2014

Thank you, Ed, for that kind introduction and for inviting me to speak today. It is always

a pleasure to come home to Princeton, and today is no exception. Princeton, and in particular the

Woodrow Wilson School, cultivates leaders of all types and in many fields, including those that have helped fuel our global technological revolution. As a lifelong consumer protection advocate, I have spent a lot of time focusing on the privacy implications of emerging technologies. Today, I would like to focus on one of the fastest growing and most promising areas in our technological revolution – big data analytics – and its effect on consumer privacy.

Technology is transforming our lives. Its enormous benefits have become part of our daily routine. Tripadvisor plans our travel. Google Now keeps us on schedule. Birthdays are celebrated on Facebook. And our newborns’ first pictures appear on Instagram.

But these now-familiar services are just the beginning of our connected future. Our cars are computers with wheels, wearable medical devices notify others when we are ill, and our connected refrigerator will soon tell us that we’ve had a sufficient amount of beer for one night.

These transformative online and mobile experiences collectively yield an enormous amount of data about us.

Technology used by others reaps even more data every minute we walk the street, park our cars, or enter a building. When we go outside, CCTV and security cameras capture our movements. Some retailers use video surveillance, facial recognition, and cell phone signals to track customers’ in-store movements. 1 And every time we go online or use a smartphone or credit card, our purchases and movements are tracked.

In a real sense, we are becoming the sum of our digital parts.

The estimates of the data we collectively generate are staggering. One estimate, already more than two years out of date, suggests that 1.8 trillion gigabytes of data were created in the year 2011 alone – that’s the equivalent of every U.S. citizen writing three tweets per minute for almost 27,000 years. 2 Ninety percent of the world’s data, from the beginning of time until now, See Lisa Wirthman, What Your Cellphone Is Telling Retailers About You, FORBES EMCVOICE, Dec. 16, 2013, available at http://www.forbes.com/sites/emc/2013/12/16/what-your-cellphone-is-telling-retailers-about-you/.

Lucas Mearian, World’s data will grow by 50X in next decade, IDC study predicts, COMPUTERWORLD, June 28, 2011, available at http://www.computerworld.com/s/ article/9217988/ World_ s_data _will_grow _by_50X_in_next_decade_IDC_study_predicts?pageNumber=1.

has been generated over the past two years, 3 and it is estimated that that total will double every two years from now on. 4 This gold mine of data can be put to important, even transformative uses. We are all familiar with big data’s ability to personalize our daily activities – helping companies determine which ads to pitch to us, which newspaper articles to recommend, and which movie should be next in our queue. But big data analytics aims for loftier goals. It promises to bring us more profound benefits by addressing important societal issues like keeping kids in high school; 5 conserving our natural resources by making our use of electricity more efficient; 6 providing first responders in crisis situations with real-time information about the injured or those who lack power, water, or food; 7 and performing other miracles in the health care sector. Indeed, the opportunities big data analytics may provide in the field of medicine are staggering: prevention of infections in premature children, 8 mobile apps that distribute information to clinicians about bacteria types and resistance patterns in relevant communities, 9 and the development of preventive programs that anticipate a person’s health status. 10 We are all eager to reap the potential benefits of big data. Yet consumers, policy makers, and academics also see threats from these vast storehouses of data. Most of us have been loath to examine too closely the price we pay by forfeiting control of our personal data in exchange for the convenience, ease of communication, and fun in a free-ranging and mostly free cyberspace.

Science News, Big Data, for Better or Worse: 90% of World’s Data Generated over Last Two Years, SCIENCE DAILY, May 22, 2013, available at http://www.sciencedaily.com/releases /2013/05/130522085217.htm.

Steve Lohr, The Age of Big Data, N.Y. TIMES, Feb. 11, 2012, available at http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html?pagewanted=all&_r=0.

Centre for Information Policy Leadership, Big Data and Analytics: Seeking Foundations for Effective Privacy Guidance, at 6-7 (Feb. 2013), available at http://www.hunton.com/files/Uploads/Documents/News_files/Big_Data_and_Analytics_February_2013.pdf.

(discussing efforts to reduce the high school drop-out rate using student record analysis in Mobile County, Alabama).

See Omer Tene & Jules Polonetsky, Big Data for All: Privacy and User Control in the Age of Analytics, 11 NW. J.

TECH. & INTELL. PROP. 239 (2013).

See Lisa Wirthman, How First Responders Are Using Big Data To Save Lives, FORBES EMCVOICE, Jan. 10, 2014, available at http://www.forbes.com/sites/emc/2014/01/10/how-first-responders-are-using-big-data-to-save-lives/#.

Brian Proffitt, Big Data Analytics May Detect Infections Before Clinicians, ITWORLD, Apr. 12, 2012, available at http://www.itworld.com/big-datahadoop/267396/big-data-analytics-may-detect-infection-clinicians.

See Sue Poremba, Can Big Data And Mobile Make Health Care More Effective?, FORBES EMCVOICE, Jan. 22, 2014, available at http://www.forbes.com/sites/emc/2014/01/22/can-big-data-and-mobile-make-health-care-moreeffective/ (discussing the use of big data to predict individuals’ health status and guide preventative health programs).

See id.

This examination is becoming all the more urgent as phones, cars, and other everyday objects join the Internet of Things. Again, the potential benefits may be profound. Medical wearable devices—such as Google’s contact lenses that help diabetics track glucose levels in their tears 11—have the potential to affect millions of people suffering from a wide range of health conditions. But “smart” devices are about to become always-on sources of deeply personal information. This will be a big shift for consumers. Instead of having a handful of devices – a smartphone, tablet and laptop – that mainly serve to connect consumers to the Internet, consumers may have many devices that they buy for one purpose – making coffee, storing food, driving to work – but that collect and use a vast amount of personal information about them. Whether it is a connected car, home appliance, or wearable device, the data that these connected devices generate could be higher in accuracy, quantity, and sensitivity, and – if combined with other online and offline data – could have the potential to create alarmingly personal consumer profiles.

Will consumers know that connected devices are capable of tracking them in new ways, especially when many of these devices have no user interface? Will companies that for decades have manufactured appliances and other “dumb” devices take the steps necessary to keep secure the vast amounts of personal information that their newly smart devices will generate? And how will the new data from all of these connected devices flow into the huge constellation of personal data that already exists about each of us? 12 These questions echo the ones that have long surrounded the vast amount of data collection and profiling performed by ad networks, data brokers, and other entities that consumers generally know nothing about because they are not consumer facing. In some instances, these entities track consumers’ online behavior. In other instances, these entities merge vast amounts of online and offline information about individuals, turn this information into profiles, and market this information for purposes that may fall outside of the scope of our current regulatory regime.

As we further examine the privacy implications of big data analytics, I believe one of the most troubling practices that we need to address is the collection and use of data — whether generated online or offline — to make sensitive predictions about consumers, such as those involving their sexual orientation, health conditions, financial condition, and race.

Let’s look at a well-known, and by now infamous, example. Before Target made news for a data security breach that may involve 110 million consumers’ credit cards and debit cards, the company received a lot of attention for its big-data-driven campaign to identify pregnant customers through an analysis of consumers’ purchases at its stores, a so-called “pregnancy See Brian Fung, Yes, Google Glass Users Look Weird. But Google’s Smart Contact Lens Will Change All That, WASH. POST, Jan. 17, 2014, available at http://www.washingtonpost.com/blogs/the-switch/wp/2014/01/17/yesgoogle-glass-users-look-weird-but-googles-smart-contact-lens-will-change-all-that/.

See Julie Brill, Op-Ed., From Regulators, Guidance and Enforcement (contribution to Room for Debate: Privacy, When Your Shoes Track Every Step), N.Y. TIMES, Sept. 8, 2013, available at http://www.nytimes.com/roomfordebate/2013/09/08/privacy-and-the-internet-of-things/regulators-must-guide-theinternet-of-things.

predictor score.” 13 Target was able to calculate, not only whether a consumer was pregnant, but also when her baby was due. 14 It used the information to win the expectant mom’s loyalty by offering coupons tailored to her stage of pregnancy. 15 To be clear, I don’t have any information indicating that Target sold its pregnancy predictor score or lists of pregnant customers to third parties, or that doing so would have violated the law. Yet we can easily imagine a company that could develop algorithms that will predict other health conditions – diabetes, cancer, mental illness – based on store purchases and other seemingly innocuous activities, and sell that information to marketers and others.

And actually, you don’t have to imagine it. The U.S. Government Accountability Office (GAO) reports that one data broker includes in its consumer profiles information about 28 or more specific diseases, including cancer, diabetes, clinical depression, and prostate problems. 16 The U.S. Senate Commerce Committee staff describes another data broker that keeps 75,000 data elements about consumers in its system, including the use of yeast infection products, laxatives, and OB/GYN services, among other health-related data. 17 And the Wall Street Journal recently informed us about a company that analyzes innocuous data from social media and the like to predict disease conditions like diabetes, obesity, and arthritis in order to persuade particular consumers to join medical trials. 18 All of this creation, analysis and use of consumers’ health information is happening outside of HIPAA – outside the US regulatory regime designed to protect health information.

Another troubling practice that we need to address is the creation and sale of profiles to identify financially vulnerable consumers. A number of the consumer lists that data brokers sell carry such titles as “Rural and Barely Making It,” “Ethnic Second-City Strugglers,” “Tough See Charles Duhigg, How Companies Learn Your Secrets, N.Y. TIMES, Feb. 16, 2012, available at http://www.washingtonpost.com/blogs/the-switch/wp/2014/01/17/yes-google-glass-users-look-weird-but-googlessmart-contact-lens-will-change-all-that/.

See id.

See id.




See STAFF OF S. COMM. ON COMMERCE, SCIENCE, AND TRANSP., 113TH CONG., A REVIEW OF THE DATA BROKER INDUSTRY: COLLECTION, USE, AND SALE OF CONSUMER DATA FOR MARKETING PURPOSES 12, 14 (2013) (citing documentary submission from Equifax and listing health care-related data elements that Equifax maintains) [hereinafter “DATA BROKER REPORT”].

