Monday, November 25, 2013

Little Data, Big Problem

As a computer scientist, I think about data a lot.

And as someone who is a fairly private person, I'm particularly interested in personal data. Not only my own, but everyone's. I gape at fellow customers at the store who give their phone number and zip code to the cashier without a thought. I am appalled at friends who post private information publicly - photos, geolocation data, their polical affiliation, their religion, their "likes". Everything from restaurant checkins to where they delivered their baby.

I am shocked that people purchase devices that track their physiological data 24/7, data which is automatically uploaded and shared publicly. I am stunned that people voluntarily give samples of their DNA to 23andme.

The shocking thing is when I mention something about this to someone, I receive one of three responses:
1) "I don't care, I have nothing to hide".
2) "Bah. I'm honestly not that interesting."
3) "Well, I know X is evil, but it's just so darn convenient. And anyway, all my friends use X. I can't stop using it now."

Never does someone say, "Wow, FCS, you're right - this data deluge is terrifying! And that anyone with cash can buy all our data willy nilly! Yikes! We should lobby the government to regulate the personal information brokering industry."

Never. Yet one word of the NSA spying snafu and POOF - people freaking out. But I think they're freaking out about the wrong thing.

My security friends talk about threat models. "What's the threat model?". I don't think it's the government. The government is far too monolithic, tech-unsavvy, and sequestered to pull off what we see in the Borne movies. And there's no Machine, sitting in a warehouse in Iowa continually monitoring, processing, and understanding the content of every phone call and surveillance camera feed. That's NP hard.

The threat model is - we have no clue. Right now, any person with the means can purchase a large lot of your private data. If you use a credit card, cell phone, or ATM, ever, you're toast.

When people say, "I don't care, I have nothing to hide." I want to whack them with a #firstworldproblems foam bat. It's not the #FWP people I'm worried about. It's the most vulnerable of our society: those who are abused, those who are stalked. Those who are bullied. Those who simply are not technologically savvy enough to realize they have not only hung their dirty laundry out on their closeline, but their entire existence.

I know what Scott McNealy said. But it still pains me. I think about data a lot.


  1. I do believe it is not unreasonable to distinguish the personal and the society level. When it comes to my personal data, I do employ a mixture of the three arguments given: I consider it unlikely that someone will construct an ind-depth profile of me, unlikely that it would harm me if someone did, and there are various ways of leaking data that are convenient enough to make up for the remaining risk.

    On the other hand, I can definitely image stuff going horribly wrong on a society-level. But that seems to be a case for better IT solutions and laws, not for personal boycott.

  2. There are differences between information collected by corporations and information collected by the government. In both cases, the data can be sold to third parties or viewed by potentially nefarious actors (stalking ex-girlfriends, anyone?). Businesses may alter service and prices based on my data, but I usually have other options to obtain said service. The government, however, has much more power. They can: prosecute me, search me, retain my property as "evidence", cause me large legal bills, limit access to bank accounts, deny me grant funding, threaten government funding to firms that hire me, and offer zero recourse for any of these violations. All of these measures (except for the grant funding part, as far as I can tell) have been used against whistle blowers, activists, etc. in both the Bush and Obama administrations. While I trust Facebook about as much as the government (both moderately), government data collection is much more alarming because of the government's ability to inflict harm.

    1. I only see two advantages to corporations doing collection vs. government. First, because the information is entirely profit-based, there is a fiscal incentive for companies to only sell some of your data to information brokers, in order to yield as much profit as possible. (Thus, some of your data may remain somewhat secure sometimes).

      The second is that the data may, possibly, be more decentralized, adding to significant delays in acquiring it. So even with a subpoena (if they use those any more), data acquisition could take days/weeks/months, and significant expense and person-power to find all the bits of interest. (So, some privacy through decentralization).

      From a whistleblower/activism perspective, the government can still do all the things you mention even if it is not the one directly doing the collection. So that's more of a general concern about the law, which is valid.

      My main point is: All of our data needs to be better protected, from all parties, period. Our legal system is still far back in the technological stone ages. None of the lawmakers have any clue of the scope of this problem, and I think in some ways that's our failure as computer scientists in communicating this to them.

    2. In reading your article, I thought: At last! Someone gets it!

      But I respectfully disagree with the thrust of the comment. Consider the recent case of Bank of America going all out to protect its shenanigans from seeing the light of day in its attack against its 'enemies' and sympathizers.

      When WikiLeaks obtained thousands of BoA documents revealing (it's believed) illegal activity in the nation's largest bank, it went on the attack. BoA wasn't satisfied shutting down access to donations via credit cards and PayPal and freezing accounts, quite an intimidating assault in itself. (How many of us depend upon access to our bank funds?)

      BoA returned to its mafia roots. They conspired with government security experts HBGary Federal, Palantir Technologies, Berico Technologies, the mega-law firm Hunton & Williams and possibly the US Chamber of Commerce. (You may remember HBGary Federal's parent company developed the Magenta virus root-kit for lease to select parties.)

      Not satisfied with WiliLeaks, the consortium targeted sympathizers, individuals, mostly reporters but also 'enemies' of BoA and CoC. The assault failed and came to light thanks to their own hubris and over-reaching.

      I'm an ardent free market proponent, but capitalism isn't all cute Coca-Cola polar bears, Pillsbury Doughboy, and those wonderful Apple products. Some of the world's best known companies (Volkswagen, Mercedes, Bayer, Hugo Boss, Siemens, even IBM) collaborated only decades ago in the worst mass murder and genocide in human history.

      As you say, the legal safeguards are centuries out of date. I agree: our data needs not only to be secured, but restricted, filtered, and available to the ultimate owners of that data, the individuals themselves– us.

  3. This reminds me of how HeLa was sequenced a little while back, and how it can tie to Ms. Lack's decedents. Scientists could take these sequences and expose information that no one should have. There are intimate details that other corporations and the government could exploit based on some probability that you could have mental issues or criminal behavior or companies could push you to elective procedures. Not thinking like a conspiracy theorist, just thinking like someone who prefers their information be given as a gift, not extracted from a giant bolus of data in a cloud.