Posted on

Protecting the creepy line

Challenges in data security and privacy for next generation data innovators

We are living in the age of data innovation.

Everywhere around us, companies are starting to understand that the data that they collect, generate, store and infer about their customers has value.

From our user account details and personally identifiable information (PII) to data about how we use services and interact in our daily lives, these systems are painting a rich portrait of us as individuals. Portraits that are worth a lot of money to those organisations creating targeted, intelligent or customised services.

These are times where Big Data, augmented reality and artificial intelligence are no longer the realm of science fiction but are emerging in the business and consumer spaces.

Everything from the songs we hear on Spotify to the way we find romance, products and services, the recommendations and information we receive are generated just for us.

Data has become a material asset to an organisation. Spotify’s recent acquisition of Niland only reinforces this.

Are our lives richer for these experiences or are we risking something much greater?

There are some undeniable benefits to this agregation of data.

We have known for a very long time that the more data points we have when making decisions, the more accurate those decisions will be. It is inevitable therefore as we as organisations strive to provide more sophisticated and autonomous products and services , that we will want to have as rich a data set feeding them as we can.

In the fields of intelligent automation (such as smart pacemakers and self driving cars), we are comforted by the fact that there is a lot of data going into each decision that is made. If the future of medicine is an AI based doctor then we’d all hope that it had access to all the information it needs to provide us care.

The trouble is, the applications of this data go far beyond medical technology and the latest Tesla.

Every system or organisation can use this data, indiscriminately, to build whatever system they can imagine.

Can we protect this data and its owners?

There are some genuine, legitimate concerns surrounding this freedom.

  • Do consumers genuniely understand what their data will be used for in a system?

  • How much control do the users have over their data and are they aware and able to act upon those rights?

  • What controls are we enforcing around data sharing and transfer, particularly in case of acquisition, merger or change of market direction?

  • Are organisations able to (willingly or not) discriminate against groups based on this data set and how can we validate and intervene if this is happening?

Is this risk creating a creepy line?

The potential for harm here is huge.

The phrase “the creepy line” was originally attributed to the early days of Google when discussing the potential for misuse of the Google data set. It was the line that Google would not cross lest the impact on our users be too damaging or upsetting. They hoped to get right up to that line, but not cross it.

It’s fair to say that Google’s original creepy line is now a distant memory, replaced by something much more global, much more personal and in some cases far more scary.

How far are we willing to go with our systems and data now before that line is deemed crossed? Are there sufficient laws to guide these decisions and protect the people using these systems?

More than privacy and law

The law in this space is still immature. While privacy law globally is maturing, it is not universal nor does it account for the new technologies that are emerging.

The General Data Protection Regulation (GDPR) framework launches in Europe in 234 days and is a big step towards the maturity needed to manage data sets of this type with privacy in mind. The privacy world is watching carefully to see how industries react to the new protections, reporting guidelines and fines that will come into force.

Until this framework is in open use however we will not know how effective it is (and what measures companies will take to minimise or avoid its impact).

As the privacy world watches and matures, in the security world, we are watching with a mix of curiosity and fascination.

In our field we don’t have such frameworks or measures. There are few statutes and frameworks for security that truly understand the complexity and value of the agregated and enriched customer data sets of the future.

Protecting the future of data

So what do we need to start considering from a cyber security perspective if we are to meet our data security and privacy obligations without getting in the way of this wave of innovation.

Firstly, the answer is not to say “no” to innovation

The cyber security industries risk avoidance approaches of the past 20 years will not work in this space. When choosing between security and innovation, most organisations will come out in favour of innovation, choosing to accept the risk instead.

This isn’t naivety, it’s survival.

So if we can’t say no then we will have to find ways to support and enable innovators. This will be no easy feat for an industry of professionals that are terrified that the next data breach will cost them their job.

So what are the challenges we will need to address to secure the data sets and innovative systems of the future (and their users)?

Four security challenges when protecting future data innovation

1) Threat assessment of artificial intelligence

Can we understand the logic and pathways of artificially intelligent systems well enough to predict their behaviours and identify and potential risks or sign of malicious activity?

MIT have set up an entire research department around morals and ethics in this space. It should not be long before we see security research in the same field.

2) Data anonymization and protecting the individual

Data anonymization is a hard problem. Can we protect our individuals sufficiently when aggregating data for new innovative systems? Can we measure if we are successful? What are the risks to individuals if we fail in this?

The risk from this is so serious that UK data protection law has specifically outlined rules against it.

3) Data ownership, transfer and acquisition

Can we consider personally identifiable information salable? How to protect individuals when selling and transferring data between companies? How much is too much data for one company to hold?

4) Data corruption, tampering and poisoning of artificially intelligent systems

If our AI is as effective and safe as the data that feeds the algorithms, supply chain security needs to expand to also include data supply chain. Can we protect our systems from data poisoning? Could a malicious individual or group inject data to bias, limit or make our systems unsafe?

If this sounds far fetched, let’s not forget Microsoft’s failed experiment with AI chat bots.

Solutions needed today for tomorrows creepy line

Technical innovators globally have a big challenge ahead of them when protecting their users and their data.

It’s only be recognising these challenges and addressing them early and openly will we as an industry be able to meet our obligations and keep people safe online.

While this shouldn’t come at the cost of innovation, we should be mindful that until we have solved some of the challenges in this space, we need to be careful to distinguish between what we can build and what we should.