“The best way to avoid damage from data theft is not to collect sensitive data at all.” (the author)

Poster - Espionage (1937) 08.jpgDo you remember the data hacks at Sony, Ashley Madison, eBay? All data leaks, data breaches or data theft, however you want to call it. Wikipedia defines data breaches as follows: “A data breach is the intentional or unintentional release of secure information to an un-trusted environment.”

Secure not only in the sense of secrecy, but in the sense that the organization or individual keeping the data should take care that unintentional access or use is prevented.

There’s even a webpage dedicated to data breaches, compiled by the Privacy Clearing House. I was unpleasantly surprised about the sheer number of breaches this site mentions. And only privacy related breaches are mentioned; not even the theft of state or industrial secrets.

Unauthorized access to secure data should be prevented. Other people, organizations and even governments can abuse that data, by using it for purposes the supplier of the data didn’t consent to.

With the breakthrough of the Internet of Things, security issues around the data transmitted and collected are becoming more prominent. The idea is that IoT-devices will collect vast amounts of personal data. This privacy sensitive information should be transmitted and stored securely. Quite a task ahead!


“The higher the classification of secrecy, the quicker you will report it.” (George P. Shultz)

Most sites that describe the history of data theft and breaches go back a couple of decades to the beginning when we started to collect and store large amounts of data in computer systems.

This rather surprised me. As an ECM-expert, I know the phenomenon of data theft has been around since the invention of… well, writing? Data hasn’t always been stored in digital systems. It’s on documents. Or in filing systems. We have all seen stories of espionage in books, movies and real life, about information and misinformation, about copying or photographing secret documents. Even Julius Caesar encrypted his documents to avoid data theft. (He probably called it differently.)

While most definitions of “documents” aren’t really concerned about what’s in the document, for security purposes we need to know the contents. When we regard a document as a container of data, we can see that this data might be sensitive and should be secured. Systems have been developed around classifying the contents of the document, from “public” to “top secret”.

Be aware that the most important security threat comes from people. Not only from individuals with malicious intentions, but also from your employees and relations. You won’t be able to keep a secret for long when you share it with over 125 people. So be careful when sharing your confidential information.

Recurring questions

“The first rule in keeping secrets is nothing on paper: paper can be lost or stolen or simply inherited by the wrong people; if you really want to keep something secret, don’t write it down.” (Thomas Powers)

Whenever I design a document management or archival system, I ask questions about the contents that need to be stored and managed. I explicitly ask if secret or privacy related documents are part of the system.

At most Dutch government organizations, secret documents are kept out of the electronic systems. They’re physically stored in safes, outside the realm of the digital network. Nobody can guarantee that the document management systems are secure enough to stop any intrusion. And rightfully so.

But when I want to classify data stores in other systems, like databases, the owners have difficulty telling me the security levels the data needs to have. In some cases, the data owners didn’t even know what’s in the databases.

I know establishing the appropriate classification for individual documents is difficult enough, but the awareness and standards are there. But for other data, not captured on documents, it all seems so far away.

History has proven that classification and securing of documents is needed to protect state and trade secrets. But it has also shown that it comes with cost and effort. Classifying information helps to focus those efforts and investments on the data that really need the protection. Indeed, we should aim to selectively secure only the data that deserves protection: privacy related data, intellectual property, trade secrets, competitive information…

Opening up data

“It ain’t what you do to data, it’s what you do with it (Edd Dumbill)”

All the data that isn’t vital for your business, the data you can assume your competitor already has, that data can be obtained from other sources too and doesn’t contain any real new information can be classified lower or even declassified… For example: Your competitor has already reverse engineered your product, or worse, already put to market a copy of your idea. So why keep all your product information confidential?

Sharing by creating open data is (becoming) common in research and government. But in the business world it’s still uncommon. By anonymizing or masking data, you can even create open data from confidential data. A question you could ask: do I really need (to collect) personal data? It takes quite an effort to keep information limited to a selected group of users, because almost everything is vulnerable to hacking and theft. So why not make that information open, so everybody can use it and learn from it. And maybe this is the only way to avoid misinformation, because everyone can check the information at the source, and not in some copied or hacked variant.

Shared information cannot be stolen and doesn’t need additional security. The question is: Is the price we have to spend securing that data higher than the value? Remember, the real value of data lies in what you do with data, not in keeping the data to yourself.

Photo Public Domain via Wikimedia