Big data seems to be the new hype and continues to be mentioned in lots of press releases and presentations. As long as we define it by traditional internal structured data it’s really a game of more manipulation and more power to analyze what we have better, and so more products and services are continually made available for us to do this. However, the real challenge, indeed the real value, is that most businesses want to figure out how to find and use external unstructured data to make real breakthroughs in seeing the ‘real’ world through using the data of others. The questions are therefore: What are the external sources from which we can get at least reasonably trusted data? How can we store the huge amounts of data? How can we manage access to it all?
September wasn’t a bad month for making progress on these issues as the Open Government Partnership held its first meeting in New York supported by the United Nations with the following statement as to its principles: In a world marked by so much turmoil, we need open government to build trust and to revitalize the social compact between states and citizens. Openness can bring governments and citizens together, cultivate shared understandings, and help solve our practical problems. It starts with sharing information.
The participating governments, and increasingly local government units as well, are signing up to the ‘open data’ movement, meaning that they are making at least a reasonable amount of their trusted data available to be used by business or in the development of new solutions. Open data, the act of making your data sets available for others to use without copyright or other hindrance, has allowed some interesting new services to be introduced. Google Transit Feed is a particularly well known example tying a metropolitan area transit authority’s data to Google maps and feeding a new generation of apps for mobility devices such as NextBus.
At the root of this is a very serious point about ‘using’, in the full sense of the word, open data, meaning both being able to find and use new real-time feeds, as well as make some of your own data available in this manner to encourage others to make your company more ‘visible’ in the market. Open data requires an application programming interface (API) to access the data and though this can be defined and published to suit the open data set when it is made available, it’s a really good policy to make sure that when developing ‘services’ to make sure that the data set is separated around its own API and that the service then consumes the data via the API. Think of it as a valuable move towards all app developments, and for more information on this in the government programs take a look at the work of Code for America.
The other interesting event was the release by the Storage Networking Industry Association, SNIA, of the Cloud Data Management Interface, or CDMI, the standard for providing virtualized storage in the form of Data as a Service, or DaaS. In their own words from the standards pages of their website:
CDMI defines the functional interface that applications will use to create, retrieve, update and delete data elements from the Cloud. As part of this interface the client will be able to discover the capabilities of the cloud storage offering and use this interface to manage containers and the data that is placed in them. In addition, metadata can be set on containers and their contained data elements through this interface. This interface is also used by administrative and management applications to manage containers, accounts, security access and monitoring/billing information, even for storage that is accessible by other protocols. The capabilities of the underlying storage and data services are exposed so that clients can understand the offering.
The whole point of CDMI is to provide a ‘simple’, yet secure and reliable, interface that will encourage the use of virtualized storage and enable the access to data held in this manner, which of course is the link back to open data! CDMI works for most types of data but is optimized for REST, Restful State Transfer, as one might expect in building the new generation of apps based on the Web Architecture with HTML5. CDMI doesn’t just simplify accessibility and use, it also manages a cohesive set of security measures – indeed these are comprehensive enough that on their own they would justify adopting CDMI.
So, two big moves that make apps for mobility clients, defined as new generation capabilities that can be combined from data sources onto a Web model and run from clouds, easier to deploy. BUT, as is generally the case with this new environment, new development methods and standards are all important!