The use of agile development methodologies in the field of Business Intelligence (BI) is very common today, nevertheless such BI projects fail repeatedly. Although an iterative approach in which the continuous work and development happens in stages is the right way, the IT expert should know in advance the success factors of agile BI.
What are these success factors?
- Steadily improving front-end tools. As noted in an earlier post visualization of Big Data is a key factor for insights to be understood by all. Especially in terms of knowledge gains through the data exploration phase, departments can hereby operate independently and respond to changing requirements.
But on another level, there are technological developments that enable an agile approach in the field of BI. By getting more data sources which are also relevant for the business departments, building a data warehouse environment gets very time-consuming due to the creation of connections to the data sources and developing the data transformation processes.
In contrast, self-service BI enables a significantly faster agile approach. Various data sources from Excel files, databases to Web services can be connected easily. This “self-provisioning” of data enables business users flexibility in reacting to new requirements. For instance, for an exploratory analysis, re-evaluation of measured values of machines can be easily connected through various export formats. In addition, this new data can be combined with the previously linked data. This virtualization creates a kind of virtual data repository you can work with inside of the BI tools.
These options offer a variety of advantages:
- New data sources can be connected and the business therefore can have a look at it without greater involvement of IT.
- It is not necessary to create time consuming extraction mechanisms to get the information from the source systems.
- Development can be done in very short iterations, since time-consuming implementations of ETL (extract-transform-load) routes may be omitted in the first step.
- New results can be achieved through quick, short iterations, whereby both faster results, as well as greater acceptance by the business teams can be achieved.
- Since the fundamental connection of new data sources through self-service BI tools requires little effort, a trial-and-error approach is conceivable. In this way the requirements of the department can be scrutinized, designed, and specified very quickly with a sufficient profundity.
- Data virtualization is another technology approach to support agile BI. New data pots (in addition to traditional databases, flat files and new Big Data architectures such as a Business Data Lake) can be connected, corresponding analysis created and thus new knowledge quickly. By integrating InMemory engines in the BI tools running these analysis on the combined data sets is also very fast because the tool is able to calculate everything in the main memory. This speed helps the business to react flexible on changing requirements.
But the topic of data virtualization is to be found not only in BI front-ends. It is to be found in database management systems as well. The various database providers give opportunities to integrate external data easily through ‘virtual tables’.
IT just has to define the corresponding connections to the external system and then create the ‘virtual tables’. These are just links to tables in another database, as opposed to local tables. In SAP HANA for example, this is called “Smart Data Access”. This technology makes it possible to bind a wide variety of external database systems like Sybase IQ, Teradata or a Hadoop cluster to a HANA DB.
If the user runs a query on the HANA DB that uses a virtual table, a corresponding SQL statement is generated, sent to the external data source, the result set is returned and associated with the locally available data. This enables the user to work with the HANA DB and not worry about bridging and combining different data sets from different sources into the frontend tool. This speeds up the entire process of data acquisition, data preparation, data linkage and information gathering.
With data virtualization you can connect new data sources easily without asking IT for complex connections and ETL processes. In the first step, this can be done in the BI tools to give business the ability to have a quick look at the new data and try to get the first insights. Afterwards when it comes to bigger amounts of data, more (and complex) data sources and structures, this combination has to be made on level below – in the database. And here (e.g. with SAP HANA) you can do the same. The only change: IT makes the connections and configurations and the business doesn’t do this in the frontend tool. Here they have only one data source: the backend.
To summarize, it can be said that a key success factor for the agile development of BI projects is data virtualization, as these technologies reduce the time and development intensive steps of building a data warehouse architecture to a minimum by bypassing the ETL part in many cases, which is often the most time consuming job.
But only technical factors do not determine the success of agile BI. Many factors in the fields of organization and processes are important and decide whether BI projects can be carried out successfully in an agile way or not.
More about that in a later blog post.
Have questions about agile BI? I look forward to a personal dialogue with you.