Given that Predictive Analytics can help organizations in multiple ways, it is no surprise that organizations are looking at effective ways to build their predictive analytics footprint. However, there are some enablers that are required to ensure efficiency and effectiveness
Ecosystem (Data and Technology)
First and foremost are the data and the technology environment. Data availability and sufficiency is the biggest precursor to successful model building. There will be times when the relevant data points are not captured, sufficient history is not available, or sufficient granularity is not available. While there are some techniques that may help in the short run, the data capture and sufficiency needs to be addressed for an effective long term solution. An example that comes to mind is when Operational risks needed to be calculated for Banking capital calculation (Basel computations) Most Banks did not have a mechanism for capturing the risks and associated losses with the required granularity. This led to many workshops that defined risk categories, losses, mechanisms for capturing losses & probable losses, and most importantly a system for capturing these data points to set up the ecosystem for effective future modeling.
On the other hand Big Data presents the opposite problem. When dealing with Big Data; analysts need to quickly sift through voluminous amount of data to identify the information nuggets .Nowhere are the data quality ramifications more evident than in predictive modeling. GIGO only gets magnified in predictions. While again predictive modelers may employ specific techniques to clean the development data, persistent problems in data quality will lead to model deployment issues in the production environment.
Predictive analytics combine business knowledge and statistical analytical techniques to apply with business data to achieve insights. Advanced analytics resources and analysts are very different from a typical IT/ BI developer. A business analytics organization does have roles for Business Analysts, Administrators, Developers, and the like. But the key role is that of the advanced data analyst (a.k.a statistical modeler/ data miner/ data scientist), who actually works with the data, exploring patterns and selecting and implementing an appropriate analytic methodology to extract business insight in the form of, say, a model score or customer profile. These resources work at the interface between IT and business – while they are comfortable in both environments, they belong fully to neither. The IDC paper “Take care of your quants” very aptly talks about prioritizing and factoring the needs of the modeling teams to effectively leverage data and infrastructure
Predictive models are only useful if the organization acts on the model outputs. While some model outputs may be presented in static reports, most will require integration for effective and timely deployment. Depending on what the model is trying to accomplish, predictive models can be integrated to the following
1. Operational systems – Predictive Analytics output, such as Credit scores or replenishment triggers, have to be integrated into the operational systems to help trigger operational decisions around loan approvals in Banks or stock re-ordering in Retail Stores.
2. Datawarehouse – Integration to DW will achieve the faster processing time . Also some of the complex calculations around data transformations etc are now best done exploiting the in-database features to achieve speed and efficiency
3. BI reports – BI reports with embedded analytics can provide insights to assist middle managers and senior leadership in decision making. For example, OLAP reports on loan usage and repayment patterns by customer segments and risk score are used by managers for credit policy decisions. Dashboards and scorecards with key metrics and KPIs enable senior management to get a quick handle on the health of the business and the key levers that they can use to set or change business direction.
4. Other Models/Analytics : Sometimes, the output of one model has to be integrated with another model. E.g. An asset maintenance forecast needs to be integrated to resource optimization models to ensure effective resource planning
The nature of predictive modeling means that models have to be checked periodically for accuracy and relevance. Models need to be monitored for impact on performance due to changes in environment, such as customer behavior or economy or weather. Thus, credit scores and credit risk segments will have to be changed with changes in economy, anti-fraud models need to be updated to capture new types of fraud or refined to reduce false positives.
In many cases, business analytics teams are small and overworked, model maintenance is the last of their priorities. In my view, poorly maintained models, or inaccurate models are worse than a “no predictive modeling scenario” Inaccurate models give inaccurate result, however, the business users are not aware that the models have outrun the utility. They have a false sense of security that their decisions are backed by accurate predictive models and realms of data , hence the user may be lulled into complacency and the business user may not exercise the usual due. Basically , it is long before anybody challenges an erroneous model output in production.
Completing the Feedback Loop
Models too need to learn from their follies! Hence it is important that decisions taken & actual observations are recorded, and there is a mechanism where models are further refined based on the outcomes. . For example, a retail store stock replenishment forecasting model could over-estimate the need for a cold drink based on a temporary spike in sales, say, due to a neighborhood party. But the model should be able to scale down future replenishment estimates based on the fact that this one-time event doesn’t have long term impact on product demand.
While predictive modeling can be a great tool to empower decision makers at all levels in the organization, it is important that projects in this space have all the key enablers to ensure successes. Errors in this process can lead to nebulous outcome, in some extreme cases it can add chaos and mislead the decision makers