Mitigating the Data Management in the Cloud Conundrum
Published by Konstantin Polukhin on Thursday, August 23, 2012 20:00
As more and more companies rely on data as the foundation for accurate strategic decision making and use it to underpin the development and evolution of their core products and service offerings, the value of data to most companies is understandably on the rise. Yet, despite an awareness that a deterioration in data quality will almost certainly result in a degradation of business processes, many organizations still do not put enough time and effort into ensuring that all data is as timely, accurate and consistent as possible.
The growing use of the cloud is now threatening to add further complexity to the management of data quality - and even fewer organizations are taking this into proper account. According to a recent report by Ventana Research, only 15 percent of organizations have completed a quality initiative for their cloud data, and that number drops to five percent for master data management. So, it comes as no surprise that less than a quarter of organizations trust their cloud data, while just under 50% trust data from on-premise applications.
While cloud applications per se don't necessarily pose an immediate danger to data quality, it is in moving data between cloud and on-premise, and when integrating data between the two, that issues are most likely to arise. This is primarily because even those companies that have instituted data quality management processes are unable to extend them to data produced by cloud applications.
Ease of use over quality control
Cloud applications are often provided by companies whose business models are based on providing functionality and ease of use rather than quality control. The content (in this instance, the data and its quality), after all, is the customer's concern. While many cloud application providers offer service level agreements (SLAs) that outline their data management practices, the reality remains: when going to the cloud, the owner is essentially surrendering oversight of the data in exchange for flexibility and elasticity.
While cloud applications can provide significant business value, they can also severely complicate data management. For obvious reasons, the more critical to the business the data produced by the cloud application, the more complicated the problem of integrating the cloud data back into an on-premise data store.
Let's consider, for instance, a hypothetical bank that already collects and manages large amounts of customer data, and has made considerable investments in building a reliable master database and ensuring its data quality. What would happen if the bank introduced a cloud-based campaign management and execution platform to automate and enhance its direct marketing? Simply creating a database for such a highly involved function is a serious project in and of itself, but maintaining the quality of the in-house data will now require ongoing and elaborate integration with the cloud to keep the structure and the unique identifiers of the core database intact. As a result of the new implementation, the bank would likely be facing significant data duplication, serious integration overhead and related data quality risks, not to mention a much higher amount of work to keep things running smoothly day-to-day.
What if the bank had the option of keeping the data on-premise where it's governed by its internal data quality and management policies, rather than have it duplicated in the cloud - and yet continue to have access to the business logic in the cloud? If the computing process could be "re-mapped" so the bank could retain control of the data while enabling the cloud application to "borrow" the relevant data for processing as needed (and write the appropriate data back), the business would be able to reap the benefits of the SaaS model while escaping the data quality management problem. Such a solution would effectively extend quality management practices to cloud data, thus eliminating the conundrum.
One may think of it as a "cloud-to-earth" connector that combines reliable communication across an unstable network and a robust queuing mechanism on both ends with a separate data access layer that uses a logical representation of the physical data structures to implement comprehensive mapping between the cloud and the on-premise data.
This approach is not without its trade-offs: some amount of automation, speed and cost savings may be lost in exchange for data quality. But it would also include an element of "having your cake and eating it too." The connector allows companies to leverage software in the cloud, thus reaping all the benefits of SaaS, without significant changes to the data governance processes; the on-premise data would be seen by the cloud apps as though it actually does reside in the cloud.
With analysts predicting that growth in cloud based applications will outstrip that of on premise applications over the next few years, companies must address the existing gap between the needs and capabilities of integrating data between the cloud and on-premise. Mitigating these effects with a cloud-to-earth connector is one way of avoiding the costly errors that might otherwise arise from the inconsistencies or inefficiencies that often result when reconciling different sets of data.
Konstantin Polukhin is a managing partner of First Line Software Inc.