First, let me address the elephant in the room and demonstrate some humility. With too many boards and CEOs, technology seems daunting, particularly with the pace of innovation and myriad acronyms. It all seems so overwhelming. I certainly have found that in my career, and I am not afraid to admit that. And now the stakes seem so very high, with data being the last untapped asset on the balance sheet and quite possibly the key driver of enterprise value going forward. How can non-technical executives and board members understand the strategic importance of data, Ai, and Machine Learning, particularly as they relate to their businesses and industries? CEOs and boards have to feel comfortable asking these seminal questions and to explore the foundations of what enables a company to drive enterprise value with data and Ai. There is no mystery, and there is no reason to feel inadequate.
Here is what I have learned and what I share with the largest companies in the world. Data is tough, and extracting business value from data is even harder. Data is spread across multiple systems and is often incomplete and inaccurate. And the tsunami of data continues to grow exponentially. We all know about structured data that is neatly organized and searchable in relational databases. But what about the massive growth in unstructured data and, in particular, voice, video, and even documents or PDFs? How do we extract the nuggets of wisdom from all that data and put it into a form such that we can apply machine learning and NLP to it to drive consequential value creation quickly?
Simply put, the first step is to build a data catalog – an itemization of the company’s data. If we ask a CEO to list all the locations of their manufacturing facilities, most would have no problem doing so. Same with IP or other critical assets. But ask the same about the data assets, and the answers will not be satisfying. In fact, most companies in America don’t even have a data catalog. Think about that. If data is the last untapped asset on the balance sheet and the key to competitive success, why don’t most companies know what data they have, much less a catalog?
To my fellow CEOs and board members, here is my message. Start initially by cataloging what data you have so that data engineers and data scientists can organize and leverage it in the future.
Once we have cataloged the data assets, management teams and boards must have a frank assessment about the data quality. Is it incomplete, inaccurate, or messy? The answer is yes. It is all of those things. How do I know? Because we work with the biggest banks, retailers, manufacturers, insurers, airlines, and hospitality firms across the globe. And you know what? They all tell us the data is a mess. Everybody has the same problem. They call the ElectrifAi confessional and spill the beans, saying they have spent millions of dollars on integrators, platforms, and cloud providers. Still, the data is a mess. But don’t despair. There is hope to ensure data quality. And that is through creating data pipelines.
What is a data pipeline, and why should Boards and CEOs care? The data pipeline is the process through which the enterprise accesses data to be analyzed. It is the process of ingesting data, performing a data quality check, cleaning it, and enriching it as appropriate for the task at hand. Most companies get this foundational data preparation step wrong. The negative effects should be obvious. Building data pipelines consistently and at scale is critical because enterprises have a lot of data. It needs to be accessed and prepared for visualization and for more sophisticated tasks such as machine learning. You get the data pipeline part wrong and have a world of problems. Some enterprises build thousands of data pipelines each year. The importance of this cannot be underestimated. So, it is critical to start with these foundational steps (e.g., data catalog and data pipeline) and have “control” over your data to turn it into a competitive weapon to drive business value.