Is it time to Abandon Data Lakes?
It seems like all we hear about today is how important data is when it comes to staying afloat in the digital marketplace. Our phones capture data. Our cars capture data. Even Alexa is logging information while we chat with friends over dinner. In fact, studies show half of the world’s data today was created in the last 10 months! Unfortunately, the frantic search to capture data has led some companies to become paralyzed with data overload. Indeed, if you’re like many businesses today, you’re sitting on a vast lake of data that your company has no clear idea how to effectively use and may be tempted to abandon. Eventually that lake will start to look more like a data swamp. Why?
As it turns out, more data isn’t always better. My colleague Shelly Kramer expressed it this way: it’s not about who’s using big data—it’s about who is using it well. That means taking a hard look at your company’s data strategy—including how and what data is being generated, captured, and—ultimately—used. The following are a few tips for getting a handle on your data storage and making it work for your company. Maybe you’ll find that you can abandon your lake altogether and just use Data-as-a-Service, or maybe you’ll find that your lake is just what you need. Let’s dive in.
Know that Less is More
I know it’s counter-intuitive in an age when everything from our walking path through Target to how frequently we purchase toilet paper is tracked by someone, somewhere. But what we’re learning is that just because you can collect and store data points, it doesn’t mean you should.
Before you even start pulling data, take time to outline what you’re trying to find out from it. What is your goal? What insights are you trying to gain? Start there and work backwards. If you’re trying to better understand customer retention, for instance, you only need to look at the data that shows when and where customers leave in their journey. You don’t need to clutter your data set with names, addresses, and social security numbers—from a security standpoint, it’s better not to. The point being, all data being stored—whether it’s structured, unstructured, or historical—needs to have purpose. If it doesn’t, it’s just another liability.
Automate When Possible
Clearly, with so much information being generated by our customers every day, it would be impossible for a person—or even a full data team—to sort through it effectively. That’s why it’s critical to include at least some form of machine learning and automation into your data strategy. Yes, some smaller companies may have smaller budgets associated with their data collection. But the hard truth: if you’re not using machine learning to manage and process your data, you probably shouldn’t be collecting it at all and should abandon it. Research shows companies that leverage customer behavior insights outperform their competitors by 85 percent in sales growth. Machine learning is what will help find insights and patterns humans will likely miss. It will also help process unstructured data like social media posts and customer service notes so their value isn’t lost in the process.
Yes, one of the great things about big data is that it is forcing companies to pull their data out of the siloes and poor them into the shared data lake. But just because we’re pulling that information from multiple sources, that doesn’t mean the lake is a free-for-all. All of that data needs to be clearly governed. That means establishing rules around:
- Who manages it?
- Who can access it?
- What type of information can and should be added to it?
- How is it secured?
- How is it classified?
- Is it easily searchable?
Without these guidelines, your data lake will become a useless data swamp in no time.
Last but not least, make data a part of your culture. In today’s digital transformation, many companies are so quick to jump on the tech or data bandwagon that they forget to incorporate their company culture into the process. But big data and data lakes aren’t just a behind-the-scenes player in today’s digital economy. They’re a valuable resource that all employees need to be equally committed to keeping up-to-date and secure. As such, make sure your leadership walks the walk when it comes to data—explaining the importance of data collection and how it impacts the company’s growth and performance. Provide training on data governance to keep your data lake clean. And make data a part of the ongoing business discussion.
It’s likely not time to abandon our data lakes. But it is time to think smarter when it comes to managing them. It’s time to think of your data lake as a living and breathing member of your company’s leadership team. Involve it in big decisions. Consider its capacity. And most of all, help it become all it can truly be.