A data lake is a centralized repository that allows an organization to store all their data, structured and unstructured. This then allows the company to quickly access and use the data to build dashboards and visualizations, real-time analytics, machine learning models, or any other analysis over a diversity of data types, from images to tables. Many times this data is used as-is by these processes, i.e. without being structured first.
An Aberdeen survey found that organizations who implemented a Data Lake outperformed their peers by 9% in organic revenue growth. Their top 3 motivations were “Increase operational efficiency”, “Make data available from department silos, mainframe and legacy systems” and “Lower transactional costs”.