Data Lakes For Dummies. Alan R. Simon

Чтение книги онлайн.

Читать онлайн книгу Data Lakes For Dummies - Alan R. Simon страница 18

Data Lakes For Dummies - Alan R. Simon

Скачать книгу

answer: Get rid of the data marts … or at least most of them!

      You have three main options for how to deal with your proliferation of independent data marts as part of your data lake initiative:

       Retire some or all of the data marts, and replace them with data lake functionality.

       Isolate some of the data marts, and leave them in place alongside your new data lake.

       Incorporate some of your data marts as components of your data lake.

      Data mart retirement

      If your existing data marts are creaking and groaning and are now coming up short even for the analytical needs of their respective users, here’s a great idea: Get rid of them!

Schematic illustration of using a data lake to retire data marts.

      FIGURE 2-3: Using a data lake to retire data marts.

      

Chances are, most of your data marts, especially those that have been around for a while, support descriptive analytics (basic business intelligence functions such as drilling deeper into summarized data to gain additional insights from lower levels of your data). But what about advanced analytical needs such as machine learning or other data mining and artificial intelligence–enabled analytical needs? Probably not so much!

      So, why keep those aging data marts around? Redirect the data feeds from your source systems into your new data lake, and rebuild your analytics for accounting, your human resources (HR) organization, sales and marketing, and other parts of your enterprise within the data lake environment.

      Data mart isolation

      What if one of your existing data marts is an absolute work of genius? Suppose that three or four years ago, your company built a data mart to support your annual strategic planning cycle. Your strategic planning data mart has data feeds from numerous applications and systems around your enterprise. Do you really want to reinvent the wheel just because you’re now building a data lake?

      Great news: You don’t have to throw away your data mart baby along with the data lake water! (Okay, maybe not the best metaphor, but you get the idea.)

Schematic illustration of leaving a data mart intact and alongside your data lake.

      FIGURE 2-4: Leaving a data mart intact and alongside your data lake.

      Data mart incorporation

Schematic illustration of incorporating a data mart into the data lake.

      FIGURE 2-5: Incorporating a data mart into your data lake.

      

Even after getting your data mart proliferation under control as part of your data lake efforts, beware: History can easily repeat itself!

      Make no mistake about it: Just because you’re now in the data lake era rather than the earlier data warehouse era, business organizations will still likely want to create their own smaller-scale data marts for their specific analytics needs.

      Your data lake gives you a carrot-and-stick, one-two punch to help prevent the proliferation of future data marts.

      First the stick, and then the carrot.

      Establishing a blockade

      Your company’s top leadership needs to help you establish a blockade against new data marts springing into existence. Your chief information officer (CIO) needs to make this policy crystal clear, in concert with their counterparts on the business side: the chief operating officer (COO), chief financial officer (CFO), and others in your company’s executive ranks.

      

Ideally, even your chief executive officer (CEO) should sign a declaration that another round of data mart proliferation won’t be tolerated.

      Should a “no proliferation” edict be written in stone? Probably not. Some departments within your company will inevitably come up with some unique, time-is-of-the-essence analytical need that is better met through a stand-alone data mart than through the data lake.

However, the proponents of a new data mart should be required to prove their case and have their data mart project approved as an exception to the “no proliferation” rule. They need to declare the following:

       What the business imperative is for building a new stand-alone data mart (for example, to address some sort of business crisis or to take advantage of a market opportunity that must be addressed immediately)

       Why their analytical needs can’t be met using the data lake in the same time frame that it would take to build their new data mart

       Whether their planned data mart will be used only for a short period of time and be retired or if it will subsequently be incorporated into the data lake

      Providing a path of least resistance

      Business users around your organization build new stand-alone data marts because that’s what they’ve done for a long, long time. They realize that the best way to bring data-driven insights into

Скачать книгу