Enterprise AI For Dummies. Zachary Jarvinen

Чтение книги онлайн.

Читать онлайн книгу Enterprise AI For Dummies - Zachary Jarvinen страница 21

Enterprise AI For Dummies - Zachary Jarvinen

Скачать книгу

the total number of employees across industries, but there is more information buried in the data.

      FIGURE 3-1: Comparison: Total page visits by mean duration of visit.

      Distribution

      Relationship

      A relationship visualization reveals how two or more variables affect each other. You can show relationships through a variety of methods.

(Top) Bar chart depicting the composition of the workforce across industries (various categories of employees versus the number of employees). (Bottom) A donut chart breaking down the percentage of revenue per market segment.

      FIGURE 3-2: Composition: Employee per industry (top), revenue per market segment (bottom).

Map depicting the distribution of new companies (Start-ups) per county that have been started in the last 10 years. Chart depicting the relationship between call center wait times and customer satisfaction scores across time.

      FIGURE 3-4: Relationship: Call center wait time versus satisfaction score.

      Of the three pillars of AI — processing power, scalable storage, and big data — the third is the one that presents the biggest challenge. How to get it, how to validate it, how to process it.

Pyramid of critical success factors for AI and analytics. Four of the six layers relate to data, focusing on relevance, accessibility, usability, completeness, and data-based conclusions.

      FIGURE 3-5: Pyramid of critical success factors for AI and analytics.

Element Questions
AI How will you address analytical deployment, governance, and operations?
Experimentation ML Does machine learning add business value? How do you define success?
BI / Analytics What is the story your data is telling? What conclusions can you make from this information?
Explore and Enrich Can the data be used meaningfully? Are you missing any data or features?
Data Access Is the data accessible and usable (analysis-ready)? Is the data flow reliable?
Data Collection Do you have data relevant to your business goals?

      Identifying data sources

      Before you start, you should perform a data audit to determine what data you already have and identify gaps in your data that you must fill to accomplish your business goals.

      As mentioned in Chapter 1, for the enterprise, data falls into two categories: structured data (databases and spreadsheets) and unstructured data (email, text messages, voice mail, social media, connected sensors, and so on). Potential sources for data include:

       Internal data: The first place to look is the IT department, but depending on the organization, you may not find everything you need in one place. The most common challenges associated with big data aren’t analytics problems; they are information integration problems. To reap the benefits of big data, you must first slay the data silo dragon, from department-level tribal thinking down to that one app on that one computer in that one person’s office.

       Data capture: The second place to look is the data entering your organization. It can arrive in many forms, but you can use data extraction, metadata extraction, and categorization to supplement data. For example, you can run paper documents, whether handwritten or printed, through an optical character recognition system to digitize them in preparation for processing. Then they can join the rest of the digital data, such as emails, PDF files, Word documents, images, voice mail messages, videos, and other formats to be classified and populate the data store that will feed your AI insights.

       Data as a service (DaaS): If there are still holes in your data requirements, you can turn to third-party data for purchase, either commercial datasets such as Accuweather or public datasets such as data.gov and Kaggle.com. Broadening your datasets can increase the insights lurking in your own data.

      Cleaning

Скачать книгу