Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data and Knowledge Management CHAPTER 5. 5.1 Managing Data 5.2 The Database Approach 5.3 Database Management Systems 5.4 Data Warehouses and Data Marts.

Similar presentations


Presentation on theme: "Data and Knowledge Management CHAPTER 5. 5.1 Managing Data 5.2 The Database Approach 5.3 Database Management Systems 5.4 Data Warehouses and Data Marts."— Presentation transcript:

1 Data and Knowledge Management CHAPTER 5

2 5.1 Managing Data 5.2 The Database Approach 5.3 Database Management Systems 5.4 Data Warehouses and Data Marts 5.5 Knowledge Management CHAPTER OUTLINE

3 DIFFICULTIES OF MANAGING DATA Amount of data increasing exponentially Data are scattered throughout organizations and collected by many individuals using various methods and devices. Data come from many sources. Data security, quality, and integrity are critical.

4 ANNUAL FLOOD OF DATA FROM….. Credit card swipes E-mails Digital video Online TV RFID tags Blogs Digital video surveillance Radiology scans Source: Media Bakery

5 ANNUAL FLOOD OF NEW DATA! In the zettabyte range A zettabyte is 1000 exabytes © Fanatic Studio/Age Fotostock America, Inc.

6 DATA GOVERNANCE See videovideo Data Governance

7 DATA GOVERNANCE See videovideo Master Data Management

8 DATA GOVERNANCE See videovideo Master Data

9 MASTER DATA MANAGEMENT John Stevens registers for Introduction to Management Information Systems (ISMN 3140) from 10 AM until 11 AM on Mondays and Wednesdays in Room 41 Smith Hall, taught by Professor Rainer. Transaction Data Master Data John StevensStudent Intro to Management Information SystemsCourse ISMN 3140Course No. 10 AM until 11 AMTime Mondays and WednesdaysWeekday Room 41 Smith HallLocation Professor RainerInstructor

10  Defining Big Data  as diverse, high-volume,high-velocity information assets that require new forms of processing to enable enhanced decision making, insight discovery, and process optimization.  Exhibit variety;  Include structured, unstructured, and semi-structured data;  Are generated at high velocity with an uncertain pattern;  Do not fi t neatly into traditional, structured, relational databases (discussed later in this  chapter); and Can be captured, processed, transformed, and analyzed in a reasonable amount of time  only by sophisticated information systems. BIG DATA

11  When the Sloan Digital Sky Survey in New Mexico was launched in 2000, its telescope collected more data in its first few weeks than had been amassed in the entire history of astronomy. By 2013, the survey’s archive contained hundreds of terabytes of data. However, the Large Synoptic Survey Telescope in Chile, due to come online in 2016, will collect that quantity of data every five days.  In 2013 Google was processing more than 24 petabytes of data every day.  Facebook members upload more than 10 million new photos every hour. In addition, they click a “like” button or leave a comment nearly 3 billion times every day.  The 800 million monthly users of Google’s YouTube service upload more than an hour ofvideo every second.  The number of messages on Twitter grows at 200 percent every year. By mid-2013 the volume exceeded 450 million tweets per day. EXAMPLES OF BIG DATA

12 Volume: We have noted the incredible volume of Big Data in this chapter. Although the sheer volume of Big Data presents data management problems, this volume also makes Big Data incredibly valuable. Irrespective of their source, structure, format, and frequency, data are always valuable. If certain types of data appear to have no value today, it is because we have not yet been able to analyze them effectively. For example, several years ago when Google began harnessing satellite imagery, capturing street views, and then sharing these geographical data for free, few people understood its value. Today, we recognize that such data are incredibly useful (e.g., consider the myriad of uses for Google Maps). CHARACTERISTICS OF BIG DATA

13 Velocity: The rate at which data fl ow into an organization is rapidly increasing. Velocity is critical because it increases the speed of the feedback loop between a company and its customers. For example, the Internet and mobile technology enable online retailers to compile histories not only on fi nal sales, but on their customers’ every click and interaction. Companies that can quickly utilize that information—for example, by recommending additional purchases—gain competitive advantage. CHARACTERISTICS OF BIG DATA

14 Variety: Traditional data formats tend to be structured, relatively well described, and they change slowly. Traditional data include fi nancial market data, point-of-sale transactions, and much more. In contrast, Big Data formats change rapidly. They include satellite imagery, broadcast audio streams, digital music fi les, Web page content, scans of government documents, and comments posted on social networks. CHARACTERISTICS OF BIG DATA

15 The first step for many organizations toward managing Big Data was to integrate information silos into a database environment and then to develop data warehouses for decision making. After completing this step, many organizations turned their attention to the business of information management—making sense of their proliferating data. In recent years, Oracle, IBM, Microsoft,and SAP have spent billions of dollars purchasing software firms that specialize in data management and business intelligence. In addition, many organizations are turning to NoSQL databases (think of them as “not only SQL” databases) to process Big Data. These databases provide an alternative for firms that have more and different kinds of data (Big Data) in addition to the traditional, structured data that fit neatly into the rows and columns of relational databases. MANAGıNG BIG DATA

16 Organizations must do more than simply manage Big Data; they must also gain value from it. In general, there are six broadly applicable ways to leverage Big Data to gain value. Creating Transparency. Simply making Big Data easier for relevant stakeholders to access in a timely manner can create tremendous business value. In the public sector, for example, making relevant data more readily accessible across otherwise separate departments can sharply reduce search and processing times. In manufacturing, integrating data from R&D, engineering, and manufacturing units to enable concurrent engineering can significantly reduce time to market and improve quality. LEVERAGING BIG DATA

17 Enabling Experimentation. Experimentation allows organizations to discover needs and improve performance. As organizations create and store more data in digital form, they can collect more accurate and detailed performance data (in real or near-real time) on everything from product inventories to personnel sick days. IT enables organizations to set up controlled experiments. For example, Amazon constantly experiments by offering slightly different “looks” on its Web site. These experiments are called A/B experiments, because each experiment has only two possible outcomes. Here is how the experiment works: Hundreds of thousands of people who click on Amazon.com will see one version of the Web site, and hundreds of thousands of others will see the other version. One experiment might change the location of the “Buy” button on the Web page. Another might change the size of a particular font on the Web page. Amazon captures data on an assortment of variables from all of the clicks, including which pages users visited, the time they spent on each page, and whether the click led to a purchase. It then analyzes all of these data to “tweak” its Web site to provide the optimal user experience. LEVERAGING BIG DATA

18 Segmenting Population to Customize Actions. Big Data allows organizations to create narrowly defined customer segmentations and to tailor products and services to precisely meet customer needs. For example, companies are able to perform micro-segmentation of customers in real time to precisely target promotions and advertising. Suppose, for instance, that a company knows you are in one of its stores, considering a particular product. (They can obtain this information from your smartphone, from in-store cameras, and from facial recognition software.) They can send a coupon directly to your phone offering 10 percent off if you buy the product within the next five minutes. LEVERAGING BIG DATA

19 Replacing/Supporting Human Decision Making with Automated Algorithms. Sophisticated analytics can substantially improve decision making, minimize risks, and unearth valuable insights. For example, tax agencies use automated risk-analysis software tools to identify tax returns that warrant for further examination, and retailers can use algorithms to fine-tune inventories and pricing in response to real-time in-store and online sales. LEVERAGING BIG DATA

20 Innovating New Business Models, Products, and Services. Big Data enables companies to create new products and services, enhance existing ones, and invent entirely new business models. For example, manufacturers utilize data obtained from the use of actual products to improve the development of the next generation of products and to create innovative after-sales service offerings. The emergence of real-time location data has created an entirely new set of location-based services ranging from navigation to pricing property and casualty insurance based on where, and how, people drive their cars. LEVERAGING BIG DATA

21 Organizations Can Analyze Far More Data. In some cases, organizations can even process all the data relating to a particular phenomenon, meaning that they do not have to rely as much on sampling. Random sampling works well, but it is not as effective as analyzing an entire dataset. In addition, random sampling has some basic weaknesses. To begin with, its accuracy depends on ensuring randomness when collecting the sample data. However, achieving such randomness is tricky. Systematic biases in the process of data collection can cause the results to be highly inaccurate. For example, consider political polling using landline phones. This sample tends to exclude people who use only cell phones. This bias can seriously skew the results, because cell phone users are typically younger and more liberal than people who rely primarily on landline phones. LEVERAGING BIG DATA

22 Database management system (DBMS) minimize the following problems: Data redundancy Data isolation Data inconsistency 5.2 THE DATABASE APPROACH

23 DBMSs maximize the following issues: Data security Data integrity Data independence DATABASE APPROACH (CONTINUED)

24 DATABASE MANAGEMENT SYSTEMS

25 Bit Byte Field Record File (or table) Database DATA HIERARCHY

26 HIERARCHY OF DATA FOR A COMPUTER-BASED FILE

27 Bit (binary digit) Byte (eight bits) DATA HIERARCHY (CONTINUED)

28 Example of Field and Record DATA HIERARCHY (CONTINUED)

29 Example of Field and Record DATA HIERARCHY (CONTINUED)

30 Data model Entity Attribute Primary key Secondary keys DESIGNING THE DATABASE

31 Database designers plan the database design in a process called entity-relationship (ER) modeling. ER diagrams consists of entities, attributes and relationships. Entity classes Instance Identifiers ENTITY-RELATIONSHIP MODELING

32 Database management system (DBMS) Relational database model Structured Query Language (SQL) Query by Example (QBE) 5.3 DATABASE MANAGEMENT SYSTEMS

33 STUDENT DATABASE EXAMPLE

34 Normalization Minimum redundancy Maximum data integrity Best processing performance Normalized data occurs when attributes in the table depend only on the primary key. NORMALIZATION

35 NON-NORMALIZED RELATION

36 NORMALIZING THE DATABASE (PART A)

37 NORMALIZING THE DATABASE (PART B)

38 NORMALIZATION PRODUCES ORDER

39 Data warehouses and Data Marts Organized by business dimension or subject Multidimensional Historical Use online analytical processing 5.4 DATA WAREHOUSING

40 DATA WAREHOUSE FRAMEWORK & VIEWS

41 End users can access data quickly and easily via Web browsers because they are located in one place. End users can conduct extensive analysis with data in ways that may not have been possible before. End users have a consolidated view of organizational data. BENEFITS OF DATA WAREHOUSING

42 Knowledge management (KM) Knowledge Intellectual capital (or intellectual assets) 5.5 KNOWLEDGE MANAGEMENT © Peter Eggermann/Age Fotostock America, Inc.

43 KNOWLEDGE MANAGEMENT (CONTINUED) Tacit Knowledge (below the waterline) Explicit Knowledge (above the waterline) © Ina Penning/Age Fotostock America, Inc.

44 Knowledge management systems (KMSs) Best practices KNOWLEDGE MANAGEMENT (CONTINUED) © Peter Eggermann/Age Fotostock America, Inc.

45 Create knowledge Capture knowledge Refine knowledge Store knowledge Manage knowledge Disseminate knowledge KNOWLEDGE MANAGEMENT SYSTEM CYCLE

46

47 HOMEWORK  Answer the questions of the «Closing Case Can Organizations Have Too Much Data»


Download ppt "Data and Knowledge Management CHAPTER 5. 5.1 Managing Data 5.2 The Database Approach 5.3 Database Management Systems 5.4 Data Warehouses and Data Marts."

Similar presentations


Ads by Google