Presentation is loading. Please wait.

Presentation is loading. Please wait.

Composed by DUONG TO DUNG, FEB 2019

Similar presentations


Presentation on theme: "Composed by DUONG TO DUNG, FEB 2019"— Presentation transcript:

1 Composed by DUONG TO DUNG, FEB 2019
MIS COURSE: CHAPTER 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Composed by DUONG TO DUNG, FEB 2019

2 CONTENT What are the problems of managing data resources in a traditional environment? What are the major capabilities of database management systems (DBMS) and why is a relational DBMS so powerful? What are the principal tools and technologies for accessing information from databases to improve business performance and decision making? Why are information policy, data administration, and data quality assurance essential for managing the firm’s data resources?

3 What are the problems of managing data resources in a traditional environment?
PROBLEMS OF MANAGING DATA RESOURCES IN A TRADITIONAL FILE ENVIRONMENT: how to search, sort, find, apply conditions in finding… FILE ORGANIZATION TERMS AND CONCEPTS (p.242) + Bit: 1/0 + Byte: a group of bits + Field: a group of words/characters, expressing a data + Record: a group of fields, such as: name, age, gender, height… + Entity: a person, a place, an event, … + Attribute: entity’s characteristics

4 What are the problems of managing data resources in a traditional environment?
Data redundancy is the presence of duplicate data in multiple data files. In this situation, confusion results because the data can have different meanings in different files. Program-data dependence is the tight relationship between data stored in files and the specific programs required to update and maintain those files. This dependency is very inefficient, resulting in the need to make changes in many programs when a common piece of data, such as the zip code size, changes. Lack of flexibility refers to the fact that it is very difficult to create new reports from data when needed. Ad-hoc reports are impossible to generate; a new report could require several weeks of work by more than one programmer and the creation of intermediate files to combine data from disparate files. Poor security results from the lack of control over data. Data sharing is virtually impossible because it is distributed in so many different files around the organization.

5 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
A database is a collection of data organized to service many applications efficiently by storing and managing data so that they appear to be in one location. It also minimizes redundant data. A database management system (DBMS) is special software that permits an organization to centralize data, manage them efficiently, and provide access to the stored data by application programs. A DBMS can reduce the complexity of the information systems environment, reduce data redundancy and inconsistency, eliminate data confusion, create program-data independence, reduce program development and maintenance costs, enhance flexibility, enable the ad hoc retrieval of information, improve access and availability of information, and allow for the centralized management of data, their use, and security.

6 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
Relational DBMS: The relational database is the primary method for organizing and maintaining data in information systems. It organizes data in two-dimensional tables with rows and columns called relations. Each table contains data about an entity and its attributes. Each row represents a record and each column represents an attribute or field. Each table also contains a key field to uniquely identify each record for retrieval or manipulation.

7 Relational Database Tables

8 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
In a relational database, three basic operations are used to develop useful sets of data: select, project, and join. Select operation creates a subset consisting of all records in the file that meet stated criteria. In other words, select creates a subset of rows that meet certain criteria. Join operation combines relational tables to provide the user with more information that is available in individual tables. Project operation creates a subset consisting of columns in a table, permitting the user to create new tables that contain only the information required.

9 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
CAPABILITIES OF DBMS + Organizing, managing, and accessing the data in the DB. + DBMS has a data definition capability to specify the structure of the content of the DB. It would be used to create DB tables and to define the characteristics of the fields in each table. This info. about the DB would be documented in a data dictionary. + Data Dictionary is an automated or manual file that stores definitions of data elements and their characteristics. + Querying, Reporting: the most prominent data manipulate language today is SQL (structured query language)

10 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
DESIGNING DATABASES To create a DB, you must understand the relationships among the data, the type of data that will be maintained in the DB, how the data will be used, and how the org. will need to change to manage data from a company-wide perspective. The DB requires both a conceptual design and a physical design. The conceptual, or logical, design of a DB is an abstract model of the DB from a business perspective; whereas the physical design shows how the DB is actually arranged on direct-access storage devices.

11 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
NORMALIZATION AND ENTITY-RELATIONSHIP DIAGRAMS + NORMALIZATION: the process of creating small, stable, yet flexible and adaptive data structure from complex groups of data.

12 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
NORMALIZATION AND ENTITY-RELATIONSHIP DIAGRAMS + ENTITY-RELATIONSHIP DIAGRAMS

13 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
NON-RELATIONAL DB & DB IN THE CLOUD 4 main reasons for the rise in non-relational databases: cloud computing, unprecedented data volumes, massive workloads for Web services, and the need to store new types of data. These systems use more flexible data models and are designed for managing large data sets across distributed computing networks. They are easy to scale up and down based on computing needs. They can process structured and unstructured data captured from Web sites, social media, graphics. Traditional relational databases aren’t able to process data from most of those sources. Non-relational databases can also accelerate simple queries against large volumes of structured and unstructured data. There’s no need to predefine a formal database structure or change that definition if new data are added later.

14 NON-RELATIONAL DB & DB IN THE CLOUD
What are the major capabilities of DBMS and why is a relational DBMS so powerful? NON-RELATIONAL DB & DB IN THE CLOUD Several different kinds of NoSQL DB, each with its own technical features and behavior. + Oracle NoSQL DB + Amazon’s SimpleDB: no need to predefine a format DB structure + MongoDB The NoSQL DB is able to use structured, semi-structured, and unstructured info. w/o requiring tedious, expensive, and time-consuming DB mapping. Cloud DB: Amazon, Oracle, Microsoft. Price based on usage.

15 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? CHALLENGES OF BIG DATA Traditional databases rely on neatly organized content into rows and columns. Much of the data collected nowadays by companies don’t fit into that mold. Big data describes datasets with volumes so huge they are beyond the ability of typical database management system to capture, store, and analyze. The term doesn’t refer to any specific quantity of data but it’s usually measured in the petabyte and exabyte range. It includes structured and unstructured data captured from Web traffic, messages, and social media content such as tweets and status messages. It also includes machine-generated data from sensors. Big data contains more patterns and interesting anomalies than smaller data sets. That creates the potential to determine new insights into customer behavior, weather patterns, financial market activity, and other phenomena.

16 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? BUSINESS INTELLIGENCE INFRASTRUCTURE + Data Warehouse: a DB that stores current and historical data of potential interest to decision makers throughout the company. DW extracts data from multiple operational systems inside the org., then combined with data from external sources and transformed by correcting inaccurate and incomplete data and restructuring the data for management reporting and analysis before being loaded into the DW. DW: data cannot be altered, range of ad-hoc and standardized query tools, analytical tools, and graphical reporting facilities.

17 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? BUSINESS INTELLIGENCE INFRASTRUCTURE +Data Mart: is a subset of DW in which a summarized or highly focused portion of the org.’s data is placed in a separate DB for a specific population of users. +Hadoop: Open-source software framework that enables distributed parallel processing of huge amounts of data across inexpensive computers. The software breaks huge problems into smaller ones, processes each one on a distributed network of smaller computers, and then combines the results into a smaller data set that is easier to analyze. It uses non-relational database processing and structured, semistructured and unstructured data.

18 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? BUSINESS INTELLIGENCE INFRASTRUCTURE +In-memory Computing: rather than using disk-based database SW platforms, this technology relies primarily on a computer’s main memory for data storage. It eliminates bottlenecks that result from retrieving and reading data in a traditional DB and shortens query response times. Advances in contemporary computer HW technology makes in-memory processing possible. +Analytic Platforms: uses both relational and non-relational technology that’s optimized for analyzing large datasets. They feature preconfigured HW-SW system designed for query processing and analytics.

19 CONTEMPORARY BUSINESS INTELLIGENCE INFRASTRUCTURE

20 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? ANALYTICAL TOOLS: RELATIONSHIPS, PATTERNS, AND TRENDS +OLAP (ONLINE ANALYTICAL PROCESSING) DW support multidimensional data analysis, that enables users to view the same data in different ways using multiple dimensions. Each aspect of information represents a different dimension. OLAP represents relationships among data as a multidimensional structure, which can be visualized as cubes of data and cubes within cubes of data, enabling more sophisticated data analysis. OLAP enables users to obtain online answers to ad hoc questions in a fairly rapid amount of time, even when the data are stored in very large databases. OLAP and data mining enable the manipulation and analysis of large volumes of data from many perspectives, for example, sales by item, by department, by store, by region, in order to find patterns in the data. Such patterns are difficult to find with normal database methods, which is why a DW and data mining are usually parts of OLAP.

21 EXAMPLE OF OLAP

22 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? ANALYTICAL TOOLS: RELATIONSHIPS, PATTERNS, AND TRENDS +DATA MINING Data mining provides insights into corporate data that cannot be obtained with OLAP by finding hidden patterns and relationships in large databases and inferring rules from them to predict future behavior. The patterns and rules are used to guide decision making and forecast the effect of those decisions. The types of information obtained from data mining include associations, sequences, classifications, clusters, and forecasts.

23 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? ANALYTICAL TOOLS: RELATIONSHIPS, PATTERNS, AND TRENDS +TEXT MINING, WEB MINING Conventional data mining focuses on data that have been structured in DB and files. Text mining concentrates on finding patterns and trends in unstructured data contained in text files. The data may be in , memos, call center transcripts, survey responses, legal cases, patent descriptions, and service reports. Text mining tools extract key elements from large unstructured data sets, discover patterns and relationships, and summarize the information. Web mining helps businesses understand customer behavior, evaluate the effectiveness of a particular Web site, or quantify the success of a marketing campaign. Web mining looks for patterns in data through: Web content mining, Web structure mining, Web usage mining

24 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? DATABASES & THE WEB Conventional DB can be linked via middleware to the Web or a Web interface to facilitate user access to an org.’s internal data. Web browser SW on a client PC is used to access a corporate Web site over the Internet. The Web browser SW requests data from the org.’s DB, using HTML commands to communicate with the Web server. Because many back-end DB cannot interpret commands written in HTML, the Web server passes these requests for data to special middleware SW that then translates HTML commands into SQL so that they can be processed by the DBMS working with the DB. The DBMS receives the SQL requests and provides the required data. The middleware transfers info. from the org.’s internal DB back to the Web server for delivery in the form of a Web page to the user. The SW working between the Web server and the DBMS can be an application server, a custom program, or a series of SW scripts.

25 What are the principal tools and technologies for accessing info
What are the principal tools and technologies for accessing info. from DB to improve business performance and decision making? DATABASES & THE WEB

26 Why are info. policy, data administration, and data quality assurance essential for managing the firm’s data resources? An information policy specifies the org.’s rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying info.. Infor policy lays out specific procedures and accountabilities, identifying which users and organizational units can share info., where info. can be distributed, and who is responsible for updating and maintaining the information. Data administration is responsible for the specific policies and procedures through which data can be managed as an organizational resource. These responsibilities include developing info. policy, planning for data, overseeing logical DB design and data dictionary development, and monitoring how IS specialists and end-user groups use data. In large corporations, a formal data administration function is responsible for info. policy, as well as for data planning, data dictionary development, and monitoring data usage in the firm.

27 Why are info. policy, data administration, and data quality assurance essential for managing the firm’s data resources? Data that are inaccurate, incomplete, or inconsistent create serious operational and financial problems for businesses. Firms must take special steps to make sure they have a high level of data quality. These include using enterprise-wide data standards, databases designed to minimize inconsistent and redundant data, data quality audits, and data cleansing software. A data quality audit is a structured survey of the accuracy and level of completeness of the data in an IS. Can be performed by surveying entire data files, surveying samples from data files, or surveying end users for their perceptions of data quality. Data cleansing consists of activities for detecting and correcting data in a DB that are incorrect, incomplete, improperly formatted, or redundant. Data cleansing not only corrects data but also enforces consistency among different sets of data that originated in separate IS.

28 REVIEW SUMMARY What are the problems of managing data resources in a traditional environment? What are the major capabilities of database management systems (DBMS) and why is a relational DBMS so powerful? What are the principal tools and technologies for accessing information from databases to improve business performance and decision making? Why are information policy, data administration, and data quality assurance essential for managing the firm’s data resources?

29 LET’S GO THROUGH THEM TOGETHER! DOES BIG DATA BRING BIG REWARDS? END!
REVIEW QUESTIONS LET’S GO THROUGH THEM TOGETHER! DOES BIG DATA BRING BIG REWARDS? END!

30 NON-RELATIONAL DB & DB IN THE CLOUD
What are the major capabilities of DBMS and why is a relational DBMS so powerful? NON-RELATIONAL DB & DB IN THE CLOUD Online analytical processing, or OLAP, is an approach to answering multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational database, report writing and data mining.

31 What are the major capabilities of DBMS and why is a relational DBMS so powerful?
NON-RELATIONAL DB & DB IN THE CLOUD


Download ppt "Composed by DUONG TO DUNG, FEB 2019"

Similar presentations


Ads by Google