Presentation is loading. Please wait.

Presentation is loading. Please wait.

Managing Data for Maximum Utility From Tables and Spreadsheets to Relational Databases and XML Caryn Anderson Simmons College Boston, MA - 22 April 2006.

Similar presentations


Presentation on theme: "Managing Data for Maximum Utility From Tables and Spreadsheets to Relational Databases and XML Caryn Anderson Simmons College Boston, MA - 22 April 2006."— Presentation transcript:

1 Managing Data for Maximum Utility From Tables and Spreadsheets to Relational Databases and XML Caryn Anderson Simmons College Boston, MA - 22 April 2006

2 Managing Data for Maximum Utility What are Data? What matters when managing Data? What tools can you use? Where can you learn more?

3 Managing Data for Maximum Utility What are Data? What matters when managing Data? What tools can you use? Where can you learn more? - Data, Information, Knowledge, Intelligence - Data personality - Data Evolution Continuum - People - Preservation - Tables - Spreadsheets - Relational Databases - XML - Tools - Information Visualization

4 Managing Data for Maximum Utility Data, Information, Knowledge & Intelligence What are the distinctions? What are Data?

5 Managing Data for Maximum Utility Data, Information, Knowledge & Intelligence “Dictionaries define data as factual information (measurements or statistics) used as a basis for reasoning, discussion, or calculation; information as the communication or reception of knowledge or intelligence; knowledge as the condition of knowing something gained through experience or the condition of apprehending truth or fact through reasoning, and intelligence as the ability to understand and to apply knowledge.” (Bouthillier & Shearer, 2002) What are Data?

6 Managing Data for Maximum Utility Data, Information, Knowledge & Intelligence “Knowledge differs from information in that it is predictive and can be used to guide action while information merely is data in context. For example, if the raw data is –10 degrees, then information would be it is –10 degrees outside, and the knowledge would be that –10 degrees is cold and one must dress warmly. In other words, knowledge is closer to action while information could be seen as documentation of any of pieces of knowledge.” (Bouthillier & Shearer, 2002) Bouthillier, F., and Shearer, K. (2002, October). Understanding knowledge management and information management: the need for an empirical perspective. Information Research. 8(1). Retrieved April 17, 2006, from Information Research Web site: http://informationr.net/ir/8-1/paper141.html What are Data?

7 Managing Data for Maximum Utility Data Personality type – text, numbers, digital objects? … context – financial, scientific, personnel, health, inventory, operational, scheduling, reference, assessment, … confidentiality – choice, law, … volume – small, large, potential for growth, … What are Data?

8 Managing Data for Maximum Utility What are Data? Generation types source info confidentiality clean or dirty Data Evolution Continuum

9 Managing Data for Maximum Utility What are Data? Generation Storage types source info confidentiality clean or dirty space formats metadata deposit/input updates Data Evolution Continuum

10 Managing Data for Maximum Utility What are Data? Generation Storage Interaction types source info confidentiality clean or dirty space formats metadata deposit/input updates sharing context analysis meaning more data Data Evolution Continuum

11 Managing Data for Maximum Utility What are Data? Generation Storage Interaction Presentation types source info confidentiality clean or dirty space formats metadata deposit/input updates sharing context analysis meaning more data summative predictive dynamic Data Evolution Continuum

12 Managing Data for Maximum Utility What are Data? Generation Storage Interaction Presentation Data Information Knowledge measurements statistics facts add context meaning summarize predict help decisions spur action Data Evolution Continuum

13 Managing Data for Maximum Utility What are Data? Generation Storage Interaction Presentation Data Information Knowledge measurements statistics facts add context meaning summarize predict help decisions spur action often an iterative process Data Evolution Continuum

14 Managing Data for Maximum Utility What are Data? Generation Storage Interaction Presentation Data Information Knowledge measurements statistics facts add context meaning summarize predict help decisions spur action frequently requiring more data and/or more analysis Data Evolution Continuum

15 Managing Data for Maximum Utility What are Data? Generation Storage Interaction Presentation Data Information Knowledge measurements statistics facts add context meaning summarize predict help decisions spur action Data Evolution Continuum

16 Managing Data for Maximum Utility Frequently, poor data management decisions are the result of the exclusive consideration of: the types of data involved, and the tools that the person responsible is familiar with. “When the only tool you have is a hammer, you tend to treat everything as if it were a nail.” -Abraham Maslow What matters when managing Data?

17 Managing Data for Maximum Utility Two most important factors People Data should be easy to engage with and easy to understand by everyone that encounters it. Preservation The security, privacy and integrity risks increase as the handling of data increases. What matters when managing Data?

18 Managing Data for Maximum Utility People Who will enter/interact/view the data? affiliation level of personal investment intellectual competence technical competence What will they do with it? record analyze make decisions How will they access it? information retrieval information visualization What matters when managing Data?

19 Managing Data for Maximum Utility Preservation What are the data personalities? type context confidentiality volume Where do they hang out? home work/play wandering When is there the greatest risk? risk areas legal obligations risk agents What matters when managing Data?

20 Managing Data for Maximum Utility Tables Tables 101 Pros Cons Examples OVDLT Literature Review MLIP Cohort Meeting Schedule Leadership Model What tools can you use?

21 Managing Data for Maximum Utility Tables Pros easy to learn and use in word documents reasonable control of presentation some sorting allowed good for one-time only, non-duplicating, non-interacting data Cons duplication of data no calculations no re-use of data easily only simple alignment other… What tools can you use?

22 Managing Data for Maximum Utility Tables Examples OVDLT - Open Video Digital Library Toolkit Literature Review MLIP - Managerial Leadership in the Information Professions Cohort Meeting Schedule MLIP Leadership Model What tools can you use?

23 Managing Data for Maximum Utility Spreadsheets Spreadsheets 101 Pros Cons Examples NEASIS&T Registration Simmons ERM JMA reports What tools can you use?

24 Managing Data for Maximum Utility Spreadsheets Pros best for numbers and currency sophisticated sorting and calculating visualization of information (charts) can feed mail merge with word documents Cons duplication of data for multi-faceted relationships selection of portions of data complicated re-use of data requires complex, error-prone, manual formulation formulas contained within discrete cells – no calculation on the fly other… What tools can you use?

25 Managing Data for Maximum Utility Spreadsheets Examples NEASIS&T - New England chapter of the American Society of Information Science & Technology Event Registration Simmons ERM - Electronic Resource Management JMA - John More Association Reports What tools can you use?

26 Managing Data for Maximum Utility Relational Databases Databases 101 Pros Cons Examples I2S Network (Access) ERUS (and ERMI guidelines) Open Video Backpackit del.icio.us blogs What tools can you use?

27 Managing Data for Maximum Utility Database Development Clarify Visualize Specify Build Test Adjust What tools can you use?

28 Managing Data for Maximum Utility Database Development Clarify your thinking about data and user scenarios Visualize the relationships between the entities Specify the details of the entities and relationships Build the database according to specifications Test the database against user scenarios Adjust the structure, editing or reports functions What tools can you use?

29 Managing Data for Maximum Utility Clarify Introduction / History Collection Description Users User Activities (Needs) User Personas with specific use scenarios What tools can you use?

30 Managing Data for Maximum Utility Visualize Entities attributes Relationships one to one one to many many to many recursive What tools can you use?

31 Managing Data for Maximum Utility Specify Data Dictionary Relational Schema Attributes Relationships What tools can you use?

32 Managing Data for Maximum Utility Build Select a data management tool Build according to specs Test Test database against user scenarios Adjust Adjust structures, interfaces and/or reporting functions What tools can you use?

33 Managing Data for Maximum Utility Relational Databases Pros handles complex multi-faceted relationships analysis easily customized easy selection of sub-groups of data scalable for large amounts of data accessible from web interfaces (though with some work) Cons learning curve steeper than for tables and spreadsheets difficult to see all data at a glance if not web-accessible, partners must have same application other… What tools can you use?

34 Managing Data for Maximum Utility Relational Databases Examples I2S - Integration and Information Sciences Network (Access) ERUS - Electronic Resource Usage Statistics Open Video Backpackit del.icio.us Blogs What tools can you use?

35 Managing Data for Maximum Utility XML XML 101 Pros Cons Examples OAI PMH EAD / MODS ICISC RSS SUSHI Bloglines What tools can you use?

36 Managing Data for Maximum Utility My First XML Chapter 1: Introduction to XML What is HTML? What is XML? Chapter 2: XML Syntax Elements must have a closing tag Elements must be properly nested What tools can you use? original slide content courtesy of Shaoping Moss The Display of the Document

37 Managing Data for Maximum Utility … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. … What tools can you use? An HTML Document An HTML document describes the book: original slide content courtesy of Shaoping Moss

38 Managing Data for Maximum Utility … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. … What tools can you use? An XML Document An XML document describes the book: original slide content courtesy of Shaoping Moss

39 Managing Data for Maximum Utility What tools can you use? HTML Elements/Tags An HTML document describes the book: original slide content courtesy of Shaoping Moss … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. … Are: defined by HTML standard always the same can be used in any order

40 Managing Data for Maximum Utility What tools can you use? XML Elements/Tags An XML document describes the book: original slide content courtesy of Shaoping Moss … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. … Are: defined by user/groups (DTD/Schema) different for each DTD/Schema hierarchical (tree structure)

41 Managing Data for Maximum Utility What tools can you use? XML is flexible and extensible An XML document describes the book for a different user group: original slide content courtesy of Shaoping Moss … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Element Rules Elements must have a closing tag. Elements must be properly nested. … Instead of “book” Extend to accommodate greater detail of “part” “section” AND “paragraph”

42 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss Differences between HTML and XML XML is not a replacement for HTML. XML and HTML were designed with different goals. - XML was designed to describe data and to focus on what data is. - HTML was designed to display data and to focus on how data looks. HTML structure and tags are very loose while XML structure and tags are strict: - XML documents must be well-formed. - XML elements must be properly nested. - All XML elements must be closed. - Tag names must be case consistent.

43 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss Differences HTML XML Content Format Selection & Organization - Held in generic containers (,, etc.) In the default format of the content tag OR -As defined by a Cascading Style Sheet (internal or external) -All content always included (no option to easily select or suppress content – must manually change document) -Content only displayed in the order written (to change order you must manually change document -Held in specific containers that describe what the data is (,, etc.) -XSLT files define the formats of each section (i.e. font, color, size, etc.) -multiple XSLTs for same XML -XSLT selects and determines order of display of content -Multiple XSLTs for same XML (one to produce just book title list, one to display full text, one for citations, etc.)

44 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss Differences HTML XML Analogy What you can get Address List in plain WORD document One document of your list of contacts with all the information that you have for each person in the order you typed it. Address List in database or MAIL MERGE data file Friends & Family with full addresses for Holiday cards E-mail list of just Professional contacts for announcing new product Special formatting of whole list for better display on PDA Etc. etc. etc. all from SAME XML document

45 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss How to Build an XML file family 1. Establish the Document Type Definition (DTD) or Schema 2. Write a well-formed XML document that holds your data in the containers established by your DTD/Schema 3. Validate your XML document to make sure you conformed to your DTD/Schema 4. Build as many different XSL documents as you need to select data from your XML file, organize it the way you want it to appear, and format it so it looks the way you want. Now you can link your XML file to whatever XSL you want to get the kind of display you want at any given time.

46 The XML family unit of files and languages XML Where the data is held DTD or Schema The organizational chart for the data XSL Instructions for using XML data and displaying it Uses XSLT to select data from.xml file and format it Uses XSL-PATH to access certain spots in the.xml file Uses XSL-FO for specifying formatting semantics (?) File types:.dtd.xml (schemas) File type:.xmlFile type:.xsl For validation during creation http://www.mysite.org/myfile.xml WEB PAGE Languages used in XSLT documents during creation 1. Calls the.xml file 2. Calls.xsl for display instructions 3. Looks in.xml for content 4. Returns content to.xsl 5. Displays content to browser Uses HTML for formatting Managing Data for Maximum Utility What tools can you use?

47 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss <!ELEMENT booktitle(#PCDATA) + means there can be as many of this element as you want The DTD establishes the hierarchy of elements/tags. The DTD or Schema

48 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss The XML Document HTML and XHTML:the Definitive Guide Chuck Musciano Bill Kennedy USA O’ Reilly 19.95 2000 XHTML 1.0 Language Sourcebook Ian S. Graham USA John Wiley and Sons 30.00 2000 This is what DTD is being used. This is what XSL is being used.

49 Managing Data for Maximum Utility What tools can you use? Validate your XML file Upload your XML file to this validator: http://www.stg.brown.edu/service/xmlvalid/ http://www.stg.brown.edu/service/xmlvalid/ You will either need to place your DTD on a web server so that the validator can find it (and put the right URL in the header of the XML), or you can put the DTD lines inside of your XML file (at the top). The validation service has a FAQ, but if you are getting stuck, it might be a good time for some remedial XML at the 3W Schools: http://www.w3schools.com/ http://www.w3schools.com/

50 Managing Data for Maximum Utility What tools can you use? The XSL Document My Book Collection Title Author Publisher Country Price 1995"> “xsl:template” is XSLT for “use the template below” “xsl:for-each” with the “select” instruction is XSLT for “select from each of the books in the booklist” “match” is X-PATH for “link to” or “start with” and “/” means the root element (“booklist” in this case) “xsl:sort” with the “select” instruction is XSLT for “sort by publisher” “xsl:if” with the “test” instruction is XSLT for “only those books when the year is later than 1995” This is basic HTML for the template… “xsl:value-of” with the “select” instruction is XSLT for “use the data from this element” You must close your XSLT commands You must close the HTML tags of your template

51 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss The Web Page

52 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss DONE! – not so hard… Logical Flexible Extensible Interoperable!!

53 Managing Data for Maximum Utility XML Pros enables easy sharing (particularly over http) – platform independent enables customized presentation highly structured for easy translation – data definitions included highly customizable within structure easy selection within subsets Cons calculations and analysis can be complicated steep learning curve parameters extremely strict (closing tags) requires translation by sharing partner other… What tools can you use?

54 Managing Data for Maximum Utility XML Examples OAI PMH – Open Archives Initiative Protocol for Metadata Harvesting EAD / MODS – Encoded Archival Description / Metadata Object Description Schema ICISC RSS – International Calendar of Information Science Conferences RSS feed SUSHI – Standardized Usage Statistics Harvesting Initiative Bloglines What tools can you use?

55 Managing Data for Maximum Utility What tools can you use? original slide content courtesy of Shaoping Moss Want to try some RSS? XML without the DTDs or XSLs Easy way to broadcast news & information Build an XML file, place on web server Use javascript RSS converters to quickly repurpose your own feed as content on your HTML web page.

56 Managing Data for Maximum Utility What tools can you use? RSS – Build the.xml file International Calendar of Information Science Conferences A collaboration between the International Information Issues SIG and the European and New England chapters of ASIS&T (American Society for Information Science and Technology) this master calendar of relevant conferences is offered to help connect and support like-minded professionals working on information problems around the world. http://icisc.neasist.org/ Conferences Added - Weeks Ending 7 April 2006 <![CDATA[ September 2006 (3-6 September 2006) BRASIL, Rio de Janeiro: 22a. Conferencia Mundial de educacao a distincia do ICDE [22nd ICDE World conference on Distance Education] (6-8 September 2006) NETHERLANDS, Leiden: Bridging the North-South Divide in Scholarly Communication on Africa (11-13 September 2006) EGYPT, Cairo: 5th International Internet Education Conference and Exhibition (18-22 September 2006) INDIA, Vadodara: Digital Libraries ]]> http://icisc.neasist.org/quickcalendar.html#month Open the channel of communication. By declaring the RSS version, you don’t need a DTD or XSL. Title of your feed. Describe your feed.Link to your web page. Title of your entry. Direct link to this item on your web page. If you don’t use these tags, none of the HTML formatting inside will work (p, ul, strong, etc.) Always close your tags!

57 Managing Data for Maximum Utility RSS – Making, Subscribing, Repurposing Some quick tutorials to get yourself going with RSS fast: Sullivan, Danny. (2003). “Making an RSS Feed.” SearchEngineWatch: http://searchenginewatch.com/sereport/article.php/2175271 http://searchenginewatch.com/sereport/article.php/2175271 Subscribing to RSS Feeds with feed readers or browsers: http://icisc.neasist.org/rssinstructions.html http://icisc.neasist.org/rssinstructions.html Use RSSExpress Lite to create javascript for the feed you want to use on your HTML page: http://rssxpress.ukoln.ac.uk/lite/include/http://rssxpress.ukoln.ac.uk/lite/include/ See the.xml file from the previous slide (the most current version): http://icisc.neasist.org/iciscfeed.xml http://icisc.neasist.org/iciscfeed.xml See the.xml from the previous slide repurposed on an HTML page: http://www.utip.info/gsliscedata.html http://www.utip.info/gsliscedata.html What tools can you use?

58 Managing Data for Maximum Utility EXAMPLES Now lets try it with your data! Discussion Problem-Solving Database Development Questions

59 Managing Data for Maximum Utility What are Data? Generation Storage Interaction Presentation types source info confidentiality clean or dirty space formats metadata deposit/input updates sharing context analysis meaning more data summative predictive dynamic Data Evolution Continuum

60 Managing Data for Maximum Utility What are Data? Generation Storage Interaction Presentation Data Information Knowledge measurements statistics facts add context meaning summarize predict help decisions spur action Data Evolution Continuum

61 Managing Data for Maximum Utility People Who will enter/interact/view the data? affiliation level of personal investment intellectual competence technical competence What will they do with it? record analyze make decisions How will they access it? information retrieval information visualization What matters when managing Data?

62 Managing Data for Maximum Utility Preservation What are the data personalities? type context confidentiality volume Where do they hang out? home work/play wandering When is there the greatest risk? risk areas legal obligations risk agents What matters when managing Data?

63 Managing Data for Maximum Utility Tools Spreadsheets Nelson, Stephen L. (2002). Excel Data Analysis for Dummies. For Dummies: http://www.dummies.com/WileyCDA/DummiesTitle/productCd- 0764516612.html http://www.dummies.com/WileyCDA/DummiesTitle/productCd- 0764516612.html Relational Databases Greenspan, Jay. “Your First Database.” Webmonkey.com.: http://www.webmonkey.com/webmonkey/backend/tutorials/tutorial3.html http://www.webmonkey.com/webmonkey/backend/tutorials/tutorial3.html Merrall, Graeme. “MySQL/PHP tutorial.” Webmonkey.com.: http://www.webmonkey.com/webmonkey/programming/php/tutorials/tutorial4.h tml http://www.webmonkey.com/webmonkey/programming/php/tutorials/tutorial4.h tml Publicly Available Database applications. Association for Computing Machinery (SIG MOD - Mgt. of Data). : http://www.sigmod.org/databaseSoftware/http://www.sigmod.org/databaseSoftware/ Hernandez, Michael J. (2003). Database Design for Mere Mortals: A Hands-on Guides to Relational Database Design. Reading, MA: AddisonWesley. Where can you learn more?

64 Managing Data for Maximum Utility Tools Relational Databases (Examples ERUS: http://focus.ischool.utexas.edu/projects/erus/http://focus.ischool.utexas.edu/projects/erus/ Backpackit: http://www.backpackit.comhttp://www.backpackit.com Open Video: http://www.openvideo.orghttp://www.openvideo.org del.icio.us: http://del.icio.ushttp://del.icio.us Bloglines: http://www.bloglines.comhttp://www.bloglines.com Where can you learn more?

65 Managing Data for Maximum Utility Tools XML W3 Schools – tutorials on all kinds of web-based languages and tools from the World Wide Web Consortium: http://www.w3schools.com/http://www.w3schools.com/ Library of Congress standards – descriptions and DTDs for MARCXML, MODS, MADS, EAD, etc.: http://www.loc.gov/standards/ (FYI – this stuff is a little hard to wade through find what you need. Look for “tag library” in EAD and “elements and attributes” in MODS to get a sense of what the tags are)http://www.loc.gov/standards/ OAI-PMH – tutorial and introduction from the Open Archives Initiative: http://www.oaforum.org/tutorial/index.php http://www.oaforum.org/tutorial/index.php SUSHI (http://www.niso.org/committees/SUSHI/SUSHI_comm.html)http://www.niso.org/committees/SUSHI/SUSHI_comm.html ViDe User’s Guide: Dublin Core Application Profile for Digital Video – video specific, but good XML examples: http://www.vide.net/workgroups/videoaccess/resources/vide_dc_userguide_2 0010909.pdf http://www.vide.net/workgroups/videoaccess/resources/vide_dc_userguide_2 0010909.pdf Where can you learn more?

66 Managing Data for Maximum Utility Tools RSS Sullivan, Danny. (2003). “Making an RSS Feed.” SearchEngineWatch: http://searchenginewatch.com/sereport/article.php/2175271 http://searchenginewatch.com/sereport/article.php/2175271 Subscribing to RSS Feeds with feed readers or browsers: http://icisc.neasist.org/rssinstructions.html http://icisc.neasist.org/rssinstructions.html Use RSSExpress Lite to create javascript for the feed you want to use on your HTML page: http://rssxpress.ukoln.ac.uk/lite/include/http://rssxpress.ukoln.ac.uk/lite/include/ Where can you learn more?

67 Managing Data for Maximum Utility Information Visualization Wikipedia – good overview of areas and some good references: http://en.wikipedia.org/wiki/Information_visualization Edward Tufte – site includes a variety of topical forums: http://www.edwardtufte.com/tufte/ http://www.edwardtufte.com/tufte/ Sparkline Wiki – PHP library Examples of use on the web: http://sparkline.wikispaces.com/Examples http://sparkline.wikispaces.com/Examples GIS - http://www.gis.com/http://www.gis.com/ iDashboards: http://www.oracle.com/technology/partners/pdf/iDashboards%204.0%20White paper.pdf http://www.oracle.com/technology/partners/pdf/iDashboards%204.0%20White paper.pdf OLIVE: On-Line Library of Information Visualization Environments – taxonomy of information visualization summarizing work of Rob Schneiderman and others: http://www.otal.umd.edu/Olive/ http://www.otal.umd.edu/Olive/ Where can you learn more?

68 Managing Data for Maximum Utility Other Reber, Elaine. “Creating Confidentiality Agreements that Protect Data and Privacy: The Challenge of Keeping up With Changes in State and Federal Regulations” (PPS): http://www.educause.edu/ir/library/powerpoint/SPC0551.pps http://www.educause.edu/ir/library/powerpoint/SPC0551.pps Where can you learn more?

69 Managing Data for Maximum Utility Contact Information Caryn Anderson Program Coordinator Ph.D./Managerial Leadership in the Information Professions Simmons College GSLIS 300 The Fenway Boston, MA 02115 617.521.2829 caryn.anderson@simmons.edu


Download ppt "Managing Data for Maximum Utility From Tables and Spreadsheets to Relational Databases and XML Caryn Anderson Simmons College Boston, MA - 22 April 2006."

Similar presentations


Ads by Google