Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Midterm Examination. 2 General Observations Examination was too long! Most people submitted by .

Similar presentations


Presentation on theme: "1 Midterm Examination. 2 General Observations Examination was too long! Most people submitted by ."— Presentation transcript:

1 1 Midterm Examination

2 2 General Observations Examination was too long! Most people submitted by email

3 3 Code of Conduct  Computing is a collaborative activity. You are encouraged to work together, but...  Some tasks may require individual work.  Always give credit to your sources and collaborators. Good professional practice: To make use of the expertise of others and to build on previous work, with proper attribution. Unethical and academic plagiarism: To use the efforts of others without attribution.

4 4 Question 1 Consider the following scenario: [Maple Leaf Rag example] (a) Using the IFLA definitions, what are the work(s), expression(s), manifestation(s) and item(s)? What is the genre(s)? Explain your choices. (b) If you were including this material in a digital library, what digital objects would you use? What would be their structural types? Explain the reason behind your design.

5 5 IFLA Model Work A work is the underlying abstraction, e.g., The Iliad The Computer Science departmental web site Beethoven's Fifth Symphony Unix operating system The 1996 U.S. census This is roughly equivalent to the concept of "literary work" used in copyright law.

6 6 IFLA Model Expression. A work is realized through an expression, e.g., The Illiad has oral expressions and written expressions A musical work has score and performance(s). Software has source code and machine code Many works have only a single expression, e.g. a web page, or a book.

7 7 IFLA Model Manifestation. A expression is given form in one or more manifestations, e.g., The text of The Iliad has been manifest in numerous manuscripts and printed books. A musical performance can be distributed on CD, or broadcast on television. Software is manifest as files, which may be stored or transmitted in any digital medium.

8 8 IFLA Model Item. When many copies are made of a manifestation, each is a separate item, e.g., a specific copy of a book computer file

9 9 Structural Types Genre: Describes category of content, e.g., jazz, blues, rap, rock,... painting, fresco, mural,... operating system, compiler, interpreter,... Structural type: Describes structure of computer representation, e.g., scanned image web page marked-up text digitized audio

10 10 Object Models Digital object: An item as stored in a digital library, consisting of data, metadata, and an identifier. Object model: The relationship between digital objects and the content that they represent.

11 11 Data Structure Identifier Data Metadata page 3 gif loc.ndlp/amrlp.13579 page 1 gif page 2 gif doc1 page map object-md

12 12 Question 2 A printed text document can be converted to digital formats by a choice of methods: (i) digitization by scanning (ii) digitization plus optical character recognition (iii) retyping with SGML markup (a) What are the advantages and disadvantages of each of these three methods? (b) Under what circumstances would a user be unsatisfied with all three digital manifestations and want to use the original printed copy?

13 13 Part (a) What impact does the method of conversion have on each of the following? Retaining the appearance of the document Manipulation and searching of the converted object Cost

14 14 Question 3 The diagram shows a system that can be used for reference linking between journal articles. [Diagram] (a) In this system, describe the execution steps that occur at run time to resolve a reference link and obtain the required material. (b) What is the problem of selective resolution? Suggest one way that this system might be enhanced to support selective resolution.

15 15 The General Model Reference database Location database Content Publisher Client Publisher places information in databases

16 16 The General Model Reference database Location database Content Publisher Client Citation Identifier s

17 17 The General Model Reference database Location database Content Publisher Client URL s Identifier

18 18 The General Model Reference database Location database Content Publisher Client URL Content

19 19 The General Model Reference database Location database Content Publisher Client Citation Identifiers URLs Identifier URL Content

20 20 Resolution of Identifier Choice of resolver (distributed resolution) –Simple model: identifier determines resolver Selection from multiple copies (selective resolution) –Performance criteria –Economic and related criteria –User requirements

21 21 Question 4 (a) With MathML, what is the distinction between presentation markup and content markup? (b) You are asked to represent the following expression in MathML:  5 (1 + x) dx  2 x 2 Give the content markup for this expression. (c) Suppose that you have the representation of this expression in TeX. How would you link this to the MathML markup?

22 22 Part (b) is a number is an identifier

23 23 Annotations Content encoding Presentation encoding

24 24 Annotations Content encoding TeX encoding

25 25 Question 5 (a) You have a query, Q, that you wish to search against a set of documents D1, D2,..., D3. Explain how vector space methods can be used to rank how closely the documents match the query. (b) The query is:[query] The set of documents is: [documents] Calculate the ranking of these four documents against this query.

26 26 Vector Space Methods: Concept n-dimensional space, where n is the total number of different words in the set of documents. Each document is represented by a vector, with magnitude in each dimension equal to the number of times that the corresponding word appears in the document. Similarity between two documents is the angle between their vectors.

27 27 Example D1 -> ant ant bee D2 -> bee hog ant dog D3 -> cat gnu dog eel fox ant bee cat dog eel fox gnu hog length D1 2 1  5 D2 1 1 1 1  4 D3 1 1 1 1 1  5


Download ppt "1 Midterm Examination. 2 General Observations Examination was too long! Most people submitted by ."

Similar presentations


Ads by Google