Presentation on theme: "ATTRIBUTION Department of Computer Science Software Engineering Research Group, Berlin, Germany Abdul Saboor WELCOME TO THE PRESENTATION."— Presentation transcript:
ATTRIBUTION Department of Computer Science Software Engineering Research Group, Berlin, Germany Abdul Saboor WELCOME TO THE PRESENTATION
Here comes your footer Slide 2 Main Agenda For The Presentation What is Attribution and Data Attribution? The Importance of Attribution Why should the Attribution be used? Main reasons of Data Attribution Key Elements of Attribution Approaches used for Attribution Issues and Challenges in Attribution Summary of Presentation References AGSE MEQ 2013 Attribution of Open Data
Here comes your footer Slide 3 What is Attribution? Definitions Attribution often involves identifying the author or source of information of the written material or a work of art. Attribution of Open Data AGSE MEQ 2013 Attribution is acknowledgement of the use of someone else information, data or other work. Attribution is about crediting a copyright holder according to the terms of a copyright licence, usually crediting author/artist work like music, fiction, video and photography. The act of attribution is defined as the act of establishing relationship which operates between users and the creator(s) of some work. Citing other authors which means refereing to them or their work.
Here comes your footer Slide 4 Data Attribution? Data Attribution is to acknowledge the Data Creators and indicating availability of the data. Attribution of Open Data AGSE MEQ 2013 Granularity Features inside the datasets are being referred to. Versioning Dynamic or regular updated data, which version need to attributed. Location of Data Persistent link such as Digital Object Identifier. Acknowledging Creators Ensure that the Credit is given to those authors who deserver for it.
Here comes your footer Slide 5 Data From Various Sectors Attribution of Open Data AGSE MEQ 2013 Adopted from Christine L. Borgman, UCLA, Developling Data Attribution and Citation Practices and Standards
Here comes your footer Slide 6 Social Practice Data Reusability Reproducing Research Replicate findings (Facts & Figures) AGSE MEQ 2013 Attribution of Open Data Importance of Attribution – 1 Need of data Attribution Social Expectations/Requirements Legal Responsibility Must need to specify the Identifier(s) Purpose of data Attribution Usability of Attributed Objects Identify the form and content Interpret Evaluate Open Read Combine Describe Reuse Compute upon Annotate
Here comes your footer Slide 7 Identify and Persistence of Digital Objects Identifier ✬ DOI, URI, URL Naming and Namespaces ✬ Authors/Creators – ORCID (FUB, TUB,...), ISNI (People, Legal Entities,...) ✬ Generic/Specific – Registry Number... Description ✬ Self-description ✬ Metadata augmentation description AGSE MEQ 2013 Attribution of Open Data Importance of Attribution – 2 Identity Permanent Long-lived Scratch spaces Persistence
Here comes your footer Slide 8 Discoverability Identify the existence of data objects with specified characteristics – Data Creators, Data creation date, Data creation method... Locate – Depends on description and representation of data and on tools and services to search the data objects Retrieve – A variety of approaches to discover and reach data description via standard web protocols. Semantic Web technologies, web crawlers and search engines. AGSE MEQ 2013 Attribution of Open Data Importance of Attribution – 3 The chain of keeping and using data The transformations from the original state of datasets Relationships Provenance Identification of Units The Links between various Units Actions on relationships
Here comes your footer Slide 9 Intellectual Property AGSE MEQ 2013 Attribution of Open Data Importance of Attribution – 4 Policy for digital objects Whose Policy? Data repositories Publishers Universities Investigators Fund raising agencies What rights are associated? ✬ Reuse ✬ Reproduce ✬ Attribute Who owns the Data rights? How open are the Data? ✬ Open Data ✬ Open bibliography Types of Policy What to release? What kind of description? What attribution? What citation? Who can describe, annotate...
Here comes your footer Slide 10 The Importance of Metadata Attribution of Open Data AGSE MEQ 2013 Metadata Main purpose is, how to create durable links? Metadata play prominent role – Documentation necessary to understand the data Questionnaires, user guides, methodology descriptions, record layouts are also provided Heterogeneous in format – The most unstructured data Data Documentation Initiative (DDI) requests to provide a structured metadata standard Adopted from Mary Vardigan, Inter-University Consortium for Political and Social Research (ICPSR)
Here comes your footer Slide 11 AGSE MEQ 2013 Attribution of Open Data Data Quality Main Reasons of Data Attribution Proper sources of datasets Accuracy or Correctness of datasets Completeness of datasets Allowing others to access the underlying data Allow researchers to check mistakes and inconsistencies Previous work for verification and reuse Maintaining research record Understanding what has done before Attribution of existing work Understanding a subject has been changed over the time
Here comes your footer Slide 12 The Elements for Data Attribution - 1 AGSE MEQ 2013 Attribution of Open Data ElementsDescription Dataset NameSpecify a particular name for each dataset that represent to an organization. E.g, datasets names such as EU Coral Reef dataset. Authors Name and Contact Details Specify the Name of the author(s) of data and contact details. E.g, Organization name and address, telephone name, e-mail address, etc. Data DescriptionDescription about the contents of datasets accurately. Data FormatsSpecify the various supported data formats such as xml, rdf, n-triple, turtle, csv, xls, etc. Data Handling RulesDescribe the particular data handling rules or policies that apply on data and must need to follow such as Creative Commons CC0 1.0. Data Access Methods Specify the access method that how someone can get access to the data either via a URL or an API (Web-service SOAP, web-service REST).
Here comes your footer Slide 13 The Elements for Data Attribution - 2 AGSE MEQ 2013 Attribution of Open Data ElementsDescription Dataset SizeSpecify an estimated sized of dataset. E.g, Less than 10 MB or more than 100 MB, or greater. Data Time-periodSpecify the time period for the data which described the particular time period. E.g, 2005-2010. Data StatusExplain how often the dataset is updated, either it is updated on weekly, monthly or annually basis. Data FactorsSpecify the name of the factors in the dataset. E.g, time, year, square meter, etc. Data AvailabilityExplain that data already exist and is available for users and if not then how data become available on web. Language of DataSpecify the data is available in one language or support some other languages.
Here comes your footer Slide 14 AGSE MEQ 2013 Attribution of Open Data Vocabularies that Support Attribution - 1 ElementsDescription Dcterms:CreatorThis property is an entity and primary responsibility is making the resource. This property can be used to acquire information about data creators of a data item. Dcterms:SourceThis property describes the source of a resource is a related resource from which the described resource is derived. This make possible to create provenance elements which are associated as source data with a data creation element. Dcterms:ModifiedThis property specifies the date in which a resource has been changed. The modification of data item as a data creation which makes a new modified version of original data item. Dcterms:PublisherPublisher of a resource is an entity responsible for make the resource available. This property can be used to acquire information about the provider of an information resource where actual information provider remains uncertain. Dcterms:ProvenanceThis property makes a link to a resource with a statement of any changes in ownership and keeping of resource since its creation that significant for its authenticity, integrity, etc.
Here comes your footer Slide 15 AGSE MEQ 2013 Attribution of Open Data Vocabularies that Support Attribution - 2 ElementsDescription sioc:has-creator, sioc:creator-of, sioc:has-modifier, sioc:modifier-of sioc:has-owner, sioc:owner-of sico:earlier-version, sioc:later-version, sioc:next-version, sioc:previous-version The Friend of a Friend (FOAF) Semantic Web Publishing vocabulary (SWP) The Web Of Trust (WOT) The Ontology Metadata Vocabulary (OMV) The Changeset Vocabulary.
Here comes your footer Slide 16 Approaches for Attribution - 1 There are some approaches which are used to support the attribution of data that are: Attribution of Open Data AGSE MEQ 2013 Dublin Core Vocabulary DC approach provides a vocabulary for expressing resources. DC relies on shared usage across different repositories and organization. The distributed application use DC terms for communication about resources. Dublic core consists of a set of qualifiers and a core set of metadata elements which make it possible to interpret the elements in the semantic way. In context of attribution, a subset of elements and qualifiers can be employed, e.g, there are terms which are used for creator of a resource, for its publisher, and for the dates its publications. A typical Metadata statements are: An Identifier for the resource being described A term from the Dublin Core Vocabulary The Annotation Value
Here comes your footer Slide 17 Approaches for Attribution - 2 Attribution of Open Data AGSE MEQ 2013 Open Provenance Model The Open Provenance Model is a process in which data is being produced or transformed into new state, and it can represent the provenance of one or more data items from an old to a new state. OPM graph model for provenance provides the description of provenance about the graph whose edges denote the primariy relationships between occurrences represented by nodes. OPM graph explains how multiple events conducted to produce some sort of data and shows how one part of data derived from another part. OPM classifies nodes into three parts: Artifacts – Parts of data fixed value and context that represent an entity in a given state Process – Performed on artifacts in order to produce another artifact. Agents – Indicate the entities which are controlling the processes such as users
Here comes your footer Slide 18 Granularity ◎ Dataset can be part of several files: each files contains many tables, record and data points. ◎ Additional subsets are used such as features and parameters. ◎ Practical solution is to list dataset at whatever level of granularity has been chosen by host repository for assigning identifier. ◎ If repository provides identifiers at several levels of granularity, then fine-grained level that fulfill the requirements of attribution should be used. AGSE MEQ 2013 Attribution of Open Data Current Issues and Challenges in Attribution Issues need to consider for making attribution process more appropriate for tracking data. Data attribution is the main successful factor for adoption of data sharing and can help to address the relevant issues while implementating data attribution. Contributor Identifiers ◎ Every contributor has some uniqueness in their organizational activities, every institute has a unique identifier for each contributor, to be used in connection with data contributions. Two schemes used for attribution: ◎ The Open Research and Contributor Identifier (ORCID) is a scheme specifically used for academic authors. ◎ The International Standard Name Identifier (ISNI) scheme is a standard for registering Public Identifies such as People, Personnel, Legal entities in the creation or distribution of intellectual property.
Here comes your footer Slide 19 Micro – Attribution ◎ Crediting the contributors in a more compact way in order to keep process manageable. ◎ It is used to credit people or organization whose contributions do not fit the roles of data creator or compiler. ◎ The standard identifiers for both contributor and contributions are used to abbreviate the entities, a table is included in the documents supplementary data. AGSE MEQ 2013 Attribution of Open Data Current Issues and Challenges in Attribution
Here comes your footer Slide 20 Mannual and Automatic use of Attribution ◎ The URL in terms of Data Attribution to lead to a landing page for the dataset rather than direct download dataset. ◎ The landing page enable users to ensure that hey have located the right datasets. The landing page create a better user experience between datasets through direct access and those available through referred access. ◎ Deep Linking provides direct access to specific datasets through hierarchical structure of website. ◎ Data are processed by software tools and SW tools provide support to reader: they can be selective to download with regard to versions and formats, to select particular files or datasets and avoid data with license restriction. AGSE MEQ 2013 Attribution of Open Data Current Implementation Issues in Attribution There are couple of issues in terms of data repositories that are: Versioning ◎ An important feature of attribution system is that a reader to identify and retrieve exact same resource that author used. ◎ Possibly more versions available to choose since the data from various stages of processing can be made available in different versions. ◎ Data repositories ensure that different versions are attributed independently with their own identifier. ◎ Problem arise when repositories have to deal with rapid changes in datasets. Various version can be manageable through time slice and snapshots.
Here comes your footer Slide 21 Conclusion Attribution of Open Data AGSE MEQ 2013 Attribution is the process to give the credit to original creator of dataset(s) Attribution helps to make the research process more transparent and authenticated Attribution process maintain the Data Quality and Integrity, previous works can be verified and reused, it also maintains the proper research record There are various elements that are used to make the attribution, there are some approaches which are used to perform that attribution. There are various issues which need to be resolved for making the attribution processes more convenient.
Here comes your footer Slide 22 Thanks for your attentions ! Any Questions? Please AGSE MEQ 2013 Attribution of Open Data
Here comes your footer Page 23 References 1.Tony Rogers, Attribution Definition, How to use attribution in a new story. http://www.vocabulary.com/dictionary/attribution and http://journalism.about.com/od/writing/a/attribution.htm. 2.The Mind Wobbles, Attribution vs Citation: Do you know the difference? http://themindwobbles.wordpress.com/2009/07/10/attribution-vs-citation-do- you-know-the-difference/. July 2009. 3.Christine L. Borgman, Why are the attribution and citation of scientific data important? Report from Developing Data Attribution and Citation Practices and standards. An International Symposium and Workshop, January 2012. 4.W3C Website, What is provenance? http://www.w3.org/2005/Incubator/prov/wiki/ What Is Provenance, Modified at November 2010. 5.W3C Website, A working Definition of Provenance. http://www.w3.org/2005/Incubator/prov/wiki/What Is Provenance AWork- ingDefinition of provenance, Modified at November 2010. 6.W3C Website, Provenance, Metadata, and Trust. http://www.w3.org/2005/Incubator/prov/wiki/What Is Provenance Prove- nance.2C Metadata.2C and Trust, Modified at November 2010. 7.Edzard Hofig, Jens Klessmann, Nils Barnickel (Fraunhofer), Open Innovation mechanism in Smart Cities, Revision: A, v1.6, July 2011. 8.Alex Ball and Monica Duke (2012), How to Cite Datasets and Link to Publica- tions, Revised June 2012. 9.D.G. Campbell, The use of Dublin Core in web annotation programs.In proceed- ing of the International Conference on Dublin Core and Metadata Applications, Florence, Italy 2002, pp105-110. 10.Simon Miles, Mapping Attribution Metadata to the Open Provenance Model,.Future Generation Computer Systems 27 (6), Kings College London, UK, pp. 806811, 2011. 11.Dublin Core Metadata Initiative Usage Board, DCMI Metadata Terms. http://dublincore.org/documents/dcmi-terms/, January 2008. 12.Olaf Hartig, Provenance information in the Web of Data, Humboldt University Zu Berlin. In proceedings of the 2nd Workshop on Linked Data on the Web (LDOW2009), April 2009. 13.D. Brickley and L. Miller, FOAF Vocabulary Specification. http://xmlns.com/foaf/spec/. November 2007. 14.U. Bojars and J. G. Breslin. SIOC Core Ontology Specification, Revision 1.30. http://rdfs.org/sioc/spec/, January 2009. 15.J. J. Carroll, C. Bizer, P. Hayes, and P. Stickler, Named Graphs, Provenance and Trust. In Proceedings of the 14th International World Wide Web Conference, ACM Press, pp613-622, May 2005. 16.D. Brickley. Web of Trust RDF Ontology. http://www.w3.org/tr/rdf-schema/, February 2004. 17.R. Palma, J. Hartmann, and P. Haase. OMV - Ontology Metadata Vocabulary for the Semantic Web, v2.4. http://omv2.sourceforge.net/, January 2008. 18.S. Tunnicliffe and I. Davis. Changeset Vocabulary. http://vocab.org/changeset/schema.html, March 2006. 19.Li Ding, James Michaelis, Jim McCusker, and Deborah L. McGuinness. Linked Provenance Data: A Semantic Web-based approach to interoperable workflow traces, Elsevier, Future Generation Computer Systems, Vol.27, October 2010. 20.Y. Simmhan, B. Plale, and D. Gannon. A Survey of Data Provenance in e- Science. SIGMOD Record, Computer Science Department, Indiana University. Vol. 34, Issue No. 3, p3136, ACM, September 2005. 21.P. Buneman, S. Khanna, and W. C. Tan. Data Provenance: Some Basic Issues. In Proceedings of the 20th Conference on Foundations of Software Technology and Theoretical Computer Science (FST TCS), p87-93, Springer, December 2000. 22.M. Hausenblas, W. Slany, and D. Ayers. A Performance and Scalability Metric for Virtual RDF Graphs. In Proceedings of the 3rd Workshop on Scripting for the Semantic Web (SFSW) at ESWC, June 2007.