Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) Tetherless World Constellation
And the reality? It’s about the questions that are being asked, e.g. When was the last sensor calibration and who did it, why was it done and where are the results? Exactly what physics routines went into this model run and how do I know this is the actual output it generated (and that it has not been altered)?
The ecosystem? These are what enable scientists or anyone to explore/ confirm/ deny their ‘hunches’ or get answers to direct questions… Accountability ProofExplanationJustificationVerifiability ‘Transparency’ (the illusion of it) Trust Provenance - Internal/ External Identity
Why an illusion? It’s not that the word transparency is wrong, it is what it is being used for – –“If I let you see everything, you can get answers to your questions” Nope, not unless you are very lucky… It depends on –Who is asking the question (and why) –What the answer will be used for –CONTEXT and ROLE (poorly represented)
Fox VSTO et al.5 But back to reality Fragmentation Disconnection Encapsulation Data as service … all are bad for the questions that are being asked
So … translucency See just what is necessary and suff. Practical definition –As close to the relevant data, information and knowledge artifacts presented in an appropriate form –Damn, yes, I mean curation Methodological means –Lenses (with filters, roles if possible) –Bags –Logic, i.e. rules
Some of this is, er… Provenance - Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility Knowledge provenance; enrich with semantics (especially the relations between concepts previously isolated, and retaining context) and semantically-aware tools
Complexity (see IN43C-05) 8
And some … Identity –YOUR identity –Friends, organizations –Communities –Peer and non-peer relations Accountability –By whom, to whom –When and how often Documentation – are you happy Ted?
We need a Knowledge Base Knowledge provenance Descriptions of the artifacts Domain specific terms/ language 10 Questions Who What/when/why/ how Answer
Access Control Essential For Establishing Trust Licensing Intellectual property Security/ defence Endangered species Sensitive Data / Information Defining authorized access
Proof Markup Language PML Justification –Explanation –Causality graph Provenance –Conclusion –Source –Engine –Rule Trust –Trust/Belief metrics NodeSet Justification Conclusion NodeSet Justification Conclusion NodeSet Justification Conclusion Engine Rule hasAntecedentList hasSourceUsage hasInferenceRule hasInferenceEngine SourceUsage Source DateTime 12
Open Provenance Model Agents –Catalyst and controlling entity of a process Processes –Action or Series of actions performed resulting in new artifacts Artifacts –Immutable piece of state Roles –Non-semantic flat tags used to provide context in relations Artifact Process wasGeneratedBy(Role) Agent Artifact used(Role) wasControlledBy(Role) Artifact wasDerivedFrom(Role) Process wasGeneratedBy(Role) wasTriggeredBy(Role) 13
E.g. Knowledge Base – see Zednik et al. IN43C-06
My suggestion(s) Accommodation of dynamic content in an open (web) environment (distrust) Filter/ lens models and implementations in tools/ applications Declarative semantics to formalize the meaning/ terms and relations - progress Rules to define the combinations of evidence required - starting “In their face” end-user modeling – getting real use cases for presentation of ‘facts’