Download presentation

Presentation is loading. Please wait.

Published byDeshawn Boldrey Modified over 3 years ago

1
Selectivity Estimation Example Mohammad Farhan Husain

2
Example Data SubjectPredicateObject R1P1L1 R2P1L2 R3P1R4 R5P1R2 R6P1L3 R7P2L4 R8P2R1 R3P2L5 R1, R2, …, R8 are resources i.e. URIs P1 and P2 are predicates, also URIs L1, L2, …, L5 are literals R = Total number of unique resources = 8 T = Total number of triples = 8 T P1 = Total number of triples having predicate P1 = 5 T P2 = Total number of triples having predicate P2 = 3 For any query: Selectivity of a bound subject s = sel(s) = 1 / R = 1 / 8 = 0.125 Selectivity of predicate P1 = sel(P1) = T P1 / T = 5 / 8 = 0.625 Selectivity of predicate P2 = sel(P2) = T P2 / T = 3 / 8 = 0.375 Selectivity of unbound subject and predicate and object = 1.0

3
Example Histogram for P1 Suppose there is a hash function which assigns the object values of triples having predicate P1 in two bins in the following manner: Bin 1 contains: L1, L2 and R2 Bin 2 contains: R4 and L3

4
Example Histogram for P2 Suppose the same hash function assigns the object values of triples having predicate P2 in two bins in the following manner: Bin 1 contains: L5 Bin 2 contains: L4 and R1

5
Estimation Approach – Base Equations EquationNotes sel(t) = sel(s) * sel(p) * sel(o)t refers to a triple pattern sel(s) = 1/RR - No. of unique Resources in knowledge store sel(p) = Tp/T T – Total No. of triples, Tp – Triples matching predicate p sel(o) = hc(p,oc)/Tpwhere (p,oc) represents the class of the histogram for predicate p in which object o falls sel(?a) = 1when ?a is unbound subject, predicate, or object

6
Selectivity Estimation for Triple Pattern Example with Bound Predicate Triple Pattern: ?s P1 L2 Estimated selectivity = sel(s) x sel(P1) x sel(L2) = 1.0 x 0.625 x sel(P1, L2) = 1.0 x 0.625 x (h 1 (P1, L2) / T P1 ) = 1.0 x 0.625 x (Height of Bin 1 / T P1 ) = 1.0 x 0.625 x (3 / 5) = 0.375 Here, h 1 (P1, L2) denotes the bin of the histogram of predicate P1 where the hash function puts L2 in.

7
Selectivity Estimation for Triple Pattern Example with Unbound Predicate Triple Pattern: ?s ?p L2 Estimated selectivity = sel(s) x sel(p) x sel(L2) = 1.0 x 1.0 x {∑ Pi ϵ P sel(Pi, L2)} = 1.0 x 1.0 x {sel(P1, L2) + sel(P2, L2)} = 1.0 x 1.0 x {h 1 (P1, L2) / T P1 + h 1 (P2, L2) / T P2 } = 1.0 x 1.0 x {Height of Bin 1 of P1 Histogram / T P1 + Height of Bin 1 of P2 Histogram / T P2 } = 1.0 x 1.0 x {3 / 5 + 1 / 3} = 0.933 Note that the hash function always puts the value L2 into bin 1. That is why we pick the height of Bin 1 of the histogram for P2 even though P2 does not have the value L2 as its object in any of the triples.

8
Selectivity Estimation for Triple Pattern Example with Unbound Object Triple Pattern: ?s P1 ?o Estimated selectivity = sel(s) x sel(P1) x sel(o) = 1.0 x 0.625 x 1.0 = 0.625

Similar presentations

OK

SPARQL for Querying PML Data Jitin Arora. Overview SPARQL: Query Language for RDF Graphs W3C Recommendation since 15 January 2008 Outline: Basic Concepts.

SPARQL for Querying PML Data Jitin Arora. Overview SPARQL: Query Language for RDF Graphs W3C Recommendation since 15 January 2008 Outline: Basic Concepts.

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google