Presentation is loading. Please wait.

Presentation is loading. Please wait.

Recent spatial work by Jim Gray and Alex Szalay Bob Mann.

Similar presentations


Presentation on theme: "Recent spatial work by Jim Gray and Alex Szalay Bob Mann."— Presentation transcript:

1 Recent spatial work by Jim Gray and Alex Szalay Bob Mann

2 Recent work on spatial querying by Jim Gray & Alex Szalay SkyServer HTM approach works, but precomputing neighbours table take long time –Partially solved by new version of HTM library (soon): >10 times faster than current version –C# interface to next version of SQL Server Want a purely relational approach –vendor-neutral –no “impedance mismatch” with external calls So, Gray and Szalay working on more geometrical approach

3 Relational method for computing neighbours on sphere Problem: –fGetNearbyObjEq() function can compute neighbours of ~70 objects/sec (GHz PC) –Scaling to full SDSS: 2 months of computation! One solution: –Don’t use fGetNearbyObjEq/HTM index for computing neighbours –Subdivide sphere into Dec zones, then cut by RA [Following slides stolen/adapted from Jim Gray]

4 Zone Based Spatial Join Divide space into zones Key points by Zone (by Dec), offset (RA) (on the sphere this need wrap-around margin.) Point search look in a few zones at a limited offset: ra ± r a bounding box that has 1-π/4 false positives All inside the relational engine Avoids “impedance mismatch” Can “batch” all-all comparisons 32x faster and parallel 32x faster and parallel r ra-zoneMax √(r 2 +(ra-zoneMax) 2 ) cos(radians(zoneMax)) zoneMax x Ra ± x

5 In SQL: points near point create table zone (zone int, objID bigint, ra float, dec float, x float, y float, z float, primary key (zone, ra, objID)) insert into zone select floor((dec+90) /zoneHeight), ra, dec, cx, cy, cz from PhotoObj select o1.objID -- find objects from zone o1 -- in the zoned table where o1.zoneID between -- where zone # floor((@dec-@r)/@zoneHeight) and -- overlaps the circle ceiling((@dec+@r)/@zoneHeight) and o1.ra between @ra - @r and @ra + @r-- quick filter on ra and o1.dec between @dec-@r and @dec+@r -- quick filter on dec and ( (sqrt( power(o1.cx-@cx,2)+power(o1.cy-@cy,2)+power(o1.cz-@cz,2)))) < @r -- careful filter on distance

6 Some details to add Correct for “compression” of RA by cos(dec) and avoid division by zero at poles –o.ra between (ra-R)/ (cos(dec)+  ) and (ra+R)/(cos(dec)+  ) Include wrap around in RA (RA=0 & 360º) –Insert margins into zone table (duplicate entries) Result: 32x speed up over use of fGetNearbyObjEq –More by parallelisation Neighbour precomputation looks OK for WSA/VSA!

7 “Relational approach to testing point-in-polygon containment” Think geometrically –Point within circular radius on sphere translated to half-space cut in 3D Cartesian space Combine spatial cuts as half-space constraints Store geom. constraint info in DB & query [Graphic from Alex Szalay]

8 The Idea: Equations Define Subspaces For (x,y) above the line ax+by > c Reverse the space by -ax + -by > -c Intersect 3 half-spaces: a 1 x + b 1 y > c 1 a 2 x + b 2 y > c 2 a 3 x + b 3 y > c 3 x y x=c/a y=c/b ax + by = c x y

9 Some terminology A half-space, H, of the N-dimensional space S can be expressed as H = {x in S | f(x) > 0} The intersection of a set of half-spaces {H i } is a convex hull of points, C C = {x in S | x in H i for all i} A domain, D, is the union of a set of convex hulls {C i } D = {x in C i for all i}

10 The Idea: Equations Define Subspaces a 1 x + b 1 y > c 1 a 2 x + b 2 y > c 2 a 3 x + b 3 y > c 3 a 1 x + b 1 y > c 1 a 2 x + b 2 y > c 2 a 3 x + b 3 y > c 3 x y select count(*) from convex where a*@x + b*@y < c 3 2 2 2 11 1 select count(*) from convex where a*@x + b*@y > c x y 0 1 1 1 22 2

11 Domain is Union of Convex Hulls Simple volumes are unions of convex hulls. Higher order curves also work Complex volumes have holes and their holes have holes. (that is harder). Not a convex hull +

12 Now in Relational Terms create table HalfSpace ( domainID int not null -- domain name foreign key references Domain(domainID), foreign key references Domain(domainID), convexID int not null,-- grouping a set of ½ spaces convexID int not null,-- grouping a set of ½ spaces halfSpaceID int identity(),-- a particular ½ space halfSpaceID int identity(),-- a particular ½ space a float not null, -- the (a,b,..) parameters a float not null, -- the (a,b,..) parameters b float not null, -- defining the ½ space b float not null, -- defining the ½ space cfloat not null, -- the constraint (“c” above) cfloat not null, -- the constraint (“c” above) primary key (domainID, convexID, halfSpaceID) primary key (domainID, convexID, halfSpaceID) (x,y) inside a convex if it is inside all lines of the convex (x,y) inside a convex if it is NOT OUTSIDE ANY line of the convex Convexes containing point (@x,@y): select convexID -- return the convex hulls from HalfSpace -- from the constraints where (@x * a + @y * b) < c -- point outside the line? group by all convexID -- insist no line of convex having count(*) = 0 -- is outside (count outside == 0)

13 All Domains Containing this Point The group by is supported by the domain/convex index, so it’s a sequential scan (pre-sorted!). select distinct domainID -- return domains from HalfSpace -- from constraints where (@x * a + @y * b) < c -- point outside group by all domainID, convexID -– never happens having count(*) = 0 -- count outside == 0

14 The Algebra is Simple (Boolean) @domainID = spDomainNew (@type varchar(16), @comment varchar(8000)) @convexID = spDomainNewConvex (@domainID int) @halfSpaceID = spDomainNewConvexConstraint (@domainID int, @convexID int, @a float, @b float, @c float) @a float, @b float, @c float) @returnCode = spDomainDrop(@domainID) select * from fDomainsContainPoint(@x float, @y float) Once constructed they can be manipulated with the Boolean operations. @domainID = spDomainOr (@domainID1 int, @domainID2 int, @type varchar(16), @comment varchar(8000)) @type varchar(16), @comment varchar(8000)) @domainID = spDomainAnd (@domainID1 int, @domainID2 int, @type varchar(16), @comment varchar(8000)) @type varchar(16), @comment varchar(8000)) @domainID = spDomainNot (@domainID1 int, @type varchar(16), @comment varchar(8000)) @type varchar(16), @comment varchar(8000)) Very different approach to spatial querying – “constraint DB” Reference: Representing Polygon Areas and Testing Point-in- Polygon Containment in a Relational Database http://research.microsoft.com/~Gray/papers/Polygon.doc


Download ppt "Recent spatial work by Jim Gray and Alex Szalay Bob Mann."

Similar presentations


Ads by Google