Spatial Data Types in SQL Server and a SQL Example
Posted by Stefan Zvonar on August 7, 2012
SQL Server 2008+ introduced spatial data types. This article will give a brief overview of these, what they are and how you can use them.
Spatial Data Types
- Geography data type:
- Geodetic data – data that takes into consideration the curvature of the earth’s surface. That is, ellipsoidal data. This data type is good for when you would like to have accurate results over long distances. A straight line in the Geography data type would look like a curve when projected to a flat geometric data type. As an example, if you were calculating flight paths over the globe or planning a journey over the ocean, you would probably use this data type as the spherical shape of the earth is important in such a case. Note, a lot of web services will return geocoded information with traditional latitude and longitude coordinates, which inherently is a geographical data type. The default Spatial Reference ID (explained later) for geography types is 4326.
- There is currently a restriction with SQL 2008 for the geography data type and that is that it can only ever encapsulate one half of the globe (hemisphere). SQL 2012 allows the spanning of coordinates across the entire globe by instantiating a geography type using the ‘FULLGLOBE’ option.
- Geometry data type:
- Geometric / Euclidean – data that has been projected to a flat 2 dimensional plane (X,Y) or shapes that need to be represented in 3 dimensional space (X,Y,Z). This is good for data that does not need to consider the curvature of the Earth. As an example, if you were calculating areas of cadastral properties, you would probably use this data type as it is much more meaningful and easier to calculate. The radius of the Earth is not going to impact much on calculating how long your backyard fence should be (unless you have a mighty big backyard)! Paths are displayed as straight lines on a flat screen (rather than geography lines which display as curved in a 2D representation). The default Spatial Reference ID (explained later) for geometry types is 0, since these types can exist in an undefined planar space (not related to being on a region on Earth).
These spatial types are essentially .NET CLR classes, each with their own properties and set of static / instance methods, explained officially here. http://msdn.microsoft.com/en-us/library/ff848797 These methods are based on the Open Geospatial Consortium (OGC) standards, and you will find very similar methods used in other GIS data software as well, if you get the chance to use them.
When storing spatial information in your database, it is a requirement for SQL Server to know what Spatial Reference the coordinates are based on. Actually, it is not just SQL Server, but almost all spatial systems need this. You cannot perform calculations between spatial objects if they are ‘cooked’ with different coordinate systems, otherwise there would be no accuracy with your results. Many of the methods on these data types require an SRID (Spatial Reference System Identifier) parameter, so it is important to know what spatial reference system you are using. Also, if you ever did try and perform calculations with objects of different SRIDs, then the result is usually a NULL, so be aware of that. If you need to, you can transform a spatial object to a different spatial reference system, but that is a separate topic
If you have a quick look at the below spatial reference systems table, by running the below SQL, you can see all the spatial reference systems that SQL Server supports.
SELECT * FROM SYS.SPATIAL_REFERENCE_SYSTEMS
Take a look at the [unit_of_measure] column. You will notice that most of these systems use metres. This is important when performing spatial calculations (for example, you may need to perform extra calculations if you want to change results for the Imperial system). Any SQL calculations done on these spatial data types will return results in this unit of measure as well.
Now take a look at the [well_known_text] column. This gives a standard description for the Spatial Reference Ids that SQL Server supports. This is actually a standard piece of text known and recognised by all types of Geographic Information Systems (defined by the Open Geospatial Consortium) and it describes your unique SRID.
Brief background on SRIDs, Projections, Datums and Ellipsoids
A spatial reference identifier uniquely defines the coordinate system you are using. For example, if you specify that your dataset is using SRID 4326, there is absolutely no confusion with what ‘system’ your coordinates are using (and this holds true for other GIS software).
A coordinate system can be a projected coordinate system (essentially, dealing with an area on the Earth as if it was a flat 2D piece of paper where coordinates are usually measured in metres) or an unprojected coordinate system (on the globe, incorporating the curve and radius of the Earth where coordinates would be measured in degrees).
A spatial reference system gets a little more involved though, by utilising a particular ‘Datum’. In the world of maps, a ‘datum’ is a known surface model of the Earth that is very accurate for a particular region. It is based on a set of reference points linked to very well known locations surrounding a particular area (for example, an entire country) and often using a specific reference ellipsoid (explained soon). Measurements and coordinates can then be mapped using this area with relative accuracy.
Each datum is different though, as the world’s surface is very bumpy and squashed and not a perfect sphere (it’s actually a geoid). If you look at the Well Known Text for your SRID, you will see that it also references an Ellipsoid. In simplistic terms, an Ellipsoid is a model of the round globe that uses a different radius of the Earth on purpose so that the ‘area of interest’ is represented correctly. Therefore, an Ellipsoid (and Datum referencing that Ellipsoid) for Australia would not be very accurate if you needed to map coordinates on the other side of the Earth, such as in India or Africa.
Traditionally, the European Petroleum Survey Group (EPSG) was the main authority on maintaining a list of ‘standard’ spatial reference systems, but it is now looked after by the Oil and Gas Producers (OGP) Surveying and Positioning Committee. You may sometimes see the SRID be referred to as the EPSG number.
Therefore, by using a spatial reference system, mapping coordinates and calculations can be very accurate for a particular region on the Earth. A spatial reference is selected based on the region where most of the calculations will be performed. For example, if you are performing calculations in Australia, you may very well use SRID 4283 which uses the Geocentric Datum of Australia (GDA).
Now that you know what spatial reference system to use, and the spatial data type to use, how can we use these types? Below I will show a quick demo of using the geography data type, using some global coordinates using the WGS84 coordinate system (a global system that can be used for GPS).
First, create a table to store your points of interest. Here, I like to store the SRID along with the coordinates, so that we can easily know in the future what coordinate system the coordinates are using. In addition, I create a Point geography object on the fly for each coordinate as well.
CREATE TABLE [dbo].[tblLocation]( [LocationID] [int] IDENTITY(1,1) NOT NULL, [LocationDescription] [nvarchar](100) NOT NULL, [Address1] [nvarchar](50) NOT NULL, [Address2] [nvarchar](50) NULL, [Town] [nvarchar](50) NOT NULL, [County] [nvarchar](50) NULL, [Postcode] [nvarchar](10) NOT NULL, [Country] [nvarchar](50) NOT NULL, [Latitude] [decimal](10, 7) NOT NULL, [Longitude] [decimal](10, 7) NOT NULL, [SRID] [int] NOT NULL, -- Note: In SQL Server 2012, you can have the below computed column as PERSISTED, for faster retrieval and calculations. In 2008, you can not persist spatial computed columns. [Point] AS ([geography]::Point([Latitude],[Longitude],[SRID])), CONSTRAINT [PK_tblLocation] PRIMARY KEY CLUSTERED ( [LocationID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY] ) ON [PRIMARY] GO ALTER TABLE [dbo].[tblLocation] ADD CONSTRAINT [DF_tblLocation_Latitude] DEFAULT ((0)) FOR [Latitude] GO ALTER TABLE [dbo].[tblLocation] ADD CONSTRAINT [DF_tblLocation_Longitude] DEFAULT ((0)) FOR [Longitude] GO
And let’s populate this table with some data. Here we add some points using the WGS84 coordinate system (SRID of 4326), which is compatible with GPS and web mapping tools like Google / Bing Maps.
insert into dbo.tblLocation(LocationDescription, Address1, Address2, Town, County, Postcode, Country, Latitude, Longitude, SRID) values('CBD Mall', 'Queen Street Mall', 'Queen Street', 'Brisbane', 'Queensland', '4000', 'Australia', -27.4709331, 153.02350239999998, 4326) insert into dbo.tblLocation(LocationDescription, Address1, Address2, Town, County, Postcode, Country, Latitude, Longitude, SRID) values('Valley Mall', 'Brunswick Street Mall', 'Brunswick Street', 'Fortitude Valley', 'Queensland', '4006', 'Australia', -27.4568523, 153.03294370000003, 4326) insert into dbo.tblLocation(LocationDescription, Address1, Address2, Town, County, Postcode, Country, Latitude, Longitude, SRID) values('Birsbane Airport', 'Brisbane Domestic Terminal', 'Brisbane Airport', 'Brisbane', 'Queensland', '4009', 'Australia', -27.3849801, 153.12056069999994, 4326)
Now you will have a table with some locations, coordinates, and a point geography instance for each record. You can now use methods and calculations on this geography objects. A typical example is that shown below, to calculate points of interest that are close to another location:
CREATE PROCEDURE [dbo].[proc_Location_List_By_Geography] ( @dmlLat decimal(10, 7), @dmlLng decimal(10, 7), @intRadiusKm int ) As SET NOCOUNT ON -- Assuming a WGS84 projection, but change 4326 to the appropriate spatial reference ID for your stored coordinates DECLARE @SearchPoint as geography SET @SearchPoint = geography::Point(@dmlLat, @dmlLng, 4326) -- The STDistance instance method returns a measurment based on the [unit_of_measure] in [sys].[spatial_reference_systems] table for the appropriate spatial reference. -- In most cases, this is in metres. SELECT LocationID, LocationDescription, Address1, Address2, Town, County, Postcode, Latitude, Longitude, SRID, (Point.STDistance(@SearchPoint)/1000) AS DistanceKm FROM tblLocation WHERE Point.STDistance(@SearchPoint) <= (@intRadiusKm * 1000) SET NOCOUNT OFF GO
You can now execute this stored procedure, passing it different radius values and returning only those locations near the specified search location and within the distance specified.
Hope this helps,
stefanzvonar.com – A collection of articles on computer programming, software development, data analysis and other technical tidbits.