Loading spatial data and cleansing-converting with SQL Server 2012

Someone asked, on Twitter, if it were possible to get a set of spatial data that included cities, counties (administrative regions) and countries in database table format for SQL Server spatial. I'd come across just these datasets (and more) as shapefiles at the CDC website and pointed them out. Then came the question about how to load these shapefiles into SQL Server.

I've always preferred FME by Safe Software to load spatial data into SQL Server. They have a variety of products  for SQL Server, including a low-priced option that more than suffices for loading shapefiles. But what if you can't afford even Safe's low-priced option (you're doing it for proof-of-concept, for example). I'd then recommend Shape2SQL, a free tool that's been around since SQL Server Spatial's inception back in SQL Server 2008.

The first time I'd loaded these shapefiles let's use the cntry00 (countries) file as an example, I'd run into some problems loading them as geography data type and had to bypass some records in the shapefile, leaving me with a map that was missing some countries. I'd always assumed that this was a ring-order problem that produced geographies large than a hemisphere. SQL Server Spatial in 2012 fixed the "larger than a hemisphere" limitation, so I thought I'd try loading as a geography again. Unfortunately I got the same errors. So I decided to reload the data as geometry, cleanse the data and convert to geography. What follows is a series of steps to do just that. It relies on some SQL Server 2012-specific functionality.

After trying to summarize this on Twitter as 140-character messages (I always refer to Twitter as "my 140-character lobotomy") someone suggested this would be a good blog entry. So here it is.

So, acquire the shapefile data. What you're looking for is a .shp file (input into Shape2SQL) and possibly the .prj file to determine the SRID to use. In this case, I'll go with SRID 4326 eventually for geography. You can use an SRID with geometry too, but it doesn't effect the spatial library's computation like it does with geography.

1. Unzip the zipfile
2. Pointed Shape2SQL at the .shp file choosing the following options:
   a. Planar Geometry
   b. Set SRID 4326
   c. Create spatial index – no (checkbox cleared)
   d. spatial column name – geom (default)

Now all the data will load. So let's run some queries to clean things up and convert to geography.

— First, look for Invalid geometries and get the reasons why they are invalid
— IsValidDetailed is SQL Server 2012-specific
— BTW, it wasn't a ring order problem after all.

— Error: 24411
select geom.IsValidDetailed(), * from cntry00
where geom.STIsValid() = 0;

— Make them valid, noting the row ids for your reference
— rows 21, 42, 252
update cntry00
set geom = geom.MakeValid()
output inserted.id
where geom.STIsValid() = 0;

— add geography column
alter table cntry00 add geog geography;

— vacuous conversion to geography
update cntry00
set geog = geography::STGeomFromWKB(geom.STAsBinary(), 4326);

— see if any invalid now… and the reason
— 24413, 24414 near the poles (3 rows)
select geog.IsValidDetailed(), * from cntry00
where geog.STIsValid() = 0;

— validate again
— rows 1,181,215
update cntry00
set geog = geog.MakeValid()
output inserted.id
where geog.STIsValid() = 0;

— check for more than a hemisphere, none found
select * from cntry00
where geog.EnvelopeAngle() > 90;

— if there are geographies greater than a hemisphere
— because of ring order problems fix 'em
— SQL Server 2012-specific
update cntry00
 set geog = geog.ReorientObject()
 output inserted.id
 where geog.EnvelopeAngle() > 90;

And there you have it. You might think that you can use the spatial results tab in SSMS to compare your data before and after validation to see if validation has left artifacts or changed the shape. Don't bother, though; the spatial results tab called MakeValid before displaying the data.

Hope this helps,


Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.