Welcome to Planet PostGIS

August 03, 2015

Postgres OnLine Journal (Leo Hsu, Regina Obe)

PostgreSQL 9.5 Grouping Sets with PostGIS spatial aggregates

One of the features coming in PostgreSQL 9.5 is the triumvirate GROUPING SETS, CUBE, and ROLLUP nicely covered in Bruce's recent slide deck. The neatest thing about PostgreSQL development is that when improvements happen, they don't just affect the core, but can be taken advantage of by extensions, without even lifting a finger. Such is the case with these features.

One of the things I was curious about with these new set of predicates is Would they work with any aggregate function?. I assumed they would, so decided to put it to the test, by using it with PostGIS ST_Union function (using PostGIS 2.2.0 development). This feature was not something the PostGIS Development group planned on supporting, but by the magic of PostgreSQL, PostGIS accidentally supports it. The grouping sets feature is particularly useful if you want to aggregate data multiple times, perhaps for display using the same dataset. It allows you to do it with a single query that in other PostgreSQL versions would require a UNION query. This is a rather boring example but hopefully you get the idea.


Continue reading "PostgreSQL 9.5 Grouping Sets with PostGIS spatial aggregates"

by Leo Hsu and Regina Obe (nospam@example.com) at August 03, 2015 07:01 AM

July 14, 2015

Boston GIS (Regina Obe, Leo Hsu)

Bring FOSS4G 2017 to Boston

If you are in Boston or wanna visit Boston for a FOSS4G conference, please join our effort in bringing the FOSS4G 2017 international conference to Boston.

For details check out Help Bring FOSS4G to Boston. Since we are Bostonians we'll be submitting talks and workshops for PostGIS and pgRouting if the event is hosted in Boston.

by Regina Obe (nospam@example.com) at July 14, 2015 10:09 PM

July 07, 2015

PostGIS Development

2.1.8 Released

Due to a number of bugs capable of crashing a back-end, we have released 2.1.8. If you are running an earlier version on a public site, we recommend that you update as soon as possible.

http://download.osgeo.org/postgis/source/postgis-2.1.8.tar.gz

View all closed tickets for 2.1.8.

by Paul Ramsey at July 07, 2015 12:00 AM

June 29, 2015

Postgres OnLine Journal (Leo Hsu, Regina Obe)

PostgreSQL OGR FDW update and PostGIS 2.2 news

PostGIS 2.2 is planned to reach feature freeze June 30th 2015 so we can make the September PostgreSQL 9.5 curtain call with confidence. Great KNN enhancements for PostgreSQL 9.5 only users. I've been busy getting all my ducks lined up. A lot on tiger geocoder and address standardizer extension to be shipped with windows builds, story for later. One other feature we plan to ship with the windows PostGIS 2.2 builds is the ogr_fdw ogr_fdw Foreign data wrapper extension. I've been nagging Paul Ramsey a lot about issues with it, this in particular https://github.com/pramsey/pgsql-ogr-fdw/issues/25, and after some prodding, he finally put his nose in and fixed them and pinged Even Rouault for some help on a GDAL specific item.

Needless to say, I've been super happy with the progress and support I've gotten with ogr_fdw development and really enjoying my ogr_fdw use. The XLSX reading a file saved after the connection was open required a fix in GDAL 2.0 branch (which missed GDAL 2.0.0 release, so because of this, this new package contains a GDAL 2.0.1ish library. Hopeful GDAL 2.0.1 will be out before PostGIS 2.2.0 comes out so I can release without guilt with this fix.


Continue reading "PostgreSQL OGR FDW update and PostGIS 2.2 news"

by Leo Hsu and Regina Obe (nospam@example.com) at June 29, 2015 09:24 AM

May 23, 2015

Postgres OnLine Journal (Leo Hsu, Regina Obe)

PostGIS 2.2 leveraging power of PostgreSQL 9.5

Things are shaping up nicely in PostGIS 2.2 development. We are going to hit feature freeze around June 30th 2015, and plan to ship late August or early September to be in line with PostgreSQL 9.5 release. So far we have committed a couple of neat features most itemized in PostGIS 2.2 New Functions. Many of the really sort after ones will require PostgreSQL 9.5 and GEOS 3.5. The geography measurement enhancements will require Proj 4.9.0+ to take advantage of. Things I'd like to highlight and then later dedicate full-length articles in our BostonGIS Waiting for PostGIS 2.2 series once they've been stress tested.


Continue reading "PostGIS 2.2 leveraging power of PostgreSQL 9.5"

by Leo Hsu and Regina Obe (nospam@example.com) at May 23, 2015 07:11 AM

May 12, 2015

Oslandia

Cesium Buildings


Cesium Buildings started at the OSGeo code sprint in Philadelphia. It is still at an early stage but already allows to display textured buildings on the Cesium globe. After our Cuardo R&D project based on Three.js we wanted to apply the gained experience to this popular framework as an alternative for the top of Oslandia's 3D stack.



The implementation is based on Cesium.QuadtreePrimitive that will call a tile provider when a terrain tile comes into view. We implemented a tile provider that delegates the actual loading of the geometries on the tile to a worker thread to keep the main thread responsive. Our tile provider is also responsible for tiles geometry caching to avoid unnecessary strain on the WFS server and faster display. The geometry caching mechanism takes advantage of the quad-tree nature of requests and of the vector nature of data. Texture loading and caching is already handled by Cesium.


The worker thread receives and url with the WFS request as input and returns several arrays (ArrayBuffer) that are transferable objects such that communication cost between the worker and the main thread is kept to a minimum. The arrays describe the vertices positions in Cesium cartesian reference frame, the triangle connectivity along with normals, tangents and bitangents (the latter two are not used by our tile provider for the moment).


A difficulty arises when you try to implement such a worker: it is not possible to importScripts(Cesium.js. Cesium is, for the moment, quite monolithic and is meant to be used in the main thread (e.g. it depends on the window among other things). As a result some Cesium code, needed for coordinates transformations, had to be copied in the worker code.


At this early stage, the plugin as a number of limitations that should be addressed in the future:
  • The geometries in the tile are not packed (i.e. there is one geometry per feature), because of the difficulty of packing the textures.
  • The worker is only able to transform wgg84 coordinates (epsg:4326), so a transformation may be required on the server side.
  • The geometry caching implementation does not provide a cache emptying strategy, therefore memory consumption will increase as the user explores the terrain and care should be taken with big datasets.

by Vincent Mora at May 12, 2015 08:00 AM

May 02, 2015

PostGIS Development

Enabling PostGIS raster drivers and out-of-db raster support

I’ve been asked this question in some shape or form at least 3 times, mostly from people puzzled why they get this error. The last iteration went something like this:

I can’t use ST_AsPNG when doing something like

SELECT ST_AsPNG(rast) 
    FROM  sometable;
 

Gives error: Warning: pg_query(): Query failed: ERROR: rt_raster_to_gdal: Could not load the output GDAL driver.

Continue Reading by clicking title hyperlink ..

by Regina Obe at May 02, 2015 12:00 AM

April 24, 2015

Boston GIS (Regina Obe, Leo Hsu)

PostGIS In Action 2nd Fresh off the presses

Just got our shipment of PostGIS In Action 2nd Edition. Here is one here.

It's a wee bit fatter than the First Edition (just by about 100 pages). I'm really surprised the page count didn't go over 600 pages given the large additional ground of coverage this edition has. This edition covers a lot more raster than the first edition and has chapters dedicated to PostGIS topology and PostGIS tiger geocoder.

by Regina Obe (nospam@example.com) at April 24, 2015 09:08 PM

April 06, 2015

PostGIS Development

PostGIS 2.0.7 and 2.1.7 Released

Due to a critical bug in GeoJSON ingestion we have made an early release of versions 2.0.7 and 2.1.7. If you are running an earlier version on a public site and accepting incoming GeoJSON, we recommend that you update as soon as possible.

http://download.osgeo.org/postgis/source/postgis-2.1.7.tar.gz

http://download.osgeo.org/postgis/source/postgis-2.0.7.tar.gz

View all closed tickets for 2.0.7.

by Paul Ramsey at April 06, 2015 12:00 AM

March 31, 2015

Boston GIS (Regina Obe, Leo Hsu)

PostgreSQL 9.5 features of interest to PostGIS Users

We got back from PostgreSQL Conf 2015 which was in New York City. It's probably the most educational conference we've been to in a while and mostly because we stayed thru the whole conference. One of the key take aways from the conference was Magnus Hagander's 9.5 showcase of features slides at http://www.hagander.net/talks/postgresql95.pdf. There are 2 things I am looking for which are along the lines of parallelization and the 3rd I'm not sure about is of much benefit but could be.

Inheritance support for Foreign Tables (aka sharding with FDWs)

The thing I was most excited about was Foreign Table inheritance (and BTW you can add check constraints to Foreign tables) which means you can use the constraint exclusion feature of PostgreSQL to query across federated PostgreSQL servers. The thing that excited me about this, is I had a perfect use case I was itching to try -- which I have on my list. The PostGIS Tiger geocoder utilizes an inheritance hierarchy to partition tables. When I took on maintainance of the geocoder and changed it to use a table inheritance instead of single table structure, I was not so much concerned about speed as much as being able to easily spit out a backup of a set of states to create a smaller database I could hand-off to a user or specific client. Since it is already naturally sharded and has that built-in its query logic, it would seem a perfect fit to test if parallelization could work in this fashion and improve speed for similar setups. Michael Paquier has a simple example of this feature in PostgreSQL 9.5 feature highlight: Scale-out with Foreign Tables now part of Inheritance Trees

Parallel sequential scans

This one isn't quite in yet, but is in the queue of likely candidates. So we don't have parallalization for all query methods, but this is the first step to getting there and one that doesn't require any changes on your part to take advantage of.

GiST index only scan

Big gotcha here is that it is only currently supported for built in PostgreSQL box and point classes. The bigger question is, is it really worthwhile to implement in PostGIS 2.2 for PostGIS types. I have to say this was a bit of a yawn for me, but I point it out because it was the only time Magnus mentioned PostGIS - "PostGIS folks in crowd, can PostGIS use this?". I doubt it will provide much benefit since the GisT we have just contains the box and requires a RECHECK anyway. Conceivably it could be useful for counting points or as a very fuzzy estimate of count of objects in a window, but that's all I can think it could be useful for. How much effort to build in the logic in PostGIS 2.2 I don't know. I leave that as an excercise for someone smarter than me.

by Regina Obe (nospam@example.com) at March 31, 2015 05:14 AM

March 30, 2015

Boundless Geo

Pinpointing Disasters with OpenGeo Suite

Boundless’ Victoria, British Columbia office sits at the southern tip of Vancouver Island, a region in which is used to mild earthquake activity. So when a colleague back east asked if we’d felt “the earthquake near Port Hardy”, we weren’t particularly surprised that there had been one or that it had gone unnoticed locally.

We were a little surprised, however, when we saw that the epicentre was well off the west coast of the island, while Port Hardy sits on the east coast. Looking more closely at the USGS map showing the location of the earthquake, one wonders why Port Hardy was chosen as a reference point in news reports and not say, Tofino, which is roughly due east of the epicentre.
Picture1
Like many OpenGeo Suite users, PostGIS is my go-to tool for doing some quick spatial analysis, so I loaded up the Natural Earth populated places data set in to a fresh database and whipped up a quick query to see what was happening within 250 kilometres of the quake.

$ shp2pgsql -W LATIN1 -I -s 4326 ne_10m_populated_places.shp | psql earthquakes

WITH constants AS 
(SELECT ST_SetSRID(ST_MakePoint(-128.15, 49.42), 4326)::geography AS quake)
(
  SELECT name, pop_max, ST_Distance(geom::geography, quake) / 1000 AS distance
  FROM ne_10m_populated_places, constants
  WHERE ST_DWithin(quake, geom::geography, 250*1000)
  ORDER BY DISTANCE
);

And sure enough:

      name      | pop_max |     distance
----------------+---------+------------------
 Port Hardy     |    2295 | 151.591959991648
 Tofino         |    1655 | 170.322296453086
 Campbell River |   33430 | 219.404018781354
 Courtenay      |   32793 | 229.792897985687

Port Hardy just edges out Tofino as the closest settlement in my data set. So do the USGS and other organisations simply use a naive algorithm to find a reference point for earthquakes? It sure looks like it!

[Note: if you’re wondering about the WITH clause above, that’s just an easy way to effectively define a constant that can be referenced throughout the main part of the query. Remember this syntax because we’ll be using it again below.]

Google went with Port Hardy, based on USGS data (although they calculate the distance differently that their source):
Picture2

Natural Resources Canada’s website on the other hand referenced the small village of Port Alice, which is closer to the action but wouldn’t make the population threshold in most data sets:
Picture3

Writing a better algorithm

If we agree that Port Hardy probably isn’t the best reference point for an event in the Pacific Ocean, then we are left with the question: can we design a better algorithm? A simple, but effective improvement would be to calculate the distance between nearby settlements but double the distance for the bits that cross over land.

So from Port Hardy to the epicentre is about 150 kilometres, but we’ll need to add about 75 km extra because about half of that is overland. From Tofino, however, it’s 170 km from point to point and only a smidgen extra for crossing Vargas Island on the way out to sea. That’s 225 km to 175: Tofino wins!

Picture4

We’re going to build up our query in three parts, something which isn’t strictly necessary but it does make things a little easier to read and maintain.

The first step is to get a list of populated places that are within a reasonable range of the earthquake (we’ll use 250 km again) and which have more than 1000 inhabitants. Additionally, we’ll use PostGIS to create a line from that point out to the quake and pass up all this information to the next step in our query.

SELECT
  name, pop_max, ST_MakeLine(geom, quake) AS line
FROM constants, ne_10m_populated_places
WHERE
  pop_max > 1000 AND
  ST_DWithin(quake::geography, geom::geography, 250*1000)

The sub-query above will be referenced as the places table in the query below, which is where most of the hard work actually happens.

We take the results from the places query and join them to another Natural Earth data set which contains states, provinces and other subdivisions of countries around the world (you can load it with the same shp2pgsql command I used above). Basically, this new table tells us what parts of the world are covered by land and what parts are not. By finding the intersection between our line and any land polygons, we can calculate a new land_line geometry for each of the places we found above.

SELECT
  places.name, places.pop_max, places.line,
  ST_Collect(ST_Intersection(places.line, land.geom)) AS line_land,
  ST_Length(line::geography) / 1000 AS distance
FROM
  ne_10m_admin_1_states_provinces_shp AS land, places
WHERE ST_Intersects(places.line, land.geom)
GROUP BY places.name, places.pop_max, places.line

We’ll add this new geometry and its length to the data we’ve collected about each place and pass them all up to final part of our query, referring to this as the places_land table.

SELECT
  name, 
  pop_max,
  distance,
  distance + ST_Length(line_land::geography) / 1000 AS weighted_distance
FROM places_land
ORDER BY weighted_distance

This is where we wrap everything up by calculating the weighted_distance, which is just the regular distance plus the distance of the part that crossed over land (dividing by 1000 since the length of the PostGIS geography data type is measured in meters).

Pulling these together we get this final, three-step query:

WITH constants AS 
(SELECT ST_SetSRID(ST_MakePoint(-128.15, 49.42), 4326) AS quake)
(
  SELECT 
    name, 
    pop_max, 
    distance, 
    distance + ST_Length(line_land::geography) / 1000 AS weighted_distance
  FROM
  (
    SELECT
      places.name, 
      places.pop_max,
      places.line,
      ST_Collect(ST_Intersection(places.line, land.geom)) AS line_land,
      ST_Length(line::geography) / 1000 AS distance
    FROM
      ne_10m_admin_1_states_provinces_shp AS land,
    (
    SELECT
      name, pop_max, ST_MakeLine(geom, quake) AS line
    FROM constants, ne_10m_populated_places
    WHERE
      pop_max > 1000 AND
      ST_DWithin(quake::geography, geom::geography, 250*1000)
    ) AS places
    WHERE ST_Intersects(places.line, land.geom)
    GROUP BY places.name, places.pop_max, places.line
  ) AS places_land
  ORDER BY weighted_distance
);

All that’s left is to run the query and see what we get:

      name      | pop_max |     distance     | weighted_distance
----------------+---------+------------------+-------------------
 Tofino         |    1655 | 170.322296453086 |  170.996532624216
 Port Hardy     |    2295 | 151.591959991648 |  213.424215448539
 Campbell River |   33430 | 219.404018781354 |  336.005258086551
 Courtenay      |   32793 | 229.792897985687 |  344.417265701763

It works: this earthquake is best described as being 170 km from Tofino by our reckoning!

The query above is only really suitable for points at sea, but you can adapt this code for cases where points are on land as well … and of course the exercise is not limited to earthquakes, but can be applied for any kind of disaster or event. With some additional creativity, we could  also tune our algorithm to prefer places with more inhabitants over those with fewer. And of course, you can always change the search radius of 250 km or the population threshold of 1000 inhabitants.

Finally, if you want to pack this all up and create an application with OpenGeo Suite, I suggest checking out our recent blog post on publishing SQL queries like this in GeoServer and passing parameters to make an interactive and dynamic layer to show on a map!

 

The post Pinpointing Disasters with OpenGeo Suite appeared first on Boundless.

by Benjamin Trigona-Harany at March 30, 2015 01:41 PM

March 21, 2015

Paul Ramsey

Magical PostGIS

I did a new PostGIS talk for FOSS4G North America 2015, an exploration of some of the tidbits I've learned over the past six months about using PostgreSQL and PostGIS together to make "magic" (any sufficiently advanced technology...)

 

by Paul Ramsey (noreply@blogger.com) at March 21, 2015 04:16 PM

March 20, 2015

Paul Ramsey

Making Lines from Points

Somehow I've gotten through 10 years of SQL without ever learning this construction, which I found while proof-reading a colleague's blog post and looked so unlikely that I had to test it before I believed it actually worked. Just goes to show, there's always something new to learn.

Suppose you have a GPS location table:

  • gps_id: integer
  • geom: geometry
  • gps_time: timestamp
  • gps_track_id: integer

You can get a correct set of lines from this collection of points with just this SQL:


SELECT
gps_track_id,
ST_MakeLine(geom ORDER BY gps_time ASC) AS geom
FROM gps_poinst
GROUP BY gps_track_id

Those of you who already knew about placing ORDER BY within an aggregate function are going "duh", and the rest of you are, like me, going "whaaaaaa?"

Prior to this, I would solve this problem by ordering all the groups in a CTE or sub-query first, and only then pass them to the aggregate make-line function. This, is, so, much, nicer.

by Paul Ramsey (noreply@blogger.com) at March 20, 2015 08:07 PM

PostGIS Development

PostGIS 2.1.6 Released

The 2.1.6 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.6 release. As befits a patch release, the focus is on bugs, breakages, and performance issues. Users with large tables of points will want to priorize this patch, for substantial (~50%) disk space savings.

http://download.osgeo.org/postgis/source/postgis-2.1.6.tar.gz

Continue Reading by clicking title hyperlink ..

by Paul Ramsey at March 20, 2015 12:00 AM

March 10, 2015

Dimensional Edge

Quadgrid Generation Using Recursive SQL Function

For something a little different, here is a PostGIS recursive SQL quadgrid function which has been in my toolbox for some time now. The inspiration came in 2010 when reading “Open Source GIS: A Grass GIS Approach” by Markus Neteler and Helena Mitasova (3rd edition, p.244). The quadgrid function works by recursively subdividing tiles into smaller tiles (quadcells) up to a maximum depth if say the number of intersecting points (or some other feature criteria) exceeds a certain threshold.

Quadgrids have many applications, but I’ve found them useful for mapping urban phenomena (such as population density), both in 2D and 3D. Since large cities tend to be packed with more people, the result is more quadcells packed into a given land area. Quadgrids are also computationally efficient, as a finer grid cell resolution is only used in the more populous areas.

quadgrid

Below is an example of how I’ve used a population density quadgrid to create a 3D city skyline of Sydney. The taller the quadcells, the greater the number of people per square meter – no different to the actual built form of our cities where, if land is scarce, we build skyward.
sydney_skyline

The recursive quadgrid function is available on github. It requires PostgreSQL9.3 or above.

https://github.com/dimensionaledge/cf_public/blob/master/lattices/DE_RegularQuadGrid.sql

The function is called in the same way as any other pl/pgsql function. For instance:

CREATE TABLE quadgrid AS
SELECT depth::integer, the_geom::geometry(Polygon, 3577) as wkb_geometry, cell_value::float8
FROM DE_RegularQuadGrid((SELECT wkb_geometry FROM abs_aus11_tiles_32k WHERE tid IN (17864)), ‘tutorials.abs_mb11_points’,’wkb_geometry’, 10, 1000);

The arguments of the quadgrid function are: the parent tile geometry, the name of the points feature table along with its geometry column name, the maximum depth of recursion, and the threshold number of points per cell above which a cell will sub-divide.

The function uses two helper functions, namely DE_MakeRegularQuadCells() and DE_MakeSquare(). They work in tandem by taking a square geometry and subdividing it into four child geometries.

https://github.com/dimensionaledge/cf_public/tree/master/shapes

I’ve also just refactored the quadgrid function to incorporate a new, highly efficient LATERAL ‘Point in Polygon’ code pattern which avoids the use of GROUP BY when counting points in polygons. The result is a 75% reduction in query times compared to the conventional CROSS JOIN and GROUP BY approach.

SELECT r.the_geom, r.pcount FROM
(SELECT DE_MakeRegularQuadCells(wkb_geometry) as the_geom FROM abs_aus11_tiles_32k WHERE tid IN (17864)) l,
LATERAL
(SELECT l.the_geom, count(*) as pcount FROM tutorials.abs_mb11_points WHERE ST_Intersects(l.the_geom, wkb_geometry) AND l.the_geom && wkb_geometry) r;

Quadgrid run times depend on the number of underlying point features, and the depth of recursion. In the above GIF animation, there are 1.86 million points alone that intersect with the parent tile, making it one of the most populous areas in Australia. For this exercise, the time to ‘quadgrid’ this one tile took 19.9 seconds to a depth of 10. Less populated tiles tend to take only fractions of a second.

DE_MakeRegularQuadGrid.sql
  1. ---------------------------------------------------------------------------
  2. -- Code Desciption:
  3. ---------------------------------------------------------------------------
  4. -- PostgreSQL/PostGIS custom function for generating quadcells recursively from a given starting geometry and intersecting reference table, to a maximum number of iteration levels or threshold value per cell
  5. -- Dependencies: DE_MakeSquare(), DE_MakeRegularQuadCells()
  6. -- Developed by: mark[a]dimensionaledge[dot]com
  7. -- Licence: GNU GPL version 3.0
  8. ---------------------------------------------------------------------------
  9.  
  10. DROP FUNCTION IF EXISTS DE_RegularQuadGrid(geometry, text, text, integer, double precision);
  11. CREATE OR REPLACE FUNCTION DE_RegularQuadGrid(parent_geom geometry, reference_table text, reference_geom_col text, max_depth integer, threshold_value double precision)
  12. RETURNS TABLE (depth integer, the_geom GEOMETRY, cell_value double precision)
  13. AS $$
  14. DECLARE
  15. reference_geom_type text;
  16.  
  17. BEGIN
  18. EXECUTE 'SELECT GeometryType('|| reference_geom_col ||') FROM '|| reference_table ||' LIMIT 1' INTO reference_geom_type ;
  19.  
  20. IF reference_geom_type NOT IN ('POINT')
  21. THEN
  22. RAISE EXCEPTION 'Reference table is not a valid geometry type';
  23. ELSE
  24. END IF;
  25.  
  26. RETURN QUERY EXECUTE
  27. 'WITH RECURSIVE quadcells (depth, the_geom, cell_value) AS (
  28. --SEED THE PARENT GEOMETRY AND CELL VALUE
  29. SELECT 1, l.the_geom, r.pcount
  30. FROM (SELECT ST_GeomFromEWKT(ST_AsEWKT('|| quote_literal(CAST(parent_geom as text)) ||')) as the_geom) l,
  31. LATERAL
  32. (SELECT count(*) as pcount, l.the_geom FROM '|| reference_table ||' WHERE ST_Intersects(l.the_geom, '|| reference_geom_col ||') AND l.the_geom && '|| reference_geom_col ||') r
  33. --RECURSIVE PART
  34. UNION ALL
  35. SELECT t.depth, t.the_geom, t.pcount
  36. FROM
  37. --TERMINAL CONDITION SUBQUERY LOOPS UNTIL THE CONDITIONS ARE NO LONGER MET - NOTE THE RECURSIVE ELEMENT CAN ONLY BE EXPLICITYLY REFERRED TO ONCE, HENCE THE USE OF CTE
  38. (
  39. WITH a AS (SELECT * FROM quadcells WHERE the_geom IS NOT NULL AND depth < '|| max_depth ||' AND cell_value > '|| threshold_value ||'),
  40. b AS (SELECT max(depth) as previous FROM a),
  41. c AS (SELECT a.* FROM a,b WHERE a.depth = b.previous),
  42. d AS (SELECT r.the_geom, r.pcount FROM (SELECT DE_MakeRegularQuadCells(the_geom) as the_geom FROM c) l, LATERAL (SELECT count(*) as pcount, l.the_geom FROM '|| reference_table ||' WHERE ST_Intersects(l.the_geom, '|| reference_geom_col ||') AND l.the_geom && '|| reference_geom_col ||') r)
  43. SELECT b.previous+1 as depth, d.the_geom, d.pcount FROM b, d
  44. ) t
  45. )
  46. SELECT depth, the_geom, cell_value::float8 FROM quadcells WHERE ST_IsEmpty(the_geom)=false AND (cell_value <= '|| threshold_value ||' OR (cell_value > '|| threshold_value ||' AND depth = '|| max_depth||'))' ;
  47.  
  48. END;
  49. $$ LANGUAGE 'plpgsql' VOLATILE;
  50.  
DE_MakeRegularQuadCells.sql
  1. ---------------------------------------------------------------------------
  2. -- Code Desciption:
  3. ---------------------------------------------------------------------------
  4. -- PostgreSQL/PostGIS custom function for subdividing a square polygon into four child polygons
  5. -- Dependencies: DE_MakeSquare()
  6. -- Developed by: mark[at]dimensionaledge[dot]com
  7. -- Licence: GNU GPL version 3.0
  8. ---------------------------------------------------------------------------
  9. -- Usage Example:
  10. ---------------------------------------------------------------------------
  11. -- SELECT DE_MakeRegularQuadCells(DE_MakeSquare(ST_MakePoint(0,0),1));
  12. ---------------------------------------------------------------------------
  13.  
  14. CREATE OR REPLACE FUNCTION DE_MakeRegularQuadCells(parent GEOMETRY)
  15. RETURNS SETOF GEOMETRY
  16. AS $$
  17. DECLARE
  18. halfside float8;
  19. i INTEGER DEFAULT 1;
  20. srid INTEGER;
  21. centerpoint GEOMETRY;
  22. centersquare GEOMETRY;
  23. quadcell GEOMETRY;
  24.  
  25. BEGIN
  26. srid := ST_SRID(parent);
  27. centerpoint := ST_Centroid(parent);
  28. halfside := abs(ST_Xmax(parent) - ST_Xmin(parent))/2;
  29. centersquare := ST_ExteriorRing(DE_MakeSquare(centerpoint, halfside));
  30.  
  31. WHILE i < 5 LOOP
  32. quadcell := DE_MakeSquare(ST_PointN(centersquare, i), halfside);
  33. RETURN NEXT quadcell;
  34. i := i + 1;
  35. END LOOP;
  36.  
  37. RETURN;
  38. END
  39. $$ LANGUAGE 'plpgsql' IMMUTABLE;
  40.  
DE_MakeSquare.sql
  1. ---------------------------------------------------------------------------
  2. -- Code Desciption:
  3. ---------------------------------------------------------------------------
  4. -- PostgreSQL/PostGIS custom function for generating a square polygon of a specified size
  5. -- Dependencies: nil
  6. -- Developed by: mark[at]dimensionaledge[dot]com
  7. -- Licence: GNU GPL version 3.0
  8. ---------------------------------------------------------------------------
  9. -- Usage Example:
  10. ---------------------------------------------------------------------------
  11. -- SELECT DE_MakeSquare(ST_MakePoint(0,0),1);
  12. ---------------------------------------------------------------------------
  13.  
  14. CREATE OR REPLACE FUNCTION DE_MakeSquare(centerpoint GEOMETRY, side FLOAT8)
  15. RETURNS GEOMETRY
  16. AS $$
  17. SELECT ST_SetSRID(ST_MakePolygon(ST_MakeLine(
  18. ARRAY[
  19. st_makepoint(ST_X(centerpoint)-0.5*side, ST_Y(centerpoint)+0.5*side),
  20. st_makepoint(ST_X(centerpoint)+0.5*side, ST_Y(centerpoint)+0.5*side),
  21. st_makepoint(ST_X(centerpoint)+0.5*side, ST_Y(centerpoint)-0.5*side),
  22. st_makepoint(ST_X(centerpoint)-0.5*side, ST_Y(centerpoint)-0.5*side),
  23. st_makepoint(ST_X(centerpoint)-0.5*side, ST_Y(centerpoint)+0.5*side)
  24. ]
  25. )),ST_SRID(centerpoint));
  26. $$ LANGUAGE 'sql' IMMUTABLE STRICT;
  27.  

by mark wynter at March 10, 2015 05:10 AM

March 05, 2015

Dimensional Edge

Bezier Curves – A More Flexible Alternative to Great Circles

This tutorial introduces a flexible method for generating flight paths in PostGIS using a custom bezier curve function. Unlike Great Circles, bezier curves provide the user the ability to set the amount of ‘flex’ or ‘bend’ exhibited by the curve, the number of vertices (useful for timeseries animations), and a break value at a predefined meridian to ensure the visual integrity of the curve is maintained when plotting in your favourite projection. I also have a 3D version of the bezier curve function which I’ll share at a later date.

The full code for this tutorial, including the QGIS technique for centering a world map in the Pacific Ocean, is available on github here.

flightpaths

I won’t repeat everything which is already documented in the github tutorial code and custom bezier curve function, except make mention of the five arguments used by the bezier curve function itself.

DE_BezierCurve2D(origin geometry, destination geometry, theta numeric, num_points integer, breakval numeric)

  1. origin geometry
  2. destination geometry
  3. theta, which is the maximum bend in the East-West plane
  4. the number of Linestring points or vertices
  5. the break value which splits the Linestring into a Multilinestring at a specified meridian

Example: DE_BezierCurve2D(o_geom, d_geom, 25, 1000, 30)

Origin and destination geometries should be self-explanatory.

Theta defines the maximum degree of ‘flex’ or ‘bend’ in the East-West plane, to which a SIN function is applied to produce the effect of straighter lines in the North-South plane. If you prefer that all curves exhibit the same degree of bend irrespective of the azimuth between their origins and destinations, then modify the bezier function by replacing SIN(ST_Azimuth(ST_Point(o_x, o_y), ST_Point(d_x, d_y)) with the value ‘1’. Or if you want to stipulate a minimum bend in the North-South plane, then enclose the associated formula within a PostgreSQL ‘greatest()’ function.

The number of Linestring points or vertices is a really handy feature for animating flight paths where you need to synchronise movements along flight paths using a clock. To do this, set the number of points equal to the flight duration between each origin destination pair. So a 60 minute flight gets 60 points or vertices. A 120 minute flight gets 120 points or vertices. Then dump the flight path vertices into a points table which you can then index to a flight timetable.

The last argument is the break value, which is the meridian at which the flight path will be broken into two Linestrings (thus becoming a Multilinestring). The break value should equal the PROJ4 ‘+lon_0′ value associated with the CRS which you intend to view the data in, plus or minus 180 degrees. An example is given in the github tutorial file.

At some point I’ll release my 3D version of this function which is very cool for making 3D flight path animations.

Enjoy!

by mark wynter at March 05, 2015 02:55 AM

March 02, 2015

Dimensional Edge

From Days to Minutes – Geoprocessing of Alberta Land Use Data

When writing my first blog post “Eating the Elephant In Small Bites”, a question on the PostGIS mailing list caught my eye because the challenge posed was not dissimilar to something I once encountered when working with Australian land use data. This time the dataset related to land cover in Alberta, Canada, and the PostGIS query using ST_Interection() was reportedly taking many, many days to process one particular class of land cover. Fortunately the Alberta dataset is freely available, so I thought to test my PostGIS vector tiling and Map Reduce code pattern – and with great effect! This post describes how I addressed the Alberta land class problem, building upon the concepts and practices introduced in my first blog post to deliver query times of just minutes. As with the “Eating the Elephant in Small Bites” tutorial, our machine setup for this exercise is a CentOS6.5 Linux server instance running on AWS, with 16vCPUs and 30GB of memory.

The full code for this solution can be downloaded from github here

Data Understanding

Upon downloading and ingesting the Alberta land cover dataset into PostGIS, the problem becomes self-evident. The land class ‘lc_class 34′ comprises the least number of geometries, but with an average of 119 rings per polygon.

SELECT lc_class, SUM(ST_NumGeometries(wkb_geometry)) as num_geoms, SUM(ST_NPoints(wkb_geometry)) as num_points, round(SUM(ST_NRings(wkb_geometry))::numeric/SUM(ST_NumGeometries(wkb_geometry))::numeric,3) as rings_per_geom FROM lancover_polygons_2010 GROUP BY lc_class ORDER BY 1;

 lc_class | num_geoms | num_points | rings_per_geom 
----------+-----------+------------+----------------
       20 |     83793 |    1306686 |          1.029
       31 |      1262 |      33366 |          1.115
       32 |      5563 |     291666 |          1.560
       33 |     12231 |     198385 |          1.023
       34 |       366 |    2646625 |        119.046
       50 |    196681 |    4750590 |          1.165
      110 |    154816 |    3137196 |          1.078
      120 |     83069 |    2833293 |          1.282
      210 |    150550 |    6260788 |          1.522
      220 |    197666 |    4793592 |          1.150
      230 |    112034 |    2419651 |          1.045
(11 rows)

Visually, this is how ‘lc_class 34′ appears in QGIS with green fill.

lc_class34

lc_class34_zoom

Upon closer inspection, we see ‘lc_class 34′ has a trellis-like structure where each contiguous block of white space (i.e. the “negative space”) is a polygon inner ring. And therein lies the problem. Like with multipolygons, complex polygons with inner rings are stored as arrays, meaning queries are often slowed by the nature of array operations, plus indexes are less effective than if each array element is dumped and stored as its own row. But luckily for us, there are standard PostGIS functions we can use to address this.

Solution

The solution to the Alberta ‘lc_class 34′ problem I will describe sequentially, starting with how to prepare the data by dumping the multipolygons and polyon rings, followed by a description of the fn_worker SQL code pattern. Finally we parallelise the code across all available cores.

Data preparation involves a two step “dumping” process, starting with ST_Dump() of the multipolygons, followed by ST_DumpRings() of the polygons dumped by step 1. Note that we record the path number of each polygon ring that is dumped. Exterior rings by default have a path value = 0, whilst interior rings have a path value > 0 signifying the interior ring number.

  1. # Create PostGIS tables (with indexes) for the dumped polygons and their rings
  2. SQL=$(cat<<EOF      
  3. -------------------------        
  4. ------- SQL BLOCK -------        
  5. -------------------------        
  6. DROP TABLE IF EXISTS landcover_dumped_34;        
  7. CREATE TABLE landcover_dumped_34 (
  8. fid serial,
  9. ogc_fid integer,
  10. wkb_geometry geometry(Polygon, 3400),
  11. lc_class integer,
  12. mod_ty text,
  13. shape_length float8,
  14. shape_area float8
  15. );
  16.  
  17. INSERT INTO landcover_dumped_34
  18. SELECT
  19. NEXTVAL('landcover_dumped_34_fid_seq'),
  20. ogc_fid,
  21. (ST_Dump(wkb_geometry)).geom,
  22. lc_class,
  23. mod_ty,
  24. shape_length,
  25. shape_area
  26. FROM lancover_polygons_2010 WHERE lc_class = 34;
  27.  
  28. DROP TABLE IF EXISTS landcover_dumped_34_rings;      
  29. CREATE TABLE landcover_dumped_34_rings (
  30. fid serial,
  31. ogc_fid integer,
  32. path integer,
  33. wkb_geometry geometry(Polygon, 3400)
  34. );
  35.  
  36. INSERT INTO landcover_dumped_34_rings
  37. WITH s AS (SELECT ogc_fid, (ST_DumpRings(wkb_geometry)).path as path, ST_Buffer(ST_SnapToGrid((ST_DumpRings(wkb_geometry)).geom,0.1)) as the_geom FROM  landcover_dumped_34)  --SNAP and BUFFER TO MAKEVALID
  38. SELECT
  39. NEXTVAL('landcover_dumped_34_rings_fid_seq'),
  40. ogc_fid,
  41. path[1],
  42. the_geom
  43. FROM s;
  44.  
  45. CREATE INDEX landcover_dumped_34_rings_wkb_geometry_idx ON landcover_dumped_34_rings USING GIST(wkb_geometry);
  46. ANALYZE landcover_dumped_34_rings;
  47.  
  48. -------------------------        
  49. EOF
  50. )
  51. echo "$SQL"  # comment to suppress printing
  52. # execute SQL STATEMENT or comment # to skip             
  53. psql -U $username -d $dbname -c "$SQL"  ### alternatively, comment out line with single '#' to skip this step
# Create PostGIS tables (with indexes) for the dumped polygons and their rings
SQL=$(cat<<EOF 	 	
-------------------------	 	 
------- SQL BLOCK -------	 	 
-------------------------	 	 
DROP TABLE IF EXISTS landcover_dumped_34;	 	 
CREATE TABLE landcover_dumped_34 (
fid serial,
ogc_fid integer,
wkb_geometry geometry(Polygon, 3400),
lc_class integer,
mod_ty text,
shape_length float8,
shape_area float8
);

INSERT INTO landcover_dumped_34
SELECT
NEXTVAL('landcover_dumped_34_fid_seq'),
ogc_fid,
(ST_Dump(wkb_geometry)).geom,
lc_class,
mod_ty,
shape_length,
shape_area
FROM lancover_polygons_2010 WHERE lc_class = 34;

DROP TABLE IF EXISTS landcover_dumped_34_rings;	 	 
CREATE TABLE landcover_dumped_34_rings (
fid serial,
ogc_fid integer,
path integer,
wkb_geometry geometry(Polygon, 3400)
);

INSERT INTO landcover_dumped_34_rings
WITH s AS (SELECT ogc_fid, (ST_DumpRings(wkb_geometry)).path as path, ST_Buffer(ST_SnapToGrid((ST_DumpRings(wkb_geometry)).geom,0.1)) as the_geom FROM  landcover_dumped_34)  --SNAP and BUFFER TO MAKEVALID
SELECT
NEXTVAL('landcover_dumped_34_rings_fid_seq'),
ogc_fid,
path[1],
the_geom
FROM s;

CREATE INDEX landcover_dumped_34_rings_wkb_geometry_idx ON landcover_dumped_34_rings USING GIST(wkb_geometry);
ANALYZE landcover_dumped_34_rings;

-------------------------	 	 
EOF
)
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT or comment # to skip		 	 
psql -U $username -d $dbname -c "$SQL"  ### alternatively, comment out line with single '#' to skip this step

Next we generate a grid layout to serve as the basis for our tiles. For this exercise, we will use a 2km x 2km tile size as we wish to keep our "chopped" up geometries fairly small to enable fast queries of the 'lc_class 34' tiled geometries at a later date.

  1. #######################################
  2. #########  GRID CREATION  ##########
  3. #######################################
  4. # bash function for creating a regular vector grid in PostGIS
  5. fn_generategrid() {
  6. # get sql custom functions
  7. wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/lattices/DE_RegularGrid.sql -O DE_RegularGrid.sql
  8. wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/shapes/DE_MakeSquare.sql -O DE_MakeSquare.sql
  9. # load sql custom functions
  10. for i in *.sql; do
  11. psql -U $username -d $dbname -f $i
  12. done
  13.  
  14. SQL=$(cat<<EOF      
  15. -------------------------        
  16. ------- SQL BLOCK -------        
  17. -------------------------
  18. DROP TABLE IF EXISTS regular_grid_2k;
  19. CREATE TABLE regular_grid_2k AS (
  20. WITH s AS (SELECT DE_RegularGrid(ST_Envelope(ST_Collect(wkb_geometry)),2000) as wkb_geometry FROM abmiw2wlcv_48tiles)
  21. SELECT row_number() over() as tid, wkb_geometry::geometry(Polygon, 3400) FROM s);
  22.  
  23. -------------------------        
  24. EOF
  25. )
  26. echo "$SQL"  # comment to suppress printing
  27. # execute SQL STATEMENT or comment # to skip       
  28. psql -U $username -d $dbname -c "$SQL"
  29. }
  30. # end of bash function
  31.  
  32. # call the bash function or comment # to skip
  33. fn_generategrid
  34.  
  35. #######################################
#######################################
#########  GRID CREATION  ##########
#######################################
# bash function for creating a regular vector grid in PostGIS
fn_generategrid() {
# get sql custom functions
wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/lattices/DE_RegularGrid.sql -O DE_RegularGrid.sql
wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/shapes/DE_MakeSquare.sql -O DE_MakeSquare.sql
# load sql custom functions
for i in *.sql; do
psql -U $username -d $dbname -f $i
done

SQL=$(cat<<EOF 	 	
-------------------------	 	 
------- SQL BLOCK -------	 	 
-------------------------
DROP TABLE IF EXISTS regular_grid_2k;
CREATE TABLE regular_grid_2k AS (
WITH s AS (SELECT DE_RegularGrid(ST_Envelope(ST_Collect(wkb_geometry)),2000) as wkb_geometry FROM abmiw2wlcv_48tiles)
SELECT row_number() over() as tid, wkb_geometry::geometry(Polygon, 3400) FROM s);

-------------------------	 	 
EOF
)
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT or comment # to skip	 	
psql -U $username -d $dbname -c "$SQL"
}
# end of bash function

# call the bash function or comment # to skip
fn_generategrid 

#######################################

We also create a vector tile table to hold the outputs of our worker function. Note that the worker function does the tiling of the exterior rings separately to the interior rings. The unions of each are taken before applying ST_Difference to remove the "negative space" of the interior rings from each tile.

  1. #######################################
  2. #####  DEFINE WORKER FUNCTION  ######
  3. #######################################
  4. # define the worker function to be executed across all cores
  5. fn_worker (){
  6. source $dbsettings
  7. SQL=$(cat<<EOF
  8. -------------------------
  9. ----- SQL STATEMENT -----
  10. -------------------------
  11. INSERT INTO vector_tiles
  12. WITH
  13. f0 AS (
  14.         SELECT
  15.         tid,
  16.         wkb_geometry as the_geom
  17.         FROM regular_grid_2k
  18.         WHERE tid >= $1 AND tid < $2
  19.         ),
  20. f1_p0 AS (
  21.         SELECT
  22.         f0.tid,
  23.         CASE WHEN ST_Within(f0.the_geom,rt.wkb_geometry) THEN f0.the_geom
  24.         ELSE ST_Intersection(f0.the_geom,rt.wkb_geometry) END as the_geom
  25.         FROM f0, $3 as rt
  26.         WHERE ST_Intersects(f0.the_geom, rt.wkb_geometry) AND f0.the_geom && rt.wkb_geometry AND rt.path = 0
  27.         ),
  28. f1_p0u AS (
  29.         SELECT
  30.         tid,
  31.         ST_Union(the_geom) as the_geom
  32.         FROM f1_p0
  33.         GROUP BY tid
  34.         ),
  35. f1_p1 AS (
  36.         SELECT
  37.         f0.tid,
  38.         CASE WHEN ST_Within(f0.the_geom,rt.wkb_geometry) THEN f0.the_geom
  39.         ELSE ST_Intersection(f0.the_geom,rt.wkb_geometry)
  40.         END as the_geom
  41.         FROM f0, $3 as rt
  42.         WHERE ST_Intersects(f0.the_geom, rt.wkb_geometry) AND f0.the_geom && rt.wkb_geometry AND rt.path > 0
  43.         ),
  44. f1_p1u AS (
  45.         SELECT
  46.         tid,
  47.         ST_Union(the_geom) as the_geom
  48.         FROM f1_p1
  49.         GROUP BY tid
  50.         ),
  51. f2 AS (
  52.         SELECT
  53.         f1_p0u.tid,
  54.         CASE WHEN  f1_p1u.tid IS NULL THEN f1_p0u.the_geom
  55.         WHEN ST_IsEmpty(ST_Difference(f1_p0u.the_geom,f1_p1u.the_geom)) THEN NULL
  56.         ELSE ST_Difference(f1_p0u.the_geom,f1_p1u.the_geom)
  57.         END as the_geom
  58.         FROM f1_p0u LEFT JOIN  f1_p1u
  59.         ON f1_p0u.tid = f1_p1u.tid
  60.         )
  61. ------------------------
  62. ---- result ----
  63. ------------------------
  64. SELECT
  65. NEXTVAL('vector_tiles_fid_seq'),
  66. tid,
  67. ST_Multi(the_geom),
  68. 1
  69. FROM f2
  70. WHERE the_geom IS NOT NULL;
  71. -------------------------
  72. EOF
  73. )
  74. echo "$SQL"  # comment to suppress printing
  75. # execute SQL STATEMENT
  76. psql -U $username -d $dbname -c "$SQL"
  77. }
  78. # end of worker function
  79.  
  80. # make worker function visible to GNU Parallel across all cores
  81. export -f fn_worker
  82.  
  83. #######################################
#######################################
#####  DEFINE WORKER FUNCTION  ######
#######################################
# define the worker function to be executed across all cores
fn_worker (){
source $dbsettings
SQL=$(cat<<EOF
-------------------------
----- SQL STATEMENT -----
-------------------------
INSERT INTO vector_tiles
WITH
f0 AS (
		SELECT
		tid,
		wkb_geometry as the_geom
		FROM regular_grid_2k
		WHERE tid >= $1 AND tid < $2
		),
f1_p0 AS (
		SELECT
		f0.tid,
		CASE WHEN ST_Within(f0.the_geom,rt.wkb_geometry) THEN f0.the_geom
		ELSE ST_Intersection(f0.the_geom,rt.wkb_geometry) END as the_geom
		FROM f0, $3 as rt
		WHERE ST_Intersects(f0.the_geom, rt.wkb_geometry) AND f0.the_geom && rt.wkb_geometry AND rt.path = 0
		),
f1_p0u AS (
		SELECT
		tid,
		ST_Union(the_geom) as the_geom
		FROM f1_p0
		GROUP BY tid
		),
f1_p1 AS (
		SELECT
		f0.tid,
		CASE WHEN ST_Within(f0.the_geom,rt.wkb_geometry) THEN f0.the_geom
		ELSE ST_Intersection(f0.the_geom,rt.wkb_geometry)
		END as the_geom
		FROM f0, $3 as rt
		WHERE ST_Intersects(f0.the_geom, rt.wkb_geometry) AND f0.the_geom && rt.wkb_geometry AND rt.path > 0
		),
f1_p1u AS (
		SELECT
		tid,
		ST_Union(the_geom) as the_geom
		FROM f1_p1
		GROUP BY tid
		),
f2 AS (
		SELECT
		f1_p0u.tid,
		CASE WHEN  f1_p1u.tid IS NULL THEN f1_p0u.the_geom
		WHEN ST_IsEmpty(ST_Difference(f1_p0u.the_geom,f1_p1u.the_geom)) THEN NULL
		ELSE ST_Difference(f1_p0u.the_geom,f1_p1u.the_geom)
		END as the_geom
		FROM f1_p0u LEFT JOIN  f1_p1u
		ON f1_p0u.tid = f1_p1u.tid
		)
------------------------
---- result ----
------------------------
SELECT
NEXTVAL('vector_tiles_fid_seq'),
tid,
ST_Multi(the_geom),
1
FROM f2
WHERE the_geom IS NOT NULL;
-------------------------
EOF
)
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT
psql -U $username -d $dbname -c "$SQL"
}
# end of worker function

# make worker function visible to GNU Parallel across all cores
export -f fn_worker

#######################################

Next we create a job list which we proceed to execute in parallel. And voilà!

  1. #######################################
  2. ##########  CREATE JOB LIST  ###########
  3. #######################################
  4. # create job list to feed GNU Parallel.
  5. SQL=$(cat<<EOF      
  6. -------------------------        
  7. ------- SQL BLOCK -------        
  8. -------------------------
  9. -- create joblist where block size = 1000 tiles (i.e. tiles processed in batches of 1000)
  10. COPY (SELECT i as lower, i+1000 as upper FROM generate_series(1,250000,1000) i) TO STDOUT WITH CSV;      
  11. -------------------------
  12. EOF
  13. )        
  14. echo "$SQL"  # comment to suppress printing
  15. # execute SQL STATEMENT    
  16. psql -U $username -d $dbname -c "$SQL" > joblist.csv
  17.  
  18. #######################################
  19.  
  20. #######################################
  21. ##########  EXECUTE JOBS  ###########
  22. #######################################
  23. cat joblist.csv | parallel --colsep ',' fn_worker {1} {2} landcover_dumped_34_rings
  24. wait
  25. #######################################
#######################################
##########  CREATE JOB LIST  ###########
#######################################
# create job list to feed GNU Parallel.
SQL=$(cat<<EOF 	 	
-------------------------	 	 
------- SQL BLOCK -------	 	 
-------------------------
-- create joblist where block size = 1000 tiles (i.e. tiles processed in batches of 1000)
COPY (SELECT i as lower, i+1000 as upper FROM generate_series(1,250000,1000) i) TO STDOUT WITH CSV; 	 
-------------------------
EOF
)	 	 
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT 	
psql -U $username -d $dbname -c "$SQL" > joblist.csv

#######################################

#######################################
##########  EXECUTE JOBS  ###########
#######################################
cat joblist.csv | parallel --colsep ',' fn_worker {1} {2} landcover_dumped_34_rings
wait
#######################################

Results of Query Times

The results show the impact of a larger tile size on query run times, and the resultant complexity of the tiled geometries produced (measured by the rings_per_geom). Whilst quadrupling the area of the tile size (from 2x2 to 4x4) reduces the run time from 779 seconds to 282 seconds, the rings_per_geom increases - which is understandable given the trellis-like structure of the source data. Both scenarios offer a vast improvement on the many, many days it reportedly took to process the data using a standard ST_Intersection query. As to the acceptability of the tile-size trade-off, I think it really depends on the criticality of query times with respect to the downstream processes that will ultimately consume the tiled geometries. It thus becomes a matter of judgement.

2k x 2k tiles
TOTAL SCRIPT TIME: 779 (batch size =1000 tiles)
 num_geoms | num_points | rings_per_geom   
-----------+------------+----------------
    144799 |    3471051 |          1.055

4k x 4k tiles
TOTAL SCRIPT TIME: 282 (batch size =250 tiles)
 num_geoms | num_points | rings_per_geom   
-----------+------------+----------------
     57775 |    2966397 |          1.263

lc_class34_tiled

Conclusions

The Alberta land cover problem highlights a few of the challenges that analysts typically encounter when working with large, complex datasets. The solution presented here is a practical example of how vector tiling and Map Reduce concepts can deliver real geo-processing efficiency improvements. However the real business value of reducing query times - from days to minutes - is to accelerate the 'business time to insight' and to amplify the speed or volume of iterations the analyst can consider as part of the geospatial 'value discovery' process. As this exercise attests, the business impact that Spatial IT can make over traditional desktop GIS approaches in the context of ever-shortening decision windows is nothing short of 'game changing'. Welcome to the emerging world of Big Data.

by mark wynter at March 02, 2015 06:46 AM

February 27, 2015

Boston GIS (Regina Obe, Leo Hsu)

PostGIS workshops at FOSS4G NA 2015 San Francisco

There will be two PostGIS workshops at FOSS4G NA 2015 on March 9th. Signup if you haven't already.

  • PostGIS Up and Running 9AM - 12 PM. This is an introductory workshop where we'll cover the basics of configuring PostGIS and using the PostGIS geometry and geography types. We also plan to demonstrate some new features coming in PostGIS 2.2, particularly of the 3D kind. If time permitting, we'll do a quick coverage of pgRouting as well.

    Someone asked on IRC if we will be handing out certificates of completion to folks who complete the workshop. Some people need this because they are allowed to attend workshops on company time, but not conferences. The thought hadn't crossed our mind, but we like the idea a lot. So yes you can have a certificate if you stay thru the whole session complete with Regina and Leo's seal of approval. We might even have some door prizes.

  • Advanced spatial analysis with PostGIS. Pierre Racine will be leading this workshop. Expect to be blown away by images of rasters dancing on legs of geometries. He'll also have some other cool advanced spatial analysis stuff beyond raster. Expect a lot of geometry processing tricks in this one.

Sadly I think our PostGIS In Action 2ed is going to be released a little after conference time and probably won't be ready until mid March, so probably just a wee bit too late for FOSS4G NA 2015, but just in time for PGCon.US New York 2015 March 25th-26th . Final book proofing is like getting our teeth pulled. I really hope it's worth the wait. We'll have coupons but no book. We will have some copies of our PostgreSQL: Up and Running 2nd Edition available though. If you've already bought one of our books and want it autographed, bring it along on your trip.

by Regina Obe (nospam@example.com) at February 27, 2015 12:04 AM

February 23, 2015

Dimensional Edge

Eating the Elephant in Small Bites – An Introduction to Vector Tiling and Map Reduce in PostGIS

“Stop trying to eat the elephant with one bite”

elephant

This tutorial introduces two concepts which I’ve found invaluable when working with “large” datasets in PostGIS.  The first concept is that of vector tiling, a method whereby large geometries are chunked down into smaller geometries using a regular grid or tile structure to accelerate the processing times of computationally intensive queries.  The second concept is Map Reduce, a scalable programming paradigm for the processing and analysis of large datasets with parallel, distributed algorithms on multi-core platforms.

In this tutorial, we apply these techniques to variations of a dataset containing large multipolygons. We compare and contrast different scenarios for producing the tiles in order to measure the impact of different strategies on query times. Our findings show query times can be substantially reduced, in this case by a factor of 7, by first “dumping” each member of the polygon collection into its own row. A further 7-fold reduction in query times can be obtained by parallelising the query in batches of 500 tiles across all available cores.

We believe that the on-going development of parallel processing techniques for PostgreSQL/PostGIS can only serve to enhance the power of these tools in their dual capacity as a database backend and as a stand-alone geo-processing engine.

Our machine setup for this tutorial is a CentOS6.5 Linux instance running on AWS, with 16vCPUs and 30GB of memory. For Windows users and Linux users who prefer not to script in bash, the parallel execution component of this tutorial can be achieved using Python.

The full code for this tutorial can be downloaded from github here

Data Modelling Objectives

Vector tiles are a recent phenomena often applied within the context of web mapping. Much like raster tiles, vector tiles are simply vector representations of geographic data whereby vector features are clipped to the boundary of each tile. This makes the downstream processing or consumption of these features by a web mapping application, or as with the focus of this tutorial, a PostGIS query more efficient.

We will demonstrate a process for generating vector tiles in PostGIS using a dataset representing the Australian continent – but with a twist. We also require vector features representing marine waters. Put succinctly, our data modelling objectives are to create:

  • unclipped tile geometries
  • tiled features representing land (i.e. clipped to the coast)
  • tiled features representing marine waters (i.e. the difference not represented by land)
Data Preparation

The source dataset is a shapefile containing the SA2 statistical boundaries of Australia.  The shapefile is freely available for download from: here. The steps for downloading and ingesting the data into PostgreSQL are described below. Note that our database connection settings are stored as variables in a shell script called “current.sh” which is sourced at the top of the code block. I have functionalised parts of the code to allow one-off steps to be easily skipped (e.g. loading the data) in the event the script is being run multiple times. You will also see that within the SQL code block, we create three variations of the source table to enable us to measure the incremental performance gain by “dumping” the multipolygon elements pre-tiling.

  1. #!/bin/bash
  2. # A tutorial that introduces the concepts of vector tiling and Map Reduce in PostGIS
  3. # Written by: mark[at]dimensionaledge[dot]com
  4. # Licence: GNU GPL version 3.0
  5. # Start timer
  6. START_T1=$(date +%s)
  7.  
  8. # DB Connection Settings
  9. dbsettings="/mnt/data/common/repos/cf_private/settings/current.sh"
  10. export dbsettings
  11. source $dbsettings
  12.  
  13. #######################################
  14. ######### DATA PREPARATION ##########
  15. #######################################
  16. # bash function for downloading and ingesting required data into PostgreSQL
  17. fn_getdata() {
  18. wget "http://www.abs.gov.au/ausstats/subscriber.nsf/log?openagent&1270055001_sa2_2011_aust_shape.zip&1270.0.55.001&Data%20Cubes&7130A5514535C5FCCA257801000D3FBD&0&July%202011&23.12.2010&Latest" -O sa211.zip
  19. unzip sa211.zip
  20. # Load dataset into PostgreSQL
  21. ogr2ogr -f "PostgreSQL" PG:"host=$host user=$username password=$password dbname=$dbname" SA2_2011_AUST.shp -nln  abs_sa211_multi -s_srs EPSG:4283 -t_srs EPSG:3577 -a_srs EPSG:3577 -nlt MULTIPOLYGON -overwrite
  22. }
  23. # end of bash function
  24. # call the bash function or comment # to skip
  25. fn_getdata
  26.  
  27. # Create three additional PostGIS tables (with indexes) representing the union and dumped constituents of all SA2 geometries
  28. SQL=$(cat<<EOF      
  29. -------------------------        
  30. ------- SQL BLOCK -------        
  31. -------------------------        
  32. DROP TABLE IF EXISTS abs_sa211_dumped;       
  33. CREATE TABLE abs_sa211_dumped AS         
  34. SELECT row_number() over () as ogc_fid, sa2_main11::text as poly_id, (ST_Dump(wkb_geometry)).geom::geometry(Polygon, 3577) as wkb_geometry FROM abs_sa211_multi;         
  35. CREATE INDEX abs_sa211_dumped_geom_idx ON abs_sa211_dumped USING GIST(wkb_geometry);         
  36.  
  37. DROP TABLE IF EXISTS abs_aus11_multi;        
  38. CREATE TABLE abs_aus11_multi AS      
  39. SELECT 1 as ogc_fid, ST_Multi(ST_Union(wkb_geometry))::geometry(Multipolygon, 3577) as wkb_geometry FROM abs_sa211;      
  40. CREATE INDEX abs_aus11_multi_geom_idx ON abs_aus11_multi USING GIST(wkb_geometry);       
  41.  
  42. DROP TABLE IF EXISTS abs_aus11_dumped;       
  43. CREATE TABLE abs_aus11_dumped AS         
  44. SELECT row_number() over () as ogc_fid, (ST_Dump(wkb_geometry)).geom::geometry(Polygon, 3577) as wkb_geometry FROM abs_aus11_multi;      
  45. CREATE INDEX abs_aus11_dumped_geom_idx ON abs_aus11_dumped USING GIST(wkb_geometry);         
  46. -------------------------        
  47. EOF
  48. )
  49. echo "$SQL"  # comment to suppress printing
  50. # execute SQL STATEMENT or comment # to skip             
  51. #psql -U $username -d $dbname -c "$SQL"  ### alternatively, comment out line with single '#' to skip this step
  52.  
  53. #######################################
#!/bin/bash
# A tutorial that introduces the concepts of vector tiling and Map Reduce in PostGIS
# Written by: mark[at]dimensionaledge[dot]com
# Licence: GNU GPL version 3.0
# Start timer
START_T1=$(date +%s)

# DB Connection Settings
dbsettings="/mnt/data/common/repos/cf_private/settings/current.sh"
export dbsettings
source $dbsettings

#######################################
######### DATA PREPARATION ##########
#######################################
# bash function for downloading and ingesting required data into PostgreSQL
fn_getdata() {
wget "http://www.abs.gov.au/ausstats/subscriber.nsf/log?openagent&1270055001_sa2_2011_aust_shape.zip&1270.0.55.001&Data%20Cubes&7130A5514535C5FCCA257801000D3FBD&0&July%202011&23.12.2010&Latest" -O sa211.zip
unzip sa211.zip
# Load dataset into PostgreSQL
ogr2ogr -f "PostgreSQL" PG:"host=$host user=$username password=$password dbname=$dbname" SA2_2011_AUST.shp -nln  abs_sa211_multi -s_srs EPSG:4283 -t_srs EPSG:3577 -a_srs EPSG:3577 -nlt MULTIPOLYGON -overwrite
}
# end of bash function
# call the bash function or comment # to skip
fn_getdata 

# Create three additional PostGIS tables (with indexes) representing the union and dumped constituents of all SA2 geometries
SQL=$(cat<<EOF 	 	
-------------------------	 	 
------- SQL BLOCK -------	 	 
-------------------------	 	 
DROP TABLE IF EXISTS abs_sa211_dumped;	 	 
CREATE TABLE abs_sa211_dumped AS	 	 
SELECT row_number() over () as ogc_fid, sa2_main11::text as poly_id, (ST_Dump(wkb_geometry)).geom::geometry(Polygon, 3577) as wkb_geometry FROM abs_sa211_multi;	 	 
CREATE INDEX abs_sa211_dumped_geom_idx ON abs_sa211_dumped USING GIST(wkb_geometry);	 	 

DROP TABLE IF EXISTS abs_aus11_multi;	 	 
CREATE TABLE abs_aus11_multi AS	 	 
SELECT 1 as ogc_fid, ST_Multi(ST_Union(wkb_geometry))::geometry(Multipolygon, 3577) as wkb_geometry FROM abs_sa211;	 	 
CREATE INDEX abs_aus11_multi_geom_idx ON abs_aus11_multi USING GIST(wkb_geometry);	 	 

DROP TABLE IF EXISTS abs_aus11_dumped;	 	 
CREATE TABLE abs_aus11_dumped AS	 	 
SELECT row_number() over () as ogc_fid, (ST_Dump(wkb_geometry)).geom::geometry(Polygon, 3577) as wkb_geometry FROM abs_aus11_multi;	 	 
CREATE INDEX abs_aus11_dumped_geom_idx ON abs_aus11_dumped USING GIST(wkb_geometry);	 	 
-------------------------	 	 
EOF
)
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT or comment # to skip		 	 
#psql -U $username -d $dbname -c "$SQL"  ### alternatively, comment out line with single '#' to skip this step

#######################################

aus11

If we examine the underlying structure of our source tables, we find that each table looks fundamentally different in terms of the number of rows, geometries and points. The extreme case is the abs_aus11_multi table, which comprises one (1) row representing the entire Australian continent. The relevance of querying large multipolygons as well as other key differences will become clearer when we compare the query times for producing our vector tiles.

--abs_sa211_multi
SELECT COUNT(ogc_fid) as num_rows, SUM(ST_NumGeometries(wkb_geometry)) as num_geoms, SUM(ST_NPoints(wkb_geometry)) as num_points FROM abs_sa211_multi;
 num_rows | num_geoms | num_points 
----------+-----------+------------
     2214 |      8917 |    4164582

--abs_sa211_dumped
SELECT COUNT(ogc_fid) as num_rows, SUM(ST_NumGeometries(wkb_geometry)) as num_geoms, SUM(ST_NPoints(wkb_geometry)) as num_points FROM abs_sa211_dumped;
 num_rows | num_geoms | num_points 
----------+-----------+------------
     8917 |      8917 |    4164582

--abs_aus11_multi
SELECT COUNT(ogc_fid) as num_rows, SUM(ST_NumGeometries(wkb_geometry)) as num_geoms, SUM(ST_NPoints(wkb_geometry)) as num_points FROM abs_aus11_multi;
 num_rows | num_geoms | num_points 
----------+-----------+------------
        1 |      6718 |    1622598

--abs_aus11_dumped
SELECT COUNT(ogc_fid) as num_rows, SUM(ST_NumGeometries(wkb_geometry)) as num_geoms, SUM(ST_NPoints(wkb_geometry)) as num_points FROM abs_aus11_dumped;
 num_rows | num_geoms | num_points 
----------+-----------+------------
     6718 |      6718 |    1622598

 

Grid Creation

Next we create a 32km x 32km regular grid or lattice structure that becomes the stencil for each tile. There are many ways to create a lattice structure in PostGIS, but since we work with grids on an almost daily basis, we utilise two custom SQL functions that can also be freely downloaded from our public Github repository. We create a table "regular_grid_32k" to hold the geometries returned by the custom function DE_RegularGrid(), together with a tile_id (which we refer to as a "tid").

  1. #######################################
  2. #########  GRID CREATION  ##########
  3. #######################################
  4. # bash function for creating a regular vector grid in PostGIS
  5. fn_generategrid() {
  6. # get sql custom functions
  7. wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/lattices/DE_RegularGrid.sql -O DE_RegularGrid.sql
  8. wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/shapes/DE_MakeSquare.sql -O DE_MakeSquare.sql
  9. # load sql custom functions
  10. for i in *.sql; do
  11. psql -U $username -d $dbname -f $i
  12. done
  13. SQL=$(cat<<EOF      
  14. -------------------------        
  15. ------- SQL BLOCK -------        
  16. -------------------------
  17. DROP TABLE IF EXISTS regular_grid_32k;
  18. CREATE TABLE regular_grid_32k AS (
  19. WITH s AS (SELECT DE_RegularGrid(ST_Envelope(wkb_geometry),32000) as wkb_geometry FROM abs_aus11_multi)
  20. SELECT row_number() over() as tid, wkb_geometry::geometry(Polygon, 3577) FROM s);
  21. -------------------------        
  22. EOF
  23. )
  24. echo "$SQL"  # comment to suppress printing
  25. # execute SQL STATEMENT or comment # to skip       
  26. psql -U $username -d $dbname -c "$SQL"
  27. }
  28. # end of bash function
  29.  
  30. # call the bash function or comment # to skip
  31. #fn_generategrid
  32.  
  33. #######################################
#######################################
#########  GRID CREATION  ##########
#######################################
# bash function for creating a regular vector grid in PostGIS
fn_generategrid() {
# get sql custom functions
wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/lattices/DE_RegularGrid.sql -O DE_RegularGrid.sql
wget https://raw.githubusercontent.com/dimensionaledge/cf_public/master/shapes/DE_MakeSquare.sql -O DE_MakeSquare.sql
# load sql custom functions
for i in *.sql; do
psql -U $username -d $dbname -f $i
done
SQL=$(cat<<EOF 	 	
-------------------------	 	 
------- SQL BLOCK -------	 	 
-------------------------
DROP TABLE IF EXISTS regular_grid_32k;
CREATE TABLE regular_grid_32k AS (
WITH s AS (SELECT DE_RegularGrid(ST_Envelope(wkb_geometry),32000) as wkb_geometry FROM abs_aus11_multi)
SELECT row_number() over() as tid, wkb_geometry::geometry(Polygon, 3577) FROM s);
-------------------------	 	 
EOF
)
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT or comment # to skip	 	
psql -U $username -d $dbname -c "$SQL"
}
# end of bash function

# call the bash function or comment # to skip
#fn_generategrid 

#######################################

grid

Tile Processing and Poor Man's Map Reduce

We now create an output table for the tile features to be produced in accordance with our modelling objectives. Each tile feature generated will be written to a single PostGIS table referenced by a unique identifier, a common tile_id, plus a feature category, 'fcat' which in this instance can take on a value of 1,2,3 to denote the feature type.

  1. #######################################
  2. #########  TILE OUTPUT TABLE  #########
  3. #######################################
  4. SQL=$(cat<<EOF      
  5. -------------------------        
  6. ------- SQL BLOCK -------        
  7. -------------------------
  8. DROP TABLE IF EXISTS vector_tiles;
  9. CREATE TABLE vector_tiles (
  10. fid serial,
  11. tid integer,
  12. wkb_geometry geometry(Multipolygon, 3577),
  13. fcat integer
  14. );
  15. -------------------------
  16. EOF
  17. )
  18. echo "$SQL"  # comment to suppress printing
  19. # execute SQL STATEMENT or comment # to skip               
  20. psql -U $username -d $dbname -c "$SQL"
  21.  
  22. #######################################
#######################################
#########  TILE OUTPUT TABLE  #########
#######################################
SQL=$(cat<<EOF 	 	
-------------------------	 	 
------- SQL BLOCK -------	 	 
-------------------------
DROP TABLE IF EXISTS vector_tiles;
CREATE TABLE vector_tiles (
fid serial,
tid integer,
wkb_geometry geometry(Multipolygon, 3577),
fcat integer
);
------------------------- 
EOF
)
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT or comment # to skip		  	  	
psql -U $username -d $dbname -c "$SQL"

#######################################

To populate the table, we must define a bash function "fn_worker" which contains a versatile SQL query pattern (a Poor Man's Map Reduce algorithm) that can be executed either as a single process, or in parallel without any need to modify code. We've written the SQL query using Common Table Expressions to enable subqueries to be used multiple times, as well as to assist overall readability.

Three bash variable substitutions are made at script run time: $1 and $2 define the limits of the tile blocks to be processed, whilst $3 is the name of the source table from which we will generate the fcat1 features (recall we wish to compare how the underlying structure of the four source tables affect computation times).

Again we use 'psql' to execute the SQL statement. Note that the worker function is not limited to the execution of a single SQL statement. Any command line statement can be included in the function, which means varied and sophisticated geo-processing pipelines can be incorporated, combining tools such as GDAL, grass, R.

  1. #######################################
  2. #####  DEFINE WORKER FUNCTION  ######
  3. #######################################
  4. # define the worker function to be executed across all cores
  5. fn_worker (){
  6. source $dbsettings
  7. SQL=$(cat<<EOF
  8. -------------------------
  9. ----- SQL STATEMENT -----
  10. -------------------------
  11. INSERT INTO vector_tiles
  12. WITH
  13. fcat0 AS (
  14.         SELECT
  15.         tid,
  16.         wkb_geometry as the_geom
  17.         FROM regular_grid_32k
  18.         WHERE tid >= $1 AND tid < $2
  19.         ),
  20. fcat1 AS (
  21.         SELECT
  22.         fcat0.tid,
  23.         CASE WHEN ST_Within(fcat0.the_geom,rt.wkb_geometry) THEN fcat0.the_geom
  24.         ELSE ST_Intersection(fcat0.the_geom,rt.wkb_geometry) END as the_geom
  25.         FROM fcat0, $3 as rt
  26.         WHERE ST_Intersects(fcat0.the_geom, rt.wkb_geometry) AND fcat0.the_geom && rt.wkb_geometry
  27.         ),
  28. fcat1u AS (
  29.         SELECT
  30.         tid,
  31.         ST_Union(the_geom) as the_geom
  32.         FROM fcat1
  33.         GROUP BY tid
  34.         ),
  35. fcat2 AS (
  36.         SELECT
  37.         fcat0.tid,
  38.         CASE WHEN ST_IsEmpty(ST_Difference(fcat0.the_geom,fcat1u.the_geom)) THEN NULL
  39.         ELSE ST_Difference(fcat0.the_geom,fcat1u.the_geom) END as the_geom
  40.         FROM fcat0, fcat1u
  41.         WHERE fcat0.tid = fcat1u.tid
  42.         )
  43. ------------------------
  44. ---- unclipped tile ----
  45. ------------------------
  46. SELECT
  47. NEXTVAL('vector_tiles_fid_seq'),
  48. tid,
  49. ST_Multi(the_geom),
  50. 0
  51. FROM fcat0
  52. WHERE the_geom IS NOT NULL
  53. -------------------------
  54. UNION ALL
  55. -------------------------
  56. ----  land features  ----
  57. -------------------------
  58. SELECT
  59. NEXTVAL('vector_tiles_fid_seq'),
  60. tid,
  61. ST_Multi(the_geom),
  62. 1
  63. FROM fcat1u
  64. WHERE the_geom IS NOT NULL
  65. -------------------------
  66. UNION ALL
  67. -------------------------
  68. ---- water features  ----
  69. -------------------------
  70. SELECT
  71. NEXTVAL('vector_tiles_fid_seq'),
  72. tid,
  73. ST_Multi(the_geom),
  74. 2
  75. FROM fcat2
  76. WHERE the_geom IS NOT NULL;
  77. -------------------------
  78. EOF
  79. )
  80. echo "$SQL"  # comment to suppress printing
  81. # execute SQL STATEMENT
  82. psql -U $username -d $dbname -c "$SQL"
  83. }
  84. # end of worker function
  85.  
  86. # make worker function visible to GNU Parallel across all cores
  87. export -f fn_worker
  88.  
  89. #######################################
#######################################
#####  DEFINE WORKER FUNCTION  ######
#######################################
# define the worker function to be executed across all cores
fn_worker (){
source $dbsettings
SQL=$(cat<<EOF
-------------------------
----- SQL STATEMENT -----
-------------------------
INSERT INTO vector_tiles
WITH
fcat0 AS (
		SELECT
		tid,
		wkb_geometry as the_geom
		FROM regular_grid_32k
		WHERE tid >= $1 AND tid < $2
		),
fcat1 AS (
		SELECT
		fcat0.tid,
		CASE WHEN ST_Within(fcat0.the_geom,rt.wkb_geometry) THEN fcat0.the_geom
		ELSE ST_Intersection(fcat0.the_geom,rt.wkb_geometry) END as the_geom
		FROM fcat0, $3 as rt
		WHERE ST_Intersects(fcat0.the_geom, rt.wkb_geometry) AND fcat0.the_geom && rt.wkb_geometry
		),
fcat1u AS (
		SELECT
		tid,
		ST_Union(the_geom) as the_geom
		FROM fcat1
		GROUP BY tid
		),
fcat2 AS (
		SELECT
		fcat0.tid,
		CASE WHEN ST_IsEmpty(ST_Difference(fcat0.the_geom,fcat1u.the_geom)) THEN NULL
		ELSE ST_Difference(fcat0.the_geom,fcat1u.the_geom) END as the_geom
		FROM fcat0, fcat1u
		WHERE fcat0.tid = fcat1u.tid
		)
------------------------
---- unclipped tile ----
------------------------
SELECT
NEXTVAL('vector_tiles_fid_seq'),
tid,
ST_Multi(the_geom),
0
FROM fcat0
WHERE the_geom IS NOT NULL
-------------------------
UNION ALL
-------------------------
----  land features  ----
-------------------------
SELECT
NEXTVAL('vector_tiles_fid_seq'),
tid,
ST_Multi(the_geom),
1
FROM fcat1u
WHERE the_geom IS NOT NULL
-------------------------
UNION ALL
-------------------------
---- water features  ----
-------------------------
SELECT
NEXTVAL('vector_tiles_fid_seq'),
tid,
ST_Multi(the_geom),
2
FROM fcat2
WHERE the_geom IS NOT NULL;
-------------------------
EOF
)
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT
psql -U $username -d $dbname -c "$SQL"
}
# end of worker function

# make worker function visible to GNU Parallel across all cores
export -f fn_worker

#######################################

We now invoke GNU Parallel, a shell tool for executing jobs in parallel (or sequentially if needed). I prefer GNU Parallel over Python because of the pure simplicity by which one can apply Map Reduce concepts from the command line or bash.

To launch GNU Parallel, we must first construct a job list. My preferred method is to use a CSV file which can be generated dynamically using psql. A CSV file also provides us with an easy way to feed multiple arguments into each job (which the worker function will substitute accordingly).

In our case, we use the generate_series function in PostgreSQL to define how the tiles are to be chunked and processed. The SQL code block shows 3 variants, so be sure to leave only one line uncommented.

  1. #######################################
  2. ##########  CREATE JOB LIST  ###########
  3. #######################################
  4. # create job list to feed GNU Parallel.  The examples below show different ways of chunking the work.
  5. SQL=$(cat<<EOF      
  6. -------------------------        
  7. ------- SQL BLOCK -------        
  8. -------------------------
  9. -- generate a single job comprising 24321 tiles (i.e. entire table processed on a single core)
  10. COPY (SELECT i as lower, i+24321 as upper FROM generate_series(1,24321,24321) i) TO STDOUT WITH CSV;
  11.  
  12. -- generate multiple jobs, each comprising 1 tile (i.e. 24321 jobs in total)
  13. --COPY (SELECT i as lower, i+1 as upper FROM generate_series(1,24321,1) i) TO STDOUT WITH CSV;
  14.  
  15. -- generate multiple jobs each comprising 100 tiles (i.e. tiles processed in batches of 100)
  16. -- COPY (SELECT i as lower, i+100 as upper FROM generate_series(1,24321,100) i) TO STDOUT WITH CSV;      
  17. -------------------------
  18. EOF
  19. )        
  20. echo "$SQL"  # comment to suppress printing
  21. # execute SQL STATEMENT    
  22. psql -U $username -d $dbname -c "$SQL" > joblist.csv
  23.  
  24. #######################################
#######################################
##########  CREATE JOB LIST  ###########
#######################################
# create job list to feed GNU Parallel.  The examples below show different ways of chunking the work.
SQL=$(cat<<EOF 	 	
-------------------------	 	 
------- SQL BLOCK -------	 	 
-------------------------
-- generate a single job comprising 24321 tiles (i.e. entire table processed on a single core)
COPY (SELECT i as lower, i+24321 as upper FROM generate_series(1,24321,24321) i) TO STDOUT WITH CSV;

-- generate multiple jobs, each comprising 1 tile (i.e. 24321 jobs in total)
--COPY (SELECT i as lower, i+1 as upper FROM generate_series(1,24321,1) i) TO STDOUT WITH CSV;

-- generate multiple jobs each comprising 100 tiles (i.e. tiles processed in batches of 100)
-- COPY (SELECT i as lower, i+100 as upper FROM generate_series(1,24321,100) i) TO STDOUT WITH CSV; 	 
-------------------------
EOF
)	 	 
echo "$SQL"  # comment to suppress printing
# execute SQL STATEMENT 	
psql -U $username -d $dbname -c "$SQL" > joblist.csv

#######################################

The final step is to execute the joblist. The syntax for GNU Parallel is fairly straight forward. The left hand side of the "|" is the name of the joblist file which feeds into the "parallel" command. --colsep ',' tells GNU Parallel each column of the joblist CSV file is delimited by a comma. Next we instruct GNU parallel to call the worker function (in this case fn_worker). {n} are the joblist CSV column numbers that inform the worker function what values to substitute for $1,$2 when launching a process. Finally, "abs_sa211_dumped" is the name of the source table that the worker function will substitute for $3. And that is it.

  1. #######################################
  2. ##########  EXECUTE JOBS  ###########
  3. #######################################
  4. cat joblist.csv | parallel --colsep ',' fn_worker {1} {2} abs_sa211_multi
  5. wait
  6. #######################################
  7.  
  8. # stop timer
  9. END_T1=$(date +%s)
  10. TOTAL_DIFF=$(( $END_T1 - $START_T1 ))
  11. echo "TOTAL SCRIPT TIME: $TOTAL_DIFF"
  12.  
  13. # end of script
  14. #######################################
#######################################
##########  EXECUTE JOBS  ###########
#######################################
cat joblist.csv | parallel --colsep ',' fn_worker {1} {2} abs_sa211_multi
wait
#######################################

# stop timer
END_T1=$(date +%s)
TOTAL_DIFF=$(( $END_T1 - $START_T1 ))
echo "TOTAL SCRIPT TIME: $TOTAL_DIFF"

# end of script
#######################################

 

Results of Query Times

 
 
results

We can draw several insights from our results.

  1. The time to vector tile the two abs_sa211 tables are orders of magnitude faster than the abs_aus11 tables. The reason for this is that abs_sa211 tables consist of smaller polygon geometries than the abs_aus11 tables, and that the ST_Intersection and union of small features is much faster than ST_Intersection of big features. The "eating the elephant in small bites" principle. The very fast "single process" query times for abs_sa211 tables also suggest there is little advantage to be gained by parallelising the tiling process for abs_sa211 with GNU Parallel.
  2. Regardless of whether you use a single process or GNU parallel, the dumping of multipolygon elements using ST_Dump is good practice that can produce significant performance gains. When tiling the abs_aus11 tables, the "dumping" step alone gave in excess of a 7x performance improvement. Multipolygon elements are stored as arrays, and are represented by one large bounding box. Hence indexes on large multipolygons are less effective than if each polygon element is dumped and stored as its own row.
  3. A single query process can at times be faster than a GNU Parallel process that treats each tile as a separate job (i.e. batch size = 1). However this may not always hold true, as with abs_aus11_dumped.
  4. Tile batch size is an important pipeline tuning parameter. Rather than treat each tile as a separate job, it is faster to process the tiles in batches. For this tutorial, 500 tiles per batch produced the fastest overall query time. However this may not hold true for every use case. When processing big, computationally complex queries and data pipelines (such as producing Open Street Map routable vector tiles), we find the processing of individual tiles as separate jobs to be much faster and easier than in batch.
  5. Another tuning parameter which we haven't mentioned is the tile size itself. This tutorial assumed a 32km x 32km tile size. Generally speaking, smaller tile sizes will take longer to process than larger tile sizes, in which GNU Parallel and multiple cores become advantageous.
Conclusions

PostGIS and the postgis community provides us with an amazing tool and support base that we'd be at a loss without. When learning PostGIS, we tend to invest in great books and incrementally read up on the spatial functions relevant to each use case. But in the emerging era of big data, a topic I believe deserves more attention is that of query "scalability". Often queries that perform "okay" on small datasets, or "pilots" do not always scale well on big datasets - in which case the query patterns themselves can act as "pipeline bottlenecks" in technical speak, or "project showstoppers" in business speak.

This tutorial introduces several techniques for efficiently working with big datasets in PostGIS that easily lend themselves to "scaling out" in a multicore platform environment. However understanding the performance of different query designs and patterns is a critical pre-cursor to the solution scaling phase where often small and "acceptable" inefficiencies can take on ugly new complexions.

Personally I find "Eating the Elephant in Small Bites" an apt and memorable adage that quintessentially captures the essence of vector tiling and query parallelisation in PostGIS. There are many additional factors and techniques to consider when applying these methods in practice (such as substituting the use of a regular grid with a quadgrid function that sees the vector tiles recursively subdivide in size to a given depth or feature count). But I took my own advice, and decided not to "try to eat the elephant in one bite". The topic is simply too broad to do the discussion justice in one post. I nevertheless hope that this tutorial will stimulate thinking whilst raising awareness amongst postgis community members of the possibilities offered by vector tiling and Map Reduce concepts.

In my next post, I will apply the techniques introduced in this tutorial to efficiently vector tile an Alberta Land Use dataset that is beset with large, highly complex polygons - a query run time that can be reduced from many days to several minutes.

by mark wynter at February 23, 2015 12:05 AM

February 15, 2015

Oslandia

Nouveau catalogue de formation Oslandia 2015

Oslandia publie son nouveau catalogue de formation pour l'année 2015


Oslandia, organisme de formation, vient de publier son nouveau catalogue de formation pour l'année 2015. Cela nous a pris un peu de temps, car il y a de multiples changements avec beaucoup de nouvelles formations dans divers domaines.


Les formations au catalogue sont toutes disponibles en intra entreprise (i.e directement dans vos locaux, aux dates de votre choix) et les plus récurrentes sont aussi proposées en inter-entreprises à des dates définies (cf fiches de formation).


On retrouve dans ce catalogue les classiques d'Oslandia : les formations sur PostGIS, qui conviendront aux débutants, et permettront aux plus curieux de se plonger au plus profond des arcanes de ce SGBD spatial toujours en évolution. Dans nos formations phares également, les sessions autour de QGIS, et les services web SIG.


Ces sujets ont été étendus et il est possible d'aller encore plus loin désormais, avec par exemple le développement de plugins QGIS de traitement avec le framework Processing ou le couplage de QGIS avec les outils de simulation, qui ouvre un pan complet de nouvelles applications.


Côté serveurs cartographiques, la formation MapServer s'adapte à la sortie récente de la version 7 lors du codesprint de Philadelphie . Nous avons également ajouté à notre catalogue une formation Mapnik.


Dans les nouvelles thématiques, nous ouvrons (enfin) des formations en développement web, avec des bibliothèques de cartographie web 2D, Leaflet ou OpenLayers 3. Mais c'est surtout l'apparition de la partie cliente 3D qui est remarquable, avec une session sur la construction d'architectures SIG 3D avec notre outil Cuardo ( SIG3D ) ou bien avec le globe WebGL Césium ( CES ).


Un autre domaine qui fait une apparition sérieuse est celui de la gestion des nuages de points. Ces données sont de plus en plus présentes dans les systèmes d'information géographiques, et les outils libres permettant de les exploiter existent désormais. Vous pourrez aborder avec nous le stockage et l'analyse de nuages de points, la visualisation, le traitement des données par photogrammétrie ou une session plus directement dédiée à l'exploitation de données UAV ( UAV ).


Nous enrichissons également nos formations en traitement de données, avec notamment une session consacrée uniquement au moteur de calcul d'itinéraires multimodal Tempus ( TEM ).


La liste complète est dans le catalogue : http://www.oslandia.com/formations


Les prochaines sessions inter-entreprise :

  • 18-20 mai 2015 : PostGIS mise en œuvre
  • 27-29 mai 2015 : PostGIS fonctionnalités avancées
  • 29 mai - 3 juin 2015 : Développement de plugins QGIS en Python

N’hésitez pas à nous contacter pour toute information complémentaire, obtenir un devis ou pour effectuer une réservation à formation@oslandia.com

by Vincent Picavet at February 15, 2015 06:00 PM

Oslandia

New Training Catalog at Oslandia

Oslandia publishes its new training catalog for 2015


Oslandia, training center, just published its new training catalog for 2015. It took us a little while, since there are numerous changes and a lot of new trainings in various domains.


Our training sessions are available as in-company training sessions (directly in your premises, at your desired dates). We can offer training in French, English, Spanish and Klingon ( additional cost may be charged ).


In this catalog, you will find all Oslandia's classics : sessions concerning PostGIS will fit beginner's needs, and allow the most curious of you to get their hand dirty with PostGIS internals. In our favorite topics, find also the QGIS sessions, and everything about GIS webservices.


These subjects have been extended and it is now possible to go even further, following namely QGIS Processing plugins development or coupling QGIS and simulation tools, which open a full new domain of applications.


On the GIS server side, the MapServer training is now upgraded to the most recent version 7, just released during the Philadelphia OSGeo Codesprint . We also added a Mapnik training session.


As for full new domains, we open (at last) web development training, with 2D web cartography libraries, Leaflet or OpenLayers 3. But most important are 3D client classes, with a session on 3D GIS architecture, with our Cuardo tool ( SIG3D ) or the Cesium WebGL globe ( CES ).


Another thematic, just appeared in our catalog is the Point Cloud data management topic. This kind of data is more and more present in modern GIS, and opensource software now exist to exploit them. You will be able to learn about storage and analysis of Point Cloud data, visualization, photogrammetry data processing or attend a session dedicated to UAV data management ( UAV ).


Our data processing training have also been enhanced, e.g. with a session dedicated to Tempus, our Multimodal routing framework ( TEM ).


The full list of training is available in our catalog :



Do not hesitate to contact us for any additional information, to get a quote or book a session : formation@oslandia.com

by Vincent Picavet at February 15, 2015 06:00 PM

February 13, 2015

Oslandia

Philadelphia OSGeo CodeSprint 2015

Gathering in Philadelphia

Last week, the OSGeo "C tribe" gathered in Philadelphia for its annual codesprint event. Around 40 developers joined to work on their favorite OpenSource geospatial projects.


The event, perfectly organized by the local folks from Azavea, took place at the Friends center and went on from tuesday to friday, alternating between serious coding time and unformal and funny sessions during the evening.


This is definitly a good time to thank all the sponsors who made the event possible. Location tech has been a gold sponsor, Airborne interactive and Boundless contributed at silver level, and bronze sponsors include AGI, CartoDB, Coordinate Solutions, Hobu, Mapzen, Mobile Geographics, OpenSCG and Typesafe. FOSSGIS.de local chapter also contributed, and Azavea organized everything.


You can have a look at Jody's pictures on Flickr .

Some Results

Good job has been achieved during these 4 days, and here is a little excerpt of some advancements you will see soon in the OpenSource GIS world.

MapServer Suite 7

The Mapserver team has released version 7, a new major version, with the following improvements :

  • Layer Compositing pipeline
  • Optimized runtime substitutions
  • Dynamic Heatmap layers
  • Support of Geomtransform and Styleitem Javascript plugins
  • WFS 2.0 (server side)
  • UTF Grid support
  • Removed GD renderer support
  • ... and some more

As a nice transition to next subject, have a look at the 3D Mapserver team !


Point Clouds

This codesprint has seen major efforts on the Point Cloud data side. Among the projects, PDAL, Plas.io, poTree, GreyHound… This is a really active domain.


But the main point (!) was about collaboration and share of ideas, concepts and software on how to use PointCloud data on the web. Howard Butler, after a "name the thing" brainstorming session, launched the PointDown initiative, so that the community can collaborate on creating specs for better interoperability.


The process is fully open, and follows the way GeoJSON emerged as a specification. You can read the PointDown Charter and follow the GitHub repository, but the most important is to join the discussion.

Oslandia's participation

Vincent Mora and Vincent Picavet took part of the event, and worked mainly on two subjects :

  • Docker-pggis : a docker container for GIS data management around PostGIS
  • Integrating Cesium as a client into our 3D GIS software stack

We worked hard and achieved good results, as you can see below.

docker-pggis

Docker PGGIS is a docker container, including PostgreSQL and all GIS related components on top of it. Vincent Picavet worked on this project and released version 1.6, including the following enhancements.


  • Workaround a nasty AUFS bug preventing PostgreSQL to launch ( nickname "the dot effect")
  • Updated PostgreSQL to 9.4 release
  • Updated PostGIS to latest version 2.1.5
  • Updated SFCGAL to latest master version, featuring volume computation, union and difference of 3D geometries
  • Updated pg_routing, pg_pointcloud and PDAL to latest master versions

You can now directly benefit from all of this with a single command line :

sudo docker.io run --rm -P --name pggis_test oslandia/pggis /sbin/my_init

Reminder : this is a tool for development, do not use in production !

3D buildings in Cesium

Vincent Mora worked on using Cesium as a 3D web client for our 3D GIS software stack. We already had published a demo of Cuardo for visualization of 3D buildings with textures and more.


Now we can have the following architecture in place :

  • 3D data in PostGIS 3D (with texture), allowing to analyze and mix different kind of data
  • Serving the 3D data with TinyOWS ( WFS-T server )
  • Visualize 3D data in Cesium, including textures or specific symbology

The video below displays a short demo of the current working code.



There is still work needed to clean the code, add tiling, caching and better LOD, improve performances on client and server sides, but this achievement proves that we are on the right track for the future of 3D OpenSource GIS.


Should you be interested in this work, want to help and fund, do not hesitate to contact us at infos+3d@oslandia.com . We are really interested in hearing your use cases and requirements.

by Vincent Picavet at February 13, 2015 11:00 AM

Stephen Mather

Normalizing tables in PostgreSQL: experiments with Fulcrum

Fulcrum by Spatial Networks is an interesting and very useful hosted service for collecting geo data in the field with whatever smartphone or tablet you already own.  It works well, so long as you just want to collect points, and the field building capabilities are quite extraordinary. Also, for many applications, it’s cheaper than custom solutions, so it hits that sweet spot well. If you’ve been paying attention, I prefer my hosted solutions to be (nearly) purely open source, so as to avoid vendor lock-in. Fulcrum is sufficiently compelling, we are using it carefully and with a bit of joy, in spite of not meeting my normal preferences.

Screen shot of Fulcrum's form builder

So, I thought I’d share some table normalization we are doing on our wetland inventory data that we’ll be collecting with Fulcrum. The State of Ohio specifies a rapid assessment method for classifying wetlands called Ohio Rapid Assessment Method for Wetlands (ORAM) (warning– pdf). This assessment method has force of law, and on paper is a one-page, two-sided form with a range of physical, water regime, land cover and other characterizations, and to date we’ve always filled it out on paper, transcribed that at the office, and then populated spreadsheets, or more recently a PostGIS/PostgreSQL database.

Where we run into some difficulty is that some of our form values can be multiple values. Fulcrum returns a single field for these, with comma delimited contents:

comma_delimited

Since we need to do lookups and calculations on these fields, we really need some normalization. Enter regexp_split_to_table.

CREATE TABLE nr.oram_metrics AS
  SELECT oram_id, 'm1_wetland_area'::text AS metric, regexp_split_to_table(m1_wetland_area, ',')::text AS selection
    FROM nr.cm_oram_data;

Now in our case, we have one lookup table for ORAM. Perhaps it should be multiple tables, but it’s just one discrete (ht. Brian Timoney for grammar fix) thing, this ORAM form, so we keep it simple. This means for each of our columns that should be normalized, we can throw them all in one table. We use UNION ALL to accomplish this:

CREATE TABLE nr.oram_metrics AS
  SELECT oram_id, 'm1_wetland_area'::text AS metric, regexp_split_to_table(m1_wetland_area, ',')::text AS selection
    FROM nr.cm_oram_data

UNION ALL

  SELECT oram_id, 'm2a_upland_buffer_width'::text AS metric, regexp_split_to_table(m2a_upland_buffer_width, ',')::text AS selection
    FROM nr.cm_oram_data;

Finally, we can use a FOREIGN KEY constraint with a table that has all our metrics and selections plus our lookup value to complete the exercise:

CREATE TABLE nr.oram_metrics AS
  SELECT oram_id, 'm1_wetland_area'::text AS metric, regexp_split_to_table(m1_wetland_area, ',')::text AS selection
    FROM nr.cm_oram_data

UNION ALL

  SELECT oram_id, 'm2a_upland_buffer_width'::text AS metric, regexp_split_to_table(m2a_upland_buffer_width, ',')::text AS selection
    FROM nr.cm_oram_data;

ALTER TABLE nr.oram_metrics ADD CONSTRAINT "metric_constraint"
  FOREIGN KEY (metric, selection) REFERENCES nr.oram_score_lookup_all (metric, selection);​ 

by smathermather at February 13, 2015 03:15 AM

February 10, 2015

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Querying MS Access and other ODBC data sources with OGR_FDW

If you have the OGR_FDW we discussed in OGR FDW Windows first taste built with ODBC support, then you can access most any ODBC datasource from PostgreSQL. This is especially useful for Windows users. Two of the data sources I've been experimenting with are SQL Server and MS Access. In this article, I'll demonstrate how to connect to MS Access with PostgreSQL running on a windows box. I think there is an Access driver for Unix/Linux most robust utilizes java. I won't go there.


Continue reading "Querying MS Access and other ODBC data sources with OGR_FDW"

by Leo Hsu and Regina Obe (nospam@example.com) at February 10, 2015 06:30 AM

February 08, 2015

Simon Greener

Morton key function for PostgreSQL/PostGIS [1]

This article shows a Morton key function for PostGis and how to use it.

by Simon Greener at February 08, 2015 10:55 PM

February 06, 2015

Paul Ramsey

Breaking a Linestring into Segments

Like doing a sudoku, solving a "simple yet tricky" problem in spatial SQL can grab ones mind and hold it for a period. Someone on the PostGIS IRC channel was trying to "convert a linestring into a set of two-point segments", using an external C++ program, and I thought: "hm, I'm sure that's doable in SQL".

And sure enough, it is, though the syntax for referencing out the parts of the dump objects makes it look a little ugly.

by Paul Ramsey (noreply@blogger.com) at February 06, 2015 06:50 PM

January 28, 2015

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Import Foreign Schema hack with OGR_FDW and reading LibreOffice calc workbooks

PostgreSQL 9.4 and below doesn't support importing whole set of tables from a FOREIGN server, but PostgreSQL 9.5 does with the upcoming Import Foreign Schema. To use will require FDW wrapper designers to be aware of this feature and use the plumbing in their wrappers. IMPORT FOREIGN SCHEMA for ogr_fdw come PostgreSQL 9.5 release is on the features ticket list. The ogr_fdw comes with this to die for commandline utility called ogr_fdw_info that does generate the table structures for you and will also list all the tables in the Foreign data source if you don't give it a specific table name. So with this utility I wrote a little hack involving using PostgreSQL COPY PROGRAM feature to call out to the ogr_fdw_info commandline tool to figure out the table names and some DO magic to create the tables.

Though ogr_fdw is designed to be a spatial foreign data wrapper, it's turning out to be a pretty nice non-spatial FDW as well especially for reading spreadsheets which we seem to get a lot of. This hack I am about to demonstrate I am demonstrating with LibreOffice/OpenOffice workbook, but works equally well with Excel workbooks and most any data source that OGR supports.


Continue reading "Import Foreign Schema hack with OGR_FDW and reading LibreOffice calc workbooks"

by Leo Hsu and Regina Obe (nospam@example.com) at January 28, 2015 04:57 AM

January 23, 2015

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Installing PostGIS packaged address_standardizer on Ubuntu

One of the changes coming to you in PostGIS 2.2 are additional extensions. Two ones close to my heart are the address_standardizer (which was a separate project before, but folded into PostGIS in upcoming 2.2) and the SFCGAL extension for doing very advanced 3D stuff (was just an sql script in older versions, but made an extension in 2.2 and new functions added). We had a need to have address standardizer running on our Ubuntu box, but since PostGIS 2.2 isn't released yet, you can't get it without some compiling. Luckily the steps are fairly trivial if you are already running PostGIS 2.1. In this article, I'll walk thru just building and installing the address_standardizer extension from the PostGIS 2.2 code base. Though I'm doing this on Ubuntu, the instructions are pretty much the same on any Linux, just replacing with your Linux package manager.


Continue reading "Installing PostGIS packaged address_standardizer on Ubuntu"

by Leo Hsu and Regina Obe (nospam@example.com) at January 23, 2015 02:54 AM

January 20, 2015

Safe Software

6 Google Maps Engine Alternatives

Google has announced that they’re ending support for Google Maps Engine. If you’ve used Google Maps Engine to store your vector and raster data, you should be aware that your GME data will disappear on January 29, 2016.

No better time than the present to migrate your geographic data to a replacement Maps Engine solution. But where to go? What technology can give you everything GME did?

  • A place to store your geographic data in a web-based environment.
  • Powerful visuals for inspecting and analyzing your data.
  • Easy collaboration on maps.
  • Real-time capabilities like dynamic rendering, alerts and notifications.

Let’s look at a few of the alternatives we’re aware of, and how you can migrate your data there without losing anything important. Please add your own suggestions in the comments.

You might find that one solution suits your requirements better than the others. I’d suggest testing out as many alternatives as you can before making your own decision about which is best.

Check out my follow-up post, How to Migrate your Google Maps Engine Data, to see how to get your data into whatever system you choose.

1. ArcGIS Online

If you want everything you had in Maps Engine, then ArcGIS Online is probably your solution of choice. It’s made for creating interactive web maps and apps, and comes complete with all the analytical power and authoritative data you’d expect from an Esri product. Like GME, AGOL renders maps dynamically, which means it offers better visualization for data that’s constantly being updated.

The ‘apps’ component is interesting: you can turn any of your feature services into apps, adding a new level of power to your data. AGOL also has no quotas or rate limits, offering more to organizations with high data volumes. They also have ArcGIS Open Data, which, if you’ve been using Google Map Gallery with Maps Engine, is an equivalent public dataset collection and distribution platform.

Getting your data out of Google Maps Engine and into ArcGIS Online is a one-step process using FME. Note you don’t need to have ArcGIS Desktop in order to use ArcGIS Online.

2. CartoDB

CartoDB is also a full solution that would work nicely as a GME replacement. In fact, they’ve just published a blog comparing the two and describing how easy it is to move your data from one to the other.

In CartoDB, the web-based environment and spatially aware database components are handled by PostGIS in the cloud, while the visualization, collaboration, and analysis aspects are presented simply and beautifully in a web interface.

I need to stress the ease-of-use here. Last week I needed CartoDB for a demo video, and was apprehensive about needing to learn how to use it first. I created an account, logged in, and—that was it. I already knew how to use it. The simplicity of importing data, creating a map, and choosing how to represent my data was immediately obvious. Kudos, CartoDB.

CartoDB also offers dynamic rendering and geo-temporal visualizations, giving you real-time maps and the ability to see data as it happens.

Again, you can extract your data from Google Maps Engine and load it into CartoDB in one step using FME.

3. iSpatial

iSpatial is a complete map creation, visualization, storage, collaboration, and analysis solution. From their website: “iSpatial is a web-based collaborative framework that leverages Google Earth and Maps in a flexible, task-based approach to solving complex problems.”

In addition to location-based inspection and analysis, it’s set up for real-time reporting and alerts management.

This technology is based on Google Earth/Maps, so if it’s the visual Google experience you’re after, this could be your solution. It also leverages Postgres/PostGIS in the back end.

And, yep, FME can extract your data from Google Maps Engine and put it into PostGIS in one step too.

4. A combination of Google products

Google, of course, suggests replacing GME with a combination of their own solutions. Google Maps/Earth is obviously great for visualizing and creating maps, but what about the rest of what Maps Engine offered?

My Maps is basically the map creation, visualization, and collaboration aspect of GME (up until a few months ago, it was actually called Google Maps Engine Lite). Storage, then, can be handled by migrating your data to Google Cloud SQL—a MySQL database in the cloud. Great, so now we have the web-based environment and database storage aspects.

Next, to bridge the two. Once you get your data from Google Maps Engine into the cloud-hosted MySQL database, you’ll have to follow their guide to build a spatial app using the Google Maps API. They’ve also provided an example vector data app and self-serve raster app for reference. Building stylized maps then involves a bit of JavaScript programming. Not an impossible migration, but not as straightforward as some of the solutions above, either.

If this is your solution, then guess what? You can extract your data from Google Maps Engine and put it into Google Cloud SQL in one step using FME. Google even published a page describing how.

5. Mapbox

Mapbox is a great way to publish your geographic data. It’s scalable, cheap, and open source. This, however, is more of a “partial” GME replacement, and will need to be combined with other technology if you want to have everything you knew and loved about Maps Engine. For instance, it lacks some of the powerful data management and analysis tools found in the solutions above.

Storage of your map tiles is handled by a SQLite database (which FME also migrates to). On the front end, Mapbox uses TileMill for creating interactive maps—a design environment that leverages the stunning cartographic abilities of Mapnik. Creating web map tiles with FME is a common scenario, and MBTiles is on our radar for an upcoming release (potentially FME 2016).

6. Your own custom integration

jigsaw-305576_1280In addition to those above, FME supports writing to many other cloud spatial data storage systems. If you’re not afraid of writing a bit of JavaScript, you can pick your favourite storage system (e.g. Amazon RDS or Aurora), your favourite map creation library/framework, and bridge the two using JavaScript.

Regardless of which technology you choose, rest assured you can easily migrate your data without losing anything. Google provided this page to walk you through exporting your stuff, where FME is one of the recommended migration strategies.

Extracting your data from Maps Engine with FME is as simple as dropping down a reader and entering your credentials. From there, you can move it to any of 335+ alternate systems, including those listed above. FME also offers hundreds of data transformation and quality control options if you need to manipulate the content or structure before loading it into another system.

As always, don’t hesitate to contact our most excellent support team if you need help getting your data wherever it needs to go.

If you’ve been using Google Maps Engine for your geographical data, please share: What alternative solution do you plan on using? Can you suggest any others?

See also: How to Migrate your Google Maps Engine Data

The post 6 Google Maps Engine Alternatives appeared first on Safe Software Blog.

by Tiana Warner at January 20, 2015 10:56 PM

Oslandia

Oslandia vous souhaite ses meilleurs voeux pour 2015


Toute l'équipe d'Oslandia se joint à nous pour vous souhaiter ses meilleurs vœux pour l'année 2015, pleine de réussite professionnelle et de joies personnelles.


C'est avec un dynamisme et un enthousiasme renouvelé que nous entamons cette année. Oslandia a en effet 5 ans révolus depuis peu, et cette maturité nous conforte dans notre croissance.


Nous prévoyons un élargissement des équipes ainsi que de nos compétences. Nos travaux de R&D sur la 3D se poursuivent, et devraient permettre à de nouvelles applications de voir le jour. Des outils pour le transport, la gestion de l'eau, la simulation sont également dans nos plans pour 2015.


Pour accompagner cela, notre nouveau catalogue de formation sera bientôt disponible. Nous organiserons également cette année des ateliers gratuits pour présenter nos produits et nos services.


Restez donc en contact, nous sommes impatients de collaborer avec vous sur vos projets, et 2015 promet d'être passionnante !

Pour toute information ou demande : infos@oslandia.com


Happy New Year 2015 !

The whole team at Oslandia joins us to wish you all the best for 2015, full of professional success and joy in your personal life.


We start this year with renewed dynamism and enthusiasm. Oslandia just turned 5, and this maturity makes us confident in our growth.


We plan for new team members this year, as well as new competences. Our research and development work on 3D goes on, and should allow for new applications. Software components for transportation, water management or simulation are also in our scope for 2015.


Our new training catalog should be available soon including exciting new topics. We will also organize in 2015 a set of free workshops to present our products and services.


Keep in touch with us, we are looking forward to collaborate with you on your projects. 2015 promises to be a fascinating year !

For any information or enquiry : infos@oslandia.com


See you soon,

Vincent Picavet, Olivier Courtin and the Oslandia team

by Vincent Picavet at January 20, 2015 11:00 AM

January 12, 2015

Boston GIS (Regina Obe, Leo Hsu)

PostGIS funding

A few people have asked how they can contribute to PostGIS work and sadly PostGIS is very disorganized in this area. Part of the problem is no one knows if setting up the work to collect the funds would be worthwhile, whose job it would be to do it, how that person would be reimbursed for the accounting work needed, and how that money would be distributed.

That said, I have decided to start on my own to ask for funding for the recurring PostGIS maintenance work that I do and to treat it as consulting income. This is partly to test the waters to see if getting a more formal funding setup is worthwhile and also so I can devote more time to PostGIS maintenance without becoming a pauper. If you are interested in funding PostGIS maintenance work, refer to this page Paragon PostGIS funding.

It is my hope that eventually we'll have a specific PostGIS fund managed by the PostGIS PSC that can be used to support the work of package maintainers and subsidize travel expenses of PostGIS developers. The PostGIS developer group is very global and getting people together or to go to conferences is expensive. Sadly many of the developers on the PostGIS team who would love to attend conferences on PostGIS behalf are not able to do so. I almost want to cry every time I hear this. Perhaps one day the PostGIS development group can afford to be in the same room together.

by Regina Obe (nospam@example.com) at January 12, 2015 09:40 AM

January 03, 2015

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Updated Foreign Data Wrappers for PostgreSQL 9.3 Windows

As stated in last article, I've packaged FDW binaries for PostgreSQL 9.3 windows 32-bit and 64-bit and added in the ogr_fdw one. These we've tested with the standard EDB Vc++ built PostgreSQL windows installs and work fine with those.

This package is an updated list from ones we've distributed before that includes ogr_fdw and recompiled with latests source from www_fdw and file_textarray


Continue reading "Updated Foreign Data Wrappers for PostgreSQL 9.3 Windows"

by Leo Hsu and Regina Obe (nospam@example.com) at January 03, 2015 10:05 PM

Boston GIS (Regina Obe, Leo Hsu)

Building GDAL under mingw-w64 with native windows ODBC support

When ogr_fdw came out, I was very excited to try ogr_fdw out on windows. To start with I used the default GDAL that we package with PostGIS, which is built under mingw-w64 (both 32-bit and 64-bit versions). Then I thought imagine how much more I can get by compiling with more drivers. A big one on my list was ODBC. Sadly this did not work out of the box under mingw-w64.

For windows folks who want to try out the ogr_fdw, I have binaries for PostgreSQL 9.4 FDWs which includes ogr_fdw and some of my other favorites. I do have the 9.3s as well, but haven't written up an article about them until I test them on my production instances. The links to those are the same as 9.4, just replace 94 with 93 in the download link. These include ODBC support which should allow you to query a lot of ODBC datasources.

Normally when I try to compile something that depends on ODBC using mingw, the configure scripts or make files are setup to assume you are using UnixODBC (which works but seems to use a different ODBC Manager and drags in all these extra dependencies). The way I normally fix this to use the native ODBC support is to replace all references to -lodbc in the configure script with -lodbc32. Yes even the windows 64-bit ODBC is called odbc32 but put in the system32 folder for 64-bit and SysWow64 for 32-bit. I imagine Microsoft came up with this confusing convention for backward compatibility. So I did that for GDAL, and to my horror that did not work.

What was the problem? GDAL had this extra dependency on odbcinst library. Although the odbcinst header is packaged with mingw-w64, there is no associated library. Searched thru my windows/system32 folder and my mingw-w64 libodbc*.a files and couldn't find such a thing. So I removed this thing from the configure script, and GDAL at least got past configure and happily started compiling, and then failed when linking. So I searched for the function it couldn't find, and discovered this was in a library called odbccp32. So to compile with odbc support, as noted in my ogr_fdw_build gist

  1. edit the configure script in gdal source tree and replace all references to -lodbc with -lodbc32, and -lodbcinst with -lodbccp32
  2. Then add in your configure --with-odbc=/mingw/${MINGHOST} where MINGHOST is set to x86_64-w64-mingw32 for the 64-bit mingw-w64 chain and i686-w64-mingw32 for the 32-bit mingw-w64 chain.

With this I was able to build libgdal with native windows ODBC support that just depends on the windows packaged odbc system dlls (so no extra dependencies). I tested by querying one of my clients SQL Server databases via ogr_fdw. One issue I ran into is it didn't handle SQL Server datetime right and kept on giving error ERROR: timestamp(62258384) precision must be between 0 and 6. So I had to change the ogr_fdw_info generated definition to bring in datetimes as varchar instead of timestamp and then just cast it to timestamp as part of my PostgreSQL query. This may be an idiosyncracy with how lengths have changed in windows ODBC that I have to patch or something with the ogr_fdw. I haven't tried the MSSpatial driver, but that suddenly showed up as an option after compiling with ODBC support. Also had some issues with UTF encoding which I was able to work thru by stripping high-byte characters in my PostgreSQL query.

One of my plans coming PostGIS 2.2 is to package a GDAL with more drivers - next on my list being SQLite family. I also need to figure out a way to have PostGIS gdal be swappable with the Visual C++ built ones. I can do that with for example curl and libxml, but the VC gdal last I looked seemed to expose weird named symbols so never hooks up. If I can get this going, then people can just swap out the GDAL we package with PostGIS with their own to get proprietary drivers like MrSID and so forth. Why don't I just try to build with it with VC++? Because then I've got to ship a VC++ runtime if I don't build with the same one EDB is using, and they use different flavors for each version of PostgreSQL. I also feel uncomfortable around Visual Studio for anything other than web development.

by Regina Obe (nospam@example.com) at January 03, 2015 12:19 AM

December 28, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Foreign Data Wrappers for PostgreSQL 9.4 Windows

As stated in last article, I've packaged FDW binaries for PostgreSQL 9.4 windows 32-bit and 64-bit and added in the ogr_fdw one. These we've tested with the standard EDB VS built PostgreSQL windows installs and work fine with those.


Continue reading "Foreign Data Wrappers for PostgreSQL 9.4 Windows"

by Leo Hsu and Regina Obe (nospam@example.com) at December 28, 2014 11:02 AM

December 27, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

OGR foreign data wrapper on Windows first taste

This christmas I received something very special from Paul Ramsey and Even Roualt as detailed in Foreign Data Wrappers for PostGIS. It's been something I've been patiently waiting for for 4 years. I think it has a few issues I'm working to replicate, but overall it's much faster than I expected and pretty slick.

So why is ogr_fdw so special, because GDAL/OGR is an avenue to many data sources, NOT JUST GEOSPATIAL. It's the NOT JUST that I am most excited about. Though the focus is geospatial you can use it with non-geospatial datasources, as we described a long time ago in OGR2OGR for data loading


Continue reading "OGR foreign data wrapper on Windows first taste"

by Leo Hsu and Regina Obe (nospam@example.com) at December 27, 2014 09:26 PM

December 19, 2014

Boundless Geo

OGR FDW FTW!

Merry Christmas, nerds! Have you ever wanted to connect your PostGIS database directly to an existing GIS file or database and read from it without importing the data. I have good news, repeat after me: OGR FDW FTW!

(How did this happen? Boundless provides its engineers with “innovation time” to pursue personal technical projects, and this year I chose to blow all my time in one go on this project. Like the idea of innovation time? Join us!)

OGR, is the vector subsystem of the GDAL open source format library. The OGR API lets applications read and write to many different formats (Shapefile, Oracle, SDE, MapInfo, PostGIS, FGDB, GML, etc) without having to import them first.

FDW, is the PostgreSQL API for implementing “foreign data wrappers”: virtual tables in the database that are actually connections to remote data files, repositories and servers. There are FDW implementations to connect to MySQL, Oracle, SQLite, and even flat text files!

FTW, is “for the win”! Because the OGR API is a simple table-oriented read-write library that is practically begging to be hooked up to the PostgreSQL FDW system and expose OGR data sources as tables in PostgreSQL.

Here’s how it works.

First, go to the source code repository, build and install the ogr_fdw extension.

Next, create the ogr_fdw extension and the postgis extension.

CREATE EXTENSION postgis;
CREATE EXTENSION ogr_fdw;

Now create a “server”. For a database FDW, the server would be an actual database server somewhere. For the OGR FDW, a server is an OGR connection string: it could be a database server, a directory full of files, or (as in this case) a web service:

CREATE SERVER opengeo
  FOREIGN DATA WRAPPER ogr_fdw
  OPTIONS (
    datasource 'WFS:http://demo.opengeo.org/geoserver/wfs',
    format 'WFS' );

Now create a “foreign table”. This will look just like a table, to the database, but accessing it will actually create an access to the remote server.

CREATE FOREIGN TABLE topp_states (
  fid integer,
  geom geometry,
  gml_id varchar,
  state_name varchar,
  state_fips varchar,
  sub_region varchar,
  state_abbr varchar,
  land_km real,
  water_km real )
  SERVER opengeo
  OPTIONS ( layer 'topp:states' );

Now, treat the table like you would any other PostGIS table, and ask it a question in SQL:

SELECT st_area(geom::geography) 
FROM topp_states 
WHERE state_name = 'Missouri';

And the answer comes back: 180863 sq km.

How does it work? The PostgreSQL query fires off an OGR query to the server (in this case, the OpenGeo Suite demo server) which pulls the table down, and it is then filtered and calculated upon locally in PostgreSQL.

Could it be better? Sure!

It could push SQL restriction clauses down to the OGR driver, reducing the quantity of data returned to the server. For big tables, this will be very important.

It could restrict the number of columns it returns to just the ones needed for the query. This will make things a little faster.

It could allow read/write access to the table, so that INSERT, UPDATE and DELETE queries can also be run. This opens up a whole world of interoperability possibilities: imagine your database being able to directly edit a File Geodatabase on the file system? Or directly edit an ArcSDE server in a workgroup?

The biggest limitation of the PostgreSQL FDW system is that it requires a table definition before it can work, so you require a priori knowledge of the table structure to set up your tables. Because this just creates busywork, I’ve also bundled a utility program with the ogr_fdw extension: ogr_fdw_info. The utility will read an OGR data source and layer and return the SQL you need to create an FDW server and table for reading that layer.

Enjoy wrapping your foreign tables, and enjoy the holidays!

The post OGR FDW FTW! appeared first on Boundless.

by Paul Ramsey at December 19, 2014 05:11 PM

PostGIS Development

Foreign Data Wrappers for PostGIS

The last couple weeks have seen two interesting updates in the world of PostgreSQL “foreign data wrappers” (FDW). Foreign data wrappers allow you to access remote data inside your database, exactly like other tables. PostgreSQL ships with two example implementations, one for accessing text files, and the other for accessing remote PostgreSQL servers.

The two updates of interest to PostGIS users are:

  • The Oracle FDW implementation was recently enhanced to support spatial columns, so an Oracle table with SDO_GEOMETRY columns can be exposed in PostgreSQL as a table with PostGIS geometry columns.
  • A new OGR FDW implementation was released that supports exposing any OGR data source as a table with PostGIS geometry columns.

Now you can access your PostGIS data without even going to the trouble of importing it first!

by Paul Ramsey at December 19, 2014 12:00 AM

December 18, 2014

Oslandia

New version of "Tempus", a framework for multimodal route planning


Oslandia and IFSTTAR are pleased to announce the release of a new version of their multimodal route planning framework: Tempus



Tempus is an Open Source C++/Python framework that allows to use, develop and test route planning algorithms with a particular focus on multimodal routing and multi-objective requests, where every possible transport modes are taken into consideration for a trip: private cars, public transports, shared vehicules, bikes, etc.



Tempus features :

  • a C++ core based on Boost::graph for efficient and generic multimodal graph manipulations
  • a (dynamically loadable) plugin architecture for routing algorithms
  • exposure of its API as a webservice through a FastCGI WPS server that is able to respond to simultaneous requests
  • a client for QGIS
  • a loader script that is able to load road network data from Navteq, TomTom and OpenStreetMap, public transport data from GTFS and point of interests from CSV or shapefiles
  • a Python request API

Tempus is distributed under the terms of the GNU LGPLv2+


Tempus is on GitHub, get it, fork it, PR it : https://github.com/ifsttar/Tempus


The focus of Tempus is both on delivering an engine able to run efficient state-of-the-art routing algorithms and on offering a complete development environment to designers of routing algorithms. A complete set of data from different sources can be integrated into a main (PostGIS) database thanks to the loader part. The modular architecture allows to develop algorithms as plugins for the core. Some mechanisms are dedicated to help during the development: user variables for plugins, possibility to output parts of the traversed graph, user metrics, etc.


This 1.2.0 release brings:

  • a routing algorithm able to deal with turning movement restrictions for each transport mode
  • a reverse graph adaptor that allows to see the graph reversed. Requests with an 'arrive before' constraint can then be processed.
  • a tool (in the QGIS plugin) to select a subset of the graph to work on
  • a support for speed profiles on roads
  • an implementation of the A* algorithm

As for previous releases, a documentation is available, including notes for compilation and installation.


A Windows installer is also available. It can be used to install Tempus on Windows, as well as (if necessary) QGIS 2.4 and PostGIS 2 !


Developments are still going on to include state-of-the-art algorithms, such as contraction hierarchies or Raptor.


Please let us know if you are interested in development, training or consulting around these pieces of technology. Do not hesitate to contact us at infos@oslandia.com .

by Hugo Mercier at December 18, 2014 09:00 AM

PostGIS Development

PostGIS 2.1.5 Released

The 2.1.5 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.5 release. As befits a patch release, the focus is on bugs, breakages, and performance issues

http://download.osgeo.org/postgis/source/postgis-2.1.5.tar.gz

Continue Reading by clicking title hyperlink ..

by Paul Ramsey at December 18, 2014 12:00 AM

December 12, 2014

Boston GIS (Regina Obe, Leo Hsu)

AvidGeo LocationTech Tour Boston 2014, Synopsis

This past Monday we gave an introductory workshop on PostGIS. Our slides are here: PostGIS tutorial. Overall the event was very educational. I did have a butterfly feeling through out since we were the last to speak.

Talk slides

Talks were good, but most of the topics were not new to me so harder to appreciate. The talk I thought was the best was given by Steve Gifford on Geospatial on Mobile. Two things I liked about this talk was it was a topic I didn't know much about (developing native 3D globe map apps for iOS and Android) and Steve was a very funny speaker. That kind of funniness that looks unplanned and natural. I wish it had a bit more Android content in it though. The full list of talks and associated slides below.

Workshops

The workshops I think were the best part of the event. I particularly enjoyed Andrew Hill's CartoDB talk (PostGIS without the pain but with all the fun) mostly because I haven't used CartoDb so first I've seen it in action. Being able to drag and drop a zip file from your local computer or a url of data from some site and have a table ready to query I thought was pretty cool. You could also schedule it to check for updates if it was a url to a shapefile zip or somethng. Of course being able to write raw PostGIS spatial queries and a map app in 5 minutes was pretty slick.

Ours came after, and unfortunately I think it was pretty dry and too technical for most GIS folks. Plus we hadn't told people to download the files before hand so next to impossible to follow along. We should have called it PostGIS with the pain and on a shoestring budget.

by Regina Obe (nospam@example.com) at December 12, 2014 10:11 PM

December 06, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Oracle FDW 1.1.0 with SDO_Geometry PostGIS spatial support

Oracle FDW is a foreign data wrapper PostgreSQL extension that allows you to read and write to Oracle database tables from a PostgreSQL database. You can get it via the PGXN network or the main website http://laurenz.github.io/oracle_fdw/.

What is new about the latest 1.1.0 release is that there is now support for the Oracle SDO_GEOMETRY type that allows you to map the most common geometry types POINT, LINE, POLYGON, MULTIPOINT, MULTILINE and MULTIPOLYGON to PostGIS geometry type. Much of the spatial plumbing work was done by Vincent Mora of Oslandia. If we have any Windows Oracle users out there, yes there are binaries available for windows for PostgreSQL 9.1- 9.4 for both 32-bit and 64-bit. The FDW does have a dependency on the OCI.dll which I think comes shipped with Oracle products. Unfortunately, we are not Oracle users so can't kick the tires.

by Leo Hsu and Regina Obe (nospam@example.com) at December 06, 2014 05:33 PM

December 02, 2014

Paul Ramsey

The Tyranny of Environment

Most users of PostGIS are safely ensconsed in the world of Linux, and their build/deploy environments are pretty similar to the ones used by the developers, so any problems they might experience are quickly found and removed early in development.

Some users are on Windows, but they are our most numerous user base, so we at least test that platform preemptively before release and make sure it is as good as we can make it.

And then there's the rest. We've had a passel of FreeBSD bugs lately, and I've found myself doing Solaris builds for customers, and don't get me started on the poor buggers running AIX. One of the annoyances of trying to fix a problem for a "weird platform" user is just getting the platform setup and running in the first place.

So, having recently learned a bit about vagrant, and seeing that some of the "weird" platforms have boxes already, I thought I would whip off a couple vagrant configurations so it's easy in the future to throw up a Solaris or FreeBSD box, or even a quick Centos box for debugging purposes.

I've just been setting up my Solaris Vagrantfile and using my favourite Solaris crutch: the OpenCSW software repository. But as I use it, I'm not just adding the "things I need", I'm implicitly choosing an environment:

  • my libxml2 is from OpenCSV
  • so is my gcc, which is version 4, not version 3
  • so is my postgres

This is convenient for me, but what are the chances that it'll be the environment used by someone on Solaris having problems? They might be compiling against libraries from /usr/sfw/bin, or using the Solaris gcc-3 package, or any number of other variants. At the end of the day, when testing on such a Solaris environment, will I be testing against a real situation, or a fantasyland of my own making?

For platforms like Ubuntu (apt) or Red Hat (yum) or FreeBSD (port) where there is One True Way to get software, the difficulties are less, but even then there is no easy way to get a "standard environment", or to quickly replicate the combinations of versions a user might have run into that is causing problems (libjson is a current source of pain). DLL hell has never really gone away, it has just found new ways to express itself.

(I will send a psychic wedgie to anyone who says "docker", I'm not kidding.)

by Paul Ramsey (noreply@blogger.com) at December 02, 2014 09:22 PM

November 30, 2014

Archaeogeek (Jo Cook)

PostGIS Day 2014

On 20th November this year we held the first PostGIS “Show-and-Tell” Day at the HQ of the British Computing Society. This was the first “official” OSGeo:UK event since hosting FOSS4G 2013 in Nottingham last year- it took us that long to recover from the stress!

We deliberately chose to make the event extremely light and informal, with the minimum of organisation, and this seemed to work pretty well. We had a range of speakers doing both longer talks and lightning talks, and then a session at the end where we discussed useful tools and add-ons for working with PostGIS.

The only caveats we placed on speakers were that the talk should (obviously) include PostGIS or PostgreSQL in some way, and that overt sales pitches would be considered uncool. Luckily our speakers obliged! There was also some really good audience interaction.

Highlights for me included Foreign Data Wrappers, for getting at information stored in other databases or behind other apis, but through PostgreSQL, and Raster processing. I was particularly excited by the Twitter FDW, but it turns out that it doesn’t work with the latest version of the twitter api, and it doesn’t look like a particularly well-maintained piece of code. You can see links to all the talks, and notes from the discussion that we had at the end, at our OSGeo:UK GitHub repository for the event: http://osgeouk.github.io/pgday/.

It transpired that there were a number of other events globally, also celebrating #PostGISDay, and perhaps next year it would be good to coordinate these better.

From an OSGeo:UK perspective, we have plans to hold more of these light-touch events, on a range of subjects (ideas gratefully received). Personally I would like to have charged a nominal fee for attending the event, not to make a profit but so that we could make a small donation back to PostGIS but charging money makes the whole process more complicated! We were extremely lucky in this case to get the venue for free, courtesy of the BCS Location Information Group, and to get the food sponsored by Astun Technology.

All in all, a really fun event, both to organise and to attend- thanks to all the speakers and attendees, and here’s to another event quite soon.

November 30, 2014 08:00 AM

November 27, 2014

Boston GIS (Regina Obe, Leo Hsu)

osm2pgrouting for Windows

For loading OSM data in a format already ready for pgRouting queries, the two common tools I've seen used are osm2po and osm2pgrouting. Unfortunately osm2pgrouting has been for a long time a Unix only tool mostly because no one tried to compile it for windows or test it on windows to see if it works. So this means that windows users who wanted a quick road to pgRouting nirvana had to use osm2po. Turns out the osm2pgrouting code compiles fine on windows (at least under mingw-w64 chain) and seems to work fine, so we should start making it available for windows.

This is still experimental, but if you are a windows user (or have a pgRouting windows user friend) and want to kick the tires a bit, you'll find osm2pgrouting for windows (both 32-bit and 64-bit) binaries on the PostGIS windows download page in the Unreleased section.

I compiled osm2pgrouting (for pgRouting 2.0) against PostgreSQL 9.4 using the same mingw-w64 chain I use to build PostGIS and pgRouting for windows -- clumsily detailed in my gist shell scripts - . Though it depends on libpq, it seems to work fine if you use it against the PostgreSQL 9.3 libpq. So just need to copy the additional files (osm2pgrouting and expat) into your PostgreSQL bin folder and you are good to go loading data into a pgRouting enabled database. It is assumed you already have PostGIS 2.1+ and pgRouting 2.0 already installed which come bundled together as part of the PostGIS 2.1+ windows installers and also available on the PostGIS windows download page.

Which should you use: osm2pgrouting or osm2po?

In many ways osm2po is superior to osm2pgrouting and has had more time invested in it. It can handle larger files on smaller machines, can work with the OSM pbf format, and it supports both Unix like and windows platforms. That said osm2po is not absolutely superior to osm2pgrouting. osm2po is inferior in a couple of ways that are important to me.

  • It's not open source, it's freeware -- which means you can't kick the code around and learn from it or inspect it. It also means you can't change it and contribute back your changes.
  • The primary motive of osm2po is not as a loading tool for pgRouting. osm2po is a complete routing engine with webserver which doesn't even rely on PostgreSQL. That said, if the maintainer decides one day providing a pgRouting export is no longer beneficial to osm2po development path, or pgRouting devs need to tweak it a bit to work with newer pgRouting, it will be difficult.
  • It's written in Java. This is my own personal dislike. Not that anything is wrong with Java, it's just sometimes a bit of a nuisance to get the VM going on a server I happen to get stuck on that I can't install anything I want. osm2pgrouting is much lighter weight not requiring a JVM, though it does require PostgreSQL,PostGIS, and pgRouting -- which you should have already.
  • osm2po generates an intermediary sql file which you then have to load via psql. I prefer the direct load approach of osm2pgrouting over having a big-o sql file to contend with after.

That said osm2po is great in many ways, but alternatives are always nice to have. Only annoying thing with having alternatives is deciding which alternative is best for what you are doing.

by Regina Obe (nospam@example.com) at November 27, 2014 11:34 PM

November 21, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

PostGIS Day synopsis

Yesterday was PostGIS day or as some may call it, Post GIS day and a couple of interesting things happened this day:

  • PostgreSQL 9.4 RC1 came out.
  • There were parties and unconferences, many summarized on http://2014.postgisday.rocks
  • I managed to entertain myself with a Conway's game of life PostGIS raster Map algebra style and pondered how wonderful PostGIS would be if it could generate animated gifs with some sort of aggregate function; to which I was politely called crazy by some of my fellow PSC friends.

But what was greatest of all and took the cake were these pictures:


Continue reading "PostGIS Day synopsis"

by Leo Hsu and Regina Obe (nospam@example.com) at November 21, 2014 09:16 PM

November 20, 2014

Boundless Geo

Happy PostGIS Day!

PostGISYesterday was GIS Day, which means that today is PostGIS Day — get it? Post-GIS Day! To celebrate, we’re giving a 50% discount on online PostGIS training through the end of the week! Visit our training catalog and use promo code “postgis2014″ to take advantage of this offer.

A lot has happened since last year, when I joined Stephen Mather, Josh Berkus, and Bborie Park in a PostGIS all-star hangout with James Fee.

In case you missed them, here are some features from our blog and elsewhere that highlight what’s possible with PostGIS:

An, as always, be sure to check out our workshops for a slew of PostGIS-related courses, including Introduction to PostGIS and our Spatial Database Tips & Tricks.

Interested in learning about or deploying PostGIS? Boundless provides support, training, and maintenance for installations of PostGIS. Contact us to learn more.

The post Happy PostGIS Day! appeared first on Boundless.

by Paul Ramsey at November 20, 2014 01:54 PM

Boston GIS (Regina Obe, Leo Hsu)

PostGIS Day Game of Life celebration

This year's PostGIS day, I decided to celebrate with a little Conway's Game of Life fun inspired by Anita Graser's recent blog series Experiments with Game of Life. The path I chose to simulate the Game of life is a little different from Anita's. This variant exercises PostGIS 2.1+ raster mapalgebra and PostgreSQL 9.1+ recursive queries. Although you can do this with PostGIS 2.0, the map algebra syntax I am using is only supported in PostGIS 2.1+. My main disappointment is that because PostGIS does not yet support direct generation of animated gifs I had to settle for a comic strip I built unioning frames of rasters instead of motion picture. Hopefully some day my PostGIS animated gif dream will come true.


Continue reading "PostGIS Day Game of Life celebration"

by Regina Obe (nospam@example.com) at November 20, 2014 07:04 AM

November 18, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

FOSS4GNA 2015 and PGDay San Francisco March 2015

PGDay 2015 San Francisco will be held March 10th 2015 in Hyatt, San Franciso Airport, Burlingame, CA (Just outside of San Francisco). This year PGDay will be hosted along-side Free and Open Source Geospatial North America (FOSS4GNA) conference 2015 which runs March 9th-12th and EclipseCon NA 2015. Speaker submissions for FOSS4GNA 2015 and EclipseCon NA 2015 will end this Monday November 17th, 2014.


Continue reading "FOSS4GNA 2015 and PGDay San Francisco March 2015"

by Leo Hsu and Regina Obe (nospam@example.com) at November 18, 2014 03:46 AM

November 03, 2014

Boston GIS (Regina Obe, Leo Hsu)

AvidGeo Conference: LocationTech Tour Boston, Dec 8 2014

We'll be giving a PostGIS Introduction tutorial at AvidGeo Conference: LocationTech Tour Boston which will be held 8 December 2014 from 8:30 AM to 4:30 PM at:

Hack / Reduce
275 Third St
Cambridge, MA, 02142

Tickets are $40 for admission.

In addition to our PostGIS Intro tutorial, there is a great line-up of talks from visionaries such as Adena Schutzberg, talking: State of the Geospatial Industry and Raj Singh on the topic of The NoSQL Geo Landscape. Also check out other tutorials on CartoDb and QGIS.

by Regina Obe (nospam@example.com) at November 03, 2014 06:40 AM

October 26, 2014

Stephen Mather

Airspace — A deep rabbit hole

In previous maps we looked at Class B, C, and D airspace. Let’s add in Class E0 and E5… (not yet in 3D):

(Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL)

Map showing Class B, C, D, E0, and E5 airspace

Map showing Class B, C, D, E0, and E5 airspace

Previous posts:

https://smathermather.wordpress.com/2014/10/25/airspace-is-complicated-and-so-i-abuse-postgis-once-again/

and

https://smathermather.wordpress.com/2014/10/25/airspace-is-complicated-and-so-i-abuse-postgis-once-again-reprise/


by smathermather at October 26, 2014 02:16 AM