Welcome to Planet PostGIS

June 22, 2017

BigSQL Holly

pgAdmin3 LTS Now Supports PostgreSQL 8.4 – 10

We are pleased to announce pgAdmin3 LTS is now available on OSX and Windows for PostgreSQL 10!

As v1.0 of pgAdmin4 was released in September of 2016, the pgAdmin Development Team decided to no longer support pgAdmin III with any more fixes. We believe that pgAdmin3 usage should fade out over time and have forked the code to include basic support for PostgreSQL 9.6+.

Try it out!

Install pgAdmin3 LTS by BigSQL

If you haven’t already installed PostgreSQL, start here.

Via Command Line

Windows users don’t prefix the pgc command with ./ as shown in the following commands:

cd <directory of installation>
./pgc install pgadmin3
./pgc start pgadmin3


If you haven’t already, install pgDevOps.

In pgDevOps, navigate to Package Manager, select the pgAdmin3 component, and install.

Already have pgAdmin3 installed?

cd <directory of installation>
./pgc update
./pgc upgrade pgadmin3

Wanna delete pgAdmin3?

cd <directory of installation>
./pgc remove pgadmin3

As always, give us your feedback so that we can continue to support and respond to the needs of the PostgreSQL community.

by Holly Orr at June 22, 2017 06:15 PM

June 19, 2017

BigSQL Holly

GDAL New Release 2.2 – BigSQL Makes it Easy to Install

The GDAL community just announced the release of GDAL 2.2… and BigSQL just made it easier to implement!

Install it Now

We support Diegos and Yodas with 2 ways to install:

Option 1: Install with the pgDevOps GUI

For GIS analysts (scientists, statisticians, analysts, etc) who find themselves as the de facto DBA (among other things).

“I don’t have time for this! I’m saving this sea turtle. Just give me a GUI!”

1. Go to www.bigsql.org/postgresql/installers and download and run the installer for your operating system.

2. Make sure to check the box for including the pgDevOps component with your install.

3. Open pgDevOps in your web browser:

    <ip of machine>:8051

If installing on local machine:

4. In the pgDevOps dashboard click the Package Manager icon.

5. Click the PostgreSQL installation you want to install on.

6. Click on the PostGIS icon and follow the instructions to install.

That’s it!

Option 2: Install via Command Line and PGC

For DBAs who find themselves supporting these strange spatially enabled databases.

“Real men only use command line! And what the hell is GIS?”

  1. Install pgc via command line to a sandbox.
    *Note: this command will install the BigSQL distribution into the directory you are currently located.

    MAC / Linux:

    python -c "$(curl -fsSL http://s3.amazonaws.com/pgcentral/install.py)" 


    @powershell -NoProfile -ExecutionPolicy unrestricted -Command "iex ((new-object net.webclient).DownloadString('http://s3.amazonaws.com/pgcentral/install.ps1'))"
  2. Get into the product home directory by navigating via command line to the bigsql directory. Windows users don’t prefix the pgc command with ./ as shown in the below examples:

    cd bigsql
  3. Using the pgc command line tool, run the update command to get the lastest distribution:

    ./pgc update
  4. Then run the install, init, and start commands on pg96:

    ./pgc install pg96 
    ./pgc start pg96
  5. Next, install postgis:

    ./pgc install postgis23-pg96

That’s it!

The GDAL_DATA Directory Path

In order for all your PostGIS extensions to properly work, you will need to set the GDAL_DATA environment variable to the location of the directory gdal.

Lucky for you, with the BigQL distribution this is as easy as running the following commands via a command line tool.

Linux / OSX:

 cd <directory of installation>/pg96
 source pg96.env


 cd <directory of installation>\pg96

Both of these commands set environment variables that will live during your current session. Note that if you close your terminal or command prompt, these variables are removed from memory.

It’s also possible to set GDAL_DATA as a persistant environment variable. But that is the beyond the scope of this tutorial.

To check to see what version of GDAL you have installed:

gdalinfo --version


In 1998, Frank Warmerdam began developing GDAL, which stands for Geospatial Data Abstraction Library. The official definition, found on gdal.org, is as follows:

GDAL is a translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single raster abstract data model and single vector abstract data model to the calling application for all supported formats. It also comes with a variety of useful command line utilities for data translation and processing.

Clear as mud, right? Let me refer to a better definition written by Planet Labs’ Robert Simmon in the post, A Gentle Introduction to GDAL, Part 1. I higly recommend reading the entire article:

Even the description is complicated. Put another way, GDAL (pronounced ‘gee-dal’ not ‘goodle’ (too close to Google!)) is a collection of software that helps with the translation of data from different file formats, data types, and map projections. It’s used in free software like QGIS and GRASS, and even some commercial applications like ESRI ArcGIS and Avenza Geographic Imager/MAPublisher.


To make things even a bit more confusing, OGR (which I discussed in an earlier post, is part of GDAL:

In theory it is separate from GDAL, but currently they reside in the same source tree and are somewhat entangled.

So what’s the difference between GDAL and OGR?

The GDAL library is historically for raster data extract, transform, load (ETL) processes, while OGR supports vector data. Sometimes people will interchange the names GDAL, OGR, and GDAL/OGR to refer to the same thing. I know… geeks and their esoteric acronyms, right?

Regardless of the complexities, I can’t overemphasize how essential it is to add GDAL/OGR libraries to your spatial data management toolkit!

by Holly Orr at June 19, 2017 04:44 PM

June 17, 2017

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Dollar-quoting for escaping single quotes

PostgreSQL has a feature called dollar-quoting, which allows you to include a body of text without escaping the single quotes. This feature has existed for quite some time. You've probably seen this in action when defining functions for example:

CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
SELECT 'Hello world. My name is ' || param_your_name || '.';
language sql STRICT;

Which is easier to read, than the equivalent escape quoted function:

CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
SELECT ''Hello world. My name is '' || param_your_name || ''.'';
language sql STRICT;

Continue reading "Dollar-quoting for escaping single quotes"

by Leo Hsu and Regina Obe (nospam@example.com) at June 17, 2017 11:54 PM

June 09, 2017

Bill Dollins

Refreshing a PostGIS Materialized View in FME

I am following up my previous post with an extremely simple example using FME to kick off the refresh of a materialized view (matview) after a data import. I had never used FME prior to coming to Spatial Networks, but now I’m hooked. I’m having a really hard time finding things it can’t do.

As I mentioned in my last post, it’s really easy to refresh a matview in PostgreSQL using the REFRESH MATERIALIZED VIEW statement. This leaves open the possibility of automating the refresh as appropriate in an application or other process.

I decided to illustrate this using a basic FME example. Using the cellular tower data set from my past post, I extracted a table containing only the records for the state of Maryland. The towers data set contains the two letter abbreviation for the state, but not the full state name. So, I built a matview to join the state name to a subset of columns from the towers data set. The SQL for that matview is here:

View the code on Gist.

I will use FME to append the records for the state of Virginia from a GeoJSON file to the PostGIS table containing the records for Maryland.

Disclaimer: This example wastes the power of FME. If this were all I needed to do, I’d probably just use OGR. I kept it simple for this post.

Here is a screenshot of my import process from FME workbench:

As can be seen, 656 records from the GeoJSON file will be appended to the PostGIS table.

To make FME run the REFRESH statement after the import, I just put the SQL into the advanced parameters for the PostGIS writer.

For demonstration purposes, I’ll run the import without the REFRESH statement. This will result in the source table having a different record count than the matview, as shown here.

Now, I’ll clear out the Virginia records from the base table, add the REFRESH statement back into FME, and rerun the process. This time, the table and the matview are in sync, as seen here.

This basic example illustrates one way to automate the refresh of matviews as part of a data update process. Whether using FME, GeoKettle, or your own application code, it’s very easy to keep matviews in sync.

It’s important to remember however, that every refresh is writing data to a new table. Care should be taken to tune how often matviews are refreshed to ensure that performance doesn’t suffer. They aren’t appropriate for high-velocity data sets, but can work nicely for data that is updated less frequently.

by Bill Dollins at June 09, 2017 06:04 PM

June 06, 2017

Bill Dollins

Working with Materialized Views in PostGIS

It’s been a few months since I’ve posted, owing mainly to getting my feet under me at Spatial Networks. About a month after I started, the company re-merged with Fulcrum, which had previously been spun off as a separate company. As a result, I’ve gotten to know the Fulcrum engineering team and have gotten to peer under the hood of the product.

Of course, Spatial Networks is also a data company. What had originally attracted me was the opportunity to help streamline the delivery of their data products, and this remains a pressing issue. This has kept me elbow-deep in PostGIS, and has led me to delve into using materialized views more than I have before.

What is a materialized view? If you are familiar with relational databases, then you are familiar with views, which are saved queries that are stored in the database. Similar to tables, you can select data from a view; but, rather than directly selecting physically stored data, you are executing the SQL that defines the view, which will stitch together data at the time of execution.

A materialized view is a useful hybrid of a table and a view. It is technically a table, because it is physically stored on disk, but it is generated from a SQL statement like a view. This is can be useful for increasing performance because costly joins and functions (ahem, spatial) are not executed every time the data is accessed. The source SQL is executed and the result written to disk. As a result, the data can be indexed and queries can executed quickly. It could be reasonably thought of as a more streamlined SELECT INTO.

So why use materialized views? If properly designed, a relational database (spatial or otherwise) is built using normalization to optimize storage, reduce data redundancy, provide constraints and enhance data quality. This is excellent for storage, but not always ideal for analytics, which is why views exist.

A materialized view provides the ability to prepare a persisted version of data that is better suited for analysis and/or human readability. In a spatial database such as PostGIS, it also provides the ability to pre-process spatial analysis to enhance database and application performance. That is what got me interested in them.

From here, it’s probably best to discuss materialized views in terms of a scenario. I’ll discuss a fairly straightforward example, data binning, which touches upon the advantages of materialized views for me.

I’ll work with four data sets:

  1. An extract of cellular towers from the FCC downloaded from ArcGIS Online
  2. US counties extracted from GADM
  3. 0.5-degree rectangular grid created in QGIS
  4. 0.5-degree hexagonal grid created in QGIS

I imported all the above into PostGIS.

My goal is to calculate the number of cellular tower in each grid square, hexagon, and county. I’ll use a fairly simple bit of SQL to do that.

View the code on Gist.

The SQL above generates a count of the towers that fall within each hex, as governed by the st_contains function and the GROUP BY clause. In a traditional view, this spatial query would executed every time I issued a SELECT statement against the view.

You will notice that I don’t return the hexagon geometry. I will create a second view that joins the geometry to the results of this query. I did that strictly to keep the SQL clean for this post, but it is perfectly fine to get all of the data in one step. The SQL that joins the hexagons is:

View the code on Gist.

Notice the left join so that all of the hexagons are returned regardless of whether or not they contain towers. I included the full CREATE statement this time to highlight the contrast with the next example.

It’s now time to create the materialized view, which calls upon the view created in the previous step. This is the SQL to create the materialized view:

View the code on Gist.

Note the creation of the spatial index. This is important because a materialized view creates a new data set, so the data is divorced from any indices in the source data.

The SQL above physically writes the data to the database, similar to a SELECT INTO statement. Unlike that approach however, we can easily refresh the data by issuing the following command:

REFRESH MATERIALIZED VIEW mvw_cellular_count_geom_hex;

This enables a user or application to automatically updated the stored data whenever the underlying source data changes.

Notice in the SQL above, I am calculating a UUID column. This is being done to aid visualization in QGIS. To add a view of any kind to a QGIS map, you need to tell it what the unique identifier is for the data set. I tried numerous different approaches, but found using a UUID to work most consistently. I will keep exploring this. The binned hex grid layer (with a bit of styling) looks like this in QGIS:

I mentioned earlier that materialized views can help performance. This really becomes apparent with complex geometries. There is minimal difference between the standard view and the materialized view when working with squares or hexes. But, to produce the county map shown at the top of this post, the standard view took 4.2 seconds to run on a Linux machine with quad-core, SSD, and 64GB of RAM. The materialized view returned in 292 milliseconds. This is where not having to re-run spatial queries using the details GADM polygons really pays off.

Materialized views are not a panacea. If you have rapidly updating data, the refresh process with probably introduce too much latency. You are also storing data, such as geometries, twice. But, depending on your needs, the benefits may outstrip the drawbacks of using materialized view in spatial applications.

by Bill Dollins at June 06, 2017 09:38 PM


OSGeo CS 2016 report : PostGIS

The annual OSGeo Codesprint took place from 23d to 26th of February in Paris, at Mozilla’s Foundation offices. Here is a summary of some achievements done during this event by the PostGIS team.


A big team of PostGIS developers worked together during the sprint on numerous issues.

New functions and dev guide

Remi Cura from IGN had prepared a full list of improvements to existing PostGIS functions and new features to develop. Together with Marc Ducobu ( Champs Libres ) , they managed to go all the way from simple PostGIS users to PostGIS developers. They dived into the code and managed to write clean patches for functions and documentation.

They documented their work and created a step-by-step guide for new PostGIS contributors.

New clustering features

New features have been introduced for PostGIS 2.3, among them clustering methods : ST_ClusterKMeans() and ST_ClusterDBSCAN() by Paul Ramsey and Dan Baston. These are really nice Window functions for PostGIS !
On a side note, the documentation for PostGIS Window functions has been improved too ( R. Obe )


Regina Obe managed to travel to Paris, and she put efforts analyzing why the ODBC performances of the ogr_fdw PostGIS foreign data wrapper were not as good as expected. This should lead to performance improvement later on. She also worked on problems encountered when restoring PostGIS data ( especially rasters), related to how the search_path is handled. There is still some work to experiment with EVENT TRIGGERS, but the issue is on a good path to be solved.

3D in PostGIS : SFCGAL


Our team at Oslandia has been working on SFCGAL, the 3D library behind PostGIS 3D features. Vincent Mora and Hugo Mercier teamed with Mickael Borne ( IGN ) and Sébastien Loriot ( GeometryFactory ) to break things and rebuild them, better. They focused mainly on performances:

  • different computation and precision kernels in CGAL, to use them efficiently and be much faster
  • set a flag for valid geometries, so as not to re-test for validity in every operation
  • lower serialization overhead when passing data between PostGIS, CGAL and SFCGAL

This effort lead to significant improvement in speed. Preliminary tests on 3D intersection showed improvement from 15 seconds down to 100 milliseconds, which is impressive result.
A lot of refactoring of the code has to be done, and this work also started, to simplify and ease the use of SFCGAL.

New indexes for big data

Another significant contribution is the work on BRIN indexes for PostGIS. At PGConf Europe, we already spoke with the team at Dalibo about the new index type in PostgreSQL 9.5 : BRIN use cases and implementation in a GIS context. Some time before OSGeo code sprint, we realized that Giuseppe Broccolo from 2ndQuadrant had started a prototype implementation, so we invited him to join. Giuseppe teamed with Ronan Dunklau and Julien Rouhaud from Dalibo, and together they managed to have a working implementation of this new type of indexes for PostGIS.

Having the opportunity for PostgreSQL developers to meet in this geospatial environment was the ideal way to get things done efficiently.

PostGIS BRIN will be detailled after some code consolidation and benchmarks, but here is the idea. BRIN indexes arepartial indexes: they deal with data blocks ( a given amount of pages ) and not rows. You can set the number of pages per block. This makes indexes much faster to build, and a lot smaller than classic indexes ( GiST for PostGIS ). Of course, the tradeoff is that they are slower to query. And since they group data in blocks, they loose their efficiency if the data is not clustered on disk.

Some of our use cases, with tables full of Billions of points ( using PgPointCloud ), are a good match for BRIN indexes : the data is naturally clustered spatially ( patches of points are loaded sequencially ), and sometimes the data is too big for a table’s index to fit entirely in memory. E.g. we have a 3TB table, resulting in a 20GB GiST index.

A BRIN index can be a really good compromise for this : preliminary tests on a medium-size dataset show a performance degradation of 5x in querying, while the index is 1000x smaller. And this tradeoff is adjustable, according to your dataset size and hardware configuration.

And more

Other topics which have been worked on (mainly by Paul, see his blog post) :

  • Expanded object header, to avoid serialization / deserialization steps
  • A nasty upgrade bug ( 2.2 blocker )

You will find more details in the following reports :

  • Paul Ramsey – http://blog.cleverelephant.ca/2016/03/paris-code-sprint-postgis-recap.html
  • Giuseppe Broccolo – http://blog.2ndquadrant.com/brin-postgis-codesprint2016-paris/
  • Regina Obe (PostgresOnLine) – http://www.postgresonline.com/journal/archives/363-Paris-OSGeo-Code-Sprint-2016-Highlights.html
  • Regina Obe (BostonGIS) – http://www.bostongis.com/blog/index.php?/archives/252-Paris-OSGeo-Code-Sprint-2016-PostGIS-Highlights.html

Thanks to the PostGIS team for such hard work and for their reports !

Figures credits : Openstreetmap’s Soup, OpenScienceMap, Wikipedia


by oslandia at June 06, 2017 09:54 AM

June 05, 2017

PostGIS Development

Causes for 'postgis.backend' is already set

The error ‘postgis.backend’ is already set comes up every so often in PostGIS mailing list. The issue arises often during or after an upgrade. I’ll go over causes for this I am aware of and how to fix.

The question goes something like this

After upgrading to Postgis 2.3 from 2.1, my server log is filled with these messages :

“WARNING ‘postgis.backend’ is already set and cannot be changed until you reconnect”

Continue Reading by clicking title hyperlink ..

by Regina Obe at June 05, 2017 12:00 AM

May 25, 2017

BigSQL Holly

Connecting to Remote Spatial Data Sources with ogr_fdw

Shout out goes to Regina Obe and Paul Ramsey for their work and documentation on ogr_fdw. The ogr_fdw project can be found here: https://github.com/pramsey/pgsql-ogr-fdw.

ogr+fdw = ogr_fdw

In an earlier post, I showed you how to use the ogr2ogr (or as my geogeek friends like to call it “ogre to ogre”) command line tool to import features in different data formats to Postgres (e.g. shapefiles or geojson).

Foreign Data Wrappers (FDWs) allow you to connect to remote data sources from within Postgres. From there you can query them with SQL, join across disparate data sets, or join across different systems. There are FDW implementations to connect to MySQL, Oracle, SQLite, as well as flat files.

ogr_fdw allows you to connect your PostGIS database directly to an existing GIS file or database and read from it without importing the data.

It’s like getting a hug from an ogre!

Exercise: Intro to ogr_fdw

Exercise Prerequisites

ogr_fdw is packaged with BigSQL’s PostgreSQL installers. If you have not installed the BigSQL PostgreSQL distribution and setup your PostGIS database, start here.

The GDAL_DATA Directory Path

In order for ogr_fdw to properly work, you will need to set the GDAL_DATA environment variable to the location of the directory gdal (if you haven’t already).

Lucky for you, with the BigQL distribution this is as easy as running the following commands via a command line tool.

Linux / OSX:

 cd <directory of installation>/pg96
 source pg96.env


 cd <directory of installation>\pg96

Both of these commands set environment variables that will live during your current session. Note that if you close your terminal or command prompt, these variables are removed from memory.

It’s also possible to set GDAL_DATA as a persistant environment variable. But that is the beyond the scope of this tutorial.

Download the data

For this exercise, I downloaded the NYC Subway Lines shapefile available at the New York City Open Data site.

To download this data click here: nyc_subway_lines shapefile

ogr_fdw_info Commands

Begin by running the ogr_fdw_info command to show a list of supported formats:

ogr_fdw_info -f

Supported Formats:
  -> "PCIDSK" (read/write)
  -> "netCDF" (read/write)
  -> "HTTP" (readonly)

Navigate to the nyc_subway_lines directory you downloaded and run the following command to see the layers in the directory:

ogr_fdw_info -s <nyc_subway_lines directory>
    example OSX/Linux: ogr_fdw_info -s /Users/hollyorr/gis_data/nyc_subway_lines
    example Windows: ogr_fdw_info -s C:\gis_data\nyc_subway_lines 


Now use the following command to read an OGR data source and output a server and table definition for this particular layer:

ogr_fdw_info -s <nyc_subway_lines directory> -l nyc_subway
    example OSX: ogr_fdw_info -s /Users/hollyorr/gis_data/nyc_subway_lines -l nyc_subway
    example Windows: ogr_fdw_info -s C:\gis_data\nyc_subway_lines -l nyc_subway


    datasource 'nyc_subway_lines',
    format 'ESRI Shapefile' );

  fid bigint,
  geom Geometry(LineString,4326),
  objectid real,
  shape_len real,
  url varchar,
  name varchar,
  rt_symbol varchar,
  id real
) SERVER myserver
OPTIONS (layer 'nyc_subway');

You can run the commands generated as output in pgAdmin or command line. In this exercise we will run the sql statement in command line with psql.

Connect to psql:

Linux / OSX:

 cd <directory of installation>/pg96
 source pg96.env    


 cd <directory of installation>\pg96\bin

Open psql and connect to postgis database:

$ psql
postgres=# \connect postgis_sample
    You are now connected to database "postgis_sample" as user "postgres".

Create the remote server and foreign table:

CREATE SERVER myserver FOREIGN DATA WRAPPER ogr_fdw OPTIONS (datasource '<location of nyc_subway_lines directory>', format 'ESRI Shapefile');

CREATE FOREIGN TABLE nyc_subway (fid bigint, geom Geometry(LineString,4326), objectid real, shape_len real, url varchar, name varchar, rt_symbol varchar, id real) SERVER myserver OPTIONS (layer 'nyc_subway');

Now open pgDevOps by BigSQL and switch to the pgAdmin4 Web view.

Check to see that your remote server was successfully created:

In the pgAdmin4 browser, navigate to:

Servers >> pg96 >> Databases >> postgis_sample >> Foreign Data Wrappers >> ogr_fdw >> Foreign Servers >> myserver

Check to see that your foreign table was successfully created:

In the pgAdmin4 browser, navigate to:

Servers >> pg96 >> Databases >> postgis_sample >> Schemas >> public >> Foreign Tables >> nyc_subway

Finally, query the nyc_subway table to see that your remote data using the ogr_fdw:

SELECT * FROM public.nyc_subway LIMIT 100;

by Holly Orr at May 25, 2017 07:39 PM

May 21, 2017

Boston GIS (Regina Obe, Leo Hsu)

PostGIS 2.4.0, Code Sprints and other extensions to try with PostgreSQL 10 beta1

So PostgreSQL 10beta1 came out recently as Holly mentioned. When first mention of beta hits, things start getting serious for me. I have to make sure that PostGIS compiles against said distribution to make sure eager testers aren't held back.

As with other releases, PostGIS didn't compile against the new PostgreSQL version without some nurturing. We've still got one regress failure, but at least PostGIS 2.4 now compiles cleanly against PostgreSQL 10beta1. I'm hoping that we can release PostGIS 2.4.0 just in time for PostgreSQL 10 planned release in September so I don't have to backport PostgreSQL 10 patches I made to lower PostGIS versions.

For PostGIS 2.4 the main focus will be cleaning up the parallel work so that all aggregate functions can enjoy use of parallel optimization. This is even more important with PostgreSQL 10 now that more kinds of queries can benefit from parallelization work. I'm also hoping to focus more energy on the raster side of things.

Continue reading "PostGIS 2.4.0, Code Sprints and other extensions to try with PostgreSQL 10 beta1"

by Regina Obe (nospam@example.com) at May 21, 2017 07:37 AM

April 24, 2017

BigSQL Holly

ETL (Extract, Transform, Load) PostGIS Data with ogr2ogr

In an earlier post, I showed you how to use the shp2pgsql tool to import shapefiles into PostGIS. But what if you need to import other geodata formats (e.g. GeoJSON, MapInfo, KML, CSV, etc.)?

ogr2ogr – ETL for PostGIS


Lucky for you, the BigSQL PostGIS distribution comes bundled with GDAL and the ogr2ogr (or as my geogeek friends like to call it “ogre to ogre”) command line tool.

From gdal.org/ogr2gr:

This program can be used to convert simple features data between file formats performing various operations during the process such as spatial or attribute selections, reducing the set of attributes, setting the output coordinate system or even reprojecting the features during translation.

In other words, ogr2ogr is your go to ETL tool for importing and exporting PostGIS data.

Exercise: Intro to ogr2ogr

In this exercise, we will import and export GeoJSON files to and from our postgis_sample database using ogr2ogr commands.

Exercise Prerequisites

The ogr2ogr command line tool is packaged with BigSQL’s PostgreSQL installers. If you have not installed the BigSQL PostgreSQL distribution and setup your PostGIS database, start here.


GeoJSON is an extension of JSON open standard format and designed for representing simple geographical features, along with their non-spatial attributes.


        "type": "Feature",
        "geometry": {
            "type": "Point",
            "coordinates": [125.6, 10.1]
        "properties": {
            "name": "Dinagat Islands"

Getting to know ogr2ogr Command Line

Open your command line utility and enter the this help command to see ogr2ogr parameters and usage:

ogr2ogr --long-usage

You should see the following parameter and under it a list of all the data formats your installation of ogr2ogr supports:

-f format_name

Make sure that the these two formats are included in the list:

-f "GeoJSON"
-f "PostgreSQL"

Import GeoJSON to PostGIS

For this exercise, I downloaded the NYC Wi-Fi Hotspot Locations GeoJSON file available at the New York City Open Data site.

To download this data click here: nyc_hotspots.geojson

To import a GeoJSON file into your PostGIS database, use the following command:

ogr2ogr -f "PostgreSQL" PG:"dbname=my_database user=postgres" "source_data.json" -nln new_table_name

If you want to append the new data to already existing records, use the -append flag:

ogr2ogr -f "PostgreSQL" PG:"dbname=my_database user=postgres" "source_data.geojson" -nln destination_table -append

So, for our exercise run the following commands:

cd <location of nyc_hotspots.geojson>
ogr2ogr -f "PostgreSQL" PG:"dbname=postgis_sample user=postgres" "nyc_hotspots.geojson" -nln nyc_hotspots

Now, run the following sql query on your postgis_sample database to see the table’s attributes:

select * from public.nyc_hotspots;

Export from PostGIS to GeoJSON

To generate a geojson file from one of your PostGIS tables, run the following command.

ogr2ogr -f "GeoJson" out.geojson PG:"host=localhost dbname=postgis_sample user=postgres password=<your_password>" \  -sql "SELECT * from nyc_hotspots"

*Note: Although you can use ogr2ogr to import and export shapefile format to and from PostGIS, it is recommended you use the shp2pgsql tool I discussed in an earlier post.

by Holly Orr at April 24, 2017 12:37 PM

April 05, 2017

Boston GIS (Regina Obe, Leo Hsu)

Slides from PGConfUS 2017 PostGIS talk

At PGConfUS 2017 last week, we presented a talk: Top 10 Problems Solved by PostGIS. The slides for the talk in HTML format and PDF format. The pgRouting examples at the end of the talk seemed to be the most popular.

We'll be giving a pgRouting workshop at FOSS4G Boston 2017 Aug 14-19th where we'll go into much more depth about topics like drive time analysis and Vehicle Route schedule planning (VRP). We also hope to give a talk at FOSS4G 2017 on PostGIS spatial tricks.

by Regina Obe (nospam@example.com) at April 05, 2017 07:23 AM

Boston GIS (Regina Obe, Leo Hsu)

Google Cloud SQL for PostgreSQL supports PostGIS

Google recently put in beta Google Cloud SQL for PostgreSQL which targets PostgreSQL 9.6 and seems like they plan to stay in line with PostgreSQL offerings similar to the Amazon PostgreSQL RDS and Aurora offerings. I was curious to see what kind of extensions they support. In Google cloud blog article Cloud SQL for PostgreSQL for your mobile and Geospatial applications in Google Cloud PostGIS is explicitly mentioned as supported. Presumably because Descartes is one of their case studies which I can only guess probably uses PostGIS based on the quote from Tim Kelton, Descartes co-founder:

"Our individual teams, who are building micro services, can quickly provision a database on Cloud SQL. They don't need to bother compiling Geos, Proj4, GDAL, and Lib2xml to leverage PostGIS. And when PostGIS isn’t needed, our teams use PostgreSQL without extensions or MySQL, also supported by Cloud SQL."

Excerpted from their article:
Geospatial support: Easily enable the popular PostGIS extension for geospatial objects in Postgres.

It's unclear from the article what other extensions they support though. Maybe I'll have to try it.

by Regina Obe (nospam@example.com) at April 05, 2017 06:18 AM

March 26, 2017

Boston GIS (Regina Obe, Leo Hsu)

PGConfUS 2017 Getting Stuff done in PostGIS

A reminder, PGConfUS 2017 conference is just days away, and we'll be giving a training March 28th 2017, Jersey City, NJ at 1 PM. If you are coming, keep an eye on this page PGConf 2017 US Getting Stuff done with PostGIS materials.

If you haven't signed up already, there are still spots, make sure to buy your tickets at http://pgconf.us/conferences/2017#registration.

Continue reading "PGConfUS 2017 Getting Stuff done in PostGIS"

by Regina Obe (nospam@example.com) at March 26, 2017 04:42 AM

March 23, 2017

BigSQL Holly

Importing Shapefiles into PostGIS with shp2pgsql

Geojson may be catching up, but by and far the most popular data format for sharing geospatial data is (and has been since the early 1990’s) the shapefile. Although a shapefile is a simple way to share data that can be easily consumed by almost any GIS system, it comes with many limitations and is not suitable for your enterprise data needs.

My personal least favorite limitation is that shapefiles truncate column names greater than 10 characters – ouch!

Luckily, the PostGIS community has bundled with PostGIS distributions a great tool that simplifies the task of importing shapefiles into your PostgreSQL database: shp2pgsql.

Shapefile Definitions:

Vector Data: Points, lines, or polygons that are used to represent real-world features on a map. Used for data that has discrete boundaries like administrative boundaries (e.g. country borders), physical features (e.g. lakes or streams) or infrastructure (e.g. streets). Attributes in a table are associated with each vector feature. vector_small

Shapefile: The most popular geospatial vector data format. First developed by ESRI. Because of it’s open specification it is now predominantly used to share GIS data between open source and proprietary software. A shapefile actually consists of 3 mandatory files:

  • .shp — shape format; the feature geometry itself
  • .shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly
  • .dbf — attribute format; columnar attributes for each shape, in dBase IV format

Exercise: Shapefiles to PostGIS with shp2pgsql

Prerequisite: Before you begin, create a spatial database by following the instructions laid out in our previous post, PostGIS – How to create a Spatial Database with PGC Command Line.

Bundled with your BigSQL PostGIS distribution is a tool called shp2pgsql. In this exercise, we will download a shapefile, find the correct SRID code, and use the command line tool to upload it to our postgis_sample database.

Download the data

Over the past 10 years, most U.S. municipalities have enacted open data policies. This means you can download lots of geospatial data from government web sites. Most geospatial data is provided as a shapefile or geojson file.

For this exercise, I downloaded NYC borough boundaries shapefile available at the New York City Open Data site.

To download this data click here: nybb_17a

After downloading, unzip the files and put it on your drive somewhere you will remember.

Navigate to shp2pgsql

Before you begin, make sure shp2pgsql is installed in your bin directory:

cd <directory of installation>/pg96/bin 

shp2pgsql syntax

Syntax for running command line:

shp2pgsql -I -s <SRID> <PATH/TO/SHAPEFILE> <SCHEMA>.<DBTABLE> | psql -U postgres -d <DBNAME>


< SRID > – The Spatial Reference System Identifier (SRID or ESPG) is a unique value that represents the data’s coordinate system.

Findng the SRID/ESPG code:

All geospatial data has a coordinate system it is projected in. The topic of projections is beyond the scope of this posting, but you can learn more here.

Navigate to the shapefile’s .prj file which contains information about the shapefile’s coordinate system.

  cd <location of unzipped nybb_17a directory> 

Option 1:
Upload the .prj file to prj2epsg.org

Option 2:
Open the .prj file in your favorite editor and put the values in the first parameter, PROJCS, into the following sql statement that you will run against the postgis_example database:

nybb.prj parameter:


SQL query:

    SELECT srid, srtext,proj4text FROM spatial_ref_sys 
    WHERE srtext ILIKE '%New_York_Long_Island%' 
    AND proj4text ILIKE '%nad83%' AND proj4text ILIKE '%ft%';  


The SRID for the nybb data is 2263.

<PATH/TO/SHAPEFILE> – The full path to the shapefile:

<location of unzipped nybb_17a directory>  

< SCHEMA > — Target schema where the new table will be created:

This is optional. 

< DBTABLE > — New database table to be created:


< DATABASE > —Target database where the table will be created:


For a complete listing of the shp2pgslq command line parameters go here.

Putting it all together

So, putting this all together for this exercise you would run the following command:

Note: before running the command make sure psql has been started.

cd <location of unzipped nybb_17a directory>
shp2pgsql -I -s 2263 nybb.shp nybb | psql -U postgres -d postgis_sample

Now, run the following sql query on your postgis_sample database:

select * from public.nybb;

And you will see the nybb table’s attributes:


by Holly Orr at March 23, 2017 01:51 PM

February 21, 2017

Stephen Mather

Taking Slices from LiDAR data: Part IX

Part 9 of N… , see e.g. my previous post on the topic.

We’ve been working to reduce the effect of overlapping samples on statistics we run on LiDAR data, and to do so, we’ve been using PDAL’s filters.sample approach. One catch: this handles the horizontal sampling problem well, but we might want to intentionally retain samples from high locations — after all, I want to see the trees for the forest and vice versa. So, it might behoove us to sample within each of our desired height classes to retain as much vertical information as possible.

by smathermather at February 21, 2017 03:26 AM

February 19, 2017

Stephen Mather

Taking Slices from LiDAR data: Part VIII

Part 8 of N… , see e.g. my previous post on the topic.

I didn’t think my explanation of sampling problems with LiDAR data in my previous post was adequate. Here are a couple more figures for clarification.

We can take this dataset over trees, water, fences, and buildings that is heavily sampled in some areas and sparsely sampled in others and use PDAL’s filters.sample (Poisson dart-throwing) to create an evenly sampled version of the dataset.

Figure showing overlap of LiDAR scanlines Figure showing data resampled for eveness
Figure showing overlap of LiDAR scanlines

Figure showing overlap of LiDAR scanlines

Figure showing data resampled for eveness

Figure showing data resampled for evenness

An extra special thanks to the PDAL team for not only building such cool software, but being so responsive to questions!

by smathermather at February 19, 2017 01:04 AM

February 16, 2017

Stephen Mather

Taking Slices from LiDAR data: Part VII

Part 7 of N… , see e.g. my previous post on the topic.

More work on taking LiDAR slices. This time, the blog post is all about data preparation. LiDAR data, in its raw form, often has scan line effects when we look at density of points.


This can affect statistics we are running, as our sampling effort is not even. To ameliorate this affect a bit, we can decimate our point cloud before doing further work with it. In PDAL, we have three choices for decimation: filters.decimation, which samples every Nth point from the point cloud; filters.voxelgrid, which does volumetric pixel based resampling; and filters.sample or “Poisson sampling via ‘Dart Throwing'”.

filters.decimation won’t help us with the above problem. Voxelgrid sampling could help, but it’s very regular, so I reject this on beauty grounds alone. This leaves filters.sample.

The nice thing about both the voxelgrid and the poisson sampling is that they retain much of the shape of the point cloud while down sampling the data:



We will execute the poisson sampling in PDAL. As many things in PDAL are best done with a (json) pipeline file, we construct a pipeline file describing the filtering we want to do, and then call that from the command line:

We can slice our data up similar to previous posts, and then look at the point density per slice. R-code for doing this forthcoming (thanks to Chris Tracey at Western Pennsylvania Conservancy and the LidR project), but below is a graphic as a teaser. For the record, we will probably pursue a fully PDAL solution in the end, but really interesting results in the interim:


More to come. Stay tuned.

by smathermather at February 16, 2017 04:46 AM

February 10, 2017

BigSQL Holly

PostGIS – How to create a Spatial Database with PGC Command Line

postgis_plus_bigsqlWe are excited to announce that BigSQL distributions of PostGIS now come bundled with several popular community spatial extensions:

This is the only community distribution to offer PostGIS with these extensions across ALL platforms (Mac, Linux, and Windows)! So, for all you GeoGeeks using Linux / Mac – spatially enabling your PostgreSQL database with PostGIS functionality just got a lot easier.

In addition, we’ve included a sample script to help guide you through creating your first PostGIS database.

In the upcoming weeks, we will offer guidance for both GUI and command line work flows (see Are you a Yoda or a Diego?) and blogging about what you can do with spatially enabled data.

So, let’s get started…

What is PostGIS?

PostGIS is powerful spatial database extender for PostgreSQL. It adds support for geographic objects allowing location queries to be run in SQL. For a full list of supported features look here: http://postgis.net/features

How to install the BigSQL distribution

If you have installed the BigSQL distribution, skip to the next section, “Getting started with BigSQL PGCLI.”

Getting started with BigSQL PGCLI

PGCLI (Pretty Good Command Line Interface) is a package and update manager, similar to yum or apt-get, specifically for PostgreSQL and related community projects. PGC enables you to easily find and install packages, configure multiple versions of PostgreSQL, update existing applications, and control services by starting and stopping from the command line.

You can run pgc commands with the following call:

 cd <directory of installation>
 ./pgc (Linux and Mac)
 pgc (Windows)

For the purpose of this tutorial, we will be using the Linux/Mac syntax: ./pgc

To learn more about PGC, go to the BigSQL PGCLI tutorial.

Update components to latest versions and install PostGIS

  1. Open a terminal and navigate to the installation directory.

  2. On windows it could be C:\Postgres, on Linux and Mac it could be any directory based on the installation method. You will be able to locate “pgc” bash script OR “pgc.bat” file in this directory.

     cd <directory of installation>
  3. Run the pgc status command to show your current Postgres server status.

    In this example Postgres 9.6 is running on the default port:

     ./pgc status
        pg96 running on port 5432

    If your Postgres instance is not running, start it with the following pgc command:

     ./pgc start
  4. Run the pgc update command to get the latest versions:

     ./pgc update
  5. Install PostGIS:

     ./pgc install postgis23-pg96

Create a PostGIS sample database

We have provided you with a sql script that will create a Postgis enabled database and includes the following extensions:

Before you begin, navigate to directory that contains the sql script you will run to create a PostGIS database and check to make sure the file create_postgis_sample_db.sql downloaded with your install. You can also take a look at what is in the script using your favorite editor tool:

 cd <directory of installation>/pg96/share/doc/postgresql/extension 

 vi create_postgis_sample_db.sql

The GDAL_DATA Directory Path

In order for all your PostGIS extensions to properly work, you will need to set the GDAL_DATA environment variable to the location of the directory gdal.

Lucky for you, with the BigQL distribution this is as easy as running the following commands via a command line tool.

Linux / OSX:

 cd <directory of installation>/pg96
 source pg96.env


 cd <directory of installation>\pg96

Both of these commands set environment variables that will live during your current session. Note that if you close your terminal or command prompt, these variables are removed from memory.

It’s also possible to set GDAL_DATA as a persistent environment variable. But that is the beyond the scope of this tutorial.

Run the script using the psql terminal.

Linux / OSX:

 psql -f share/doc/postgresql/extension/create_postgis_sample_db.sql


 psql -f ..\share\doc\postgresql\extension\create_postgis_sample_db.sql

Check to see that the database and extensions were successfully created:

 psql (if using Windows, you may need to include -U postgres with the psql command)
 postgres=# \connect postgis_sample;
 You are now connected to database "postgis_sample" as user "postgres".

                 List of installed extensions
                 Name             | Version |   Schema               
     address_standardizer_data_us | 2.3.2   | public     
     fuzzystrmatch                | 1.1     | public     
     ogr_fdw                      | 1.0     | public     
     plpgsql                      | 1.0     | pg_catalog 
     postgis                      | 2.3.2   | public     
     postgis_tiger_geocoder       | 2.3.2   | tiger      
     postgis_topology             | 2.3.2   | topology   
    (8 rows)

Exit psql:


Install pgadmin3 if it is not already installed:

 cd <directory of installation>  
 ./pgc install pgadmin3

Launch pgadmin3 (the GUI should popup when you run this command):

 ./pgc start pgadmin3    

Output should look something like this:


Get me outta here! (CLI edition):

You can always uninstall your BigSQL Postgres distribution at any time with the following commands:

 cd <directory of installation>
 ./pgc stop
 cd uninstall
 open uninstall.app

Finally, cd to location of your PostgreSQL directory and delete the PostgreSQL directory and it’s contents:

 rm -r <directory of installation> (Linux or Mac)
 rmdir <directory of installation> /s (Windows)

by Holly Orr at February 10, 2017 02:18 PM

February 08, 2017

BigSQL Holly

Are you a Yoda or a Diego?

In the geospatial world there are usually 2 types of people supporting spatially enabled databases:

diego 1. GIS analysts  (scientist, statisticians, analysts, etc) who find themselves as the de facto DBA (among other things). “I don’t have time for this! I’m saving this sea turtle. Just give me a GUI!”

yoda 2. DBAs who find themselves supporting these strange spatially enabled databases. “Real men only use command line! And what the hell is GIS?”

arcinfo Of course, some of you old school GISers may really love to geek out on the CLI (ArcI/Info anyone?).

But GIS has come a long way since 1982. For one thing, the learning curve for running GIS analysis is becoming less intimidating due to new SaaS geospatial companies like our friends at Carto. For another, open source (FREE!) server and desktop software has caught up with (and in some cases surpassed) its proprietary counterparts (see out post “Why are you still paying for your GIS“).

At BigSQL, we want you to know we love you for who you are. And in that spirit, just in time for the release of PostgreSQL 9.6.2 and PostGIS 2.3.2, we are introducing tools to help you build and monitor your spatially enabled databases that will satisfy whichever tribe you belong to: the GUIs or the CLIs.

In our next post we will help Tribe Yoda (aka CLIs) leverage the PGCLI: Pretty Good Command Line Interface to make installing, updating, configuring, and monitoring PostgreSQL/PostGIS simple and FREE!

by Holly Orr at February 08, 2017 08:25 PM

January 31, 2017

PostGIS Development

PostGIS 2.3.2 Released

The PostGIS development team is pleased to announce the release of PostGIS 2.3.2 As befits a patch release, the focus is on bugs and breakages. Best served with PostgreSQL 9.6.1+ and pgRouting 2.3.2.

Continue Reading by clicking title hyperlink ..

by Regina Obe at January 31, 2017 12:00 AM

January 30, 2017

PostGIS Development

PostGIS 2.2.5 Released

The PostGIS development team is pleased to announce the release of PostGIS 2.2.5 As befits a patch release, the focus is on bugs and breakages.

Continue Reading by clicking title hyperlink ..

by Regina Obe at January 30, 2017 12:00 AM

January 24, 2017


QGIS versioning plugin

We developped a tool to manage data history, branches, and to work offline with your PostGIS-stored data and QGIS. Read more to get the insight of QGIS Versioning plugin.

The QGIS plugin is available in QGIS plugin repository, and you can `fork it on GitHub too https://github.com/Oslandia/qgis-versioning !


Even if the necessity of data versioning often arises, no standard solution exist for databases.

The GeoGit project proposes a solution to store versioned geospatial data. There is also an existing plugin for QGIS, pgversion, which uses views and triggers to version a PostGIS database. Unfortunately those solutions were not adapted to the specific constrains of this project, namely: using a PostGIS database as the main repository (excludes GeoGit) and the ability to working off-line (excludes pgversion).The project we developed QGIS/PostGIS versioning looks like the following.



The database is stored in a PostGIS schema, the complete schema is versioned (i.e. not individual tables). Revisions are identified by a revision number. A revision table in the versioned schema, called ‘revisions’, keeps track of the date, author, commit message and branch of all revisions.

Once a table structure is defined, three operations can be performed on rows: INSERT, DELETE and UPDATE. To be able to track history, every row is kept in the tables. Deleted rows are marked as such and updated rows are a combined insertion-deletion where the deleted and added rows are linked to one another as parent and child.|

A total of five columns are needed for versioning the first branch:

a unique identifier across the table

revision when this record was inserted

last revision for which this record exist (i.e. revision when it was deleted minus one)

in case the row has been inserted as the result of an update, this fields stores the hid of the row that has been updated

in case the row has been marked as deleted as the result of an update, this field stores the hid of the row that has been inserted in its place.

For each additional branch, four additional columns are needed (the ones with the prefix branch_).

If the branch_rev_begin is null, it means that a row belongs to another branch.

SQL views are used to see the database for a given revision number. If we note ‘rev’ the revision we want to see. For each table, the condition for a row to be present is the view is::

(branch_rev_end IS NULL OR branch_rev_end >= rev) AND branch_rev_begin <= rev

In the special case of the current revision, or head revision, the condition reads::

branch_rev_end IS NULL AND branch_rev_begin IS NOT NULL

Since elements are not deleted (but merely marked as such) from an historized table, care must be taken with the definition of constrains, in particular the conceptual unicity of a field values.

Withing the PostGIS database, the views on revisions must be read-only  and historized tables should not be edited directly. This is a basic principle for version control: editions must be made to working copies an then committed to the database. Please note that by default PostGIS 9.3 creates updatable views.

Workflow schema

This setup allows for multiple users to use and edit data offline from a central repository, and commit their modifications concurrently.

Working copies

Two kinds of working copies are available:

SpatiaLite working copies
They are meant to be used off-line. They consist of the versioned tables of a given versioned database (i.e. PostGIS schema) or any subset. For each table, only the elements that have not been marked as deleted in the head revision need to be present. Furthermore only a subset of the elements the user needs to edit can be selected (e.g. a spatial extend).  To create a working copy (i.e. to checkout), tables from the versioned schema (or the aforementioned subsets) are converted to a SpatiaLite database using ogr2ogr.

PostGIS working copies
They are meant to be used when the connection to the original database will remain available. They are quite similar to pgversion working copies since they only store differences from a given revision (the one checked out).

The following description is aimed at understanding the inner workings of the qgis versioning plugin. The user does not need to perform the described operations manually.

For each versioned table in the working copy, a view is created with the suffix _view (e.g. mytable_view). Those views typically filters out the historization columns and shows the head revision. A set of triggers is defined to allow updating on those views (DELETE, UPDATE and INSERT).

The DELETE trigger simply marks the end revision of a given record.

The INSERT trigger create a new record and fills the branch_rev_begin field.

The UPDATE trigger create a new record and fills the branch_rev_begin and branch_parent fields. It then marks the parent record as deleted, and fills the branch_rev_end and branch_child fields.

Updating the working copy

Changes can be made to the database while editing the working copy. In order to reconcile those edition, the user needs to update the working copy.

When updating, a set of records can be in conflicts: the records for which the end revision has been set since the initial checkout or last update if any.

Multiple editions can be made to the same record. Therefore the child relation must be followed to the last child in order to present tu user with the latest state of a given conflicting feature.

Conflicts are stored in a table and identified with a conflict id and the tag ‘theirs’ or ‘mine’. A DELETE trigger on this table is used for conflict resolution. On deletion of ‘mine’, the working copy edition is discarded, on deletion of ‘theirs’ the working copy edition is appended to the feature history (i.e. the working copy feature becomes a child of the last state of the feature in the historized database).

Committing the editions to the versionned database

If a working copy is up to date, the editions can be integrated in the versioned database. This operation consists simply in the insertion of a record in the revisions table, and, for each versioned table, the update of rows that are different and inserting rows that are not present.


A branch can be created from any revision by adding the four history columns and setting the branch_rev_begin field of features that are present in their revision.

Plugin interface tutorial

Groups are used for all versioning operations in QGIS since the versions are for a complete PostGIS schema or SpatiaLite database.

The versioning toolbar will change depending on the group selected in the QGIS legend.

The group elements must share the same connection information (i.e. share the same database and schema for PostGIS working copies and revision views or share same SpatiaLite database for SpatiaLite working copies).

Versioning a PostGIS schema

Starting with an unversioned database, import a number of layers from a schema that needs to be versioned into a QGIS project.

Once the layers are imported, they must be grouped together.

Selecting the newly created group will cause the versioning toolbar to display the historize button (green V). On click a confirmation is requested to version the database schema.

The versioned layers are imported in a new group and the original layers are removed from the project.

The symobology is not kept in the process.

Working with a versioned PostGIS schema

Versioned layers can be imported in QGIS. The layers must be from a head revision or a view on any revision.


Once the layers are in QGIS, they must be grouped.


For PostGIS groups at head revision, the versioning plugin allows the user to create a SpatiaLite or a PostGIS working copy, create a view on a given revision or create a branch. A corresponding group will be imported in QGIS.

If the user chooses to create a SpatiaLite working copy, he will be asked to select a file to store the working copy.


Commiting changes

Once the working copy is imported in QGIS, the user can start edition of its layers. With SpatiaLite working copies, this edition can be done off-line.

When the user is done with edition, he can commit the changes to the database and if commit is feasible (i.e. the working copy is up to date with the versioned database), he will be prompted for a commit message and subsequently be informed of the revision number he committed.


If the commit is not feasible, the user will be informed that he must update his working copy prior to commit.

Resolving conflicts

Conflicts are detected during update, the user is informed, and conflicts layers are imported into QGIS.

To resolve conflicts, the user can open the conflict layer’s attribute table. Selected entries are also selected on the map canvas and the user can decide which version, either his or the database’s, he wants to keep. User version is tagged with ‘mine’ and database version with ‘theirs’. The conflict is resolved by deleting the unwanted entry in the conflict layer.

On deletion of one conflict entry, both entries are removed (by a trigger) but the attribute table (and canvas) are not refreshed. As a workaround, the user can close and re-open the attribute table to see the actual state of the conflict table.

Once the conflict table is empty, the commit can be done.


Due to design choices and tools used for conversion to SpatiaLite, a number of restrictions apply to the versioned database:

  • |schemas, tables and branch names should not have space, caps or quotes
  • tables must have primary keys
  • columns are lowercase (because of conversion to SpatiaLite) but can have spaces (not that it’s recommended
  • geometry column is geom in PostGIS, GEOMETRY in SpatiaLite

Do not edit OGC_FID or ROWID

The constrains on the tables are be lost in the PostGIS to SpatiaLite conversion.

Known bug

The conflict layer won’t be loaded automatically is it has no geometry. The user will have to load it manually.

by oslandia at January 24, 2017 02:40 PM


QGIS plugin for water management

Oslandia releases today a new plugin for the QGIS processing framework, allowing for water distribution network simulation. It integrates the opensource EPANET simulation software. EPANET models water distribution networks. It’s a widely used public-domain simulation software developed by the US Environmental Protection Agency.

Hydraulic simulation is used to understand water distribution in distribution network, to forecast the impact of network alterations, to dimension network elements or study extreme case scenarios (e.g. important demand for firefighting, pipes breakages, interruption in supply).

QGIS provides a graphical user interface that can be used to import/edit/export hydraulic model elements and simulation parameters from various sources, launch simulation and visualize results directly inside QGIS.

Hydraulic model

A hydraulic model consists of junctions (POINT) and pipes (LINESTRING) along with various other elements like tanks, pumps and valves. Those elements can be stored as features in a spatially enabled database. Features attributes can be simple (e.g. pipe diameter) or complex (e.g. pumps characteristic curves or water consumption). Complex attributes are stored via a foreign key in other alphanumeric tables.

This is the kind of data QGIS is designed to handle. It can import/export them from/to a variety of sources and also display and edit them.

Simulation parameters

Simulation parameters and options (e.g. simulation time step or accuracy) are key-value pairs. The values can be stored in a table which columns are keys. Each set of simulation parameters is then a record in this table. This kind of table can be loaded in QGIS as a vector layer without geometry.

Integration in the processing framework

Once the hydraulic model and simulation parameters are loaded in QGIS, the simulation can be launched through the Processing toolbox. The plugin uses the standalone command line interface of EPANET (CLI) which path needs to be specified in processing Options and configuration.

The plugin assembles an EPANET input file, runs EPANET and parses its output to generate result layers.

One interesting aspect with processing modules is that they can be used for chained processing: the user can use other modules to do additional transformations of simulation results, as feeding them into another simulation model.

Result visualization

Simulation results are water pressure and velocity at all points in the network along with state of network elements (e.g. volume in tanks, power of pumps) for all simulation time steps . This represent a huge amount of data that are usually displayed either as time-plots or as map-plots of time aggregated data (e.g. max and min during simulation).

Results of particular interest are:

  • time-plots of:
    • volume in reservoirs
    • flow at pumps
    • pressure in pipes and at junctions
  • map-plots of:
    • low speed (stagnation)
    • high and low pressure (risk of breakage, unhappy consumer)
    • lack of level variation in reservoirs (stagnation)
    • empty reservoir
    • reservoir overflow
    • abnormal pressure (typical of error in the altitude of a node in the model)
    • flow direction

QGIS is naturally suited for map-plots. Time-aggregated simulation results are automatically joined to map layers when the result table is added to the map. Rule-based symbology is used to highlight zones of concern (e.g. low water velocity or empty reservoirs).

The matplotlib library provides 2D plotting facilities in python and QGIS provides an extensive set of selection tools (on the map or in tables). The plugin’s button plots the appropriate value depending on the selected feature type (e.g. water level for tanks, pressure for junctions).


For a full demo of this plugin, see the following video :


Where and who

The plugin is available on GitHub and should be available soon on QGIS plugin repository : https://github.com/Oslandia/qgis-epanet

This work has been funded by European Funds. Many thanks to the GIS Office of Apavil, Valcea County (Romania). Oslandia has developped this plugin, and provides support and development around QGIS, PostGIS and this plugin. Get in touch if you need more : infos@oslandia.com

We are looking for a free dataset with full informations (pumps, tanks, valves, pipes and their characteristics…) to distribute with this plugin as a test case and demonstration. If you can provide this, mail us !

We also are implementing a Processing plugin for SWMM, the public domain Waste-water simulation tool. If you are interested to participate to the development, please contact us.

by oslandia at January 24, 2017 01:53 PM


PostGIS 3D – Foss4g video and workshop

The latest PostGIS and QGIS 3D enhancements presented at FOSS4G by Oslandia are available online.We suggest you to have a look on our PostGIS 3D / QGIS 3D video demonstration using SFCGAL library and the QGIS Horao plugin.

A step by step workshop, (really close to the video workflow) is also available online  https://github.com/Oslandia/Workshops/tree/master/FOSS4G_2013_PostGIS_3D

We can provide you the full virtual machine on demand, with proper software environment (6GB Virtual Box Image).

We would be really interested in having your advice on these new 3D features, and the use cases you could be interested in. Do not hesitate to get in touch.

Contact us at infos+foss4g@oslandia.com for any information.

by oslandia at January 24, 2017 01:12 PM


QGIS Community meeting in Brighton

Developers and contributors from the QGIS project are used to gather physically twice a year across different countries. Such an event allows people to synchronize their effort, and discuss new possible developments.cThe latest QGIS community meeting took place in Brighton from the 12th to the 16th of September, just before the FOSS4G event. It was the biggest community meeting organized so far, with close to 50 people attending ! Everything went smooth thanks to the perfect organization by Lutra Consulting.

This session was of particular interest in the project’s history, since it was dedicated to the release of the eagerly-awaited new 2.0 version of QGIS.

Oslandia is used to take part in the event and even organized the march 2012 session in Lyon.


Despite being originally oriented toward code and translations, some presentations took place during the event. Some of them have been video recorded, some did not. Hereafter is a subset of them.

A new website

In parallel to the release of the 2.0 version, the QGIS website has been updated. Its look and feel, but also the way it is now build. Richard Duivenvoorde presented the efforts that have been put on the support of multiple languages, adaptation to mobile devices, and the reuse of tools used for building the documentation of the project. The new website is now online.

Richard presenting the new website


Presentation of the new website : http://www.ustream.tv/recorded/38687971

Constraints on attributes

Some more developer-oriented presentations and discussions also took place. Matthias Kuhn and Nathan Woodrow presented an idea about extending the way attributes are handled by QGIS. In particular, the concept of constrained attributes emerged. The idea is to be able to express, manipulate and edit contrains on attributes (possible range of values for instance) as it is found in databases. This could then be used to constrain user editing of layers, presenting to the user an appropriate widget (combo box for an enumeration for instance), especially for layers that do not have native support for these constraints.

QGIS for Android tablets

RealworldSystems presented their work on what they called the “QGIS Mobility framework”, based on previous works by Marco Bernasocchi on QGIS for Android. It is dedicated to the design of custom QGIS applications for deployment on Android tablets (for on-the-field editing campains for instance). It looks promising and has already been used in a real-world application for gaz pipeline inspection. The framework can be found on github.

QGIS webserver

Andreas Neumann presented evolutions of QGIS webserver and webclient. More can be found in the corresponding video.

Andreas presenting the work on QGIS webserver and webclient

Video 1 http://www.ustream.tv/recorded/38741015

Evolution of the Globe plugin

Matthias Kuhn presented evolutions he made to the Globe plugin that allows to display a 3D earth with different kinds of data on it. Lots of osgearth features are now integrated into the Globe plugin (in particular the support for 2D vector layers).

Matthias presenting its work on the Globe plugin

Video 2 http://www.ustream.tv/recorded/38737991

Visualisation of 3D data

Oslandia presented also its ongoing work on the integration of Postgis 3D. After a thourought evaluation of osgearth, which is the base of the Globe plugin, we decided to develop our own 3D visualisation stack directly on top of OpenSceneGraph.

A QGIS plugin has also been developed in order to be able to view QGIS layers in 3D.

With this new 3D visualisation stack we are able to display and manipulate data of a whole city between 20 and 60 frames per second on a laptop (here the demo has been designed on data from the city of Lyon) , when we were hardly able to display a small city quarter with Globe.

Oslandia presenting its work on its 3D visualisation stack

Video 3 http://www.ustream.tv/recorded/38738897

Slides https://github.com/Oslandia/presentations/tree/master/qgis_hf_2013

QGIS 2.0

All the work done during this community meeting allowed to polish the 2.0 version of QGIS which has been publicly announced during the FOSS4G in Nottingham by Tim Sutton.
Waiting now for the 2.1 release 🙂

by oslandia at January 24, 2017 11:48 AM


Oslandia and Mappy towards OpenSource

For more than two years, Oslandia has been working with Mappy for its transition towards OpenSource. The first step is represented by the database migration project of the cartographic backoffice, from Oracle to PostGIS. This has just been finalized and is currently running in production.

Mappy offers routing and cartography services. Supporting several billions of requests per month, the LBS platform (Location Based Services) is at the heart of Mappy’s activity and serves as a base for new products such as Web To Store. Oslandia works together with Mappy to migrate this cartographic platform from a solution developped in-house, to a new solution based entirely on free and opensource softwares, in order to prepare forthcoming technical and operational challenges.

The migration

Data preparation and cartographic map creation backoffices have been migrated successfully.
This software stack was initially based on Oracle Spatial, SQL Server, and other non-standard internal tools to create the map. It only runs opensource components by now. Oracle and SQL Server have completely been replaced by PostgreSQL and PostGIS, which is the key component for spatial data storage and preprocessing. Mapnik, Python, Tornado, Varnish, MemCached, Debian are other Opensource components used in this architecture.

The migration to OpenSource softwares allowed Mappy to rationalize and optimize the architecture of their “map” component :

  • Reduced data processing time
  • API and data format Standardization
  • Strong reduction of the technical debt
  • Reduced amount of lines of code
  • Cost reduction for the platform, and cost-effective scaling
  • Technical skills improvements for the team and stronger motivation
  • Full technical mastership of their tools

Oslandia helped Mappy to integrate the Opensource culture and methods into their teams, and provided them with the technical skills required to implement this high-load architecture. Oslandia’s competences in geographical information systems, and its high expertise on PostGIS let the project be achieved successfully.

Mappy benefited from the latest OpenSource technologies, at the state of the art, and was also able to contribute back to some free software, like Mapnik.

Migration steps

The migration has been achieved with several steps, so as to keep the project under control, validating the migrated components through early tests.

The initial platform was the following one :

The target platform, with all components migrated to OpenSource, is now the following one :

Thanks to the availability of OpenSource components, a prototype has been developped within two weeks at the beginning of the project, so as to validate architectural decisions. Then the real migration work has begun.

The choice concerning OpenSource components has been fast, as already said. The target architecture could therefore be setup quickly, and the development and migration processes have been setup once this first step validated.

This migration gave the opportunity for Mappy to contribute to Mapnik. A 25% performance gain was achieved when reading data from PostGIS. Mappy also opensourced Pycnik, a Python tool to generate styles for Mapnik.


PostGIS, the database supporting the whole infrastructure, allowed to reach very high performances and a rich functional level. The database – 75GB – benefits from all latest enhancements of PostgreSQL and PostGIS, such as streaming replication, PostGIS new geographical functions, recursive CTEs, JSON native support and much more.

Audrey Malherbe ( @AudreyMalherbe ), project manager at Mappy, stresses that choosing PostgreSQL and PostGIS was obvious:

“We wanted to dive into OpenSource, leaning on reknown and performant technologies like PostGIS. It was important for us to contribute to OpenSource, and Oslandia’s expertise and its implication in the community has allowed us to rush into this adventure with full confidence.”

Some facts

Most geeks like to see this kind of numbers :

  • for 85% of all Mappy traffic
  • 14 servers (4 PostGIS, 8 Tornik, 2 Varnish)
  • 240 req/s peak without cache
  • 2500 req/s peak with cache
  • 94% Cache Hit Ratio
  • 2M objects in cache warming
  • 75 GB for PostGIS database
  • 300 GB for the relief database

Next steps

More technical informations on this migration are available in the `Audrey’s presentation at FOSS4G 2013.

Mappy intends to go on with this OpenSource orientation and extend this migration method to the other services of the cartographic platform.


About Mappy

Routing and cartographic services specialist, Mappy is reknown as the leader for local search through maps, on internet, tablets, mobiles and GPS.

Mappy offers three kinds of search to its users : map search, which allows to visualize a neighborhood, to experience city immersion thanks to 360° views in 320 french cities, but also to open the door of several thousands of shops ; the routing service available for cars, public transportation, bicycle and pedestrian mode ; and product search, allowing to locate a specific product in a given geographical zone, to know its price and availability.

As a major actor of urban transportation, Mappy offers to advertisers a geolocated solution on the whole country, facilitating web-to-store applications and trafic generation to their retail centers.
Mappy counts as of now more than 10 millions users on Internet, tablets and mobile (Mappy and MappyGPS Free).

Mappy is a subsidiary of the Solocal Group.

by oslandia at January 24, 2017 10:51 AM

January 23, 2017


Full Spatial database power in 2 lines

This post explains how to setup a powerful spatial data store with a wide range of features on Ubuntu 14.04 (Trusty) in 2 command lines. And how it works.


Run this on Ubuntu 14.04 Trusty Tahr, and enjoy :

sudo apt install docker.io
sudo docker.io run --rm -P --name pggis_test oslandia/pggis /sbin/my_init

Database credentials : pggis/pggis/pggis for database/user/password


More info

You want to store and manipulate spatial data. Lots of spatial data. Various spatial data. This Docker image lets you install the latest components based on the most powerful opensource database (PostgreSQL, in case you wonder) and benefit from a wide range of features.
Docker is a Linux container based technology. See containers like something between the good old chroot and a Virtual Machine. Fast, efficient and painless. See Docker as a mix between a meta-package manager, git and a whale. Docker is available natively on Ubuntu 14.04 (and named docker.io), and can also run on other Linux flavors, or in a minimalist Virtual Machine with Boot2docker for other environments (Windows, MacOSX).

This image focus on installing and running the following components :

  • PostgreSQL 9.3
  • PostGIS 2.1.3 with SFCGAL support
  • PgRouting
  • PostgreSQL PointCloud
  • PDAL

PostgreSQL is the advanced database providing a lot of features for data store and management. Version 9.3 is the latest stable version (9.4 is just around the corner), providing support for streaming replication, JSON, CTE, window queries and much, much more.

PostGIS is a PostgreSQL extension, enabling spatial support. It features new data types, such as 2D (and 2.5D) vector data, be it projected or in lat/lon. It also offers raster data support, to store gridded image data, and topological vector data storage.

SFCGAL support for PostGIS adds 3D features to the database. You can store 3D meshes, TIN, and operate on those objects.

PgRouting is a PostgreSQL/PostGIS extension allowing you to perform routing in the database on topological spatial data. It can be used to dynamically compute shortest paths, driving distances and more.

PostgreSQL PointCloud is a PostgreSQL/PostGIS extension which lets you deal with huge amounts of points, e.g. LIDAR data. PDAL is a library and set of tools allowing you to load and extract data from and to PointCloud (among other things).

All of this makes the best platform for spatial data crushing. Loads of spatial data.

Using it

Below is a longer full explanation to install this image and use it.

Install Docker

You first need to install docker tools on your Ubuntu system. You can install Docker on a lot of recent Linux flavors or in a Linux Virtual Machine ( with Boot2docker for example).

sudo apt install docker.io

Get the image and run the container

The following commands use docker to download the image `from the docker registry <https://index.docker.io/u/oslandia/pggis/>`_ (~1GB) and then run it in the foreground. If you omit the **pull** command and try to run it directly, Docker will look for the image on the Docker registry if not already present locally.

sudo docker.io pull oslandia/pggis
sudo docker.io run --rm -P --name pggis_test oslandia/pggis /sbin/my_init

Your container is initiated from the oslandia/pggis image. It is named pggis_test, runs in the foreground (default), will redirect all exposed ports to a host port ( -P ), and will delete all created filesystem and container image when the container exits ( -rm ). It will run the /sbin/my_init startup script of baseimage to launch all necessary processes inside the container.

Now PostgreSQL runs in your container with all extensions activated, and a new database created just for you.

Connect to the database

Assuming you have the postgresql-client installed on your host, you can use the host-mapped port to connect to the database. You need to use **docker.io ps** to find out what local host port the container is mapped to first:

$ sudo docker.io ps
CONTAINER ID        IMAGE                   COMMAND                CREATED             STATUS              PORTS                     NAMES
75fec271dc5e        oslandia/pggis:latest   /sbin/my_init       51 seconds ago      Up 50 seconds>5432/tcp   pggis_test

You can see that te container’s port 5432 has been mapped to the host’s port 49154 in this case (yours may differ). We can now connect to the database server through this host port. A pggis user has been created (with pggis as password) and a corresponding database with extensions activated.

$ psql -h localhost -p 49154 -d pggis -U pggis --password

Enjoy GIS features

You are now ready to use the database. Inside the psql console, you can check that everything is in order with queries like below.

pggis=# SELECT postgis_version();
(1 row)

pggis=# SELECT st_astext(st_makepoint(random() * 100, random() * 100)) from generate_series(1, 10);
POINT(90.1295741088688 97.7623191196471)
POINT(79.2798819020391 0.146342813968658)
POINT(56.9643571972847 50.8249804843217)
POINT(14.1903728246689 26.436754828319)
POINT(46.5733857825398 39.7641783114523)
POINT(89.1805129591376 36.6665146779269)
POINT(17.3397121019661 20.6000158563256)
POINT(11.5902675781399 93.6640297528356)
POINT(96.0820932406932 53.891480108723)
POINT(88.9889035373926 15.435611223802)
(10 rows)

Describing all features of installed components would require a lot more posts, coming later. If you are interested in learning more, Oslandia can provide training, assistance and support.


Do not use this in production !

The PostgreSQL configuration in this image opens the database connection without any IP restriction for the pggis super-user. The default pggis password is very weak too. This should NOT BE USED IN PRODUCTION, and you should adapt the configuration to your specific setup and security level. See below to rebuild the image with different configuration. In a production environment, you would also want to run the container in the background ( -d option) and keep the container filesystem and images after running ( no –rm option ).

How it works

This image is a Docker image, which uses Linux Containers (LXC). The base image is `Phusion baseimage <http://phusion.github.io/baseimage-docker>`_ , which it itself a Ubuntu 14.04 (Trusty). It is is built using a Dockerfile (see the source on Github). This Dockerfile is used by Docker to build the pggis image, by processing all installation, download, compilation and setup of the various components. The result is a Docker image, which is then published on docker.io.

You can either directly download and use the image from docker.io as explained above, or rebuild it from the Dockerfile in the git repository like below.

sudo apt-get install docker.io

Then clone the GitHub repository to get the Dockerfile :

git clone https://github.com/vpicavet/docker-pggis.git

cd docker-pggis

Then you can build the image, using the Dockerfile in the current repository.

sudo docker.io build -t oslandia/pggis .

Note that building the image requires quite a lot of download from internet, as well as enough RAM and CPU. Compilations are achieved along the way, with make -j3, thus enabling parallel compilation. You should adapt this value according to the number of CPUs you have.

Once you have built the image, you can run a container based on it just like above.

sudo docker.io run --rm -P --name pggis_test oslandia/pggis /sbin/my_init

If you want to change how the container works, or the default setup, you can edit the Dockerfile and rebuild the image. Docker caches every step of the build process (after each RUN command), so rebuilding can be very fast.

If you have a docker.io account, you can upload the image to the image registry like this

    sudo docker.io login
    sudo docker.io push oslandia/pggis


This is a first dive into Docker, with a spatial GIS database container. There is a lot more you can do with Docker, and plenty of use cases for this spatial database setup. This Docker image, the setup and the way it is built can be improved a lot as well, and will certainly change along with package availability, ppa repositories, next versions of components, better Docker practices and versions, but it is already a good and efficient way to get the stack up quickly. Do not hesitate to fork the GitHub repository  and send Pull Requests.

You can also report any issue on GitHub https://github.com/vpicavet/docker-pggis/issues

If you want to go further, or if you have any remark concerning the tools covered in this blog post, do not hesitate to contact us : infos+pggis@oslandia.com

by oslandia at January 23, 2017 03:41 PM

December 11, 2016

Paul Ramsey

PostgreSQL "Compatible" Aurora

PostgreSQL "Compatible" Aurora

While I know full well that Amazon’s marketing department doesn’t need my help, I cannot resist flagging this new development from the elves in Santa’s AWS workshop:

Today we are launching a preview of Amazon Aurora PostgreSQL-Compatible Edition. It offers … high durability, high availability, and the ability to quickly create and deploy read replicas. Here are some of the things you will love about it:

Performance – Aurora delivers up to 2x the performance of PostgreSQL running in traditional environments.

Compatibility – Aurora is fully compatible with the open source version of PostgreSQL (version 9.6.1). On the stored procedure side, we are planning to support Perl, pgSQL, Tcl, and JavaScript (via the V8 JavaScript engine). We are also planning to support all of the PostgreSQL features and extensions that are supported in Amazon RDS for PostgreSQL.

Cloud Native – Aurora takes full advantage of the fact that it is running within AWS.

The language Amazon uses around Aurora is really wierd – they talk about “MySQL compatibility” and “PostgreSQL compatibility”. At an extreme, one might interpret that to mean that Aurora is a net-new database providing wire- and function-level compatibility to the target databases. However, in the PostgreSQL case, the fact that they are additionally supporting PostGIS, the server-side languages, really the whole database environment, hints strongly that most of the code is actually PostgreSQL code.

There is not a lot of reference material about what’s going on behind the scenes, but this talk from re:Invent shows that most of the action is in the storage layer. For MySQL, since storage back-ends are pluggable, it’s possible that AWS has added their own back-end. Alternately, they may be running a hacked up version of the InnoDB engine.

For PostgreSQL, with only one storage back-end, it’s pretty much a foregone conclusion that AWS have taken a fork and added some secret sauce to it. However, the fact that they are tracking the community version almost exactly (they currently offer 9.6.1) indicates that maybe their fork isn’t particularly invasive.

I’d want to wait a while before trusting a production system of record to Aurora PgSQL, but the idea of the cloud native PostgreSQL, with full PostGIS support, excites me to no end. RDS is already very very convenient, so RDS-with-better-performance and integration is just icing on the cake for me.

I, for one, welcome our new cloud database overlords.

December 11, 2016 04:09 PM

December 09, 2016

Boundless Geo

LocationTech Tour 2016

If you caught our update on FOSS4G 2016 you know we enjoy taking part in community events and open source advocacy. The LocationTech Tour is the second big open source GIS mapping conference & community event we take part in; …

The post LocationTech Tour 2016 appeared first on Boundless Geo.

by Jody Garnett at December 09, 2016 05:13 PM

Boundless Geo

LocationTech Tour 2016

If you caught our update on FOSS4G 2016 you know we enjoy taking part in community events and open source advocacy. The LocationTech Tour is the second big open source GIS mapping conference & community event we take part in; …

The post LocationTech Tour 2016 appeared first on Boundless Geo.

by Jody Garnett at December 09, 2016 05:13 PM

November 28, 2016

PostGIS Development

PostGIS 2.3.1 Released

The PostGIS development team is pleased to announce the release of PostGIS 2.3.1. Best served with pgRouting 2.3.1 and PostgreSQL 9.6.1.

As befits a patch release, the focus is on bugs and breakages.

Continue Reading by clicking title hyperlink ..

by Regina Obe at November 28, 2016 12:00 AM

November 26, 2016

PostGIS Development

PostGIS 2.2.4 Released

The PostGIS development team is pleased to announce the release of PostGIS 2.2.4 As befits a patch release, the focus is on bugs and breakages.

Continue Reading by clicking title hyperlink ..

by Regina Obe at November 26, 2016 12:00 AM

November 23, 2016

Postgres OnLine Journal (Leo Hsu, Regina Obe)

ODBC FDW now supports 9.5 and 9.6

A while ago when Foreign Data Wrappers in PostgreSQL was a fairly new thing, we talked about the ODBC_FDW foreign data wrapper. Since then, people have been asking us how to get the ODBC FDW to work on newer PostgreSQL. Sadly the ODBC_FDW was stuck in time not having updated to newer FDW API standards. Our recommendation was just to use OGR_FDW, which many distributions both Linux and Windows have compiled OGR_FDW with ODBC support. True that OGR_FDW is coined as a spatial data wrapper, but the reality is spatial data rarely lives apart from regular attribute data so a good spatial vector driver supports both vector data and bread and butter data types. OGR_FDW is still our go to for working with spreadsheets and folders of CSV files.

Recently the fine folks at Carto patched the ODBC FDW to work with PostgreSQL 9.5. I do hope they accept my modest patch to make it work with PostgreSQL 9.6 as well. So now 2 FDWs to choose from for connecting to ODBC datasources. Which one is better? The answer as most always is IT DEPENDS.

Continue reading "ODBC FDW now supports 9.5 and 9.6"

by Leo Hsu and Regina Obe (nospam@example.com) at November 23, 2016 05:39 AM

October 17, 2016

BigSQL Holly

Two great things that taste great together: PostGIS and PostgreSQL

In my past life as a geospatial DBA, I had to navigate users and managers who belonged to different database teams (PostgreSQL, MySQL, SQL Server, Oracle, etc). When I was lucky enough to work in a shop that supported open source solutions, I was often asked by members of team MySQL, Why should we use PostgreSQL as our geospatial database when MySQL has spatial data types?” 

The answer: PostGIS.

Sure, MySQL has the ability to run spatial analysis (with some prodding). But PostgreSQL + PostGIS wins in:

  • functionality
  • performance
  • adoption by 3rd party solutions (QGIS, ArcGIS Server, GeoServer…)
  • So if you like to do things the free and easy way, go with PostgreSQL and PostGIS.

    by Holly Orr at October 17, 2016 02:59 PM

    October 10, 2016

    UpStats blog (Stefan Petrea)

    Off the streets, Land subdivision in PostGIS - Part 2


    In a previous post, we've seen how PostGIS and Openstreetmap can be used to leverage geographical data.

    A common task in GIS(in particular land use planning) is land subdivision. This involves taking different shapes, polygons usually, and cutting them into smaller polygons.

    This post will focus on describing an algorithm for partitioning a land polygon into parcels of a given area. Parcels will be cut off from the initial polygon until no more parcels can be formed.

    This is part2 of a series:

    Algorithm description

    We have a polygon P and the nearest road R. We get the bounding box B for P and all our searches for cutting points/lines will be confined to B. We compute the extreme points on the bounding box and we label them with the cardinal directions they represent.

    Our goal is to cut a corner C (also called subdivision) from P such that it contains the nearest boundary point to a road. The cut will be done using two lines, one horizontal and one vertical. We want C to be of a given area A.

    Now in order to find the corner to cut, we look at the extreme points and check which one of them is closest to a road.

    We'll use sweeping-lines to find the parcel we need. Any sweeping-lines mentioned will be moving away from the corner (in other words, away from the closest road point).

    In what follows, we assume the north-west corner needs to be cut.

    We place an inset (a horizontal line) that will be located sqrt(A) to the south (relative to the north edge). The inset is positioned there because we anticipate the target area to be in the form of a square.

    If the area above the inset (the one we aim for) is larger than our target, we split the polygon, take the upper half and use another sweeping line that goes from west to east, to find another cutting line that allows us to find a parcel with the required area.

    If the area above the inset is insufficient (below the target area), we search for a better position for it, using binary search, along the north-south direction.

    Additional details: The way the cut search works, using the inset, is such that we avoid getting thin horizontal strips when our initial polygon is a square/rectangle (and it is expected to be a square in the vast majority of cases).

    Details about corner cases (other than NW which was covered above):

    • NE corner: horizontal goes north->south and vertical goes east->west
    • SE corner: horizontal goes south->north and vertical goes east->west
    • SW corner: horizontal goes south->north and vertical goes west->east

    So the sweep lines always move away from the corner.

    After the parcel with target area was found, it will be cut off from the polygon, the original polygon in the GIS database will be updated and the new parcel will be inserted (separate from the original polygon that we split).


    OSM was used to extract test data. Specifically, in the run below, data for the Herastrau Park in Bucharest was used. In OSM, the park is represented as a series of polygons, and one of the polygons was partitioned in parcels, each measuring 8000 square meters.

    On the left side you can see the actual partition. On the right side, the same partition is displayed, except we also have some of the objects used throughout development, including bounding boxes, and extreme boundary points.

    And to visualize how the algorithm works, here's an animation of how it partitions the polygon, parcel by parcel:


    • While working on this algorithm implementation, at one point, it was required to find the extreme boundary points and assign them cardinal orientation labels. A first approach to determining the cardinal direction of the boundary points was to order them by ST_Azimuth relative to the centroid of the polygon (as the circle center passed to ST_Azimuth). This is a pitfall because the clock-wise order doesn't guarantee in any way membership to the N,E,S,W edges of the bounding box. It's perfectly possible that two such extreme points belong to the north edge, and one belongs to the west edge, and one to the south edge.
    • As it turns out, splitting the polygon multiple times, and using ST_Union to glue the remaining parts back together will create MULTIPOLYGON structures and which ST_ExteriorRing is not compatible with. So a convex hull was used in order to get one single polygon and compute the bounding box for it. This problem will surface after repeatedly cutting the input polygon.
    • Due to lack of knowledge about QGIS, some logic for an SVG-based 1 visualization was written in order to check the position of the cuts and the shape of the polygon after the cuts were made. Although QGIS has virtual layer feature, that was quite hard to use.
    • Early on, there was a need to visualize some objects from OSM in order to assess if they are fit for test data. It was easier to create a VIEW of those objects in PostgreSQL and then visualize that in QGIS (it was a bit hard to deal with all the menus in QGIS)
    • It would've been nice to have some automatically generated test polygons for each position of the road relative to the polygon. And some polygons that would be thiner near the top and wider near the bottom. However, the polygons extracted from OSM were used instead.
    • To get the extreme points of the polygon, the first approach used was to intersect exterior ring of the polygon with the exterior ring of the polygon's envelope (envelope meaning bounding box). This proved to be complicated and would definitely add complexity if one of the polygon's edges were axis-parallel as the result of the intersection would've been a line, and not a point. So the second approach was much simpler, just decomposing the polygon into its vertices (using ST_DumpPoints) and then picking the northmost, eastmost, westmost and southmost vertices. While this works, it only works for POLYGON and MULTIPOLYGON but it's not expected to work for CURVEPOLYGON.


    While implementing the logic for each corner-cut, it became more obvious that the logic for one corner would be enough, and the entire drawing could be rotated, the cut made, and then the resulting pieces rotated back. This could have decreased the size of the implementation.

    No strict constraints are made on the shape of the parcels. Most of them are square, but some diverge from that shape quite a lot.

    Some polygons may have U-shapes and if the closest-point on the boundary is one of the two tips of the U shape, the parcel that gets cut could turn out to be one made up of disconnected pieces.

    Another gap that was not covered by the initial spec and is not currently handled is the row-excess which can form at the end of a row (in the example it will be at the end of the row). This row-excess is another factor that can generate thin parcels because once formed, the next split will use this excess and add a very thin portion beneath it to form the new parcel. This is not very helpful, but it's a known issue.

    Someone pointed out that there should also be access paths/roads but that wasn't part of the problem statement. Another aspect that was not taken into consideration was surface inclination, it's assumed that subdivided area is flat.


    In this post we've seen how PostGIS can be leveraged to build a land subdivision algorithm. The implementation for this algorithm is available on Github under MIT license.



    Satoshi Koda has a great blog post about this, and that was very useful reading while writing the SVG visualization for this project. His new blog is hosted here.

    October 10, 2016 09:00 PM

    October 06, 2016

    PostGIS Development

    PostGIS 2.2.3 Released

    The PostGIS development team is pleased to announce the release of PostGIS 2.2.3 As befits a patch release, the focus is on bugs and breakages.

    Continue Reading by clicking title hyperlink ..

    by Regina Obe at October 06, 2016 12:00 AM

    October 04, 2016

    BigSQL Holly

    Why are you still paying for your GIS when you can get it for free?

    If you haven’t switched to open source solutions (FREE!), then you have probably fallen for some common misconceptions/myths:

    MYTH: Open source tools are buggy.

    Does software have bugs? Of course. It was made by humans! But open source has the benefit of a strong community that can crowdsource fixes because the code base is not proprietary (open source!) And, by the way, If you have been using the most popular proprietary GIS software (wink,wink) for more than 5 years, you know all about bugs.

    MYTH: Free GIS software has a limited toolbox for analysis.

    Well this depends. Let’s refer to the 80/20 rule here. When applied to GIS, 80% of your analysis can be processed with 20% of the tools / algorithms available for a spatially enabled database. If you are a bada$$ spatial stats expert and love your proprietary tools, then by all means stick with that expensive stack (also take a look at PySAL). But if you are like most of us (roughly 80%), you can accomplish your analysis with the FREE stack.

    MYTH: Open source tools are impossible to install and administer.

    Granted, this has been true in the past. But the open source community has made great strides in creating tools that don’t require you to have an engineering degree to stand-up a fully functioning GIS stack. And, because the community is large (and committed), you can find a ton of tutorials and documentation on the web.

    MYTH: PostGIS is fine for hobbyists, but it can’t support corporate needs.

    Actually, more and more companies and government agencies are turning to PostgreSQL/PostGIS for their geospatial needs: Uber, FourSquare, NOAA, and Carto just to name a few.

    In upcoming posts we will show you how to install new open source (FREE!) tools from BigSQL (powered by OpenSCG) to help you build your open source GIS stack.

    Just think of all the cool stuff you can buy from the Public Lab with the money you will save…

    by Holly Orr at October 04, 2016 02:26 PM

    September 26, 2016

    PostGIS Development

    PostGIS 2.3.0 Released

    The PostGIS development team is pleased to announce the release of PostGIS 2.3.0.
    This is the first version to utilize the parallel support functionality introduced in PostgreSQL 9.6. As such, if you are using PostgreSQL 9.6, we strongly encourage you to use this version.

    Parallel support will make many queries using PostGIS relationship operators and functions faster. In order to take advantage of parallel query support, make sure to set max_parallel_workers_per_gather to something greater than 0 as noted in max_parallel_workers_per_gather PostgreSQL runtime configs

    Best served with [PostgreSQL 9.6+] which is due out this week and pgRouting 2.3.0 which also just got released.

    Packages from maintainers will be out in the coming days and weeks.

    Continue Reading by clicking title hyperlink ..

    by Regina Obe at September 26, 2016 12:00 AM

    September 19, 2016

    PostGIS Development

    PostGIS 2.3.0rc1 Released

    PostGIS 2.3.0rc1 is feature complete, so we’re looking for testing and feedback! Best served with PostgreSQL 9.6rc1 and pgRouting 2.3.0-rc1

    Please give this release candidate a try and report back any issues you encounter. New things since 2.3.0beta1 release

    Please report bugs that you find in this release.

    Important / Breaking Changes

    • 3466, Casting from box3d to geometry now returns a 3D geometry (Julien Rouhaud of Dalibo)

    • 3604, pgcommon/Makefile.in orders CFLAGS incorrectly leading to wrong liblwgeom.h (Greg Troxel)

    • 3396, ST_EstimatedExtent, now throws WARNING instead of ERROR (Regina Obe)

      New Features and Performance Enhancements

    • Add support for custom TOC in postgis_restore.pl (Christoph Moench-Tegeder)

    • Add support for negative indexing in STPointN and STSetPoint (Rémi Cura)
    • Numerous new function additions and enhancements: New Functions and Enhancements

    • 3549, Support PgSQL 9.6 parallel query mode, as far as possible (Paul Ramsey, Regina Obe)

    • 3557, Geometry function costs based on query stats (Paul Norman)
    • 3591, Add support for BRIN indexes (Giuseppe Broccolo of 2nd Quadrant, Julien Rouhaud and Ronan Dunklau of Dalibo)
    • 3496, Make postgis non-relocateable (for extension install), schema qualify calls in functions (Regina Obe) Should resolve once and for all for extensions #3494, #3486, #3076

    • 3547, Update tiger geocoder to support TIGER 2016 and use http or ftp (Regina Obe)

    See the full list of changes in the news file and please report bugs that you find in the release. Binary packages will appear in repositories over the coming weeks as packagers roll out builds.

    View all closed tickets for 2.3.0.

    by Regina Obe at September 19, 2016 12:00 AM

    September 18, 2016

    Anita Graser (Underdark)

    Movement data in GIS: issues & ideas

    Since I’ve started working, transport and movement data have been at the core of many of my projects. The spatial nature of movement data makes it interesting for GIScience but typical GIS tools are not a particularly good match.

    Dealing with the temporal dynamics of geographic processes is one of the grand challenges for Geographic Information Science. Geographic Information Systems (GIS) and related spatial analysis methods are quite adept at handling spatial dimensions of patterns and processes, but the temporal and coupled space-time attributes of phenomena are difficult to represent and examine with contemporary GIS. (Dr. Paul M. Torrens, Center for Urban Science + Progress, New York University)

    It’s still a hot topic right now, as the variety of related publications and events illustrates. For example, just this month, there is an Animove two-week professional training course (18–30 September 2016, Max-Planck Institute for Ornithology, Lake Konstanz) as well as the GIScience 2016 Workshop on Analysis of Movement Data (27 September 2016, Montreal, Canada).

    Space-time cubes and animations are classics when it comes to visualizing movement data in GIS. They can be used for some visual analysis but have their limitations, particularly when it comes to working with and trying to understand lots of data. Visualization and analysis of spatio-temporal data in GIS is further complicated by the fact that the temporal information is not standardized in most GIS data formats. (Some notable exceptions of formats that do support time by design are GPX and NetCDF but those aren’t really first-class citizens in current desktop GIS.)

    Most commonly, movement data is modeled as points (x,y, and optionally z) with a timestamp, object or tracker id, and potential additional info, such as speed, status, heading, and so on. With this data model, even simple questions like “Find all tracks that start in area A and end in area B” can become a real pain in “vanilla” desktop GIS. Even if the points come with a sequence number, which makes it easy to identify the start point, getting the end point is tricky without some custom code or queries. That’s why I have been storing the points in databases in order to at least have the powers of SQL to deal with the data. Even so, most queries were still painfully complex and performance unsatisfactory.

    So I reached out to the Twitterverse asking for pointers towards moving objects database extensions for PostGIS and @bitnerd, @pwramsey, @hruske, and others replied. Amongst other useful tips, they pointed me towards the new temporal support, which ships with PostGIS 2.2. It includes the following neat functions:

    • ST_IsValidTrajectory — Returns true if the geometry is a valid trajectory.
    • ST_ClosestPointOfApproach — Returns the measure at which points interpolated along two lines are closest.
    • ST_DistanceCPA — Returns the distance between closest points of approach in two trajectories.
    • ST_CPAWithin — Returns true if the trajectories’ closest points of approach are within the specified distance.

    Instead of  points, these functions expect trajectories that are stored as LinestringM (or LinestringZM) where M is the time dimension. This approach makes many analyses considerably easier to handle. For example, clustering trajectory start and end locations and identifying the most common connections:


    (data credits: GeoLife project)

    Overall, it’s an interesting and promising approach but there are still some open questions I’ll have to look into, such as: Is there an efficient way to store additional info for each location along the trajectory (e.g. instantaneous speed or other status)? How well do desktop GIS play with LinestringM data and what’s the overhead of dealing with it?

    This is the first part of a series of posts, read more:

    by underdark at September 18, 2016 03:11 PM

    September 11, 2016

    Jorge Arévalo

    Querying raster data stored in your Carto account

    During the years, people keep asking me the same question: can you store (and query) raster data in your Carto (former CartoDB) account? How?

    Well, the answer is yes. And this is more a PostGIS Raster issue than a Carto issue. Let’s go.

    Disclaimer: This code was written couple of years ago. It may be outdated, but still works.

    Store raster data

    First thing you need to do is easy: just drop your georeferenced raster file into the Dataset page of your account. The raster will be imported, and you’ll see something like this

    Do you see that foto_pnoa table with gray shading name? That means you cannot click on it to see the table’s data. But don’t worry: your data is safe in the table. Don’t trust me? Just enter another table and run this query (of course, using your own table’s name)

    select st_summarystats(the_raster_webmercator, 1) as stats from foto_pnoa

    You should see the stats of your raster data now.

    View your raster data over a map

    Now, to the fun part. We’ll code a bit of JavaScript to put that data over a Carto map. We’ll even allow some calculation with the data (average raster value within a region). So, go for it.

    The HTML

    Pretty simple and straightforward. This is the relevant part of the HTML. But don’t worry, I’ll provide a link to the complete example at the end of the post.

    <div class="header">
    <h1>PostGIS Raster test</h1>
    <h2>Draw a figure and click on it to see the avg raster value</h2>
    <div id="map"></div>

    The JS

    We’ll create a Leaflet map, using:

    Relevant parts of the code here (again, you’ll get the complete code later)

    Create the map and the controls

    We create a Leaflet map and the draw controls

    // Create map
    var map = new L.Map('map', {
    zoomControl: true,
    drawnControl: true,
    center: [37.383333, -5.983333],
    zoom: 11
    // Add CartoDB basemaps
    L.tileLayer('http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png', {
    attribution: '<a href="http://cartodb.com">CartoDB</a> © 2014',
    maxZoom: 18
    // Add drawn controls
    var drawnItems = new L.FeatureGroup();
    var drawControl = new L.Control.Draw({
    position: 'bottomleft',
    draw: {
    polyline: false,// Turns off this drawing tool
    marker: false,
    polygon: false,
    rectangle: {
    shapeOptions: {
    color: '#a63b55'
    showArea: true
    circle: {
    shapeOptions: {
    color: '#662d91'
    showArea: true
    edit: {
    featureGroup: drawnItems

    What can I do with those controls? Let’s see

    Handle draw actions

    Whenever we draw a figure:

    1. Using the figure’s coords, we build a PostGIS geometry by calling ST_MakeBox2D, if we drawn a rectangle, or ST_Buffer, if we drawn a circle.
    2. Run a query to check the average raster value within that geometry and show the value

    Check it out in the next snippet

    map.on('draw:created', function (e) {
    var type = e.layerType,
    layer = e.layer;
    var pol_pgis = null;
    switch(type) {
    // Create a Rectangle geometry in PostGIS
    case 'rectangle':
    var coords = layer.getLatLngs();
    var southWest = L.latLng(coords[1].lat, coords[1].lng);
    var northEast = L.latLng(coords[3].lat, coords[3].lng);
    var pol_pgis = "st_transform(ST_SetSRID(ST_MakeBox2D(ST_Point(" +
    coords[1].lng + ", " + coords[1].lat + "),ST_Point(" +
    coords[3].lng + "," + coords[3].lat + ")),4326), 3857)";
    // Create a circle geometry in PostGIS
    case 'circle':
    var center = layer.getLatLng();
    var pol_pgis = "st_transform(geometry(st_buffer(geography(st_setsrid(st_point(" +
    center.lng + ", " + center.lat + "), 4326)), " + layer.getRadius() + ")),3857)";
    case 'polygon':
    if (pol_pgis) {
    q = "SELECT avg((stats).mean) as m from (select st_summarystats(the_raster_webmercator, 1) as stats from foto_pnoa where st_intersects(the_raster_webmercator, " + pol_pgis +")) as foo";
    console.log("QUERY: " + q);
    var sql = new cartodb.SQL({user: 'libregis'});
    .done(function(data) {
    if (data.rows && data.rows.length > 0)
    layer.bindPopup("Average raster value inside the " + type + ": " + data.rows[0].m);
    layer.bindPopup("Could not get avg value!");
    .error(function(errors) {
    layer.bindPopup("Could not get avg value!");
    else {
    layer.bindPopup("Could not get avg value!");

    Show the raster tiles

    The final touch. We put the raster tiles over the map using the Maps API, using AJAX (old fashioned way, I know…)

    var config = {
    "version": "1.3.1",
    "layers": [
    "type": "cartodb",
    "options": {
    "sql": "select * from foto_pnoa",
    "cartocss": "#foto_pnoa {raster-opacity: 0.5;}",
    "cartocss_version": "2.3.0",
    "geom_column": "the_raster_webmercator",
    "geom_type": "raster"
    var request = new XMLHttpRequest();
    request.open('POST', currentEndpoint(), true);
    request.setRequestHeader('Content-Type', 'application/json; charset=UTF-8');
    request.onload = function() {
    if (this.status >= 200 && this.status < 400){             var layergroup = JSON.parse(this.response);             var tilesEndpoint = currentEndpoint() + '/' + layergroup.layergroupid + '/{z}/{x}/{y}.png';             var protocol = 'https:' == document.location.protocol ? 'https' : 'http';             if (layergroup.cdn_url && layergroup.cdn_url[protocol]) {                 var domain = layergroup.cdn_url[protocol];                 if ('http' === protocol) {                     domain = '{s}.' + domain;                 }                 tilesEndpoint = protocol + '://' + domain + '/' + currentUser() + '/api/v1/map/' + layergroup.layergroupid + '/{z}/{x}/{y}.png';             }             rasterLayer = L.tileLayer(tilesEndpoint, {                 maxZoom: 18             }).addTo(map);         } else {             throw 'Error calling server: Error ' + this.status + ' -> ' + this.response;

    That’s it. You can check the final result in the next codepen

    You can grab the code here


    The very first version of the code was made by Raul Ochoa

    by Jorge Arévalo at September 11, 2016 11:38 AM

    September 06, 2016

    PostGIS Development

    PostGIS 2.3.0beta1 Released

    PostGIS 2.3 is feature complete, so we’re looking for testing and feedback! Best served with PostgreSQL 9.6.

    Please give this beta a try and report back any issues you encounter.

    Please report bugs that you find in this release.

    ** Important / Breaking Changes **

    • 3466, Casting from box3d to geometry now returns a 3D geometry (Julien Rouhaud of Dalibo)

    • 3604, pgcommon/Makefile.in orders CFLAGS incorrectly leading to wrong liblwgeom.h (Greg Troxel)

      ** New Features and Performance Enhancements **

    • Add support for custom TOC in postgis_restore.pl (Christoph Moench-Tegeder)

    • Add support for negative indexing in STPointN and STSetPoint (Rémi Cura)
    • Numerous new function additions and enhancements: New Functions and Enhancements

    • 3549, Support PgSQL 9.6 parallel query mode, as far as possible (Paul Ramsey, Regina Obe)

    • 3557, Geometry function costs based on query stats (Paul Norman)
    • 3591, Add support for BRIN indexes (Giuseppe Broccolo of 2nd Quadrant, Julien Rouhaud and Ronan Dunklau of Dalibo)
    • 3496, Make postgis non-relocateable (for extension install), schema qualify calls in functions (Regina Obe) Should resolve once and for all for extensions #3494, #3486, #3076

    • 3547, Update tiger geocoder to support TIGER 2016 and use http or ftp (Regina Obe)

    See the full list of changes in the news file and please report bugs that you find in the release. Binary packages will appear in repositories over the coming weeks as packagers roll out builds.

    View all closed tickets for 2.3.0.

    by Regina Obe at September 06, 2016 12:00 AM

    August 25, 2016

    Paul Ramsey

    PgSQL Indexes and "LIKE"

    Do you write queries like this:

    SELECT * FROM users 
    WHERE name LIKE 'G%'

    Are your queries unexpectedly slow in PostgreSQL? Is the index not doing what you expect? Surprise! You’ve just discovered a PostgreSQL quirk.

    TL;DR: If you are running a locale other than “C” (show LC_COLLATE to check) you need to create a special index to support pattern searching with the LIKE operator: CREATE INDEX myindex ON mytable (mytextcolumn text_pattern_ops). Note the specification of the text_pattern_ops operator class after the column name.

    As a beginner SQL student, you might have asked “will the index make my ‘like’ query fast” and been answered “as long as the wildcard character is at the end of the string, it will.”

    PgSQL Indexes and "LIKE"

    That statement is only true in general if your database is initialized using the “C” locale (the North America/English-friendly UNIX default). Running with “C” used to be extremely common, but is less and less so, as modern operating systems automagically choose appropriate regional locales to provide approriate time and formatting for end users.

    For example, I run Mac OSX and I live in British Columbia, an English-speaking chunk of North America. I could use “C” just fine, but when I check my database locale (via my collation), I see this:

    pramsey=# show LC_COLLATE;
    (1 row)

    It’s a good choice, it’s where I live, it supports lots of characters via UTF-8. However, it’s not “C”, so there are some quirks.

    I have a big table of data linked to postal codes, this is what the table looks like:

                  Table "gis.postal_segments"
          Column       |     Type     | Modifiers 
     postal_code       | text         | not null
     segment           | character(4) | 
        "postal_segments_pkey" PRIMARY KEY, btree (postal_code)

    Note the index on the postal code, a standard btree.

    I want to search rows based on a postal code prefix string, so I run:

    SELECT * FROM postal_segments 
    WHERE postal_code LIKE 'V8V1X%';
                                                  QUERY PLAN                                              
     Seq Scan on postal_segments  (cost=0.00..2496.85 rows=10 width=68) (actual time=30.320..34.219 rows=4 loops=1)
       Filter: (postal_code ~~ 'V8V1X%'::text)
       Rows Removed by Filter: 100144
     Planning time: 0.250 ms
     Execution time: 34.263 ms
    (5 rows)

    Ruh roh!

    I have an index on the postal code, so why am I getting a sequence scan?!?! Because my index is no good for doing pattern matching in any collation other than “C”. I need a special index for that, which I create like this.

    CREATE INDEX postal_segments_text_x 
      ON postal_segments (postal_code text_pattern_ops);

    The magic part is at the end, invoking text_pattern_ops as the opclass for this index. Now my query works as expected:

    SELECT * FROM postal_segments 
    WHERE postal_code LIKE 'V8V1X%';
                                                               QUERY PLAN                                                           
     Index Scan using postal_segments_text_x on postal_segments  (cost=0.29..8.31 rows=10 width=68) (actual time=0.067..0.073 rows=4 loops=1)
       Index Cond: ((postal_code ~>=~ 'V8V1X'::text) AND (postal_code ~<~ 'V8V1Y'::text))
       Filter: (postal_code ~~ 'V8V1X%'::text)
     Planning time: 0.532 ms
     Execution time: 0.117 ms
    (5 rows)

    I have gotten so used to PostgreSQL doing exactly the right thing automatically that it took quite a long time to track down this quirk when I ran into it. I hope this page helps others save some time!

    August 25, 2016 09:05 AM

    August 10, 2016

    Paul Ramsey

    Your Broken PostGIS Upgrade

    Since the Dawn of Time, people have found PostGIS upgrades difficult and confusing, and this is entirely to be expected, because a PostGIS upgrade consists of a number of interlocking parts. Sometimes, they “upgrade” their version of PostGIS and find out they’ve bricked their system. What gives?

    Your Broken PostGIS Upgrade

    What Makes PostGIS Work?

    Before talking about upgrades, it’s important to understand how PostGIS works at all, because that understanding is key to seeing how upgrade scenarios go bad.

    PostGIS is a “run-time loadable library” for PostgreSQL. That means we have a block of C code that is added to a running PostgreSQL database. That C code sits in a “library file” which is named (for the current 2.2 version): postgis-2.2.so.

    Just to add to the confusion: for Windows, the name of the library file is postgis-2.2.dll. For every rule, there must be an exception. For users of Apple OSX, yes, there’s a further exception for you: even though most dynamic libraries on OSX are suffixed .dylib, the PostgreSQL modules on OSX are suffixed .so, just like their Linux counterparts.

    The location of the postgis-2.2.so file will vary from system to system.

    The presence of the postgis-2.2.so alone is not sufficient to “PostGIS enable” a database. PostGIS consists of a large collection of SQL functions in the database.

    The SQL functions are created when you run the CREATE EXTENSION postgis command. Until that time your database knows nothing about the existence or definition of the PostGIS functions.

    Once the extension is installed, you can see the definitions of the PostGIS functions in the system tables.

    The use of dynamic function and type management catalogs is one of the things which makes PostgreSQL so incredibly flexible for extensions like PostGIS

    SELECT * 
      FROM pg_proc 
      WHERE proname = 'st_pointonsurface';
    -[ RECORD 1 ]---+--------------------
    proname         | st_pointonsurface
    pronamespace    | 2200
    proowner        | 10
    prolang         | 13
    procost         | 100
    prorows         | 0
    provariadic     | 0
    protransform    | -
    proisagg        | f
    proiswindow     | f
    prosecdef       | f
    proleakproof    | f
    proisstrict     | t
    proretset       | f
    provolatile     | i
    pronargs        | 1
    pronargdefaults | 0
    prorettype      | 667466
    proargtypes     | 667466
    proallargtypes  | 
    proargmodes     | 
    proargnames     | 
    proargdefaults  | 
    prosrc          | pointonsurface
    probin          | $libdir/postgis-2.2
    proconfig       | 
    proacl          | 

    Lots to see here, but most important bit is the entry for the probin column: $libdir/postgis-2.2. This function (like all the other PostGIS functions) is bound to a particular version of the PostGIS C library.

    Those of you thinking forward can now begin to see where upgrades could potentially go wrong.

    How Things Go Wrong

    Package Managers

    The most common way for things to go wrong is to upgrade the library on the system without upgrading the database.

    So, in Red Hat Linux terms, perhaps running:

    yum upgrade postgresql94-postgis

    This seems straight-forward, but think about what a package manager does during an upgrade:

    • Downloads a new version of the software
    • Removes the old version
    • Copies in the new version

    So, if we had PostGIS 2.1.3 installed, and the latest version is 2.2.2, what has happend?

    • The postgis-2.1.so file has been removed
    • The postgis-2.2.so file has been added
    • So, the pg_proc entries in every PostGIS-enabled database now point to a library file that does not exist

    Fortunately this mismatch between the pg_proc entries and the system state is usually solved during the very next step of the upgrade. But it’s a manual step, and if the DBA and system administrator are different people with different schedules, it might not happen.

    Your next step should be to go and update the SQL function definitions by running an extension update on all your databases:

    ALTER EXTENSION postgis UPDATE TO '2.2.2';

    If you don’t, you’ll find that none of the PostGIS functions work. That, in fact, you cannot even dump your database. The very act of outputting a representation of the geometry data is something that requires the PostGIS C library file, and until you run ALTER EXTENSION the database doesn’t know where the new library file is.


    Since the use of CREATE EXTENSION postgis (available since PostgreSQL 9.1+ and PostGIS 2.0+) became commonplace, migrations now almost always “just work”, which is excellent news.

    • When you dump a modern PostGIS-enabled database, that was created using the CREATE EXTENSION postgis command, the dump file just includes a CREATE EXTENSION postgis command of its own at the top.
    • When you load the dump file into a new version of PostgreSQL even with a new version of PostGIS, the extension is created and the data magically loads.

    However, there are still some old databases around that were created before the PostgreSQL extension system was invented, and when you dump them you get not only the data, but all the “custom” function and type definitions, including the defintions for PostGIS. A function definition looks like this:

    CREATE OR REPLACE FUNCTION ST_PointOnSurface(geometry)
        RETURNS geometry
        AS '$libdir/postgis-2.2', 'pointonsurface'

    And look what is hiding inside of it: a reference to a particular version of the PostGIS library! So you cannot simply dump your old PostGIS 1.5 database on PostgreSQL 8.4 and load it into a fresh PostGIS 2.2 database on PostgreSQL 9.5: the function definitions won’t reference the right library file.

    The best bet for a really old database that was created without the extension mechanism is to use the “hard upgrade” process. The hard upgrade works by:

    • Taking a special “custom-format” back-up that includes an object catalog;
    • Filtering the back-up to clean out all the PostGIS-specific function and object definitions; and then
    • Loading the “cleaned” back-up into a new database with the desired version of PostGIS already installed (using CREATE EXTENSION postgis this time, so you never have to hard upgrade again).


    In the case of upgrades that change out the underlying library and other situations that result in a mismatch between the SQL definitions in the database and the state of the system, there are a couple hacks that provide short-term fixes for emergencies:

    • Symlink the library name the database is looking for to the library name you have. So if your database wants postgis-2.1.so and all you have is postgis-2.2.so, you can ln -s postgis-2.2.so postgis-2.1.so and your database will “work” again.
    • Update the PostgreSQL catalog definitions for the functions. As a super-user, you can do all kinds of dangerous things, and one of them is to just UPDATE pg_proc SET probin = '$libdir/postgigs-2.2' WHERE probin ~ 'postgis-2.1'

    Both hacks “work” because the PostGIS project doesn’t change underlying function names often, and inter-version changes mostly involve adding functions to the C library, not removing old ones.

    However, there’s no guarantee that an underlying function name hasn’t change between versions, it’s just unlikely. In the worst case, the function name hasn’t changed, but the parameters have, so it’s now possible that calling the function will crash your database.

    All this to say: linking and SQL catalogue hacks should be used temporarily only until you can properly upgrade your database using a hard upgrade.

    August 10, 2016 04:05 PM

    August 02, 2016

    Boston GIS (Regina Obe, Leo Hsu)

    GeoHipster Interview with Regina Obe

    GeoHipster interview with me came out today. Covers how I stumbled into database programming and my work on PostGIS, PostgreSQL and pgRouting. Interview with Regina Obe

    by Regina Obe (nospam@example.com) at August 02, 2016 03:45 AM

    July 28, 2016

    Bill Dollins

    Personal Geospatial Workflows, July 2016 Edition

    It’s hard to believe, but I last touched upon this topic over two years ago, when my family and I were living in our between-houses rental. One of the goals I had when building our current house was to create a space where I could more effectively work from home. To that end, I have a dedicated office that I’ve been working toward optimizing for my technical work.

    One advantage of a dedicated space, which I did not anticipate ate the time, is compartmentalization. One of the dangers with working at home is the blurring of the boundary between work time and personal/family time. In our old house, I definitely felt that as I was working from the dining room table. Now, I can more effectively shut the door and step away. I’m not perfect at doing that, yet, but I am getting better.


    As a consultant doing federal work, I don’t get to work off-site all the time. I’ve been fortunate, however, to have worked a few projects over the past couple of years that have allowed it, so I’ve taken advantage of it as much as possible.

    In the past two years, the nature of my work has shifted dramatically. I am still doing my core federal work but I have diversified the smaller projects that I work on an independent basis. This has been primarily to keep my technical skills as diverse as possible. I still do a lot of .Net work for federal customers but, on the other hand, I am also doing a fair amount of geospatial web development these days, centered around a core stack of PostGIS, Node, and Leaflet. Other tools get tossed in based on requirements, but those core technologies are common.

    As a result, I found myself doing development in a number of different environments with a variety of tools. (Due to customer preferences, for example, I have projects that use Visual Studio 2010, 2012, and 2015.) While PostgreSQL is my mainstay, I also work in SQL Server (differing versions there, too).

    I had two machines, one company-provided and one personal, to attack these various configurations. Both were severely under-powered for what I was doing, so I recently purchased a new system to unite all of my project work in one place, and allow myself to make use of virtualization in order to segment off the various development and deployment configurations I support.

    I knew I wanted an Ubuntu Linux host machine, primarily because it’s what I prefer on a daily basis. I opted to buy a pre-built system from System76 versus building one myself. I’ve done that a few times, but simply don’t have the time for it anymore and was attracted to the convenience of having someone else do it for me.

    I customized the System76 Wild Dog Pro system, expanding the RAM to 64GB, the system drive to a 250GB M.2 SSD, and adding a 4TB hard drive for additional storage. This configuration, especially the RAM, allows me a lot of flexibility for configuring and running project VMs in VirtualBox, while not taxing the performance of the host machine.

    My plan is to keep the host system relatively clean. I have installed the following directly to the host:

    • PostgreSQL/PostGIS/pgAdmin III
    • QGIS
    • GDAL/OGR
    • Chrome
    • Chrome Remote Desktop (more on that later)

    I use PostGIS so commonly that I decided to install it directly onto the host and allow project VMs to access it over the network. This setup actually emulates a typical deployment pattern pretty well, in addition to giving me just one installation to manage. I have set up a tablespace to direct the data storage to the 4TB hard drive.

    That’s it. Everything else will live in a VM. As a result, I’m storing a lot of ISO images of operating systems and application installers, but they live on a USB 3 external drive. I’m currently in the process of creating my baseline VMs, including the one that will be my primary ArcGIS machine. (The image above shows that I’ve already build a Windows 10 baseline.)  I have also imaged my current development laptop to a VM using the VMWare Infrastucture Client. That will help with the transition of my development environment into this new setup.

    You probably noticed that I purchased a desktop workstation, but that I am currently using a laptop. I have always been a laptop guy because I liked the mobility and was willing to sacrifice some power for it. Two big changes for me are that I don’t move around as much and that I have had a need for more power. I had been thinking of making the switch for a couple of years and finally decided to make the jump.

    What about the mobility? I still need it occasionally, which is why Chrome Remote Desktop is in the mix. I’ve tried it a few times and it seems to work well. In my opinion, it works much better than VNC, giving me a Windows RDP type of experience. I’ll be putting it to the test very soon as I attempt to work remotely from 1,100 miles away. I don’t have the world’s greatest ISP, and I don’t have a UPS yet, so this could fail spectacularly, but I’ll give it a go.

    I expect it to take a couple of weeks to get my feet back under me after this transition, but I’ll post an update soon.

    by Bill Dollins at July 28, 2016 05:02 PM

    July 05, 2016

    UpStats blog (Stefan Petrea)

    Geolocation using multiple services

    UPDATE: A chinese version of this post is available here.


    In a previous post I wrote about PostGIS and ways of querying geographical data.

    This post will focus on building a system that queries free geolocation services 1 and aggregates their results.


    In summary, we're making requests to different web services (or APIs), we're doing reverse geocoding on the results and then we'll aggregate them.

    Comparing geonames and openstreetmap

    To relate to the previous post, here are some differences between geonames and openstreetmap:

    criterion OSM geonames
    size 50 GB compressed 309 MB compressed
    entities 3.25 billion 11 million
    has administrative area data yes yes
    has lat/long city data yes yes
    has neighbourhood/district data yes yes
    has region/area polygonal areas yes no
    has intracity-level metadata yes no
    has terrain metadata yes no

    They are meant for different purposes. Geonames is meant for city/administrative area/country data and can be used for geocoding. Openstreetmap has much more detailed data (one could probably extract the geonames data using openstreetmap) and can be used for geocoding, route planning and more .

    Asynchronous requests to geolocation services

    We're using the gevent library to make asynchronous requests to the geolocation services.

    import gevent
    import gevent.greenlet
    from gevent import monkey; gevent.monkey.patch_all()
            ['geoplugin'    , 'http://www.geoplugin.net/json.gp?ip={ip}' ],
            ['ip-api'       , 'http://ip-api.com/json/{ip}'              ],
            ['nekudo'       , 'https://geoip.nekudo.com/api/{ip}'        ],
            ['geoiplookup'  , 'http://api.geoiplookup.net/?query={ip}'   ],
    # fetch url in asynchronous mode (makes use of gevent)
    def fetch_url_async(url, tag, timeout=2.0):
        data = None
            opener = urllib2.build_opener(urllib2.HTTPSHandler())
            opener.addheaders = [('User-agent', 'Mozilla/')]
            data = urllib2.urlopen(url,timeout=timeout).read()
        except Exception, e:
        return [tag, data]
    # expects req_data to be in this format: [ ['tag', url], ['tag', url], .. ]
    def fetch_multiple_urls_async(req_data):
        # start the threads (greenlets)
        threads_ = []
        for u in req_data:
            (tag, url) = u
            new_thread = gevent.spawn(fetch_url_async, url, tag)
        # wait for threads to finish
        # retrieve threads return values
        results = []
        for t in threads_:
            results.append(t.get(block=True, timeout=5.0))
        return results
    def process_service_answers(location_data):
        # 1) extract lat/long data from responses
        # 2) reverse geocoding using geonames
        # 3) aggregate location data
        #    (for example, one way of doing this would
        #     be to choose the location that most services
        #     agree on)
    def geolocate_ip(ip):
        urls = []
        for grp in geoip_service_urls:
            tag, url = grp
            urls.append([tag, url.format(ip=ip)])
        results = fetch_multiple_urls_async(urls)
        answer = process_service_answers(results)
        return answer

    City name ambiguity

    Cities with the same name within the same country

    There are many cities with the same name within a country, in different states/administrative regions. There's also cities with the same name in different countries.

    For example, according to Geonames, there are 24 cities named Clinton in the US (in 23 different states, with two cities named Clinton in the same state of Michigan).

    WITH duplicate_data AS (
        array_agg(ROW(country_code, region_code)) AS dupes
        FROM city_region_data
        WHERE country_code = 'US'
        GROUP BY city_name, country_code
        ORDER BY COUNT(ROW(country_code, region_code)) DESC
    ARRAY_LENGTH(dupes, 1) AS duplicity,
    ( CASE WHEN ARRAY_LENGTH(dupes,1) > 9 
      THEN CONCAT(SUBSTRING(ARRAY_TO_STRING(dupes,','), 1, 50), '...')
      ELSE ARRAY_TO_STRING(dupes,',') END
    ) AS sample
    FROM duplicate_data
    LIMIT 5;
    city_name duplicity sample
    Clinton 24 (US,NY),(US,AR),(US,NC),(US,MA),(US,MD),(US,OH),(U…
    Franklin 19 (US,ME),(US,MA),(US,NC),(US,TX),(US,NC),(US,LA),(U…
    Springfield 19 (US,MN),(US,KY),(US,SD),(US,MI),(US,VA),(US,IL),(U…
    Madison 18 (US,CT),(US,MN),(US,NJ),(US,ME),(US,SD),(US,FL),(U…
    Greenville 18 (US,NC),(US,SC),(US,MS),(US,KY),(US,RI),(US,ME),(U…

    Cities with the same name in the same country and region

    Worldwide, even in the same region of a country, there can be multiple cities with the exact same name.

    Take for example Georgetown, in Indiana. Geonames says there are 3 towns with that name in Indiana. Wikipedia says there are even more:

    WITH duplicate_data AS (
        array_agg(ROW(country_code, region_code)) AS dupes
        FROM city_region_data
        WHERE country_code = 'US'
        GROUP BY city_name, region_code, country_code
        ORDER BY COUNT(ROW(country_code, region_code)) DESC
    ARRAY_LENGTH(dupes, 1) AS duplicity,
    ( CASE WHEN ARRAY_LENGTH(dupes,1) > 9 
      THEN CONCAT(SUBSTRING(ARRAY_TO_STRING(dupes,','), 1, 50), '...')
      ELSE ARRAY_TO_STRING(dupes,',') END
    ) AS sample
    FROM duplicate_data
    LIMIT 4;
    city_name duplicity sample
    Plantation 3 (US,FL),(US,FL),(US,FL)
    Georgetown 3 (US,IN),(US,IN),(US,IN)
    Robinwood 3 (US,MD),(US,MD),(US,MD)
    Prospect Park 2 (US,NJ),(US,NJ)

    Reverse geocoding

    Both (city_name, country_code) and (city_name, country_code, region_name) tuples have failed as candidates to uniquely identify a location.

    We would have the option of using zip codes or postal codes except we can't use those since most geolocation services don't offer that.

    But most geolocation services do offer longitude and latitude, and we can use those to eliminate ambiguity.

    Geometric data types in PostgreSQL

    I looked further into the PostgreSQL docs and found that it also has geometric data types and functions for 2D geometry. Out of the box you can model points, boxes, paths, polygons, circles, you can store them and query them.

    PostgreSQL has some additional extensions in the contrib directory. They are available out of the box with most Postgres installs.

    In this situation we're interested in the cube and earthdistance extensions 2. The cube extension allows you to model n-dimensional vectors, and the earthdistance extension uses 3-cubes to store vectors and represent points on the surface of the Earth.

    We'll be using the following:

    • the earth_distance function is available, and it allows you to compute the great-circle distance between two points
    • the earth_box function to check if a point is within a certain distance of a reference point
    • a gist expression index on the expression ll_to_earth(lat, long) to make fast spatial queries and find nearby points

    Designing a view for city & region data

    Geonames data was imported into 3 tables:

    Then we create a view that pulls everything together 3. We now have population data, city/region/country data, and lat/long data, all in one place.

    CREATE OR REPLACE VIEW city_region_data AS ( 
            b.country AS country_code,
            b.asciiname AS city_name,
            a.name AS region_name,
            b.latitude AS city_lat,
            b.longitude AS city_long,
            c.name    AS country_name
        FROM geo_admin1 a
        JOIN (
            SELECT *, (country || '.' || admin1) AS country_region, admin1 AS region_code
            FROM geo_geoname
            WHERE fclass = 'P'
        ) b ON a.code = b.country_region
        JOIN geo_countryinfo c ON b.country = c.iso_alpha2

    Designing a nearby-city query and function

    In the most nested SELECT, we're only keeping the cities in a 23km radius around the reference point, then we're applying a country filter and city pattern filter (these two filters are optional), and we're only getting the closest 50 results to the reference point.

    Next, we're reordering by population because geonames sometimes has districts and neighbourhoods around bigger cities 4, and it does not mark them in a specific way, so we just want to select the larger city and not a district (for example let's say the geolocation service returned a lat/long that would resolve to one district of a larger metropolitan area. In my case, I'd like to resolve this to the larger city it's associated with)

    We're also creating a gist index (the @> operator will make use of the gist index) which we're using to find points within a radius of a reference point.

    This function takes a point (using latitude and longitude) and returns the city, region and country that is associated with that point.

    CREATE INDEX geo_geoname_latlong_idx ON geo_geoname USING gist(ll_to_earth(latitude,longitude));
    CREATE OR REPLACE FUNCTION geo_find_nearest_city_and_region(
        latitude double precision,
        longitude double precision,
        filter_countries_arr varchar[],
        filter_city_pattern  varchar,
        country_code varchar,
        city_name varchar,
        region_name varchar,
        region_code varchar,
        population bigint,
        _lat double precision,
        _long double precision,
        country_name varchar,
        distance numeric
        ) AS $$
        SELECT *
        FROM (
            FROM (
                       ll_to_earth(c.city_lat, c.city_long),
                       ll_to_earth(latitude, longitude)
                      )::numeric, 3) AS distance_
                FROM city_region_data c
                WHERE earth_box(ll_to_earth(latitude, longitude), 23000) @> ll_to_earth(c.city_lat, c.city_long) AND
                      (filter_countries_arr IS NULL OR c.country_code=ANY(filter_countries_arr)) AND
                      (filter_city_pattern  IS NULL OR c.city_name LIKE filter_city_pattern)
                ORDER BY distance_ ASC
                LIMIT 50
            ) d
            ORDER BY population DESC
        ) e
        LIMIT 1;
    LANGUAGE plpgsql;


    We've started from the design of a system that would query multiple geoip services, would gather the data and would then aggregate it to get a more reliable result.

    We first looked at some ways of uniquely identifying locations.

    We've then picked a way that would eliminate ambiguity in identifying them. In the second half, we've looked at different ways of structuring, storing and querying geographical data in PostgreSQL.

    Then we've built a view and a function to find cities near a reference point which allowed us to do reverse geocoding.



    By using multiple services (and assuming they use different data sources internally) after aggregation, we can have a more reliable answer than if we were using just one.

    Another advantage here is that we're using free services, no setup is required, we don't have to take care of updates, since these services are maintained by their owners.

    However, querying all these web services will be slower than querying a local geoip data structure. But, city/country/region geolocation databases are available, some examples include geoip2 from maxmind, ip2location or db-ip.


    There's a nice post here using the earthdistance module to compute distances to nearby or far away pubs.


    Geonames has geonameIds as well, which are geonames-specific ids we can use to accurately refer to locations.


    geonames does not have polygonal data about cities/neighbourhoods or metadata about the type of urban area (like openstreetmap does) so you can't query all city polygons (not districts/neighbourhoods) that contain that point.

    July 05, 2016 09:00 PM

    June 18, 2016

    Stephen Mather

    Using PostGIS for Hydrologic Modeling (reblog)

    The Problem We have to filter out the roads and ditches without removing streams that cross roads or follow them closely. I’m going to use PostGIS to find the intersection of the streams lines data with a buffered roads polygon. If the intersected line is less than 50% of the length of the stream line, […]

    via Filtering Roads from Extracted Streams Data — GeoKota

    by smathermather at June 18, 2016 01:57 AM

    June 04, 2016

    Boston GIS (Regina Obe, Leo Hsu)

    FOSS4GNA 2016: pgRouting - A Crash course video is out

    Leo's pgRouting : a Crash Course video made it thru great. Better than mine. Leo doesn't believe in slides, so this is all live demo stuff. The data he used in the video is part of our code/data download for pgRouting: A Practical Guide.

    Continue reading "FOSS4GNA 2016: pgRouting - A Crash course video is out"

    by Regina Obe (nospam@example.com) at June 04, 2016 07:05 AM

    May 31, 2016

    Boston GIS (Regina Obe, Leo Hsu)

    FOSS4GNA 2016 PostGIS Spatial Tricks video is out

    The videos for FOSS4G NA 2016 have started coming out. Recently Andrea Ross posted PostGIS Spatial Tricks talk video. I'm happy to say it looks pretty good and I didn't suck as badly as I worried I would. Thank you very much Andrea. Some talks unfortunately did not come thru. I'm hoping Leo's pgRouting : a Crash Course video made it thru okay as well, and will post that later if it does.

    Only small little nit-picks is the first 2-5 minutes or so didn't make it thru and the blue colors on the slides got a little drowned out, but here are the slides if you need full resolution.

    Continue reading "FOSS4GNA 2016 PostGIS Spatial Tricks video is out"

    by Regina Obe (nospam@example.com) at May 31, 2016 08:36 PM