Welcome to Planet PostGIS

October 17, 2016

BigSQL Holly

Two great things that taste great together: PostGIS and PostgreSQL


In my past life as a geospatial DBA, I had to navigate users and managers who belonged to different database teams (PostgreSQL, MySQL, SQL Server, Oracle, etc). When I was lucky enough to work in a shop that supported open source solutions, I was often asked by members of team MySQL, Why should we use PostgreSQL as our geospatial database when MySQL has spatial data types?” 

The answer: PostGIS.

Sure, MySQL has the ability to run spatial analysis (with some prodding). But PostgreSQL + PostGIS wins in:

  • functionality
  • performance
  • adoption by 3rd party solutions (QGIS, ArcGIS Server, GeoServer…)

So if you like to do things the free and easy way, go with PostgreSQL and PostGIS.

by Holly Orr at October 17, 2016 02:59 PM

October 10, 2016

UpStats blog (Stefan Petrea)

Off the streets, Land subdivision in PostGIS - Part 2


In a previous post, we've seen how PostGIS and Openstreetmap can be used to leverage geographical data.

A common task in GIS is land subdivision. This involves taking different shapes, polygons usually, and cutting them into smaller polygons. This is often used in urban planning.

This post will focus on describing an algorithm for partitioning a land polygon into parcels of a given area. Parcels will be cut off from the initial polygon until no more parcels can be formed.

This is part2 of a series:

Algorithm description

We have a polygon P and the nearest road R. We get the bounding box B for P and all our searches for cutting points/lines will be confined to B. We compute the extreme points on the bounding box and we label them with the cardinal directions they represent.

Our goal is to cut a corner C (also called subdivision) from P such that it contains the nearest boundary point to a road. The cut will be done using two lines, one horizontal and one vertical. We want C to be of a given area A.

Now in order to find the corner to cut, we look at the extreme points and check which one of them is closest to a road.

We'll use sweeping-lines to find the parcel we need. Any sweeping-lines mentioned will be moving away from the corner (in other words, away from the closest road point).

In what follows, we assume the north-west corner needs to be cut.

We place an inset (a horizontal line) that will be located sqrt(A) to the south (relative to the north edge). The inset is positioned there because we anticipate the target area to be in the form of a square.

If the area above the inset (the one we aim for) is larger than our target, we split the polygon, take the upper half and use another sweeping line that goes from west to east, to find another cutting line that allows us to find a parcel with the required area.

If the area above the inset is insufficient (below the target area), we search for a better position for it, using binary search, along the north-south direction.

Additional details: The way the cut search works, using the inset, is such that we avoid getting thin horizontal strips when our initial polygon is a square/rectangle (and it is expected to be a square in the vast majority of cases). This simulates a parcel shape appropriate for building a house.

Details about corner cases (other than NW which was covered above):

  • NE corner: horizontal goes north->south and vertical goes east->west
  • SE corner: horizontal goes south->north and vertical goes east->west
  • SW corner: horizontal goes south->north and vertical goes west->east

So the sweep lines always move away from the corner.

After the parcel with target area was found, it will be cut off from the polygon, the original polygon in the GIS database will be updated and the new parcel will be inserted (separate from the original polygon that we split).


OSM was used to extract test data. Specifically, in the run below, data for the Herastrau Park in Bucharest was used. In OSM, the park is represented as a series of polygons, and one of the polygons was partitioned in parcels, each measuring 8000 square meters.

On the left side you can see the actual partition. On the right side, the same partition is displayed, except we also have some of the objects used throughout development, including bounding boxes, and extreme boundary points.

And to visualize how the algorithm works, here's an animation of how it partitions the polygon, parcel by parcel:


  • While working on this algorithm implementation, at one point, it was required to find the extreme boundary points and assign them cardinal orientation labels. A first approach to determining the cardinal direction of the boundary points was to order them by ST_Azimuth relative to the centroid of the polygon (as the circle center passed to ST_Azimuth). This is a pitfall because the clock-wise order doesn't guarantee in any way membership to the N,E,S,W edges of the bounding box. It's perfectly possible that two such extreme points belong to the north edge, and one belongs to the west edge, and one to the south edge.
  • As it turns out, splitting the polygon multiple times, and using ST_Union to glue the remaining parts back together will create MULTIPOLYGON structures and which ST_ExteriorRing is not compatible with. So a convex hull was used in order to get one single polygon and compute the bounding box for it. This problem will surface after repeatedly cutting the input polygon.
  • Due to lack of knowledge about QGIS, some logic for an SVG-based 1 visualization was written in order to check the position of the cuts and the shape of the polygon after the cuts were made. Although QGIS has virtual layer feature, that was quite hard to use.
  • Early on, there was a need to visualize some objects from OSM in order to assess if they are fit for test data. It was easier to create a VIEW of those objects in PostgreSQL and then visualize that in QGIS (it was a bit hard to deal with all the menus in QGIS)
  • It would've been nice to have some automatically generated test polygons for each position of the road relative to the polygon. And some polygons that would be thiner near the top and wider near the bottom. However, the polygons extracted from OSM were used instead.
  • To get the extreme points of the polygon, the first approach used was to intersect exterior ring of the polygon with the exterior ring of the polygon's envelope (envelope meaning bounding box). This proved to be complicated and would definitely add complexity if one of the polygon's edges were axis-parallel as the result of the intersection would've been a line, and not a point. So the second approach was much simpler, just decomposing the polygon into its vertices (using ST_DumpPoints) and then picking the northmost, eastmost, westmost and southmost vertices. While this works, it only works for POLYGON and MULTIPOLYGON but it's not expected to work for CURVEPOLYGON.


While implementing the logic for each corner-cut, it became more obvious that the logic for one corner would be enough, and the entire drawing could be rotated, the cut made, and then the resulting pieces rotated back. This could have decreased the size of the implementation.

On a different note, some of the parcels created have a shape that is not exactly square, so there is room for improvement there too.

Some polygons may have U-shapes and if the closest-point on the boundary is one of the two tips of the U shape, the parcel that gets cut could turn out to be one made up of disconnected pieces.

Someone pointed out that there should also be access paths/roads but that wasn't part of the problem statement. Another aspect that was not taken into consideration was surface inclination, it's assumed that subdivided area is flat.


In this post we've seen how PostGIS can be leveraged to build a land subdivision algorithm. The implementation for this algorithm is available on Github under MIT license.



Satoshi Koda has a great blog post about this, and that was very useful reading while writing the SVG visualization for this project. His new blog is hosted here.

October 10, 2016 09:00 PM

October 06, 2016

PostGIS Development

PostGIS 2.2.3 Released

The PostGIS development team is pleased to announce the release of PostGIS 2.2.3 As befits a patch release, the focus is on bugs and breakages.

Continue Reading by clicking title hyperlink ..

by Regina Obe at October 06, 2016 12:00 AM

October 04, 2016

BigSQL Holly

Why are you still paying for your GIS when you can get it for free?

If you haven’t switched to open source solutions (FREE!), then you have probably fallen for some common misconceptions/myths:


MYTH: Open source tools are buggy. 

Does software have bugs? Of course. It was made by humans! But open source has the benefit of a strong community that can crowdsource fixes because the code base is not proprietary (open source!) And, by the way, If you have been using the most popular proprietary GIS software (wink,wink) for more than 5 years, you know all about bugs.

MYTH: Free GIS software has a limited toolbox for analysis. 

Well this depends. Let’s refer to the 80/20 rule here. When applied to GIS, 80% of your analysis can be processed with 20% of the tools / algorithms available for a spatially enabled database. If you are a bada$$ spatial stats expert and love your proprietary tools, then by all means stick with that expensive stack (also take a look at PySAL). But if you are like most of us (roughly 80%), you can accomplish your analysis with the FREE stack.

MYTH: Open source tools are impossible to install and administer.

Granted, this has been true in the past. But the open source community has made great strides in creating tools that don’t require you to have an engineering degree to stand-up a fully functioning GIS stack. And, because the community is large (and committed), you can find a ton of tutorials and documentation on the web.

MYTH: PostGIS is fine for hobbyists, but it can’t support corporate needs.

Actually, more and more companies and government agencies are turning to PostgreSQL/PostGIS for their geospatial needs: Uber, FourSquare, NOAA, and Carto just to name a few.

In upcoming posts we will show you how to install new open source (FREE!) tools from BigSQL (powered by OpenSCG) to help you build your open source GIS stack.

Just think of all the cool stuff you can buy from the Public Lab with the money you will save…

by Holly Orr at October 04, 2016 02:26 PM

September 26, 2016

PostGIS Development

PostGIS 2.3.0 Released

The PostGIS development team is pleased to announce the release of PostGIS 2.3.0.
This is the first version to utilize the parallel support functionality introduced in PostgreSQL 9.6. As such, if you are using PostgreSQL 9.6, we strongly encourage you to use this version.

Parallel support will make many queries using PostGIS relationship operators and functions faster. In order to take advantage of parallel query support, make sure to set max_parallel_workers_per_gather to something greater than 0 as noted in max_parallel_workers_per_gather PostgreSQL runtime configs

Best served with [PostgreSQL 9.6+] which is due out this week and pgRouting 2.3.0 which also just got released.

Packages from maintainers will be out in the coming days and weeks.

Continue Reading by clicking title hyperlink ..

by Regina Obe at September 26, 2016 12:00 AM

September 21, 2016


Back from FOSS4G 2016 - part 2

FOSS4G, the main international OSGeo event, takes place once per year and gather all folks involved in opensource geospatial technologies at large.

This year, the conference took place in Bonn, Germany, and gathered around 900/1000 people. This edition was really good, as always for this conference. A lot of social events allowed direct talks to passionate people from all over the world, and the presentations were diverse and of high quality.

Oslandia at FOSS4G

Oslandia is a full player in the OpenSource GIS field, and we participated this year with various interventions.

QGIS 3: plans, wishes and challenges

QGIS is a very powerful GIS environment. More and more features have kept coming in the 2.0 branch, thanks to a growing number of users, developers and funders. But it is occasionally time to look up and envision the future to make sure this growth of energy is used at its full potential, especially to make sure new features are not added to a base that will become hard to maintain or evolve. Discussions and active work have already been done about how to transition away from the obsolescence of Python 2 and Qt 4. Some API breaks will have to occur and this is an opportunity to include major changes, both for users and for developers. This talk will present some of the changes that are planned or wished for the 3.0 version of QGIS and will detail challenges that remain to see them exist, from a technical, organisational or economical point of view.

Slides: qgis_3.pdf.

Using PostGIS in a real advanced way !

A lot of people use PostGIS as a basic GIS toolbox, but very few use it in a real advanced way.

To progress towards full PostGIS power, we can first make use of advanced native PostGIS functions. Using some extensions related to PostGIS, such as SFCGAL (for 3D data management), PostGIS Raster, PgPointCloud or even the latest pgsql-postal (for address normalization)...

Then we can mix PostGIS functions with advanced standardized SQL features provided by PostgreSQL 9.x itself (CTE, Window functions, FDW, join and aggregate pushdowns…).

Even better, use PostgreSQL bindings for data analysis languages such as R or Python to create your own dedicated function set, and integrate them into your SQL queries.

Slides are available on github.

iTowns, a new framework for 3D web visualization

We present iTowns, a web framework developed in Javascript / WebGL for 3D geospatial data visualization, with capabilities for precise measurement directly in the browser. The first use case of iTowns is Street-view data type visualization : immersive images, but also terrestrial LIDAR Point Cloud data. But iTowns now supports much more data types :

  • Oriented images
  • Panoramic images
  • Point Clouds
  • 3D textured models
  • WFS vector data

iTowns OpenSource is the descendant of the initial iTowns software developed at MATIS research laboratory of the French National Mapping Agency. iTowns OpenSource version 1.0 has been released in February 2016.

The framework allows to : - Visualize projected images on a mesh ( cube, 3D model) - Visualize panoramic images - Display depth panoramic images - Display extruded building ( from WFS, other sources ) - Navigate in 3D (click & go) - Display Point Clouds - Visualize textured 3D models ( B3D, 3DS) - Use a simple API

We detail iTowns features with videos. The data showcased was acquired by IGN's Stereopolis car. Aside from presenting the software, its present state and the future 2.0 version, we also explain the project history, which is an interesting case of technology transfer from research to industry.


OpenSource tools for water network management

This presentation details some OpenSource tools dedicated to water network management, be it for water distribution or wastewater networks.

The qWAT project is a specific tool based on QGIS and PostGIS. it aims at managing water distribution networks. The data model is part of the project and covers most use cases for this kind of assets. The qWAT project is strongly linked to QGIS, and tries to contribute to the core of QGIS so as to mutualize developments and features among other QGIS-based applications.

Similarly, the QGEP project is dedicated to wastewater networks. We also present a use case for an implementation of a wastewater information system in France, based on QGIS and PostGIS.

Furthermore, we show how PostGIS-based projects allow to do network and graph analysis, so as to extract meaningful information for decision-taking and planning.

QGIS-Epanet and QGIS-SWMM are two QGIS Processing extensions integrating simulation features on water distribution and wastewater networks. They let the user run simulations to analyze the network, dimensioning, and identify specific issues.

These set of tools show that OpenSource GIS now tend to fulfill use cases for specific fields of application, and water management is among them.


Want to see more videos? The whole conference is available here.


by Audrey Malherbe at September 21, 2016 04:00 PM

September 20, 2016


Back from FOSS4G 2016 - part 1

FOSS4G, the main international OSGeo event, takes place once per year and gather all folks involved in opensource geospatial technologies at large.

This year, the conference took place in Bonn, Germany, and gathered around 900/1000 people. This edition was really good, as always for this conference. A lot of social events allowed direct talks to passionate people from all over the world, and the presentations were diverse and of high quality.

Oslandia at FOSS4G

Oslandia is a full player in the OpenSource GIS field, and we participated this year with various interventions.

Workshop on Point Cloud data

This workshop features the use of large amounts of Point Cloud data. We talked about databases and teached how to use PostgreSQL, PostGIS, PgPointCloud and PDAL at their best to manage your Point Clouds data. Workshop steps include :

  • Presentation of the components and their principles
  • Getting the components ready
  • Loading the point cloud data inside the database
  • Data manipulation with PDAL
  • Querying the point cloud data in the database
  • Mixing point cloud data with other data type ( 2D, 3D)
  • Performance issues & indexing
  • Visualizing your data

Workshop materials are available on github.

Workshop on 3D data

This workshop was an introduction to 3D geospatial infrastructure. It leads you to serving 3D data from a spatial database to a 3D web visualization client. The Open Source components used in this workshop : PostGIS, to store and manipulate 3D data ( buildings ) building-server, a web server streaming 3D data and iTowns, the 3D web visualization framework.

This video was recorded during the workshop.

Workshop materials are available github :.

A post to follow will give you some informations about our presentations.

by Audrey Malherbe at September 20, 2016 04:00 PM

September 19, 2016

PostGIS Development

PostGIS 2.3.0rc1 Released

PostGIS 2.3.0rc1 is feature complete, so we’re looking for testing and feedback! Best served with PostgreSQL 9.6rc1 and pgRouting 2.3.0-rc1

Please give this release candidate a try and report back any issues you encounter. New things since 2.3.0beta1 release

Please report bugs that you find in this release.

Important / Breaking Changes

  • 3466, Casting from box3d to geometry now returns a 3D geometry (Julien Rouhaud of Dalibo)

  • 3604, pgcommon/Makefile.in orders CFLAGS incorrectly leading to wrong liblwgeom.h (Greg Troxel)

  • 3396, ST_EstimatedExtent, now throws WARNING instead of ERROR (Regina Obe)

    New Features and Performance Enhancements

  • Add support for custom TOC in postgis_restore.pl (Christoph Moench-Tegeder)

  • Add support for negative indexing in STPointN and STSetPoint (Rémi Cura)
  • Numerous new function additions and enhancements: New Functions and Enhancements

  • 3549, Support PgSQL 9.6 parallel query mode, as far as possible (Paul Ramsey, Regina Obe)

  • 3557, Geometry function costs based on query stats (Paul Norman)
  • 3591, Add support for BRIN indexes (Giuseppe Broccolo of 2nd Quadrant, Julien Rouhaud and Ronan Dunklau of Dalibo)
  • 3496, Make postgis non-relocateable (for extension install), schema qualify calls in functions (Regina Obe) Should resolve once and for all for extensions #3494, #3486, #3076

  • 3547, Update tiger geocoder to support TIGER 2016 and use http or ftp (Regina Obe)

See the full list of changes in the news file and please report bugs that you find in the release. Binary packages will appear in repositories over the coming weeks as packagers roll out builds.

View all closed tickets for 2.3.0.

by Regina Obe at September 19, 2016 12:00 AM

September 18, 2016

Anita Graser (Underdark)

Movement data in GIS: issues & ideas

Since I’ve started working, transport and movement data have been at the core of many of my projects. The spatial nature of movement data makes it interesting for GIScience but typical GIS tools are not a particularly good match.

Dealing with the temporal dynamics of geographic processes is one of the grand challenges for Geographic Information Science. Geographic Information Systems (GIS) and related spatial analysis methods are quite adept at handling spatial dimensions of patterns and processes, but the temporal and coupled space-time attributes of phenomena are difficult to represent and examine with contemporary GIS. (Dr. Paul M. Torrens, Center for Urban Science + Progress, New York University)

It’s still a hot topic right now, as the variety of related publications and events illustrates. For example, just this month, there is an Animove two-week professional training course (18–30 September 2016, Max-Planck Institute for Ornithology, Lake Konstanz) as well as the GIScience 2016 Workshop on Analysis of Movement Data (27 September 2016, Montreal, Canada).

Space-time cubes and animations are classics when it comes to visualizing movement data in GIS. They can be used for some visual analysis but have their limitations, particularly when it comes to working with and trying to understand lots of data. Visualization and analysis of spatio-temporal data in GIS is further complicated by the fact that the temporal information is not standardized in most GIS data formats. (Some notable exceptions of formats that do support time by design are GPX and NetCDF but those aren’t really first-class citizens in current desktop GIS.)

Most commonly, movement data is modeled as points (x,y, and optionally z) with a timestamp, object or tracker id, and potential additional info, such as speed, status, heading, and so on. With this data model, even simple questions like “Find all tracks that start in area A and end in area B” can become a real pain in “vanilla” desktop GIS. Even if the points come with a sequence number, which makes it easy to identify the start point, getting the end point is tricky without some custom code or queries. That’s why I have been storing the points in databases in order to at least have the powers of SQL to deal with the data. Even so, most queries were still painfully complex and performance unsatisfactory.

So I reached out to the Twitterverse asking for pointers towards moving objects database extensions for PostGIS and @bitnerd, @pwramsey, @hruske, and others replied. Amongst other useful tips, they pointed me towards the new temporal support, which ships with PostGIS 2.2. It includes the following neat functions:

  • ST_IsValidTrajectory — Returns true if the geometry is a valid trajectory.
  • ST_ClosestPointOfApproach — Returns the measure at which points interpolated along two lines are closest.
  • ST_DistanceCPA — Returns the distance between closest points of approach in two trajectories.
  • ST_CPAWithin — Returns true if the trajectories’ closest points of approach are within the specified distance.

Instead of  points, these functions expect trajectories that are stored as LinestringM (or LinestringZM) where M is the time dimension. This approach makes many analyses considerably easier to handle. For example, clustering trajectory start and end locations and identifying the most common connections:


(data credits: GeoLife project)

Overall, it’s an interesting and promising approach but there are still some open questions I’ll have to look into, such as: Is there an efficient way to store additional info for each location along the trajectory (e.g. instantaneous speed or other status)? How well do desktop GIS play with LinestringM data and what’s the overhead of dealing with it?

by underdark at September 18, 2016 03:11 PM

September 11, 2016

Jorge Arévalo

Querying raster data stored in your Carto account

During the years, people keep asking me the same question: can you store (and query) raster data in your Carto (former CartoDB) account? How?

Well, the answer is yes. And this is more a PostGIS Raster issue than a Carto issue. Let’s go.

Disclaimer: This code was written couple of years ago. It may be outdated, but still works.

Store raster data

First thing you need to do is easy: just drop your georeferenced raster file into the Dataset page of your account. The raster will be imported, and you’ll see something like this

Do you see that foto_pnoa table with gray shading name? That means you cannot click on it to see the table’s data. But don’t worry: your data is safe in the table. Don’t trust me? Just enter another table and run this query (of course, using your own table’s name)

select st_summarystats(the_raster_webmercator, 1) as stats from foto_pnoa

You should see the stats of your raster data now.

View your raster data over a map

Now, to the fun part. We’ll code a bit of JavaScript to put that data over a Carto map. We’ll even allow some calculation with the data (average raster value within a region). So, go for it.


Pretty simple and straightforward. This is the relevant part of the HTML. But don’t worry, I’ll provide a link to the complete example at the end of the post.

<div class="header">
<h1>PostGIS Raster test</h1>
<h2>Draw a figure and click on it to see the avg raster value</h2>
<div id="map"></div>

The JS

We’ll create a Leaflet map, using:

Relevant parts of the code here (again, you’ll get the complete code later)

Create the map and the controls

We create a Leaflet map and the draw controls

// Create map
var map = new L.Map('map', {
zoomControl: true,
drawnControl: true,
center: [37.383333, -5.983333],
zoom: 11

// Add CartoDB basemaps
L.tileLayer('http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png', {
attribution: '<a href="http://cartodb.com">CartoDB</a> © 2014',
maxZoom: 18

// Add drawn controls
var drawnItems = new L.FeatureGroup();
var drawControl = new L.Control.Draw({
position: 'bottomleft',
draw: {
polyline: false,// Turns off this drawing tool
marker: false,
polygon: false,

rectangle: {
shapeOptions: {
color: '#a63b55'
showArea: true

circle: {
shapeOptions: {
color: '#662d91'

showArea: true

edit: {
featureGroup: drawnItems

What can I do with those controls? Let’s see

Handle draw actions

Whenever we draw a figure:

  1. Using the figure’s coords, we build a PostGIS geometry by calling ST_MakeBox2D, if we drawn a rectangle, or ST_Buffer, if we drawn a circle.
  2. Run a query to check the average raster value within that geometry and show the value

Check it out in the next snippet

map.on('draw:created', function (e) {
var type = e.layerType,
layer = e.layer;

var pol_pgis = null;

switch(type) {

// Create a Rectangle geometry in PostGIS
case 'rectangle':
var coords = layer.getLatLngs();

var southWest = L.latLng(coords[1].lat, coords[1].lng);
var northEast = L.latLng(coords[3].lat, coords[3].lng);

var pol_pgis = "st_transform(ST_SetSRID(ST_MakeBox2D(ST_Point(" +
coords[1].lng + ", " + coords[1].lat + "),ST_Point(" +
coords[3].lng + "," + coords[3].lat + ")),4326), 3857)";


// Create a circle geometry in PostGIS
case 'circle':
var center = layer.getLatLng();

var pol_pgis = "st_transform(geometry(st_buffer(geography(st_setsrid(st_point(" +
center.lng + ", " + center.lat + "), 4326)), " + layer.getRadius() + ")),3857)";


case 'polygon':


if (pol_pgis) {
q = "SELECT avg((stats).mean) as m from (select st_summarystats(the_raster_webmercator, 1) as stats from foto_pnoa where st_intersects(the_raster_webmercator, " + pol_pgis +")) as foo";

console.log("QUERY: " + q);

var sql = new cartodb.SQL({user: 'libregis'});

.done(function(data) {
if (data.rows && data.rows.length > 0)
layer.bindPopup("Average raster value inside the " + type + ": " + data.rows[0].m);

layer.bindPopup("Could not get avg value!");

.error(function(errors) {
layer.bindPopup("Could not get avg value!");

else {
layer.bindPopup("Could not get avg value!");


Show the raster tiles

The final touch. We put the raster tiles over the map using the Maps API, using AJAX (old fashioned way, I know…)

var config = {
"version": "1.3.1",
"layers": [
"type": "cartodb",
"options": {
"sql": "select * from foto_pnoa",
"cartocss": "#foto_pnoa {raster-opacity: 0.5;}",
"cartocss_version": "2.3.0",
"geom_column": "the_raster_webmercator",
"geom_type": "raster"

var request = new XMLHttpRequest();
request.open('POST', currentEndpoint(), true);
request.setRequestHeader('Content-Type', 'application/json; charset=UTF-8');
request.onload = function() {
if (this.status >= 200 && this.status < 400){             var layergroup = JSON.parse(this.response);             var tilesEndpoint = currentEndpoint() + '/' + layergroup.layergroupid + '/{z}/{x}/{y}.png';             var protocol = 'https:' == document.location.protocol ? 'https' : 'http';             if (layergroup.cdn_url && layergroup.cdn_url[protocol]) {                 var domain = layergroup.cdn_url[protocol];                 if ('http' === protocol) {                     domain = '{s}.' + domain;                 }                 tilesEndpoint = protocol + '://' + domain + '/' + currentUser() + '/api/v1/map/' + layergroup.layergroupid + '/{z}/{x}/{y}.png';             }             rasterLayer = L.tileLayer(tilesEndpoint, {                 maxZoom: 18             }).addTo(map);         } else {             throw 'Error calling server: Error ' + this.status + ' -> ' + this.response;

That’s it. You can check the final result in the next codepen

You can grab the code here


The very first version of the code was made by Raul Ochoa

by Jorge Arévalo at September 11, 2016 11:38 AM

September 06, 2016

PostGIS Development

PostGIS 2.3.0beta1 Released

PostGIS 2.3 is feature complete, so we’re looking for testing and feedback! Best served with PostgreSQL 9.6.

Please give this beta a try and report back any issues you encounter.

Please report bugs that you find in this release.

** Important / Breaking Changes **

  • 3466, Casting from box3d to geometry now returns a 3D geometry (Julien Rouhaud of Dalibo)

  • 3604, pgcommon/Makefile.in orders CFLAGS incorrectly leading to wrong liblwgeom.h (Greg Troxel)

    ** New Features and Performance Enhancements **

  • Add support for custom TOC in postgis_restore.pl (Christoph Moench-Tegeder)

  • Add support for negative indexing in STPointN and STSetPoint (Rémi Cura)
  • Numerous new function additions and enhancements: New Functions and Enhancements

  • 3549, Support PgSQL 9.6 parallel query mode, as far as possible (Paul Ramsey, Regina Obe)

  • 3557, Geometry function costs based on query stats (Paul Norman)
  • 3591, Add support for BRIN indexes (Giuseppe Broccolo of 2nd Quadrant, Julien Rouhaud and Ronan Dunklau of Dalibo)
  • 3496, Make postgis non-relocateable (for extension install), schema qualify calls in functions (Regina Obe) Should resolve once and for all for extensions #3494, #3486, #3076

  • 3547, Update tiger geocoder to support TIGER 2016 and use http or ftp (Regina Obe)

See the full list of changes in the news file and please report bugs that you find in the release. Binary packages will appear in repositories over the coming weeks as packagers roll out builds.

View all closed tickets for 2.3.0.

by Regina Obe at September 06, 2016 12:00 AM

August 25, 2016

Paul Ramsey

PgSQL Indexes and "LIKE"

Do you write queries like this:

SELECT * FROM users 
WHERE name LIKE 'G%'

Are your queries unexpectedly slow in PostgreSQL? Is the index not doing what you expect? Surprise! You’ve just discovered a PostgreSQL quirk.

TL;DR: If you are running a locale other than “C” (show LC_COLLATE to check) you need to create a special index to support pattern searching with the LIKE operator: CREATE INDEX myindex ON mytable (mytextcolumn text_pattern_ops). Note the specification of the text_pattern_ops operator class after the column name.

As a beginner SQL student, you might have asked “will the index make my ‘like’ query fast” and been answered “as long as the wildcard character is at the end of the string, it will.”

PgSQL Indexes and "LIKE"

That statement is only true in general if your database is initialized using the “C” locale (the North America/English-friendly UNIX default). Running with “C” used to be extremely common, but is less and less so, as modern operating systems automagically choose appropriate regional locales to provide approriate time and formatting for end users.

For example, I run Mac OSX and I live in British Columbia, an English-speaking chunk of North America. I could use “C” just fine, but when I check my database locale (via my collation), I see this:

pramsey=# show LC_COLLATE;
(1 row)

It’s a good choice, it’s where I live, it supports lots of characters via UTF-8. However, it’s not “C”, so there are some quirks.

I have a big table of data linked to postal codes, this is what the table looks like:

              Table "gis.postal_segments"
      Column       |     Type     | Modifiers 
 postal_code       | text         | not null
 segment           | character(4) | 
    "postal_segments_pkey" PRIMARY KEY, btree (postal_code)

Note the index on the postal code, a standard btree.

I want to search rows based on a postal code prefix string, so I run:

SELECT * FROM postal_segments 
WHERE postal_code LIKE 'V8V1X%';
                                              QUERY PLAN                                              
 Seq Scan on postal_segments  (cost=0.00..2496.85 rows=10 width=68) (actual time=30.320..34.219 rows=4 loops=1)
   Filter: (postal_code ~~ 'V8V1X%'::text)
   Rows Removed by Filter: 100144
 Planning time: 0.250 ms
 Execution time: 34.263 ms
(5 rows)

Ruh roh!

I have an index on the postal code, so why am I getting a sequence scan?!?! Because my index is no good for doing pattern matching in any collation other than “C”. I need a special index for that, which I create like this.

CREATE INDEX postal_segments_text_x 
  ON postal_segments (postal_code text_pattern_ops);

The magic part is at the end, invoking text_pattern_ops as the opclass for this index. Now my query works as expected:

SELECT * FROM postal_segments 
WHERE postal_code LIKE 'V8V1X%';
                                                           QUERY PLAN                                                           
 Index Scan using postal_segments_text_x on postal_segments  (cost=0.29..8.31 rows=10 width=68) (actual time=0.067..0.073 rows=4 loops=1)
   Index Cond: ((postal_code ~>=~ 'V8V1X'::text) AND (postal_code ~<~ 'V8V1Y'::text))
   Filter: (postal_code ~~ 'V8V1X%'::text)
 Planning time: 0.532 ms
 Execution time: 0.117 ms
(5 rows)

I have gotten so used to PostgreSQL doing exactly the right thing automatically that it took quite a long time to track down this quirk when I ran into it. I hope this page helps others save some time!

August 25, 2016 09:05 AM

August 16, 2016


Oslandia at FOSS4G 2016 in Bonn

We are happy to announce that Oslandia is a Bronze sponsor at FOSS4G 2016 conference in Bonn (22-26 August). FOSS4G is the annual global event of the Open Source Geospatial Foundation (OSGeo). It's the largest technical geospatial Open Source conference.

Oslandia is participating with 2 hands-on workshops and 4 presentations covering 3D, postGIS, point cloud and network management. We invite you to visit our booth for more information, technical questions, and to meet members of our team in person.


Hope to see you there!

About FOSS4G 2016

The key topics in 2016 are :

  • Open Data
  • Remote Sensing for Earth Observation
  • Land information
  • Disaster Management

The theme of the conference is "building bridges". This supports the objective of bringing together attendees across domains and communities. The Open Source idea has a lot in common with the emerging trend to publish more digital assets as Open Data. At the same time the geospatial technology, the core of FOSS4G, is perfectly applicable to making all that great Open Data more accessible. The "Bonn Geo Summer" also brings together the geospatial and remote sensing worlds which have historically existed somewhat in parallel. Land Information in its broadest sense, be it cadastre, tenure and city planning, has a growing need for geospatial technology to help alleviate the pressure on urban areas and make the constant migration pressure into cities and large scale refugee events manageable. The fourth thematic focus lies on emergency management which also heavily relies on geospatial technology and increasingly also on geospatial Open Data.

by Audrey Malherbe at August 16, 2016 05:00 PM

August 10, 2016

Paul Ramsey

Your Broken PostGIS Upgrade

Since the Dawn of Time, people have found PostGIS upgrades difficult and confusing, and this is entirely to be expected, because a PostGIS upgrade consists of a number of interlocking parts. Sometimes, they “upgrade” their version of PostGIS and find out they’ve bricked their system. What gives?

Your Broken PostGIS Upgrade

What Makes PostGIS Work?

Before talking about upgrades, it’s important to understand how PostGIS works at all, because that understanding is key to seeing how upgrade scenarios go bad.

PostGIS is a “run-time loadable library” for PostgreSQL. That means we have a block of C code that is added to a running PostgreSQL database. That C code sits in a “library file” which is named (for the current 2.2 version): postgis-2.2.so.

Just to add to the confusion: for Windows, the name of the library file is postgis-2.2.dll. For every rule, there must be an exception. For users of Apple OSX, yes, there’s a further exception for you: even though most dynamic libraries on OSX are suffixed .dylib, the PostgreSQL modules on OSX are suffixed .so, just like their Linux counterparts.

The location of the postgis-2.2.so file will vary from system to system.

The presence of the postgis-2.2.so alone is not sufficient to “PostGIS enable” a database. PostGIS consists of a large collection of SQL functions in the database.

The SQL functions are created when you run the CREATE EXTENSION postgis command. Until that time your database knows nothing about the existence or definition of the PostGIS functions.

Once the extension is installed, you can see the definitions of the PostGIS functions in the system tables.

The use of dynamic function and type management catalogs is one of the things which makes PostgreSQL so incredibly flexible for extensions like PostGIS

  FROM pg_proc 
  WHERE proname = 'st_pointonsurface';
-[ RECORD 1 ]---+--------------------
proname         | st_pointonsurface
pronamespace    | 2200
proowner        | 10
prolang         | 13
procost         | 100
prorows         | 0
provariadic     | 0
protransform    | -
proisagg        | f
proiswindow     | f
prosecdef       | f
proleakproof    | f
proisstrict     | t
proretset       | f
provolatile     | i
pronargs        | 1
pronargdefaults | 0
prorettype      | 667466
proargtypes     | 667466
proallargtypes  | 
proargmodes     | 
proargnames     | 
proargdefaults  | 
prosrc          | pointonsurface
probin          | $libdir/postgis-2.2
proconfig       | 
proacl          | 

Lots to see here, but most important bit is the entry for the probin column: $libdir/postgis-2.2. This function (like all the other PostGIS functions) is bound to a particular version of the PostGIS C library.

Those of you thinking forward can now begin to see where upgrades could potentially go wrong.

How Things Go Wrong

Package Managers

The most common way for things to go wrong is to upgrade the library on the system without upgrading the database.

So, in Red Hat Linux terms, perhaps running:

yum upgrade postgresql94-postgis

This seems straight-forward, but think about what a package manager does during an upgrade:

  • Downloads a new version of the software
  • Removes the old version
  • Copies in the new version

So, if we had PostGIS 2.1.3 installed, and the latest version is 2.2.2, what has happend?

  • The postgis-2.1.so file has been removed
  • The postgis-2.2.so file has been added
  • So, the pg_proc entries in every PostGIS-enabled database now point to a library file that does not exist

Fortunately this mismatch between the pg_proc entries and the system state is usually solved during the very next step of the upgrade. But it’s a manual step, and if the DBA and system administrator are different people with different schedules, it might not happen.

Your next step should be to go and update the SQL function definitions by running an extension update on all your databases:


If you don’t, you’ll find that none of the PostGIS functions work. That, in fact, you cannot even dump your database. The very act of outputting a representation of the geometry data is something that requires the PostGIS C library file, and until you run ALTER EXTENSION the database doesn’t know where the new library file is.


Since the use of CREATE EXTENSION postgis (available since PostgreSQL 9.1+ and PostGIS 2.0+) became commonplace, migrations now almost always “just work”, which is excellent news.

  • When you dump a modern PostGIS-enabled database, that was created using the CREATE EXTENSION postgis command, the dump file just includes a CREATE EXTENSION postgis command of its own at the top.
  • When you load the dump file into a new version of PostgreSQL even with a new version of PostGIS, the extension is created and the data magically loads.

However, there are still some old databases around that were created before the PostgreSQL extension system was invented, and when you dump them you get not only the data, but all the “custom” function and type definitions, including the defintions for PostGIS. A function definition looks like this:

    RETURNS geometry
    AS '$libdir/postgis-2.2', 'pointonsurface'

And look what is hiding inside of it: a reference to a particular version of the PostGIS library! So you cannot simply dump your old PostGIS 1.5 database on PostgreSQL 8.4 and load it into a fresh PostGIS 2.2 database on PostgreSQL 9.5: the function definitions won’t reference the right library file.

The best bet for a really old database that was created without the extension mechanism is to use the “hard upgrade” process. The hard upgrade works by:

  • Taking a special “custom-format” back-up that includes an object catalog;
  • Filtering the back-up to clean out all the PostGIS-specific function and object definitions; and then
  • Loading the “cleaned” back-up into a new database with the desired version of PostGIS already installed (using CREATE EXTENSION postgis this time, so you never have to hard upgrade again).


In the case of upgrades that change out the underlying library and other situations that result in a mismatch between the SQL definitions in the database and the state of the system, there are a couple hacks that provide short-term fixes for emergencies:

  • Symlink the library name the database is looking for to the library name you have. So if your database wants postgis-2.1.so and all you have is postgis-2.2.so, you can ln -s postgis-2.2.so postgis-2.1.so and your database will “work” again.
  • Update the PostgreSQL catalog definitions for the functions. As a super-user, you can do all kinds of dangerous things, and one of them is to just UPDATE pg_proc SET probin = '$libdir/postgigs-2.2' WHERE probin ~ 'postgis-2.1'

Both hacks “work” because the PostGIS project doesn’t change underlying function names often, and inter-version changes mostly involve adding functions to the C library, not removing old ones.

However, there’s no guarantee that an underlying function name hasn’t change between versions, it’s just unlikely. In the worst case, the function name hasn’t changed, but the parameters have, so it’s now possible that calling the function will crash your database.

All this to say: linking and SQL catalogue hacks should be used temporarily only until you can properly upgrade your database using a hard upgrade.

August 10, 2016 04:05 PM

August 02, 2016

Boston GIS (Regina Obe, Leo Hsu)

GeoHipster Interview with Regina Obe

GeoHipster interview with me came out today. Covers how I stumbled into database programming and my work on PostGIS, PostgreSQL and pgRouting. Interview with Regina Obe

by Regina Obe (nospam@example.com) at August 02, 2016 03:45 AM

July 05, 2016

UpStats blog (Stefan Petrea)

Geolocation using multiple services

UPDATE: A chinese version of this post is available here.


In a previous post I wrote about PostGIS and ways of querying geographical data.

This post will focus on building a system that queries free geolocation services 1 and aggregates their results.


In summary, we're making requests to different web services (or APIs), we're doing reverse geocoding on the results and then we'll aggregate them.

Comparing geonames and openstreetmap

To relate to the previous post, here are some differences between geonames and openstreetmap:

criterion OSM geonames
size 50 GB compressed 309 MB compressed
entities 3.25 billion 11 million
has administrative area data yes yes
has lat/long city data yes yes
has neighbourhood/district data yes yes
has region/area polygonal areas yes no
has intracity-level metadata yes no
has terrain metadata yes no

They are meant for different purposes. Geonames is meant for city/administrative area/country data and can be used for geocoding. Openstreetmap has much more detailed data (one could probably extract the geonames data using openstreetmap) and can be used for geocoding, route planning and more .

Asynchronous requests to geolocation services

We're using the gevent library to make asynchronous requests to the geolocation services.

import gevent
import gevent.greenlet
from gevent import monkey; gevent.monkey.patch_all()

        ['geoplugin'    , 'http://www.geoplugin.net/json.gp?ip={ip}' ],
        ['ip-api'       , 'http://ip-api.com/json/{ip}'              ],
        ['nekudo'       , 'https://geoip.nekudo.com/api/{ip}'        ],
        ['geoiplookup'  , 'http://api.geoiplookup.net/?query={ip}'   ],

# fetch url in asynchronous mode (makes use of gevent)
def fetch_url_async(url, tag, timeout=2.0):
    data = None
        opener = urllib2.build_opener(urllib2.HTTPSHandler())
        opener.addheaders = [('User-agent', 'Mozilla/')]
        data = urllib2.urlopen(url,timeout=timeout).read()
    except Exception, e:

    return [tag, data]

# expects req_data to be in this format: [ ['tag', url], ['tag', url], .. ]
def fetch_multiple_urls_async(req_data):

    # start the threads (greenlets)
    threads_ = []
    for u in req_data:
        (tag, url) = u
        new_thread = gevent.spawn(fetch_url_async, url, tag)

    # wait for threads to finish

    # retrieve threads return values
    results = []
    for t in threads_:
        results.append(t.get(block=True, timeout=5.0))

    return results

def process_service_answers(location_data):
    # 1) extract lat/long data from responses
    # 2) reverse geocoding using geonames
    # 3) aggregate location data
    #    (for example, one way of doing this would
    #     be to choose the location that most services
    #     agree on)

def geolocate_ip(ip):
    urls = []
    for grp in geoip_service_urls:
        tag, url = grp
        urls.append([tag, url.format(ip=ip)])
    results = fetch_multiple_urls_async(urls)
    answer = process_service_answers(results)
    return answer

City name ambiguity

Cities with the same name within the same country

There are many cities with the same name within a country, in different states/administrative regions. There's also cities with the same name in different countries.

For example, according to Geonames, there are 24 cities named Clinton in the US (in 23 different states, with two cities named Clinton in the same state of Michigan).

WITH duplicate_data AS (
    array_agg(ROW(country_code, region_code)) AS dupes
    FROM city_region_data
    WHERE country_code = 'US'
    GROUP BY city_name, country_code
    ORDER BY COUNT(ROW(country_code, region_code)) DESC
ARRAY_LENGTH(dupes, 1) AS duplicity,
( CASE WHEN ARRAY_LENGTH(dupes,1) > 9 
  THEN CONCAT(SUBSTRING(ARRAY_TO_STRING(dupes,','), 1, 50), '...')
) AS sample
FROM duplicate_data
city_name duplicity sample
Clinton 24 (US,NY),(US,AR),(US,NC),(US,MA),(US,MD),(US,OH),(U…
Franklin 19 (US,ME),(US,MA),(US,NC),(US,TX),(US,NC),(US,LA),(U…
Springfield 19 (US,MN),(US,KY),(US,SD),(US,MI),(US,VA),(US,IL),(U…
Madison 18 (US,CT),(US,MN),(US,NJ),(US,ME),(US,SD),(US,FL),(U…
Greenville 18 (US,NC),(US,SC),(US,MS),(US,KY),(US,RI),(US,ME),(U…

Cities with the same name in the same country and region

Worldwide, even in the same region of a country, there can be multiple cities with the exact same name.

Take for example Georgetown, in Indiana. Geonames says there are 3 towns with that name in Indiana. Wikipedia says there are even more:

WITH duplicate_data AS (
    array_agg(ROW(country_code, region_code)) AS dupes
    FROM city_region_data
    WHERE country_code = 'US'
    GROUP BY city_name, region_code, country_code
    ORDER BY COUNT(ROW(country_code, region_code)) DESC
ARRAY_LENGTH(dupes, 1) AS duplicity,
( CASE WHEN ARRAY_LENGTH(dupes,1) > 9 
  THEN CONCAT(SUBSTRING(ARRAY_TO_STRING(dupes,','), 1, 50), '...')
) AS sample
FROM duplicate_data
city_name duplicity sample
Plantation 3 (US,FL),(US,FL),(US,FL)
Georgetown 3 (US,IN),(US,IN),(US,IN)
Robinwood 3 (US,MD),(US,MD),(US,MD)
Prospect Park 2 (US,NJ),(US,NJ)

Reverse geocoding

Both (city_name, country_code) and (city_name, country_code, region_name) tuples have failed as candidates to uniquely identify a location.

We would have the option of using zip codes or postal codes except we can't use those since most geolocation services don't offer that.

But most geolocation services do offer longitude and latitude, and we can use those to eliminate ambiguity.

Geometric data types in PostgreSQL

I looked further into the PostgreSQL docs and found that it also has geometric data types and functions for 2D geometry. Out of the box you can model points, boxes, paths, polygons, circles, you can store them and query them.

PostgreSQL has some additional extensions in the contrib directory. They are available out of the box with most Postgres installs.

In this situation we're interested in the cube and earthdistance extensions 2. The cube extension allows you to model n-dimensional vectors, and the earthdistance extension uses 3-cubes to store vectors and represent points on the surface of the Earth.

We'll be using the following:

  • the earth_distance function is available, and it allows you to compute the great-circle distance between two points
  • the earth_box function to check if a point is within a certain distance of a reference point
  • a gist expression index on the expression ll_to_earth(lat, long) to make fast spatial queries and find nearby points

Designing a view for city & region data

Geonames data was imported into 3 tables:

Then we create a view that pulls everything together 3. We now have population data, city/region/country data, and lat/long data, all in one place.

CREATE OR REPLACE VIEW city_region_data AS ( 
        b.country AS country_code,
        b.asciiname AS city_name,
        a.name AS region_name,
        b.latitude AS city_lat,
        b.longitude AS city_long,
        c.name    AS country_name
    FROM geo_admin1 a
    JOIN (
        SELECT *, (country || '.' || admin1) AS country_region, admin1 AS region_code
        FROM geo_geoname
        WHERE fclass = 'P'
    ) b ON a.code = b.country_region
    JOIN geo_countryinfo c ON b.country = c.iso_alpha2

Designing a nearby-city query and function

In the most nested SELECT, we're only keeping the cities in a 23km radius around the reference point, then we're applying a country filter and city pattern filter (these two filters are optional), and we're only getting the closest 50 results to the reference point.

Next, we're reordering by population because geonames sometimes has districts and neighbourhoods around bigger cities 4, and it does not mark them in a specific way, so we just want to select the larger city and not a district (for example let's say the geolocation service returned a lat/long that would resolve to one district of a larger metropolitan area. In my case, I'd like to resolve this to the larger city it's associated with)

We're also creating a gist index (the @> operator will make use of the gist index) which we're using to find points within a radius of a reference point.

This function takes a point (using latitude and longitude) and returns the city, region and country that is associated with that point.

CREATE INDEX geo_geoname_latlong_idx ON geo_geoname USING gist(ll_to_earth(latitude,longitude));
CREATE OR REPLACE FUNCTION geo_find_nearest_city_and_region(
    latitude double precision,
    longitude double precision,
    filter_countries_arr varchar[],
    filter_city_pattern  varchar,
    country_code varchar,
    city_name varchar,
    region_name varchar,
    region_code varchar,
    population bigint,
    _lat double precision,
    _long double precision,
    country_name varchar,
    distance numeric
    ) AS $$
    SELECT *
    FROM (
        FROM (
                   ll_to_earth(c.city_lat, c.city_long),
                   ll_to_earth(latitude, longitude)
                  )::numeric, 3) AS distance_
            FROM city_region_data c
            WHERE earth_box(ll_to_earth(latitude, longitude), 23000) @> ll_to_earth(c.city_lat, c.city_long) AND
                  (filter_countries_arr IS NULL OR c.country_code=ANY(filter_countries_arr)) AND
                  (filter_city_pattern  IS NULL OR c.city_name LIKE filter_city_pattern)
            ORDER BY distance_ ASC
            LIMIT 50
        ) d
        ORDER BY population DESC
    ) e
    LIMIT 1;
LANGUAGE plpgsql;


We've started from the design of a system that would query multiple geoip services, would gather the data and would then aggregate it to get a more reliable result.

We first looked at some ways of uniquely identifying locations.

We've then picked a way that would eliminate ambiguity in identifying them. In the second half, we've looked at different ways of structuring, storing and querying geographical data in PostgreSQL.

Then we've built a view and a function to find cities near a reference point which allowed us to do reverse geocoding.



By using multiple services (and assuming they use different data sources internally) after aggregation, we can have a more reliable answer than if we were using just one.

Another advantage here is that we're using free services, no setup is required, we don't have to take care of updates, since these services are maintained by their owners.

However, querying all these web services will be slower than querying a local geoip data structure. But, city/country/region geolocation databases are available, some examples include geoip2 from maxmind, ip2location or db-ip.


There's a nice post here using the earthdistance module to compute distances to nearby or far away pubs.


Geonames has geonameIds as well, which are geonames-specific ids we can use to accurately refer to locations.


geonames does not have polygonal data about cities/neighbourhoods or metadata about the type of urban area (like openstreetmap does) so you can't query all city polygons (not districts/neighbourhoods) that contain that point.

July 05, 2016 09:00 PM

June 18, 2016

Stephen Mather

Using PostGIS for Hydrologic Modeling (reblog)

The Problem We have to filter out the roads and ditches without removing streams that cross roads or follow them closely. I’m going to use PostGIS to find the intersection of the streams lines data with a buffered roads polygon. If the intersected line is less than 50% of the length of the stream line, […]

via Filtering Roads from Extracted Streams Data — GeoKota

by smathermather at June 18, 2016 01:57 AM

June 04, 2016

Boston GIS (Regina Obe, Leo Hsu)

FOSS4GNA 2016: pgRouting - A Crash course video is out

Leo's pgRouting : a Crash Course video made it thru great. Better than mine. Leo doesn't believe in slides, so this is all live demo stuff. The data he used in the video is part of our code/data download for pgRouting: A Practical Guide.

Continue reading "FOSS4GNA 2016: pgRouting - A Crash course video is out"

by Regina Obe (nospam@example.com) at June 04, 2016 07:05 AM

May 31, 2016

Boston GIS (Regina Obe, Leo Hsu)

FOSS4GNA 2016 PostGIS Spatial Tricks video is out

The videos for FOSS4G NA 2016 have started coming out. Recently Andrea Ross posted PostGIS Spatial Tricks talk video. I'm happy to say it looks pretty good and I didn't suck as badly as I worried I would. Thank you very much Andrea. Some talks unfortunately did not come thru. I'm hoping Leo's pgRouting : a Crash Course video made it thru okay as well, and will post that later if it does.

Only small little nit-picks is the first 2-5 minutes or so didn't make it thru and the blue colors on the slides got a little drowned out, but here are the slides if you need full resolution.

Continue reading "FOSS4GNA 2016 PostGIS Spatial Tricks video is out"

by Regina Obe (nospam@example.com) at May 31, 2016 08:36 PM

May 30, 2016

Stephen Mather

Using foreign data wrapper to use PostGIS with SQLServer

Here was the problem that needed solved last week (we have a few similar problems in upcoming projects, so this was an exciting thing to try): we needed to use PostGIS to access data in a SQLServer database. The SQLServer database backs the web site in question, the underlying content management system, etc., so no– removing SQLServer isn’t really an option at this stage. Obviously PostGIS is a requirement too… .

Before I go further, I used tds_fdw as the foreign data wrapper. The limitations here are as follows: it is a read-only connection, and only works with Sybase and SQLServer, as it uses tabular data stream protocol for communicating between client and server. This is not as generic a solution as we can use. Next time I’ll try ogr_fdw which is more generic as it can connect with other databases and other data types. Another advantage to ogr_fdw is we can use IMPORT FOREIGN SCHEMA. Regina Obe and Leo Hsu warn though to limit this to 150 tables or so for performance reasons.

With the limitations listed above, this is how we built the thang:


-- Create the server connection with FOREIGN DATA WRAPPER
OPTIONS (servername 'name_or_ip', port '1433', database 'non-spatial-database', tds_version '7.1', msg_handler 'notice');

-- We map the postgres user to the user that can read the table we're interested in
SERVER beach_fdw
OPTIONS (username 'user', password 'password');

-- Create the actual foreign table connection
CREATE FOREIGN TABLE beach_closures (
AutoNumber int NOT NULL,
Title varchar NOT NULL,
StartDate timestamp without time zone NOT NULL,
WaterQuality varchar NOT NULL,
Latitude varchar NOT NULL,
Longitude varchar NOT NULL,
BeachStatus varchar NOT NULL,
ClosureColor varchar NOT NULL)
SERVER beach_fdw
OPTIONS (schema_name 'schema_name', table_name 'vw_CMPBeachClosures');

-- Now we create a spatial view using our longitude and latitude
CREATE VIEW v_beach_closures AS
AutoNumber, Title, StartDate, WaterQuality, Latitude,
Longitude, BeachStatus, ClosureColor, ST_SetSRID(ST_MakePoint(Longitude::numeric, Latitude::numeric), 4326)	AS geom
FROM beach_closures;

Voila! A nice little PostGIS enabled view of a SQLServer view or table!

by smathermather at May 30, 2016 02:34 AM

May 25, 2016

UpStats blog (Stefan Petrea)

Redesigning a notification system


As previously mentioned, UpStatsBot is part of UpStats. It's a notification system in the form of a bot. You can get notifications about jobs while you're on the street or in a cafe.

This bot has been running for ~7 months now. And it was running quite well, except for one thing, the Postgres logs were growing a lot. Not long ago, I've analyzed PostgreSQL logs for 1 week using pgbadger. The results were startling:

As the image shows, the bot was responsible for more than 40% of all queries to that database. In addition, the Pg logs were growing quite a lot.

The bot would run queries to poll for new items every 10 minutes; those queries would run regardless if the collector had brought in new data since the last time the queries were run.

This blog post will describe how I fixed that issue using PostgreSQL's LISTEN/NOTIFY feature 1 , 2.

The main advantage of LISTEN/NOTIFY is the ability to receive near-realtime notifications with a fraction of the queries.

Overview of the UpStats project

The diagram above summarizes how UpStats and UpStatsBot work and how the system is composed:

  • A data collector
  • A PostgreSQL db
  • A web app
  • Metrics (that are recomputed periodically)
  • A notification system

A user can access it from the telegram bot or from the web app. The data is collected from the UpWork API, placed in a PostgreSQL database. Metrics can be computed(via more complex SQL queries) and notifications dispatched using the Telegram API.

We aim to move the logic responsible for notifications computation from the bot into the collector in order to realize near real-time dispatch (i.e. whenever new data becomes available).

Tables involved in notifications

The relevant tables here are:

  • odesk_job
  • odesk_search_job
  • odesk_telegram

In summary, we use search keywords from odesk_telegram and search for them in odesk_job via odesk_search_job. The odesk_search_job table holds full-text indexes for the jobs. The odesk_telegram table holds search keywords and active search streams for each subscribed user.

PosgreSQL's LISTEN/NOTIFY in Python

To offer some background, some use-cases of LISTEN/NOTIFY include:

  • using it in conjunction with websockets to build chat systems 3
  • building asynchronous and trigger-based replication systems 4
  • keeping caches in sync with a PostgreSQL database 5

We're using the psycopg2 connector 6 for PostgreSQL. The connector uses a socket to talk to the Postgres database server, and that socket has a file descriptor. That descriptor is used in the select call. Select checks if the descriptor is ready for reading 7 .

In order to exemplify this, we'll write a simple Python class that allows to send and listen to notifications 8 .

import select
import psycopg2
import psycopg2.extensions
from psycopg2.extensions import QuotedString
import json
import time

This class is used to create an easy to use queue
mechanism where you can send and listen to messages.

class PgQueue:
    dbuser = None
    dbpass = None
    dbname = None

    conn = None
    curs = None
    channel = None

    continue_recv = True

    def __init__(self,channel,dbname=None,dbuser=None,dbpass=None):
        Connect to the database.
        If one of dbname, dbuser or dbpassword are not provided,
        the responsibility of providing (and setting a connection on
        this object) will fall on the calling code. 

        Otherwise, this will create a connection to the database.
        self.dbname = dbname
        self.dbuser = dbuser
        self.dbpass = dbpass
        self.channel = channel

        if not channel:
            raise Exception('No channel provided')

        if dbname and dbuser and dbpass:
            # store connection
            self.conn = psycopg2.connect( \
                    'dbname={dbname} user={dbuser} password={dbpass} host='.format(\
            # this is required mostly by the NOTIFY statement because it has
            # to commit after the query has been executed

    def recvLoop(self):
        Loop that's concerned with receiving notifications

        self.curs = self.conn.cursor()
        self.curs.execute("LISTEN {0};".format(self.channel))

        conn = self.conn
        curs = self.curs

        while self.continue_recv:
            if select.select([conn],[],[],6) == ([],[],[]):
                print "consumer: timeout"
                print "consumer: received messages"
                while conn.notifies:
                    notif = conn.notifies.pop(0)
                    # print "Got NOTIFY:", notif.pid, notif.channel, notif.payload

    def recvCallback(self, notification):
        Needs to be implemented with notification handling logic

    def send(self, data):
        Send a notification
        curs = self.conn.cursor()

        message = {}
        print "producer: sending.."
        # equip the message object with a timestamp
        message['time'] = time.time()
        message['data'] = data
        messageJson = json.dumps(message)
        messagePg = QuotedString(messageJson).getquoted()

        query = 'NOTIFY {0}, {1};'.format(self.channel, messagePg )
        print query

Now that we've implemented the class we can use it. The producer will be quite simple, and the consumer will need to either patch the notifyCallback method or subclass the PgQueue class to override the same method. We'll use the former, we'll patch the method. We'll run the producer in a thread and the consumer in a different thread.

def sample_producer_thread():
    q = PgQueue('botchan', dbname='dbname', dbuser='username', dbpass='password')

        message = {}
        message['test'] = "value"

def sample_consumer_thread():
    q = PgQueue('botchan', dbname='dbname', dbuser='username', dbpass='password')

    def newCallback(m):
        if m.payload:
            payload = m.payload
            print "callback: ", payload

    # replace the receiver callback
    q.recvCallback = newCallback

if __name__ == '__main__':
    import signal 
    from threading import Thread

    thread_producer = Thread(target=sample_producer_thread)
    thread_consumer = Thread(target=sample_consumer_thread)

Putting together user notifications

The regexes below are creating tsquery-compatible strings. Then those strings are used to run full-text searches on the job table. This way we can build notifications for each user and for each of their active search streams.

The last_job_ts is used to make sure we limit our searches to the new data.

We make use of wCTE (WITH common table expressions) because they're easy to work with and allow for gradually refining results of previous queries until the desired data can be extracted.

Near the end of the query we neatly pack all the data using PostgreSQL's JSON functions.

WITH user_notifs AS (
                   rtrim(ltrim(search,' '),' '),
                   '\s+',' ','g'
        '(\s*,\s*|\s)' , ' & ', 'g'
    AS fts_query
    FROM odesk_telegram
    WHERE paused = false AND deleted = false
), jobs AS (
    SELECT A.job_id, A.tsv_basic, B.job_title, B.date_created
    FROM odesk_search_job A
    JOIN odesk_job B ON A.job_id = B.job_id
    WHERE B.date_created > EXTRACT(epoch FROM (NOW() - INTERVAL '6 HOURS'))::int
), new AS (
    A.id, A.chat_id, A.search, B.job_id, B.job_title, B.date_created
    FROM user_notifs AS A
    JOIN jobs B ON (
        B.tsv_basic @@ A.fts_query::tsquery AND
        B.date_created > A.last_job_ts
), json_packed AS (
    json_build_object (
        'job_id', A.job_id,
        'job_title', A.job_title,
        'date_created', A.date_created
    )) AS j
    FROM new A
    GROUP BY A.id, A.search, A.chat_id
SELECT * FROM json_packed;

Tightening the constraints

Near the end of the collector program we compute the notifications and an expensive search query needs to be run in order to find out what to send and whom to send it to.

However, every time this query has run, we can store the latest timestamp on that search stream so next time we can tighten the search constraints and next time we're trying to compute the notifications we only search after that timestamp.

In order to do this, the search streams' last_job_ts needs to be updated:

UPDATE odesk_telegram SET last_job_ts = %(new_ts)d WHERE id = %(id)d;

For the active search streams that had new jobs, the earliest timestamp can be passed as parameter to this query.

Even for the active search streams that have seen no new jobs, we still have to tighten the search by updating their last_job_ts to the time when the collector started (we can only go so far, any later than this and we might miss jobs that were posted while the collector was running).

Optimizing the search streams

If enough data and enough users are present, we could craft a better query for this. For example, the search keywords could be organized in a tree structure.

This particular tree would store keywords based on the number of search results they're present in; in other words, the more queries a keyword exists in, the closer to the root that keyword will be.

A search stream corresponds to a path in this tree.

For example, in the tree below, php, wordpress is a search stream and user3 has registered for that particular search stream. Accordingly, user3 will receive job ads that match the words php and wordpress. Given the logic described above, php will match more jobs than wordpress.9

A temporary table can be created for the high-volume top-level search streams. To get to the more specific search streams, a JOIN on this table followed by the conditions for the lower-volume keywords would be enough.

For example, there are two search streams php,wordpress (for user3) and php,mysql (for user2). We could cache the ids of the notifications for the larger stream php and then refine it in order to get the two streams we're interested in.

This would be particularly interesting for a situation with a large number of subscribed users and a lot of active search streams as the tree would expand, preferably in depth rather than breadth.


This blog post describes the redesign of a notification system and some ideas about improving its performance.



Notifications in PostgreSQL have been around for a long time, but starting with version 9.0 they are equipped with a payload.


The message passing described in this blog post is a simple one.

There are situations in which message delivery is more critical than the method described here.

For that purpose, this article provides a better approach that handles delivery failures. It uses a queue table and two triggers. The queue table is used to persist the messages that are sent.

One trigger will be placed on the table whose changes are of interest.

The other trigger will be placed on the queue table. So the message will be the "cascading" result of modifying the actual table. The advantage here is persistence (among other advantages that you can read in that article).

What's more, for the first trigger, the function row_to_json offers a way to serialize the changes in a structure-agnostic way.

Here's another article describing a queue-table-centric approach (without any LISTEN/NOTIFY). There's an emphasis put on locking and updating the 'processed' state of each item in the queue table, and different approaches for that. This would be more in line with a distributed queue.


For example this presentation in which the author explains how Postgres sends notifications to a Python backend via LISTEN/NOTIFY, which are then forwarded to the browser via websockets. The presentation is also available on youtube here.


This article describes a scenario where a PostgreSQL database updates a cache by broadcasting changes to it.


Although the code here is in Python, you may certainly use PostgreSQL's notifications in other languages (it's a well-supported feature) including the following:


More details about the conn.poll() statement. The poll() method comes from psycopg2 (it's called conn_poll in the C code) and it reads all the notifications from PQsocket (the ones we have issued a LISTEN statement for). There are 5 functions involved:

  • conn_poll (which in turn calls _conn_poll_query)
  • _conn_poll_query (in turn, calls pq_is_busy)
  • pq_is_busy (which calls conn_notifies_process and conn_notice_process)
  • conn_notifies_process (reads them from the PQsocket and populates C data structures)
  • conn_notice_process (turns the available notifications into Python data structures)

Need to keep in mind that the payloads are limited to 8000 bytes.


This section about optimizing and caching search results is just an idea at this point. Details such as how to keep the tree updated, which search streams should be cached in the temporary table, and how to represent it are not yet worked out. It's not yet implemented, it will probably be used later on.

May 25, 2016 04:00 AM

May 21, 2016

Boston GIS (Regina Obe, Leo Hsu)

pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1

pgRouting 2.2.3 was released last week. Main change is this version now supports PostgreSQL 9.6. Many thanks to Vicky Vergara for working thru the issues with PostgreSQL 9.6 and getting it to work. Vicky has also been doing a good chunk of the coding (a lot of Boost refactoring and integrating more Boost features), testing, and documentation in pgRouting, osm2pgrouting, and QGIS pgRoutingLayer in general for pgRouting 2.1, 2.2, and upcoming 2.3. We are very indebted to her for her hard work.

If you are a windows user testing the waters of PostgreSQL 9.6beta1, we have pgRouting 2.2.3 binaries and PostGIS 2.3.0dev binaries at http://postgis.net/windows_downloads.

Continue reading "pgRouting 2.2.3 released with support for PostgreSQL 9.6beta1"

by Regina Obe (nospam@example.com) at May 21, 2016 04:24 PM

May 13, 2016

Postgres OnLine Journal (Leo Hsu, Regina Obe)

PLV8 binaries for PostgreSQL 9.6beta1 windows both 32-bit and 64-bit

To celebrate recent release of PostgreSQL 9.6beta1, we've started to experiment with our favorite extensions. For starters, PLV8 (aka PL/JavaScript) binaries listed below and upcoming PostGIS 2.3.0 and ogr_fdw detailed here

Continue reading "PLV8 binaries for PostgreSQL 9.6beta1 windows both 32-bit and 64-bit"

by Leo Hsu and Regina Obe (nospam@example.com) at May 13, 2016 10:56 PM

Boston GIS (Regina Obe, Leo Hsu)

PostgreSQL 9.6beta1 out, help test PostGIS 2.3 windows binaries available

PostgreSQL 9.6beta1 came out yesterday. It is the first version of PostgreSQL that will have parallel query support and PostGIS 2.3 will be the first PostGIS to support parallelism in queries. Although work is not yet committed in PostGIS repo to support this, you can expect to see this soon (currently here - https://github.com/pramsey/postgis/tree/parallel , waiting for someone to you know who you are do something about it.)

UPDATE: Some parallel support has been committed. More to come. pgRouting 9.6 issues resolved, many thanks to Vicky Vergara. Now pgRouting 2.2.3 windows binaries available for PostgreSQL 9.6beta1. Also recently tested these with BigSQL 9.6beta1 Windows 64-bit (at least the PostGIS ones), and seem to work fine. BigSQL paths are a little different from EDB (so be careful when copying to copy to right folder, zip structure follows the EDB structure. Also make sure not to overwrite files that came packaged with BigSQL, since there is more file overlap since we both use mingw64, but their gcc is newer).

Because of the newness of the parallelization feature, there are some caveats. As with all big things, we expect there to be a lot of bugs, and the more eyeballs on those and real workloads we've got hammering on them, the sweeter the PostGIS 2.3.0 and PostgreSQL 9.6 release will be.

Binaries for Windows users

For windows users, winnie the PostGIS windows buildbot is now building PostGIS for 9.6. Get PostgreSQL 9.6 binaries and installers from PostgreSQL 9.6beta1 for windows.

Once you have that, just copy the contents of the respective PostGIS 2.3 9.6 binaries listed here - http://postgis.net/windows_downloads/ into your install folder.

In the extras folder, you'll also find ogr_fdw foreign data wrapper latest development version which we covered extensively in FOSS4GNA2016 PostGIS Spatial Tricks. Talk also covered some new PostGIS 2.3.0 stuff.

We don't have pgRouting binaries available yet. pgRouting team is working out some compatibility issues with PostgreSQL 9.6. Once those are resolved, we will publish pgRouting binaries as well.

by Regina Obe (nospam@example.com) at May 13, 2016 09:02 PM

May 10, 2016

UpStats blog (Stefan Petrea)

Parallel testing multiple versions of ImageMagick

While investigating some test failures on a project involving ImageMagick (a parser for the metadata that ImageMagick extracts out of documents), I noticed different versions of ImageMagick were producing outputs in different formats. Not only the formats were differing but also some of the data was missing, or the values were different.

I only knew about two versions that were affected:

  • 6.7.7-10 2014-03-06 Q16
  • 6.9.2-4 Q16 x86_64 2015-11-27

On the newer one I got failures, so I quickly solved them. But I went a bit further, because I was interested to know what other versions in between these two were failing any tests.

I found the ImageMagick git repository for Debian. I cloned it, and looked at the branches. There were multiple branches, but the ones I was interested in were origin/debian/*.

Next I wrote a bash script to build all versions between all the way up to

# ...

# Clone the debian repository
git clone $GIT_REPO imagemagick

# We're only interested in the identify binary so we aim
# at making a custom build for each branch. We're not
# building any documentation, we're also not going to let
# the IM build run any tests, we only want the binaries.
for BRANCH in "${GIT_BRANCHES[@]}"; do
    BRANCH_VER=$(echo "$BRANCH" | perl -ne 'm{(\d+(?:\.\d+)+.*)$} && print "$1"')
    pushd .
    cd imagemagick
    # remove any unversioned files created by the previous build
    git clean -dfx .
    # undo any changes to versioned files
    git reset --hard HEAD
    # checkout a new branch for the build
    git checkout $BRANCH
    # building a minimal imagemagick for testing purposes
    autoreconf -i
    ./configure --without-perl --without-magick-plus-plus --disable-docs --prefix="$PWD/$IM_VER"
    # speed up the build with 4 parallel jobs
    make -j4
    mkdir -p "$IM_VER"
    make install

It took one hour to build 57 different 1 versions of ImageMagick, resulting in a directory (4.2 GB) with each version's binaries.

Now I wanted to know on which versions of ImageMagick, the unit-tests were failing. So I wrote another bash script to run the tests in parallel for each version of ImageMagick. To parallelize the unit tests I've only made use of xargs -P which I read about some time ago 2 (there are other ways to do that too, but I found this one to be the easiest in this situation).

BUILD_PATHS=$(find custom-build/ -maxdepth 1 -mindepth 1)
OUT_DIR=$(mktemp -d)
for P in $BUILD_PATHS; do
    # extract IM version from the directory
    DIR_VERSION=$(echo "$P" | perl -ne 'm{(\d+(?:\.\d+)+[^\/]*)\/?} && print "$1"')
    # build output path
    # run the tests using a custom built version of imagemagick
    # (prepend custom built path so it can be used by the tests)
    CMD="PATH=\"" "$NEW_PATH" "\" ./run-tests.sh >$OUT_FILE 2>&1;"
    echo "$CMD"
done | xargs -I% -P 6 /bin/bash -c '%';

echo "$OUT_DIR"
echo "Test reports with failures:"
grep -lir FAIL "$OUT_DIR"

It took under 33 seconds to run through all the versions, and I got this result

Test reports with failures:

Now I was down to 4 ImageMagick versions that I had to look into. On looking further into those failed tests, they were testing properties of a tree built from looking at PDF metadata. For those versions, IM only outputs data for the first page of the PDF, not for each page of the PDF (as is the case with most other versions) so this is the reason why they were failing.

I then used unittest.skipIf to skip those tests if the IM version used was one of the four ones listed above.

This made for an interesting analysis, and I was happy to learn more about the way IM works.



There's a more complicated scenario in which all dependencies (libraries IM depends on) would taken into consideration, not just IM's source code. The assumption here is that IM alone is responsible for the output of the identify utility. And that's a fair assumption, at least for the format of the output.

May 10, 2016 04:00 AM

April 29, 2016

Paul Ramsey

OGR FDW Update

I’ve had a productive couple of weeks here, despite the intermittently lovely weather and the beginning of Little League baseball season (not coaching, just supporting my pitcher-in-training).

13 Days

The focus of my energies has been a long-awaited (by me) update to the OGR FDW extension for PostgreSQL. By binding the multi-format OGR library into PostgreSQL, we get access to the many formats supported by OGR, all with just one piece of extension code.

As usual, the hardest part of the coding was remembering how things worked in the first place! But after getting my head back in the game the new code flowed out and now I can reveal the new improved OGR FDW!

OGR FDW Update

The new features are:

  • Column name mapping between OGR layers and PgSQL tables is now completely configurable. The extension will attempt to guess mapping automagically, using names and type consistency, but you can over-ride mappings using the table-level column_name option.
  • Foreign tables are now updateable! That means, for OGR sources that support it, you can run INSERT, UPDATE and DELETE commands on your OGR FDW tables and the changes will be applied to the source.

    • You can control which tables and foreign servers are updateable by setting the UPDATEABLE option on the foreign server and foreign table definitions.
  • PostgreSQL 9.6 is supported. It’s not released yet, but we can now build against it.
  • Geometry type and spatial reference system are propogated from OGR. If your OGR source defines a geometry type and spatial reference identifier, the FDW tables in PostGIS will now reflect that, for easier integration with your local geometry data.
  • GDAL2 and GDAL1 are supported. Use of GDAL2 syntax has been made the default in the code-base, with mappings back to GDAL1 for compatibility, so the code is now future-ready.
  • Regression tests and continuous integration are in place, for improved code reliability. Thanks to help from Even Roualt, we are now using Travis-CI for integration testing, and I’ve enabled a growing number of integration tests.

As usual, I’m in debt to Regina Obe for her timely feedback and willingness to torture-test very fresh code.

For now, early adopters can get the code by cloning and building the project master branch, but I will be releasing a numbered version in a week or two when any obvious bugs have been shaken out.

April 29, 2016 06:00 PM

April 21, 2016

Postgres OnLine Journal (Leo Hsu, Regina Obe)

PGConfUS 2016 PostGIS slides and tutorial material

We gave a PostGIS Intro Training and a PostGIS talk at PGConfUS 2016 in Brooklyn, New York and just got back. A number of people asked if we'd make the slides and material available. We have these posted on our presentation page: http://www.postgis.us/presentations and will be putting on the PostgreSQL Wiki as well in due time. There will be a video coming along for the talk, but the training was not recorded.

We also have two more talks coming up in North Carolina in Early May at FOSS4G NA 2016 - one on PostGIS Spatial Tricks which has more of a GIS specialist focus than the top 10 talk we gave, but there will be some overlap. The other talk is a topic a couple of people asked us in training and after our talk, on routing along constrained paths. If you are attending FOSS4G NA 2016, you won't want to miss our talk pgRouting: A Crash Course which is also the topic of our upcoming book.

Just like FOSS4G NA 2015, there is a pgDay track which is PostgreSQL specific material, useful to a spatial crowd, but not PostGIS focused.

Continue reading "PGConfUS 2016 PostGIS slides and tutorial material"

by Leo Hsu and Regina Obe (nospam@example.com) at April 21, 2016 05:04 PM

April 19, 2016

Archaeogeek (Jo Cook)

PortableGIS 5.6

I’m pleased to announce the latest release of Portable GIS. This version (v5.6) has the following changes:

  • QGIS 2.14.1 LTR
  • By popular demand: Geoserver 2.8

You can download the setup exe and the md5 checksum here.

Older versions are still available but have been archived to avoid confusion.

As always, please let me know of any problems via the Portable GIS google group.

Note that I will shortly be publicising a GitLabs repository for the changed files, along with developer and user documentation, to allow people to roll their own versions or contribute to development. This work is nearly complete, so watch this space!

April 19, 2016 10:20 AM

April 15, 2016

Postgres OnLine Journal (Leo Hsu, Regina Obe)

First Look at pgAdmin 4

When David Page announced pgAdmin 4, I was really excited to try it out. I was impressed I could compile it so easily on windows. I had a few bumps, but not too bad.

One of the reasons I'm excited about it is that it's built on Python and a web framework, and there is a large Python and web developer following in the GIS community, so I suspect someone will step up to the plate to add a mapviewer plugin to this so I can have a seamless PostGIS experience.

The interface is also very slick and pretty and I love the sorting and paging capability now in the query window. Check this sampling from our workshop database.

Continue reading "First Look at pgAdmin 4 "

by Leo Hsu and Regina Obe (nospam@example.com) at April 15, 2016 11:16 PM

April 05, 2016

PostGIS Development

Nautilytics Case Study

Nautilytics is a small data visualization and GIS startup based out of Boston, MA. We use PostGIS and PostgreSQL, among other open-source tools to build powerful web applications for US government organizations, public, and private sector companies.

Continue Reading by clicking title hyperlink ..

by Christopher Lanoue at April 05, 2016 12:00 AM

April 04, 2016

PostGIS Development

PostGIS for fast prototyping and Research

I used extensively postgis (+ecosystem) for my phd thesis, in several ways. The first is that PostGIS is a good steady horse (elephant?): a database is the perfect place to store a lot of very different information in the same place and put them in relation. For geospatial data, postgis means you always have a way to put data in relation (are they at the same place?).

Continue Reading by clicking title hyperlink ..

by Rémi Cura at April 04, 2016 12:00 AM

April 02, 2016

Postgres OnLine Journal (Leo Hsu, Regina Obe)

PostGIS 2.2 Windows users hold off on installing latest PostgreSQL patch release

Someone reported recently on PostGIS mailing list, that they were unable to install PostGIS 2.2.1 bundle or PostGIS 2.2.2 binaries on a clean PostgreSQL 9.5.2 install. Someone also complained about PostgreSQL 9.3 (though not clear the version) if that is a separate issue or the same. I have tested on PostgreSQL 9.5.2 Windows 64-bit and confirmed the issue. The issue does not affect PostgreSQL 9.5.1 and older. I haven't confirmed its an issue with the 32-bit installs, but I suspect so too. This issue will affect OGR_FDW users and people who used our compiled WWW_FDW.

Continue reading "PostGIS 2.2 Windows users hold off on installing latest PostgreSQL patch release"

by Leo Hsu and Regina Obe (nospam@example.com) at April 02, 2016 03:34 PM

March 31, 2016

Paul Ramsey

Parallel PostGIS Joins

In my earlier post on new parallel query support coming in PostgreSQL 9.6 I was unable to come up with a parallel join query, despite much kicking and thumping the configuration and query.

Parallel PostGIS Joins

It turns out, I didn’t have all the components of my query marked as PARALLEL SAFE, which is required for the planner to attempt a parallel plan. My query was this:

 SELECT Count(*) 
  FROM pd 
  JOIN pts 
  ON pd.geom && pts.geom
  AND _ST_Intersects(pd.geom, pts.geom);

And _ST_Intersects() was marked as safe, but I neglected to mark the function behind the && operator – geometry_overlaps – as safe. With both functions marked as safe, and assigned a hefty function cost of 1000, I get this query:

 Nested Loop  
 (cost=0.28..1264886.46 rows=21097041 width=2552) 
 (actual time=0.119..13876.668 rows=69534 loops=1)
   ->  Seq Scan on pd  
   (cost=0.00..14271.34 rows=69534 width=2512) 
   (actual time=0.018..89.653 rows=69534 loops=1)
   ->  Index Scan using pts_gix on pts  
   (cost=0.28..17.97 rows=2 width=40) 
   (actual time=0.147..0.190 rows=1 loops=69534)
         Index Cond: (pd.geom && geom)
         Filter: _st_intersects(pd.geom, geom)
         Rows Removed by Filter: 2
 Planning time: 8.365 ms
 Execution time: 13885.837 ms

Hey wait! That’s not parallel either!

It turns out that parallel query involves a secret configuration sauce, just like parallel sequence scan and parellel aggregate, and naturally it’s different from the other modes (gah!)

The default parallel_tuple_cost is 0.1. If we reduce that by an order of magnitude, to 0.01, we get this plan instead:

 (cost=1000.28..629194.94 rows=21097041 width=2552) 
 (actual time=0.950..6931.224 rows=69534 loops=1)
   Number of Workers: 3
   ->  Nested Loop  
   (cost=0.28..628194.94 rows=21097041 width=2552) 
   (actual time=0.303..6675.184 rows=17384 loops=4)
         ->  Parallel Seq Scan on pd  
         (cost=0.00..13800.30 rows=22430 width=2512) 
         (actual time=0.045..46.769 rows=17384 loops=4)
         ->  Index Scan using pts_gix on pts  
         (cost=0.28..17.97 rows=2 width=40) 
         (actual time=0.285..0.366 rows=1 loops=69534)
               Index Cond: (pd.geom && geom)
               Filter: _st_intersects(pd.geom, geom)
               Rows Removed by Filter: 2
 Planning time: 8.469 ms
 Execution time: 6945.400 ms

Ta da! A parallel plan, and executing almost twice as fast, just like the doctor ordered.


Mostly the parallel support in core “just works” as advertised. PostGIS does need to mark our functions as quite costly, but that’s reasonable since they actually are quite costly. What is not good is the need to tweak the configuration once the functions are properly costed:

  • Having to cut parallel_tuple_cost by a factor of 10 for the join case is not any good. No amount of COST increases seemed to have an effect, only changing the core parameter did.
  • Having to increase the cost of functions used in aggregates by a factor of 100 over cost of functions used in sequence filters is also not any good.

So, with a few more changes to PostGIS, we are quite close, but the planner for parallel cases needs to make more rational use of function costs before we have achieved parallel processing nirvana for PostGIS.

March 31, 2016 06:00 PM

March 29, 2016

PostGIS Development

Plenario (Univeristy of Chicago)

The Urban Center for Computation and Data (UrbanCCD) is a research initiative of Argonne National Laboratory and the Computation Institute of the University of Chicago. We create computational tools to better understand cities. One of these is Plenario, our hub for open geospatial data. PostGIS makes the spatial operations at the heart of Plenario possible.

Continue Reading by clicking title hyperlink ..

by William Engler at March 29, 2016 12:00 AM

March 26, 2016

Paul Ramsey

Parallel PostGIS

Parallel query support in PostgreSQL in the upcoming 9.6 release will be available for a number of query types: sequence scans, aggregates and joins. Because PostGIS tends to involve CPU-intensive calculations on geometries, support for parallel query has been at the top of our request list to the core team for a long time. Now that it is finally arriving the question is: does it really help?

Parallel PostGIS


  • With some adjustments to function COST both parallel sequence scan and parallel aggregation deliver very good parallel performance results.
  • The cost adjustments for sequence scan and aggregate scan are not consistent in magnitude.
  • Parallel join does not seem to work for PostGIS indexes yet, but perhaps there is some magic to learn from PostgreSQL core on that.


In order to run these tests yourself, you will need to check out and build:

For testing, I used the set of 69534 polling divisions defined by Elections Canada.

shp2pgsql -s 3347 -I -D -W latin1 PD_A.shp pd | psql parallel

It’s worth noting that this data set is, in terms of number of rows very very small as databases go. This will become important as we explore the behaviour of the parallel processing, because the assumptions of the PostgreSQL developers about what constitutes a “parallelizable load” might not match our assumptions in the GIS world.

With the data loaded, we can do some tests on parallel query. Note that there are some new configuration options for parallel behaviour that will be useful during testing:

  • max_parallel_degree sets the maximum degree of parallelism for an individual parallel operation. Default 0.
  • parallel_tuple_cost sets the planner’s estimate of the cost of transferring a tuple from a parallel worker process to another process. The default is 0.1.
  • parallel_setup_cost sets the planner’s estimate of the cost of launching parallel worker processes. The default is 1000.
  • force_parallel_mode allows the use of parallel queries for testing purposes even in cases where no performance benefit is expected. Default ‘off’.

Parallel Sequence Scan

Before we can test parallelism, we need to turn it on! The default max_parallel_degree is zero, so we need a non-zero value. For my tests, I’m using a 2-core laptop, so:

SET max_parallel_degree=2;

Now we are ready to run a query with a spatial filter. Using EXPLAIN ANALYZE suppressed the actual answer in favour of returning the query plan and the observed execution time:

  SELECT Count(*) 
    FROM pd 
    WHERE ST_Area(geom) > 10000;

And the answer we get back is:

 (cost=14676.95..14676.97 rows=1 width=8) 
 (actual time=757.489..757.489 rows=1 loops=1)
   ->  Seq Scan on pd  
   (cost=0.00..14619.01 rows=23178 width=0) 
   (actual time=0.160..747.161 rows=62158 loops=1)
         Filter: (st_area(geom) > 10000)
         Rows Removed by Filter: 7376
 Planning time: 0.137 ms
 Execution time: 757.553 ms

Two things we can learn here:

  • There is no parallelism going on here, the query plan is just a single-threaded one.
  • The single-threaded execution time is about 750ms.

Now we have a number of options to fix this problem:

  • We can force parallelism using force_parallel_mode, or
  • We can force parallelism by decreasing the parallel_setup_cost, or
  • We can adjust the cost of ST_Area() to try and get the planner to do the right thing automatically.

It turns out that the current definition of ST_Area() has a default COST setting, so it is considered to be no more or less expensive than something like addition or substraction. Since calculating area involves multiple floating point operations per polygon segment, that’s a stupid cost.

In general, all PostGIS functions are going to have to be reviewed and costed to work better with parallelism.

If we redefine ST_Area() with a big juicy cost, things might get better.

  AS '$libdir/postgis-2.3','area'
  COST 100;

Now the query plan for our filter is much improved:

Finalize Aggregate  
(cost=20482.97..20482.98 rows=1 width=8) 
(actual time=345.855..345.856 rows=1 loops=1)
->  Gather  
   (cost=20482.65..20482.96 rows=3 width=8) 
   (actual time=345.674..345.846 rows=4 loops=1)
     Number of Workers: 3
     ->  Partial Aggregate  
         (cost=19482.65..19482.66 rows=1 width=8) 
         (actual time=336.663..336.664 rows=1 loops=4)
           ->  Parallel Seq Scan on pd  
               (cost=0.00..19463.96 rows=7477 width=0) 
               (actual time=0.154..331.815 rows=15540 loops=4)
                 Filter: (st_area(geom) > 10000)
                 Rows Removed by Filter: 1844
Planning time: 0.145 ms
Execution time: 349.345 ms

Three important things to note:

  • We have a parallel query plan!
  • Some of the execution results output are wrong! They say that only 1844 rows were removed by the filter, but in fact 7376 were (as we can confirm by running the queries without the EXPLAIN ANALYZE). This is a known limitation, reporting on the results of only one parallel worker, which (should) maybe, hopefully be fixed before 9.6 comes out.
  • The execution time has been halved, just as we would hope for a 2-core machine!

Now for the disappointing part, try this:

  SELECT ST_Area(geom) 
    FROM pd;

Even though the work being carried out (run ST_Area() on 70K polygons) is exactly the same as in our first example, the planner does not parallelize it, because the work is not in the filter.

 Seq Scan on pd  
 (cost=0.00..31654.84 rows=69534 width=8) 
 (actual time=0.130..722.286 rows=69534 loops=1)
 Planning time: 0.078 ms
 Execution time: 727.344 ms

For geospatial folks, who tend to do a fair amount of expensive calculation in the SELECT parameters, this is a bit disappointing. However, we still get impressive parallelism on the filter!

Parallel Aggregation

The aggregate most PostGIS users would like to see parallelized is ST_Union() so it’s worth explaining why that’s actually a little hard.

PostgreSQL Aggregates

All aggregate functions in PostgreSQL consist of at least two functions:

  • A “transfer function” that takes in a value and a transfer state, and adds the value to the state. For example, the Avg() aggregate has a transfer state consisting of the sum of all values seen so far, and the count of all values processed.
  • A “final function” that takes in a transfer state and converts it to the final aggregate output value. For example, the Avg() aggregate final function divides the sum of all values by the count of all values and returns that number.

For parallel processing, PostgreSQL adds a third kind of function:

  • A “combine” function, that takes in two transfer states and outputs a singe combined state. For the Avg() aggregate, this would add the sums from each state and counts from each state and return that as the new combined state.

So, in order to get parallel processing in an aggregate, we need to define “combine functions” for all the aggregates we want parallelized. That way the master process can take the completed transfer states of all parallel workers, combine them, and then hand that final state to the final function for output.

To sum up, in parallel aggregation:

  • Each worker runs “transfer functions” on the records it is responsible for, generating a partial “transfer state”.
  • The master takes all those partial “transfer states” and “combines” them into a “final state”.
  • The master then runs the “final function” on the “final state” to get the completed aggregate.

Note where the work occurs: the workers only run the transfer functions, and the master runs both the combine and final functions.

PostGIS ST_Union Aggregate

One of the things we are proud of in PostGIS is the performance of our ST_Union() implementation, which gains performance from the use of a cascaded union algorithm.

Cascaded union involves the following steps:

  • Collects all the geometries of interest into an array (aggregate transfer function), then
  • Builds a tree on those geometries and unions them from the leaves of the tree upwards (aggregate final function).

Note that all the hard work happens in the final step. The transfer functions (which is what would be run on the workers) do very little work, just gathering geometries into an array.

Converting this process into a parallel one by adding a combine function that does the union would not make things any faster, because the combine step also happens on the master. What we need is an approach that does more work during the transfer function step.

PostGIS ST_MemUnion Aggregate

“Fortunately” we have such an aggregate, the old union implementation from before we added “cascaded union”. The “memory friendly” union saves memory by not building up the array of geometries in memory, at the cost of spending lots of CPU unioning each input geometry into the transfer state.

In that respect, it is the perfect example to use for testing parallel aggregate.

The non-parallel definition of ST_MemUnion() is this:

  basetype = geometry,
  sfunc = ST_Union,
  stype = geometry

No special types or functions required: the transfer state is a geometry, and as each new value comes in the two-parameter version of the ST_Union() function is called to union it onto the state. There is no final function because the transfer state is the output value. Making the parallel version is as simple as adding a combine function that also uses ST_Union() to merge the partial states:

  basetype = geometry,
  sfunc = ST_Union,
  combinefunc = ST_Union,
  stype = geometry

Now we can run an aggregation using ST_MemUnion() to see the results. We will union the polling districts of just one riding, so 169 polygons:

  SELECT ST_Area(ST_MemUnion(geom)) 
    FROM pd 
    WHERE fed_num = 47005;

Hm, no parallelism in the plan, and an execution time of 3.7 seconds:

 (cost=14494.92..14495.18 rows=1 width=8) 
 (actual time=3784.781..3784.782 rows=1 loops=1)
   ->  Seq Scan on pd  
   (cost=0.00..14445.17 rows=199 width=2311) 
   (actual time=0.078..49.605 rows=169 loops=1)
         Filter: (fed_num = 47005)
         Rows Removed by Filter: 69365
 Planning time: 0.207 ms
 Execution time: 3784.997 ms

We have to bump the cost of the two parameter version of ST_Union() up to 10000 before parallelism kicks in:

CREATE OR REPLACE FUNCTION ST_Union(geom1 geometry, geom2 geometry)
  RETURNS geometry
  AS '$libdir/postgis-2.3','geomunion'
  COST 10000;

Now we get a parallel execution! And the time drops down substantially, though not quite a 50% reduction.

 Finalize Aggregate  
 (cost=16536.53..16536.79 rows=1 width=8) 
 (actual time=2263.638..2263.639 rows=1 loops=1)
   ->  Gather  
   (cost=16461.22..16461.53 rows=3 width=32) 
   (actual time=754.309..757.204 rows=4 loops=1)
         Number of Workers: 3
         ->  Partial Aggregate  
         (cost=15461.22..15461.23 rows=1 width=32) 
         (actual time=676.738..676.739 rows=1 loops=4)
               ->  Parallel Seq Scan on pd  
               (cost=0.00..13856.38 rows=64 width=2311) 
               (actual time=3.009..27.321 rows=42 loops=4)
                     Filter: (fed_num = 47005)
                     Rows Removed by Filter: 17341
 Planning time: 0.219 ms
 Execution time: 2264.684 ms

The punchline though, is what happens when we run the query using a single-threaded ST_Union() with cascaded union:

  SELECT ST_Area(ST_Union(geom)) 
    FROM pd 
    WHERE fed_num = 47005;

Good algorithms beat brute force still:

 (cost=14445.67..14445.93 rows=1 width=8) 
 (actual time=2031.230..2031.231 rows=1 loops=1)
   ->  Seq Scan on pd  
   (cost=0.00..14445.17 rows=199 width=2311) 
   (actual time=0.124..66.835 rows=169 loops=1)
         Filter: (fed_num = 47005)
         Rows Removed by Filter: 69365
 Planning time: 0.278 ms
 Execution time: 2031.887 ms

The open question is, “can we combine the subtlety of the cascaded union algorithm with the brute force of parallel execution”?

Maybe, but it seems to involve magic numbers: if the transfer function paused every N rows (magic number) and used cascaded union to combine the geometries received thus far, it could possibly milk performance from both smart evaluation and multiple CPUs. The use of a magic number is concerning however, and the approach would be very sensitive to the order in which rows arrived at the transfer functions.

Parallel Join

To test parallel join, we’ll build a synthetic set of points, such that each point falls into one polling division polygon:

  ST_PointOnSurface(geom)::Geometry(point, 3347) AS geom, 
  gid, fed_num 
FROM pd;

  ON pts USING GIST (geom);

Points and Polling Divisions

A simple join query looks like this:

 SELECT Count(*) 
  FROM pd 
  JOIN pts 
  ON ST_Intersects(pd.geom, pts.geom);

But the query plan has no parallel elements! Uh oh!

(cost=222468.56..222468.57 rows=1 width=8) 
(actual time=13830.361..13830.362 rows=1 loops=1)
   ->  Nested Loop  
   (cost=0.28..169725.95 rows=21097041 width=0) 
   (actual time=0.703..13815.008 rows=69534 loops=1)
         ->  Seq Scan on pd  
         (cost=0.00..14271.34 rows=69534 width=2311) 
         (actual time=0.086..90.498 rows=69534 loops=1)
         ->  Index Scan using pts_gix on pts  
         (cost=0.28..2.22 rows=2 width=32) 
         (actual time=0.146..0.189 rows=1 loops=69534)
               Index Cond: (pd.geom && geom)
               Filter: _st_intersects(pd.geom, geom)
               Rows Removed by Filter: 2
 Planning time: 6.348 ms
 Execution time: 13843.946 ms

The plan does involve a nested loop, so there should be an opportunity for parallel join to work magic. Unfortunately no variation of the query or the parallel configuration variables, or the function costs will change the situation: the query refuses to parallelize!

SET parallel_tuple_cost=0.001;
SET force_parallel_mode=on;
SET parallel_setup_cost=1;

The ST_Intersects() function is actually a SQL wrapper on top of the && operator and the _ST_Intersects() function, but unwrapping it and using the components directly also has no effect.

 SELECT Count(*) 
  FROM pd 
  JOIN pts 
  ON pd.geom && pts.geom
  AND _ST_Intersects(pd.geom, pts.geom);

The only variant I could get to parallelize omitted the && index operator.

 SELECT *        
  FROM pd 
  JOIN pts 
  ON _ST_Intersects(pd.geom, pts.geom);

Unfortunately without the index operator the query is so inefficient it doesn’t matter that it’s being run in parallel, it will take days to run to completion.

 (cost=1000.00..721919734.88 rows=1611658891 width=2552)
   Number of Workers: 2
   ->  Nested Loop  
   (cost=0.00..576869434.69 rows=1611658891 width=2552)
         Join Filter: _st_intersects(pd.geom, pts.geom)
         ->  Parallel Seq Scan on pd  
         (cost=0.00..13865.73 rows=28972 width=2512)
         ->  Seq Scan on pts  
         (cost=0.00..1275.34 rows=69534 width=40)

So, thus far, parallel query seems to be a wet squib for PostGIS, though I hope with some help from PostgreSQL core we can figure out where the problem lies.

UPDATE: Marking the geometry_overlaps function which is bound to the && operator as PARALLEL SAFE allows PostgreSQL to generate parallel join plans when the index is in place.


While it is tempting to think “yay, parallelism! all my queries will run $ncores times faster!” in fact parallelism still only applies in a limited number of cases:

  • When there is a sequence scan large (costly) enough to be worth parallelizing.
  • When there is an aggregate large (costly) enough to be worth parallelizing, and the aggregate function can actually parallize the work effectively.
  • (Theoretically) when there is a (nested loop) join large (costly) enough to be worth parallelizing.

Additionally there is still work to be done on PostGIS for optimal use of the parallel features we have available:

  • Every function is going to need a cost, and those costs may have to be quite high to signal to the planner that we are not garden variety computations.
  • Differences in COST adjustments for different modes need to be explored: why was a 10000 cost needed to kick the aggregation into action, while a 100 cost sufficed for sequence scan?
  • Aggregation functions that currently back-load work to the final function may have to be re-thought to do more work in the transfer stage.
  • Whatever issue is preventing our joins from parallelizing needs to be tracked down.

All that being said, the potential is to see a large number of queries get $ncores faster, so this promises to be the most important core development we’ve seen since the extension framework arrived back in PostgreSQL 9.1.

March 26, 2016 08:00 PM

Boston GIS (Regina Obe, Leo Hsu)

FOSS4G NA 2016 Early Bird ending soon and come see us

The FOSS4G NA 2016 in North Carolina early bird registration running May 2nd-5th (May 2nd is workshop day) will be ending this month. This means after this month, ticket prices will go up by as much as $200 depending on what kind of pass you need. If you are a speaker you should register now so we have a good head-count. Hotel prices will go up after this coming week too since conference room block will expire so all the more reason to register now.

On a related note, both Leo and I will be giving talks.

There will be other PostGIS talks and PostgreSQL day (May 3rd) talks for PostGIS and PostgreSQL users.

by Regina Obe (nospam@example.com) at March 26, 2016 07:26 PM

March 22, 2016

PostGIS Development

PostGIS 2.2.2 Released

The PostGIS development team is happy to release patch for PostGIS 2.2, the 2.2.2 release. As befits a patch release, the focus is on bugs and breakages.

Bug Fixes and Improvements

  • #3463, Fix crash on face-collapsing edge change
  • #3422, Improve ST_Split robustness on standard precision double systems (arm64, ppc64el, s390c, powerpc, …)
  • #3427, Update spatial_ref_sys to EPSG version 8.8
  • #3433, ST_ClusterIntersecting incorrect for MultiPoints
  • #3435, ST_AsX3D fix rendering of concave geometries
  • #3436, memory handling mistake in ptarrayclonedeep
  • #3437, ST_Intersects incorrect for MultiPoints
  • #3461, ST_GeomFromKML crashes Postgres when there are innerBoundaryIs and no outerBoundaryIs
  • #3429, upgrading to 2.3 or from 2.1 can cause loop/hang on some platforms
  • #3460, ST_ClusterWithin “Tolerance not defined” error after upgrade
  • #3490, Raster data restore issues, materialized views. Added scripts postgis_proc_set_search_path.sql, rtpostgis_proc_set_search_path.sql
  • #3426, failing POINT EMPTY tests on fun architectures

See the full list of changes in the news file and please report bugs that you find in the release.

Binary packages will appear in repositories over the coming weeks as packagers roll out builds.

View all closed tickets for 2.2.2.

by Paul Ramsey at March 22, 2016 12:00 AM

March 20, 2016

Stephen Mather

Taking Slices from LiDAR data: Part VI

I finally got PDAL properly compiled with Point Cloud Library (PCL) baked in. Word to the wise — CLANG is what the makers are using to compile. The PDAL crew were kind enough to revert the commit which broke GCC support, but why swim upstream? If you are compiling PDAL yourself, use CLANG. (Side note, the revert to support GCC was really helpful for ensuring we could embed PDAL into OpenDroneMap without any compiler changes for that project.)

With a compiled version of PDAL with the PCL dependencies built in, I can bypass using the docker instance. When I was spawning tens of threads of Docker and then killing them, recovery was a problem (it would often hose my docker install completely). I’m sure there’s some bug to report there, or perhaps spawning 40 docker threads is ill advised for some grander reason, but regardless, running PDAL outside a container has many benefits, including simpler code. If you recall our objectives with this script, we want to:

  • Calculate relative height of LiDAR data
  • Slice that data into bands of heights
  • Load the data into a PostgreSQL/PostGIS/pgPointCloud database.

The control script without docker becomes as follows:


# readlink gets us the full path to the file. This is necessary for docker
readlinker=`readlink -f $1`
# returns just the directory name
pathname=`dirname $readlinker`
# basename will strip off the directory name and the extension
name=`basename $1 .las`

# PDAL must be built with PCL.
# See http://www.pdal.io/tutorial/calculating-normalized-heights.html

pdal translate "$name".las "$name".bpf height --writers.bpf.output_dims="X,Y,Z,Intensity,ReturnNumber,NumberOfReturns,ScanDirectionFlag,EdgeOfFlightLine,Classification,ScanAngleRank,UserData,PointSourceId,HeightAboveGround"

# Now we split the lidar data into slices of heights, from 0-1.5 ft, etc.
# on up to 200 feet. We're working in the Midwest, so we don't anticipate
# trees much taller than ~190 feet
for START in 0:1.5 1.5:3 3:6 6:15 15:30 30:45 45:60 60:105 105:150 150:200
        # We'll use the height classes to name our output files and tablename.
        # A little cleanup is necessary, so we're removing the colon ":".
        nameend=`echo $START | sed s/:/-/g`

        # Name our output

        # Implement the height range filter
        pdal translate $name.bpf $bpfname -f range --filters.range.limits="HeightAboveGround[$START)"

        # Now we put our data in the PostgreSQL database.
        pdal pipeline -i pipeline.xml --writers.pgpointcloud.table='pa_layer_'$nameend --readers.bpf.filename=$bpfname --writers.pgpointcloud.overwrite='false'

We still require our pipeline xml in order to set our default options as follows:

<?xml version="1.0" encoding="utf-8"?>
<Pipeline version="1.0">
  <Writer type="writers.pgpointcloud">
    <Option name="connection">
      host='localhost' dbname='user' user='user' password=‘password’
    <Option name="table">54001640PAN_heightasz_0-1.5</Option>
    <Option name="compression">dimensional</Option>
    <Filter type="filters.chipper">
      <Option name="capacity">400</Option>
      <Reader type="readers.bpf">
      <Option name="filename">54001640PAN_heightasz_0-1.5.bpf</Option>

And as before, we can use parallel to make this run a little lot faster:

find . -name '*.las' | parallel -j20 ./pdal_processor.sh

For the record, I found out through testing that my underlying host only has 20 processors (though more cores). No point in running more processes than that… .

So, when point clouds get loaded, they’re broken up in to “chips” or collections of points. How many chips do we have so far?:

user=# SELECT COUNT(*) FROM "pa_layer_0-1.5";
(1 row)

Now, how many rows is too many in a PostgreSQL database? Answer:

In other words, your typical state full of LiDAR (Pennsylvania or Ohio for example) are not too large to store, retrieve, and analyze. If you’re in California or Texas, or have super dense stuff that’s been flown recently, you will have to provide some structure in the form of partitioning your data into separate tables based on e.g. geography. You could also modify your “chipper” size in the XML file. I have used the default 400 points per patch (for about 25,765,414,000 points total), which is fine for my use case as then I do not exceed 100 million rows once the points are chipped:

      <Option name="capacity">400</Option>

by smathermather at March 20, 2016 02:43 AM

March 17, 2016


OSGeo CS 2016 report : Point Clouds


The annual OSGeo Codesprint took place from 23d to 26th of February in Paris, at Mozilla's Foundation offices. Here is a summary of some achievements done during this event by developers working on Point Cloud data.

Point Clouds

Point Cloud, as seen in the famous musical masterpiece, is a data type growing in popularity lately. It often comes from LIDAR sensors, which generate 300k to 500k points/s these days. They can also come out of a Structure-from-Motion computation, based on simple pictures.

Data volumes, as well as some specific aspects of this data type ( e.g. no fixed schema ) lead to new challenges in the softwares needed to handle Point Clouds. A similarly growing number of persons work on these challenges, especially in the OSGeo community.

This post summarizes some of the achievement in this field for the OSGeo Codesprint in Paris.

Greyhound & Entwine

Hobu's team was present at the codesprint, with Howard Butler, Connor Manning and Andrew Bell. They disclosed two technologies they have been working on in the last months : Greyhound and Entwine.

Entwine is a Point Cloud data preparation library, indexing large amounts of data in distributed file systems. Connor Manning improved the performances of this new software. Greyhound is a Point Cloud streaming server. It is based on Entwine for data access, and is able to answer queries, streaming corresponding points giving a location, LOD and other informations. Greyhound and Entwine focus on serving large amounts of data with high performances.

Greyhound was originally bundled with Plas.io, which is a WebGL client to visualize point cloud data served by Greyhound. During the codesprint, Hobu's collaborators teamed with Maarten van Meersbergen and Oscar Martinez Rubi and achived to plug Potree to Greyhound. This is a really interesting improvement, since potree is currently the most advanced opensource point cloud visualization tool. Furthermore, it is a step towards interoperability. The Open Geospatial Consortium being currently interested in Point Cloud formats and protocol standardization, the work achieved during the codesprint makes a good pragmatic implementation as a base for standardization discussions.

There are now online demonstrations of the Entwine-Greyhound-Potree coupling :


The iTowns project is a 3D WebGL geospatial visualization solution, developped by the French national mapping agency ( IGN ), Oslandia and AtolCD. We are currently working on the future version of iTowns, improving the code base and adding new features.

During the Codesprint, Quoc worked on integrating Potree inside iTowns for massive Point Cloud data visualization. Both projects are based on THREE.js, therefore the incorporation makes sense. He managed to get a working prototype, displaying Point Clouds from a Potree data source ( Flat files structure ) inside iTowns. This work is currently in a specific branch, as it required some modifications on the potree side. Next steps will be to propose modularization refactoring to potree so that the integration into other frameworks can be done easily.

Our PhD student, Jeremy Gaillard spent some time improving its building server. It is a piece of software serving 3D objects, namely buildings, from a PostGIS database to a web client. Jeremy enhanced the indexing mechanism, which allows to load the data on the client following priorities given by a cost function. It allows for example to visualize specific landmarks first, or big buildings, or another characteristics used as priority .

And more

You will find more details in Howard Butler's post :

by Vincent Picavet at March 17, 2016 11:00 AM

PostGIS Development

Vanguard Appraisals

Vanguard Appraisals is new to the GIS world. In fact, we aren’t really in the GIS world; we just kind of brush up against it. We do mass property appraisal for entire county and city jurisdictions, and we develop software to collect, price and maintain values. We also host assessment data online so that homeowners can search and find property information much simpler from the comfort of their own home. Our software and websites are used in 7 states (IA, IL, MN, MO, NE, ND, SD).

Continue Reading by clicking title hyperlink ..

by Andy Colson at March 17, 2016 12:00 AM

PostGIS Development

Clever Maps

Clever°Maps’ is a three years old startup based in the Czech Republic. We create web apps for four distinct market segments:

  1. business/location inteligence - helping companies to make decisions based on data, not on feelings
  2. farming - simplifying agenda and everyday work
  3. road infrastructure administration - settlement of land property rights, treaty evidence, speeding up the whole administrative process
  4. assets administration - treaty management, land purchases
Continue Reading by clicking title hyperlink ..

by Michal Zimmermann at March 17, 2016 12:00 AM

March 16, 2016


OSGeo CS 2016 report : PostGIS


The annual OSGeo Codesprint took place from 23d to 26th of February in Paris, at Mozilla's Foundation offices. Here is a summary of some achievements done during this event by the PostGIS team.


A big team of PostGIS developers worked together during the sprint on numerous issues.

New functions and dev guide

Remi Cura from IGN had prepared a full list of improvements to existing PostGIS functions and new features to develop. Together with Marc Ducobu ( Champs Libres ) , they managed to go all the way from simple PostGIS users to PostGIS developers. They dived into the code and managed to write clean patches for functions and documentation.

They documented their work and created a step-by-step guide for new PostGIS contributors.

New clustering features

New features have been introduced for PostGIS 2.3, among them clustering methodsST_ClusterKMeans() and ST_ClusterDBSCAN() by Paul Ramsey and Dan Baston. These are really nice Window functions for PostGIS !

On a side note, the documentation for PostGIS Window functions has been improved too ( R. Obe )


Regina Obe managed to travel to Paris, and she put efforts analyzing why the ODBC performances of the ogr_fdw PostGIS foreign data wrapper were not as good as expected. This should lead to performance improvement later on. She also worked on problems encountered when restoring PostGIS data ( especially rasters), related to how the search_path is handled. There is still some work to experiment with EVENT TRIGGERS, but the issue is on a good path to be solved.

3D in PostGIS : SFCGAL

Our team at Oslandia has been working on SFCGAL, the 3D library behind PostGIS 3D features. Vincent Mora and Hugo Mercier teamed with Mickael Borne ( IGN ) and Sébastien Loriot ( GeometryFactory ) to break things and rebuild them, better. They focused mainly on performances :

  • different computation and precision kernels in CGAL, to use them efficiently and be much faster
  • set a flag for valid geometries, so as not to re-test for validity in every operation
  • lower serialization overhead when passing data between PostGIS, CGAL and SFCGAL

This effort lead to significant improvement in speed. Preliminary tests on 3D intersection showed improvement from 15 seconds down to 100 milliseconds, which is impressive result.

A lot of refactoring of the code has to be done, and this work also started, to simplify and ease the use of SFCGAL.

New indexes for big data

Another significant contribution is the work on BRIN indexes for PostGIS. At PGConf Europe, we already spoke with the team at Dalibo about the new index type in PostgreSQL 9.5 : BRIN use cases and implementation in a GIS context. Some time before OSGeo code sprint, we realized that Giuseppe Broccolo from 2ndQuadrant had started a prototype implementation, so we invited him to join. Giuseppe teamed with Ronan Dunklau and Julien Rouhaud from Dalibo, and together they managed to have a working implementation of this new type of indexes for PostGIS.

Having the opportunity for PostgreSQL developers to meet in this geospatial environment was the ideal way to get things done efficiently.

PostGIS BRIN will be detailled after some code consolidation and benchmarks, but here is the idea. BRIN indexes are partial indexes : they deal with data blocks ( a given amount of pages ) and not rows. You can set the number of pages per block. This makes indexes much faster to build, and a lot smaller than classic indexes ( GiST for PostGIS ). Of course, the tradeoff is that they are slower to query. And since they group data in blocks, they loose their efficiency if the data is not clustered on disk.

Some of our use cases, with tables full of Billions of points ( using PgPointCloud ), are a good match for BRIN indexes : the data is naturally clustered spatially ( patches of points are loaded sequencially ), and sometimes the data is too big for a table's index to fit entirely in memory.

  • E.g. we have a 3TB table, resulting in a 20GB GiST index.

A BRIN index can be a really good compromise for this : preliminary tests on a medium-size dataset show a performance degradation of 5x in querying, while the index is 1000x smaller And this tradeoff is adjustable, according to your dataset size and hardware configuration.

And more

Other topics which have been worked on (mainly by Paul, see his blog post) :

  • Expanded object header, to avoid serialization / deserialization steps
  • A nasty upgrade bug ( 2.2 blocker )

You will find more details in the following reports :

Thanks to the PostGIS team for such hard work and for their reports !

Figures credits : Openstreetmap's Soup, OpenScienceMap, Wikipedia

by Vincent Picavet at March 16, 2016 11:00 AM

March 15, 2016


OSGeo CS 2016 report : some achievements


The annual OSGeo Codesprint took place from 23d to 26th of February in Paris, at Mozilla's Foundation offices. Here is a summary of some achievements done during this event.

OSGeo projects at the codesprint

A lot of OSGeo projects were represented at the codesprint. Each had setup their plans with tasks to work on, goals to achieve and issues to discuss. Look at the projects and their initial goals :

A big part of the plans have been achieved, and most participants were happy with their results.

Some reports

Some projects already did some reporting on what they achieved during the codesprint. Here are some of them, and next articles on this blog will give more insight about PostGIS projects as well as Point Cloud technologies.

Orfeo ToolBox

Victor Poughon explains the work done on OTB during the sprint. They worked on QGIS / OTB integration so that now OTB algorithm are available through QGIS Processing. OSGeo Live OTB documentation has been improved. OGR integration in OTB has been analyzed, bash completion for GDAL and OTB has been added. Some progresses on ITK pipeline optimization have been achieved too.

You can read the full report here :


Bruno Friedmann, an OpenSuse Linux packager was present, and worked with Angelos Kalkas on improving Geospatial packages in OpenSuse. GDAL, PostGIS, pgRouting, pgPointCloud,PDAL and Mapserver had their updates and improvement.

Read more on this report :


Even Rouault, GDAL maintainer, has been working on a various features and bug fixing. Support of the M dimension of geometries was one of the issues, not easy to solve.

Rob Emmanuele put efforts on having a better error reporting system for some GDAL drivers. The new design should be in GDAL 2.1.

It is not a joke, Yann Chemin explored how to use GDAL for other planets and satellites. That is spatial geospatial development ...

Among other improvements to GDAL : OGR ODBC driver ( Regina Obe) and bash completion for CLI tools ( Guillaume Pasero).

You can read Even's complete report here :


The MapServer project released some new versions of MapServer, with bugfixes. The 7.0.1 and 6.4.3 maintainance versions have been released on the 25th of february. You can see the changelog for 7.0.1 or for 6.4.3 . A new version of MapCache, 1.4.1 has also been released, and all three components are available for download .

The Mapserver team also closed 430 tickets during the codesprint. They flooded 80 happy watcher's inboxes in the process. This is a result of an automated script from Stephan Meißl.

Daniel Morissette added a service providers page on MapServer website. Mike fixed a few begs, and added some Docker container informations for MapServer.

A new demo and tutorial has been setup for MapServer. This is work from Lars, Seth and Julien and is available here ( RFC 115 ) :


Proj.4 now has a new website , thanks to the work of Howard Butler !

The next article will present what the PostGIS team achieved...

by Vincent Picavet at March 15, 2016 11:00 AM

March 14, 2016


OSGeo CodeSprint 2016 in Paris

C Tribe OSGeo Code Sprint, 2016 Paris edition

At the end of February, Mozilla Paris hosted the annual OSGeo "C tribe" Code Sprint event. Around 50 developers from several Open Source GIS projects gathered to work together.

The "C tribe" code sprint was originally born in North America, almost a decade ago. It took place in Europe for the first time in 2014 (Vienna edition). Vienna was a successful event, and an incitation to alternate the location between the old and the new continent. This allows for more local developers to attend, and new places to visit. This is why we landed in Paris.

This event can be seen as a collection of several smaller sprints (several projects involved), each with its own agenda :

But in the meantime each attendee benefits from all resources gathered. This really makes a difference for some issues...

During a code sprint, coding obviously takes a central place. But discussions about projects orientations and software design are as important. And most important is also all informal exchanges between participants.

You can have a look at Vincent's pictures on Flickr .

A special event

This event had its special taste and color, thanks to some exceptional aspects :

  • The venue itself really helped ! Located in the very center of Paris, the main room is magnificent ; perfect mix beetween old fashion style ala Versailles (this place used to be an ambassy) and latest high tech facilities. From a symbolic point of view, Mozilla Paris offices could be seen as an Open Source temple, using the old aristocratics power's codes. It also conveys the message that Open Source is no longer only written in underground places.
  • Each project was encouraged to define its own agenda before the sprint (as linked above). This is a good bootstrap for the first day.
  • I also wanted to enlarge the attendees list, to people not used to attend OSGeo Code Sprints. So we reached out to other communities and organizations and invited them to join. All of them apparently enjoyed a lot the experience. We can be proud of the OSGeo's community to be able to welcome new energies, faces and skills.
  • In a similar way, during the second evening, we opened the doors to a external people curious to see how a code sprint look like. Participants had the opportunity to briefly present what they were working on, in front of the whole audience. Giving a chance to share the knowledge, and the following buffet, was a really nice moment.
  • A strong attention has been paid to the locations for evening events, as well as food and drinks quality. This may be French way of life, but it really improves the social side of the event.

It apparently paid off, as sprinters seems enthusiastics :

Yes, it was worth it !

Gathering around 50 developerss, during almost a week, represents more or lesse one man-year effort, plus extra resources ( travel fees & more). Was it worth it ? As an organizer, I obviously think it was !

A code sprint really changes the way developers interact with each other. Tough discussions are much easier face to face than on IRC or mailing list, and decision-taking is greatly improved. Spending a week close to the others changes the way you interact afterwards and ease communications. Moreover, finding highly skilled people in the same room makes problem solving simple, as someone almost always has the right answer to any tricky issue.

This sprint helped create many bridges, be it between individuals, organizations or projects. Most of them will grow into new interactions in the coming weeks and months. The event was also a good way to inject new energy into the community, and welcome newcomers into this growing set of people, for the best of OpenSource geospatial technologies.

Therefore, a lot of "Mercis" to all who contributed to this event :

by Olivier Courtin at March 14, 2016 11:00 AM

March 13, 2016

PostGIS Development

Selecting only pixels of particular range of values with ST_Reclass

This raster question comes up quite a bit on PostGIS mailing lists and stack overflow and the best answer often involves the often forgotten ST_Reclass function that has existed since PostGIS 2.0.
People often resort to the much slower though more flexible ST_MapAlgebra or dumping out their rasters as Pixel valued polygons they then filter with WHERE val > 90, where ST_Reclass does the same thing but orders of magnitude faster.

Continue Reading by clicking title hyperlink ..

by Regina Obe at March 13, 2016 12:00 AM

March 12, 2016

Paul Ramsey

libpostal for PostgreSQL

Dealing with addresses is a common problem in information systems: people live and work in buildings which are addressed using “standard” postal systems. The trouble is, the postal address systems and the way people use them aren’t really all that standard.

libpostal for PostgreSQL

Postal addressing systems are just standard enough that your average programmer can whip up a script to handle 80% of the cases correctly. Or a good programmer can handle 90%. Which just leaves all the rest of the cases. And also all the cases from countries where the programmer doesn’t live.

The classic resource on postal addressing is called Falsehoods programmers believe about addresses and includes such gems as:

An address will start with, or at least include, a building number.

Counterexample: Royal Opera House, Covent Garden, London, WC2E 9DD, United Kingdom.

No buildings are numbered zero

Counterexample: 0 Egmont Road, Middlesbrough, TS4 2HT

A street name won’t include a number

Counterexample: 8 Seven Gardens Burgh, WOODBRIDGE, IP13 6SU (pointed out by Raphael Mankin)

Most solutions to address parsing and normalization have used rules, hand-coded by programmers. These solutions can take years to write, as special cases are handled as they are uncovered, and are generally restricted in the language/country domains they cover.

There’s now an open source, empirical approach to address parsing and normalization: libpostal.

Libpostal is built using machine learning techniques on top of Open Street Map input data to produce parsed and normalized addresses from arbitrary input strings. It has binding for lots of languages: Perl, PHP, Python, Ruby and more.

And now, it also has a binding for PostgreSQL: pgsql-postal.

You can do the same things with the PostgreSQL binding as you can with the other languages: convert raw strings into normalized or parsed addresses. The normalization function returns an array of possible normalized forms:

SELECT unnest(
  postal_normalize('412 first ave, victoria, bc')
 412 1st avenue victoria british columbia
 412 1st avenue victoria bc
(2 rows)

The parsing function returns a jsonb object holding the various parse components:

SELECT postal_parse('412 first ave, victoria, bc');
 {"city": "victoria", 
  "road": "first ave", 
  "state": "bc", 
  "house_number": "412"}
(1 row)

The core library is very fast once it has been initialized, and the binding has been shown to be acceptably fast, despite some unfortunate implementation tradeoffs.

Thanks to Darrell Fuhriman for motivating this work!

March 12, 2016 11:00 AM

March 08, 2016

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Paris OSGeo Code Sprint 2016 Highlights

Leo and I attended the Paris OSGeo Code Sprint at Mozilla Foundation put together by Oslandia and funded by several companies. It was a great event. There was quite a bit of PostGIS related hacking that happened by many new faces. We have detailed at BostonGIS: OSGeo Code Sprint 2016 highlights some of the more specific PostGIS hacking highlights. Giuseppe Broccolo of 2nd Quadrant already mentioned BRIN for PostGIS: my story at the Code Sprint 2016 in Paris.

Continue reading "Paris OSGeo Code Sprint 2016 Highlights"

by Leo Hsu and Regina Obe (nospam@example.com) at March 08, 2016 01:01 AM

Boston GIS (Regina Obe, Leo Hsu)

Paris OSGeo Code Sprint 2016 PostGIS Highlights

We'd like to give a big thanks to Olivier Courtin and Oslandia for organizing this year's OSGeo Code Sprint in Paris. It was quite memorable and I walked away feeling energized.

Leo and I attended the Paris OSGeo Code Sprint for 1.5 days. I'm really glad we were able to attend at least part of it. It would have been nicer to attend all of it. Though we didn't get too much done while there, we did have some interesting conversations and learned about what others were doing. I walked out with a TO DO list, of which I'm happy to say I've accomplished some of now.

While we were there Leo spent time cleaning up my Mingw64 compile scripts and starting to test PostGIS against PostgreSQL 9.6 in preparation for parallelization testing of PostGIS 2.3 features specifically targetted for PostgreSQL 9.6.

Continue reading "Paris OSGeo Code Sprint 2016 PostGIS Highlights"

by Regina Obe (nospam@example.com) at March 08, 2016 01:01 AM

March 07, 2016

Paul Ramsey

Paris Code Sprint, PostGIS Recap

At the best of times, I find it hard to generate a lot of sympathy for my work-from-home lifestyle as an international coder-of-mystery. However, the last few weeks have been especially difficult, as I try to explain my week-long business trip to Paris, France to participate in an annual OSGeo Code Sprint.

Paris Code Sprint

Yes, really, I “had” to go to Paris for my work. Please, stop sobbing. Oh, that was light jealous retching? Sorry about that.

Anyhow, my (lovely, wonderful, superterrific) employer, CartoDB was an event sponsor, and sent me and my co-worker Paul Norman to the event, which we attended with about 40 other hackers on PDAL, GDAL, PostGIS, MapServer, QGIS, Proj4, PgPointCloud etc.

Paul Norman got set up to do PostGIS development and crunched through a number of feature enhancements. The feature enhancement ideas were courtesy of Remi Cura, who brought in some great power-user ideas for making the functions more useful. As developers, it is frequently hard to distinguish between features that are interesting to us and features that are using to others so having feedback from folks like Remi is invaluable.

The Oslandia team was there in force, naturally, as they were the organizers. Because they work a lot in the 3D/CGAL space, they were interested in making CGAL faster, which meant they were interested in some “expanded object header” experiments I did last month. Basically the EOH code allows you to return an unserialized reference to a geometry on return from a function, instead of a flat serialiation, so that calls that look like ST_Function(ST_Function(ST_Function())) don’t end up with a chain of three serialize/deserialize steps in them. When the deserialize step is expensive (as it in for their 3D objects) the benefit of this approach is actually measureable. For most other cases it’s not.

(The exception is in things like mutators, called from within PL/PgSQL, for example doing array appends or insertions in a tight loop. Tom Lane wrote up this enhancement of PgSQL with examples for array manipulation and did find big improvements for that narrow use case. So we could make things like ST_SetPoint() called within PL/PgSQL much faster with this approach, but for most other operations probably the overhead of allocating our objects isn’t all that high to make it worthwhile.)

There was also a team from Dalibo and 2nd Quadrant. They worked on a binding for geometry to the BRIN indexes (9.5+). I was pretty sceptical, since BRIN indexes require useful ordering, and spatial data is not necessarily well ordered, unlike something like time, for example. However, they got a prototype working, and showed the usual good BRIN properties: indexes were extremely small and extremely cheap to build. For narrow range queries, they were about 5x slower than GIST-rtree, however, the differences were on the order of 25ms vs 5ms, so not completely unusable. They managed this result with presorted data, and with some data in its “natural” order, which worked because the “natural” order of GIS data is often fairly spatially autocorrelated.

I personally thought I would work on merging the back-log of GitHub pull-requests that have built up on the PostGIS git mirror, and did manage to merge several, both new ones from Remi’s group and some old ones. I merged in my ST_ClusterKMeans() clustering function, and Dan Baston merged in his ST_ClusterDBSCAN() one, so PostGIS 2.3 will have a couple of new clustering implementations.

However, in the end I spent probably 70% of my time on a blocker in 2.2, which was related to upgrade. Because the bug manifests during upgrade, when there are two copies of the postgis.so library floating in memory, and because it only showed up on particular Ubuntu platforms, it was hard to debug: but in the end we found the problem and put in a fix, so we are once again able to do upgrades on all platforms.

The other projects also got lots done, and there are more write-ups at the event feedback page. Thanks to Olivier Courtin from Oslandia for taking on the heavy weight of organizing such an amazing event!

March 07, 2016 08:00 PM

February 17, 2016


iTowns 1.0 release : 3D web visualization framework

Oslandia is pleased to announce the first release of iTowns, a new 3D geospatial data visualization web framework developed by the iTowns project, including people from French IGN, Oslandia and AtolCD .

Contact for this project at Oslandia : infos+itowns@oslandia.com

iTowns is a web framework written in Javascript/WebGL for visualisation of 3D geographic data, allowing precise measurements in 3D. Its first purpose is the visualisation of street view images and terrestrial lidar point clouds, though it now supports much more data types.

Version 1.0 is the first opensource release of iTowns. It is the core of the original iTowns application developped during the last years at French IGN MATIS research laboratory. Oslandia worked closely with IGN during the last months, in order to prepare this move from an in-house application to an opensource project.

This version provides the following features :

  • Load and project Oriented Images on mesh (cube or city model)
  • Load and display Panoramic Images
  • Load Depth Panoramic Image and render in 3D
  • Load 2D multipolygons with height (building footprint) from WFS or local file and triangulate it to create building boxes. This mesh can then be use for texture projection.
  • Navigate through Image Data using click and go functions
  • Load and display Point Cloud from PLY files.
  • Load and display 3D textured models (B3D, 3DS).
  • Simple API interface.

You will find all information and a demo on iTowns website :


French IGN provided a sample dataset, which can be viewed on the iTowns demo repository :


You can download iTowns Version 1.0 here :



iTowns is an OpenSource organization, delivering software components for 3D geospatial data visualization on the web. The iTowns project is open to contributions from anyone in the community. You can fork the project on GitHub and send Pull Request, and we will be pleased to integrate them. Do not hesitate to consult the iTowns wiki for general informations on how to contribute.


On the 3d and 4th of February, 2016, the iTowns project held a codesprint, hosted by IGNFab in Paris. It gathered around 15 developers and enabled to finalize the 1.0 release, work on iTowns project organization, documentation and demos, as well as design and test new features for the next generation of the iTowns framework. A PSC has been formed, with Alexandre Devaux (IGN) as chair.

This collaboration has been fruitful, and is a good kick-off for all developers willing to work together on 3D geospatial data visualization.


iTowns is an original work from French IGN, MATIS research laboratory. It has been funded through various research programs involving the French National Research Agency, Cap Digital, UPMC, Mines ParisTec, CNRS, LCPC.

iTowns is currently maintained by IGN ( http://www.ign.fr ) and Oslandia ( http://www.oslandia.com )

Should you wish to have more information or need support, please contact infos+itowns@oslandia.com

by Vincent Picavet at February 17, 2016 10:30 AM