Welcome to Planet PostGIS

April 09, 2014

Boston GIS (Regina Obe, Leo Hsu)

MySQL 5.7 GIS building on Boost Geometry

Interesting times ahead. The upcoming MySQL 5.7 promises to provide great enhancements in the spatial space. They've scrapped much of the existing GIS plumbing they have in place seen in prior 5.6 and are replacing muc of it with Boost Geometry. In addition R-Tree indexes will be available for InnoDb. Details: Why Boost.Geometry in MySQL? and Making use of Boost Geometry in MySQL. Along the roadmap (not for 5.7) they plan to have geography support as well and projections, 3D and the various output formats e.g GeoJSON currently available in PostGIS.

Just when I thought database spatial domain was getting boring. It will be interesting to see how the new MySQL Boost Geometry stacks against PostGIS GEOS /3D CGAL support. Might be time for a PostGIS/MySQL shoot-out soon.

by Regina Obe (nospam@example.com) at April 09, 2014 07:44 PM

April 01, 2014

Boundless Geo

Code Sprinting in Vienna

Over 50 people came together in Vienna last week for five days of working on open source geospatial software. This code sprint merged the annual “C tribe” sprint — PostGIS, MapServer, GDAL, PDAL, and others — with the regular sprints held by the QGIS and GRASS communities. As a result, the attendance was at a record level, and the opportunities for collaboration much greater.

PostGIS 2.1.2 & 2.0.5 Released

I attended to work on PostGIS development, as usual, and achieved my modest goal: closing all the remaining tickets and getting 2.1.2 and 2.0.5 released. These are both bug fix releases, so no new functionality is included, but there are some important issues in the 2.1 branch that were resolved with the 2.1.2 release.

More QGIS in Our Future

Wearing my OpenGeo Suite product manager hat, the sprint was an excellent opportunity to sit down and talk with leaders in the QGIS community about our plans for bundling QGIS with OpenGeo Suite. When OpenGeo Suite 4.1 is released later this year, it will include builds of QGIS for all our supported platforms (Windows, OSX, Linux). It will also have some specialized plugins for managing map composition and versioned editing within QGIS. Our own Victor Olaya is a QGIS community member and a developer of the the Processing framework (formerly Sextante) in QGIS. We’re looking forward to being a part of the QGIS community, in the same way that we are a part of the OpenLayers, GeoServer and PostGIS communities.

Building Community

Five days is a long time to hunker down in a room slinging a code, and there were a lot of valuable side meetings: developers showed off some of their latest work (I saw some great 3D work from the PDAL’s Howard Butler and Oslandia’s Olivier Courtin); the QGIS community worked through important design decisions; and Tim Sutton and Anita Graser collected lots of interviews for the QGIS podcast.

Code Sprint Beer Mat

The “beer track” (not run concurrently with the other tracks) was also great, thanks to Stephan Meißl’s organizing: some onsite events, a walk and dinner in downtown Vienna, and a dinner in the local neighborhood. Most importantly, opportunities to get to know developers from other communities and bring the whole open source geo community closer together.

Next year, the sprint will once again combine as many communities as possible (hopefully adding some!), this time in a North American venue (likely Philadelphia). I’ll certainly be in attendance. If you write open source geo software, I hope you will too.

 

The post Code Sprinting in Vienna appeared first on Boundless.

by Paul Ramsey at April 01, 2014 01:44 PM

March 31, 2014

PostGIS Development

PostGIS 2.1.2 Released

The 2.1.2 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.1, the 2.1.2 release. As befits a patch release, the focus is on bugs and breakages.

Best served with a bottle of GEOS 3.4.2.

http://download.osgeo.org/postgis/source/postgis-2.1.2.tar.gz

Continue Reading by clicking title hyperlink ..

by Paul Ramsey at March 31, 2014 12:00 AM

PostGIS Development

PostGIS 2.0.5 Released

The 2.0.5 release of PostGIS is now available.

The PostGIS development team is happy to release patch for PostGIS 2.0, the 2.0.5 release. As befits a patch release, the focus is on bugs and breakages.

Best served with a bottle of GEOS 3.3.9.

http://download.osgeo.org/postgis/source/postgis-2.0.5.tar.gz

Doc downloads: html doc download pdf doc download

Continue Reading by clicking title hyperlink ..

by Paul Ramsey at March 31, 2014 12:00 AM

March 26, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

An almost idiot's guide to install PostgreSQL 9.3, PostGIS 2.1 and pgRouting with Yum

In this exercise, we'll go thru installing PostgreSQL 9.3 on a CentOS 6 64-bit box. We'll cover upgrading in a later article. For the rest of this article, we'll go over configuring yum to use the PostgreSQL PGDG Yum repository found at http://yum.postgresql.org , which has the latest and greatest of 9.3. It's been a while since we wrote step by step instructions for installing with Yum.

Note: PostGIS 2.1.2 release is eminent, so you might want to wait till we release and Yum is updated before you install/upgrade.

Continue reading "An almost idiot's guide to install PostgreSQL 9.3, PostGIS 2.1 and pgRouting with Yum"

by Leo Hsu and Regina Obe (nospam@example.com) at March 26, 2014 05:38 AM

Stephen Mather

Packt Sale — 2 books for the price of one

Now I feel like a salesman. I feel obliged to mention the Packt buy one get one free sale (ends tomorrow):

http://www.packtpub.com/?utm_source=referral&utm_medium=marketingPR&utm_campaign=2000thTitle

If you want a postgresql title to go with the PostGIS Cookbook (just sayin’):

http://www.packtpub.com/search?keys=postgresql&sort=0&types=0&forthcoming=1&available=1&count=20&op=Go

Ok. Enough. Back to fun reading:

An map of Mount Pleasant

 


by smathermather at March 26, 2014 03:56 AM

March 21, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

New PostgreSQL releases from Packt Publishing

This past year has seen a fair number of PostgreSQL and PostGIS book releases from Packt Publishing. Packt is currently running a special E-Book offer, buy one of any of their 2000 titles and get one free. Visit http://bit.ly/1j26nPN to redeem on the offer. Offer ends: 26th-Mar-2014.

Recent titles you'll want to get as a PostgreSQL / PostGIS user

  • PostGIS Cookbook Published Jan 2014. Though not directly related to PostGIS - Learning QGIS Cookbook Pub September 2013 is a fine companion to any PostGIS book.
  • PostgreSQL Server Programming Published Jun 2013
  • PostgreSQL Replication Published August 2013

Continue reading "New PostgreSQL releases from Packt Publishing"

by Leo Hsu and Regina Obe (nospam@example.com) at March 21, 2014 06:05 PM

March 14, 2014

PostGIS Development

Getting intersections the faster way

Doing an ST_Intersection is much slower than relation checks such as ST_Intersects , ST_CoveredBy, and , ST_Within . In many situations you know the intersection of 2 geometries without actually computing an intersection. In these cases, you can skip the costly ST_Intersection call. Cases like this:

  1. geometry a is covered by geometry b -> intersection is geometry a
  2. geometry b is covered by geometry a -> intersection is geometry b
  3. geometry a does not intersect geometry b -> intersection is empty geometry

This kind of question comes up a lot: As discussed in stackexchange Acquiring ArcGIS speed in PostGIS

Continue Reading by clicking title hyperlink ..

by Regina Obe at March 14, 2014 12:00 AM

March 13, 2014

Geo Perspectives (Justin Lewis)

Exploding Polygons Into Random Points by Density with PostGIS

Visualizing geo-data by density is a common need but often leaves us with yet another chloropleth map.  Sometimes there is a need to make the data more visually interesting or to capture multiple characteristics of that density in one layer.  This can be the case with race and ethnicity data where visualizing race and density simultaneously can be very powerful.  The University of Virginia’s Racial Dot Map is an excellent example of this.  In this post I will demonstrate one way to convert polygons into points based on a related attribute for density using a PostGIS function.  In other words, if you have a polygon with an associated value for population you could use this technique to generate points for each person in that polygon.  

Goals:

  1. Generate points from polygons
  2. Ensure point locations are random.
  3. Ensure point locations fall within the boundary of the polygon they originate from.

Data for this example can be downloaded from the DRCOG Regional Data Catalog.

The PostGIS function that generates random point locations representing people from Census Tracts:

-- Function: public.pop_tract_explode()
-- DROP FUNCTION public.pop_tract_explode();

CREATE OR REPLACE FUNCTION public.pop_tract_explode()
  RETURNS text AS
$BODY$
DECLARE
    record    RECORD;

    gid INTEGER;
    geom_2232 GEOMETRY;
    geoid10 BIGINT;

    -- Variables for explode
    total INTEGER;

    category_tag TEXT;   
    ct integer;
    skipped INTEGER := 0;

    -- Variables for randomize steps
    i INTEGER;
    maxiter INTEGER := 1000;
    x0 DOUBLE PRECISION;
    dx DOUBLE PRECISION; 
    y0 DOUBLE PRECISION;
    dy DOUBLE PRECISION;
    xp DOUBLE PRECISION;
    yp DOUBLE PRECISION;
    poly_geom GEOMETRY;
    randPoint Geometry;

BEGIN       

    -- Creating a destination table with a matching data model to the input table and adding needed fields
    EXECUTE 'DROP TABLE IF EXISTS pop_tract_explode';
    EXECUTE 'CREATE TABLE pop_tract_explode AS 
             SELECT "gid", "geoid10", "pop2010" FROM public.census_tracts_2010 LIMIT 1;';
    EXECUTE 'TRUNCATE TABLE pop_tract_explode;';
    EXECUTE 'ALTER TABLE pop_tract_explode ADD COLUMN "category_tag" character varying(100);';
    EXECUTE 'ALTER TABLE pop_tract_explode ADD COLUMN rand_point GEOMETRY;';

    -- Loop through all records of the input table
    FOR record IN EXECUTE 'SELECT gid, geoid10, geom_2232, "pop2010" FROM public.census_tracts_2010;'
	LOOP	
		-- Set the polygon geometry variable for the current record
		poly_geom := record.geom_2232;

		-- Determine the bounds of polygon geometry
		x0 = ST_XMin(poly_geom);
		dx = (ST_XMax(poly_geom) - x0);
		y0 = ST_YMin(poly_geom);
		dy = (ST_YMax(poly_geom) - y0);	

		-- Set the variables for the current record
		gid := record.gid::INTEGER;
		geoid10 := record.geoid10::BIGINT;	
	 	total := record.pop2010::INTEGER;		

		-- Loop throu
		ct := 0;
		category_tag := 'total';
		WHILE ct < total  -- you can make each point represent a specific number of occurances by dividing by that value (i.e. total/100 makes 1 pt = 100 people)
		LOOP
			RAISE NOTICE 'Count: %', ct;

			-- Only try to place the point within the polygon if not exceeding the maximum interation value
			i := 0;		
			WHILE i < maxiter 
			LOOP
				i = i + 1;
				xp = x0 + dx * random();
				yp = y0 + dy * random();
				randPoint = ST_SetSRID( ST_MakePoint( xp, yp ), ST_SRID(poly_geom) );

				EXIT WHEN ST_Within( randPoint, poly_geom ); 
			END LOOP;

		 	IF i >= maxiter THEN
				RAISE NOTICE 'RandomPoint: number of interations exceeded %', maxiter;
				skipped = skipped + 1;
			ELSE
			-- Insert a new record into the destination table for the random point 					
			INSERT INTO pop_tract_explode VALUES ( gid,geoid10,total,category_tag,randPoint );
			END IF;

			ct = ct + 1;
		END LOOP;
	END LOOP;

	-- Create spatial index on the new spatial column
	EXECUTE 'CREATE INDEX pop_tract_explode_gist ON pop_tract_explode USING gist (rand_point)';

    RETURN skipped;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION public.pop_tract_explode()
  OWNER TO postgres;

 

Run the function with this simple command:

SELECT pop_tract_explode();

 

This function can be improved in many ways.  One major improvement would be to allow the function to accept inputs for input/output table names as well as density field names so the function can be used with other data sets.  This would also be a start toward allowing the generation of points for different categories of density data so you could, for example, visualize population density by race category.

by jmapping at March 13, 2014 06:15 AM

March 06, 2014

Stephen Mather

PostGIS Cookbook(s)

This is starting to feel real…

PostGIS Cookbook(s).  2904 pages.

6 PostGIS Cookbooks

Page count for a single book can be calculated as in the following code:

WITH distinction AS (
    SELECT DISTINCT( the_text) AS pages FROM postgis_cookbook
SELECT COUNT (pages) AS "Count of Pages" FROM distinction;

The preceding query will result in the following:

Count of Pages
--------------
484

Feeling real

A special thanks to my family, my co-authors, all the contributors to PostGIS, and to Regina Obe for helping me through the hardest bit of code in my part of the book. Check out Regina and Leo’s book, PostGIS in Action. Their second edition is already on pre-order. I got to read a bit of it last year, and it promises to be a really epic edition. I still reference the 1st edition myself. I wish I had had it when I was starting in PostGIS… .


by smathermather at March 06, 2014 03:25 AM

February 27, 2014

Geo Perspectives (Justin Lewis)

Scripted Geo-Data Processing Techniques – Getting/Accessing Data

This post is a follow up to my last post where I stated that there are three basic steps to geo-data processing.  I want to dig into the first of those steps (getting/accessing data) so as to get a better idea of some of the many ways you can approach this task.  Many of these examples overlap with the third basic step of geo-data processing (outputting data) as it is common to first move data to a processing environment.  I will cover outputting data in a future post.  This is not an exhaustive list of all the ways you can achieve this goal but hopefully it will help you get started with these tools.

It is common for geo-data tools to come with a minimalist command line interface, also commonly called the terminal.  While the terminal can be intimidating for newbies it offers a great amount of versatility that is difficult to match with a GUI. The terminal can also be handy for executing commands from a script.  This is not always the best approach to automating a task but can be useful in certain situations.  While the terminal is often very useful there are also great tools that allow you to work with data at a finer level from within a script.  For this post I will focus on some powerful command line tools, psycopg2, and Python.

Note:  You can execute any terminal command with Python using the OS module that ships by default with Python.  Wherever I demonstrate a terminal command in this post I also include the corresponding Python code to execute the command from a script.  

import os

os.system('command')

 

Copy data from a text file to PostgreSQL (PostGIS)

It is often the cast that data is provided as a text file.  There are many flavors of text file but probably the most common is the CSV (comma separated values).  While there is no official standard for the CSV format there are generally agreed upon standards that a CSV is commonly defined by.  The most basic structure requires a CSV to have values seperated (delimited) by commas.  All this aside, the PostgreSQL copy command is probably the fastest command available for copying data from a text file to PostgreSQL.  The hardest part is ensuring your destination table is setup correctly to except your input data (from the text file).  Before running a copy command like this you must pre-build a table with a matching structure to the text file.  Meaning, the same number of columns that have appropriate data types.

COPY YOUR_DESTINATION_TABLE FROM '/PATH/TO/YOUR/CSV' WITH DELIMITER AS ',';

Wrap this in Python using the OS module:

import os

os.system("psql -U admin -d demo -c 'COPY YOUR_DESTINATION_TABLE FROM '/PATH/TO/YOUR/CSV' WITH DELIMITER AS ',';")

An arguably less ideal option is to use ogr2ogr with something like this:

ogr2ogr --config PG_LIST_ALL_TABLES YES -update append -skipfailures -f PostgreSQL PG:"port=5432 host=YOUR_HOST dbname=YOUR_DATABASE user=YOUR_USER password=YOUR_PASS" %%f -nln YOUR_OUTPUT_TABLE_NAME

 

Get data from a PostgreSQL database table with Python

It is common to need to access data from a PostgreSQL database for further processing in a script.  Pysycopg2 can be handy in this situation by enabling you to execute SQL statements and return the results to your Python environment.  This is a very basic example but demonstrates the general technique.

#DB connection properties
conn = psycopg2.connect(dbname = 'YOUR_DATABASE', host= 'YOUR_HOST', port= 5432, user = 'YOUR_USER',password= 'YOUR_PASS')
cur = conn.cursor()  ## open a cursor

sql = 'SELECT * FROM your_table_name'

cur.execute(sql)  ## executes the sql command
result = cur.fetchall()  ## returns the result

print result

cur.close()
conn.close()

 

Copy data from an ESRI Shapefile to PostgreSQL (PostGIS)

Geo-data is commonly shared in the ESRI Shapefile format.  While not an open standard the Shapefile is well known and it is common for geo based tools to have at least some level of compatibility with the format.  This is true for programming tools as well.

shp2pgsql terminal command

shp2pgsql /PATH/TO/YOUR/SHAPEFILE.shp | psql -U YOUR_USER -d YOUR_DATABASE

import os

os.system("shp2pgsql /PATH/TO/YOUR/SHAPEFILE.shp | psql -U YOUR_USER -d YOUR_DATABASE")

ogr2ogr terminal command:

ogr2ogr -overwrite -skipfailures -progress -f PostgreSQL PG:"dbname='YOUR_DATABASE' host='YOUR_HOST' port='5432' user='YOUR_USER' password='YOUR_PASS'" PATH/TO/county_2013.shp -nln county

import os

os.system("ogr2ogr -overwrite -skipfailures -progress -f PostgreSQL PG:"dbname='YOUR_DATABASE' host='YOUR_HOST' port='5432' user='YOUR_USER' password='YOUR_PASS'" PATH/TO/county_2013.shp -nln county")

 

Copy data from a PostgreSQL (PostGIS) database to another PostGIS database

You might find yourself in a situation where you need to pull data from one PostGIS database to another in order to isolate the data.  There are many ways you can solve this problem but I will only cover one for simplicity.

ogr2ogr -overwrite -f PostgreSQL --config PG_USE_COPY YES  PG:"dbname='DESTINATION_DB' host='DESTINATION_HOST' port='5432' user='USER' password='PASS'" PG:"dbname='ORIGIN_DB' host='ORIGIN_HOST' port='5432' user='USER' password='PASS'" NAME_OF_TABLE_IN_ORIGIN_DB

import os

os.system("ogr2ogr -overwrite -f PostgreSQL --config PG_USE_COPY YES  PG:"dbname='DESTINATION_DB' host='DESTINATION_HOST' port='5432' user='USER' password='PASS'" PG:"dbname='ORIGIN_DB' host='ORIGIN_HOST' port='5432' user='USER' password='PASS'" NAME_OF_TABLE_IN_ORIGIN_DB")

TIP: You can add multiple tables to copy more than one in a single command by simply adding table names separated by a space to the end of this command.

 

Get data from a GeoJSON Web-Service

With the increase of open data catalogs and data api’s comes the growing need (or option) to consume data directly from a web service.  These services often offer data in the common JSON or GeoJSON format.  This can be useful for consuming directly into a web browser but can also be useful for scripting data collection efforts.  For this post I will only demonstrate how to GET JSON data with python and not how to process it afterwords as that is a topic a bit beyond the scope of this post.

import urllib2, json

url = "http://gis.drcog.org/REST/v1/ws_geo_identify.php?distance=50&format=json&fields=name&geotables=muni_2010&srid=2232&x=3138704.521018773&y=1699213.268416112"

data = json.load(urllib2.urlopen(url))

print data

 

Copy data from a Web Page (scraping)

Scraping web pages is a useful way to get data that is not readily available and machine readable.  It is a fairly challenging task that involves the processing of HTML elements.  HTML provides the structure for a web page and so you can scrape content from specific HTML elements in a variety of ways.  The easiest is to select types of elements based on the elements class or id.   Luckily there are many scraping tools like Beautiful Soup that help greatly in this process.

Important Note: Always review the usage limitations of the web-site before scraping any site.  Some web-sites do NOT allow sites to be scraped.

import urllib2, re, psycopg2

from bs4 import BeautifulSoup
from bs4 import SoupStrainer

class scrape:

    def get_soup(self, source_url):     
        url = urllib2.urlopen(source_url)
        content = url.read()
        soup = BeautifulSoup(content)

        paragraphs = soup.findAll("p")
        #print paragraphs

        return paragraphs

if __name__ == '__main__':

    #DB connection properties
    conn = psycopg2.connect(dbname = 'YOUR_DATABASE', host= 'YOUR_HOST', port= 5432, user = 'YOUR_USER',password= 'YOUR_PASS')
    cur = conn.cursor()  ## open a cursor

    source_url = "http://jmapping-maplabs.rhcloud.com/about/"

    scrape = scrape()
    paragraphs = scrape.get_soup(source_url)  

    print paragraphs

 

That’s all I have for now.  Let me know if you have more to add to this list.

by jmapping at February 27, 2014 02:53 AM

February 23, 2014

Geo Perspectives (Justin Lewis)

Getting Started With Scripted Geo-Data Processing (PostgreSQL, PostGIS, Python, and a little OGR)

Processing data is an important part of any agency that deals with data and/or data visualization.  While there are many amazing tools with handy interfaces to help process geo-data it’s often the case that scripting is needed or preferred.  Scripted data processing tasks are beneficial for a number of reasons but arguably the most important are automation and repeat-ability.  That said, there are three basic steps to processing data.

Basic steps to a data processing task:

  1. Get/access the data
  2. Manipulate the data
  3. Put the data somewhere (if delivery outside of the processing environment is needed)

Simple enough… There are also different situations that will influence how to handle any one (or all) of these three steps.  Of these three I feel the second step warrants the most attention as it tends to be where the bulk of your efforts will be spent.  That said, there are two general ways to approach the step of manipulating data depending on a number of factors.  That list could be infinite but can be simplified with just a couple.

Simple factors influencing data manipulation tool/environment choice:

  1. Is a project database currently available that has the functionality you need (i.e. spatial functionality)?
  2. Is there a need to process data in a single processing environment with a minimal tool set?
    • This could be needed for simplicity, to prevent pushing data between environments/hardware, efficiency, or many other reasons.

For the remainder of this post we will take the approach of data manipulation using a database.  My favorite tools for this step (and all three for that matter) include a combination of PostgreSQL (with PostGIS) and Python (with psycopg2).  This relatively simple combination offers enough flexibility to do pretty much anything you will ever need to do in a data processing task.  Lets look at these tools a little closer.

  • PostgreSQL is a mature and very robust open source relational database commonly used by organizations large and small.
  • PostGIS is a spatial extender to PostgreSQL. In other words, PostGIS makes PostgreSQL a spatial database.
  • Python is a very robust programming language with a rich history and large user base.
  • Psycopg2 is the most popular PostgreSQL adapter for the Python programming language.  Meaning psycopg2 allows you to interact with a PostgreSQL database with Python.

Once we have all this software installed we are basically left with two very powerful tools.

  1. A database providing storage, structure, and SQL driven functionality.
  2. A programming environment (external from a database) providing control over database query execution and the flexibility to work outside of the database.

NOTE: As was hinted at above, you could potentially eliminate either the database or the programming environment in this scenario depending on your needs.

Lets dive into some examples.  Here we download, unzip, and load a shapefile into a PostGIS enabled database from a Python script.  The dependencies for this script are Python, psycopg2, and ogr2ogr.

#!/usr/bin/env python

import psycopg2, urllib, zipfile, os

#DB connection properties
conn = psycopg2.connect(dbname = 'YOUR_DATABASE', host= 'YOUR_HOST', port= 5432, user = 'YOUR_USER',password= 'YOUR_PASS')
cur = conn.cursor()  ## open a cursor

## Download the zipped shapefile
urllib.urlretrieve("http://gis.drcog.org/datacatalog/sites/default/files/shapefiles/county_2013_2.zip", "county.zip")

## Unzip the shapefile
fh = open('county.zip', 'rb')
z = zipfile.ZipFile(fh)
for name in z.namelist():
    outpath = "/YOUR/DESTINATION/PATH/"
    z.extract(name, outpath)
fh.close()

## Load the shapefile into the database
os.system("""ogr2ogr -overwrite -skipfailures -progress -f PostgreSQL PG:"dbname='YOUR_DATABASE' host='YOUR_HOST' port='5432' user='YOUR_USER' password='YOUR_PASS'" PATH/TO/county_2013.shp -nln county""")

## Not the prettiest way to rename the geometry field 
fixCountyGeomName = 'ALTER TABLE county RENAME COLUMN wkb_geometry TO the_geom;'
cur.execute(fixCountyGeomName)

## Download the zipped shapefile
urllib.urlretrieve("http://gis.drcog.org/datacatalog/sites/default/files/shapefiles/drcog_crash_2006_shapefile_0.zip", "crash.zip")

## Unzip the shapefile
fh = open('crash.zip', 'rb')
z = zipfile.ZipFile(fh)
for name in z.namelist():
    outpath = "/YOUR/DESTINATION/PATH/"
    z.extract(name, outpath) 
fh.close()

## Load the shapefile into the database
os.system("""ogr2ogr -overwrite -skipfailures -progress -f PostgreSQL PG:"dbname='YOUR_DATABASE' host='YOUR_HOST' port='5432' user='YOUR_USER' password='YOUR_PASS'" PATH/TO/drcog_crash_2006.shp -nln crash_pts""")

## Not the prettiest way to rename the geometry field 
fixCrashGeomName = 'ALTER TABLE crash_pts RENAME COLUMN wkb_geometry TO the_geom;'
cur.execute(fixCrashGeomName)

conn.commit() ## commits pending transactions to the db (making them persistent). This is needed for sql that modifies data or creates objects.

cur.close()
conn.close()

print "Data loaded into database!"

Now that the data is loaded into the database we can connect with psycopg2 and do some processing or analysis of the imported data.  This technique gives you the ability to utilize the full power of the database through SQL queries while also being able to pull data into your Python environment if needed.  This also provides control over the sequence of processing steps.   You could gain similar control through PostgreSQL functions but I will save that topic for another post.

Here we perform a PostGIS ST_Within() function to determine how many pedestrians where injured or killed within Denver in 2006 while also creating an additional database table of just the crashes where pedestrians where involved.

#!/usr/bin/env python

import psycopg2

#DB connection properties
conn = psycopg2.connect(dbname = 'YOUR_DB_NAME', host= 'YOUR_HOST', port= 5432, user = 'YOUR_USER',password= 'YOUR_PASS')
cur = conn.cursor()

getPedestrianCrashesWhithin = '''SELECT a.killed, a.injured FROM crash_pts a, county b WHERE (a.ped_act1 > 0 OR a.ped_act2 > 0 OR a.ped_act3 > 0) AND b.county = 'Denver' AND ST_Within(a.the_geom, b.the_geom);'''

createSubsetOfPedestrianCrashes = '''DROP TABLE IF EXISTS ped_crash_pts;
                                     CREATE TABLE ped_crash_pts AS
                                     SELECT a.* FROM crash_pts a, county b WHERE a.ped_act1 > 0 OR a.ped_act2 > 0 OR a.ped_act3 > 0;'''

cur.execute(createSubsetOfPedestrianCrashes)
conn.commit()

cur.execute(getPedestrianCrashesWhithin)  ## executes the sql command
pedestrianCrashes = cur.fetchall()  ## returns the result

totalKilled = 0
totalInjured = 0
for crash in pedestrianCrashes:
    killed = crash[0]
    injured = crash[1]

    if killed > 0:
        totalKilled += 1

    if injured > 0:
        totalInjured += 1

print "Total killed in Denver = ", totalKilled
print "Total injured in Denver = ", totalInjured

cur.close()
conn.close()

From techniques like this you can get simple text based results like this:

>>> 
Total killed in Denver =  13
Total injured in Denver =  186
>>>

Or map based results like this (example only):

Pedestrian Crashes (example)

 Important: This post was created for demo purposes only and so the results of the process should not be quoted with certainty of any acuracy. No time was spent testing the data for quality and completeness.

 

That’s about it! While this post was intended to give an overview of a strategy for scripted processing of geo-data it by no means included all the tools you could use. I hope it provoked ideas about how you could use techniques like this or how to do this in a way that works better for you.  This is just one example of how to approach a geo-data processing task and with the great selection of tools available you can easily take a completely different approach.

All the data used in this example can be downloaded from the DRCOG Regional Open Data Catalog

by jmapping at February 23, 2014 03:51 AM

February 19, 2014

Oslandia

QGIS plugin for water management

Oslandia releases today a new plugin for the QGIS processing framework, allowing for water distribution network simulation. It integrates the opensource EPANET simulation software.


EPANET models water distribution networks. It's a widely used public-domain simulation software developed by the US Environmental Protection Agency.


Hydraulic simulation is used to understand water distribution in distribution network, to forecast the impact of network alterations, to dimension network elements or study extreme case scenarios (e.g. important demand for firefighting, pipes breakages, interruption in supply).


QGIS provides a graphical user interface that can be used to import/edit/export hydraulic model elements and simulation parameters from various sources, launch simulation and visualize results directly inside QGIS.

Hydraulic model

A hydraulic model consists of junctions (POINT) and pipes (LINESTRING) along with various other elements like tanks, pumps and valves. Those elements can be stored as features in a spatially enabled database. Features attributes can be simple (e.g. pipe diameter) or complex (e.g. pumps characteristic curves or water consumption). Complex attributes are stored via a foreign key in other alphanumeric tables.

This is the kind of data QGIS is designed to handle. It can import/export them from/to a variety of sources and also display and edit them.

Simulation parameters

Simulation parameters and options (e.g. simulation time step or accuracy) are key-value pairs. The values can be stored in a table which columns are keys. Each set of simulation parameters is then a record in this table. This kind of table can be loaded in QGIS as a vector layer without geometry.

Integration in the processing framework

Once the hydraulic model and simulation parameters are loaded in QGIS, the simulation can be launched through the Processing toolbox. The plugin uses the standalone command line interface of EPANET (CLI) which path needs to be specified in processing Options and configuration.

The plugin assembles an EPANET input file, runs EPANET and parses its output to generate result layers.



One interesting aspect with processing modules is that they can be used for chained processing: the user can use other modules to do additional transformations of simulation results, as feeding them into another simulation model.

Result visualization

Simulation results are water pressure and velocity at all points in the network along with state of network elements (e.g. volume in tanks, power of pumps) for all simulation time steps . This represent a huge amount of data that are usually displayed either as time-plots or as map-plots of time aggregated data (e.g. max and min during simulation).

Results of particular interest are:

  • time-plots of:

    • volume in reservoirs
    • flow at pumps
    • pressure in pipes and at junctions
  • map-plots of:

    • low speed (stagnation)
    • high and low pressure (risk of breakage, unhappy consumer)
    • lack of level variation in reservoirs (stagnation)
    • empty reservoir
    • reservoir overflow
    • abnormal pressure (typical of error in the altitude of a node in the model)
    • flow direction

QGIS is naturally suited for map-plots. Time-aggregated simulation results are automatically joined to map layers when the result table is added to the map. Rule-based symbology is used to highlight zones of concern (e.g. low water velocity or empty reservoirs).



The matplotlib library provides 2D plotting facilities in python and QGIS provides an extensive set of selection tools (on the map or in tables). The plugin's time-plot button plots the appropriate value depending on the selected feature type (e.g. water level for tanks, pressure for junctions).


Screencast

For a full demo of this plugin, see the following video :



Where and who

The plugin is available on GitHub and should be available soon on QGIS plugin repository :

https://github.com/Oslandia/qgis-epanet


This work has been funded by European Funds. Many thanks to the GIS Office of Apavil, Valcea County (Romania).


Oslandia has developped this plugin, and provides support and development around QGIS, PostGIS and this plugin. Get in touch if you need more : infos@oslandia.com


We are looking for a free dataset with full informations (pumps, tanks, valves, pipes and their characteristics…) to distribute with this plugin as a test case and demonstration. If you can provide this, mail us !


We also are implementing a Processing plugin for SWMM, the public domain Waste-water simulation tool. If you are interested to participate to the development, please contact us.

by Vincent Mora at February 19, 2014 01:00 PM

February 18, 2014

Bill Dollins

A Little Deeper with Node and PostGIS

What does one do when presented with more snow days than expected? My friends in Colorado would probably do something outrageous like skiing, but I found it to be a great opportunity to catch up on some of my recreational coding. Specifically, I wanted to revisit the Node/PostGIS work I blogged about earlier.

As fun as that project was, it was fairly limited in scope and I wanted to spread my wings a little more with Node. So I decided to build a more general-purpose wrapper around PostGIS. Lately, I've become a bit obsessed with the idea that PostGIS may be the only GIS tool you really need in terms of data storage, management, and analytics. That's probably a topic for another post but exploring that concept was a perfect premise for my continued explorations with Node.

I have been circling back to Node over the past few months to continue building my comfort level with it. I tend to eschew frameworks when i have learning something new because I want to get comfortable with the core before I start layering on abstraction. That was my approach with the tile viewer tool I built a while back. For the recent post centered on Amazon RDS, I added Express into the mix, which has been a big help.

This time around, I wanted to dig a little deeper with the node-postgres module and also make the application more modular. I wanted to build around a few core principles:

  1. Keep it RESTful (or as close to it as I could)
  2. GeoJSON in/GeoJSON out (so....vector only for now)
  3. Let PostGIS do the heavy lifting

Getting Started

This time around, I elected to use my local PostgreSQL/PostGIS instance rather than Amazon RDS. This was mainly so I could keep my development isolated on one machine. I already had the basic infrastructure in place from my last time around, so I was able to quickly dive into the meat of things. I decided to scope my initial effort at the following:

  1. Return the contents of an entire table as GeoJSON, with the ability to choose between features (geometries and attributes) in a GeoJSON feature collection or just geometries in a GeoJSON geometry collection. This should support any table in the database.
  2. Return those records in a table that intersect a geometry passed in as a parameter. The input geometry would be in GeoJSON format.
  3. Return a JSON representation of a table's schema.
  4. Return a list of tables from the server. This is necessary in order to support the ability to query any available table.
  5. Implement some simple, but application-specific logic to demonstrate extensibility.

With these goals in mind, I decided to first tackle the issue of extensibility. I wanted to make it as painless as possible and the strategy described in this post seemed to fit the bill. I just had to add the following code block to my server.js (straight from the post):

snippet1.js
1
2
3
4
5
6
7
8
9
10
// dynamically include routes (Controller)
fs.readdirSync('./controllers').forEach(function (file) {
  if(file.substr(-3) == '.js') {</p>

<pre><code>  route = require('./controllers/' + file);
  route.controller(app);
</code></pre>

<p>  }
});

This will load any .js file in the controllers directory into the application. If they are written to the pattern expected by Express, new resource paths are exposed to the application. The post above describes a simple MVC implementation. Astute readers will note that my take is all "C" without "M" or "V." I plan to refactor that later but it it was easier for me to keep track of things on this pass with code in one place.

Getting Data

With modularity out of the way, it was time work on the basic structure for getting data from the database. In core.js, I defined a route with a URL template like '/vector/:schema/:table/:geom'. This would translate into something like http://localhost/vector/public/parcels/features, which would fetch a GeoJSON feature collection containing the contents of the parcels table. To do that, I need to know the name of the spatial column in the table, which the following helps me retrieve:

snippet2.js
1
var meta = client.query("select * from geometry_columns where f_table_name = '" + tablename + "' and f_table_schema = '" + schemaname + "';");

The next code block shows how I capture the name of the spatial column and structure the main query, depending on the choice of features or geometry:

snippet3.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
meta.on('row', function (row) {</p>

<pre><code>var query;
var coll;
spatialcol = row.f_geometry_column;
if (geom == "features") {
    query = client.query("select st_asgeojson(st_transform(" + spatialcol + ",4326)) as geojson, * from " + fullname + ";");
    coll = {
        type : "FeatureCollection",
        features : []
    };
} else if (geom == "geometry") {
    query = client.query("select st_asgeojson(st_transform(" + spatialcol + ",4326)) as geojson from " + fullname + ";");
    coll = {
        type : "GeometryCollection",
        geometries : []
    };
}
</code></pre>

<p>//'meta' code block continues

As can be seen above, the query will transform the output geometry to WGS84 and convert it to GeoJSON for me. So I'm sticking my third principle by leaning on PostGIS functions here. I plan to stick to GeoJSON's default spatial reference of WGS84 for now. To roll up the query results into the appropriate GeoJSON object and return it, I handled the 'row' and 'end' events.

snippet4.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
//roll up the results
query.on('row', function (result) {</p>

<pre><code>if (!result) {
    return res.send('No data found');
} else {
    if (geom == "features") {
        coll.features.push(geojson.getFeatureResult(result, spatialcol)); //use helper function
    } else if (geom == "geometry") {
        var shape = JSON.parse(result.geojson);
        coll.geometries.push(shape);
    }
}
</code></pre>

<p>});</p>

<p>//send the results
query.on('end', function (err, result) {</p>

<pre><code>res.setHeader('Content-Type', 'application/json');
res.send(coll);
</code></pre>

<p>});

I wrote a helper function to roll up GeoJSON features:

snippet5.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
exports.getFeatureResult = function(result, spatialcol) {</p>

<pre><code>    var props = new Object;
    var crsobj = {
        "type" : "name",
        "properties" : {
            "name" : "urn:ogc:def:crs:EPSG:6.3:4326"
        }
    };
    //builds feature properties from database columns
    for (var k in result) {
        if (result.hasOwnProperty(k)) {
            var nm = "" + k;
            if ((nm != "geojson") &amp;&amp; nm != spatialcol) {
                props[nm] = result[k];
            }
        }
    }

    return {
        type : "Feature",
        crs : crsobj,
        geometry : JSON.parse(result.geojson),
        properties : props
    };
};
</code></pre>

<p>

So that's basic data retrieval. How about that spatial intersect?

A Simple Spatial Query

One thing I failed to mention in the above section, is that all of that is exposed through an HTTP GET request. For this next function, I'm going to use a POST. I went back and forth on that but came down on the side of POST due to the potential for a user to send a very verbose input shape. The function is designed to accept JSON as the body of the request, which would be done in curl like this:

snippet6.bat
1
curl -X POST -d "{ \"type\": \"Point\", \"coordinates\": [-98.35, 39.7] }" -H "Content-Type: application/json" http://localhost:3000/vector/public/states_gen/features/intersect

The above action returns the state of Kansas (I knew you were wondering). To make this happen, there are only three things that are different. First, the URL is defined a POST and, second, the code needs to capture the input shape. The first few lines are:

snippet7.js
1
2
3
4
5
6
7
8
9
10
11
12
</p>

<pre><code>/**
 * retrieve all features that intersect the input GeoJSON geometry
 */
app.post('/vector/:schema/:table/:geom/intersect', function (req, res, next) {
    //console.log(JSON.stringify(req.body));
    var queryshape = JSON.stringify(req.body);
//continue with the rest of app.post
</code></pre>

<p>

I stringify the JSON since I have to insert it into my SQL. This brings me to the third difference here, the query. This time, I am using ST_INTERSECTS to filter down the response. So, depending on the choice of features or geometry, the query will be similar to:

"select st_asgeojson(st_transform(" + spatialcol + ",4326)) as geojson, * from " + fullname + " where ST_INTERSECTS(" + spatialcol + ", ST_SetSRID(ST_GeomFromGeoJSON('" + queryshape + "'),4326));"

The rest of the process is similar to the basic query above. With a well-exercised data access pattern in place, querying table schema and layer lists become trivial. Since GeoJSON doesn't cover these topics, I had to roll my own. I won't detail the output but the queries are below.

snippet8.js
1
2
3
4
5
6
7
//SQL to retrieve schema
//var sql = "SELECT n.nspname as schemaname,c.relname as table_name,a.attname as column_name,format_type(a.atttypid, a.atttypmod) AS //type,col_description(a.attrelid, a.attnum) as comments";
//sql = sql + " FROM pg_class c INNER JOIN pg_namespace n ON c.relnamespace = n.oid LEFT JOIN pg_attribute a ON a.attrelid = c.oid";
//sql = sql + " WHERE a.attnum > 0 and c.relname = '" + tablename + "' and n.nspname = '" + schemaname + "';";</p>

<p>//SQL to retrieve layer list
//sql = "SELECT 'geometry' AS geotype, * FROM geometry_columns UNION SELECT 'geography' as geotype, * FROM geography_columns;";

So this gives me everything I need for an all-purpose interface into PostGIS from Node. I could spend the rest of the year similarly wrapping the hundreds of spatial functions in PostGIS but the real power of extensibility is the ability to tailor an application for one's own needs, based upon one's detailed understanding of their own data and logic.

Adding Some Customization

To do this, I fell back to the data for Leonardtown, Maryland that have used in a couple of previous posts. I am simply going to expose the ability to query residential or commercial buildings from the data set. For this, all of the prep work is done at the top of the function by simply preparing a WHERE clause.

snippet9.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
app.get('/leonardtown/buildings/:geom', function (req, res, next) {</p>

<pre><code>var client = new pg.Client(app.conString);
var geom = req.params.geom.toLowerCase();
if ((geom != "features") &amp;&amp; (geom != "geometry")) {
    res.status(404).send("Resource '" + geom + "' not found");
    return;
}
var tablename = "leonardtown_bldgs";
var schemaname = "public";
var fullname = schemaname + "." + tablename;
var spatialcol = "";
var proptype = req.query.type;
var whereclause = ";";
if (typeof proptype != "undefined") {
    if (proptype.toLowerCase() != "all") {
        whereclause = " where structure_ = '" + proptype + "';";
    }
}
var coll;
var sql = "";
//logic continues from here
</code></pre>

<p>

The primary difference here are that I am using a GET with a query string since I'm not concerned with data size and that I'm building a WHERE clause on a specific column name. What's not shown is that, farther down, I don't need to query the name of the spatial column so I can cut out that step. I can do this because I understand my own data so I can be more succinct that if I were writing a more generic function. Using this approach I can also write more complex custom logic in my database, call it from Node, and send the response. In other words, standard web application behavior.

In order to expose this application-specific logic, I just needed expose it in a separate leonardtown.js file and drop it into the 'controllers' directory.

Wrapping Up

This post was bit longer than usual but there was lot of ground to cover. I feel like I'm getting more comfortable with the Node ecosystem though I'm still a bit wobbly. My next step is probably to dive a little deeper into the MVC side of things with something like Sails. Having a familiar face like PostGIS on the back end is helping me as I figure out how to perform more meaningful tasks with Node and its related tools.

If you want to check out the full code for this application, it is here: https://github.com/geobabbler/node-gis-server

February 18, 2014 06:44 PM

February 17, 2014

Oslandia

Oslandia and Mappy towards OpenSource

Note : the ATOM feed for english only Oslandia blog posts is : http://www.oslandia.com/feeds/newsen.atom.xml

The project

For more than two years, Oslandia has been working with Mappy for its transition towards OpenSource. The first step is represented by the database migration project of the cartographic backoffice, from Oracle to PostGIS. This has just been finalized and is currently running in production.


mappy_logo


Mappy offers routing and cartography services. Supporting several billions of requests per month, the LBS platform (Location Based Services) is at the heart of Mappy's activity and serves as a base for new products such as Web To Store.

Oslandia works together with Mappy to migrate this cartographic platform from a solution developped in-house, to a new solution based entirely on free and opensource softwares, in order to prepare forthcoming technical and operational challenges.


The migration

Data preparation and cartographic map creation backoffices have been migrated successfully. This software stack was initially based on Oracle Spatial, SQL Server, and other non-standard internal tools to create the map. It only runs opensource components by now.

Oracle and SQL Server have completely been replaced by PostgreSQL and PostGIS, which is the key component for spatial data storage and preprocessing. Mapnik, Python, Tornado, Varnish, MemCached, Debian are other Opensource components used in this architecture.


The migration to OpenSource softwares allowed Mappy to rationalize and optimize the architecture of their "map" component :

  • Reduced data processing time
  • API and data format Standardization
  • Strong reduction of the technical debt
  • Reduced amount of lines of code
  • Cost reduction for the platform, and cost-effective scaling
  • Technical skills improvements for the team and stronger motivation
  • Full technical mastership of their tools

Oslandia helped Mappy to integrate the Opensource culture and methods into their teams, and provided them with the technical skills required to implement this high-load architecture. Oslandia's competences in geographical information systems, and its high expertise on PostGIS let the project be achieved successfully.


Mappy benefited from the latest OpenSource technologies, at the state of the art, and was also able to contribute back to some free software, like Mapnik.

Migration steps

The migration has been achieved with several steps, so as to keep the project under control, validating the migrated components through early tests.


The initial platform was the following one :



The target platform, with all components migrated to OpenSource, is now the following one :



Thanks to the availability of OpenSource components, a prototype has been developped within two weeks at the beginning of the project, so as to validate architectural decisions. Then the real migration work has begun.


The choice concerning OpenSource components has been fast, as already said. The target architecture could therefore be setup quickly, and the development and migration processes have been setup once this first step validated.



This migration gave the opportunity for Mappy to contribute to Mapnik. A 25% performance gain was achieved when reading data from PostGIS. Mappy also opensourced Pycnik, a Python tool to generate styles for Mapnik.

PostGIS

PostGIS, the database supporting the whole infrastructure, allowed to reach very high performances and a rich functional level. The database - 75GB - benefits from all latest enhancements of PostgreSQL and PostGIS, such as streaming replication, PostGIS new geographical functions, recursive CTEs, JSON native support and much more.


Audrey Malherbe ( @AudreyMalherbe ), project manager at Mappy, stresses that choosing PostgreSQL and PostGIS was obvious "We wanted to dive into OpenSource, leaning on reknown and performant technologies like PostGIS. It was important for us to contribute to OpenSource, and Oslandia's expertise and its implication in the community has allowed us to rush into this adventure with full confidence."

Some facts

Most geeks like to see this kind of numbers :

  • for 85% of all Mappy traffic
  • 14 servers (4 PostGIS, 8 Tornik, 2 Varnish)
  • 240 req/s peak without cache
  • 2500 req/s peak with cache
  • 94% Cache Hit Ratio
  • 2M objects in cache warming
  • 75 GB for PostGIS database
  • 300 GB for the relief database

Next steps

More technical informations on this migration are available in the Audrey's presentation at FOSS4G 2013 .


Mappy intends to go on with this OpenSource orientation and extend this migration method to the other services of the cartographic platform.




About Mappy

Routing and cartographic services specialist, Mappy is reknown as the leader for local search through maps, on internet, tablets, mobiles and GPS.


Mappy offers three kinds of search to its users : map search, which allows to visualize a neighborhood, to experience city immersion thanks to 360° views in 320 french cities, but also to open the door of several thousands of shops ; the routing service available for cars, public transportation, bicycle and pedestrian mode ; and product search, allowing to locate a specific product in a given geographical zone, to know its price and availability.


As a major actor of urban transportation, Mappy offers to advertisers a geolocated solution on the whole country, facilitating web-to-store applications and trafic generation to their retail centers. Mappy counts as of now more than 10 millions users on Internet, tablets and mobile (Mappy and MappyGPS Free).


Mappy is a subsidiary of the Solocal Group.

http://www.mappy.com

by Vincent Picavet at February 17, 2014 09:00 AM

February 11, 2014

OpenShift by RedHat

Build your own Google Maps (and more) with GeoServer on OpenShift

Image Upload: 
geoserver opening page
password dialog for geoserver
selecting previews in geoserver
previews list in geoserver
OpenLayers quick preview in geoserver
postgis connection dialog in geoserver

Greetings Shifters! Today we are going to continue in our spatial series and bring up Geoserver on OpenShift and connect it to our PostGIS database. By the end of the post you will have your own map tile server OR KML (to show on Google Earth) or remote GIS server.

The team at Geoserver has put together a nice short explanation of the geoserver and then a really detailed list. If you want commercial support, Boundless will give you a commercial release and/or support for all your corporate needs. Today though I am only going to focus on the FOSS bits.

Getting started

There are two ways to run Geoserver. They ship a version that includes a Jetty container so you can just unzip and run on your local machine.

read more

by scitronpousty at February 11, 2014 02:34 PM

February 10, 2014

Oslandia

Oslandia et Mappy vers l'OpenSource

Le projet

Depuis plus de deux ans, Oslandia accompagne Mappy dans sa transition vers l'OpenSource. La première étape est symbolisée par le projet de migration des bases de données Oracle du backoffice cartographique vers PostGIS. Celui-ci vient notamment d'être finalisé et fonctionne actuellement en production.


mappy_logo


Mappy propose des services de calcul d'itinéraire et de cartographie. Supportant plusieurs milliards de requêtes par mois, la plateforme LBS (Location Based Services) est au cœur de l'activité de Mappy et sert de socle aux nouveaux produits tel que le Web To Store.

Oslandia travaille conjointement avec Mappy pour migrer cette plateforme cartographique d'une solution développée en interne, vers une solution basée entièrement sur des logiciels libres, afin de préparer les défis techniques et opérationnels à venir.


La migration

Les backoffices de préparation des données et de création du plan cartographique ont été entièrement migrés avec succès. Cette pile applicative était initialement basée sur Oracle Spatial, SQL Server, et des outils internes non standards de création de carte. Elle n'utilise plus désormais que des composants OpenSource. Oracle et SQL Server ont été entièrement remplacés par PostgreSQL et PostGIS, qui constitue le socle de base de données géographique pour le stockage et le prétraitement des données géographiques. Mapnik, Python, Tornado, Varnish, MemCached, Debian sont les autres composants OpenSource utilisés.


La migration vers ces composants OpenSource a permis de rationnaliser et d'optimiser l'architecture du composant "carte":

  • Temps de traitement des données réduit
  • Standardisation des formats et API
  • Forte diminution de la dette technique
  • Nombre de lignes de code optimisé
  • Baisse du coût de la plateforme, et passage à l'échelle plus économique
  • Montée en compétence et motivation des équipes
  • Maîtrise complète des outils

Oslandia a permis à Mappy d'intégrer la culture et les méthodes de l'OpenSource dans ses équipes, et leur a fourni les compétences techniques nécessaires pour mettre en place cette architecture à forte charge. Les compétences d'Oslandia en systèmes d'information géographique, et son expertise unique en France sur PostGIS, ont permis de mener à bien ce projet.


Mappy a ainsi pu bénéficier des dernières technologies OpenSource à la pointe de l'état de l'art, et également pu contribuer à certains projets libres comme Mapnik.

Étapes de migration

La migration s'est effectuée avec plusieurs étapes, de façon à conserver une maitrise sur le projet, en validant par des tests le plus tôt possible les composants migrés.


La plateforme initiale était celle ci :



La plateforme cible, avec tous les composants migrés vers de l'OpenSource, est désormais celle ci :



Grâce à la disponibilité des composants OpenSource, un prototype a pu être développé en moins de deux semaines afin de valider les choix d'architecture. Ensuite, la migration a proprement parler a pu débuter.


Les choix des composants OpenSource ont été assez rapide, comme déjà évoqué plus haut. L'architecture cible s'est donc mise en place assez rapidement et le développement et les processus de migration ont pu être mis au point une fois cette étape franchie.



La migration a permis à Mappy de contribuer a Mapnik, avec à la clef une amélioration de 25% des performances de lecture de données en utilisant PostGIS, et de libérer Pycnik, un outil de génération de styles en Python pour Mapnik.

PostGIS

PostGIS, la base de données géographique supportant toute l'infrastructure, a permis d'atteindre de très hautes performances et un niveau fonctionnel élevé. La base de 75Go bénéficie des toutes dernières avancées de PostgreSQL et PostGIS, tels que la réplication au fil de l'eau, les nouvelles fonctions géographiques de PostGIS, les requêtes CTE récursives, le support de JSON et bien plus.


Audrey Malherbe ( @AudreyMalherbe ), responsable du projet chez Mappy, souligne que le choix de PostgreSQL et de PostGIS était une évidence : "Nous voulions basculer dans l'OpenSource en nous appuyant sur des technologies performantes et reconnues comme PostGIS. Il était important pour nous de contribuer à l’OpenSource et l'expertise technique d’Oslandia et son implication dans la communauté nous ont permis de nous lancer dans cette aventure en toute confiance."

En chiffres

Les plus geeks d'entre vous aiment voir ce genre de chiffres :

  • 85% du trafic de Mappy
  • 14 serveurs (4 PostGIS, 8 Tornik, 2 Varnish)
  • 240 req/s en pic sans cache
  • 2500 req/s en pic avec le cache
  • 94% de Cache Hit Ratio
  • 2M d'objets en préchauffage de cache
  • 75 Go pour la base PostGIS
  • 300 Go pour la base relief

La suite...

Plus d'informations techniques sur cette migration sont disponibles dans la présentation d'Audrey au FROG 2013 .


Mappy a l'intention de continuer ce virage vers l'OpenSource et d'étendre la méthode de migration aux autres services de la plateforme cartographique.




À propos de Mappy

Spécialiste du calcul d’itinéraire et des services de cartographie, Mappy est reconnu comme le leader français de la recherche locale par la carte, sur Internet, tablettes, mobiles et GPS.


Mappy propose à ses utilisateurs trois types de recherche : la recherche par le plan, qui permet de visualiser un quartier, de s’immerger dans la ville  grâce aux vues 360° dans 320 villes françaises, mais également de pousser la porte de plusieurs milliers de commerces ; la recherche d’itinéraires disponible pour les déplacements en voiture, en transports en commun, en vélo et en mode piéton ; enfin la recherche de produits, permettant de localiser un produit précis, dans une zone géographique donnée, de connaître son prix et sa disponibilité.


Acteur majeur du déplacement urbain, Mappy propose aux annonceurs une solution géolocalisée sur l’ensemble du territoire, facilitant les dispositifs web-to-store et la génération de trafic vers leurs points de vente. Mappy compte aujourd’hui plus de 10 millions d’utilisateurs mensuels sur Internet, tablettes et mobiles (Mappy et MappyGPS Free).


Mappy est une filiale à 100% de Solocal Group.

http://www.mappy.com

by Vincent Picavet at February 10, 2014 06:00 PM

February 05, 2014

Safe Software

FME 2014 Special: Top 10 Reader and Writer Updates

Hello FME’ers,Formats2014FeaturedImage
By now you’ll have (I hope) upgraded to 2014 and taken advantage of some of the new features and functionality. From my point of view updates can classified in one of three areas and, having already covered Workbench improvements and Transformer updates, it’s time for me to look at the third of these: Readers and Writers.

As usual we’ve added a whole bunch of new formats – twenty-one (21) by my count – and these seem to reflect some modern trends in spatial data. There are many cloud formats and some important updates to 3D and BIM data formats too. But there are also some updates to older CAD and GIS formats, filling in the gaps I suppose you could say.

Anyway, I realize that not everyone is going to be interested in every format, so for this top-ten I’ve tried to pick out updates that I think will be of most interest to the most people, and made the points fairly brief so it’s easier to skip items that don’t interest you. I haven’t mentioned null support because we included it in so many formats – check out the list here if you want to know which.

EvangelistBanner7

EvangelistNumber10

AIXM 5

I wouldn’t blame you if you’ve never used – or even heard of – the Aeronautical Information Exchange Model. But if you ever journey by air then you’ll be grateful for this format because it’s the means by which navigational and other airspace information is passed between systems. For an interesting read on the subject check out Wikipedia first, then a site like the UK National Aeronautical Information Service.

For FME, the big update is that we now have a Writer for Version 5 (actually v5.1) of this format. Version 5 is an important update so it was vital FME supports it. Technically, it being an XML-based format, you could have written data with the XMLTemplater transformer, but a dedicated writer like this is way better.

EvangelistBanner7

EvangelistNumber9

JPEG

Did you know that FME doesn’t just read JPEG files, but also the Exif information that comes attached?

Exif (Exchangeable image file format) is metadata for a JPEG file, and includes location data from geotagged files. FME has been able to read these tags for a while; there’s to be a JpegGPSPointReplacer transformer in the FME Store and Certified Professional Ulf Mansson blogged about how FME could read these Exif tags.

So Ulf will probably be happy to know that in FME 2014 we’ve expanded the JPEG Reader to add support for some additional tags:

jpeg_exif_contrast
jpeg_exif_exposureprogram
jpeg_exif_focallengthin35mmfilm
jpeg_exif_lightsource
jpeg_exif_saturation
jpeg_exif_sharpness
jpeg_exif_software

You can get at all these attributes in the Format Attributes tab in a feature type properties dialog.

EvangelistBanner7

EvangelistNumber8

Trimble SketchUp Reader and WriterFormats2014SketchUp

In case you didn’t know, SketchUp was acquired by Trimble in 2012, and shortly after an excellent new API was released. FME has had support for the Google version of SketchUp for a while, but 2014 introduces a whole new Reader and Writer that is based around the new Trimble API.

The new Reader/Writer provides support for versions back to Google v3 and up to the latest Trimble 2013 release. Besides this there are some useful improvements over the older R/W:

Coordinate System and Geo-location support. Data will be read in its true location, so with the FME Data Inspector you can see a background map (in 2D).
Support for Non-Affine Textures (those with a perspective correction)
Textures on a group feature

EvangelistBanner7

EvangelistNumber7

Esri Geodatabase

The Geodatabase update I wanted to highlight is support for GUID column types. GUID stands for Globally Unique Identifiers, which are 36 character fields that uniquely identify features or objects in a Geodatabase. FME used to read them as string attributes but now we handle them as proper GUID fields and have an actual field type that can be selected in the Feature Type properties dialog:

Formats2014GeodbGUID

Besides this update we’ve also worked on improving the handling of Chinese attribute names, updated the API (non-ArcObjects) to use v1.3 of the API, and added Raster Catalog support.

EvangelistBanner7

EvangelistNumber6

Google Earth KML

A new update to the KML writer in FME 2014 was in response to a number of customer requests. The issue was this: when you view a KML dataset, the attribute information shows up inside a balloon. That balloon content can be displayed using HTML content, but there was no way in FME to include auxiliary files such as images or scripts.

So, the update in FME 2014 was to provide a way to do this. The magic is carried out in a section of the KMLPropertySetter transformer’s parameters:

Formats2014KMLAddtnFiles

Now when I run this workspace, and write to KMZ, all of those files will be included with the HTML content.

EvangelistBanner7

EvangelistNumber5

MicroStation Design

Two nice updates here, filling in a couple of gaps. Firstly we added the ability to write to multiple models. A MicroStation seed file can contain multiple models – basically the ability to apply different working units, global origins, etc. FME has long been able to read which model the data was coming from, now we can write it to a specific model to. You simply need to set either igds_model_id or igds_model_name on the features to be written. The model must already exist in the seed file too, as we can’t create it from scratch.

The second update is the ability to set draw order, aka “element priority”. MicroStation doesn’t have a method for setting the order in which layers are drawn one above the other. Instead each element (feature) is give its own priority. FME now has the ability to read and write this priority using the format attribute igds_element_priority. This will work for 2D datasets, or 3D datasets where the features have common Z values.

So, good news if you are a MicroStation user needing this functionality.

EvangelistBanner7

EvangelistNumber4

AutoCADFormats2014ACADRaster

For AutoCAD, FME 2014 introduces the ability to read and write raster features with our AutoCAD formats. I didn’t realize that raster data could be added to an AutoCAD drawing, but apparently it can – the DWG file simply includes a reference to the raster file.

When FME reads the raster data it includes a set of attributes that define any clip boundaries that were set in AutoCAD, so you can clip to the same extents in FME. But if you send the data back to AutoCAD then you don’t need to  manually clip the data, it’s automatically done on writing. This way you can round-trip your AutoCAD raster data without any loss of information. Of course, you can just write raster features directly to AutoCAD too.

Another AutoCAD update is the ability to read from one of multiple layouts in the paper space. I’m not an AutoCAD expert but to me this looks similar to the DGN models. Basically a paper space can contain multiple layouts, each with different page setup/layout settings. So FME now gives the user the option to read in only the active layout or all layouts.

Finally I have to mention the Revit Reader that is new to FME 2014. To watch a recorded demo click this link and skip to 56 minutes into the presentation. Basically we’ve created a Revit extension (add-in) that let’s you export data in a file format that FME is able to read using a new Revit Reader. The Reader includes a parameter called Element Type, which allows you to select the type of data to be read (2D Floorplan, 3D with Textures, Building Envelope, etc). That way the data is partly pre-processed, and easier to use than a simple dump of all elements.

EvangelistBanner7

EvangelistNumber3

PostGIS

PostGIS on AWS RDSThe simple update here is that the Postgres-based Readers and Writers now include parameters for a SQL Statement to Execute Before Translation, and a SQL Statement to Execute After Translation. Other database formats have had this capability for a while, so it was high time that PostGres users got the same opportunity.

The other PostGIS item I wanted to mention is the ability to use an Amazon RDS Hosted PostGIS database. Amazon RDS (Relational Database Service) is a web service that allows you to set up a database in the cloud. Their latest offering is the ability to run a cloud-based Postgres/PostGIS database and, having tested it, we’re happy to report that FME 2014 works just fine Reading and Writing from such a database. And yes, that includes raster. You can read a bit more about FME support for this on Don’s blog post.

EvangelistBanner7

EvangelistNumber2

Amazon

Speaking of Amazon, there are two new Amazon-related formats in FME 2014: Amazon Redshift and Amazon DynamoDB. Both are databases offered by Amazon as a cloud service.

Redshift is for massive (petabyte-scale) datasets. Basically it’s fast and it’s large and it’s supported by FME!

DynamoDB is a NoSQL (non-relational) database service. It’s schema-less, really fast, and – again – it’s supported by FME!

EvangelistBanner7

EvangelistNumber1

Google

As for Google, we added three new formats in FME 2014, two for cloud services and one for a raster file format.

The raster file format is Google WebP. Google says this format has about 25% smaller file sizes than jpeg and is therefore useful in web applications. FME has both a Reader and Writer for this format.

Google Cloud SQL is (obviously) a cloud-based format. We have a Reader and Writer for both spatial and non-spatial versions of this online database.

We also have a new Reader and Writer for Google Maps Engine, so you can read data from your FME-supported format and push it out to your GME users. We also have a new “GME Compatibility” parameter on the KML writer, which you should set if you’re creating KML for loading into GME.

By my count we now have eight (8) Readers and Writers for various Google formats and products:

Formats2014Google

 

EvangelistBanner7

You might have noticed I didn’t include GML in my top-ten. That’s because the changes there are so significant, and their effects so wide-reaching, I’m going to create a single post all about that format. I also missed out IFC because there are some interesting updates still to come that I want to wait for.

Anyway, for a full list of new formats – and updates to formats – be sure to visit the What’s Great page here on this blog. And while I remember, the FME 2014 launch quiz is still on (until February 28th) so answer now and answer often (once a day) to win some great prizes.

Regards,

Mark

NewBlogSignature

The post FME 2014 Special: Top 10 Reader and Writer Updates appeared first on Safe Software Blog.

by Mark Ireland at February 05, 2014 06:53 PM

February 04, 2014

Paul Ramsey

Introspection Double-Shot

Davy Stevenson has a great post (everyone should write more, more often) on a small Twitter storm she precipitated and that I participated in. Like all sound-and-fury-signifying-nothing it was mostly about misunderstandings, so I'd like to add my own information about mental state to help clarify where I come from as a maintainer.

First, PostGIS is full of shortcomings. Have a look at the (never shrinking) ticket backlog. Sure, a lot of those are feature ideas and stuff in "future", but there's also lots of bugs. Fortunately, most of those bugs affect almost nobody, and are easily avoided (so people report them, then avoid them).

When we first come up an a "major" release (2.0 to 2.1 for example) I expect lots of bugs to shake out in the early days, as people try the new release in ways that are not anticipated by our regression tests. (It's worth noting that the ever-growing collection of regression tests provides a huge safety net under our work, allowing us to add features and speed without breaking things... for cases we test.)

My expectation is that the relative severity of bugs reported decreases as the time from initial release increases. Basically, people will always be finding bugs, but they will be for narrower and narrower user cases, data situations that come up extremely infrequently.

The bug Davy's team ran across broke my usual rules. It took quite a while for users to find and report, and yet was broad enough to affect moderately common use cases. However, is was also something the vast majority of PostGIS users would not run across:

  • If you were using the Geography type, and
  • If you were storing polygons in your Geography column, and
  • If you queried your column with another Polygon, and
  • If the query polygon was fully contained in one of the column polygons, then
  • The distance reported between the polygons would be non-zero (when it should be zero!)

It happens! It happened to Davy's team, and it happened to other folks (the ones who originally filed #2556) — I was actually working on the bug on the plane a couple weeks before. It was a tricky one to both find and to diagnose, because it was related to caching behaviour: you could not reproduce it using a query that returned a single record, it had to return more than one record.

If I was prickly about the report from Davy:

And pricklier still about the less nuanced report of her colleague Jerry:

That prickliness arose because, on the basis of a very particular and narrow (but real!) use case, they were tarring the whole release, which had been out and functioning perfectly well for thousands and thousands of users for months.

Also, I was feeling guilty for not addressing it earlier. PostGIS has gotten a lot bigger than me, and I don't even try to address raster or topology bugs, but in the vector space I take pride in knocking down real issues quickly. But this issue had dragged on for a couple months without resolution, despite the diligent sleuthing of Regina Obe, and a perfect reproduction case from "gekorob", the original reporter.

That's where I'm coming from.

I can also empathize with Jerry and others who ran across this issue. It's slippery, it would eat up a non-trivial amount of time isolating. Having had the time eaten, a normal emotional response would be "goddamn it, PostGIS, you've screwed me, I won't let you screw others!" Also, having eaten many of your personal hours, the bug would appear big not narrow, worthy of a broadcast condemnation, not a modest warning.

Anyways, that's the tempest and teapot. I'm going to finish my morning by putting this case into the regression suite, so it'll never recur again. That's the best part of fixing a bug in PostGIS, locking the door behind you so it can never come out again.

by Paul Ramsey (noreply@blogger.com) at February 04, 2014 07:05 PM

January 28, 2014

Stephen Mather

Book Complete

For all following along at home, the PostGIS Cookbook is complete!  At 484 pages it’s a tome– but a great one (It’s been out for a few days, actually).

A shout out to my brilliant co-authors, Paolo, Bborie, and Tom. I could not have hoped for better.

http://www.packtpub.com/postgis-to-store-organize-manipulate-analyze-spatial-data-cookbook/book

And for any who wonder, the picture on the front is of the Maumee River, the largest tributary to the Great Lakes.

bridge over the maumee


by smathermather at January 28, 2014 03:29 AM

January 27, 2014

Postgres OnLine Journal (Leo Hsu, Regina Obe)

Writing PostGIS raster Map Algebra Callback Functions in PLV8

I've read from many that PL/V8 mathematic operations are generally faster than what you get with SQL functions and PL/pgsql functions. One area where I thought this speed would be really useful was for writing Map Algebra call-back functions. A PostGIS 2.1+ map algebra callback function signature looks like: func_name(double precision[][][] value, integer[][] pos, text[] VARIADIC userargs)


Continue reading "Writing PostGIS raster Map Algebra Callback Functions in PLV8"

by Leo Hsu and Regina Obe (nospam@example.com) at January 27, 2014 05:58 AM

January 24, 2014

Paolo Corti

The PostGIS cookbook is here!

After 18 months of hard work in our spare time, the PostGIS cookbook has been finally published!The book, in a friendly tutorial fashion, covers a plethora of PostGIS related topics such as:Importing and exporting dataVectorial and Raster management and analysis functionsUsing desktop clients such as QGIS, OpenJump, gvSIG and UDigpgRouting and how to use the Nth dimensionwriting PostGIS programs with Pythonusing PostGIS to do web GIS with web mapping engines and frameworks such as MapServer, GeoServer, OpenLayers, Leaflet and GeoDjangomaintenance, optimization and performance tuningThanks to the other authors - Steve, Bborie and Thomas - for the huge effort done in...

January 24, 2014 05:30 PM

January 20, 2014

Archaeogeek (Jo Cook)

Portable GIS v4

Here’s a quick and overdue announcement to say that I’m making a new version of Portable GIS available today, including QGIS 2. Consider this one a beta release, since I really want to upgrade PostGIS and GDAL when I get time. Additional upgrades in this version: Astun Technology’s Loader has been upgraded to the latest version, and Psycopg2 is now included.

Before you click on the link, please take time to read the main Portable GIS page, and also do me the favour of reporting any problems that you find at the portable GIS google group, or via Twitter. If you can provide me with a screenshot of an error, and let me know the exact version of Windows that you’re using, then I’ll do my best to fix.

You can pick up the beta version here. This is a dropbox link, so if it’s not available, then I’ve exceeded my bandwidth for the moment, so please try again later.

I hope to get a release out fairly soon with PostGIS 2.1 and GDAL 1.10, but this is not my day job so bear with me!

January 20, 2014 10:40 AM

January 17, 2014

Stephen Mather

Final “clubbed” book. Last read through…

image


by smathermather at January 17, 2014 01:38 AM

January 15, 2014

OpenShift by RedHat

Add Map Navigation to Your App With pgRouting on OpenShift

Image Upload: 
Some folks trying to navigate
A screenshot of OSM
route for first query shown on a map
route for the second query showing footpaths

Some folks trying to navigate

Introduction

Greetings Shifters! Today I am going to introduce you to another piece of AWESOME geospatial technology that let's you compute travel routes on a network - pgRouting. We have it available by default with PostGIS on OpenShift. It adds the capability to take a routable network, a start location, an end location, and then compute the "least cost route" between the two points. There are actually quite a few concepts in that statement to so let's unpack the statement some more.

Routing is a function of the algorithms + the data you feed it. In terms of the data, you would be wise to heed the old maxim: garbage in = garbage out. I think a lot of people forget that good routing depends on having really good data.

read more

by scitronpousty at January 15, 2014 02:27 PM

January 13, 2014

Boundless Geo

Best of PostGIS in 2013

PostGISLast year was a very significant year for PostGIS: the 2.1 release made the project even more feature complete and bug free and PostGIS Day is now an institution. Unsurprisingly, many of the most popular posts on our blog over the past year were also focused on PostGIS, including a few top posts from 2012 and a chart-topper from 2011:

  • Indexed Nearest Neighbour Search in PostGIS: This oldie from 2011 remains a perennial favorite and answers the age-old question of “how do I find the N nearest things to this point?”
  • PostGIS 2.1, what you need to know: With lots of bugs fixed,  features added, and  corner cases filled, Paul Ramsey pulls out the highlights from this most recent release.
  • PostGIS 2.0 New Features: Typmod: In the first of several posts highlighting new features in PostGIS 2.0, Paul discusses one of his personal favorites, support for type modifiers on geometry objects.
  • PostGIS 2.0 New Features: 3D/4D Indexing: The second post in the series highlights the ability to index in higher dimensions than the usual two.
  • PostGIS 2.0 New Features: ST_MakeValid: And the third post describes dealing with invalid features, the bane of PostGIS users since time immemorial.
  • Manage LIDAR with PostGIS: Although only released a few weeks ago, this post and associated tutorial about our newly-developed support for storing and analyzing LIDAR data within a PostgreSQL database quickly became among our most popular.
  • Celebrate PostGIS Day with Paul Ramsey and James Fee (Part 1, Part 2) as we continue our tradition of contemplating the myriad ways geospatial is now embedded in the fabric of IT.
  • Using PostGIS on Amazon RDS: In this recent post, Paul shows how to deploy a backed-up and replication-ready database in just a few minutes with Amazon Web Services and OpenGeo Suite — no hardware required.
  • Why We Sprint describes Paul’s experience attending an annual code sprint for C-based open source geospatial projects in Boston. Join Paul at the next sprint in Vienna this March!

Interested in using PostGIS in your enterprise? Boundless provides support, training, and maintenance for PostGIS. Contact us to learn more.

The post Best of PostGIS in 2013 appeared first on Boundless.

by Rolando Peñate at January 13, 2014 05:27 PM

January 10, 2014

Stephen Mather

Editing in progress

Edit:

Free book to the first to tell me what the is name of the children’s book on the armchair in the foreground:

 

Book editing in progress… .

image


by smathermather at January 10, 2014 04:58 AM

January 01, 2014

Oslandia

Oslandia vous souhaite une brillante année 2014

voeux2014

The whole team at Oslandia wishes you all the best for 2014. May this year be brilliant, and your projects see the light, personally and professionaly.


2013 has been a busy year at Oslandia. We enlarged our team, and made sustained contributions to the OpenSource GIS ecosystem. We can namely mention our leadership on PostGIS 3D development, and our multiples contributions to QGIS 2.0. 2013 has also been an opportunity to unveil our QGIS support offer .

And, as usual, we promoted and participated to numerous OpenSource events, locals and internationals.


The year to come will certainly be as rich and plenty of novelties. We will of course continue the work begun last year. Furthermore, we foresee some innovations on topics like water management, transportation systems or Smart Cities...


The projects we develop, the expertise and offer we propose, are all centered around our client's needs, your needs. This is therefore an opportunity to thank you for contributing to the growth of free and opensource softwares in GIS. We hope that 2014 will see plenty of collaborations and new projects taking place.



Toute l'équipe d'Oslandia vous souhaite ses meilleurs vœux pour 2014. Que cette année soit brillante, qu'elle puisse voir se réaliser tous vos souhaits tant personnels que professionnels.


2013 a été une année bien remplie pour Oslandia. Nous avons agrandi notre équipe et contribué de manière soutenue à l'écosystème SIG OpenSource. On peut notamment citer notre rôle de leader sur PostGIS 3D, et nos multiples contributions à QGIS 2.0.


2013 fût aussi l'occasion de mettre en place notre offre de support QGIS . Et, bien sûr, comme chaque année, nous avons soutenu et participé à de nombreux événements OpenSource, tant locaux qu'internationaux.


L'année qui vient sera certainement tout aussi riche de nouveautés. Nous continuerons évidemment le travail effectué l'an passé. De plus, des innovations sur les thèmes de la gestion de l'eau, des transports ou des Smart Cities sont notamment dans nos prévisions.


Puisque nous sommes dans les projections, voici une date importante à agender pour 2014 : réservez les 20-22 Mai pour le FOSS4G-fr, organisé par l'OSGeo-fr , à Marne La Vallée.


Les projets que nous développons, l'expertise et l'offre que nous proposons sont centrés sur les besoins de nos clients, de vos besoins. C'est donc l'occasion de vous remercier également pour contribuer à l'avancée des logiciels libres en SIG. Nous espérons que l'année à venir sera riche en collaborations et en nouveaux projets.



À très bientôt donc,

We hope to see you soon then,

Olivier Courtin, Vincent Picavet

by Vincent Picavet at January 01, 2014 06:00 PM

December 24, 2013

Duncan Golicher

Adapting the census mapping application to use INEGI’s data for Mexico

INEGI, the Mexican institute for national statistics and geography, place large quantities of data in the public domain. Geographical data is now provided through an efficient map server. However most socio economic data is only available directly in the form of Excel spreadsheets from the descarga masiva site. http://www3.inegi.org.mx/sistemas/descarga/ These data vary in quality but…

by Duncan Golicher at December 24, 2013 04:29 PM

Stephen Mather

Drone Pointilism

Just an image today.

image of UAS derived points using structure from motion.

Image of UAS derived points using structure from motion.


by smathermather at December 24, 2013 03:04 AM

December 21, 2013

Stephen Mather

Voronoi in PostGIS

PostGIS has for a time had ST_DelaunayTriangles for Delaunay Triangulation, and since 2.1 apparently can even natively create a 2.5D TIN using said function, which is pretty cool. I think with SFCGAL, we will eventually have true 3D TINs as well.

Overlay of Delaunay and Voronoi

We’re still waiting on native Vororoi support for PostGIS though. According to http://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS

“GEOS 3.5 is still in development but will have Voronoi which will be a feature in 2.2 only if you have GEOS 3.5.”

Vishal Tiwari through a Google of Summer of Code under the mentorship of Sandro Santilli completed the port of Voronoi to GEOS from JTS Topology Suite. Now we just need to wait for PostGIS 2.2….

In the mean time,

http://geogeek.garnix.org/2012/04/faster-voronoi-diagrams-in-postgis.html

One caveat– this python function doesn’t always provide just Voronoi but some artifact polygons.

Voronoi polygons with some artifacts

Once we have a table with Voronoi, we can filter out just the true Voronoi cells by counting the number of original points we find within them, and only return the polygons which contain a single point:

CREATE TABLE true_voronoi AS
WITH tvoronoi AS (
	SELECT COUNT(*), v.geom
		FROM voronoi v, input_points p
		WHERE ST_Intersects(v.geom, p.geom)
		GROUP BY v.geom
		)
SELECT the_geom FROM tvoronoi WHERE count = 1;

voronoi without extra polygons

But it’s still not a perfect solution. I can’t wait for PostGIS 2.2….


by smathermather at December 21, 2013 05:39 PM

December 19, 2013

OpenShift by RedHat

Set up Local Access to OpenShift Hosted Services with Port Forwarding

Author: 
Image Upload: 
Cartridge connection information available in the OpenShift Console

postgres in local development As a follow-up to my recent post on building Instant Mapping Applications with PostGIS and Nodejs, I wanted to outline my process for setting up a local application environment for testing and reviewing changes during development.

I'm using a Fedora 19 system - so installing, configuring, and managing my own local Postgres database is definitely one way to meet this application's local service requirements. However, OpenShift provides an easier way to fulfill these dependencies - offering secured access to your hosted database via on-demand ssh tunnels.

In this post, I'll show you how I use the rhc port-forward command to deliver local access to your OpenShift-hosted cartridges.

read more

by rjarvine at December 19, 2013 02:18 PM

December 18, 2013

Stephen Mather

2.5D TINs in PostGIS

(edited: changed TIN to TIN Z)
(edited again — function already exists as a flag in ST_DelaunayTriangles… .)

Uh, I wrote a function for nothin’…
As Regina points out in the commments, this functionality was rolled into 2.1 with a flag. Her example:

SELECT
    ST_AsText(ST_DelaunayTriangles(
       ST_Union(ST_GeomFromText(
       'POLYGON((175 150, 20 40, 50 60, 125 100, 175 150))'), 
       ST_Buffer(ST_GeomFromText(‘POINT(110 170)’), 20) ),0.001,2) )
    AS dtriag;

For the record, my code is about 2% slower on a dataset that takes ~50minutes to triangulate .

——————————————
Original Post
——————————————

Ever since ST_Delaunay made its way into PostGIS with GEOS 3.4, there’s been the promise of TINs from geometries in PostGIS. The promise is for 2.5D TINs.  What are these, and why aren’t they 3D? It’s all about overhangs– imagine the edge of a building.  A 3D TIN can handle that building edge with a true vertical wall, or even overhang.  A 2.5D TIN is like a 2.5D raster– no overlaps allowed.

With those limitations in mind, you can have TINs today if you want:
https://github.com/smathermather/postgis-etc/edit/master/3D/AsTin.sql

-- A small function to convert ST_Delaunay (requires GEOS 3.4) to a 2.5D Tin
-- Uses the hackerish approach of converting to text, and doing string replacement
--- for format conversion.

-- A Delaunay triangulation does not by itself formally make a TIN.  To do this,
-- we will require some modification to the output format.  If we perform an ST_AsText
-- on our new geometries we will see that they are POLYGON Z's wrapped in a
-- GEOMETRYCOLLECTION.

-- Thus there are two ways in which this is not a TIN-- the internal POLYGON Z are
-- implicitly triangles, but explicitly POLYGON Zs.  In addition, the external wrapper
-- for the collection of triangles is a GEOMETRYCOLLECTION, not a TIN.
-- Once that we have this geometry in text form, the easiest way to fix this is
-- with string replacement to fix these two things, then convert back to binary

-- We'll go from e.g. this:
-- GEOMETRYCOLLECTION Z (POLYGON Z ((-1.14864 0.61002 -2.00108,-0.433998 -0.0288903 -2.13766, ...
-- to this:
-- TIN ( ((-1.14864 0.61002 -2.00108,-0.433998 -0.0288903 -2.13766, ...

CREATE OR REPLACE FUNCTION AsTIN(geometry)
  RETURNS geometry AS
$BODY$

WITH dt AS
(
SELECT ST_AsText(ST_DelaunayTriangles(ST_Collect($1))) AS atext
),
replacedt AS
(
-- Remove polygon z designation, as TINs don't require it.
SELECT replace(atext, 'POLYGON Z', '') as ttext
  FROM dt
),
replacegc AS
(
-- change leading declaration to TIN
SELECT replace(ttext, 'GEOMETRYCOLLECTION Z', 'TIN Z') AS tintext
  from replacedt
),
tingeom AS
(
-- Aaaand convert back to binary.  Voila!
SELECT ST_GeomFromEWKT(tintext) AS the_geom FROM replacegc
)

SELECT the_geom FROM tingeom

$BODY$
  LANGUAGE sql VOLATILE
  COST 100;

Here’s an example TIN derived from UAS (non-weaponized drone) imagery:
Image showing UAS derived TIN


by smathermather at December 18, 2013 08:36 AM

December 16, 2013

Stephen Mather

UAS (drone) Footprint Geometries Calculated in PostGIS with SFCGAL — for real this time

In my earlier post, I made a claim that SFCGAL was used in this figure:

Figure showing overlapping viewing footprints from images from UAS flight.

It dawned on my afterwards, while I was using 3D, I hadn’t actually employed any of the analysis goodies that come with SFCGAL.  Well, here it is– a footprint as calculated using the view angles and a real terrain model:

Map showing TIN of intersection of digital terrain model and viewing cone from UAS.

Here it is compared with the initial calculation:

Comparison of previous figure with original less correct estimate.


by smathermather at December 16, 2013 03:32 AM

December 15, 2013

Stephen Mather

UAS (drone) Footprint Geometries Calculated in PostGIS — Individual Footprint

UAS (drone) Footprint Geometries Calculated in PostGIS (viewed in QGIS nightly), taking into account relative elevation, bearing, pitch, and roll, this time just one:

Nadir view of viewing pyramid for individual drone  image


by smathermather at December 15, 2013 07:50 PM

Stephen Mather

Aaargh! No: geometry ST_RotateX(geometry geomA, float rotRadians, geometry pointOrigin)

Edit– Code may be flawed, testing—– testing. Please wait to use… .

In PostGIS, ST_RotateZ has a couple forms: a version that rotates around 0,0 and a version that rotates around a point of origin:

geometry ST_Rotate(geometry geomA, float rotRadians, geometry pointOrigin)

ST_RotateX and ST_RotateY have had no equivalents– until now. These equivalents are dumb versions– they use a transformation/rotation/reverse-transformation to do their magic, which is maybe not as efficient as using ST_Affine, but I’m not smart enough for ST_Affine. Not today anyway. Repo here: https://github.com/smathermather/postgis-etc/tree/master/3D

geometry ST_RotateX(geometry geomA, float rotRadians, geometry pointOrigin)

ST_RotateX version:

-- Function: st_rotatex(geometry, double precision, geometry)
CREATE OR REPLACE FUNCTION ST_RotateX(geomA geometry, rotRadians double precision, pointOrigin geometry)
  RETURNS geometry AS
$BODY$

----- Transform geometry to nullsville (0,0,0) so rotRadians will take place around the pointOrigin
WITH transformed AS (
    SELECT ST_Translate(geomA, -1 * ST_X(pointOrigin), -1 * ST_Y(pointOrigin), -1 * ST_Z(pointOrigin)) AS the_geom
    ),
----- Rotate in place
rotated AS (
    SELECT ST_RotateX(the_geom, rotRadians) AS the_geom FROM transformed
    ),
----- Translate back home
rotTrans AS (
    SELECT ST_Translate(the_geom, ST_X(pointOrigin), ST_Y(pointOrigin), ST_Z(pointOrigin)) AS the_geom
	FROM rotated
    )
----- profit
SELECT the_geom from rotTrans

;

$BODY$
  LANGUAGE sql VOLATILE
  COST 100;
geometry ST_RotateY(geometry geomA, float rotRadians, geometry pointOrigin)

ST_RotateY version:

-- Function: ST_RotateY(geometry, double precision, geometry)
CREATE OR REPLACE FUNCTION ST_RotateY(geomA geometry, rotRadians double precision, pointOrigin geometry)
  RETURNS geometry AS
$BODY$

----- Transform geometry to nullsville (0,0,0) so rotRadians will take place around the pointOrigin
WITH transformed AS (
    SELECT ST_Translate(geomA, -1 * ST_X(pointOrigin), -1 * ST_Y(pointOrigin), -1 * ST_Z(pointOrigin)) AS the_geom
    ),
----- Rotate in place
rotated AS (
    SELECT ST_RotateY(the_geom, rotRadians) AS the_geom FROM transformed
    ),
----- Translate back home
rotTrans AS (
    SELECT ST_Translate(the_geom, ST_X(pointOrigin), ST_Y(pointOrigin), ST_Z(pointOrigin)) AS the_geom
	FROM rotated
    )
----- profit
SELECT the_geom from rotTrans

;

$BODY$
  LANGUAGE sql VOLATILE
  COST 100;

by smathermather at December 15, 2013 01:49 AM

December 14, 2013

Stephen Mather

Roll, Pitch, Yaw, ST_Rotate

Thinking out load (and hoping I’m right), but with the classic flight dynamics of roll, pitch, yaw for flight dynamics relative to PostGIS concepts of rotation:
Diagram of jet liner with roll, pitch, yaw

  • Pitch Axis = ST_RotateX
  • Bearing (instead of yaw) = ST_RotateZ using 90-minus the angle for NSEW coordinates from Cartesian angles
  • Roll = ST_RotateY

  • by smathermather at December 14, 2013 08:24 PM

    December 12, 2013

    OpenShift by RedHat

    Instant Mapping Applications with PostGIS and Nodejs

    Author: 
    Image Upload: 
    Instant Mapping on OpenShift with PostGIS and Nodejs

    Mapping Applications with PostGIS and Nodejs screenshot After my post on Open Source Mapping with PHP and MongoDB, I wanted to follow up with a rewrite of our basic instant mapping demo using PostGIS, a full-featured collection of spatial mapping extensions for PostgreSQL.

    This app also takes advantage of node-restify, LeafLet Maps, and map tiles from Stamen, to visualize the locations of major National Parks and Historic Sites.

    We have a growing collection of these example mapping applications.

    read more

    by rjarvine at December 12, 2013 01:34 PM

    December 11, 2013

    Bill Dollins

    Building a Simple Geodata Service with Node, PostGIS, and Amazon

    tl;dr

    This post describes the construction of a simple, lightweight geospatial data service using Node.JS, PostGIS and Amazon RDS. It is somewhat lengthy and includes a number of code snippets. The post is primarily targeted at users who may be interested in alternative strategies for publishing geospatial data but may not be familiar with the tools discussed here. This effort is ongoing and follow-up posts can be expected.

    Rationale

    I'm always looking for opportunities to experiment with new tools and the announcement of PostgreSQL/PostGIS support on Amazon RDS piqued my curiosity. Over the past six months, I have run into the repeated need on a couple of projects to be able to get the bounding box of various polygon features in order to drive dynamic mapping displays. Additionally, the required spatial references of these projects have varied beyond WGS84 and Web Mercator.

    With that, the seeds of a geodata service were born. I decided to build one that would, via a simple HTTP call, return the bounding box of a polygon or the polygon itself, in the spatial reference of my choice as a single GeoJSON feature.

    I knew I wanted to use PostGIS hosted on Amazon RDS to store my data. Here are the rest of the building blocks for this particular application:

    1. Node.js
    2. Express web application framework for Node
    3. PG module for accessing PostgreSQL with Node
    4. Natural Earth 1:10M country boundaries

    Setting up PostGIS on Amazon RDS

    Setting up the PostgreSQL instance on RDS was very easy. I simply followed the instructions here for doing it in the AWS Management Console. I also got a lot of use out of this post by Josh Berkus. Don't forget to also set up your security group to govern access to your database instance as described here. I prefer to grant access to specific IP addresses.

    Now that the Amazon configuration is done, your RDS instance essentially behaves the same as if you had set it up on a server in your server room. You can now access the instance using all of the standard PostgreSQL tools with which you are familiar. This is good because we need to do at least one more thing before we load our spatial data: we have to enable the PostGIS extension. I find that it is easiest to accomplish this at the command line:

    psql -U {username} -h {really long amazon instance host name} {database name}

    Once you've connected, issue the command to enable PostGIS in your database:

    CREATE EXTENSION postgis;

    You may also want to enable topology while you're here:

    CREATE EXTENSION postgis_topology;

    This should complete your setup. Now you are ready to load data.

    Loading Spatial Data

    As I mentioned above, we are now dealing with a standard PostgreSQL server that happens to be running on Amazon RDS. You can use whatever workflow you prefer to load your spatial data.

    I downloaded the Natural Earth 1:10M country polygons for this effort. Once downloaded, I used the DB Manager extension to QGIS to import the data to PostgreSQL. I also did a test import with OGR. Both worked fine so it's really a matter of preference.

    Building the Application

    I chose to use Node.js because it is very lightweight and ideal for building targeted web applications. I decided to use the Express web framework for Node, mainly because it makes things very easy. To access PostgreSQL, I used the node-postgres module. I was planning to deploy the application in an Ubuntu instance on Amazon EC2, so I chose to do the development on Ubuntu. Theoretically, that shouldn't matter with Node but the node-postgres module builds a native library when it is installed so it was a factor here.

    After building the package.json file and using that to install the Express, node-postgres, and their dependencies, I build a quick server script to act as the web interface for the application. This is where Express really excels in that it makes it easy to define resource paths in an application.

    server.js
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    
    var express = require('express'),</p>
    
    <pre><code>geo = require('./routes/geo');
    </code></pre>
    
    <p>var app = express();</p>
    
    <p>app.get('/countries/:id/bbox', geo.bbox);
    app.get('/countries/:id/bbox/:srid', geo.bboxSrid);
    app.get('/countries/:id/polygon', geo.polygon);
    app.get('/countries/:id/polygon/:srid', geo.polygonSrid);</p>
    
    <p>app.listen(3000);
    console.log('Listening on port 3000...');
    

    The four "app.get" statements above define calls to get either the bounding box or the actual polygon for a country. When the ":srid" parameter is not specified, the resulting feature is returned in the default spatial reference of WGS84. If a valid EPSG spatial reference code is supplied, then the resulting feature is transformed to that spatial reference. The ":id" parameter in all of the calls represents the ISO Alpha-3 code for a country. You will notice that the application listens on port 3000. More on that later.

    The next step is to define the route handlers. In this application, this where interaction with PostGIS will take place. Note that each of the exports correspond to the callback functions in the app.get statements above.

    geo.js
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    
    var pg = require('pg');
    var conString = "postgres://username:password@hostname.rds.amazonaws.com:5432/database"; //TODO: point to RDS instance</p>
    
    <p>exports.bbox = function(req, res) {</p>
    
    <pre><code>var client = new pg.Client(conString);
    client.connect();
    var crsobj = {"type": "name","properties": {"name": "urn:ogc:def:crs:EPSG:6.3:4326"}};
    var idformat = "'" + req.params.id + "'";
    idformat = idformat.toUpperCase();
    var query = client.query("select st_asgeojson(st_envelope(shape)) as geojson from ne_countries where iso_a3 = " + idformat + ";");
    var retval = "no data";
    query.on('row', function(result) {
        client.end();
        if (!result) {
          return res.send('No data found');
        } else {
          res.setHeader('Content-Type', 'application/json');
          //build a GeoJSON feature and return it
          res.send({type: "feature",crs: crsobj, geometry: JSON.parse(result.geojson), properties:{"iso": req.params.id, "representation": "extent"}});
        }
      });
    </code></pre>
    
    <p>};</p>
    
    <p>exports.bboxSrid = function(req, res) {</p>
    
    <pre><code>var client = new pg.Client(conString);
    client.connect();
    var crsobj = {"type": "name","properties": {"name": "urn:ogc:def:crs:EPSG:6.3:" + req.params.srid}};
    var idformat = "'" + req.params.id + "'";
    idformat = idformat.toUpperCase();
    var query = client.query("select st_asgeojson(st_envelope(st_transform(shape, " + req.params.srid + "))) as geojson from ne_countries where iso_a3 = " + idformat + ";");
    var retval = "no data";
    query.on('row', function(result) {
        client.end();
        if (!result) {
          return res.send('No data found');
        } else {
          res.setHeader('Content-Type', 'application/json');
          res.send({type: "feature",crs: crsobj, geometry: JSON.parse(result.geojson), properties:{"iso": req.params.id, "representation": "extent"}});
        }
      });
    </code></pre>
    
    <p>};</p>
    
    <p>exports.polygon = function(req, res) {</p>
    
    <pre><code>//TODO: Flesh this out. Logic will be similar to bounding box.
    var client = new pg.Client(conString);
    client.connect();
    var crsobj = {"type": "name","properties": {"name": "urn:ogc:def:crs:EPSG:6.3:4326"}};
    var idformat = "'" + req.params.id + "'";
    idformat = idformat.toUpperCase();
    var query = client.query("select st_asgeojson(shape) as geojson from ne_countries where iso_a3 = " + idformat + ";");
    var retval = "no data";
    query.on('row', function(result) {
        client.end();
        if (!result) {
          return res.send('No data found');
        } else {
          res.setHeader('Content-Type', 'application/json');
          res.send({type: "feature", crs: crsobj, geometry: JSON.parse(result.geojson), properties:{"iso": req.params.id, "representation": "boundary"}});
        }
      }); };
    </code></pre>
    
    <p>exports.polygonSrid = function(req, res) {</p>
    
    <pre><code>var client = new pg.Client(conString);
    client.connect();
    var crsobj = {"type": "name","properties": {"name": "urn:ogc:def:crs:EPSG:6.3:" + req.params.srid}};
    var idformat = "'" + req.params.id + "'";
    idformat = idformat.toUpperCase();
    var query = client.query("select st_asgeojson(st_transform(shape, " + req.params.srid + ")) as geojson from ne_countries where iso_a3 = " + idformat + ";");
    var retval = "no data";
    query.on('row', function(result) {
        client.end();
        if (!result) {
          return res.send('No data found');
        } else {
          res.setHeader('Content-Type', 'application/json');
          res.send({type: "feature",crs: crsobj, geometry: JSON.parse(result.geojson), properties:{"iso": req.params.id, "representation": "boundary"}});
        }
      }); };
    </code></pre>
    
    <p>
    

    The PostGIS spatial SQL for each function is shown in the "client.query" calls in the code above. This approach is very similar to constructing SQL calls in a number of other application environments. You will notice that a coordinate reference system object is constructed and attached to each valid response, which is structured as a GeoJSON feature. The code currently assumes EPSG codes but that may be addressed in a future version.

    The above modules do most of the heavy lifting. The full code for this sample is available here.

    To test the application, simply issue the following command in a terminal:

    node server.js (this assumes you are running from the same directory in which server.js resides. The file extension is optional.

    Your web application is now listening on port 3000. In a browser, visit the following URL:

    http://localhost:3000/countries/irl/bbox

    This should return a GeoJSON feature representing the bounding box of Ireland in WGS84. You can then test the other three calls with appropriate parameters. To get the bounding box in Web Mercator, go to:

    http://localhost:3000/countries/irl/bbox/3785

    Deploying the Application

    The application should now be ready to deploy. In my case, I created an Ubuntu EC2 instance (free tier). Using SSH, I made sure Node and git were installed on the machine. Additionally, I installed Forever which allows a Node application to run continuously (similar to a service on Windows). This can also be done using an upstart script but I chose Forever. I may switch to using PM2 at some point.

    Now, it's simply matter of installing the application code to the instance via git, wget, or the method of your choice. Once installed, be sure to go to the folder containing the code and issue the "npm install" command. This will read the package.json install Express, node-postgres, and other dependencies. Since some native code is built in the process, you may need to issue the command under sudo.

    I mentioned above that the application listens on port 3000. The Ubuntu instance, by default, will not allow the application to listen on port 80. This can be mitigated in a number of ways but I issued the following command to redirect traffic from 80 to 3000. Since this instance is single-use, this approach is sufficient.

    sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-ports 3000

    Once you are ready to go, you'll want to start the application with the following command:

    forever start server (again assuming you are executing from the directory containing server.js)

    A couple of Amazon notes: 1) You may want to assign an elastic IP to your instance for a persistent IP address and 2) you'll want you remember to configure your RDS security group to allow access from your instance's IP address.

    Conclusion

    If everything has gone correctly, you should be able to execute the above URLs (using your instance IP address) and get a response like the following, which you should be able to load directly into QGIS or another GeoJSON-literate client. Altogether, I was able to assemble this in one evening. This small collection of open-source tools, combined with the Amazon infrastructure, seems to provide a straightforward path to a hosted geodata service. This example is intentionally simple but PostGIS provides a robust collection of functions that can be exploited in a similar manner, leading to more advanced processing or analysis. I will continue my experimentation but am encouraged by what I have seen so far.

    Sample Response

    irl_bbox.json
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    
    {
      "type": "feature",
      "crs": {</p>
    
    <pre><code>"type": "name",
    "properties": {
      "name": "urn:ogc:def:crs:EPSG:6.3:4326"
    }
    </code></pre>
    
    <p>  },
      "geometry": {</p>
    
    <pre><code>"type": "Polygon",
    "coordinates": [
      [
        [
          -10.4781794909999,
          51.4457054710001
        ],
        [
          -10.4781794909999,
          55.386379299
        ],
        [
          -5.99351966099994,
          55.386379299
        ],
        [
          -5.99351966099994,
          51.4457054710001
        ],
        [
          -10.4781794909999,
          51.4457054710001
        ]
      ]
    ]
    </code></pre>
    
    <p>  },
      "properties": {</p>
    
    <pre><code>"iso": "irl",
    "representation": "extent"
    </code></pre>
    
    <p>  }
    }
    

    December 11, 2013 08:18 PM

    December 10, 2013

    Boundless Geo

    Using PostGIS on Amazon RDS with OpenGeo Suite

    Amazon Web ServicesEarlier this month, Amazon announced the addition of PostgreSQL to its “relational database service“, generally known as “RDS”.

    RDS is database-in-the-cloud with all the annoying, repetitive administrative tasks taken care of: automatic backups, restore, replication, replication across availability zones, and I/O provisioning. You can build your own version of RDS using just vanilla cloud servers and your brain, but you’ll spend a lot of time on it, and it still might not work as well.

    The first version of RDS was MySQL-based, and Amazon has since added support for Oracle and SQL Server, but the arrival of Amazon RDS for PostgreSQL is a big deal, because it means PostGIS is now an RDS option too.

    Using Amazon RDS for PostgreSQL and OpenGeo Suite on AWS Marketplace, setting up a cloud web service is entirely a push-button affair.

    Launching Amazon RDS for PostgreSQL

    Getting the PostgreSQL instance up and running is almost comically easy.

    • Log into AWS (you’ll need to set up an account) and go to the RDS section of the console. Setting up an instance only requires a button click.
      Launch a DB Instance
    • Select a PostgreSQL instance type, and then choose details of the deployment. A big instance? Replicate across zones? Apply minor upgrades (do this always)?
      DB Instance Type
    • Set up security and database. For new users, the only tricky part of AWS servers and databases is security. “VPC”s are “virtual private clouds”, network segments that are kept isolated from the world by firewall rules. If you run your database and servers in the same VPC, they can easily connect to each other.
      Security Setup
    • And then the database starts up!
      DB Starting

    Configuring Security & Adding Data

    Pretty incredible, and particularly gratifying to get up-and-running with a fully backed-up system in just a handful of mouse clicks. But an empty running database holds limited interest… what can it do with data?!?

    In order to load data directly from your home/office computer to the RDS instance, you’ll have to open up connections into the virtual private cloud for your home network. Add a rule to the VPC “security group” assigned to your RDS instance, allowing your computer access to port 5432 (the PgSQL port).
    Security Group Rule
    Once the rule was changed, I could connect easily the RDS instance using my favorite command-line SQL tool, psql:

    psql --host boundless.cyvm9oy2wq1b.us-west-2.rds.amazonaws.com
    --port 5432
    --username boundless
    --password
    --dbname gistest

    I loaded two tables of spatial data, an electoral districts (ed) file of 85 multipolygons and a census distribution area (da) file of 55529 multipolygons. Then I did a full spatial join of the two.

    SELECT ed.edname, Sum(da.popn) AS popn
    FROM ed
    JOIN da ON ST_Intersects(ed.geom, da.geom)
    WHERE ST_Contains(ed.geom, ST_Centroid(da.geom))
    GROUP BY ed.edname
    ORDER BY ed.edname;

    The RDS instance took about 13 seconds. My MacBook laptop (1.8Ghz) took 28 seconds. So there’s some decent CPU horsepower in these AWS instances.

    Since the RDS instance was accessible from my desktop computer, I decided to see if I could connect to it and view data from QGIS.QGIS ViewYep, no problem!

    Connecting to OpenGeo Suite on AWS

    Finally, I wanted to see how OpenGeo Suite performed accessing the RDS instance. I started OpenGeo Suite on an AWS instance and ensured it was a member of the VPC my RDS was in.

    Suite connected to the RDS instance no problem, and I was able to quickly point and click to configure my big census table as a layer.

    Census Data in Suite

    Speed was great for zoomed in rendering. For zoomed out rendering, where the whole 50Mb file needed to be scanned and rendered, things slowed down a bit, which is to be expected when the data and renderer are running on separate instances with a network interconnect.

    The future is certainly cloudy!

    With Amazon Web Services, I had a backed-up and replication-ready database up and running in under an hour, and OpenGeo Suite tied into it in another hour. With the knowledge I’ve gained, next time will take only a few minutes.

    The post Using PostGIS on Amazon RDS with OpenGeo Suite appeared first on Boundless.

    by Paul Ramsey at December 10, 2013 11:05 AM

    December 09, 2013

    Safe Software

    PostGIS on Amazon RDS: A Spatial Database in the Cloud

    An announcement at the AWS:Invent 2013 conference that got me excited was the new support for PostgreSQL on Amazon Web Service’s (AWS) Relational Database Service (RDS).

    I crossed my fingers as I read the announcement, hoping that PostGIS was part of this. “No Way!” I thought after reading that PostGIS was indeed! I knew I had to try it out.

    PostGIS on AWS RDS

    Image via Techcrunch article: http://fme.ly/techcrunch

    Running PostGIS on RDS

    AWS Cloud Logo

    I fired up an RDS instance of PostgreSQL to see for myself. After connecting using the standard PostgreSQL admin tool, pgAdmin, and executing the statement CREATE EXTENSION postgis; I now had a scalable spatial database in the cloud at my fingertips.  With this there are now two spatial data solutions within RDS: SQL Server and PostGIS!  Spatial data in the cloud just keeps getting easier.

    PostGIS Spatial Database Logo

    I used FME to read and write data to this PostGIS database as it would any other PostGIS. Both Raster and Vector worked flawlessly. This whole process, from creating the database to using it, took about 10 minutes. Wow!

    FME and PostGIS in the Cloud

    FME Cloud Data Integration Technology

    The implications of this are huge, especially when I think about our new FME Cloud iPaaS service. FME Cloud currently has a PostGIS database as part of its installation. Now with AWS RDS supporting PostGIS, it is simple for users to use this cloud-based PostGIS database instead.

    Doing this is a game changer.

    With PostGIS now supported by RDS, anyone can easily deploy a class-leading fault-tolerant database that spans multiple data centers (or AZ in Amazon speak) in minutes.  These databases can also be scaled up or down to meet changing requirements. And of course all of the management and backup issues are handled by Amazon. No hardware to buy or manage.

    Best thing is that all FME users, from FME Desktop to FME Server to FME Cloud, can take advantage of this today!

    The post PostGIS on Amazon RDS: A Spatial Database in the Cloud appeared first on Safe Software Blog.

    by Don Murray at December 09, 2013 10:24 PM

    December 02, 2013

    Duncan Golicher

    QuickFire PostGIS visualisation: Using SQL views with Geoserver

    One of the fastest ways of deploying the results of PostGIS queries on the web takes advantage of Geoserver SQL views. The process is very straight forward once you have installed and configured a Geoserver. The Geoserver layer from a PostGIS data store is just defined as an SQL query with placeholders for the variables…

    by Duncan Golicher at December 02, 2013 02:55 AM

    December 01, 2013

    Duncan Golicher

    Deforestation in Southern Mexico: Displaying PostGIS raster queries using Google maps

    The recent publication of Hansen and colleagues’ global deforestation map in a user friendly Google maps earth explorer format (including street view for Mexico) has received a great deal of publicity. So I placed the Landsat classification for Southern Mexico that our group published in Google maps using more or less the same format. Click…

    by Duncan Golicher at December 01, 2013 04:03 PM

    November 27, 2013

    Bill Dollins

    Consider the 'Alternative'

    When I was in college, I had a psychology professor who posited that you could train a cat (a dodgy proposition at best) to take a circuitous route to its food bowl by only rewarding that behavior. He was clearly a behaviorist and was convinced that you could completely condition the instinct to go straight to the food bowl out of the cat. To my knowledge, this professor did not own a cat and never attempted to test his assertion.

    I was reminded of this after reading my friend Atanas Entchev's post in reaction to the PostGIS Day hangout panel discussion. In his post, Atanas describes difficulty in convincing customers to consider open-source geospatial tools. These customers and prospects are comfortable with their proprietary tools and associated workflows and are reluctant to consider switching. I have encountered this attitude many times myself so I take no issue with the observation. Barriers to exit are real considerations, regardless of the new technology being considered. Organizations align themselves around their tools to achieve maximum efficiency with them. I discussed these issues at a talk I gave last year to the New Jersey Geospatial Forum about how organizations can extend their existing geospatial technology investments with open-source technologies. These issues are very real for any organization with a mature, extended investment in a particular technology stack.

    Atanas went on to liken the attitude to that with which some people view alternative medicine and I can see his point. Traditional GIS has set itself apart from the rest of the technology world for so long that users are generally conditioned to believe that GIS workflows should involve a series of Rube Goldberg machinations involving file-based data sets, some proprietary scripting, and possibly some application-level business logic to relate and/or join data as necessary. This has taken various forms over the years but diagrams of those workflows tend to look the same.

    Standing in contrast to such things, PostGIS looks alien, or "alternative." In truth, it is not "alternative" but rather "standard." As an example, here is a map I produced a few weeks ago showing the average ages of bridges by county. (I am not a cartographer.) It is a simple aggregation of the National Bridge Inventory, which consists of tens of thousands of records by county (3100-ish records). All of the data processing was done in PostgreSQL/PostGIS using nothing more exotic than SQL aggregate functions and some joins. None of the operations took longer than 6 seconds on my very pedestrian laptop. When I was done, I used QGIS to play with visualization and then dump out the static GeoJSON for use in Leaflet.

    For my many friends who are regular users of PostGIS, this is nothing exotic. For some of my friends who regularly use commercial tools, this is interesting but not earth-shattering. But for a large portion of my friends who are comfortable with traditional tools and workflows, the time-to-market for this effort (35 minutes from the time I downloaded the NBI to the time I pushed the map to GitHub) has them taking notice. This entire workflow involved SQL extended with OGC-compliant spatial objects. (Side note: I have been hard on OGC's web efforts but the Simple Features Specification has been a quiet workhorse across the geospatial industry for over a decade. It's a good example of the benefit that well-designed standards can provide.) The map is being served from static content over simple HTTP with some client-side Javascript handling visualization. No heavy APIs or middleware involved or needed. The QGIS part was really necessitated by own cartographic limitations, but I could have fully generated the GeoJSON from SQL as well.

    This example is fairly simplistic but I have good friends that are using PostGIS, and nothing more, to perform analyses and produce results for decision makers while sitting in meetings. This type of turnaround is standard in other market segments and the geospatial industry should expect nothing less. It requires nothing more than a strong foundation in SQL, mastery of spatial processes, and detailed understanding of your own data.

    So I have come to realize that the mainstream GIS community has become very much like my professor's theoretical cat; conditioned to take the long way to the end result when more direct paths are clearly available. What's more, they have become conditioned to think of such approaches as normal. Geospatial analytical operations can be very complex and the approaches to performing them were, in the past, necessarily convoluted due to the lack of understanding of spatial data types and operations within mainstream platforms. Those barriers have been rapidly disappearing over the past decade or so, but the user community has been slow to let go of its comfort with familiar tools and convoluted approaches. As I stated above, organizational barriers to exit are real considerations, but the inherent efficiencies available in modern geospatial tools such as PostGIS make the transition worth the effort.

    November 27, 2013 06:01 PM

    November 21, 2013

    Boundless Geo

    Manage LIDAR with OpenGeo Suite

    We live in an age of exploding geospatial data volumes, as sensors get cheaper and more universally accessible. LIDAR data is no exception: high quality data that was once only affordable for narrow building and infrastructure sites is now routinely collected over entire regions or states. The tragedy of bulk data collection is that often it is used once for a project purpose, and then archived in a desk drawer. The challenge is to store data so that it remains useful and accessible while not creating a data management burden.

    Presidential Palace in Haiti from LIDAR data

    New LIDAR Features in OpenGeo Suite

    OpenGeo Suite 4.0 integrates our newly-developed support for storing and analyzing LIDAR data within a PostgreSQL database. The new Pointcloud extension provides the following features:

    • Random access of small working areas from large LIDAR collections.
    • Aggregation of pointcloud rows into large return blocks.
    • Filtering of pointcloud data using any dimension as a filter condition.
    • Lossless compression of LIDAR data with compression ratios between 3:1 and 4:1.

    In combination with the PostGIS extension, even more features are available:

    • Spatial indexing for high speed spatial search and retrieval.
    • Clipping and masking of pointclouds to arbitrary clipping polygons.
    • Conversion of pointcloud data to and from PostGIS geometry types for advanced spatial analysis.

    To efficiently load LIDAR and other pointcloud data into the database, OpenGeo Suite 4.0 includes the PDAL pointcloud processing tools. PDAL supports multiple read/write for formats such as LAS, LASZip, Oracle Pointcloud, PostgreSQL Pointcloud and text/csv. In addition, PDAL allows for in-translation processing of data, including reprojection, rescaling, calculation of new dimensions, removal of dimensions, and raster gridding.

    Learn more with course materials from Boundless

    Try these new features yourself with our LIDAR analysis and visualization tutorial, which walks through loading and analyzing LIDAR data and visualizing it in 3D using KML and Google Earth.

    Interested in working with Paul? We’re looking for interns at our Victoria, BC office!

    The post Manage LIDAR with OpenGeo Suite appeared first on Boundless.

    by Paul Ramsey at November 21, 2013 02:53 PM

    PostGIS Development

    PostGIS Day Hang-out Panel discussion

    November 21st, 2013 was PostGIS day and to celebrate we had a four panel discussion with host James Fee and panel members Paul Ramsey, Bborie Park, Stephen Mather, and Josh Berkus.

    Video here

    Continue Reading by clicking title hyperlink ..

    by Regina Obe at November 21, 2013 12:00 AM

    November 19, 2013

    Safe Software

    FME 2014 Sneak Peek: More Ado About Nothing

    Hi FME’ers,
    In a previous Evangelist post (from 2012) I talked about nulls in data and how FME handled them. In 2014 we have exciting new functionality that deals with null values in a proper way.

    Because we’ve changed how FME behaves, if you think your source data may contain nulls then it’s important you read this article.

    EvangelistBanner7

    You were nothing to me once, and I was contented; you are now nothing to me again. But how different the second nothing is from the first! – Thomas Hardy in Far From the Madding Crowd critiques our new functionality!

    EvangelistBanner7

    Nulls in General
    First let’s recap what we mean by null. In general there are three types of ‘nothing’ values:

    • The attribute exists and is an empty value (Empty)
    • The attribute exists and is NULL (NULL)
    • The attribute doesn’t exist (Non-existent)

    I say ‘three’ types, but really there are many more. A numeric attribute with a value of zero could be taken to mean nothing. Also there are NaN (Not-A-Number) values, nils in XML, Nodata values in raster, and many others (did you know Excel has its own special version of ‘nothing’?) – in fact I’m told by our developers that they found fifteen (15) types of nothing; but here we’re looking at the three primary types of nothing and why null is special.

    NullsWarpingData

    Yes, fifteen types of nothing! When you get all of them in a single dataset it causes a vortex in space-time! True.

    But for most people nulls will occur in a database. A true null means a field has been deliberately set to ‘nothing’. It’s not the same as an empty value at all. The closest non-IT analogy I can think of is when you fill in your name on a questionnaire. You could fill in your middle name (in which case it has a value) or you could write “n/a” or cross the field out. That would be equivalent to an empty field; it’s a positive value and we know you have no middle name. However, if you left that field blank, that would be a null; we can’t tell whether or not you have a middle name. It’s a value that signifies “unknown”, sort of a placeholder for real data.

    EvangelistBanner7

    Nulls and FME
    Up until now FME did not have explicit support for nulls; we supported empty values and non-existent attributes (plus other forms of nothing, like NaN) but null wasn’t a concept that our internal data model understood. If an attribute was null then it was treated as if it didn’t exist. However, in 2014 we’ve completely overhauled that data model to incorporate null value support. So FME can now read null values, write null values, and carry out transformations on them too.

    In terms of databases, it means you can now much more easily identify values that are null and write nulls to new rows, plus change existing rows to be null (which you couldn’t do before).

    DifferentFormatsDifferentNulls

    But, like people, formats come in all shapes and sizes and no two are the same. Interestingly, some formats will support all three of our primary nothing types; some will support just one or two. For example, Shape supports empty values, but not nulls, whereas GeoMedia supports nulls, but doesn’t recognize empty values. Databases generally store both null and empty values, and let you query them too. JSON nicely supports the concept of all three, whereas VPF has at least five types of nothing, maybe six depending on the field type (getting close to a space-time ripple there)! So even with null support, the behaviour of FME’s Readers and Writers will vary according to format, and you have to be aware of that.

    EvangelistBanner7

    Reading and Writing Nulls
    If your source data includes null values the FME Reader will now emit null values. You don’t need to update or replace the Reader in your workspace, it will happen automatically. When you write the data out then the Writer will write null values too. If the format you are writing doesn’t support nulls, then the data will be handled appropriately – replaced with zero, or an empty value, or whatever that format requires.

    So a translation involving nulls should just work. You don’t need to do anything special to your workspace, even if it’s an older one.

    Which formats does this apply to? I’ve created a list at the bottom of the article for you to check, and you can find that list on FMEpedia too. We cover all the major databases, but if there’s a format you think we’ve missed then please do let us know.

    EvangelistBanner7

    Data Inspection
    So now nulls have been incorporated into the FME Data model, we need a way to inspect the data, and a way to distinguish true nulls from other forms of “nothing”.

    When you use the FME Data Inspector to examine your data before translation (you are inspecting your data before translation, aren’t you?) you’ll see null values displayed in the table view:

    NullTableView

    …and in the Feature Information window:

    NullFeatureInfo

    When an attribute exists, but is empty, the Table View window will show it as an empty field (i.e. the cell in the table is blank) but when the attribute is missing completely (i.e. just doesn’t exist for that feature) then it flags it as such:

    MissingValueTableView

    Here TaxCoord is part of the source schema, but these features don’t have it at all. So now you can totally identify which attributes are null, which are empty, and which are missing.

    EvangelistBanner7

    Nulls and Transformers
    OK, so if we’re reading this data, and have an internal representation, then we also need a way for you to handle it in Workbench. Therefore we’ve updated several transformers to specifically handle nulls, updated many others to simply cope with null data, and added a new transformer to carry out new null functionality.

    The list of transformers with null support is, like the formats, listed as an appendix below.

    To give you some examples, though, let’s first look at the AttributeCreator.

    AttrCreatorNulls

    Notice how you can set (or create) a null attribute using the new entry on the drop-down list, or by typing “<null>” into that field. You’ll find you can also do this with other attribute-setting transformers, such as the AttributeValueMapper:

    AttrValueMapperNulls

    The Tester transformer doesn’t look much different to before, except for rewording the operators to be more precise, but now when you choose “Attribute is Null” it really is checking for a true null value:

    TesterNullHandling

    Other examples would be the ChangeDetector and the Matcher. The ChangeDetector and Matcher will now truly compare nulls so that “null” is not the same as “missing” as it was before. The change in the parameters here is just the option to turn this behaviour on or off – that way your workspace can be made to run the way it always has (and in fact – for backwards compatibility reasons – it will default to doing so):

    MatcherNullHandling

    Like I mentioned, other transformers have been updated to better cope with nulls, even if their parameters and outward appearance haven’t changed. For example, the StringPadder won’t attempt to pad a null string, because that would change it to no longer being null. It will still let you pad empty strings, because that won’t affect the attribute’s status.

    And then we’ve added a new transformer to handle nulls, called the NullAttributeMapper.

    The NullAttributeMapper is a replacement for the NullAttributeReplacer transformer. Basically the function here is to map attribute values to or from null, depending on the original attribute value:

    ParkNullExample

    For example here, if my ParkId attribute is missing, or is empty, or has a value of 9999, then I will replace it with null. I’d probably do this when I’m reading from a format that doesn’t support nulls, and writing to one that does. That way I get true null values in my output dataset. Similarly I could map null values to an empty field or a specific value:

    ParkNullExample2

    I probably wouldn’t need to map to an empty field just for a Writer, because if the format doesn’t support nulls, FME will automatically convert null values to something appropriate for the format. But I might do that if I deliberately wanted to give them a new value or make them empty to match other data.

    The other great things about this transformer are that it maps multiple attributes, and that it’s not just limited to nulls; so you could even use it instead of the AttributeValueMapper for some scenarios.

    EvangelistBanner7

    Benefits and Consequences
    I think the benefits of this are fairly obvious, so I won’t say anything except that to mention we’ve also updated our APIs to include this functionality too, to ensure developers get the benefits as much as regular users.

    However, you do need to be prepared for the effect of reading nulls as “real” attribute values, rather than as a missing attribute.

    Many transformers will ignore missing attributes, and you may have come to rely upon that behaviour. But that data might be read as a null now, and that’s going to be treated differently. For example, the ListElementCounter won’t include attributes that are missing. It will, however, include nulls – because these are real values. The two transformers that I think need most careful scrutiny are the Tester and the FeatureMerger.

    With the Tester you might check for an attribute that is empty (attr=”") or that “doesn’t exist”. In FME 2014 that attribute might now be read as null instead; in that case the test would fail, where it passed before. Basically, the functionality is better in 2014, it’s just that you’ve come to rely on the older (less good) behaviour. In this scenario you could change the test to be “Attribute is Null” instead. Otherwise, the new NullAttributeMapper transformer can be inserted to change nulls to missing and so revert the behaviour.

    For the FeatureMerger, think of what happens when you merge attributes from one feature to another, particularly when a “requestor” attribute has the same name as a “supplier” attribute. The requestor attribute will be overwritten, except for where the supplier attribute is missing. But now, if your missing records are represented as nulls, that requestor would get overwritten as a null. Again it’s because null is a “real value”, and again the NullAttributeMapper can be used to workaround this.

    Yes, cases like this are going to be fairly rare and obscure, but it’s worth being aware of them. We’d like you to embrace the new behaviour, but if you want to revert then use the NullAttributeMapper. Basically if you put a NullAttributeMapper into your workspace, directly after your Readers, and map “Null” to “Missing” then the behaviour in the rest of the workspace would be the same as before (although it wouldn’t be as good!)

    For the latest info on null behaviour (and the formats/transformers supported) see our null support page on FMEpedia.

    EvangelistBanner7

    Conclusion
    I think this update is going to be oh so useful to many of our users. Our developers have done a super job implementing this big change (any time you update the core data model it’s a big change) and then incorporating it into all these different Readers, Writers, and transformers.

    If you notice any parts that could be improved (either functionality or user experience) then please do let us know about your experiences (good or bad).

    NewBlogSignature

    EvangelistBanner7

    Appendix 1: List of null-complete formats:

    • FFS (obviously)
    • ArcSDE formats
    • Autodesk SDF
    • DB2
    • Geodatabase API
    • Geodatabase ArcObjects and family
    • GeoMedia
    • GML
    • Informix
    • Ingres
    • MS Access and Excel
    • MSSQL Server (Windows Azure SQL Database)
    • MySQL (MariaDB) (Google Cloud SQL)
    • Netezza
    • ODBC2
    • Odata
    • Oracle and family
    • PostGIS and family (inc Redshift)
    • Teradata (JDBC and TPT formats)
    • TFS
    • VPF

    Appendix 2: List of null-support transformers (ones with exposed null parameters or functionality):

    • AttributeClassifier
    • AttributeCreator
    • AttributeFilter
    • AttributeRenamer
    • AttributeValueMapper
    • ChangeDetector
    • Matcher
    • NullAttributeMapper
    • Tester

    NB: Do be aware that many other transformers will now handle nulls without parameters; for example the Logger will log null values, the InlineQuerier can handle null values in its database, the ListHistogrammer will count the number of nulls, and so on.

    The post FME 2014 Sneak Peek: More Ado About Nothing appeared first on Safe Software Blog.

    by Mark Ireland at November 19, 2013 05:48 PM

    Bill Dollins

    GIS Day After

    It's the morning of November 21st, but not for long. You open one eye. Just one; it's best not to rush such things. Apparently, you finally came to rest in the ball pit you all made using the squishy globes from myriad conferences past. A cursory scan tells you the GIS lab is trashed. It starts to come back to you: the rousing game of "Pin the Certificate on the Khakis." Yes, there are your pleated khakis on the wall with everyone's training and GISP certificates stuck on or around them with pushpins. Someone won in what would have been a most painful way if the khakis had been on your body. The loin cloth fashioned from the old hard-copy topos (which you are still wearing). The fact that you let the intern talk you into finally opening a Twitter account and your glee at discovering you could attach photos to geocoded tweets with your BlackBerry.

    You look around the room at your coworkers strewn across the floor. There's the ArcObjects guy still sporting his "war paint" from the plotter toner. There's your lead analyst with a face tattoo of the town's land-use drawn in marker and, she proudly insisted, accurately projected in state plane. Slowly, you are able to account for everyone on your GIS staff. Good, no one has gone missing...except the intern.

    The door opens and you turn to see the intern standing there in all of her college-kid resilience, letting in far too much sunlight for your comfort. You're not sure it's possible to hate anyone more in this moment.

    "Oh, good," she says, "you're awake. Are you going to do the hangout?"

    "Hangout?" you mumble.

    "Yeah, James Fee is doing a PostGIS Day hangout with a panel of experts on PostGIS. I told you about this. I told you about PostGIS. Don't you remember? It starts in about an hour."

    It does sound familiar. You give it some thought. What better way to shake off the cobwebs from the bacchanalia of the "BEST GIS DAY EVAR!!!" (so says the hand-lettered Sharpie on the lab's wall) than a little geo-hair-of-the-dog? Maybe it's time to finally pay attention to this open-source stuff. You look around to survey the damage one more time. Besides, ditching the Oracle license might just about pay for this.

    "Yeah," you reply, "I'll be there in five minutes. Close the door, please."

    November 19, 2013 02:13 PM

    November 18, 2013

    Pierre Racine

    Launching the PostGIS Add-ons! A new PostGIS extension for pure PL/pgSQL contributions.

    I'm glad today to launch the PostGIS Add-ons. This new PostGIS extension aims at providing a quick, more Agile way to release pure PL/pgSQL functions developed by the PostGIS user community. It includes 15 new functions. Among them, two functions to aggregate raster/vector overlay statistics, one to extract values from a vector coverage to a raster, one to generate random points inside a polygon and two new aggregates functions helping removing overlaps in a table of geometry.

    You can download the PostGIS Add-ons from my GitHub account. I'm already at version 1.19. The second version number should increment for each significant addition. The major version number should increase when there will be a major change in some function parameters or when we will stop supporting the actual minimal required versions of PostgreSQL and PostGIS which are now PostgreSQL 9.x and PostGIS 2.1.x.

    You can contribute to the PostGIS Add-ons with your own PL/pgSQL functions by following the Fork & Pull GitHub model. Just make sure that:

    • your function is written in pure PL/pgSQL (no C!),
    • your function is generic enough to be useful to other PostGIS users, 
    • you follow functions and variables naming and indentation conventions already in use in the files, 
    • you document your function like the ones already provided, 
    • you provide some tests in the postgis_addons_test.sql file,
    • you add the necessary DROP statements in the postgis_addons_uninstall.sql file.

    So why not contributing your PL/pgSQL work directly to the core PostGIS project? Good question! We can say that the it's easier to contribute pure PL/pgSQL functions to the PostGIS Add-ons for a number of reasons:
    • Because you can do it yourself without relying on one of the core PostGIS developers. No need to compile PostGIS, no need to write to the mailing list and explain the usefulness of your function, no need to write a ticket and wait for one of the main PostGIS developer to add it for you. Just make sure your code is ok and send a pull request to the GitHub project. It should be added pretty fast and trigger a new PostGIS Add-ons release. Your new function will be released in a matter of hours or days, not weeks or months.
    • Because your function does not fit well in the PostGIS SQL API. Sometimes PL/pgSQL functions are not as generic as the one provided in the PostGIS SQL API or do not perform as well as most core PostGIS functions which are written in C. The PostGIS Add-ons will try to be very open to functions performing very high level tasks or which, because there are implemented in PL/pgSQL, would not perform as well as if they would be implemented in C. Requesting a function to be implemented in C generally means you will not see your function added to the PostGIS core before weeks or months. The PostGIS Add-ons allows you to release them NOW, making them easily accessible, documented, testable and changeable before they get a more definitive signature and find their way in the PostGIS core in C. Some PostGIS Add-ons functions will find their way in the core after being converted to C, some will not and will stay as they are.
    So the main goal is to provide casual PL/pgSQL developers, not necessarily skilled in C or not wanting to build PostGIS (on Windows for example which can take days), a channel to contribute to PostGIS. I'm sure there are dozens of PostGIS users out there who developed quite useful stuff but never took time to share them because they found it "too complicated". The PostGIS Add-ons is the first effort to give them and easy and consistent way to contribute to PostGIS.

    The PostGIS Add-ons tries to make things as simple as they can be. It consist in a single, easy to install, postgis_addons.sql file accompanied by an uninstall and a test file. For the sake of simplicity, documentation about each function is embedded in the main file, before each function declaration. A list of all the function available with a short description is provided at the beginning of the file.

    So here is a list of what functions are available in this first release (in order of what I think are the most interesting contributions):

    ST_ExtractToRaster() - Compute a raster band by extracting values for the centroid or the footprint of each pixel from a global geometry coverage using different methods like count, min, max, mean, value of biggest geometry or area weighted mean of values. This function use ST_MapAlgebra() to compute stats about a geometry or a raster coverage, for each pixel of a raster. Stats are computed using custom SQL queries exploiting the spatial index existing on the coverage being extracted. New stats queries can easily be added on demand. A list of possible stats is provided in objective FV.27 in the PostGIS raster working specifications. A number of them are already implemented.

    ST_SplitAgg() - Returns the first geometry as a set of geometries after being split by all the second geometries being part of the aggregate. This function is a more robust and powerful alternative to the solution provided in the PostGIS wiki to overlay two vector coverages. It is normally used to remove overlaps in a vector coverage by splitting overlapping polygons. It does not imply the union of all the geometry so it can work on very large geometry tables. a tolerance parameter allows removing slivers resulting from the spitting.

    ST_DifferenceAgg() - Returns the first geometry after having removed all the subsequent geometries in the aggregate. It is used to remove overlaps in a geometry table by erasing parts from all but one of many overlapping polygons.

    ST_RandomPoints() - Generates points located randomly inside a geometry.

    ST_GlobalRasterUnion() - Build a new raster by extracting all the pixel values from a global raster coverage using different methods like count, min, max, mean, stddev and range. Similar and slower, but more flexible than ST_Union.

    ST_AreaWeightedSummaryStats() - Aggregate function computing statistics on a series of intersected values, weighted by the area of the corresponding geometry. The most notable statistics is the weighted mean very useful when intersecting buffers with a raster coverage.

    ST_SummaryStatsAgg() - Aggregate function computing statistics on a series of rasters generally clipped by a geometry.

    ST_CreateIndexRaster() - Creates a new raster as an index grid. Many parameters allow creating grids indexed following any order.

    ST_BufferedSmooth() - Returns a smoothed version of the geometry. The smoothing is done by making a buffer around the geometry and removing it afterward. This technique, sometimes called dilatation/erosion or inward/outward polygon offseting, was described years ago by Paul Ramsey and more recently by "spluque" in the PostGIS wiki (with a nice animated GIF).

    ST_BufferedUnion() - Alternative to ST_Union(geometry) making a buffer around each geometry before unioning and removing it afterward. Used when ST_Union leaves internal undesirable vertexes after a complex union or when holes want to be removed from the resulting union. This function, used with some very complex and detailed geometry coverages, fasten and simplify the unioning of all the geometries into one.

    ST_DeleteBand() - Removes a band from a raster.

    ST_AddUniqueID() - Adds a column to a table and fill it with a unique integer starting at 1.

    ST_NBiggestExteriorRings() - Returns the n biggest exterior rings of the provided geometry based on their area or thir number of vertex. Mostly useful to the ST_BufferedUnion() function.

    ST_TrimMulti() - Returns a multigeometry from which simple geometries having an area smaller than the tolerance parameter have been removed. Mostly useful to the ST_SplitAgg() function.

    ST_ColumnExists() - Returns true if a column exist in a table. Mostly useful to the ST_AddUniqueID() function.

    I hope the PostGIS Add-ons will be useful to some people and trigger a new wave of contributions to PostGIS.



    by Pierre Racine (noreply@blogger.com) at November 18, 2013 04:58 PM

    Paul Ramsey

    BC Electoral Redistribution and Population Dispersion

    2013/01/09 Update: I have converted this blog post into a document and submitted it as a comment on the white paper.



    I've been asked by my technical readers to stop writing so much about politics, but I cannot help myself, and this week I have the perfect opportunity to apply my technical skills to a local political topic.

    History and Background

    Like Britain and the USA (and very few other jurisdictions anymore), British Columbia has a first-past-the-post representative electoral system. The province is divided up into "electoral districts" (also known as "ridings" in the British tradition) and each district returns a single member to the Legislature. In each district, the candidate with the most votes wins the district, even if they only obtain a plurality. In the Legislature, the party with most seats forms government.

    In such a system, the geographical layout of the districts has a great deal of importance, because it is possible for a party to win a majority of votes in the province, but a minority of seats in the legislature, if the votes are concentrated in particular seats. This actually happened in British Columbia in 1996. It also happened in the USA in the 2000 Presidential election, since their presidential Electoral College effectively acts like a weighted version of a first-past-the-post Legislature.

    In a representative democracy, it's important that everyone's vote have the same weight, which means ensuring that the each district has approximately the same number of people in it. As relative populations grow in some regions and shrink in others, districts can become unbalanced, and districts need to be redrawn. In the USA, this "redistricting" process is often driven by partisan considerations, and can lead to districts like this (try out this awesome gerrymandered districts puzzle game):

    Fortunately, since 1989 British Columbia's districts have been drawn every 10 years by non-partisan "Electoral Boundaries Commissions", and the primary consideration has been creating districts that are as equal in population as possible while allowing for effective representation.

    Effective Representation

    BC is a big place and a place of extremes. The smallest district in BC is in downtown Vancouver (Vancouver-West End), with an area of under 500 hectares: it takes less than 30 minutes to walk across it and 48,500 people live there.

    The largest district is in the north-west of the province (Stikine), with an area of almost 20,000,000 hectares: about the size of Ireland, Switzerland, Denmark and the Netherlands, combined. Just over 20,000 people live in it. When you're dealing with areas this sparsely populated "effectiveness of representation" begins to have some concrete meaning.

    On the other hand, the principle of "one person, one vote" is the corner-stone of democracy, and the goal of an electoral boundary re-distribution is to try to achieve it, as far as possible. There is a tension inherent in the process.

    2008 Commission Catastrophe

    Population growth in BC over the last generation has been concentrated in the south: mostly in Vancouver and its suburbs, with some in Vancouver Island and Kelowna. Commissions have dealt with this growth through a combination of increasing the number of seats in the Legislature (to suppress the growth of the average seat size), and slowly increasing the size of the rural districts (to keep defensibly close to the average).

    In 2008, this process reached a tipping point, as the Commission recommended two new seats, and the transfer of three seats from rural areas to urban areas. Rural BC exploded in anger, and the government of the day rushed in legislation directing the Commission to add more seats than recommended and to avoid removing seats from certain rural areas.

    At this point, though the process remained non-partisan (both parties in the Legislature supported the new plan), it had become thoroughly politicized (the carefully considered deliberations of the Commission had been hastily overturned by politicians for public relations purposes).

    Formalization of Politicization

    No doubt remembering the tumult of the 2008 experience, the current government of BC has released a proposal for the rules governing the next Electoral Boundary commission. The proposal aims to avoid a messy politicization of the process at the end, by politicizing quietly it in advance:

    • The Commission may not recommend adding any further seats to the Legislature
    • The Commission may not remove seats from three protected regions: North, Columbia-Kootenay, and Cariboo-Thompson

    The protected regions look like this.

    Note that the Okanagan region is isolated from the rest of the "unprotected" areas of the province, making it impossible to juggle population into or out of the region. That means the Okanagan can either gain or lose a whole seat, but never lose a "half" a seat by having population juggled in or out via boundary changes.

    Anyone with a passing familiarity with BC electoral geography will recognize that this proposal entrenches an already large and growing deviation from the principle of one-person-one-vote, but I want to calculate just how large, and also to measure the "fairness" of this particular proposal.

    Population

    The electoral district boundaries of BC are readily available as GIS files online, but do not have population information attached (and would be out of date if they did, since they pre-date the most recent census).

    Similarly, the StatsCan boundary files can be downloaded, and the attribute file giving the census 2011 population in each block is also available. There are about 500-800 blocks in each electoral district, making for a very fine-grained profile of where people are concentrated in each district.

    I loaded the GIS files into a PostGIS spatial database for analysis. Once the electoral districts (ed) and dissemination blocks (db) were loaded, calculating the electoral district population in PostGIS was a simple spatial join query:


    SELECT ed.edabbr, ed.edname, sum(db.popn) as popn
    FROM ed, db
    WHERE ST_Intersects(ed.geom, db.geom)
    AND ST_Contains(ed.geom, ST_Centroid(db.geom))

    The results of this calculation and others in this article can be seen in the spreadsheet I've placed online.

    A quick summary of the population results shows that, among other things:

    • The current distribution is extremely lopsided, with the most heavily populated riding (Surrey-Cloverdale, 73042) having well over 3 times the population of the least populated (Stikine, 20238)
    • The current provincial average population is 51765 per riding
    • The average population in the 17 "protected" ridings is 35609, 31% less that the provincial average
    • The average population in the 68 "unprotected" ridings is 55804, 8% higher than the provincial average
    • A vote in the protected regions will be over 1.5 times more "powerful" than one in the unprotected regions
    • Of the 85 ridings, 26 are below average and 59 are above average, indicating that the problem of underpopulation is concentrated in a minority of ridings

    There is no doubt that the government proposal will enshrine the regional imbalance in representation, and further worsen it as continued migration into the south pushes the balance even further out of line.

    "Fair" Imbalances

    Legal challenges to imbalanced representation have resulted in court decisions that indicate that it is constitutional within limits, and with reasonable justification. The limits generally accepted by the courts are +/- 25% of the provincial average, and the starting point of this proposal already exceeds that on average--some individual ridings (like Stikine) will be much worse. Political commentators in BC are already musing about whether ridings built under this scheme would survive a court challenge.

    Of more analytical interest is whether the scheme of selecting "protected regions" is a good one for choosing which ridings should receive preferential treatment.

    "Representing" a riding involves being available to your constituents, meeting with other orders of government in your riding (cities, school boards), and attending local events. Representation is very much tied up with being where the people are.

    • If the people are all in one place, near together, then representing them is easy.
    • If the people are spread out, in many different localities, then representing them is hard.

    Can we measure the "spreadoutness" of people? Yes, we can!

    Calculating Dispersion

    Each riding contains several hundred census dissemination blocks, each of which has a population associated with it. Imagine measuring the distance between each block, and all the other blocks in the riding, and weighting that distance by the population at each end.

    For Vancouver-Fairview, the picture looks like this.

    The blocks are fairly regular, the population is all very close together, and the dispersion is not very high.

    For Skeena, the picture looks like this.

    The population is concentrated in two centers (Terrace and Kitimat) reasonably far apart, giving a much higher dispersion than the urban ridings.

    In mathematical terms, the formula for "dispersion" looks like this.

    In the database, after creating a table of census blocks that are coded by riding, the calculation looks like this.


    WITH interim AS (
    SELECT
    a.ed,
    max(a.edname) AS edname,
    sum(b.popn * a.popn * st_distance_sphere(a.pt, b.pt)) AS score,
    sum(b.popn * a.popn) AS p2
    FROM popn a, popn b
    WHERE a.ed = b.ed
    GROUP BY a.ed
    )
    SELECT ed, edname, score/p2/1000 AS score
    FROM interim;

    Taking the ratio of the distance scaled populations against the unscaled populations allows populations that are far apart to dominate ones that are close together. Scaling the final result down by 1000 just makes the numbers more readable.

    As before, the results of this calculation and others in this article can be seen in the spreadsheet.

    Is Regional Protection "Fair"

    Using the measure of dispersion allows us to evaluate the government proposal on its merits: does protecting the North, Kootenays, Cariboo and Thompson protect those ridings that are most difficult to represent?

    In short, no.

    The regional scheme protects some difficult ridings (Stikine) but leaves others (North Island) unprotected. It also protects ridings that are not particularly dispersed at all (Kamloops-South Thompson), while leaving more dispersed ridings (Powell River-Sunshine Coast, Boundary-Similkameen) unprotected.

    Among the larger ridings, Skeena is notable because even though it is the 10th largest riding by area, and 10th sparsest by population density, it's only the 17th most dispersed. There are many smaller ridings with more dispersion (Powell River-Sunshine Coast, Nelson-Creston, Boundary-Similkameen). This is because most of the people in Skeena live in Terrace and Kitimat, making it much easier to represent than, say, North Island. Despite that, Skeena's population is 43% below the average, while North Island's is 5% above.

    Kamloops-South Thompson is the least dispersed (score 15.2) protected riding, and it's worth comparing it to the similar, yet unprotected, Nanaimo-North Cowichan (score 16.2).

    Kamloops-South Thompson (protected) consists of a hunk of Kamloops, and a string of smaller communities laid out to the east for 50KM along Highway 1.

    Nanaimo-North Cowichan (unprotected) consists of a hunk of Nanaimo, and a string of smaller communities laid out to the south for 45KM along Highway 1 (and some settled islands).

    What is it about Kamloops-South Thompson that recommends it for protected status along with truly dispersed difficult ridings like Stikine? Nothing that I can determine.

    Let the Commission Work

    The intent of the government's proposal to amend the Redistribution Act is clearly to avoid the firestorm of protest that accompanied the 2008 Commission report, and it's good they are thinking ahead.

    They need to think even further ahead: the consequences of having the boundaries enacted, then reversed in court, will be far more disruptive than allowing the Commission to proceed with the necessary work of redistributing BC's districts to more fairly reflect our actual population distribution.

    The end result of an unconstrained Commission will be fair boundaries that still reflect the representation needs of dispersed ridings by giving them lower populations within the limits already acknowledged by the courts: +/- 25% with a handful of exceptions (I'm looking at you, Stikine).

    I encourage you to explore the data on dispersion, and how it relates to the regional "protection" scheme, in the spreadsheet.


    by Paul Ramsey (noreply@blogger.com) at November 18, 2013 08:30 AM