A community member on the postgis-users mailing list had a question recently:
A common spatial need is to find a polygon that accurately represents a set of points. The convex hull of the points often does not provide this, since it can enclose large areas which contain no points. What is required is a non-convex hull, often termed the concave hull.
A concave hull is generally considered to have some or all of the following properties:
- The hull is a simply connected polygon
- It contains all the input points
- The vertices in the hull polygon boundary are all input points
- The hull may or may not contain holes
- Simple geometric basis: this allows the user to understand the effect of the parameter and aids in determining an effective value
- Scale-free (dimensionless): this allows a single parameter value to be effective on varying sizes of geometry, which is essential for batch or automated processing
- Local (as opposed to global): A local property (such as edge length) gives the algorithm latitude to determine the concave shape of the points. A global property (such as area) over-constrains the possible result.
- Monotonic area: larger (or smaller) values produce a sequence of more concave areas
- Monotonic containment :the sequence of hulls produced are topologically nested
- Convex-bounded: an extremal value produces the convex hull
This is a well-studied problem, and many different approaches have been proposed. Some notable ones are:
- Alpha shapes - Edelsbrunner, Kirkpatrick, Seidel (1983): On the shape of a set of points in the plane
- KNN - Moreira and Santos (2007): CONCAVE HULL: A K-NEAREST NEIGHBOURS APPROACH FOR THE COMPUTATION OF THE REGION OCCUPIED BY A SET OF POINTS.
- Delaunay Erosion (Chi-shapes) - Duckham, Kulik, Worboys, Galton (2008) Efficient generation of simple polygons for characterizing the shape of a set of points in the plane
- Nearest-Point Digging - Park and Oh (2012) A New Concave Hull Algorithm and Concaveness Measure for n-dimensional Datasets
Of these, Delaunay Erosion (Chi-shapes) offers the best set of features. It is straightforward to code and is performant. It uses the control parameter of Edge Length Ratio, a fraction of the difference between the longest and shortest edges in the underlying Delaunay triangulation. This is easy to reason about, since it is scale-free and corresponds to a simple property of the point set (that of distance between vertices). It can be extended to support holes. And it has a track record of use, notably in Oracle Spatial.
Recently the Park-Oh algorithm has become popular, thanks to a high-quality implementation in Concaveman project (which has spawned numerous ports). However, it has some drawbacks. It can't support holes (and likely not disconnected regions and discarding outlier points). As the paper points out and experiment confirms, it produces rougher outlines than the Delaunay-based algorithm. Finally, the control parameter for Delaunay Erosion has a simpler geometrical basis which makes it easier to use.
Given these considerations, the new JTS ConcaveHull algorithm utilizes Delaunay Erosion. The algorithm ensures that the computed hull is simply connected, and contains all the input points. The Edge Length Ratio is used as the control parameter. A value of 1 produces the convex hull; 0 produces a concave hull of minimal size. Alternatively the maximum edge length can be specified directly. This allows alternative strategies to determine an appropriate length value; for instance, another possibility is to use a fraction of the longest edge in the Minimum Spanning Tree of the input points.
The recently-added Tri data structure provides a convenient basis for the implementation,. It operates as follows:
- The Delaunay Triangulation of the input points is computed and represented as a set of of Tris
- The Tris on the border of the triangulation are inserted in a priority queue, sorted by longest boundary edge
- While the queue is non-empty, the head Tri is popped from the queue. It is removed from the triangulation if it does not disconnect the area. Insert new border Tris into the queue if they have a boundary edge length than the target length
- The Tris left in the triangulation form the area of the Concave Hull
Thanks to the efficiency of the JTS Delaunay Triangulation the implementation is quite performant, approaching the performance of a Java port of Concaveman.
Optionally holes can be allowed to be present in the hull polygon (while maintaining a simply connected result). A classic demonstration of this is to generate hulls for text font glyphs:
- Disconnected Result - It is straightforward to extend the algorithm to allow a disconnected result (i.e. a MultiPolygon). This could be provided as an option.
- Outlier Points - It is also straightforward to support discarding "outlier" points.
- Polygon Concave Hull - computing a concave "outer hull" for a polygon can be used to simplify the polygon while guaranteeing the hull contains the original polygon. Additionally, an "inner hull" can be computed which is fully contained in the original. The implementation of a Polygon Concave Hull algorithm is well under way and will be released in JTS soon.