Skip to content

Releases: arctern-io/arctern

arctern-0.3.0

20 Jul 09:08
0294677
Compare
Choose a tag to compare

Arctern Release Notes

0.3.0

We have a number of improvements and updates in this release. Here are some highlights that you may be interested in:

  • arctern_spark - The arctern_spark package is the counterpart of arctern in the ecosystem of Apache Spark. Built on Apache Koalas, arctern_spark implements GeoSeries and GeoDataFrame classes similar to their counterparts in the arctern package. In other words, the GeoSeries and GeoDataFrame APIs in arctern_spark and arctern are almost the same. So, you can easily migrate your arctern codebase to arctern_spark when working with large datasets.

  • Improvements - Add 11 GeoSeries APIs implemented with C++ multi-threading, three GeoSeries APIs that read or write geometries, and two top-level functions (sjoin and read_file).

  • Plot - arctern.plot.choroplethmap now supports MultiPolygons. Functions suffixed with _layer generates PNG images encoded with base64. unique_value_choroplethmap and unique_value_choroplethmap_layer support specifying the color for each geometry according to its label.

  • GeoDataFrame - Starting with this release, Arctern begins to support GeoDataFrame. A GeoDataFrame object is a pandas.DataFrame that has columns with geometry.

  • RESTful Service - Introduce arctern-webserver, which is built on Apache Zeppelin, to provide RESTful service. arctern-webserver provides interactive geospatial data analysis services that enable data engineers, data analysts, and data scientists to increase their productivity. You can use a web-based notebook to perform geospatial data analysis without the usage of command lines or knowledge of the cluster details.

New features

GeoSeries APIs
Name Description
GeoSeries.is_empty Tests whether each geometry in the GeoSeries is empty.
GeoSeries.boundary Returns the closure of the combinatorial boundary of each geometry in the GeoSeries.
GeoSeries.union(other) Returns a geometry being a union of two input geometries.
GeoSeries.exterior For each geometry in the GeoSeries, returns a line string representing the exterior ring of the geometry.
GeoSeries.difference(other) For each geometry in the GeoSeries and the corresponding geometry given in other, returns a geometry representing the part of the first geometry that does not intersect with the other.
GeoSeries.symmetric_difference(other) For each geometry in the GeoSeries and the corresponding geometry given in other, returns a geometry representing the portions of the two geometries that do not intersect.
GeoSeries.scale(factor_x, factor_y[, origin]) Scales all geometries in the GeoSeries to a new size.
GeoSeries.affine_transform(matrix) Returns a GeoSeries with transformed geometries.
GeoSeries.translate(offset_x, offset_y) Returns a GeoSeries with translated geometries shifted by offsets along each dimension.
GeoSeries.rotate(angle[, origin, use_radians]) Returns a GeoSeries with rotated geometries on a 2D plane.
GeoSeries.disjoint(other) For each geometry in the GeoSeries and the corresponding geometry given in other, tests whether the first geometry disjoints to the other.
GeoSeries.from_file(fp[, bbox, mask, item]) Constructs a GeoSeries from a file or OGR dataset.
GeoSeries.to_file(fp[, mode, driver]) Writes a GeoSeries to a file or OGR dataset.
GeoSeries.to_json([show_bbox]) Returns a GeoJSON string representation of the GeoSeries.
GeoDataFrame APIs
Name Description
GeoDataFrame.to_geopandas() Transforms an arctern.GeoDataFrame object to a geopandas.GeoDataFrame object.
GeoDataFrame.from_geopandas(gdf) Constructs an arctern.GeoSeries object from a geopandas.DataFrame object.
GeoDataFrame.to_json([na, show_bbox, geometry]) Returns a GeoJSON string representation of the GeoDataFrame.
GeoDataFrame.from_file(filename, **kwargs) Constructs a GeoDataFrame from a file or URL.
GeoDataFrame.to_file(filename[, driver, ...]) Writes a GeoDataFrame to a file.
GeoDataFrame.crs The Coordinate Reference System (CRS) of arctern.GeoDataFrame.
GeoDataFrame.set_geometry(col[, inplace, crs]) Sets an existing column in the GeoDataFrame to a geometry column, which is used to perform geometric calculations later.
GeoDataFrame.dissolve([by, col, aggfunc, ...]) Dissolves geometries within by into a single observation.
GeoDataFrame.merge(right[, how, on, ...]) Merges two GeoDataFrame objects with a database-style join.
Top-level functions
Name Description
arctern.sjoin(left_df, right_df, left_col, ...) Spatially joins two GeoDataFrames.
arctern.read_file(*args, **kwargs) Returns a GeoDataFrame from a file or URL.
Plot functions suffixed with _layer
Name Description
arctern.plot.pointmap_layer(w, h, points[, ...]) Plots a point map layer.
arctern.plot.weighted_pointmap_layer(w, h, ...) Plots a weighted point map layer.
arctern.plot.heatmap_layer(w, h, points, ...) Plots a heat map layer.
arctern.plot.choroplethmap_layer(w, h, ...) Plots a choropleth map layer.
arctern.plot.iconviz_layer(w, h, points, ...) Plots an icon map layer.
arctern.plot.fishnetmap_layer(w, h, points, ...) Plots a fishnet map layer.
Plot functions with unique values
Name Description
arctern.plot.unique_value_choroplethmap(ax, ...) Plots a choropleth map in matplotlib with a set of unique colors.
arctern.plot.unique_value_choroplethmap_layer(w, ...) Plots a choropleth map layer with a set of unique colors.

arctern-0.2.0

06 Jun 03:46
0056798
Compare
Choose a tag to compare

Arctern 0.2.0 Release Notes

The 0.2.0 release focuses on two things

  • GeoSeries interfaces. Since this version, Arctern begin to support GeoPandas-like interfaces. In Arctern, we would like to equip these popular interfaces with efficient C++ multi-thread implementations as well as GPU-accelerated implementations.
  • Enrich geospatial data pre-processing functionality. Three map matching and one aggregration functions are added.

GeoSeries interfaces

GeoSeries.is_valid | Check if each geometry is of valid geometry format.
GeoSeries.length | Calculate the length of each geometry.
GeoSeries.is_simple | Check whether each geometry is "simple".
GeoSeries.area | Calculate the 2D Cartesian (planar) area of each geometry.
GeoSeries.geometry_type |  Get geometry type.
GeoSeries.centroid | Compute the centroid of each geometry.
GeoSeries.convex_hull | For each geometry, compute the smallest convex geometry that encloses all geometries in it.
GeoSeries.npoints | Calculates the points number for each geometry.
GeoSeries.envelope | Compute the double-precision minimum bounding box geometry for each geometry.
GeoSeries.point(x, y[, crs]) | Construct Point geometries according to the coordinates.
GeoSeries.polygon_from_envelope(min_x, ...) | Construct polygon(rectangle) geometries from arr_min_x, arr_min_y, arr_max_x, arr_max_y and special coordinate system.
GeoSeries.geom_from_geojson(json[, crs]) | Construct geometry from the GeoJSON representation string.
GeoSeries.as_geojson() | Transform each to GeoJSON format string.
GeoSeries.to_wkt() | Transform each geometry to WKT formed string.
GeoSeries.to_wkb() | Transform each geometry to WKB formed bytes object.
GeoSeries.buffer(distance) | For each geometry, returns a geometry that represents all points whose distance from this geos is less than or equal to "distance".
GeoSeries.precision_reduce(precision) | For the coordinates of each geometry, reduce the number of significant digits to the given number.
GeoSeries.intersection(other) | Calculate the point set intersection between each geometry and other.
GeoSeries.make_valid() | Create a valid representation of each geometry without losing any of the input vertices.
GeoSeries.simplify_preserve_topology |  
GeoSeries.set_crs(crs) | Set the coordinate system for the GeoSeries.
GeoSeries.to_crs(crs) | Transform each geometry to a different coordinate reference system.
GeoSeries.curve_to_line() | Convert curves in each geometry to approximate linear representation, e.g., CIRCULAR STRING to regular LINESTRING, CURVEPOLYGON to POLYGON, and MULTISURFACE to MULTIPOLYGON.
GeoSeries.geom_equals(other) | Check whether each geometry is "spatially equal" to other.
GeoSeries.touches(other) | Check whether each geometry "touches" other.
GeoSeries.overlaps(other) | Check whether each geometry "spatially overlaps" other.
GeoSeries.crosses(other) | Check whether each geometry and other(elementwise) "spatially cross".
GeoSeries.contains(other) | Check whether each geometry contains other (elementwise).
GeoSeries.intersects(other) | Check whether each geometry intersects other (elementwise).
GeoSeries.within(other) | Check whether each geometry is within other (elementwise).
GeoSeries.distance_sphere(other) | Return minimum distance in meters between two lon/lat points.
GeoSeries.distance(other) | Calculates the minimum 2D Cartesian (planar) distance between each geometry and other.
GeoSeries.hausdorff_distance(other) | Returns the Hausdorff distance between each geometry and other.
GeoSeries.union_aggr |  combine all the geometries into one
GeoSeries.envelope_aggr() | Compute the double-precision minimum bounding box geometry for the union of all geometries.

Pre-processing functions

arctern.nearest_location_on_road(roads, points) | Compute the location on roads closest to each point in points, The points passed do not need to be part of a continuous path.
arctern.nearest_road(roads, points) | Compute the closest road for each point in points, The points passed do not need to be part of a continuous path.
arctern.near_road(roads, points[, distance]) | Check if there is a road within distance meters of each point.
arctern.within_which(left, right) | For each geometry in left, search geometries that satisfy "within" relationship in right

arctern-0.2.0-preview

26 May 02:59
69f7c35
Compare
Choose a tag to compare
arctern-0.2.0-preview Pre-release
Pre-release
add verbose argument to arctern.version (#639)

* add verbose argument to arctern.version

* [skip ci] add doc string for arctern.version

* [skip ci] add doc string for arctern.version

* add verbose argument to arctern_pyspark.version

arctern-0.1.2

20 May 09:17
3acfaa6
Compare
Choose a tag to compare

Enhance documentation

arctern-0.1.2-preview

19 May 02:04
7561969
Compare
Choose a tag to compare
arctern-0.1.2-preview Pre-release
Pre-release

see Arctern 0.1.2 Release Notes

arctern-0.1.1

15 May 07:34
cf9ce98
Compare
Choose a tag to compare

Arctern 0.1.1 Release Notes

This release introduces a novel rendering function, the fishnet map (both CPU-based and GPU-based are supported). In addition, std::vector<ArrowArray> is adopted as the input type of C++ rendering libraries and analytic libraries to support large input slicing.

New feature

Rendering functions
  • finshnetmap: Draws a color-weighted fishnet map.

This function returns an image in PNG format. Currently only square fhishnet cell is supported.

Fixed issues

arctern-0.1.1-preview

11 May 01:54
ecd116a
Compare
Choose a tag to compare
arctern-0.1.1-preview Pre-release
Pre-release

see Arctern 0.1.1 Release Notes

arctern-0.1.0

26 Apr 04:46
ce66e70
Compare
Choose a tag to compare

Arctern 0.1.0 Release Notes

This release mainly focuses on developing the geospatial library and integrating the analytic engine. So far, Arctern has introduced 38 geospatial analytic functions and five geospatial rendering functions. It provides Python bindings and PySpark SQL UDF integration for these functions. A set of RESTful APIs is also provided for accessing Arctern's backend.

Arctern 0.1.0 offers GPU acceleration for time-consuming function calls, involving six geospatial analytic functions and five geospatial rendering functions.

The CPU-based and GPU-based implementations share the same interface despite their underling implementation differences.

This release discusses important aspects in Arctern v0.1.0, including APIs, experimental features, and limitations.

Pandas & Spark APIs

In this release, most geospatial analytic functions are CPU-based functions based on GDAL with a batch of improvements, while six other functions (ST_Point, ST_Area, ST_Envelop, ST_Length, ST_Distance, ST_Within) adopt GPU-accelerated implementations to enhance the computational performance. Compared with its counterparts, Arctern 0.1.0 shows promising results in the aspect of computing power and speed.

In the upcoming releases, we plan on adding more functions to expand Actern's analytic capabilities and optimizing its performance.

Constructor Functions

  • ST_Point: Builds a Point based on the given horizontal and vertical coordinates.
  • ST_PolygonFromEnvelope: Constructs a rectangular Geometry based on the given parameters.
  • ST_GeomFromGeoJSON: Constructs a Geometry from the GeoJson strings.
  • ST_PointFromText: Converts the given Point from WKT format to WKB format. (Spark only)
  • ST_PolygonFromText: Converts the given Polygon from WKT format to WKB format. (Spark only)
  • ST_LineStringFromText: Converts the given LineString from WKT format to WKB format. (Spark only)
  • ST_GeomFromText: Converts the given Geometry from WKT format to WKB format.
  • ST_GeomFromWKT: Converts the given Geometry from WKT format to WKB format. (Spark only)
  • ST_AsText: Converts the given Geometry from WKB format to WKT format.
  • ST_AsGeoJSON: Converts the given Geometry from WKB format to GeoJSON format.

Accessor Functions

  • ST_IsValid: Checks if the given Geometry is valid.
  • ST_IsSimple: Checks if the given Geometry is simple, which means it has no abnormal points, such as self-intersection and self-tangency.
  • ST_GeometryType: Returns a string representing the type of each Geometry in the input.
  • ST_NPoints: Counts the number of vertices/end points in a given Geometry.
  • ST_Envelope: Calculates the smallest rectangle containing the given Geometry.

Processing Functions

  • ST_Buffer: Returns a Geometry, the maximum distance between which and the given Geometry is not greater than the given distance.
  • ST_PrecisionReduce: Reduces the coordinate precision of the given Geometry based on the given number of significant digits.
  • ST_Intersection: Calculates the intersection of the two given Geometries.
  • ST_MakeValid: Constructs the given Geometry as a valid Geometry without removing any vertices.
  • ST_SimplifyPreserveTopology: Uses polylines to approximate curves in the given Geometry through the Douglas-Peucker algorithm.
  • ST_Centroid: Calculates the centroid of the given Geometry.
  • ST_ConvexHull: Calculates the smallest convex Geometry that contains the given Geometry.
  • ST_Transform: Maps the coordinates of the given Geometry from the "src_rc" space coordinate system (SRID) to the "dst_rs" space coordinate system.
  • ST_CurveToLine: Converts curves in the given Geometry to approximate linear representations, such as converting CIRCULAR STRING to LINESTRING, CURVEPOLYGON to POLYGON, and MULTISURFACE to MULTIPOLYGON.

Measurement Functions

  • ST_DistanceSphere: Calculates the spherical distance between two given Points on the earth's surface based on their latitude and longitude coordinates.
  • ST_Distance: Calculates the distance between the two given Geometries.
  • ST_Area: Calculates the area of the given Geometry.
  • ST_Length: Calculates the length of the given linear Geometry.
  • ST_HausdorffDistance: Returns the Hausdorff distance between the two given Geometries. The Hausdorff distance is used to measure the similarity between two Geometries.

Relationship Functions

  • ST_Equals: Checks if the two given Geometries are equivalent, which means they represent the same Geometry.
  • ST_Touches: Checks if the two given Geometries are adjacent, which means they have common points on the boundary.
  • ST_Overlaps: Checks if the two given Geometries overlap each other, which means they intersect and neither of them completely contains the other.
  • ST_Crosses: Checks if the two given Geometries cross each other, which means they share some but not all of the internal points. The intersection of these two Geometries cannot be empty, and the dimension of the intersection is smaller than the largest dimension of the input Geometry.
  • ST_Contains: Checks if the Geometry geo1 contains the Geometry geo2, which means geo2 has no point outside geo1 and at least one point inside geo1.
  • ST_Intersects: Checks if the two given Geometries intersect, which means they share the common space.
  • ST_Within: Checks if the Geometry geo1 is inside the Geometry geo2, which means geo1 has no point outside geo2 and at least one point inside geo2.

Aggregation Functions

  • ST_Union_Aggr: Returns a Geometry representing the given union set of Geometries.
  • ST_Envelope_Aggr: Calculates the minimum rectangle that contains the given set of Geometries.

Rendering Functions

This release supports the following five rendering functions, all having both the CPU-based and the GPU-based implementations:

  • pointmap: Draws a point map for WKB-formatted Points.
  • weighted_pointmap: Draws a weighted point map for WKB-formatted Points.
  • heatmap: Draws a heat map for WKB-formatted Points.
  • choroplehtmap: Draws a choropleth map for WKB-formatted Points that forms the contours of Polygons.
  • icon_viz: Draws an icon map for WKB-formatted Points.

Each function returns an image in PNG format. You can overlap these images to create stacked multi-layer effects.

PySpark SQL Integration

This release provides integration with PySpark SQL. All the 38 geospatial analytic functions mentioned above can be called as a SQL UDF (or nested UDFs). For more details, see Arctern 0.1.0 PySpark APIs.

Limitations

Due to the limitations of PySpark's UDF framework, the ST_Union_Aggr and ST_Envelope_Aggr functions may show poor performance when dealing with large data sets. This will be solved with the coming of the Dataframe/Series interface in 0.2.0.

RESTful APIs

This release only supports setting PySpark as Arctern's RESTful backend. The currently supported RESTful APIs are as follows. For more details, see Arctern 0.1.0 RESTful APIs.

  • POST /scope
  • DELETE /scope/
  • POST /loadfile
  • POST /savefile
  • GET /table/schema?scope=scope1&session=spark&table=table1
  • POST /query
  • POST /pointmap
  • POST /weighted_pointmap
  • POST /heatmap
  • POST /choroplethmap
  • POST /command