Skip to content

Introduction

Function list

GeoSparkSQL supports SQL/MM Part3 Spatial SQL Standard. It includes four kinds of SQL operators as follows. All these operators can be directly called through:

var myDataFrame = sparkSession.sql("YOUR_SQL")

  • Constructor: Construct a Geometry given an input string or coordinates
    • Example: ST_GeomFromWKT (string). Create a Geometry from a WKT String.
    • Documentation: Here
  • Function: Execute a function on the given column or columns
    • Example: ST_Distance (A, B). Given two Geometry A and B, return the Euclidean distance of A and B.
    • Documentation: Here
  • Aggregate function: Return the aggregated value on the given column
    • Example: ST_Envelope_Aggr (Geometry column). Given a Geometry column, calculate the entire envelope boundary of this column.
    • Documentation: Here
  • Predicate: Execute a logic judgement on the given columns and return true or false
    • Example: ST_Contains (A, B). Check if A fully contains B. Return "True" if yes, else return "False".
    • Documentation: Here

GeoSparkSQL supports SparkSQL query optimizer, documentation is Here

Quick start

The detailed explanation is here Write a SQL/DataFrame application.

  1. Add GeoSpark-core and GeoSparkSQL into your project POM.xml or build.sbt
  2. Declare your Spark Session
    sparkSession = SparkSession.builder().
          config("spark.serializer",classOf[KryoSerializer].getName).
          config("spark.kryo.registrator", classOf[GeoSparkKryoRegistrator].getName).
          master("local[*]").appName("myGeoSparkSQLdemo").getOrCreate()
    
  3. Add the following line after your SparkSession declaration:
    GeoSparkSQLRegistrator.registerAll(sparkSession)