JupyterLab

Purpose: data exploration and engineering, analytics, and Model Development 

Primary language: Python (PySpark)

Official documentation: https://jupyter.org

https://spark.apache.org

Points of note: 

  1. Jupyter notebooks are created with PySpark kernels for ease of use.

  2. Interaction with the Space data access layer is via PySpark (incl. SparkSQL).

  3. Core Python packages pre-provisioned with environment.

  4. Home area shared by collaboration - example / guidance notebooks may be provisioned within the environment.

  5. Python and PySpark versions provisioned will evolve overtime.

  6. Spark Dataframes can be created through SparkSQL interaction with the data access layer.

  7. Derivative data frames containing derived outputs can be written back to the Collaboration and Publish connections.

  8. Spark dataframes can also be drawn down into Pandas dataframes for native python interactions.