ANN-route-predition/pyrate/doc/source/plan/geometry/geometry.rst

Geometry
========

The geometry package provides a foundation for planning methods by implementing
several commonly used geometric objects, e.g. locations, polygons, and routes.
Each of them comes in a polar coordinates (i.e. latitude & longitude) and
cartesian coordinates (i.e. local x- & y-axis on a tangent plane) variant.

The cartesian ones are based on `Shapely <https://shapely.readthedocs.io/en/stable/project.html>`_
and the polar ones try to mimic their interface and functionality.
All coordinates are referenced to the widely used
`world geodetic system (WGS84) <https://de.wikipedia.org/wiki/World_Geodetic_System_1984>`__.
In this inheritance diagram, :class:`~shapely.BaseGeometry` as well as classes inheriting directly from it
are provided by *Shapely*.
It shows that all geometric objects of *Pyrate* inherit from :class:`~pyrate.plan.geometry.geospatial.Geospatial`:

.. inheritance-diagram::
        pyrate.plan.geometry.location.CartesianLocation
        pyrate.plan.geometry.location.PolarLocation
        pyrate.plan.geometry.polygon.CartesianPolygon
        pyrate.plan.geometry.polygon.PolarPolygon
        pyrate.plan.geometry.route.CartesianRoute
        pyrate.plan.geometry.route.PolarRoute
    :parts: 1
    :top-classes: pyrate.plan.geometry.geospatial.Geospatial

See :ref:`geometry-plotting` on how to easily plot geometries like points, polygons and routes.
See :ref:`design-decisions-local-projections` on how the implementation of the projections
between local and global coordinate systems has developed.

.. toctree::
   :maxdepth: 2
   :caption: Modules:

   geospatial
   location
   polygon
   route
   helpers


.. _geometry-plotting:

Geometry Plotting
-----------------

There are many possibilities to visualize geometries with Python. For simplicity, we chose to not provide
direct visualization methods, but support `GeoJSON <https://geojson.org>`_. This format can be read very
easily by many programs, including the website `geojson.io <https://geojson.io>`_. You can simply
copy-paste it into there or use the convenient command-line tool `geojsonio <https://github.com/mapbox/geojsonio-cli>`_.
However, when objects become very large, other tools like  `QGIS Desktop <https://www.qgis.org>`_ may be more appropriate.
The code below gives and example of how the
*GeoJSON* representation can be obtained. After that, a few interesting references are given.
Also, see :meth:`~pyrate.plan.geometry.geospatial.Geospatial.to_geo_json`.

.. code-block:: python

    from geojson import dumps, Feature
    from pyrate.plan.geometry import PolarPolygon

    # create a geometry object
    some_geometry = PolarPolygon(...)

    # then simply dump it to standard out
    print(some_geometry.to_geo_json())

    # or more general
    print(dumps(Feature(geometry=some_geometry)))

.. code-block:: bash

    echo '{"type": "Point", "coordinates": [30, 10]}' | geojsonio
    geojsonio some_gemetry.json
    # see https://github.com/mapbox/geojsonio-cli#examples for more examples

This works for

- :class:`~pyrate.plan.geometry.location.PolarLocation`,
- :class:`~pyrate.plan.geometry.location.CartesianLocation`,
- :class:`~pyrate.plan.geometry.polygon.PolarPolygon`,
- :class:`~pyrate.plan.geometry.polygon.CartesianPolygon`,
- :class:`~pyrate.plan.geometry.route.PolarRoute`,
- :class:`~pyrate.plan.geometry.route.CartesianRoute`,
- and any object that provides a ``__geo_interface__`` attribute/property.

Further References
~~~~~~~~~~~~~~~~~~

- The original `Gitlab issue #54 <https://gitlab.sailingteam.hg.tu-darmstadt.de/informatik/pyrate/-/issues/54>`_ that collected initial ideas
- `Interaktive Visualisierung von Geodaten in Jupyter Notebooks (Lightning Talk, FOSSGIS 2017) <https://tib.flowcenter.de/mfc/medialink/3/de387967965b98c17bd5dd552ac86e899179084e8c1b5aa6d578f5ad72c5eea5ea/Interaktive_Visualisierung_von_Geodaten_in_Jupyter_Notebooks_Lightning_Talk_2.pdf>`_
- Examples in the *Folium* library: `Quickstart - GeoJSON/TopoJSON Overlays <https://python-visualization.github.io/folium/quickstart.html#GeoJSON/TopoJSON-Overlays>`_


.. _design-decisions-local-projections:

Design decisions on the local projections
-----------------------------------------
This section documents our arguments for and against `Universal Transverse Mercator (UTM) <https://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system>`_ vs
`local tangent plane coordinates <https://en.wikipedia.org/wiki/Local_tangent_plane_coordinates>`_ based on freely chosen reference points,
as means of `horizontal position representation <https://en.wikipedia.org/wiki/Horizontal_position_representation>`_.
A third approach would be to provide both.
This discussion was copied and adapted from the `issue #40 <https://gitlab.sailingteam.hg.tu-darmstadt.de/informatik/pyrate/-/issues/40>`_, which initially collected this discussion.

Overview of the arguments
~~~~~~~~~~~~~~~~~~~~~~~~~
Firstly, the three approaches are presented with the arguments for using them.

Pro UTM
.......
1. A worldwide standard for navigation
2. Data easy to import/export to other teams/projects (can be important e.g. for the WRSC competition). However, WGS84 coordinates will probably suffice.
3. UTM locations can be pre-computed while arbitrary projections constantly change. Example from DB (pseudo-SQL): ``SELECT obstacle WHERE obstacle.zone IN {boat_zone, boat_zone + 1, boat_zone - 1, ...}``. Compared to *local* where the PolarLocation is transformed into local coordinates, distance computed and then decided whether to use or drop.
4. UTM errors are guaranteed to be +-1m per 1km within a single zone, see for reference e.g. `here <https://www.e-education.psu.edu/natureofgeoinfo/c2_p22.html>`_.
5. UTM makes tiling the map easy. This might help to choose which obstacles to include while planning. However, a single UTM zone is also quite large.
6. Slicing can be done once, offline.

Pro local
.........
1. Better precision around boat position/obstacles close to the boat. If we also use the Traverse Mercator projection like UTM, we might even get better resolution. However, this might come at some increased computational cost since it cannot be easily done offline/beforehand.
2. No tiling needed, select obstacles that are within a range of boat, and clip the non-relevant parts (already implemented in the *spatialite* database with polar coordinates)
3. Do special cases due to UTM zones not being entirely uniform
4. Could, in theory, allow for different projections for different needs (preserve the visual shape, preserve the area, etc.), though it might be too complicated and not worth the effort
5. Works exactly the same, no matter where on the globe something is

Pro for both and therefore neutral
..................................
1. Tested and documented packages for UTM (`utm <https://pypi.org/project/utm/>`_) and for arbitrary local transformations exist (`pyproj <https://pypi.org/project/pyproj/>`_)
2. Slicing Polygons provided by shapely (either ``island.intersect(Point(x, y).buffer(radius))`` or ``island.intersect(Polygon([(0, 0), (max, 0), (max, max), (0, max)]))``)
3. Both approaches would provide sufficiently precise approximations of the earth surface for our needs

About implementing both
.......................
1. Would have the best of both worlds
2. How would this complicate the implementation? (Too much, and it would spark discussions and incompatibilities.)

Decision
~~~~~~~~
In the end, the main argument against UTM zones was the handling of the cases near zone borders and that there are some irregularities in the UTM zones that might complicate things.
However, using local projections was feared to have a huge performance impact on embedded computes, so we performed a benchmark of a basic implementation.
The results when benchmarking in the scenario tested below confirmed that using local projections was feasible on our embedded computers.
Thus, the local transformation approach was selected.

.. _benchmarking-db-and-local-projections:

Benchmarking results of the custom local transformation approach
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The performance was initially tested on a *Raspberry Pi 4B* with 2GB RAM and a *SandDisk Extreme 64GB (Class 3)*.
A *Raspberry Pi* was chosen as it will likely be the actual computer being used in many challenges.
The OS was *Raspberry Pi OS (32-bit)* with version *May 2020* and the variant "with desktop and recommended software".
The overall performance was concluded to be very acceptable.

The benchmarking was performed with the chart database from the `data repository <https://gitlab.sailingteam.hg.tu-darmstadt.de/informatik/data>`__
on `commit 0abe9269026de87b7265f664d10a0b9599314313 <https://gitlab.sailingteam.hg.tu-darmstadt.de/informatik/data/-/commit/0abe9269026de87b7265f664d10a0b9599314313>`__.
It contained the entirety of North America as was available from the (US) NOAA.
The benchmark script (and *Pyrate* code) was from `commit 0ae4c33e361369321b10d677067deeb07ed27493 <https://gitlab.sailingteam.hg.tu-darmstadt.de/informatik/pyrate/-/commit/0ae4c33e361369321b10d677067deeb07ed27493>`__.
See :ref:`script-benchmark_db_and_projections` for details on what is actually tested.

The following tests were carried out on an Intel(R) Core(TM) i5-6300U with a SATA SSD and plenty of RAM.

Results with realistic parameters: radius 100km
...............................................

.. code-block:: bash

    user@ubuntu:~/sailing/pyrate $ python scripts/benchmark_db_and_projections.py ../data/charts/noaa_vector/all_combined_simplified_25m.sqlite --iterations 10 --radius 100
    Information on the setting:
            number of rows/polygons in database:                 648828
            sum of vertices of all rows/polygons of in database: 13727653
            extracted number of polygons:                        6266
            extracted total number of vertices:                  120179

    Executed "query_database" 10 times:
            average:  2.977373 seconds
            std dev:  0.042802 seconds
            variance: 0.001832 seconds

    Executed "project_to_cartesian_and_back" 10 times:
            average:  1.465923 seconds
            std dev:  0.033850 seconds
            variance: 0.001146 seconds

Results with stress testing parameters: radius 999km
....................................................

.. code-block:: bash

    user@ubuntu:~/sailing/pyrate $ python scripts/benchmark_db_and_projections.py ../data/charts/noaa_vector/all_combined_simplified_25m.sqlite --iterations 10 --radius 999
    Information on the setting:
            number of rows/polygons in database:                 648828
            sum of vertices of all rows/polygons of in database: 13727653
            extracted number of polygons:                        90539
            extracted total number of vertices:                  2131078

    Executed "query_database" 10 times:
            average:  34.120787 seconds
            std dev:  0.499919 seconds
            variance: 0.249919 seconds

    Executed "project_to_cartesian_and_back" 10 times:
            average:  23.383787 seconds
            std dev:  0.224816 seconds
            variance: 0.050542 seconds

Notes and conclusions
.....................

Comparing the results with radius 100km and 999km, we can see that ``_project_to_cartesian_and_back()`` grows very linearly, as expected: 12 μs/vertex (100km) vs. 11 μs/vertex (999km).
The ``_query_database()`` benchmark runs even better (sub linear in the number of vertices): 24 μs/vertex (100km) vs. 16 μs/vertex (999km).
Also note, that having a lot of polygons outside of the relevant area seems to be non-problematic.
Here, the spatial index really shines, as ``_query_database()`` took *a lot* longer before its introduction.

About 66% of the time when projecting is spent reassembling the polygon after it was converted, so that's probably something we can improve if we eventually need to.
Also, one could reduce the fidelity of the features by using stronger simplification or reduce the query radius.

Memory seems to not be a problem either. No precise measurements were made though.