Skip to content

Use case 3

Alessandro Demo edited this page Jun 23, 2025 · 2 revisions

3️⃣ Complex Spatial Join (Proximity)

  • Goal: To test performance on computationally intensive join operations on large datasets.

  • Example task: Given the road network and buildings in Rome, count for each main road (highway = 'primary') how many buildings are within 20 meters.

  • Workflows to compare:

  1. DuckDB: The query with ST_DWithin that I have already analyzed and seen to be slow. I should also discuss the optimization strategies I found (e.g. batching).

  2. PostGIS: The same query, but it will transparently use spatial indexes (to be created first).

  3. GeoPandas: gpd.sjoin_nearest(roads, buildings, max_distance=20)

  • Metrics: Execution time, correctness of results, complexity of the code.
Clone this wiki locally