Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries
- Anthology ID:
- G22-191
- Month:
- Year:
- 2022
- Address:
- Venue:
- GWF
- SIG:
- Publisher:
- ACM
- URL:
- https://gwf-uwaterloo.github.io/gwf-publications/G22-191
- DOI:
Integration of text and geospatial search for hydrographic datasets using the lucene search library
Matthew Y. R. Yang
|
Siwen Yang
|
Jimmy Lin
We present a hybrid text and geospatial search application for hydrographic datasets built on the open-source Lucene search library. Our goal is to demonstrate that it is possible to build custom GIS applications by integrating existing open-source components and data sources, which contrasts with existing approaches based on monolithic platforms such as ArcGIS and QGIS. Lucene provides rich index structures and search capabilities for free text and geometries; the former has already been integrated and exposed via our group's Anserini and Pyserini IR toolkits. In this work, we extend these toolkits to include geospatial capabilities. Combining knowledge extracted from Wikidata with the HydroSHEDS dataset, our application enables text and geospatial search of rivers worldwide.