[Gsoc-orga] GSoC proposal: OSM Notes Profile
Andres Gomez Casanova
angoca at yahoo.com
Wed Mar 29 01:16:07 UTC 2023
Full name: Andres Gomez Casanova
OSM username: AngocA
Current occupation: Employee at Scotiabank as DBA.
Studies: I am a systems engineer in Colombia.
GitHub profile: https://github.com/angoca
Blog: https://angocadb2.blogspot.com/
OSM involvement: I am a mapper in Colombia. Also, I am leading a team of OSM note solvers in LatAm.
Programming skills: I developed business applications in Java several years ago, but I no longer do this; however, I still know the basics, and from time to time, I write some Java code for testing purposes. For my current employment, I must write Bash scripts to automate activities related to the databases I administer. To develop good scripts, I have taken several courses about Bash scripting, which has allowed me to create robust algorithms. In general terms, I have good programming skills.
Technical skill: I know SQL and databases, more specifically Db2. Also, I have contributed to some Open-Source projects by creating slight improvements or basic fixes. Most of my contributions have been made on GitHub.
I have not applied for other projects in this GSoC.
Proposal
My proposal is about a project I started some months ago to create a website that shows the user’s and country’s profiles related to open and closed OSM notes.
State of the art
I have seen a few tools for OSM notes, and they supply simple solutions to help solve OSM notes in a particular country. The most common tool used for this purpose is Results maps from Pascal Neis; however, this tool only shows the list of notes per country and a ranking between them.
Some new note-related tools with different approaches have appeared in the last two years (OSM notes heatmap, Anton’s OSM note viewer). This reflects the importance of OSM notes in the OpenStreetMap ecosystem. This is because the community is convinced of the power of the notes to keep the map updated from the map users' feedback.
On the other hand, these are some OSM user profiles based on the mapping work done, but just one associates it with notes. Also, there are some country profiles.
· MapRoulette.
· Missing maps.
· Result maps – How did you contribute (hdyc). The only one that includes notes for a user.
· Result maps – notes.
· Tasking manager HOT OSM.
· OSM-monitor.
This profile engages users to continue collaborating on the map and discover new ways of contributing.
Origin
The origin of this project is based on the idea of offering recognition to the most productive mappers that solve notes in LatAm: https://wiki.openstreetmap.org/wiki/ES:LatAm/Proyectos/Resoluci%C3%B3n_de_notas/Preparaci%C3%B3n_premios. However, that idea required complex queries over a basic database structure, and the data could not be reused for other purposes, nor updated with most recent OSM note changes.
Structure
The project I am developing is divided into three parts:
1. Retrieve the information from OSM. This part downloads the most recent dump from the Planet to get the history of OSM notes. Once the database is populated with all historical notes, the near real-time mechanism retrieves the data about notes via the OSM API. The result of this part is a tool that continuously gathers any change in an OSM note at the worldwide level and a SQL database populated with them.
2. Once the data is in the database in a transaction data model, it should be transformed into another structure for better historical analysis. This will allow the data to be in a better format for analytical queries, reducing the number of joins and improving the concurrence without affecting the near real-time data injection. This part is currently being designed.
3. Once the data is in another format, it will be presented on a website. This website will have various kinds of reports to show the data differently. This will allow a community of OSM mappers of a given country to focus their efforts on the note resolution via different criteria (location, age, text, author, contributor). It will also allow performing audits in the most recent operations on notes to be sure there is no vandalism, and OSM notes are correctly processed.
As mentioned, the first part is already done and is working. It gathers the data from the API continuously. I have done several tests for this part of the project, and it is working as expected with reliable performance.
The second and third parts still need to be developed. The data conversion could be challenging because it must convert the data into something easily queryable to reduce the complex queries to produce a report (a profile). Finally, the user interface website will require data visualization techniques, and I am still learning to become an expert in this field.
As part of the development of the second part, in the first stages, I found a need from the Note Solvers community to see the location of the notes via a WMS. To resolve this need, I did an extra step in the second part of the data processing to copy the data into another database schema. As a result, with a light data schema, the queries from the geoservice are processed fast. Currently, it shows the location of the open and closed notes, with a distinct color depending on the age of the notes from the GeoServer app. This is useful to solve OSM notes from applications like Vespucci, which cannot download a vast area of notes but allows one to see a WMS.
Milestones
These are the milestones of the project:
· DONE – Retrieve the OSM planet notes and insert them into the database.
· DONE – Retrieve the areas of the countries from OSM and create a mechanism to find the country a note belongs to.
· DONE – Gathers the recent changes in notes (open, commented, closed, reopened) from the API and update them in the database accordingly.
· Convert the transactional data into an analytical data structure for the historical notes.
· Process the recent note modifications into the analytical data structure to keep it updated.
· Choose a data visualization tool and design a simple website for the reports.
· Design the user’s profile report. I have already thought of several values to include.
· Design the country’s profile report.
· Generate badges for outstanding mappers.
· Configure the whole solution into a persistent infrastructure via a tool-like infrastructure as code.
· DONE – Populate another database schema to be consumed from a geoservice for WMS visualization.
The more complex part is the data transformation into the analytical data structure because it could affect how the data is currently stored in the transaction; some data could be missing, or some table structure could imply. Also, the performance of the queries from the reports should be fast to engage the users. The development of the reports could also suggest some changes in the data transformation to include more data; however, the reports could start with simple values and then perform iterations to have more things.
Project hosting
The project is published at https://github.com/OSMLatam/OSM-Notes-profile, where I have included all parts and the required documentation.
Project Incentive
My incentive to develop this project is to create a tool to see reports for note analysis, which needs to be provided to the OSM community, especially for note solvers, where I am involved. At the same time, I want to learn about data transformation and visualization tools by creating a tool that could be extendible for other contributors and supported with open-source tools.
Summer work
I will continue working at the bank this summer, but I can work on the project for at least 2 hours per working day and 8 hours on the weekends for 18 hours per week. I can do this during July and August.
Andrés Gómez Casanova
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/gsoc-orga/attachments/20230329/758f3714/attachment-0001.htm>
More information about the Gsoc-orga
mailing list