Hiking is the Best Hobby for Research
RatingInference for Custom Trips from Enriched GPS Traces
Lasaris Seminar, November 23, 2023
Mouzhi Ge
DeggendorfInstituteof Technology, Germany
mouzhi.ge@th-deg.de
Agenda
• Motivation
• Definition and Background
• Problem statement and scope
• Similarity-based trip rating inference framework
• ML-based trip rating inference framework
• Experimental settings and results
• Key take-aways
The real motivation (hobby-driven research)
Research motivation
• GPS-enableddevicesallow us to pinpoint our location and generate a large
amount of data that traces our movements along trips.
• Custom trips are designed to cater to travelers’ specific desires and user
preferences for personalized tourism experience.
• Since the custom trip is usually new in the system, no rating can be shown
to the user. As a result, the rating inference of custom trips has emerged as
an important feature in tourism applications and location-based services.
• This paper aims to determine which representationfeeds best to the
machine learning algorithms and achieves higher accuracy for rating
inference.
• Apart from trip recommendations,rating inference in this paper can be
considered a second opinionfor custom trips defined by users.
Custom trip
• Closeness to POIs
• Closeness to places where users take pictures
Enriched GPS traces along with the trip
• Trip location
• Trip elevation
Multi-criteria ratings for trips
• Multi-criteria ratings consider different factors simultaneously. For
example, one hiking route would have various attributes for ratings,
such as Condition, Difficulty, Technique, Quality of Experience, and
Landscape.
Condition
Difficulty
Technique
Quality of Experience
Landscape
Problem statement and scope
• The user designs a custom trip
• This trip contains enriched GPS traces
• There are different rating criteria for this trip
• We want to infer/predict
• The rating of each criterion for this custom trip
Our similarity-based solution
Proposed in 2019
Theodoros Chondrogiannis, Mouzhi Ge: Inferring ratings for custom trips from rich GPS traces, LocalRec at
27th ACM SIGSPATIAL, Chicago, Illinois, USA, November 5, 2019.
Hiking routes
Hiking routes with overlaps
Recap and intuition
• Users design their own routes
• Applications
‣ hiking trails
‣ running/training routes
• Problem: what is the rating of such a route?
• Idea: Consider the ratings of overlapping routes to
infer a rating for the new route
Trip rating inference similarity-based framework
Map Matching
• Map all rated trips to segments of the underlying spatial network
(preprocessing)
• Map the unrated trip to a segment of the underlying spatial network
(query processing)
⟨(x1, y1), … , (xn, yn)⟩
GPS Trace
⟨e1, … , em⟩
List of edges
Overlapping Trip Retrieval
• Overlap: how much of the query trip is overlapping with some already
rated trip
• Inverted index
‣ Edge e → List of trips that contain e
‣ Retrieval cost is linear to the size of the query trip
Ol(ti, tj) =
i j∀e∈p(t)∪p(t)
∑ ℓ (e)
ℓ(p(tj))
Rating Inference (Step 1)
‣ Te is the set of trips that cross e
• The rating of and edge e depends on
‣ the rating of each trip that cross e
‣ the overlap of each trip that cross e with tq
∀ti∈Te
iRt(e) = ∑ r ⋅
• Edge rating inference
Ol(ti, tq)
∑∀tj∈Te
Ol(tj,tq)
∣ Te = {ti ∣ ti ∈ D∧e ∈ p(ti)}
Rating Inference (Step 2)
• The rating of the trip is given by the weighted sum of the ratings of its
edges
‣ Eq is the set of edges that have been rated from the previous step
• Note: our approach considers only segments that overlap
with at least one existing trip.
∑
∀e∈p(tq)
rq = Rt(e) ⋅
ℓ(e)
ℓ(Eq)
Outdooractive dataset
Evaluation Setup
• Hiking Trails from Outdooractive
• Five attributes rated betweem [1,6]
• (Condition, Difficulty, Landscape, Quality, Technique)
network nodes edges trips (all) trips (hiking)
Swabia 491213 630094 544 353
Austria 2484861 3033885 516 260
NE Italy 1467754 1884450 696 419
Bavaria 3045179 3928652 1346 754
The average overlap of each unrated trip with already rated trips was 48.6% for Swabia, 14.1% for Austria, 22.4% for
NE Italy, and 15.5 % for Bavaria.
Experimental results (MAE)
Experimental results (Accuracy)
Our machine leaning based
solution
Proposed in 2023
Theodoros Chondrogiannis, Mouzhi Ge: Rating Inference for Custom Trips from Enriched
GPS Traces using Random Forests, LocalRec at 31st ACM SIGSPATIAL, Hamburg, Germany,
November 13, 2023.
Trip rating inference ML-based framework
We want to use machine learning to do the rating inference, but the focus of this work is not ML model selection, it is feature
engineering and encoding selections, given the enriched GPS traces are complex.
Location Encoder
• We first impose a 𝑛 × 𝑛 grid over the space defined by the minimum
bounding rectangle of all traces.
• One-hot encoding
• 𝑍-order curve to first ID the grids
• For each set of IDs the trip crosses, a vector that contains basic statistics, i.e., min, max, mean, and median values.
• Histogram of 𝑛 buckets
0 1
2 3
0 1
2 3
0
3
Altitude Encoder
<
• total ascent,
• total descent
• minimum altitude
• maximum altitude
• standard deviation of the elevation profile
>
POI Distance Encoder
• A combination of two vectors
• Vector 1: Distances to all POIs
Plus
• Vector 2: Distances to a predefined set of POIs (𝑘 nearest POIs)
Geo-tagged Images Encoder
• A bit vector and the size equals to the number of images in 𝐼
• we set each bit associated with an image to 1 if the minimum
distance between the trace and the image location is below a
predefined threshold, e.g., 20 meters.
Datasets
• Trip data obtained from Outdooractive: www.outdooractive.com
• Elevation data for trip from Copernicus: www.copernicus.eu
• 181,185 POI data from www.kaggle.com/datasets/ehallmar/points-
of-interest-poi-database
• 50,000 geotagged images from
www.kaggle.com/datasets/habedi/large-dataset-of-geotagged-
images
Encoding methods overview
ML Model and Evaluation Metric
• Random forest
• Our previous experiments demonstrated that Random Forest performs best in
several similar rating inference scenarios. We used Random Forest classifier in
this work.
• MAE, widely used for evaluating rating predictions, especially in
recommender system research.
Experimental Results
Take aways
• "One size fits all" encodings may lower the quality of multi-criteria
rating inferences.
• Different encodings might be dynamically used to infer different
rating criteria.
• The trip-oriented ratings are focused on the intrinsic features of the
trip. Thus, the encoding of trip profiles can offer higher-quality rating
inferences.
• User-oriented ratings focus on how users feel about the trip and user
satisfaction.
Summary and Future Research
• Scope of this research: encoding selection, not model selection.
• The model may consider more contextual factors. For example, the
context of a trip may include group dynamics, previous experiences,
and cultural factors.
• Users would often like to know how the inference is made. In turn,
users can be more confident in their trip decisions. Therefore,
developing transparent and explainable models may increase user
trust and satisfaction.
• Including user feedback to enhance user engagement is also critical.
User feedback can be used to improve the model training and provide
continuous improvement for implementing trip recommendations
Thank you and questions
Contact details
• Prof.Dr. habil. Mouzhi Ge
Head of Data Science and Intelligent Systems Research Group
European CampusRottal-Inn
Deggendorf Institute of Technology
Max-Breiherr-Straße32
84347Pfarrkirchen, Germany
• Email: mouzhi.ge@th-deg.de