Sie sind auf Seite 1von 7

Authors

Ramiro Robles.
Nationality: Mexico/Portugal
DOB: 18/01/78
Email: ramiro_samano@yahoo.com
Mobile: 962615625

Catherine McTeigue
Nationality: British
Email:curlycat21@hotmail.com
DOB: 19/09/1980
Mobile:

Ricardo Garibay
Nationality: Mexico
Email: ricardo.garibay.m@gmail.com
DOB: 12/07/1984
Mobile:918861553

Evolutionary multi-criteria optimization and decision-making for neural


networks and mobility models for human behaviour prediction with
imperfect context information
Summary: We present a solution for predicting human behaviour (particularly
focusing on tourists) that will help us understand their preferences, choices, habits,
and common interests. The objective of this proposal is to allow stakeholders of the
tourism industry (including food and wine producers, government bodies, etc.) to
know better what tourists prefer depending on their budget, nationality, region of
origin, as well as visited location, local weather and time of the day/year chosen for
vacations. Our approach will exploit as much as possible context information
coming from different sources with different levels of reliability: social media (e.g.,
Facebook likes, tweets, TripAdvisor ratings, Instagram posts, comments and
hashtags), search engines (google statistics), as well as from online polls to both
tourists (i.e., external users of the application) as well as owners of tourist locations
(e.g., restaurants, museums and hotels). The design of this solution recognizes the
imperfect nature of the collected information, and therefore attempts to predict,
based on controlled error statistics, tourist behaviour by combining different
information sources in a strict mathematical framework known as evolutionary
multi-objective and multi-criteria decision making. These mathematical tools will
allow for the simultaneous optimization of different criteria that depend on a set of
functional (e.g., the number of likes of a Facebook page or the number of stars in
TripAdvisor) and non-functional inputs (e.g., the opinion of a restaurant owner). This
means that we attempt to combine mathematically all the information collected
from different types of source assigning different weights (obtained via different
mathematical tools that reflect the reliability of each source) and create a selflearning algorithm based on neural networks that will allow us to predict tourist
behaviour mainly as a Bayesian conditional probabilistic space. For example, we will
be able to estimate the probability that a tourist coming from France, during
October and with a budget of x euros staying in Baixa Lisbon, will get a glass of
Douro wine of x euros combined with (conditional on selecting) an octopus dish of x
euros, and vice versa we will be also able to estimate the odds of a tourist to choose
a squid or seabass dish when the customer has high preference for an Alentejo red
wine. Our mathematical models attempt to follow as accurately as possible the
behaviour of a tourist by allowing the models to change over time (evolve) and
adapt according to different situations, such as online feedback, user information,
climate, time of the year and even current economic events in the different
countries. To predict tourist mobility we will focus on mobility models used in the
literature of cellular communications, which implement Markov models with
conditional decision probabilities. This means we attempt to predict (based on
probability measurements) the next decision of a tourist that has visited, for
example, the area of Belem before midday. We would like to know what the most
likely next destination will be based on the initial decision: Sintra, Cascais, or
downtown Lisbon?. Therefore, our approach will combine information of different
sources, assigning different weights depending on the reliability of the source, the
estimated precision error, and on the different mathematical models used for
predicting such errors. The weights of this optimization problems will be used also to

update the coefficients of the neural network which is a tool that attempts to
recreate the learning process of human brains to adapt to different circumstances
and refine the results over time based on statistical sampling and observations.
Evolutionary algorithms will also allow us to replicate the evolutionary nature of
human behaviour over time and over different geographical locations. It will also
allow us to reduce the complexity of information extraction and optimization, and
thus obtain a fast algorithm response that will be reflected on an efficient mobile
application that can be accessed on resource constrained embedded devices with
internet connection.

1. Introduction and Motivations


Modelling and predicting human behaviour is of paramount importance in the
optimization of resources and revenue in the tourism industry. In Portugal,
tourism has become one of the pillars of economic development all across
the country. Over the last few years, the amount of tourists visiting Portugal
has been increasing for different motives, one of them being economic crises
that avoid tourists from destinations outside mainland Europe. While tourism
in Portugal has been developing at a reasonable pace, there are many
important gaps to be filled. Digital systems have transformed tourism around
the world. Online bookings, destination reviews, blogs, newsletters, and
website advertising, have made it easier to tourists to retrieve and exploit
information to make better decisions.
Big data and context aware processing are two concepts that have become a
hot topic in information system design. The amount of data coming from
databases, sensors and social networks is a source of statistical information
that can be used to predict or estimate accurately certain aspects of systems,
weather, traffic, and human behaviour. This information can be exploited for
improved offers, commercial plans, and accurate customer profiling. With
strong EU regulations, privacy and security can be reinforced reducing threats
while at the same time enabling a wide set of improved applications that can
be translated in more effective business models and therefore increased
revenue.
The main motivation for our proposal focuses on answering a set of questions
such as the following: If one visits the Alentejo region during autumn, what is
the most popular dish to order for dinner?. In addition to this, we would also
like to answer the following: what is the most common drink or beverage
recommended to have with that selected dish?. Or vice versa, if one desires
to try one of the excellent wines of the region, which dish or type of food is
more convenient to have with such range of wines?. The answer to these
basic questions could seem straightforward to a local restaurant owner or
customer. However, if we ask the same question for a whole city or for a
region (county) and at the country level, the amount of potential information,
combinations and options become more and more complex. So, the idea of

this proposal (in the first stage of development) is to provide an answer to


these and increasingly more complex questions. For example, what is the
most common dish that French tourists from Toulouse try in the Douro region
in the middle of October, on a sunny day with average temperature 28
degrees combined with a red Tawny wine also from the region for lunch time
with a budget of XX euros over a period of ten days in Portugal?
What if wine producers, restaurant owners, government and all tourist
industry can have access to this type of information?. Producers could create
new strategies, focusing on products preferred over different geographical
locations. Or perhaps, local authorities could reinforce and promote products
that are currently unknown to foreign tourists. In our opinion, this information
is key for the development of this industry. Our application intends to provide
this information and more importantly a method to combine different sources
information with different levels of reliability and provide the best estimator.
Our solution intends to use a variety of mathematical tools that in
combination will attempt to provide the most accurate prediction and
estimation of the different probabilities.
In later stages of the development, we intend to extend the analysis to more
behavioural aspects such as type of location visited, preferred geographical
area for night clubs, etc. Our focus will be on conditional Bayesian
probabilities, which means that we focus on solving questions such as the
following: once a German visitor has been to the area of Belem, what is his
most likely next destination, depending of course on the time of the day,
weather conditions and budget?. Therefore we obtain what we call the
Bayesian conditional probability space.
The proposed solution combines different types of source of information: it
will use social networks (counting the number of likes, retweets, stars in
TripAdvisor), search engines (counting the visits to websites or search of key
terms over a period of time) but also using traditional polling to tourists and
owners of tourist locations (via online pools or face-to-face). In this way we
attempt to overcome the potential estimation errors and thus increase the
quality of the estimators. To update the probability space, we will use
mathematical models for sell-learning based on neural networks and to
combine different sources we will use evolutionary multi-criteria optimization
and decision making. This will enable for a convenient automatic self-learning
of the weighting coefficients of the different sources of information that will
minimize the estimation error, and additionally it reflects the evolutionary
aspect of the human behaviour through time and space (geographical
location).
This proposal is organized as follows. Section 2 provides the description of the
modules of the solution. Section 3 provides more details of the evolutionary
multi-criteria optimization functionality, while Section 4, 5, and 6 deal with
neural networks, Markov mobility models, and Bayesian probability spaces,

respectively. Section 7 focuses on the example of food and wine, while


Section 8 provides conclusions.
2. Description
This proposal aims to combine (mathematically) different sources of context
information for estimating the Bayesian conditional probabilistic behaviour of
tourists. Social media, website information, search engines, blog posts, online
polls to tourists and location owners and feedback from the application itself
will be considered. Each of these sources has different level of accuracy or
reliability, so therefore they will be assigned a different weight when
obtaining the final estimator. This weighted combination of criteria to obtain
an improved set of estimators minimizing the bias and estimation error of the
different sources of information is known as multi-objective optimization. The
weights of each source of information will be updated by using a self-learning
algorithm based on neural networks. Each time a new update information is
received, the learning algorithm will obtain the best next value of each
weighting factor using an evolutionary approach. Some of the information
sources cannot be directly mapped into a continuous mathematical real
function. Instead we will use an approach known as multicriteria decision
making or goal scoring where experts or relevant people provide a set of
criteria with different marks or scores that are combined via a mapping
function to obtain a mathematical real value that can be combined with other
real-valued functions. The proposed self-learning algorithm intends also to
adapt to changing conditions and thus define the estimate of the different
sources of information.
The proposed application will follow a service oriented architecture that is
well suited for our needs and type of information. Each of the sources is
encapsulated by a software module in charge of retrieving periodically
relevant information. This information is transmitted to the central module via
web services based on SOAP protocol. Information can be periodically
retrieved or requested on demand. The server hosts the algorithm to update
the probabilistic information which is then stored in a database. The app will
act both as a source of information as well as the requesting end user.
The figure below shows the functional model of the algorithm used to
combine the different sources of information with different estimation errors.
Each module is implemented in software communicating with one another via
web services.

3. Evolutionary multi-criteria optimization/decision making


The theory of multi-objective optimization provides a formal mathematical
framework that allows for the simultaneous optimization and trade-off
analysis of multiple performance metrics or objective functions. Since two or

more of the objective functions can exhibit conflictive behaviours with


another set of objective functions, in general there is no a single point in the
solution space that simultaneously optimizes all the objective functions.
Instead, the concept of Pareto optimality is commonly used. A Pareto optimal
solution optimizes a subset of the objective functions, which means that it
cannot be dominated by any other solution in the solution space. In general,
the number of Pareto optimal solutions can be infinite, which defines a Pareto
optimal curve or surface, which is of great interest for system designers as it
provides the space which achieves the best-trade-off performance between
performance metrics or objective functions. The Pareto solution or Pareto
front curve allows engineers to answer in a mathematical and accurate way
the following question: how much is lost in terms of performance metric y
when maximizing performance metric x?. As it can be observed, multiobjective optimization lies at the heart of engineering and designing complex
systems in a professional and accurate manner.
When the objective functions cannot be expressed in a precise mathematical
form, which is the typical case of decision making and technology selection
processes for large companies or complex systems, then useful abstractions
and methodologies are preferred to proceed with the multi-objective or multicriteria optimization. These tools have been called multi-criteria decision
making (MCDM). MCDM consists of creating a set of abstraction models that
map an abstract criteria into a mathematical quantity, in an attempt to
recreate a mathematical function that can represent the decision making
problem. These abstractions are usually questions that are formulated to
experts in the subject and that provide different weights to different criteria.
The way these criteria, weights and collected data are created, handled and
processed varies according to different approaches.
In evolutionary MOO, the optimization problem is initialized with a random
solution that is modified (evolved) using rules of genetic evolution. A number
of iterations are necessary for the approximation to converge to an
appropriate Pareto solution frontier.
4. Neural Networks
Neural networks constitute a field of mathematical tools that attempt to
recreate the cognitive self-learning behaviour of neurons in human brains.
Neurons are connected via synapsis where different mathematical models are
used to store knowledge. Knowledge is stored in these synapsis. An algorithm
is used to adapt the synapsis elements according to observations of
phenomena, which lead to the cognitive process of self-learning.

5. Markovian Mobility models


Markov chains are mathematical models that represent the different states of
a system and the transition probabilities between them. This type of model is

suitable for human mobility and it is commonly used in cellular


communications. It will allows us to assign probabilities of direction and
speed across a city based on preference, time of the day, itinerary, budget,
etc. Tourists will be modelled to be in a given state of the markovian space
and via calculations at the most probable state or states can be determined,
leading to accurate prediction of behaviour and itinerary plan. This
information can help tourism industry to design strategies, plans, packages
and routes that suite better tourist common habits.

6. Bayesian and conditional statistics


One of the central concepts of this proposal is conditional and Bayesian
probabilities. Our app will provide the user with the probabilities of most
common selection of, for example, Dao wine conditional on the visitor having
selected sea food for dinner, and also vice versa, what are the most likely
dish selections if a visitor with a given profile prefers Tawny red wine for
dinner?. Bayesian probabilities constitute a powerful tool in the derivation of
probability spaces and high quality un-biased estimators.

7. Example
We provide an example of how the app will look to the final user, highlighting
the different steps of the data flow and calculations of the proposed
algorithm. Let us assume that a German tourist visiting Porto access our app
to obtain information of what the most common combination of food and
wine is in the area of Cais da Ribeira. The tourist ell be presented with a
screen for selecting which information wants to request. Let us assume that
he requests for the most typical dish in the area. The app will proceed to
obtain the following information to obtain the best answer or set of answers.
The algorithm will consider a set of websites for food in the area counting the
dish with most mentions. It will do the same with blogs, and the number of
searches on Google analytics. The algorithm will also retrieve the collected
information from restaurant owners in the area where they have provided a
ranking with the most commonly requested dishes by tourists of different
countries. The app will request the tourist to provide feedback about his
selection, which will be used also for refining rankings of subsequent request.
Results will be stored in a database containing the different conditional
probabilistic space. Every time the information is accessed via an app or that
one of the information sources is updated, a self-learning algorithm will look
into updating the weights assigned to each source of information, looking for
reducing estimation errors and maximizing the accuracy of the provided
information to the end user.

Das könnte Ihnen auch gefallen