Sie sind auf Seite 1von 9

Recommending Configurations based on Cases

with Unreliable Labels

Jack Stillwell1

Drexel University, Philadelphia PA 19104, USA

Abstract. This paper explores utilizing cases that present challenges to


the application of case-based reasoning (CBR) methodology. The cases in
question are player match records from SMITE by Titan Forge Games. In
SMITE, a player must select a character and up to 6 items for that char-
acter. Our goal is to recommend an item configuration which increases
a player’s chances of winning. The cases used as a knowledge base for
these recommendations are collected from recently played matches and
consist of individual players’ character choice, item choices, performance
statistics, and match outcome.
There are two primary challenges to the application of CBR. First, the
problem context available is extremely minimal, consisting of only the
character chosen. We cannot extract preferences or constraints beyond
the character choice to aid a match to a configuration of items. Second,
the match outcomes within each case are not dependant upon an indi-
vidual player, but on 5 players’ ability to work both individually and
together against another team of 5 to find victory. These factors, which
include the relative strength of both opponents and teammates, can be
considered random from the perspective of the data concerned.
To mitigate this inherent uncertainty we propose a seeded unsupervised
step, clustering performances represented through statistics to align with
wins and losses. By separating performances that are more aligned with
each outcome, we gain an understanding of which associated configura-
tions are likely to assist a player in finding victory. To generate configura-
tions we induce a decision tree trained to classify configuration data into
a cluster. Finally, we use Bernoulli Naive Bayes to rank the generated
configurations.

Keywords: Multiplayer Online Battle Arena (MOBA) · Explainable AI


(XAI) · Configuration · Classification · Uncertainty · Case-based Reason-
ing (CBR).

1 Introduction
In this paper we will be discussing a methodology to obtain recommended config-
urations from performance and configuration data in cases with unreliable labels.
Our implementation utilizes data gathered from SMITE, specifically player per-
formance data and player item data. In SMITE, items alter the statistics of a
player, a player’s allies, or even a player’s opponents. For the purposes of this
2 J. Stillwell

paper, we consider the player performance data to be characterized by the con-


figuration of their items. SMITE allows a player to choose up to 6 items over
the course of a game, and this choice of 6 from 250+ items is the core problem
we are attempting to solve through the application of our methodology.
During a review of the existing work in configuration generation, we were
unable to find any papers which overlapped with our proposed methodology of
configuration generation through classifier explanation. However, we were able
to find papers which discussed our proposed solution for reducing uncertainty
– a seeded clustering algorithm. In 2002, Basu et. al. experimented with the
properties and potential benefits of seeding variations of the KMeans clustering
algorithm[2]. In their conclusion, they state that ”seeding [KMeans] without
constraints is a robust semi-supervised method that is less sensitive to noise and
imperfections in ... supervised data” (Basu et. al., 8).
The lack of previous work regarding the use of decision trees to generate con-
figurations is a clear opportunity to explore a niche application of explainable
case-based reasoning. Our work extends the scope of applications for explain-
able case-based reasoning, exploring new methods which expand the range of
problems that can be tackled. Specifically, our work enables the application of
explainable case-based reasoning to generate configurations which increase the
likelihood of a desired outcome given only minimal information to match to
previous cases containing performance and configuration data.
Our proposed methodology consists of three steps: uncertainty reduction
through seeded unsupervised clustering, configuration generation through su-
pervised classifier explanation, and configuration ranking through classification
by a secondary supervised classifier.
The domain renders our results difficult to evaluate, but accuracy scores
of the applied classifiers and results from our trials against a Neural Network
yielded encouraging results.

2 The Game and its Data

SMITE is a Multiplayer Online Battle Arena, similar to the extremely popular


League of Legends and DOTA2. SMITE differentiates itself by its unique 3rd-
person perspective and heavier reliance on mechanical skill than other games
in its genre. SMITE currently contains 101 playable characters called gods and
264 purchasable items which can be purchased for characters. A player chooses
1 god per match and may equip that god with up to 6 items over the course of
the match. Items are extremely impactful over the course of a match, altering
character’s base statistics as well as providing other effects which affect the
player, their teammates, and sometimes even their opponents. Many items have
effects which synergize with or against particular characters or items. Due to
this, an effective item build is critical to obtaining victory.
The game mode which we source our data from is Ranked Conquest, the pre-
mier competitive mode of SMITE. Ranked Conquest is played with 10 players, 5
on each team. The goal of Conquest is to defeat at least 2 opposing Tower struc-
Recommending Configurations based on Cases with Unreliable Labels 3

tures (6 possible), 1 opposing Phoenix structure (3 possible), and the opposing


Titan structure. Over the course of the match, each player generates statistics
including:

– Kills: the number of kills a player obtains against other players.


– Assists: the number of kills a player assists teammates or allied structures
in obtaining.
– Deaths: the number of times a player is killed.
– Player Damage: the amount of damage a player deals to opposing players.
– Damage Taken: the amount of damage a player receives from other players
during the match.
– Damage Mitigated: the amount of incoming damage a player mitigates
through base protections and protections obtained from items.
– Healing: the amount of healing a player generates for either themselves or
their teammates.
– Structure Damage: the amount of damage a player deals to opposing
structures over the course of the match.
– Gold Earned: the amount of gold a player earns over the match, obtained
through defeating neutral and opposing structures and players.
– Items Purchased: the items a player ends the match with, up to 6.
– Minutes: the length of the game in minutes.
– Win Status: whether the player was victorious in their match or not.
– Date Played: the date on which the match was played.

The difficulty faced when presented with the described data is that each data
1 5
instance is only 10 of the data produced by a match in which 10 are assigned
5
a loss label and 10 are assigned a win label. With only a single instance of the
described data, it is uncertain whether the label assigned is the result of the
individual’s contribution or the contribution of their teammates or opponents.

3 Proposed Approach

Our proposed approach occurs in three distinct steps:


First, we determine the most positive and negative case based upon the
performance features of each case and the existing win / loss label. Then, we
use the performance features from these cases as seeds in a clustering algorithm,
forming two clusters of the performance data of the cases around the positive
and negative seeds, assigning new positive and negative labels to each case based
on the assigned cluster of its performance data.
Second, we use the configuration features of each case associated with the
new positive and negative label to train an explainable classifier which considers
the interactions between features. This explainable classifier is then inspected
and the configurations which produce a likelihood of belonging to the positive
class over a certain threshold are generated as potential recommendations.
4 J. Stillwell

Third, we use the configuration features of each case associated with the as-
signed positive and negative label to train an explainable classifier which consid-
ers only the individual impact of features. Each potential recommended config-
uration is then assessed by this second classifier and the potential recommended
configurations are ranked by likelihood of belonging to the positive class.

3.1 Clustering

The first step is designed to mitigate the uncertainty in the existing label for
each case – the existing label applies to a larger data instance of which our case
is only a part. Due to having only partial data, we accept the existing label as
informative but not definitive. To obtain more reliable labels we treat the data
as unlabeled and employ an unsupervised learning technique to obtain labels
based upon the performance data associated with each case.
In order to direct the unsupervised learning technique towards the trends
we wish to reveal in the data, we elected to seed the algorithm. Determining
these seeds requires identifying the two instances which best exemplify positive
performance features and negative performance features. Once these instances
have been identified, they are used as the starting point for the unsupervised
learning algorithm.
Once the unsupervised learning algorithm has completed its classification, the
new classes are recorded in each case, replacing the previous unreliable label.

3.2 Configuration Generation

The second step generates possible recommended configurations. To prepare the


configuration data in each case for use, it is arranged so that all possible features
are represented in each instance as a binary inclusion feature, where a positive
value indicates a feature’s presence in a configuration and a negative feature
indicates its absence.
Next, we train a supervised classifier which considers the interaction between
features to determine the label of a case given the case’s rearranged configuration
data. Due to the necessity of transparency for the remainder of the process, we
recommend a decision tree. A decision tree is recommended due to its clear
indication of the most important features (closer to the root of the tree), clear
indication of whether a particular feature should be included (branching on
inclusion or exclusion), and clear indication of the desirability of a path (the
probability of belonging to a desired class in each leaf).
After the supervised classifier has been trained, we analyze its decisions to
generate possible recommended configurations. In the case of a decision tree,
each branch of the tree is traced from the root down until the desired number
of features have been included. At this point, each trace endpoint is evaluated
and any traces above a probability threshold of belonging to the desired class
are recorded as possible recommendations.
Recommending Configurations based on Cases with Unreliable Labels 5

3.3 Configuration Ranking

The final step ranks the generated configurations. To accomplish this, we train
a secondary supervised classifier which considers only individual features to de-
termine the label of a case given the case’s rearranged configuration data. We
recommend a Bernoulli Naive Bayes algorithm for this task. Bernoulli Naive
Bayes is recommended due to its completely independent consideration of each
feature. This provides a vital secondary lens through which to view a configu-
ration, capable of recognizing and surfacing the most potent configuration by
individual feature.
After the secondary classifier has been trained, each generated configuration
is evaluated for likelihood of belonging to the desired class. This likelihood is then
used to rank the generated configurations from highest to lowest likelihood.

4 Implementation

4.1 Data Preparation

We begin by filtering the data obtained to include at least 2000 records for each
possible character, with more records allowed if the age of the records is less
than 2 weeks. We chose 2 weeks as the cutoff because that is the time frame in
which the game is updated by the developers to balance different statistics and
these updates often significantly change how the game is played. Additionally,
we only consider data generated by players currently awarded a rank of Platinum
or above by the in-game ranking system, which indicates that those players are
within the top 50% of players evaluated by number of games won and growth
over the current split which lasts roughly 10 weeks. To account for the extreme
differences in builds for different characters, data is separated by character.
After we have obtained the data and chosen a particular character’s data
to use, the first step is separating the data into 3 matrices with the following
columns (features):

Table 1. Case Data Matrix Representation

Performance Data Item Configuration Target Column


Kills Item One Win Status
Assists Item Two
Deaths Item Three
Player Damage Item Four
Damage Taken Item Five
Damage Mitigated Item Six
Healing
Structure Damage
Gold Earned
6 J. Stillwell

4.2 Clustering

For our implementation we elected to use the KMeans algorithm, but were un-
sure of the impact different seeds might have. To investigate this impact and
sanity check our methodology, we evaluated a variety of techniques, including
Gradient Descent Regression, Relief Regression, and Neural Networks, select-
ing each technique’s most positive and negative output as the seed instance for
KMeans. While each technique provided a different seed instance, the seeded
KMeans algorithm created clusters which were more than 99% similar regard-
less of the seed. These trials confirmed that the trends we were interested in
were present in the data and also gave us several options determining the seed.
In addition to KMeans, we also looked into fuzzy KMeans and DBSCAN as
unsupervised clustering algorithm alternatives. While fuzzy KMeans showed po-
tential, especially with its ability to designate certainty thresholds for acceptance
to a cluster, DBSCAN was unable to find any meaningful patterns in the data.
To reduce the complexity of the experiment (e.g. using different thresholds for
acceptance to each cluster in trial runs) we elected to use the KMeans algorithm,
with possible future work dedicated to discovering the impact of fuzzy KMeans.
The Performance Data Matrix is standardized through zero-centering and
division by standard deviation of each column. After the Performance Data
Matrix has been standardized, it is fed to a Gradient Descent Logistic Regression
algorithm, using the Win Data Matrix as the target column. After the Gradient
Descent Logistic Regression algorithm has been trained, it is used to predict the
case in the Performance Data Matrix most likely to be a win and the case in
the Performance Data Matrix most likely to be a loss. These two cases are then
used to seed a 2-cluster KMeans clustering algorithm.
After the clustering completes, the cluster containing the most wins is as-
signed a positive label and the cluster containing the most losses is assigned a
negative label which supplants the Win Status label. These positive and nega-
tive labels are then correlated with their respective Item Data cases in the Item
Data Matrix.
At this point, we consider the uncertainty in the data to be minimized
through the application of an unsupervised algorithm operating without knowl-
edge of the original label.

4.3 Configuration Generation

Now that we have our uncertainty-mitigated labels, the Item Data Matrix must
take a form in which the relative strengths and importance of each possible
item and their combinations can be evaluated. First, a matrix is created with a
column for each of the 264 possible items and a row for each case, with default
values of 0. If the item was purchased in that match, a 1 replaces the 0 at the
intersection of that column and row. Next, the resulting Item Purchase Matrix
is pruned to contain only columns in which an item was purchased during 3%
or more of cases.
Recommending Configurations based on Cases with Unreliable Labels 7

The next step is to train a decision tree using the Item Purchase Matrix, with
the respective positive and negative cluster labels used as the target column for
each instance.
Our decision tree functions by splitting on the feature resulting in minimally
random children. In more technical terms, it selects the feature for which a
split results in the optimal information gain. This means that items selected
first, towards the top of the tree, are more significant than items selected later,
towards the bottom. To generate potential build recommendations, we trace the
decision tree from the top down, recording the items which are recommended for
purchase until at least 4 and at most 6 items have been recommended, returning
only traces which terminate in a 50% or higher chance of a positive label. This
methodology has the additional effect of capturing item synergies, as items which
synergize with each other will appear consistently in traces together.
Recommended builds are constructed by analyzing each returned trace and
grouping any trace with 5 similar items into a core and their differing items as
optional.

4.4 Configuration Ranking

Finally, we train a Bernoulli Naive Bayes algorithm on the same data as our de-
cision tree algorithm and use it to rank the generated configurations. Bernoulli
Naive Bayes (BNB) assumes that each feature (item) affects the results indepen-
dently and therefore does not capture these essential synergies. However, BNB
performs the function of capturing individual item strength, which can then be
used to evaluate the relative strength of different synergistic builds constructed
by the ID3 algorithm.
After the potential build recommendations have been generated from the
decision tree trace, each recommendation core is classified using the BNB algo-
rithm, and the 3 recommendations with the highest likelihood of belonging to
the positive cluster are presented to the user.

5 Evaluation and Algorithm Explanation

Unfortunately, assessing the strength of these builds in an unbiased, statistical


way is extremely difficult due to both the constant shifting of the game environ-
ment and the differences between players.
We attempted to evaluate our methodology by collecting accuracy from the
algorithms used, as well as by training a Neural Network to perform the same
classification as our decision tree and BNB algorithms and evaluating the rec-
ommended builds against it.
For the purposes of brevity, we will present only the data from a single
character’s recommendations.
8 J. Stillwell

5.1 Classification Accuracy


For the character Bellona, the decision tree algorithm was 91% accurate at pre-
dicting a positive or negative label given the item purchase data. Both the BNB
algorithm and Neural Network were 72% accurate. It is interesting to note that
given item purchase data alongside the original win label, the Neural Network
could not attain over 60% accuracy, even allowing for over-fitting.

5.2 Neural Network Evaluation


Given the generated builds, the Neural Network assigned probabilities of success
to the top 3 builds in the same order as they had been ranked by the BNB
algorithm.

Table 2. Configuration Trials

Recommended ’Core’ Build BNB Ranking NN Score


(Score)
Ninja Tabi, Berserker’s Shield, Void Shield, Pestilence, 1 (86%) 76.4%
Frostbound Hammer
Ninja Tabi, Berserker’s Shield, Void Shield, Pestilence, 2 (85.9%) 76.1%
Hide of the Nemean Lion
Ninja Tabi, Berserker’s Shield, Void Shield, Shogun’s 3 (85.7%) 72.0%
Kusari, Pestilence

5.3 Decision Tree Explanation


In the case of the decision tree, as discussed above, the generated configurations
are actually the explanation. The first item presented was the first along the path
to be purchased, following along down the branch for each additional item. The
full tree can be viewed by printing out a graphic of nodes and paths labeled with
the corresponding human-comprehensible features and values, terminating in
nodes with probabilities of belonging to either class. Some of our implementation
choices play a role in ensuring this ease of explanation. The structure of the data
fed to the decision tree, being binary inclusion and exclusion values for features
which directly corresponded to the configuration items, make explaining the
significance of a branch and value path (purchasing or not purchasing a particular
item) extremely simple to convey.

5.4 Bernoulli Naive Bayes Explanation


Bernoulli Naive Bayes works by assigning a value to a feature based on its
correlation with the desired class. To explain the results, one need simply present
those values in a human-comprehensible format. Below, I have included the
values for a few items, if the values assigned to items are averaged, the build’s
1
rating can be calculated by taking eavg .
Recommending Configurations based on Cases with Unreliable Labels 9

Table 3. Example Item Values from BNB Trained on Bellona Data

Item Value
Warrior’s Blessing -1.61
Ninja Tabi -0.81
Pestilence -0.88
Winged Blade -3.1
Shogun’s Kusari -1.19
Void Shield -0.93

6 Conclusions

The proposed methodology utilizes proven techniques such as seeded clustering,


decision trees, and Bernoulli Naive Bayes to extract explainable configurations
from an unreliably labeled case base and limited problem context. The strategy
of utilizing performance data which has a meaningful correlation to both the
configuration data and an existing label to reduce uncertainty is applicable across
a wide spectrum of problems. Additionally, the process for extracting explainable
configurations from a limited case base can certainly be generalized for further
application. We are hopeful that the processes described and exemplified in
this paper lead to further investigations which push the potential of case-based
reasoning towards addressing problems requiring even greater flexibility.

References
1. Quinlan, J. Ross.: C4. 5: programs for machine learning. Morgan Kaufmann Pub-
lishers Inc., San Francisco, CA, USA (1993)
2. Basu, S., Banerjee A., Mooney, R.: Semi-supervised clustering by seeding. In: 19th
International Conference on Machine Learning, pp. 19–26. ICML, Sydney, Australia
(2002)

Das könnte Ihnen auch gefallen