Tab Program Description

This
paper is intended as a technical explanation, including possible formulas, for designing a new debate tabulation program based on a matrix optimization algorithm. I have spent a lot of time experimenting with different formulas and have arrived at the ones I think work best. I am, however, not a computer programmer. If anyone wants to develop a program based on these ideas, please do so and share with the community. Section 1: How a Matrix Optimization Works The key idea that must be understood is that current algorithms for pairing, whether done by hand on notecards or by a computer program, are sequential not simultaneous. For example, in one current version of a power-matching algorithm, debaters are ranked from top to bottom, then the debaters are matched together two-by-two: first place debates second, third debates fourth, and so on. If a match is prohibited by tournament rules for example, the first and second place debaters are from the same school then the next debater is picked, and first place debates third. Every current algorithm uses some kind of sequential process. The limitation is that only one variable can be considered, even if that one variable is often composed of several different measures. For example, a debaters place in the tournament is a variable, which might be composed of the measures win-loss record, speaker points, opponent wins, etc. It is impossible for a sequential algorithm to balance two variables at once, such as geography and skill level. All a sequential algorithm can do is occasionally strike a potential match for violating one criterion or another, e.g. eliminating a match because the two debaters are from the same state. A simultaneous algorithm can optimize several variables at once. The basic outline of such an algorithm is simple to describe: step 1 is to assign a point value, or score, based on as many variables as desired, to every possible match; step 2 is to pick the pairing that has the highest average point value per match. Step 2 is a well-known and solved problem in computer science (see Hungarian algorithm). Therefore, the only problem to solve is how to assign a score to every match. Step 1 generates a matrix, listing debaters by row and by column. In an even, side-constrained round, the teams due Affirmative could be listed in each row, and the teams due Negative are listed by column. In an odd round, the teams can be randomly assigned to either half, or the program could assign teams from the same school to the same side, since they are blocked and cannot debate each other. Using variables stored for each team, the program can populate the
matrix with different scores for each possible match. In the case where the teams are blocked, the score should be set to 0. In step 2, the program picks out the optimal set of matches. Section 2: Particular Formulas for Particular Uses Formula 1: Strength of Schedule The problem with a high-high or high-low pairing method is that the strength of schedule can still vary widely, even within a bracket. As evidence, look at the opponent wins column in the final results of a tournament; within the same bracket, say 4-2s, opponent wins can vary from the low teens to the high twenties. The problem is that the method matches teams by only one variable strength (record, points) and cannot consider a second variable schedule strength at the same time. This is no problem for a matrix optimization method. For a potential match between teams A and B, the program needs to use only the strengths of teams A and B, s A and sB , and also the strengths of the prior opponents of team A , {sC , sD , sE ,...} , and of team B , {sN , sO , sP ,...} . Then the program calculates ! A of the set {sB , sC , sD , sE ,...} and ! B of the set {s A , sN , sO , sP ,...} . Each potential match is scored by (1.1)
scoresos =
2 1 1 + ! A !B
with a higher score being better. Therefore, this method looks for the opponent for team A that most increases the standard deviation of its opponent set i.e., that most balances out its schedule and likewise for team B. Using the harmonic mean to combine these two standard deviations ensures that only matches that balance out both teams schedules can receive high scores. Strength could be measured in any way: wins alone; wins plus a fraction for speaker points; or points for the teams rank order out of the n teams at the tournament, where first place = n points, second = n 1, , and last = n n = 0. (My preferred way to measure a teams strength is discussed in a later section.) The strengths would need to be updated after every round.
Variant A The method for strength-of-schedule pairings stated above ignores brackets. Indeed, because it is trying to push every team to an equally balanced schedule, the method stated above will turn a tournament into a kind of partial round robin. Each team will face opponents from the top, middle, and bottom brackets. To correct this, an additional factor, wins A ! winsB , needs to be considered. To force within- bracket pairings, the scoring needs to be adjusted such that the best one-bracket pull-up match has a lower score than the worst within-bracket pairing. In this way, the program will choose a pull-up match only if it is necessary because a bracket has an uneven number of teams. The program will then choose the best possible pull-up match. It is also worth considering multiple bracket pull-ups as well. Although these are rare, sometimes they are necessary at small tournaments with only a few participating schools. It is not difficult to have the program consider these multiple brackets, too. The best two-bracket pull-up should have a lower score than the worst one-bracket pull-up, and so on. If strength is turned into a 0.01 to 1 scale (for example, by dividing team rank points by n), then the highest standard deviation possible is 0.5, and therefore the highest possible score for a match is 0.5. The scoring formula could be adjusted to 2 scoresos.brackets = + 0.5 !(rounds " wins A " winsB ) (1.2) 1 1 + ! A !B This would shift the scaling up. After 5 rounds, for example, a five-bracket pull-up match would be scored 0.01 to 0.5, a four-bracket pull-up would be scored 0.51 to 1, , and a within-bracket match would be scored 2.51 to 3. Within brackets, the program would still try to push teams to equally balanced schedules. In other words, a 3-1 team with an easy schedule would be paired against a strong 3-1 team, while a 3-1 team with a tough schedule would be paired against a weak 3-1 team. In this way, a strength-of-schedule pairing helps speed the sorting process by ensuring that teams neither coast to good records due to significantly weaker schedules than their bracket nor have poor records due to significantly tougher schedules.
Formula 2: Geography plus initial rankings One consideration for some state and national tournaments is geographic mixing. By rule, some tournaments block teams from the same geographic district from meeting. (These blocks can be treated just as within-school blocks are: by assigning these matches a zero.) Some tournaments merely want to encourage geographic mixing. In many cases, state and national tournaments might also like to run a partial round robin, where every team debates some top, middle, and bottom bracket teams (perhaps based on rankings from the regular season). The formula (1.1) outlined above already runs a partial round robin. The measure of strength could come from regular season points and not vary during the tournament. All that needs to be added is a measure of geographic diversity. The key consideration is the distance between teams. If geographic districts are used, teams within a district could be assigned a distance of zero; teams in adjacent districts, a distance of 1; and teams in non-adjacent districts, a distance of 2. Alternatively, the program could use each schools ZIP code, find longitude and latitude data, and assign the actual distance between two teams. No matter how distance is measured, the method of assigning a score for geographic diversity is the same. For a potential match of team A and team B, when team A had the prior opponent C and team B had the prior opponent D, the program would look up {d AB , d AC , d AB , d BC , d BD } , the distance between each combination of teams listed. The average distance created by this match would be 6 d avg = (2.1) 2 1 1 1 1 + + + + d AB d AC d AD d BC d BD where a higher score indicate more geographic spread. The use of the harmonic mean ensures that only matches where all distances are long receive high scores. This factor could be used in a tournament on its own as a way to pair rounds, especially for rounds 1 and 2, or it could be combined with a factor for strength of schedule. Formula 3: Judges The most exciting application of matrix optimization is judge assignment. If the tournament has collected each teams judging preferences, these can be used to assign each debate a mutually preferred judge. After the debate matches have paired, the program would build a matrix of each debate versus each judge. If team A gave judge j a preference of j A and team B gave her a
preference of jB (usually on a 1 = most preferred to 6 = strike scale), then the score for debate AB to be given judge j is (3.1)
judgescore =
1 2 + 2 jB
jA
where a higher score indicates a more preferred judge. If a tournament has surplus judges, then several blank rounds will need to be added to make the matrix square. These blank rounds should be scored by 1 (3.2) judgescore = 3 + rdsremain the inverse of the rounds of commitment a judge has remaining. This will put the judges with the fewest rounds of commitment remaining into a round off, and it will push judges with the most rounds of commitment into actual debates. Finally, judges who have already seen a team or are blocked from seeing a team should receive a 0 for that round. This needs to be updated during a tournament. Variant A The above method considered the preferences of every team equally. If the goal is to give higher priority to teams in break rounds, then the program needs to include a multiplier in (3.1) to inflate the judge scores for these teams. For example, this scheme could work: either team has two losses = 4 points, either team has one loss but neither has two = 3 points, both teams are undefeated = 2 points, and neither team can make elimination rounds = 1 point. The (3.1) formula would be modified to multiplier judgescore = j (3.3) 2 A + 2 jB Variant B Of course, it is also possible to accomplish similar goals without teams judge preferences. If the tournament ranks all the judges on an experience scale, say, 10 points for very experienced to 0 points for a novice judge, then judge scores for each debate can be assigned by
(3.4)
judgescore = multiplier ! exp erience
It would make no sense to assign these scores with only the experience variable, since every round a judge could see would have the same score. Variant C Perhaps the most exciting option is that all debates in all divisions of all debate events could be considered simultaneously. The program would create a matrix with all debate matches and all judges. The new additional variable would be the judges appropriateness for each division for each event. Perhaps an inexperienced judge would be given a score of 10 (most appropriate) for novice Lincoln Douglas debate but a score of 0 (inappropriate) for varsity policy debate. This is, in essence, the same thing as an experience score, but given for every division and event. The program would use judges in the most appropriate divisions and events where they can still judge debates, but if they are blocked against all the teams in that division or event, then the program would seamlessly slip the judges into the next best division or event. To make the formula complete, the program would also need to consider which division and event had the highest priority for appropriate judges. Given the speed and technical demands, the highest priority event would most likely be policy debate. The formula would then be (3.5)
judgescore = prioritydiv .ev ! appropriateness
This could be combined with the multiplier for break rounds or even mutual preference scores, as long as these were used in all divisions of all events. Variant D The same method can be used for assigning judging panels. The judges would be assigned one by one. If the tournament uses three-judge panels, the program would divide the judges into three pools and a fourth set of unused judges. The pools could be divided in any number of ways: geography, experience, stylistic preferences, sex, age, etc. Each pool would be assigned using the matrix optimization method. Formula 4: IEs It is also possible to use matrix optimization to set up IE panels using multiple variables. The IErs would need to be added to the panel one by one.
Section 3: A Better Measure of Team Strength The better measure of team strength I have come up with is weighted wins. If team A defeats teams B, C, and D, team As weighted wins would be 3 + wins of B + wins of C + wins of D. In this way, team A earns additional wins for the strength of its defeated opponents. If team A loses to teams E and F, team As weighted losses would be 2 + losses of E + losses of F. Team A earns its w.wins additional losses if it loses to weak opponents. The ratio measures a teams w.wins + w.losses strength on a 0 to 1 scale. The weighted wins ratio is NOT the same thing as opponent wins. Opponent wins measures the strength of a teams schedule; weighted wins considers the strength of opponents to adjust the measure of a teams strength.

Tab Program Description

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Tab Program Description

Hochgeladen von

Copyright:

Verfügbare Formate

This

judgescore = multiplier ! exp erience

Das könnte Ihnen auch gefallen