0 views

Uploaded by wufeifei

- Main Paper
- k Means Clustering
- An Effective Method for Fingerprint Classification
- The Combined Approach for Anomaly Detection Using Neural Networks and Clustering Techniques
- Parameter Estimation in EM
- ESP - Abstracts - DSP Projects 2010 - NCCT, Final Year Projects IEEE Projects
- Clustering Based Model For Facility Location In Logistic Network Using K-Means
- Scikit Learn Infographic
- IRJET-Hierarchical Approach Based on Color Image Segmentation Using Homogeneity
- chocolate presentation 2019 ieee jsm
- Jahmm 0.6.1 Userguide
- 1-4-1-PB
- Modelling Data Dispersion Degree in Automatic Robust Estimation for Multivariate Gaussian Mixture Models with an Application to Noisy Speech Processing
- IF04-0449
- An Improved K-means Clustering Algorithm With Refined Initial Centroids
- Comparison and Analysis of the Pixel-based and Object-Oriented Methods for Land Cover Classification with ETM+ Data
- A Novel Hybrid Intelligent Model for Classification and Pattern Recognition Problems
- Tex Segmentaion
- SubFlow: Towards Practical Flow-Level Trafﬁc Classiﬁcation Narus
- 3D_Reconstruction_of_Dynamic_Scenes_with_Multiple_Handheld_Cameras.pdf

You are on page 1of 13

Computer Science Department, Boston University, Boston MA

[bahargam, edori, best, evimaria]@cs.bu.edu

arXiv:1703.08762v1 [cs.AI] 26 Mar 2017

Whether teaching in a classroom or a Massive Online stu- dents; working in teams with their peers helps

Open Course it is crucial to present the material in a way students to access the material from a different viewpoint

that benefits the audience as a whole. We identify two as well [2, 6, 39, 27, 38].

important tasks to solve towards this objective; (1.) group

students so that they can maximally benefit from peer In this paper we study the problem of creating personal-

interaction and (2.) find an optimal schedule of the ized educational material for teams of students by taking

educational material for each group. Thus, in this paper, a computational perspective. More specifically, we focus

we solve the problem of team formation and content on two problems: the first problem is how to identify the

scheduling for education. Given a time frame d, a set of right schedule for a group of students, when the group is

students S with their required need to learn different apriori formed. The second problem is how to partition a

activities T and given k as the number of desired groups, set of students into groups and design personalized

we study the problem of finding k group of students. The schedules per group so that the benefit of students in

goal is to teach students within time frame d such that terms of how much they learn and absorb is maximized.

their potential for learning is maximized and find the best

Significant amount of work has been carried out on

schedule for each group. We show this problem to be NP-

designing personalized educational content, such as [29]

hard and develop a polynomial algorithm for it. We show

in the context of online education services and more

our algorithm to be effective both on synthetic as well as

notably on designing personalized schedules by Novikoff

a real data set. For our experiments, we use real data on

et al. [32] which has in- spired our current work. Team

students grades in a Computer Science department. As

formation in education is another well-studied area [2,

part of our contribution, we release a semi-synthetic

14, 31] and it has been showed that students can improve

dataset that mimics the properties of the real data.

their abilities by interaction and communication with

other team members [34].

Keywords

Team Formation; Clustering; Partitioning; Teams; However, to the best of our knowledge we are the first to

MOOC; Large Classes formally define and study the two problems of team

forma- tion and personalized scheduling for teams in the

1. INTRODUCTION context of education. Therefore, our contribution is to

Education has always been regarded as one of the most present formal definition of aforementioned problems,

im- portant tasks of society. Nowadays it is viewed as study their compu- tational complexity and design

one of the best means to bridge the societal inequalities algorithms for solving them. In addition to this we also

gap and to help individuals to achieve their full apply our algorithms to a real dataset obtained from real

potential. Accord- ingly, many work has been dedicated students. We make our semi- synthetic dataset

to study how individu- als learn and what is the best way BUCSSynth, generated to faithfully mimic the real student

to teach them (see [10, 5] for an overview). We recognize data available on our website.

two substantial conclusions that studies in this area

make on how to improve students learning outcome. Roadmap: The rest of the paper is organized as

First, the use of personalized education; by shaping the follows: After reviewing the related work in Section 2,

content and delivery of the lessons to the individual we define our framework and settings in Section 3. In

ability and need of each student we can enhance Section 4 we define group schedule problem. In Section 5

we formally define Cohort Selection and also show its

computational complexity. In the same section, we

present our CohPart to solve Cohort Selection . In

Section 6 we show usefulness of our CohPart on

synthetic and real world datasets. Finally we conclude

the paper in Section 7.

2. RELATED WORK

Our problem is related to psychology, education and

com- puter science including ability grouping, repetition

in learn-

ing and team formation. We review some of these works observe this dependency of learning quality on reiteration

here: of topics. We model a students learning process by a

sequence of topics that she learns about. In this sequence

Ability grouping: Majority of the studies in this area topics may appear multiple times, and repetitions of a

find that over the whole population, there definitely is a topic may count with different weights towards the

gain in academic performance due to ability grouping overall benefit of the stu- dent.

[17, 39, 23, 24, 21, 9]. Most of the studies agree, that

Let S = {s1, s2, . . . , sn} be a set of students and T =

there is high in- crease to learning of students in high-

ability groups. Some say there is only small gain, while {t1, t2, . . . , tm} be a set of topics. We assign topics to d

others say there is zero gain for low-ability groups. But timeslots based on two very simple rules; only one topic

even in this case, gain to the medium and high ability can

groups counters these negative effects. The benefits of be assigned to each timeslot but the same topic can

appear in multiple slots. A schedule A is a collision free

grouping on students attitude has also been studied in assignment of topics to the timeslots. A can be thought

of as an ordered

[23]. Authors have shown that stu- dents in grouped list of (possible multiple occurrences) of the topics.

classes developed more positive attitudes toward the For a topic t T the tuple (t, i) denotes the ith

occurrence of t in a schedule. The notation A[r] = (t, i)

subjects they were studying than did students in refers to the tuple (t, i) that is assigned to timeslot r in

ungrouped classes. A.

Repetition in learning: Repetition has long been Topic requirements. For every student s S and topic

re- garded as essential in learning. When learning a new t T there is a number of times that s has to hear

activity for the first time, new information is gained about

and stored in the short-term memory. This information t in order for s to learn every aspect of this topic. We call

will be lost over time when there is no attempt to retain this number the requirement of s on t and denote it by

it [33, 36, 19, 11, 1, 16, 15] Repetition in learning and the integer function req(s, t).

spacing effect has been even studied in computer

science in [32]. In this work au- thors try to optimize a Beneftts from topic. In order for a student s to be

single students learning in the light of Ebbinhauss work. fully knowledgeable about topic t, he has a requirement

They model education process as a sequence of to learn req(s, t) times about t. We assume that until s

abstract units and these units are repeated over time. has met his requirements, he gains knowledge and

However they did not consider the importance of hav- hence, will benefit to some extent from every repetition

ing a deadline for e.g. to prepare for a test and also the of t. After req(s, t) repetitions of t, while there is no

fact that after enough repetitions the information will detriment, there is also no additional benefit to s

move to long-term memory and there is negligible gain from hearing about t. We call

from repeti- tion. b(s, (t, i)) (Equation (1)) the benefit of s from topic t

when

Team formation: An earlier version of this study has hearing about it for the ith time. We assume that s

ap- peared in [7]. Team formation has been studied in benefits equally from each of the first req(s, t)

oper- ations research community [8, 13, 41, 42], which occurrences of t in

1

defines the problem as finding optimal match between A, thus b(s, (t, i)) = req(s,t) if i req(s, t). Since after this

people and demanded functional requirements. It is often point s has already mastered topic t, there is no additional

benefit from any later repetition of t and hence b(s, (t, i))

solved using techniques such as simulated annealing, = 0.

branch-and-cut or .

1

genetic algorithms [8, 41, 42]. It has also been studied in b(s, (t, i)) = req(s.t) if i req(s, t) (1)

computer science [2, 3, 20, 26, 30, 35, 4] Majority of

these

work focus on team formation to complete a task and different viewpoints

min- imize the communication cost among team

members. The focus of these studies is on finding only

one team to perform a given task. [2] considers

partitioning students in which each student has only one

ability level for all the activities and each team has a set

of leaders and followers. The goal is to maximize the

gain of students where gain is defined as the number of

students who can do better by interacting with the higher

ability students. Our problem differs as we consider

different ability levels for different activities.

3. PRELIMINARIES

Already Aristotle said that it is frequent repetition

that produces a natural tendency. The fundamental

basis of our work is the realization that repetition is an

essential part of learning; engaging with a topic

multiple times 1 deepens and hastens students

engagement and understanding pro- cesses [11, 40]. In

this paper we focus on developing optimal schedules for

teaching groups of students (e.g. classes) that

1

For e.g. learning about a topic multiple times,

reiterating it, possibly in different formats or from

0 otherwise

Note that for ease of exposition, we assume that all

repeti- tions of t before req(s, t) carry equal benefit to s.

However, the definition and all of our later algorithms

could easily be

extended to use some other function br (s, (t, i)). A

natural

choice for example is a function, where earlier repetitions

of

t carry higher benefit than later ones, thus br(s, (t, i)) =

1 2i

. The intuition is that first you learn about the

fundamentals of t and later you add on additional

information.

Given the benefits b(s, (t, i)) there is a natural extension

to define the benefit B(s, A) that s gains from schedule

A. This benefit is simply a summation over all timeslots

in A,

d

.

B(s, A) = b(s, A[r])

(2

)

r=

1

(t, i)

that is scheduled at timeslot r in A.

Observe, that every time topic t appears in the

schedule A, it will contribute with the same amount of

benefit towards

B(s, A), regardless of the exact timeslot allocation updates

within marginal benefit of ut, B[ut] (Lines 7- 8).

A.

In this section we investigate the problem of how to

divide students in such groups and assign schedules to

each group to maximize the benefit of students in every

group.

we refer to P as a group. The notion of the benefit of

a

schedule A to a single student s lends itself to a

straight-

forward extension to the benefit of a group. We define

the benefit B(P, A) group P has from A in Equation (3)

as the

sum of the benefits over all students in P .

d

..

B(P, A) = b(s, A[r]) (3)

s P r=1

first task is to find an optimal schedule for this group,

that is to find a schedule that maximizes the group

benefit of P . We call this the group schedule problem

(problem 1).

of students and T be a set of topics. For every s S and

t T let req(s, t) be the requirement of s on t given for every

student-topic pair. Find a schedule AP , such that B(P, AP )

is maximized for a deadline d.

polynomial time algorithm that solves problem 1. We cal

this algorithm Schedule(P, d). We present Schedule(P,

d) in Algorithm 1.

may be different for the different students in P . We

say that

the marginal benefit, m(P, (t, i)), from the ith repetition

of t (thus (t, i)) to P is the increase in the group benefit if

(t, i) is added to A. The marginal benefit of (t, i) can

be

computed as the sum of benefits over all students in P

as

given in Equation (4).

.

m(P, (t, i)) = b(s, (t, i)) (4)

s P

requirements for t, the subsequent repetitions of the

same topic may have different (decreasing) marginal

benefits.

timeslot chooses an instance of the topic with the

largest marginal benefit. To achieve this we maintain

an array B in which

values are marginal benefit of topics t T, if next

repetition of t is added to the schedule AP . We keep

the number that topic t has been added to AP in array

R.

re- peats until all d timeslots in the schedule are filled; it

selects the topic ut with the largest marginal benefit

from B and

adds it to the schedule AP (Lines 5 and 6) . Then it

Algorithm 1 Schedule algorithm for computing an

opti- mal schedule AP for a group P .

Input: requirements req(s, t) for every s P and

every topic t T, deadline d.

Output: schedule AP .

1: AP []

2: B [m(P, (t, 1))] for t T

3: R [0] for all t T

4: while |AP | < d do

5:Find topic ut with maximum marginal benefit in

B

6: AP (ut, R[ut])

7: R[ut] + +

8:Update B[ut] to m(P, (t,

R[ut])) 9: end while

best computed from the point of view of computing

marginal benefits of topics in B. In each iteration of the

loop, the marginal benefit is only recomputed for one of

the topics, ut with the largest benefit which has

been added to the

schedule AP most recently. The total runtime of

algorithm is O(d(|P |+|T|)). If we keep the marginal

benefits in a max- heap, we can reduce the running time

to O(d(|P | + log|T|)).

Algorithm 1 yields an optimal schedule for a group P .

1 yields maximal benefit B(P, A) for the group P .

repe- tition (t, i) of topic t to A is only dependent on i

and t but

not on any other topic. Hence the choice that we make in

al-

gorithm 1 in any iteration does not change the marginal

ben-

efit m(P, (t, i)). Thus choosing the topic t with the

largest marginal benefit in any iteration of algorithm 1

results in a schedule with maximal total benefit for the

group.

So far we discussed how to find an optimal schedule of

topics for a given group of students. The next natural

question is, that given a certain teaching capacity K

(i.e., there are K teachers or K classrooms available),

how to divide students into K groups so that each

student benefits the most possible from this

arrangement.

problem; we have to find a K-part partition P = P1 P2

. . . PK

of students into groups, so that the sum of the group

benefits

over all groups is maximized. We call the problem of

finding a partition that yields the highest sum of group

benefits the the Cohort Selection Problem .

students and T be a set of topics. For every s S and

t T let req(s, t) be the requirement of s on t that is

given for every student-topic pair. Find a partition P of

students

into K groups, such that

.

B(P, d) = B(P, AP ) (5)

P

P

d)

for every group.

Theorem 1. Cohort Selection (Problem 2) is lectionproblem, contradicting the optimality of P.

NP- hard.

5.1 Partition algorithms.

Proof. We reduce the catalog segmentation We first describe briefly two popular algorithms for

prob- lem [22] to Cohort Selection problem. clus- tering, K_means and Random Partitioning and

catalog seg- mentation is the following problem; how it is applied to our problem. Then we proceed to

there is a universe of present our so- lution, CohPart to the Cohort

items U = {u1, u2, . . . , um} and subsets S1, S2, . . . , Sn U Selection and a sampling- based speedup, CohPart_S .

given. Find two subsets X and Y of U , both of size |X | =

|Y| = r, such that

n Random Partitioning is assigning each point randomly

. to

CS(X , Y) = max{|Si X |, |Si Y|}

a cluster. We use this partitioning as a baseline to

(6 compare our algorithm with. Also we use it as the

initialization part of our CohPart algorithm.

)

i=1 K_means is a clustering method used to minimize the

is maximized. It is proven by Kleinberg et al. [22] av- erage squared distance between points in the same

that cluster. Solving K_means problem [18] exactly is NP-

catalog segmentation is NP-hard. hard. Lloyds algorithm [28] solves this problem by

choosing k centers ran- domly and assigning the points

We now show that if we can solve Cohort Selection to the closest center. Then the centers are recomputed

then we can also solve the catalog segmentation based on the points assigned to it. These two phases are

problem. More specifically, we map an instance of repeated until there is no more im- provement on the

catalog segmen- tation to an instance of Cohort cost of clustering. In our setting the stu- dents are the

Selection as follows: data points and the repetition for each topic represent

every subset Si in catalog segmentation is mapped each dimension.

to a

student si in Cohort Selection and element of the

uni- CohPart algorithm. The CohPart algorithm (Cohort

verse ui U of catalog segmentation is mapped to Par- titioning) is presented in algorithm 3 and consists of

a topic ti in Cohort Selection. For student si and two phases; first there is an initialization phase (Lines 3-

topic tj 6), in which a random clustering is executed on all of the

we set the requirement req(si , tj ) = 1 if uj Si , students (Line 3) and then for each partition pi , the

otherwise centers are computed (Lines 4- 6) using algo 1. When

req(si , tj ) = nm3 . We also set d = r and K = 2. initial cluster centers are chosen, then there is an

iterative phase (Lines 7- 14) where students get

We can also map a solution of Cohort Selection to a reassigned to clusters and cluster centers are updated

so- lution of catalog segmentation and vice verse; again.

let P =

{X, Y } be a partition of the students S in Cohort

Selec- tion and let AX and AY be the optimal schedules

for X and Y . We define the sets X and Y in catalogx

segmenta- tion from AX and AY . Specifically, let {t

x x

1 , t2 , . . . , tr } be

the topics (possible with multiplicity) that appear in AX

. Then we define X = {utx , utx , . . . , utx } to contain the

ele- 1 2 r

ments in U corresponding to the topics in AX . Y is

derived In our notations A and R both show the schedules (of a

in a similar way from AY . group of students or a single student). A shows the

vector of size d consisting of topics and their repetitions

(t, R[t]) for each time slot. R is a vector of size |T| and

Given a solution X and Y to catalog segmentation, for each topic

we can define the partition P = {X, Y } and the t, how many times it can be repeated in deadline d.

corresponding group schedules AX and AY . For every s

S we assign s to

X if |S X| > |S Y| and assign s to Y otherwise, where S

is the set in catalog segmentation that we identified Algorithm 2 Benefit algorithm for computing the

with student s. Further, the group schedule AX is the

schedule that contains topic t if and only if ut X . benefit of a single student s from a schedule R

Similar, Y =

Input: requirements req(s, t) for a student s P and

{t|ut Y}. every topic t T and a single schedule R

Output: b(s,R) Benefit of s from schedule R.

We show if P = {X, Y } is an optimal solution to Cohort 1: b = 0

Selection, then the corresponding solution X , Y has

to 2: for all topics t T do

be an optimal solution to catalog segmentation. min(req(s,t),R[t])

First,

observe that because of the choice of the requirements in 3: b=b R[t]

+

Co- hort Selection, if B is the value of a solution to

4: end for

Cohort

Selection, then the value of catalogsegmentation(X, Y)

|BJ. Further, |B({X, Y }, d)J = catalogsegmentation(X, Y),

where P = {X, Y } is derived from X and Y. exist X r and Y r , such thatr catalogsegmentation(X, Y) <

catalogsegmentation(X

r r r

, Y r ). However, in this case the par-

Let us assume, that P = {X, Y } is an optimal tition Pr = {X , r Y } with the schedules AXt , AY t derived

solution to Cohort Selection, but the derived X and from X and Y would yield a higher value for Cohort

Y are not Se-

optimal for catalog segmentation. That means there

Runtime : CohPart is a heuristic to solve Cohort

Selec-

tionproblem. In each iteration of the algorithm, the

group that each student can benefit the most is found

and stu-

dent is assign to that group. This will take O(k|T|) for

each

student. Then the schedule of each group is updated

and

algorithm iterates until convergence is achieved. The

total

running time of each iteration is O(k|S||T|). In our

experi- ments we observed that our algorithm

converges really fast, less than a few tens of iterations.

Algorithm 3 CohPart for computing the partition P

based on the benefit of students from schedules. tionnaires. In IRT models the probability of giving a

Input: requirement req(s, t) for every s S and t correct answer by a student to a question is determined

based on the ability of student and the difficulty of the

T,

number of timeslots d, number of groups K. question. For our work we used the Graded Response

Model (GRM), an advanced IRT model which fits our

Output: partition P.

1: C = data well and handles partial credit values. Using our

2: P = {P1, P2, . . . , PK} data on grades of students for taken courses, GRM

helps us to deduce ability scores for each student and

3:Run Random Partitioning on the students and

obtain difficulty scores for each course. Having these score

Pi s parameters, then we can generate the missing grades

4: for i = 1, . . . , K do for courses that a student did not take. We also used

5: ci = Schedule(Pi , GRM to obtain a model to generate a larger dataset,

d) 6: end for i.e. BUCSSynth.

7: while convergence is achieved do

8: for all students s S do 6.1 Datasets

9: Pi s, such that i = argmaxj=1,...,k b(s, cj ) This subsection describes each dataset and their

10: end for attributes.

11: for i = 1, . . . , K do

12: ci = Schedule(Pi ,

d) 13: end for BUCS data: The original BUCS dataset consists of

14: end while grades of students in CS courses at Boston University.

This data was collected from Fall 2003 to Fall 2013.

Each row of data looked like: FALL 2003, CS101,

CohPart_S algorithm. The CohPart_S (Cohort U12345, U1, C+ which shows the semester year, course

Partition- ing with Sampling,) resembles CohPart number, students BU id, undergraduate/graduate year

except that it per- and the grade. It consists of 9833 students. We only

forms clustering on a random sample of students of size considered students who were tak- ing CS330 and CS210

nr (required courses to obtain a major in CS) which

and when clustering is finished assigns the remaining stu- consisted of 398 students and 41 courses. Here the

dents to the cluster with the maximum benefit b(s, cj ). courses correspond to topics. Obviously the new dataset

It reduces the running time to O(knr|T|) . We set nr = k had some missing values, not all 41 courses were taken

c

for different values of c. by those 398 students. To fill the grades for missing

(student, course) pairs, we used GRM. First using GRM,

5.2 Constraints on Topic Order we obtained the ability and difficulty parameters for all

In real-life, most often we cannot pick any scheduling of students and all courses. The abilities 2 and difficulties

top- ics we like. Instead, there are strict precedence parameters are avail- able online3. Then for each pair of

constraints among the topics. For example, one has to (student, course) in which student s did not take course

learn addition be- fore he can learn about multiplication c, we used the ability of s and difficulty of c to predict

during a math course. Therefore, we assume that along the grade of course c for that student. After having all

with the topics, a set of con- straints is also given. The

grades for all courses, we had to transform these grades

constraints can be simple ones, such as the first

occurrence of topic ti has to be before topic tj , or more to the number of required repetitions to learn a course.

complicated ones, topic tj can only be scheduled after at We assumed the number of required repe- tition to master

least r1 repetitions of ti1 and r2 repetitions of ti2 . Of a course (or topic) for the smartest student is 5 (base

course, the set of constraints can also be empty, if we do parameter). Note that throughout a semester students

not have any of them. We can easily modify algorithm review the course materials to solve homework, do

1 to take into account these constraints and check for project and prepare for quizzes, midterm and final exams,

prece- dence constraints. To achieve this, after line 4 we so they review material for at least 5 times. Thus for

can check for precedence constraints and in line 5 we

choose only the topics which their precedence students who got A, we considered 5 repetitions needed

constraints are met. to fully mas- ter the course and as the ability (and grade)

drops, number of repetition goes up (step parameter). We

6. EXPERIMENTS also tried differ- ent base and step values for our

experiments.

The goal of these experiments is to gain an

understanding of how our clustering algorithm works in

terms of performance (objective function). BUCSSynth data: In order to see how well our

Furthermore, we want to understand how the deadline algorithm scales to a larger dataset, we generated a

parameter impacts our algorithm. We used a real world synthetic data, based on the obtained parameters from

dataset, semi synthetic and synthetic datasets. The GRM. We call this dataset BUCSSynth. From BUCS

semi synthetic dataset and the source code to generate dataset, we observed that the ability of students follows

it are available in our website. We first introduce Graded a normal distribution with

Re- sponse Model (GRM) briefly, then explain different = 1.13 and = 1.41. Applying GRM to BUCS data,

datasets and finally show how well our algorithm is we obtained difficulty parameters for 41 courses. In

doing on each dataset. order to obtain difficulties for 100 courses, we used the

following approach:

1. Choose one of the 41 courses at random.

2. Use density estimation, smoothing and then get

the CDF of the difficulties.

Item Response Theory and Graded Response Model:

In psychometric, Item Response Theory (IRT) is a frame- work for designing and evaluating tests, questions and

2

ques- http://cs-people.bu.edu/bahargam/abilities

3

http://cs-

people.bu.edu/bahargam/difficulties

3. Randomly sample from the CDF to get the schedule for those students is more personalized and

difficulties for a new course. closer to their individual schedule, when having one

Using the aforementioned parameters, we generated the tutor for each student. The benefit grows dramatically

grades for 2000 students and 100 courses and we from 1 cluster to 10 cluster. But after 10 cluster the

transformed the grades to number of repetitions similar to increase in the potential is slower. We also show the 95

what we did for BUCS dataset. This dataset 4 and the code confidence interval, but it was small that cannot be seen

5

to generate it are available online. in some plots.

Synthetic data: Our first synthetic dataset is to BUCSdeadline: We also show the result for different

gener- ate ground truth data to compare our algorithm val- ues of deadline. As the deadline increases, the gap

to Ran- dom Partitioning and K_means . In this between K_means and our algorithm decreases. The

dataset we had generated 10 groups of students, each reason is as deadline is greater we have to take into

group containing 40 students. For each group we consideration more topics to teach to the students.

selected 5 courses and assigned repetitions randomly to Note that K_means algo- rithm behaves like our

those 5 courses such that the sum of repetition will be algorithm except it considers all the courses and ignores

equal to the deadline6. Then for the remaining 35 the deadline. So the greater the dead- line is, the closer

courses, we filled the required number of repe- titions K_means gets to our algorithm. But in real life, we do

with random numbers

5 taken from a normal distribu- not have enough time to repeat (or teach) all of the

tion with = deadline and = 3. We refer to this dataset courses (for e.g. for preparation before SAT exam).

as GroundTruth. We expect our algorithm to be able to Figure 1f illustrates the case when deadline is equal to

find the right clusters of students while K_means cannot the average sum of need vectors for different students.

find this hidden structure.

BUCSBase: We tried different values for base and step

We have also generated the repetitions for 400 students parameters (explained earlier) and the result is depicted

and 40 courses using Pareto, Normal and Uniform in Figure 1g, when the base and step are equal to 1. We

distributions. We refer to this datastes as pareto, observe that when the base is equal to 1 and step is also

normal and uniform. To generate number of repetitions small, K_means also performs well, but still our

for different courses using the pareto distribution, we algorithm is doing better than K_means . The larger is

used the shape parameter = 2. For normal distribution the value of base and step parameter, the better our

we used = 30 and = 5 and for uniform dataset we algorithm performs.

generated random numbers in the range of [5,100].

6.2.2 Results on Semi-synthetic Dataset

6.2 Results:

Our experiments compare our algorithm in terms of BUCSSynth dataset: We ran our algorithmon on

our objective function (students benefit) with Random BUC- SSynth dataset to see how well our algorithm

Parti- tioning and K_means Recall that the students scales for large number of students. The result is

benefit is defines in Equation (5). The current depicted in Figure 1h.

algorithm is imple- mented in Python 2.7 and all the

experiments are run single threaded on a Macbook Air

(OS-X 10.9.4, 4GB RAM). We compare our algorithm 6.2.3 Results on Synthetic Datasets

with Random Partitioning and the K_means algorithm, Our first set of experiments on synthetic data used the

the built in k-means function in Scipy library. Each ground truth dataset. The result is illustrated in Figure

experiment was repeated 5 times and the av- erage 1a. As we see CohPart and CohPart_S both are

results are reported in this section. For sample size in performing really well. For all of the courses the mean

CohPart_S algorithm, we set parameter c (explained required repetition is close to 10 with standard

earlier) to 4 in all experiments.

deviation 3. We expect that stu- dents in the same group

(when generating the data) should be placed in the same

6.2.1 Results on Real World Datasets cluster as well after running our algo- rithm and the

BUCS: We executed our algorithm on BUCS dataset schedule should include the selected courses in each

untill reaching convergence and show how well it group. In each group students have different repetition

values for the selected courses, but the sum of these

maximized the benefit of learning while varying the

number of clusters We compare CohPart and CohPart_S selected courses is equal to the deadline and our

algorithm realized this structure and only considered

to Random Partitioning and K_means . The result is

depicted in Figure 1e where each point shows the these selected courses to obtain the schedule. But

K_means lacked this ability and did not cluster these

benefit of all students when partition- ing them into k

groups. As we see the Random Partitioning has the students together. The next studied datasets were

uniform, pareto and normal datasets and the results are

lowest benefit and our algorithm has the best benefit. As

the number of clusters increases (having hence fewer stu- depicted in Figure 1b, 1c and 1d respectively. For these

datasets also our algorithm outperformed K_means and

dents in each cluster), the benefit also increases, means

the Random Partitioning .

4

5

http://cs-people.bu.edu/bahargam/BUCSSynth 7. CONCLUSION

http://cs-people.bu.edu/bahargam/BUCSSynthCode In this paper, we highlighted the importance of team

6

The repetition for those selected courses are not equal

for the students in the same group, but for all the for- mation and scheduling educational materials for

students in the group the sum of selected courses is students. We suggested a novel clustering algorithm to

equal to the deadline. form different teams and teach the team members

based on their abilities in different topics. Our

algorithm maximized the potential and benefit of team

members for learning . The encouraging results that we

obtained shows that our proposed solution is

(a) Ground Truth (b) Random (c) pareto (d) normal

Figure 1: Performance of Random Partitioning , K_means , CohPart and CohPart_S for Cohort Selection problem

on different datasets

effective and suggest that we have to consider [8]A. Baykasoglu, T. Dereli, and S. Das. Project team

personalized teaching for students and form more selection using fuzzy optimization approach.

efficient teams. Cybern. Syst., 38(2):155185, Feb. 2007.

[9]B. S. Bloom. The 2 sigma problem: The search for

8. REFERENCES methods of group instruction as effective as one-to-

[1]R. Agrawal, B. Golshan, and E. Terzi. Forming one tutoring. Educational Researcher, 13(6):416,

beneficial teams of students in massive online 1984.

classes. In Proceedings of the first ACM [10]J. Bransford, A. Brown, and R. Cocking, editors. How

conference on Learning@ scale conference, pages People Learn: Brain, Mind, Experience, and School -

155156. ACM, 2014. Expanded Edition. 2000.

[2]R. Agrawal, B. Golshan, and E. Terzi. Grouping [11]R. F. Bruner. Repetition is the first principle of

students in educational settings. In Proceedings of all learning. Social Science Research Network,

the 20th ACM SIGKDD International Conference 2001.

on Knowledge Discovery and Data Mining, KDD 14, [12]P. Brusilovsky and C. Peylo. Adaptive and

pages 10171026, New York, NY, USA, 2014. intelligent web-based educational systems.

ACM. International Journal of Artificial Intelligence in

[3]A. Anagnostopoulos, L. Becchetti, C. Castillo, Education, 13(2):159172, 2003.

A. Gionis, and S. Leonardi. Power in unity: Forming [13]S.-J. Chen and L. Lin. Modeling team member

teams in large-scale community systems. In characteristics for the formation of a multifunctional

Proceedings of the 19th ACM International Conference team in concurrent engineering. Engineering

on Information and Knowledge Management, CIKM Management, IEEE Transactions on, 51(2):111124,

10, pages 599608, New York, NY, USA, 2010. May 2004.

ACM. [14]D. Esposito. Homogeneous and heterogeneous ability

[4]A. Anagnostopoulos, L. Becchetti, C. Castillo, grouping: Principal findings and implications for

A. Gionis, and S. Leonardi. Online team formation in evaluating and designing more effective educational

social networks. In Proceedings of the 21st environments. Review of Educational Research,

International Conference on World Wide Web, 43(2):163179, 1973.

WWW 12, pages 839848, New York, NY, USA, [15]E. Galbrun, B. Golshan, A. Gionis, and E.

2012. ACM. Terzi. Finding low-tension communities. arXiv

[5]J. Aronson, editor. Improving academic achievement preprint arXiv:1701.05352, 2017.

: impact of psychological factors on education. [16]B. Golshan, T. Lappas, and E. Terzi.

[6]A. Ashman and R. Gillies. Cooperative Learning: Profit-maximizing cluster hires. In Proceedings of the

The Social and Intellectual Outcomes of Learning 20th ACM SIGKDD international conference on

in Groups. Taylor & Francis, 2003. Knowledge discovery and data mining, pages 1196

[7]S. Bahargam, D. Erdos, A. Bestavros, and E. 1205. ACM, 2014.

Terzi. [17]B. Grossen. How should we group to achieve

Personalized education; solving a group formation and excellence

scheduling problem for educational content. In The

8th International Conference on Educational Data

Mining, 2015.

with equity. PhD thesis, Unviersity of Oregon, July [31]J. M. McPartland and M. Johns Hopkins Univ.,

1996. Baltimore. School Structures and Classroom

[18]J. Hartigan and M. Wong. Algorithm AS 136: Practices in Elementary, Middle, and Secondary

A K-means clustering algorithm. Applied Schools. Report No. 14 [microform] / James M.

Statistics, pages 100108, 1979. McPartland and Others. Distributed by ERIC

[19]M. Y. Jaber and H. V. Kher. Variant versus invariant Clearinghouse [Washington, D.C.], 1987.

time to total forgetting: The learnforget curve [32]T. P. Novikoff, J. M. Kleinberg, and S. H.

model revisited. Computers & Industrial Strogatz. Education of a model student.

Engineering, 46(4):697705, 2004. Proceedings of the National Academy of

[20]M. Kargar and A. An. Discovering top-k teams of Sciences, 109(6):18681873, 2012.

experts with/without a leader in social networks. In [33]B. Pentland. The learning curve and the forgetting

Proceedings of the 20th ACM International Conference curve: The importance of time and timing in the

implementation of technological innovations. In

on Information and Knowledge Management, CIKM 49th

11, pages 985994, New York, NY, USA, 2011. annual meeting of the Academy of Management,

ACM. Washington, DC, 1989.

[21]A. C. Kerckhoff. Effects of ability grouping in british [34]N. M. Rachel Hertz-Lazarowitz. Interaction in

secondary schools. American Sociological Review, cooperative groups: The theoretical anatomy of group

51(6):842858, 1986. learning. Cambridge University Press, 1995.

[22]J. Kleinberg, C. Papadimitriou, and P. Raghavan. [35]S. S. Rangapuram, T. B uhler, and M. Hein.

Segmentation problems. J. ACM, pages 263280, Towards realistic team formation in social networks

2004. [23]C.-L. C. Kulik and J. A. Kulik. Effects of based on densest subgraphs. In Proceedings of the

Ability 22Nd International Conference on World Wide Web,

Grouping on Secondary School Students: A WWW 13, pages 10771088, Republic and Canton of

Geneva, Switzerland, 2013. International World Wide

Meta-analysis of Evaluation Findings. Am Educ Res Web Conferences Steering Committee.

J, 19(3):415428, Jan. 1982. [36]H. Roediger and J. Nairne. The Foundations of

[24]J. A. Kulik and C.-L. C. Kulik. Meta-analytic Remembering: Essays in Honor of Henry L. Roediger

findings on grouping programs. Gifted Child III. Psychology Press Festschrift Series.

Quarterly, 36(2):7377, 1992. Psychology Press, 2007.

[25]A. I. Lakatos. Introduction. Journal of the Society for [37]A. Segal, Z. Katzir, K. Gal, G. Shani, and B.

Information Display, 8(1):11, 2000. Shapira.

[26]T. Lappas, K. Liu, and E. Terzi. Finding a team of Edurank: A collaborative filtering approach to

experts in social networks. In Proceedings of the personalization in e-learning. 2014.

15th ACM SIGKDD International Conference on [38]A. P. Sergio Gutierrez-Santos, Manolis Mavrikis.

Knowledge Discovery and Data Mining, KDD 09, Mining students strategies to enable collaborative

pages 467476, New York, NY, USA, 2009. ACM. learning. 2014.

[27]C. F. Lin, Y. chu Yeh, Y. H. Hung, and R. I. Chang. [39]R. E. Slavin. Ability Grouping and Student

Data mining for providing a personalized Achievement in Elementary Schools: A Best-

learning path in creativity: An application of Evidence Synthesis. Review of Educational

decision trees. Computers & Education, 68(0):199 Research, 57(3):293336, 1987.

210, 2013. [40]C. J. Weibell. Principles of learning: A conceptual

[28]S. P. Lloyd. Least squares quantization in pcm. IEEE framework for domain-specific theories of learning.

Transactions on Information Theory, 28:129137, PhD thesis, Brigham Young University. Department of

1982. Instructional Psychology and Technology, 2011.

[29]J. Lu. Personalized e-learning material recommender [41]H. Wi, S. Oh, J. Mun, and M. Jung. A team formation

system. In International conference on information model based on knowledge and collaboration. Expert

technology for application, pages 374379, 2004. Systems with Applications, 36(5):9121 9134, 2009.

[30]A. Majumder, S. Datta, and K. Naidu. Capacitated [42]A. Zakarian and A. Kusiak. Forming teams: an

team formation problem on social networks. In analytical approach. IIE Transactions, 31(1):85

Proceedings of the 18th ACM SIGKDD International 97,

Conference on Knowledge Discovery and Data Mining, 1999.

KDD 12, pages 10051013, New York, NY, USA,

2012. ACM.

- Main PaperUploaded byajupalleri
- k Means ClusteringUploaded byPedro Llanos
- An Effective Method for Fingerprint ClassificationUploaded byDiana Krishna Kumar
- The Combined Approach for Anomaly Detection Using Neural Networks and Clustering TechniquesUploaded bycseij
- Parameter Estimation in EMUploaded byUday Babbar
- ESP - Abstracts - DSP Projects 2010 - NCCT, Final Year Projects IEEE ProjectsUploaded byncctprojects
- Clustering Based Model For Facility Location In Logistic Network Using K-MeansUploaded byPublishing Manager
- Scikit Learn InfographicUploaded byAlexandre Farb
- IRJET-Hierarchical Approach Based on Color Image Segmentation Using HomogeneityUploaded byIRJET Journal
- chocolate presentation 2019 ieee jsmUploaded byapi-448507449
- Jahmm 0.6.1 UserguideUploaded bySasha Nícolas
- 1-4-1-PBUploaded byLaksmi Candra Ningrum Suharto
- Modelling Data Dispersion Degree in Automatic Robust Estimation for Multivariate Gaussian Mixture Models with an Application to Noisy Speech ProcessingUploaded bySEP-Publisher
- IF04-0449Uploaded bysusasuresh
- An Improved K-means Clustering Algorithm With Refined Initial CentroidsUploaded byDeden Istiawan
- Comparison and Analysis of the Pixel-based and Object-Oriented Methods for Land Cover Classification with ETM+ DataUploaded byInternational Organization of Scientific Research (IOSR)
- A Novel Hybrid Intelligent Model for Classification and Pattern Recognition ProblemsUploaded byijcsis
- Tex SegmentaionUploaded bymaxtom12
- SubFlow: Towards Practical Flow-Level Trafﬁc Classiﬁcation NarusUploaded byStopSpyingOnMe
- 3D_Reconstruction_of_Dynamic_Scenes_with_Multiple_Handheld_Cameras.pdfUploaded bywrite2vikash
- Pam i Mean ShiftUploaded byCuong01011
- 5. Abs CCR Clustering and Collaborative Representation - CopyUploaded byg_31682896
- 1569100457Uploaded bymiusay
- IJIP-367Uploaded byBarış Bayram
- Chapter 06_Well SegmentationUploaded byamramazon88
- A Novel Cluster-Head Selection Algorithm for Wireless Sensor NetworksUploaded byGhufran Ahmed
- Dms All Subjects Questions 2016Uploaded bymstfahmed76
- IJIP-478Uploaded bypathummw
- Texture Based Segmentation Using Statistical Properties for Mammographic ImagesUploaded byEditor IJACSA
- Distributed Processing of Probabilistic Top-kQueries with Efficient Query EvaluationUploaded byInternational Organization of Scientific Research (IOSR)

- Spherometer expt 5Uploaded byanimeshrockstar
- Pandas_Cheat_Sheet.pdfUploaded bybacklever95
- Sloshing Ship Motion Coupling EffectUploaded bySafal Banker
- q2 w1-d1 Math 5-6, Epp 6Uploaded byRaymund Bonde
- 2240_TGUploaded byAzzaya Dashzeveg
- cUploaded byapi-3722999
- 44440000Uploaded byjayesh newal
- Project Management-Assignment 01Uploaded bymurugiah sudharson
- Trig ReviewUploaded byBrian Diaz
- Circle EquationUploaded byangweekiatchs5120
- Kenneth Derucher, Chandrasekhar Putcha, Kim Uksun-Indeterminate Structural Analysis-Edwin Mellen Pr (2013).pdfUploaded bykovary
- tutorial-lex-yaccUploaded bySumanth S Murthy
- SOAL.docxUploaded byFebryn Ngk
- m4.docxUploaded byandelka2
- Laterally Loaded Piles - FinalUploaded bycjcute91
- Assignment 02Uploaded byWahidurRahmanShifat
- 2. Dynamic Process Simulation Promotes Energy Efficient Ship Design Ocean EngineeringUploaded byKam FC
- 1809.10160.pdfUploaded byFábio Duarte
- Brain Wave Classification and Feature Extraction of EEG Signal by Using FFT on Lab ViewUploaded byIRJET Journal
- Basic Meshing OperationsUploaded byHamza
- Tudorache_18Uploaded byCorneliu Voichita
- Expt 9 Unsteady Heat TransferUploaded bysaravthen
- ASTM E1461 Standard Test Method for Thermal Diffusivity by the Flash MethodUploaded byMonica Villaquiran C
- Rtiwari Rd Book 05bUploaded bymghgol
- System BiologyUploaded byvikashisar009
- OPTIMAL CAPACITORS PLACEMENT IN DISTRIBUTION NETWORKS USING GENETIC ALGORITHM: A DIMENSION REDUCING APPROACHUploaded byOreana Garcia Alarcon
- parallel perpendicular alg1Uploaded byapi-167924086
- Library BooksUploaded bySunil Yadav
- Introduction to Worm GearingUploaded byjaikishank
- Two Dimension MotionUploaded bySombir Ahlawat