Beruflich Dokumente
Kultur Dokumente
What is foursquare?
An app that helps you explore your city and connect with friends A platform for location based services and data
What is foursquare?
People use foursquare to: share with friends discover new places get tips get deals earn points and badges keep track of visits
What is foursquare?
Mobile Social
Local
Stats
Video: h)p://vimeo.com/29323612
Overview Intro to Foursquare Data Place Graph Social Graph Explore Conclusions
NY Flow Network
Ellis Island Immigration Museum, Battery Park, Liberty Island, National September 11 Memorial, New York Stock Exchange, Empire State Building
Collaborative ltering
How
do
we
connect
people
to
new
places
theyll
like? People Places
Collaborative ltering
[Koren, Bell 08]
Item-Item similarity
Find items which are similar to items that a user has
already liked
User-User similarity
Find items from users similar to the current user Low-rank matrix factorization First nd latent low-dimensional coordinates of users
and items, then nd the nearest items in this space to a user
Collaborative ltering
Item-Item similarity
Pro: can easily update w/ new data for a user Pro: explainable e.g people who like Joes pizza, also like Lombardis Con: not as performant as richer global models
User-User similarity Pro: can leverage social signals here as well... similar
can mean people you are friends with, whom youve colocated with, whom you follow, etc...
Friends: intersection
xi xj kxi kkxj k
sim(A, B) = |A \ B|
each entry is the log(# of checkins at place i by user j) one row for every 30m venues...
X2R
nd
K2R
nn
O(n d)
Requires ~4.5m
machines to compute in < 24 hours!!! and 3.6PB to store!
K2R
nn
map
reduce
nal score
What happens when a new coffee shop opens in the East Village?
A2B
nn
L 2 Rnd
Graph embedding
Spring Embedding - Simulate physical system
by iterating Hookes law Spectral Embedding - Decompose adjacency matrix A with an SVD and use eigenvectors with highest eigenvalues for coordinates Laplacian eigenmaps [Belkin, Niyogi 02] - form graph laplacian from adjacency matrix, L = D A , apply SVD to L and use eigenvectors with smallest non-zero eigenvalues for coordinates
Preserving structure
A connectivity algorithm G(K) such as k-nearest neighbors should be able to recover the edges from the coordinates such that G(K) = A
Embedding
Connectivity G(K)
Edges
Points
max tr(KA)
KK
Dij > (1
where K = {K 0, tr(K) 1,
2Kij
ij
Kij = 0}
SVD
A2B
nn
K2R
nn
L 2 Rnd
1 SDP a From only connectivity information describing = a triplet (i, j, k) such maximum= 1 and Aik which neighbors, b-matching, or that Aij SVD spanning 0. weight where the step-size = t . Tony Jebara X n nodes in randomly chosen graph are connected, K specifying disorm and for eachaccepts as input clearly can we learnthe,set of This set Computer Science kernel constraint a if tree) which aof all triplets a triplet subsumes Cl an acan use projection to enforce th Dept. fP = tr(L LA) (L) max(tr ue A, tr(Cl L L) and returns an adjacency each each individual low-dimensional above, and allows node call that tance constraints update L for matrix, embedding > 0 thencoordinatesaccording to: we suchan s, Columbia University l S ij (L L)ij = 0, by subtracting Structure be written as theused to>reconstruct Preserving application of G y these coordinates can easily tr(Cl K) llest embedding structure preserving ifbe Embedding constraint New York, NY 10027 to 0 where d dividing each entry of L by its F the l K) Lt+1structure ij (finput Cl )Kkk Temporarily the tr(Coriginal K= Lt + SGDnetwork? . G(K) = A. =reproduces of + (Lt ), graph: 2K the 2Kik to optimized via K exactly We will maximize f (L) via projec sjj [Shaw, constraints, Jebara here dropping the centering reserving Embedding and scaling preserves11] we gradient decent. Dene the subgra oLinear the step-sizeto SPE1 learns aKeach as K tr(Cl K constraints on K=be . written step, the a enforce that matrix constraint After we As via ces. where now formulate the SDP above as maximizing the can rst proposed, single randomly chosen triplet: gt topology of the input adjacency matrixthen decomposes Imp following objective function that tr(L L) over cansemidenite program (SDP) and+ 2KsolvesKkk . use projection to enforce2KijL: ( SPE for greedy nearest-neighbor constraints 1 and the ik hich Ptr(Cl K) = Kjj K L L = 2L(A Cl ) Red if tr in following = 0, by performing singular value decomij (L L)ij SDP: by subtracting the mean from L and et of Dene distance and weightX terms of K: x (f (L), Cl ) = position. tr(L LA) by itsmax(tr(Cl L directly j f (L) = We of L Frobenius L), liza- dividing each entrypropose optimizing L norm.0). using 0 otherwise hing that st Dij = iKii descent 2Ktr(C beca stochastic gradient + Kjj k (SGD).l K) < 0 ij max tr(KA) lS ngs, ruct K K Wij = Tony = Kii Kjj + 2Kij Dij Jebara and for each randomly chosen impo trip cted m We will maximize(1 (L) ij ) max(Aim Dim ) i,j subDij > f Avia Dept. Computer Science projected stochastic node k tr(Cl L L) > 0 then update L acco m distr(C A, gradient decent. Dene the subgradient 0 terms of a Columbia University l K) > in SPE-SGD i j P fc- a s stsosingle K = {K chosenNY 1, Aij s.t. A 0} randomly York, G(K) = arg max triplet: ij Kij T Wij New 0, tr(K) 10027 where Lt+1 = Lt + (f (Lt ), logotan yields exponential number ij constraints of form: of SGD A X = s ( mning fStructure tr(L LA) l ) if tr(Cmax(tr(Cl L where the step-size (L) =C ) = 2L(Aconstraints can L) >written L), 0). 1 Large-ScaleStruct C 0 re l L be preserving = 1t . A Algorithm (f (L), l Structure preserving constraints also benet bors k-nearest neighborsotherwise {C can , ...C }, where .D trix as a set of matrices S = 0 1 , C2 bedding can use projection to enforce that l These m in methods P ns a dimensionality reduction algorithms. S SPE forl greedy nearest-neighbor constraints solves the hest each C is a constraint matrix corresponding to ( n h Require:ijA 0, bynsubtracting th , dimension an similarly DijSDP: Aij ) max(Aim Dim ) that preserve if nd randomly chosen triplet constraint C , compact coordinates ij (L L) = B then > (1 following (i, j, k) such that Aij = 1 and Aik =l 0. and for a tripleteach m dividing each , and maximum i 2L(A Cto: of parameter properties of the input according l if tr(Cl L L) > 0 entry of L by its Fro alue certainl L L) >all then update Ldata. Many)of these lizatr(C set of 0 = orm This (L), Cl ) triplets clearly subsumes the SPE (f at 1: Initialize L0 rand(d, n) max ectly manifold learning techniques preserve local distances ings, -balls blah constraints, and otherwise individual A, distance blah K tr(KA) allows each st K 0 ct (or optionally initialize to sp Ltopology. + = Lt We (f (Lthat ladding 0 where ), cted but not graphto be written showtr(Cl K) > explicit t+1 llest constraint Dij > (1 Aij )as t(AC ) im ) i,j ng max im Laplacian eigenmaps solution) constraints to these existingD dis- topological= Kjj 2Kij + 2Kikm Kkk . algorithms is tr(Cl K) y1 1 2: t 0 and the ij (Aij f so- crucial for preventing 2 ) =(Aijt ),After each step, we Lwhere =DL{K+ 0,folding 1t1, PCl ) = problems (f collapsing 0} here t+1 K = step-size tr(K)(L . 2 ) K t x, a where 3: repeat ij olog- that occur projection to enforce centering L) scaling can use in dimensionality reduction. ij es. Temporarily dropping the that tr(L and 1 and li1 P n 4: t t+1 maximum weightby subtracting the mean fromblah subgraph method blah L and j constraints, preserving formulate the SDP above as h: ij (L L) = 0, ng Structure ij we can now be written hich When the connectivityof Lconstraints can maximum 5: i rand(1 . . . n) dividing each entry algorithm G(K) is a maximizing the following by its Frobenius norm. L: objective function over
Large-scale SPE
Notes for previous slide: Each node in this network is a person, each edge represents friendship on foursquare. The size of each node is proportional to how many friends that person has. We can see the existence of dense clusters of users, on the right, the top, and on the left. There is a large component in the middle. There are clear hubs. We can now use this low-dimensional representation of this high-dimensional network, to better track what happens when a new coffee shop opens in the east village. As expected, it spreads ...like a virus, across this social substrate. We see as each person checks in to la colombe, their friends light up. People who have discovered the place are shown in blue. The current checkin is highlighted in orange in orange. Its amazing to see how la colombe spreads. Many people have been talking about how ideas, tweets, and memes spread across the internet. For the rst time we can track how new places opening in the real world spread in a similar way.
Iterative approach
start with random values and iterate works great w/ map-reduce X PR(j) P PR(i) = (1 d) + d k Aik
j2{Aij =1}
A2B
nn
PR(i) / vi where Pv =
1v
Explore
A social recommendation engine built from check-in data
Foursquare Explore
Realtime recommendations from signals: location time of day check-in history friends preferences venue similarities
< 200 ms
Our data stack MongoDB Amazon S3, Elastic Mapreduce Hadoop Hive Flume R and Matlab
Conclusion
Unique networks formed by people interacting
with each other and with places in the real world Massive scale -- today we are working with millions of people and places here at foursquare, but there are over a billion devices in the world constantly emitting this signal of userid, lat, long, timestamp
Join us!
foursquare is hiring! 110+ people and growing foursquare.com/jobs
Blake Shaw @metablake blake@foursquare.com