Sie sind auf Seite 1von 14

Graph And Social Mining

Xuan-Loi Vu

September 28, 2017

Vega Corp Graph And Social Mining September 28, 2017 1 / 14


Essential Graph

Graph - Characteristics

Types: direct vs undirect; weighted; signed; multigraph


Representation: adjacency matrix vs adjacency list vs edge list
Connectivity: connected; strongly connected vs weakly connected;
path - shortest path, spanning tree; random walk

Vega Corp Graph And Social Mining September 28, 2017 2 / 14


Essential Graph

Graph Algorithms

Traversal: BFS, DFS


Shortest path: Dijkstra
Minimum Spanning Trees: Prim
Network Flow Algorithms
Maximum Bipartite Matching
Bridge Detection

Vega Corp Graph And Social Mining September 28, 2017 3 / 14


Essential Network Measures

Centrality

For each node in a network, we can measure:


Degree Centrality
Eigenvector Centrality
Katz Centrality
PageRank
Betweenness Centrality
Closeness Centrality
...
For a group of nodes, we can measure almost above metrics.

Vega Corp Graph And Social Mining September 28, 2017 4 / 14


Essential Network Measures

Transitivity and Reciprocity

Transitivity: Clustering vs Local Clustering Coefficient


Reciprocity (the number of reciprocal pairs)

Vega Corp Graph And Social Mining September 28, 2017 5 / 14


Essential Network Measures

Balance and Status

Social balance theory:


The friend of my friend is my friend
The friend of my enemy is my enemy
The enemy of my enemy is my friend
The enemy of my friend is my enemy

Vega Corp Graph And Social Mining September 28, 2017 6 / 14


Essential Network Measures

Similarity

Structural Equivalence: we look at the neighborhood shared by two


nodes
Regular Equivalence: we look at how neighborhoods themselves are
similar

Vega Corp Graph And Social Mining September 28, 2017 7 / 14


Essential Network Models

Real-World Networks

Real-world networks share common characteristics.


Degree distribution: follows a power-law distribution (pk = ak b )
Clustering Coefficient: friendships are highly transitive
Average Path Length: any two members of the network are usually
connected via short paths

Vega Corp Graph And Social Mining September 28, 2017 8 / 14


Essential Network Models

Models

To model real-world networks:


Random Graphs (average path lengths)
Small-World Model (clustering coefficient and small average path
length)
Preferential Attachment Model (degree distribution and small average
path length)

Vega Corp Graph And Social Mining September 28, 2017 9 / 14


In Practice

In Practice

For current Telco problem and our social mining:


extract basic features
extract latent features
use these features for ML
Extract basic features:
Measure nodes above metrics
Detect community and measure communitys metrics
Extract latent features:
Apply deep-learning models (auto-encoder, . . . )

Vega Corp Graph And Social Mining September 28, 2017 10 / 14


In Practice NetworkX

NetworkX

Courseras course Applied Social Network Analysis In Python


Reference book Python for Graph and Network Analysis

Vega Corp Graph And Social Mining September 28, 2017 11 / 14


In Practice GraphX

GraphX

Based on Spark RDD


Implement basic and core components
Support Google Pregel
Reference book SparkX in Action

Vega Corp Graph And Social Mining September 28, 2017 12 / 14


In Practice GraphX

GraphX supplementation

Graph Frames:
Motif finding
Subgraphs
Graph algorithms: BFS, Connected components, Label propagation;
shortest path; Triangle count
Sparkling Graph:
Almost graph measures

Vega Corp Graph And Social Mining September 28, 2017 13 / 14


References

References

https://networkx.github.io
https://spark.apache.org/docs/latest/
graphx-programming-guide.html
https://sparkling-graph.github.io
https://graphframes.github.io

Vega Corp Graph And Social Mining September 28, 2017 14 / 14

Das könnte Ihnen auch gefallen