Beruflich Dokumente
Kultur Dokumente
UNIT II
MODELING AND VISUALIZATION
BY
JAUSMIN KJ,ME
ASSISTANT PROFESSOR
COMPUTER SCIENCE AND ENGINEERING
RMD ENGINEERING COLLEGE
SYLLABUS-UNIT II
Degree –nodes directly connected to larger nodes considered.if edges directed the in-degree centrality is differentiated from
the out-degree centrality.
Betweenness - Betweenness centrality is to measure the connectivity of the neighbors of a node and to give a higher value
for nodes which bridge clusters.
Closeness - The measure of closeness centrality is to take into account how distant a node is to the other nodes in the
network
3.2 Clustering
• Many social networks contain subsets of nodes-highly connected(within subset),to explore this community use the
measures
• Clustering coefficient- measure the degrees of nodes to decide which nodes in a graph tend to be clustered together.
3.3 NODE- EDGE – DIAGRAM
• With the node-edge visualization, many network analysis tasks, such as component size calculation,
centrality analysis, and pattern sketching, can be better presented in a more straightforward manner.
• There are three kinds of layouts:
• Random layout –placing nodes at random geometric locations in the graph and no clear visualization-
O(N)
• Force –directed layout- Also known as a spring layout(edges-spring, the nodes -repelling objects. an
initial random layout will be yielded first, and then the force-directed algorithms will run iteratively to
adjust the positions of nodes until all graph nodes and attractive forces between the adjacent nodes run-
least O(N log N) or O(E),
• Tree layout –A basic tree layout is to choose a node as the root of tree, and the nodes connected to the
root become children of the root node.
2.4 VISUALIZING SOCIAL NETWORKS WITH
MATRIX-BASED REPRESENTATIONS
• online social network services are created to connect social relationships among people
• depict and analyze the visualization of online social networks according to their attributes of sociality,
including Web communities, email groups , digital libraries, and Web 2.0 services.
• online social network visualizations based on different views of social relationships, e.g. user centric social
relationships, content centric social relationships, and hybrid social relationships.
2.4.1 web communities
• The SixDegrees.com website was an early representative created on the basis of the Web interaction model during
1997 and 2001.
• various social network websites and Web-based dating services have been established to build up their social
relationships and communities.
• In 2003, Club Nexus-friendship network community, provided very rich profiles explicitly list their friends by
their profiles and allow for detailed social network analysis ,and identifying activities and preferences that
determine the formation of friendship.
• In 2005,Vizster was developed based on node-edge network layouts for exploring connectivity in large graph
structures facilitate the analysis of social networks, such as highlighting, panning, zooming, and distortion
techniques.
• FOAF (Friend-of-a-friend)- Analyse and visualize human-centric social relationships based on Semantic Web
social metadata-XML/RDF
• Microsoft Research Asia proposed a novel object-level search service, called Entity Cube, to help people
discover real-world entities, such as people, locations, and organizations, and explore their social relationships.
2.4.2 Email groups
• In 2004, Soylent was developed to study the social patterns and the temporal rhythms of daily email
activities.(mutual interaction , collaboration activities clearly visible)
• EXAMPLE: onion pattern, the nexus pattern, and the butterfly pattern
• In 2005, two visual metaphors, Social Network Fragments (SNF) and Post History, were employed to
visualize the major two dimensions of email activities: people and time.
• Relationship from email archive highlighted in SNF
• Post History(calendar panel, contacts panel)-The email exchange activities with time progress visualized
2.4.3Digital libraries
• social networks can be mainly analyzed from two aspects: authors and writings.
2.4.3.1 Co-Authorship Networks
• With the visualization of co-authorships, some characteristics, such as clustering coefficient and average
path length, can be analyzed in co-authorship networks.
• In 2005,social network analysis for co-authorship was in-depth studied in digital libraries.
• In addition to the node-edge representation, a matrix representation was used in the coauthorship network
to help analyze different co-authorship patterns.
2.4.3.2 Co-Citation Relations
• In 2006, a novel visualization tool, called CircleView,(documents with high impact and citation pattern
immediately identified with interactive desigh,highlighted color and circles
• In 2007, an interactive visualization tool was developed to present large co-citation networks with latent
visual cues and allows direct interaction with the visualized graphs.
• In 2009, an innovative visualization technique, called FP-tree, was developed to present co-citation
network from a new perspective, namely, visualizing social networks based on a paper-reference matrix
instead of using a reference-reference matrix.
2.4.4 Web 2.0 services
• Since the concept of Web 2.0 was proposed in 2004, online social activities are becoming more prosperous
than before.
• Many Web 2.0 applications are popularly accessed by users to connect their social networks, such as Twitter
and Facebook.
• Nexus is a visualization application on Facebook communities to illustrate their large network
graphs(recognize relationship complex for some case)
• In 2010, an advanced interactive visualization interface, called IRNet, was proposed to further improve the
shortcomings of Nexus and TouchGraph on visualizing Facebook communities.
2.4.5 visualization of online social networks classification
• In addition, visualization of online social networks can be further categorized into three types by their social
relationships:
• user-centric visualization(access people network ,discover relationship with interest), content-centric
visualization(content based on interest), and hybrid visualization.(different kinds of relationship and
interaction)
2.6 Matrix based representation
2.6.1 Matrix or Node-Link Diagram
Node-link diagrams are more effective for very small (under 20 vertices) and sparse networks ,matrices when
the task is to follow paths in the network.
Advantages of matrices
• Matrices provide powerful overview visualization.
• Matrices do not suffer from node overlapping.
• Matrices do not suffer from link crossing each other.
• Matrices show all possible pairs of vertices.
• Matrices are particularly appropriate for directed and dense networks.
Advantages of node-link diagrams
These representations are familiar to a wide audience; they constitute a powerful communication tool.
For small or sparse networks, node-link diagrams were more effective than matrices.
The space used by matrices is larger than the space to display node-link diagrams. Therefore, node-link
diagrams provide a compact representations.
Node-link diagrams are more appropriate to perform a number of path-related tasks
2.6.2 Matrix +Node Link Diagram
• Matrix Explorer designed to combine advantages of both representations and to support the visual
exploration of social networks. Following are the steps to combine matrices and node-link diagrams.
• Initiate the exploration
• Explore interactively and iteratively
• Find a consensus in the data or validate an hypothesis
• Present the findings
2.7 Node Link Diagram
• The principle of node-link diagrams is to graphically represent actors of the network by nodes and
connections by links.(readability and message depends on node position)
2.8 HYBRID REPRESENTATIONS
• Providing both matrix and node-link diagrams to the user has a number of advantages but also drawbacks.
• It requires a large amount of display space.
• At least two display monitors are required to comfortably use Matrix Explorer;
• Switching from one representation to the other may induce high cognitive load to the user.
• Two hybrid representations were developed namely,
• MatLink and NodeTrix
2.8.1 AUGMENTING MATRICES
• Its principle is to augment a standard matrix representation with links on its borders,dual encoding the
connection b/w actors.the two types of links added to representations:
• static links (in white on the figure) and
• interactive links (in a darker shade).
Assessing the Readability of MatLink
• MatLink introduced specific tasks of social network analysis: find a cut point, find a clique(circle) and
find communities (strongly connected groups).
• By the way MatLink significantly improve standard matrix representations.
• The only task for which node-link diagrams still perform better is the identification of cut points. With
MatLink, this task requires to identify specific visual patterns of the links.
Using MatLink for Navigating in the Matrix
• To improve readability of matrices, Matlink supports navigation
Three techniques that provide users with effective tools to navigate in large matrices with MatLink were listed
below:
• Melange: folds the space between two far away nodes as if it was a piece of paper. Users may see side by
side parts of the matrix that are far away.
• Bring-and-go: neighbors of an actor closer as if their links were elastic, by moving the cursor over one
of the neighbor and releasing the mouse, the view and the node travel to its previous location.
• Link Sliding : allows users to locks their cursor to a given link and travel very fast to its destination
2.8.2 MERGING MATRIX AND NODE-LINK DIAGRAM
• NodeTrix is a hybrid visualization merging node-link diagrams and matrices. The principle of NodeTrix is to
represent the global network as a node-link diagram and the locally dense subparts as matrices.
Interactive Exploration
• NodeTrix developed a number of interactions based on traditional drag-and-drop of objects with the
mouse cursor for ease creation, exploration and edition of matrices.
Drawback
• Making it impossible to place an actor in two different communities.
Presenting Findings
• NodeTrix can be used for both exploration and communication because matrices can be expanded showing
detailed information on actors and connections showing higher-level connection patterns.
2.9 MODELLING AND AGGREGATING SOCIAL NETWORK DATA
• 1st,Maintaining the semantics of social network data is crucial for aggregating social network
information, especially in heterogeneous environments where the individual sources of data are under
diverse control.
• 2nd,semantically representations can facilitate the exchange and reuse of case study data in the
academic field of Social Network Analysis.
• The possibilities for electronic data exchange has already revolutionized a number of sciences with the
most well-known examples of bio-informatics and genetics.
2.9.1 State-of-the-art in network data representation
• The most common kind of social network data can be modeled by a graph where the nodes represent
individuals and the edges represent binary social relationships. (Less commonly, higher-arity
relationships may be represented using hyper-edges, i.e. edges connecting multiple nodes.)
• The most commonly encountered formats are those used by the popular network analysis packages
Pajek and UCINET. These are text-based formats which have been designed in a way so that they can
be easily edited using simple text editors.
2.9.2 Ontological representation of social individuals
• The Friend-of-a-Friend (FOAF) ontology that we use in our work is an OWL based format for
representing personal information and an individual’s social network.
• FOAF greatly surpasses graph description languages in expressivity by using the powerful OWL
vocabulary to characterize individuals.
• The idea of FOAF was to provide a machine processable format for representing the kind of
information that made the original Web successful, namely the kind of personal information described
in homepages of individuals.
• Thus FOAF has a vocabulary for describing personal attribute information typically found on
homepages such as name and email address of the individual, projects, interests, links to work and
school homepage etc.
2.9.3 Ontological representation of social relationships
• Ontological representations of social networks such as FOAF need to be extended with a framework for
modelling and characterizing social relationships for two principle reasons:
1. To support the automated integration of social information on a semantically basis
2. To capture established concepts in Social Network Analysis.
• The characteristics of social relationships:
1) Sign(positive and negative attitudes of relationship)
2)Strength(closeness or tie strength b/w nodes)
3) Provenance
4) Relationship history(interaction,indivuals)
5) Relationship roles
4. Random walk and its Application
5.Use of Hadoop and mapreduce
Map reduce
• · Data-parallel programming model for clusters of commodity machines
• · Pioneered by Google
- Processes 20 PB of data per day
• · Popularized by open-source Hadoop project
- Used by Yahoo!, Facebook, Amazon, …
Map Reduce used for
At Google:
• 1. Index building for Google Search
• 2. Article clustering for Google News
• 3. Statistical machine translation
At Yahoo!:
• 1. Index building for Yahoo! Search
• 2. Spam detection for Yahoo! Mail
At Facebook:
• 1. Data mining
• 2. Ad optimization
• 3. Spam detection
Challenges
· Cheap nodes fail, especially if you have many
- Mean time between failures for 1 node = 3 years
- MTBF for 1000 nodes = 1 day
- Solution: Build fault-tolerance into system
· Commodity network = low bandwidth
- Solution: Push computation to the data
· Programming distributed systems is hard
- Solution: Users write data-parallel “map” and “reduce” functions, system handles work
• distribution and faults
Hadoop Components
· Distributed file system (HDFS)
- Single namespace for entire cluster
- Replicates data 3x for fault-tolerance
· MapReduce framework
- Executes user jobs specified as “map” and “reduce” functions
- Manages work distribution & fault-tolerance