Sie sind auf Seite 1von 5

Designing a Network Search System

April 8, 2013
Misbah Uddin, Rolf Stadler
ACCESS Linnaeus Center KTH Royal Institute of Technology Email: {ahmmud,stadler}@kth.se
AbstractNetwork search makes operational data available in real-time to management applications. In contrast to traditional monitoring, neither the data location nor the data format needs to be known to the invoking process, which simplies application development, but requires an efcient search plane inside the managed system. This paper presents a query language for network search and discusses how search queries can be executed in a networked system. The search space consists of named objects that are modeled as sets of attribute-value pairs. The data model is more general than the relational model, and the query language is more expressive than relational calculus. The paper shows that distributed query processing can be performed using an echo algorithm and that name resolution can be embedded in query processing. Finally, a use case for network search in cloud computing, backed up by a prototype implementation, is presented. KeywordsNetwork search, management paradigms, distributed management, name resolution.

Alexander Clemm
Cisco Systems San Jose, California, USA Email: alex@cisco.com spanning trees inside the networked system and incrementally aggregates the partial search results, which are sent from the leafs of the tree towards the root. Object name resolution is embedded in a natural way in query processing. While some database concepts help in engineering a network search system, there are clear differences between querying a traditional distributed database and performing search in a networked system. First, a search result may not be an exact match, but only close enough to be returned by a query. Second, similar to web search, search results are ranked, according to how closely a specic object in the result matches the query, the search history, etc. We believe that an important application area for network search will be capturing and tracing dynamic service behavior across time and space. For instance, network search can be used to identify and trace media streams associated with a video-conference across the nodes of a network, or to a nd the set of virtual machines associated with a particular application in a server cluster. A long version of the paper provides more details about these use cases [14]. Such functionality can, of course, be hardcoded beforehand in specialized protocols; network search, however, allows us to dynamically introduce such capabilities into a networked system. The paper is organized as follows. Section 2 contains the proposed data model. Section 3 describes the query language, and Section 4 explains how search queries can be processed in a distributed fashion in a networked system. Section 5 discusses a use case in some detail, and Section 6 reviews the papers contribution and presents future work. 2. O BJECT M ODEL

1.

I NTRODUCTION

Network searchor search in networked systems can be understood in three ways. First, as a generalization of monitoring whereby the monitoring data is retrieved by characterizing its content in simple terms, without giving location or detailed structure of the data. Second, it can be understood as googling the network for operational data, in analogy to googling the web for content. Third, network search can be seen as a capability that views the network as a giant database of operational and conguration data, which can be queried through a database-like interface. In this paper, we follow the database interpretation and develop further the concept of network search, which we motivated and introduced in [13]. Specically, we view network data as objects with a simple structure: an object has a (globally unique) name, a type and a variable number of additional attribute-value pairs. Objects are linked through joint attribute-value pairs through which associations can be expressed and discovered. We introduce a language to express search queries. It turns out that processing search queries can be performed through similar techniques as query processing in distributed (relational) database systems; we propose a tree protocol that dynamically creates 1

We consider physical and logical entities in a networked system, such as routers, servers, IP ows, virtual machines, etc., as objects in a search space (or object space). We associate an object in the space with each of these entities. An object is modeled as a bag of attributevalue pairs, containing conguration and operational information. An object is named and typed, and, hence,

has at least two attribute-value pairs. Figure 1 shows two examples, an IP ow object with information available on a router, and a virtual machine object with data from a server. We introduce a relation between objects that links together objects that share attribute-value pairs. The relation will allow us to nd information that belongs to a certain context in a networked system. For instance, it will enable us to trace an IP ow passing through the nodes of a network, or to search for the servers in a cluster that run applications belonging to a certain customer. Consider objects a, b in a search space O. We say a is directly linked to b, denoted by l(a, b), if a and b share an attribute-value pair. We say a O is linked to b O, denoted by l (a, b), if l (a, b) := l(a, b), or c O : l (a, c) l (c, b)

Each token t expresses a condition on an attributevalue pair. During query processing, the token is matched against all objects in the search spacemore precisely, against all attribute-value pairs of objects in the search space. If the match is successful, then the object is included in the query result. For example, the token server returns all objects that contain the attribute name or value server. The token load>0.7 returns all objects that include an attribute named load with a value larger than 0.7. Note that the match of a token to an attribute-value pair does not have to be exact, but can be approximate, for an object to be included in the query result. This issue is part of our future work. The query q1 q2 returns the union of the results of sub-queries q1 and q2 . Likewise, q1 q2 returns the intersection of sub-queries q1 and q2 . We give an example from a datacenter that offers Infrastructure-asa-Service: nd servers with at least 12 CPU cores and that have load lower than 20 percent, which can be expressed as server cpuCores > 11 load < 0.2 Rule (4) describes link queries, whereby A denotes a set of objects, denotes the operator for direct linking, and denotes the operator for linking. In case of operator , the above query returns the directly linked objects of A. In case of , it returns the closure of A with respect to l (, ), which means all objects o O that are linked to objects in A. The following query computes the closure of an object and returns objects of a specic type from that closure: nd servers that run processes of customer John. server John Rule (5) introduces a projection operator a1 ,an , and an aggregation operator f, a , which are described in detail in a longer version of this paper [14]. 4. D ISTRIBUTED Q UERY P ROCESSING

Note that the same relations can be dened on subsets A, B O. It is often useful to compute the closure of a subset A under l (, ). For instance, all information associated with a video service can be found in the closure of the set of ow objects related to the service. The above described model is simpler and coarser than the information models traditionally used in network management, such as, SMIv2 [8], GDMO [2], CIM [1], and YANG [6], but it is also less expressive. We believe that our model is better suited for network search, as one can formulate queries with minimal knowledge about information structure. Furthermore, one can easily populate our model with data from available sources in a network system, at the price of potentially losing structural information. 3. Q UERY L ANGUAGE

A query on a of search space O returns a subset of information in this space. We describe the query language in BNF notation as follows: Basic: q t op q p r t | qq | qq a | v | a op v = | < | | A | A q | q p | p (1) (2) (3) (4) (5) (6)

Link: Projection: Aggregation:

First, we discuss queries q based on rules (1), (2), (3), which return a set of objects. Rule (2) denes how a query is made up of tokens t. a stands for an attribute name, such as load, v for a value, such as 0.7, and op for a relational operator. Here are some examples of queries based on rules (1), (2), (3): load, load > 0.7, server load > 0.7, server router, (server router) load=0.7. 2

Our approach to process network queries makes use of the echo protocol, a tree-based protocol suitable for distributed polling [11] [7]. It is based on an algorithm rst described by Segall [9]. The execution of the echo protocol can be understood as the subsequent expansion and contraction of a wave on a network graph. The execution starts and terminates on an initiating node of the graph, also called the root (node). The wave expands through explorer messages, which nodes forward to their respective neighbors. During the expansion phase, local operations are triggered on the nodes after receiving an explorer. The results of these local operations are collected in echo messages, when the wave contracts, so that the aggregated result of the global operation becomes available at the root node.

name type srcIPaddress dstIPaddress protocol srcPort dstPort packet octet timestamp

urn:ns:128.146.222.233:131.187.253.67 IPow 128.146.222.233 131.187.253.67 6 03FA 0016 4 1129 10:05:11 24 April 2012

name type uuid cpuCores memory storage IPaddress server customer dateCreated

urn:ns:instance-17 VM 4f5f86875be18e30c9000002 1 1 GB 3 GB 192.168.1.5 urn:ns:server:Server-08 urn:ns:customer:John 12:49:25 10 February 2012

Fig. 1. Sample objects in the search space. On the left, an object representing an IP ow with information from a router; on the right, an object representing a virtual machine on a server.

During the expansion phase, the protocol constructs a spanning tree on the network graph for the purpose of collecting and aggregating the partial results during the contraction phase. The echo protocol executes on the network graph of the search plane (described in [13]). The protocol can be started on any search node once a query q has been received. First, the query is disseminated by explorer messages to every node and executed as local operation against the local database D. The results of the local operations are sent, by echo messages, on the spanning tree from child nodes to parent nodes, where the partial results are aggregated. (Note that the term aggregation here refers to the processing of the partial query results, not to a possible aggregation operator in the query q .) Figure 2 shows a sample spanning tree created by the echo protocol on nodes n1 , , n6 with root n1 . It further shows the message exchange between nodes. Each node shows the local database D containing the objects with information from that node, together with the variable result, which contains the (partial) result of the query q . The denition of the local operation, the aggregation operation of the query result, and the current local state of the query collection, are modeled in an object, called the aggregator object of the echo protocol. For lack of space we describe the aggregator in a long version of this paper [14]. 5. U SE C ASE : S EARCH ON A C LOUD I NFRASTRUCTURE

Fig. 2. Distributed query processing: The echo protocol creates a spanning tree in the search plane. Each node contains the local database D. The variable result contains the partial result of a query q . Some of the explorer (EXP) and echo (ECHO) messages are shown.

In order to experiment with network search concepts, we have instrumented a cloud Infrastructureas-a-Service (IaaS) platform in our laboratory with network search functions. The platform contains nine high-performance servers, interconnected by Gigabit Ethernet, and runs the OpenStack cloud management software. (See [15] for details.) The components of the network search system include search nodeseach server on the testbed runs such a nodeand managers in the management plane that run in a management 3

station. Each search node contains a local database based on MongoDB [4] that maintains the network objects. Currently the database has four types of objects, namely, server, virtual machine (VM), application, and customer. A data sensing component reads system les, such as /proc [5], libvirt conguration [3] etc., and populates and periodically updates the objects in the local database at a rate corresponding to their respective lifetime. A distributed query processing component implements the protocol outlined in Section 4. The manager component offers two types of interfaces for accessing network search functionality: a simple line console and a graphical, menu supported interface that allows to compose queries and browse the output in various ways. Due to lack of space, details about this implementation is reported in [10] [12]. We have used the network search system on the OpenStack platform for conducting a range of exploratory experiments. The platform can be loaded by external load generators that were developed for evaluating performance management solutions for OpenStack [15]. The produced load has a time-varying pattern of several types of applications running in virtual machines of different congurations and lifetimes. Here are some

of the experiments we have performed. First, we inquire about the load on a server cluster, which is given by a range of IP address: sum, load (192.168.212.*). Given the case that the load is unexpectedly high, we want to nd out which applications are running on this cluster: name ( application 192.168.212.*). Finally, we want to identify the customers for which these applications are executed: name ( customer app1 appn ). In a further set of experiments, we study the behavior of the virtual machines on the platform under an adaptive placement policy. First, we are interested in learning the distribution of the uptimes of the virtual machines. The query uptime (VM) provides the uptime of the active virtual machines, out of which the distribution can be computed. (If a distribution aggregator is implemented, then the distribution can be directly computed as part of the query). Second, we want to study the movement of virtual machines that belongs to a specic application. We can achieve this by periodically issuing the query name, server (V M appx ). We ran a series of performance tests on the search system, the details of which can be found in [10]. 6. D ISCUSSION

added on-demand, or it allows for network applications to dynamically adapt their information demand. In future work, we plan to further develop the paradigm of network search. Here are some of our priorities. While the contribution in this paper focuses on database aspects of network search, network search includes concepts that go beyond database functionality, most importantly, approximate matching of attributes and ranking of search results (see Section 1). We envision, for instance, that ranking takes into account the freshness of the data, the locality of the query invocation, and the number of tokens in a query that matches a particular object, and that the ranking process is realized as a distributed aggregation function. Second, we plan on improving the scalability of distributed query processing for network search. While this paper describes a distributed method for query processing, each query still invokes an operation on every search node, which is expensive in a large system. We are considering several approaches to reduce the footprint of a query. Domain knowledge can be used to guide the search process and thus reduce the search space. Alternatively, an index structure can be developed to reduce the number of nodes that are involved in processing a query. Link queries require special attention, since each such query involves several executions of echo. Possible heuristics for reducing the overhead include restricting the search to those links with the number of intermediate objects below a given bound, and limiting the subsequent executions of echo to those nodes that produced a non-empty query result during the previous execution. Additionally, work is needed for the development of efcient local databases, the population of local database with available data sources, the development of concepts regarding the privacy and security of local data, as well as a framework for search in a multidomain environment. R EFERENCES
[1] Common information model. http://dmtf.org/standards/ci, April 2013. GDMO - Guidelines for Denition of Managed Objects. http://www.cellsoft.de/telecom/gdmo.htm, April 2013. libvirt 0.7.5 - Application Development Guide. http://libvirt.org/guide/html/, April 2013. MongoDB Manual. http://docs.mongodb.org/manual/, April 2013. Proc(5). http://man7.org/linux/man-pages/man5/proc.5.html, April 2013. M. Bjorklund. RFC 6020: YANG - A Data Modeling Language for the Network Conguration Protocol, October 2010.

The contribution of this paper centers around a simple query language for search in networked systems. Queries are based on a model where objects are represented as a set of attribute-value pairs. We propose a method for distributed execution of search queries in a networked system in which nodes maintain objects that contain conguration and operational information. We argue why the proposed method provides the correct result for a network query. The use cases further motivate the paradigm of network search and demonstrate that the introduced language is useful. Our implementation gives evidence that the design can be implemented (even if the testbed is of limited size). Similar to the case of web search, the simplicity of our query language has the drawback that, often, the information we are interested in cannot be expressed in a sufciently precise manner, and, therefore, the query result needs interpretation. For instance, the query search for servers that run processes of customer John cannot be directly expressed in our query language, but only through attribute names, values, and object links, for example, through server name = *John. This query returns objects with attribute name (or value) server that are linked to objects whose names end with John. The query result needs to be interpreted and, hopefully, contains the information we searched for in the rst place. An argument can be made that search queries can be implemented as specialized protocols in a networked system, and, therefore, a generic search system is not needed (see Section 5). However, we believe that a network search system enables new functionality to be 4

[2]

[3]

[4]

[5]

[6]

[7]

K.S. Lim and R. Stadler. A navigation pattern for scalable internet management. In Integrated Network Management Proceedings, 2001 IEEE/IFIP International Symposium on, pages 405420. IEEE, 2001. K. McCloghrie, D. Perkins, and J. Schoenwaelder. RFC 2578: Structure of Management Information Version 2, April 1999. A. Segall. Distributed network protocols. IEEE Transactions on Information Theory, 29:2335, 1983. A. Skinner. A system for googling operational data in clouds. Masters thesis, KTH The Royal Institute of Technology, December 2012. R. Stadler. Protocols for distributed management. Technical Report 2012:028, KTH, Communication Networks, 2012. QC 20120604. M. Uddin, A. Skinner, R. Stadler, and A. Clemm. Real-time search in clouds (demo), 2013. IM 2013. M. Uddin, A. Stadler, and A. Clemm. Management by network search. In IFIP/IEEE International Symposium on Network Operations and Management (NOMS 2012), Maui, Hawaii, April 16 - 20, 2012. M. Uddin, R. Stadler, and A. Clemm. A query language for network search. In IFIP/IEEE International Symposium on Integrated Network Management (IM 2013), Ghent, Belgium, May 27-31, 2013. F. Wuhib, R. Stadler, and H. Lindgren. Dynamic resource allocation with management objectives : Implementation for an openstack cloud. Technical Report 2012:021, KTH, Communication Networks, 2012. QC 20120528.

[8] [9] [10]

[11]

[12] [13]

[14]

[15]