You are on page 1of 11

CHINESE

ASTRONOMY
AND ASTROPHYSICS
ELSEVIER Chinese Astronomy and Astrophysics 38 (2014) 211–221

Design and Implementation of CNEOST


Image Database Based on NoSQL System
WANG Xin1,2
1
Purple Mountain Observatory, Chinese Academy of Sciences, Nanjing 210008
2
Key Laboratory of Space Object and Debris Observation, Purple Mountain Observatory,
Chinese Academy of Sciences, Nanjing 210008

Abstract The China Near Earth Object Survey Telescope is the largest Schmidt
telescope in China, and it has acquired more than 3 TB astronomical image data
since it saw the first light in 2006. After the upgrade of the CCD camera in
2013, over 10 TB data will be obtained every year. The management of the
massive images is not only an indispensable part of data processing pipeline
but also the basis of data sharing. Based on the analysis of requirement, an
image management system is designed and implemented by employing the non-
relational database.
Key words: astronomical databases—surveys

1. REQUIREMENT ANALYSIS

The China Near Earth Object Survey Telescope (CNEOST), located in the Xuyi station
of Purple Mountain Observatory (PMO), is now the largest Schmidt telescope in China.
It is mainly dedicated to the survey of near Earth objects, in addition to other types of
observations to serve the research fields of celestial mechanics, astrometry and astrophysics.
Equipped with a 4k×4k CCD camera, the CNEOST has accumulated the image data of
3 TB since it was put into operation in 2006. After being upgraded to the 10k×10k CCD
camera in the year 2013, it will acquire more than 10 TB image data every year (ideally near
20 TB each year).
Nowadays, the data obtained by large astronomical instruments are mainly managed
by database technology. Thus for this telescope, the primary need is to realize the storage
and management of its massive image data. The astronomical image data not only have a
huge capacity but also increase linearly with time, at a rate of nearly 20 TB each year. To
† Supported by National Natural Science Foundation of China
Received 2012–09–12; revised version 2012–10–19
 A translation of Acta Astron. Sin. Vol. 54, No. 4, pp. 382–391, 2013
 wangxin@pmo.ac.cn

0275-1062/14/$-see front matter


0275-1062/01/$-see © 2014 
front matter Elsevier
c 2014 B.V. All rights
Elsevier reserved.
Science B. V. All rights reserved.
doi:10.1016/j.chinastron.2014.04.008
PII:
212 Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221

upgrade the system simply by increasing the storage capacity is not only very expensive, but
also practically impossible because the physical limit of the storage device will be reached
quite soon. Therefore, both the storage device and the storage method should have a good
expansibility, so that the storage capacity can increase constantly without migrating the
existing data.
Since both the number and the capacity of the image data are huge, it is impossible
for the users to acquire all the data, but select only those they need through a retrieval
system. The CNEOST adopts the FITS file format, which is widely used in astronomical
instruments, to store the images. In a FITS file, some important information about the
image is kept as key-values in the header section (called header data unit, HDU for short) of
the file. These key-values not only are the necessary parameters for the follow-up processing,
but also can serve as the main clues for the image retrieval. In this sense, how to use the
key-values in the FITS files to fulfil the management and inquiry on the image data is the
key point for the usability of a database.
The diverse observations of different scientific objectives performed on the CNEOST
bring with the differences in data collection and attribution. For example, in the survey
observations, the telescope is operated in a mode that it rotates with the Earth, and only
the coordinates of the observing sky area are recorded in the FITS file. While for the moving
targets, the telescope tracks the apparent trajectory of the target, so that the designation
of the target and the moving velocity of the telescope at exposure time etc. should be
included in the HDU of the FITS file. The telescope will be totally motionless when it is
used to observe a geosynchronous orbit satellite, and the corresponding mark should also
be included in the FITS file. All this information is necessary for the successive image
processing. Meanwhile, along with the upgrade of the telescope system and the extension of
data acquisition terminal, the amount of recorded image attributes increases continuously.
For instance, when new filter plates were implanted in the CNEOST in 2008, new key-values
were added into the FITS file to indicate the filter lens. In addition to the raw image data,
the CNEOST team now begins to release preprocessed images, thus the preprocessing should
also be marked in the FITS file. All these aforementioned factors cause the image attributes
to be managed vary continuously, and in response, the management on the image attribute
data must have a good flexibility.
Generally, the capacity of astronomical database is huge, but the number of users is not
many. There is not a high expectation for concurrent access to the database. Meanwhile,
as an optical telescope, the CNEOST acquires images only at night, there is no special
requirement of concurrent read/write on the database. Any of existing mainstream database
management systems can meet the requirements of CNEOST.
In this paper, we design and implement such an image database system for CNEOST.

2. ARCHITECTURE DESIGN
2.1 A Brief Review of Current Database Architectures
In the long history of astronomical databases, there are three common-used architec-
tures: data file system, database system, and the combination of both.
In a data file system, the attributes of observational data, such as the observation time,
pointing and target, are defined as the unified file names with fixed length and format. Then
Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221 213

the data are classified and stored according to the file name and the path name, and the
external access is generally provided by means of FTP. This data system of long history
is easy to construct and widely used. Even nowadays, quite many astronomical data are
released in this manner. Particularly, for those observational data stored in text format, the
number of data attributes is limited and relatively fixed, this kind of database architecture
is still a very good choice. Formerly, the CNEOST also adopted the data file system to
manage the image data. However in recent years, the shortages of this database structure
have been exposed as the requirements on the CNEOST image data become more and more
complicated. First, in this data system only a few attributes can be manipulated, and
strictly, these attributes can not be used to retrieve images. Searching and managing of a
large number of images are very tedious and time-consuming. Secondly, the total amount of
files is limited by the capacity of the file system, thus this system is not extensible. When
the telescope is upgraded to a new CCD camera, the single-machine capacity can hardly
meet the needs of the sustainable management. Finally, the data backup and mirroring
functions have to be handled by oneself.
To overcome the aforementioned shortages, the idea of adopting the database technol-
ogy to manage the astronomical data arose naturally, especially along with the development
of the relational database (also known as SQL database). By using the database system,
we can manage many attributes of the data files, and different retrieval functions can also
be realized easily by using the powerful inquiry capability of the relational database system.
Meanwhile, the managements like the data backup and mirroring can be fulfilled automat-
ically. The China Optical Surveillance Network for Space Targets and Debris of Chinese
Academy of Sciences has adopted such a database system to manage the data of orbit de-
termination, and store the files in the database as large object (LOB) segments. But, as
the number of files increases, the retrieval efficiency sharply declines, and sometimes the
database even fails to support the normal usage. So, if applying this method to the massive
image data obtained from the telescope, the performance of the database would descend
more severely, because the size of single image file is bigger and the total amount of the
data is huge. To make matters even worse, it is difficult for a relational database to make
parallel extension, i.e. it is impossible to extend the database capacity infinitely by adding
new hardware. Thus, it can hardly satisfy the requirement of yearly 10 TB increase.
In order to solve the problems of a pure database system mentioned above, the idea
of combining the database system and the file system emerges. The retrieving attribute
information is managed by through a database system, while the image files themselves
are handled by the file system, only leaving the path information in the database system.
In this manner, the convenience of retrieval of the database system is kept, while the bad
influence of using the LOB segments is avoided. This kind of database architecture has
been adopted by the Guo Shoujing telescope (LAMOST)[1] and the millimeter wave radio
telescope of PMO1 . If the file system for the storage of massive data was the distributed
file system (DFS), it would be easy to realize the linear capacity extension of the data
system by simply adding with more storage servers. In a word, this is a relatively sound
architecture for fulfilling the requirements of retrieval, performance and extensibility. The
only weakness of such an architecture is that it has a high demand for administration. It

1 http://www.radioast.csdb.cn/sjgxdt.php
214 Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221

requires independent maintenances of a database system and a parallel file system.


2.2 Selections of Database System and of File System
According to the above discussion, a reasonable architecture for the data management
system which can fulfill the requirements of CNEOST is the combination of a database
system and a parallel file system. We will discuss the selection of the database management
system and the selection of the file system.
Compared with image files, the size of the image attribute data is much smaller. No
matter the record number, storage capacity or concurrent read/write request, these require-
ments can be fulfilled easily by any of the mainstream database management systems. Thus,
the selection should mainly refer to the flexibility and extensibility of the attribute manage-
ment of the system.
Because of the restriction of table structure, the data table segment of a relational
database must be fixed, and it is very difficult to change. However, the image attributes
from the CNEOST vary constantly, and the attributes of two images may be different. To
realize the management over the varying attribute information, a special table structure was
proposed in Reference [2]. Each attribute is stored as a single record, and both the key and
value are stored as data segments. Although this method can solve the problem of table
structure for varying attributes, it produces multiple attribute records for a single image,
hence it brings some extra expense to the database system because the sub-inquiry or join
inquiry must be used when a complex inquiry is performed.
In this paper, we will investigate the management of the database of image attributes.
Since the year of 2011, the non-relational (NoSQL) database became well-known. As a com-
plement to the SQL database, the NoSQL database has attractive properties. The relation
model was abandoned (that’s why it is called NoSQL), but the schemaless data structure
was adopted, that is, people don’t need to define the table structure (data segments), and
it’s not necessary to keep the same structure for different records in the same table. This
character of the database coincides very well with the character of the image attribute in-
formation. After some comparisons, we finally choose the MongoDB database2 , for reasons
listed below: (1) it uses the schemaless document as the data structure; (2) MongoDB is
well-developed among various NoSQL databases, and it has relatively good developing inter-
faces; (3) it is now the non-relational database that is most close to the relational database,
and has a powerful inquiry function.
Besides, MongoDB has a built-in GridFS file system, which is a file system built by
MongoDB itself. So, we don’t need to set up a separated DFS, and have automatically
solved the problem of managing two different systems. The management to the database
is naturally the management to the file system. Finally, as most of the NoSQL systems,
MongoDB has built-in automatic partitioning and replication mechanisms, which can realize
the parallel extension of the database. Simply by increasing servers we can realize the
upgrade of both capacity and throughput.
2.3 MongoDB and GridFS[3]
MongoDB is a powerful, flexible and extendable data storage system. It’s not a re-
lational database, but a document-oriented non-relational database. The document is the

2 http://www.mongodb.org
Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221 215

elementary unit in the MongoDB, like the ’line’ in the SQL database. But, free from the
restriction of relations, the document is much more complicated than a line.
Multiple keys combined orderly with the corresponding values constitute a document,
and multiple documents together make up a collection. A collection resembles a table in the
database, but without the restriction of schema. In the system management, MongoDB is
also very simple and convenient, it can be automatically configured by the server as far as
possible. Many complicated functions, such as the grouping and partitioning, become very
simple under the configuration management of MongoDB.
GridFS is a mechanism for the storage of large binary files in the MongoDB. A suc-
cessful deployment of the MongoDB database means a successful deployment of the GridFS
distributed file system as well, so there is no need of other independent file storage structures.
GridFS utilizes directly the replication and partition mechanisms of the database system,
so it is very easy to make parallel extension and failure recovery. In addition, because files
are stored by adopting a database, GridFS successfully avoids some common problems in
the traditional file system and there will be no disk fragments.
2.4 Architecture Design
After considering the characteristic of MongoDB and the requirements of CNEOST,
we adopt the sharding cluster as the database architecture. Here “sharding” is the process
to distribute the data among multiple servers, also known as “partitioning” in some other
database systems. Since the data are distributed among different servers, storing more data
and larger load can be easily achieved by increasing the number of servers. Through the
automatic sharding by MongoDB, it is convenient to enlarge the capacity of the database
by adding more servers into the system. The architecture of the sharding cluster is shown
as Fig. 1.

Fig. 1 System architecture

In such a system, the data service will be supplied by the cluster constituted by mul-
tiple sharding nodes, all the data are evenly distributed to the every node. The multiple
configuration services are backup each other, with each of them keeping the configuration
information of the cluster. Multiple route services can also be deployed so that each of them
can serve as the access entry, through which users can access to the database by Web. The
route service does not store any data, but only supplies the functions of unified access to
database and transferrs it to specified data sharding. While the configuration service only
stores the cluster configuration information, nearly without any working burden. And the
data management is the responsibility of the sharding nodes.
216 Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221

3. DATABASE DESIGN

3.1 Management of Image Data


3.1.1 Document Structure
The image data are stored by means of GridFS, which is a storage standard on the
basis of MongoDB, not essentially different from the usual data collections. The basic idea
of GridFS is to divide a big file into many blocks (or chunks) and to store each block as a
single document. The GridFS standard uses two collections to store and manage files. The
collection fs.files is used to handle the basic file information, and the collection fs.chunks
is used to store the file chunks. Without the restriction of schema, the image attribute
information can be expanded into the collection fs.files, while the collection fs.chunks remains
the default status, being maintained totally by the system. The structure of the expanded
collection fs.files is shown in Table 1.

Table 1 Structure of the collection fs.files


Field Type Description
id ObjectId System default ID of the document *
length Long File size in bytes *
chunkSize Long Chunk size in bytes *
md5 String MD5 of the file *
uploadDate Date Timestamp of store in database *
filename String Filename *
header String HDU of the FITS file
start Long Exposure start time
exp Long Exposure time
pointing Document Image center
ax1 Long Dimension of axis X
ax2 Long Dimension of axis Y
band String Filter type
type String Dark, bias, flat or light
hdu Document Key-value pairs in HDU

In Table 1, the keys defined by the GridFS standard are listed in lines indicated by
asterisks. These keys, created and evaluated by the system automatically, contain the in-
formation as in a usual file system. Other expanded keys in the list are used to store the
image attributes. For the convenience of time-related comparison, the exposure start time
is recorded as the Unix timestamp. The redundant storage is kept for HDU, with the form
of full character string being stored in the header key, so that the original notes and formats
are reserved. Meanwhile, the key-value pairs in the HDU are combined into an embedded
document as the value of the HDU key, in favor of the inquiry according to the key name
or key value.
For astronomical images, all the information about the image attributes is recorded in
the HDU, in which there is much more information than that used for retrieval. For the most
often used attributes which are relatively invariant in astronomical images, the redundant
design is adopted, they are not only kept in the embedded document of the HDU key, but
also recorded in the document as separate keys. Currently, these attributes include the
exposure start time, exposure duration, pointing of image center, image size, filter plate and
image types. Also thanks to the schemaless design, as the equipment operation varies, the
often-used attributes can be redefined, added to or removed from according to requirements.
Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221 217

The document is schemaless. There are many different methods to display the massive
schemaless data, but none of them are as friendly as the two-dimensional table. The inquiry
results are displayed in a two-dimensional table consisted of the often-used attributes. The
redundant design dramatically improves the retrieval efficiency because in this case there is
no need to search the embedded document of HDU.
For the convenience of the successive indexing, the pointing of image center is also
recorded as an embedded document, which has only two keys: alpha and delta, corresponding
to the right ascension and declination.
From the design of document structure mentioned above, we can find that the man-
agement of the image attribute information is more attuned to the general logic. One image
corresponds to one document. The attribute key and value correspond to the key and value
of the document. It is not necessary to save both the key and value as values. Meanwhile,
since the documents in a collection may have different structures, the document structures
can be adjusted constantly to meet the requirement of development.
3.1.2 Sharding Design
From the storage mechanism of GridFS, the collection fs.files contains only the attribute
information, and the massive image data are stored in the collection fs.chunks. When
an inquiry is made to the sharding collection, it needs the access to every sharding data
service. This will cause an extra expense, it is not economical for small-capacity collections.
Therefore, we perform sharding only for the collection fs.chunks, and distribute it to the
different sharding nodes, while the collection fs.files is maintained on one sharding node.
To perform proper sharding, a specific key in the document in the MongoDB should be
assigned as the shard key, which serves as the sharding criterion. In the collection fs.chunks,
the key files id, whose value corresponds to the id in the collection fs.files, is used to specify
the file corresponding to the block. Apparently, all blocks from one file have the same
files id. Therefore, if this key is chosen as the shard key, all the blocks of one file will stay
in the same sharding node when sharding fs.chunks. Thus, the acquisition efficiency will be
improved, because no node skip is needed when fetching an image from the database.
3.1.3 Inquiry and Index Design
As the most fundamental function, inquiry is also the most common operation to a
database. The index is used to accelerate the inquiry function. Just as most of the as-
tronomical image databases, the most popular inquiry criteria to the image database of
CNEOST are the time range and sky area of the observation. To optimize the inquiry, the
related keys are indexed. In our work, three indexes are created: start-key index, pointing-
key index and the combined index for both the start and pointing keys. The descending
index is applied to the start key, because generally the latest images in the database will be
queried more. To support the spherical inquiry, 2d index, which is a spatial indexing scheme
provided by MongoDB, is applied to the pointing key. With the 2d index, one can perform
the spherical index of two-dimensional coordinates.
Because all the image attributes are stored as key value pairs in the HDU key, the
inquiry upon any attributes can be realized. But the observations made by CNEOST are
diverse and users of the database are from a wide field, thus the inquired attribute name
and attribute number may differ. For example, those users interesting in photometric infor-
mation may consider first the information about the filter plates, while those who need only
the positional information will just neglect the filter information. For the users of astro-
218 Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221

nomical research, they generally prefer to use the images that have been preprocessed, but
for the users studying the image processing technique, they must favor in the raw images.
Traditionally, the database supplies an inquiry form to users to fulfil the retrieval. When
the attributes suitable for retrieval are not very many and relatively fixed, this inquiry form
works well, but it is not suitable for the image retrieval of CNEOST.
According to the image properties, the most often referred attributes, i.e. the observa-
tion time and sky area, are provided for users as the input interface in the traditional tabular
form. The observation time is expressed in time duration, and the sky area is defined by the
center coordinates and radius. For other attributes, in order to give users the convenience
of describing the inquiry conditions by any keys or their combinations, we define the inquiry
expression to be the form of ’KEY operator VALUE’, in support of the simultaneous exis-
tence of multiple inquiry conditions. The relation between two inquiry conditions is their
logic sum. Here KEY is the key name of any one attribute, and the operator is defined in
Table 2.

Table 2 Description of different operators


Operator Description for string value Description for numerical value
> Greater than Greater than
>= Greater than or equals Greater than or equals
< Less than Less than
<= Less than or equals Less than or equals
== Equals Equals
= Like Equals
<> Not equals Not equals
!= Not like Not equals

For the character type and numeral type values, the implication of the same operator
may differ a little. For operators = and !=, the inquiry supports the canonical expression
for character variables. And two different inquiry expressions can be used, namely KEY and
!KEY, indicating the existence and nonexistence of the key. Through setting the text area
in the inquiry form, users may freely input the inquiry conditions to satisfy the requirement
of inputting the inquiry conditions by the different keys and their combinations.
3.2 Thumbnail Management
The FITS files are not supported directly by HTML, thus the thumbnail preview of
FITS images is provided for the convenience of image selection. The thumbnails are dy-
namically and automatically created when the preview request is received. To avoid the
repeated creations of one image, the thumbnails are managed using the database. Set a
data collection thumb, and put all the created thumbnails in the thumb collection. If a
thumbnail exists at the request, it will be read directly from the database, but will not be
repeatedly created again and again. There are three keys in the thumb collection. Except
the id key created by the system, 2 keys are created. The key files id, corresponding to the
key value id in the collection fs.files, indicates the original image file corresponding to the
thumbnail. The key jpg is used to store the thumbnails in JPEG format.
Along with the application and operation of the system, a large number of thumbnails
will be produced by the huge amount of images. But the thumbnails are only for quick
view, they are useless in the follow-up research. On one hand, the accumulation of these
thumbnails causes the big space waste and heavy burden to the database system; on the other
Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221 219

hand, a regular cleaning brings burden to the system management. Using the characteristic
of the capped collection in MongoDB, the collection thumb is set to be a capped collection.
The capped collection possesses a fixed size, when the collection exceeds the defined capacity,
the system performs automatically a process of age-out according to the LRU (Least Recently
Used) rule and the insertion sequence. Through the application of capped collection, the
management on the collection thumb is minimized to nearly zero. The system deletes the
old thumbnails to keep the capacity, and reserves only those to be most used recently.

4. IMPLEMENTATION AND PROSPECT OF THE SYSTEM

Based on the design mentioned above, we have implemented for the CNEOST a primary
database system using the sharding cluster of 3 nodes. Although it is not recommended in
the official document of MongoDB to set multiple sharding nodes on one physical server in
production environment, we still choose to run the multiple services on one physical machine,
because the access load of the CNEOST image database is not high, in the mean while the
basic performance of nowadays server machines is pretty high.
The server is equipped with the dual Intel Xeon E5620 CPU and a memory of 64 GB.
The operating system is the 64 bit CentOS (version 6.2) and the database adopts Mon-
goDB (version 2.2). The database storage adopts the disk array RAID5+hotspare, which
is connected to the server through optical fibers. After the disk array was mounted, the
storage capacity is about 65 TB. The logical volume manager (LVM) and XFS file system
are applied to the storage, which has been partitioned into 3 logic volumes to store the
data of 3 sharding nodes respectively. For the convenience of annual upgrade, the size of
each node is set to be 20 TB, nearly the maximal size of the annual data of CNEOST. The
architecture is consistent with Fig. 1. The whole database system consists of 7 services,
including 3 configuration services, 3 sharding node services and 1 route service.
The user interface is built by using the PHP language under the ExtJS framework3.
The interface, as shown in Fig. 2, is divided into two sectors. In the left sector, the main
attributes of the inquired result are displayed with the form of grid. To check the detailed
information of an image, an operation column is set at the end of each line, where the
operations of viewing FITS HDU, viewing thumbnail and downloading the image can be
performed. The right sector is the area for inputting inquiry conditions, including the text
box for inputting the conditions of observation time and observation sky area, and the
text region for inputting the inquiry conditions of multiple attributes. The test running
and preliminary application indicate that the system has correctly realized the designed
functions and performances.
At present, the high availability has not been taken into account in the primary database
system, with only a log being installed to save time for the failure recovery of the database.
Fortunately, the requirement of reliability has not been raised by the database system of
CNEOST. Of course, when the usage and the data amount increase in future, the high
reliability would become more and more important. For this sake, we have planned to
expand each sharding data node to be a cluster, in which multiple nodes are backup each
other. And all these clusters constitute a bigger sharding cluster. A scheme of this plan is
shown in Fig. 3.
3 http://www.sencha.com/products/extjs/
220 Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221

Fig. 2 User interface for inquiry

Fig. 3 Architecture in Planning


Wang Xin / Chinese Astronomy and Astrophysics 38 (2014) 211–221 221

In Fig. 3, the solid-line frames indicate the corresponding services running on the same
physical machine, and the dashed-line frames indicate the three sharding nodes backup each
other. Such an architecture is called “replica set” in the MongoDB system, and it also
supports automatic configurations.

5. CONCLUSION AND DISCUSSION

By using the non-relational database system, an image database system for the CNEOST has
been constructed. It fulfils the design goals of massive image storage and management, ad-
justable attribute management and flexible inquiry. Thanks to the application of MongoDB,
a document-oriented non-relational database system, the realization of many characteristics
becomes very easy. Because MongoDB is very good in extendability and also quite powerful
in automatical configuration, the database system can be easily extended and adjusted if
the data amount increases significantly or the requirements change with time.
Many merits can be found in the non-relational database, and its high performance
under a huge amount of data has attracted a lot of attention in the Internet industry. Our
work introduced in this paper has revealed that the schemaless characteristic and extend-
ability of the non-relational database are consistent very well with the characteristics of
astronomical image data. Hence, it is a very suitable database for the management of astro-
nomical image data. Compared to image data, the sizes of other types of astronomical data
are much smaller, hence it is a feasible choice to apply the non-relational database to real-
izing an unified management over all the astronomical data. In the face of the management
of next-generation massive scientific data, particularly enlightened by the requirements of
management of 100 PB data of the Large Synoptic Survey Telescope (LSST), the SciDB
project has started in 2008, in order to build a database for the scientific data management
and analysis based on the non-relational database model4 .
Along with the great increase of the astronomical data and the development of the
non-relational database, it is believed that more and more non-relational databases would
be applied to the astronomical data management. Quite promisingly, it will improve the
management efficiency, reduce the maintenance burden, and provide better services to the
astronomical research.

ACKNOWLEDGEMENT Thanks to Prof. Ma Yuehua from PMO for her support. The
author is also grateful to Prof. Zhao Haibin, Li Bin, Lu Hao and Xia Yan for their kind
helps.

References

1 Li H. X., Sang J., Wang S., et al., Astronomical Research & Technology, 2006, 3, 56
2 Fan D. W., Cui C. Z., Zhao Y. H., Astronomical Research & Technology, 2011, 8, 306
3 Chodorow K., Dirolf M., MongoDB: The Definitive Guide, Cheng X. F. translated, Beijing: Posts &
Telecom Press, 2011, 99-101

4 http://www.scidb.org