Sie sind auf Seite 1von 20

DSpace

Import, Export & Backup


Mukesh A Pund
Scientist
NISCAIR

Backup Vs. Export/Import

Backup is meant for guarding the data from disk


crash, virus attack, hacking or any calamity
Export/Import is meant for exchange of digital
objects across repositories

Hardware Required for Backup

Any one of the following

CD-ROM/DVD-ROM

DAT Drive

External Hard disk

Another system on the LAN

DSpace Directory Structure

/dspace/assetstore

/dspace/assetstore (bitstreams most important)

/dspace/bin (commands to be used at command


line, can always be generated from dspace-source
files
/dspace/config (you might have customized it,
one time backup is good enough)
/dspace/handle-server

DSpace Directory Structure

/dspace/lib (can always be generated from dspacesource)


/dspace/logs (essential to generate statistical reports
and bug tracing)
/dspace/reports (can be generated from Log files)
/dspace/search (can be regenerated using index-all
command)

Where DSpace stores data

/dspace/assetstore directory will have all the

Bitstreams and licenses

PostgreSQL databases contains information on

Communities

Collections

e-groups

E-persons, thier passwords

Host of other information

What should be Backedup

Your DSpace postgreSQL database

/dspace/assetstore (minimum backup)

/dspace (entire directory)

Creating Backup Directory

Create one directory where backup files will be


stored
Eg

#mkdir /dspacebkp

#chmod 777 /dspacebkp

tar Command (compress)

To back up /dspace directory

$tar zcvf /dspacebkp/dspace150508.tar.gz /dspace

To back up only /dspace/assetstore

$tar zcvf /dspace/asset150508.tar.gz /dspace/assetstore

Untar (uncompress)

To untar and unzip the tar.gz file, you may use the
following command
$tar zxvf /dspacebkp/dspace150508.tar.gz
WARNING:The safer approach is to use the
above command in temp directory and copy it to
dspace directory only after successfully untaring
the file

Backup of database
The following commands are for Postgresql
database backup

Run pg_dump as dspace user


Ex:

$ su - /dspace

/usr/local/pgsql/bin/pg_dump

dspace >
/dspacebkp/dspace_db_150508

Backup of database
Where
dspace

is name of the database

/dspacebkp/dspace_db_150508

file is
backup file in which all the table
definitions and contents will be stored

Restoring the backup data


One can use any of the following commands:

psql command OR

pg_restore

Restoring the Database

WARNING:
WARNING You do not need to restore, unless
your data got corrupted.

Not to be used as a routine

Of course backup should be done periodically

Using psql to Restore

$ psql -d dspace -f
/dspacebkp/dspace_db_150508
Where

dspace is the name of database


dspace_db_150508 is the backup file
taken on 15th May 2008.

Using pg_restore to Restore

pg_restore -d dspace
/dspacebkp/dspace_db_150508
More options of pg_restore can be explored
by
$

man pg_restore

Export/Import in Dspace

Not to be used as a backup mechanism


Export and import deal only with bitstreams,
metadata, license and handles
You can export or Import

An item

A collection

or

Export

Importing
dsrun org.dspace.app.itemimport.ItemImport \
--add \
--eperson=dspace@localhost.localdomain

--collection=collectionID \
--source=items_dir
--mapfile=mapfile
Where
--add or --replace or --remove
--mapfile can be used later to remove uploaded items

What is exported

The following files will be created for every item

dublin_core.xml ( metadata)

Handle ( one line having the handle number)

license.txt

Actual file ( bitstream: could be pdf or doc or an


image file)

Contents (with two lines license file name, and


actual bitstream name)