Getting Started Guide-V2-20150918 - 1705

ModeShape 3
Getting Started Guide
Exported from JBoss Community Documentation Editor at 2015-09-18 17:05:01 EDT

Copyright 2015 JBoss Community contributors.
JBoss Community Documentation
Page 1 of 36
ModeShape 3
Table of Contents
1 What's new ________________________________________________________________________ 4
2 Migrating from 2.x ___________________________________________________________________ 6
2.1 Public API _____________________________________________________________________ 7
2.2 Storage vs. connectors ___________________________________________________________ 8
2.3 Federation _____________________________________________________________________ 9
2.4 Binary storage __________________________________________________________________ 9
2.5 Sequencers ___________________________________________________________________ 10
2.6 MIME type detection ____________________________________________________________ 10
2.7 Text extractors _________________________________________________________________ 10
2.8 Configuration and running the engine _______________________________________________ 10
2.9 Migrating content _______________________________________________________________ 15
2.10 JBoss AS _____________________________________________________________________ 15
3 Getting Started ____________________________________________________________________ 16
3.1 Complete Maven examples _______________________________________________________ 16
3.2 Embedding ModeShape in application or library built with Maven _________________________ 16
3.2.1 Prerequisites ____________________________________________________________ 17
3.2.2 Add ModeShape Dependencies _____________________________________________ 17
3.2.3 Configuring a ModeShape repository __________________________________________ 23
3.2.4 Configuring the Infinispan Cache _____________________________________________ 28
3.2.5 Starting a ModeShape Repository ____________________________________________ 31
3.2.6 Using JCR's RepositoryFactory ______________________________________________ 35
3.3 ModeShape and JBoss AS7 ______________________________________________________ 36
Page 2 of 36
ModeShape 3
This guide contains information about:
new features in ModeShape 3,
migration to ModeShape 3 and
getting started with ModeShape 3.
See the child pages for the relevant sections.
Page 3 of 36
ModeShape 3
1 What's new
ModeShape 3 has changed quite significantly since the 2.x releases:
Use of Infinispan for all caching and storage - This gives a powerful and flexible foundation for
creating JCR repositories that are fast, scalable, and highly available. Infinispan offers several storage
options (via cache loaders), but can also be used as a distributed, multi-site, in-memory data grid.
Improved performance - ModeShape 3 is faster than 2.x - most operations are at least one if not
several orders of magnitude faster. We will publish performance and benchmarking results closer to
the final release.
Improved scalability - ModeShape 3 has been designed to store and access the content so that a
node can have hundreds of thousands of child nodes (even with same-name-siblings) yet still be
incredibly fast. Additionally, repositories can scale to millions of nodes and be deployed across many
processes.
Improved configuration - There is no more global configuration of the engine; instead, each
repository is configured with a separate JSON file, which must conform to a JSON Schema and can
be validated by ModeShape prior to use. Repository configurations can even be changed while the
repository is running (some restrictions apply), making it possible to add/change/remove sequencers,
authorization providers, and many other configuration options while the repository is in use.
Runtime management - Each repository can be deployed, started, stopped, and undeployed while
the engine and other repositories are still in use.
Deployment to JBoss AS7 - Our AS7 kit installs ModeShape as an AS7 service, allowing you to
configure and manage repositories using the AS7 tooling. Infinispan and JGroups are also a built-in
services in AS7 that can be managed the same way. Plus, ModeShape clustering will work out of the
box using AS7's built-in clustering (domain management) mechanism. ModeShape and JBoss AS7
will be the easiest way to deploy, manage and operate enterprise-grade repositories. See this page
for details on how to install, configure and use.
Storage options - ModeShape continues to have great options for where it can store content.
Although ModeShape 2 had its own connector framework, ModeShape 3 simply uses Infinispan's
cache loaders with out-of-the-box support for storing content in:
In-memory (no cache loader)
BerkleyDB
Relational databases (via JDBC), including in-memory, disk-based, or remote
File system
Cassandra
Cloud storage (e.g., Amazon's S3, Rackspace's Cloudfiles, or any other provider supported by
JClouds)
Remote Infinispan
JTA support - Clients (including EJBs) can use JCR Sessions within JTA transactions.
Page 4 of 36
ModeShape 3
Visibility of changes - Sessions now immediately see all changes persisted/committed by other
sessions, although transient changes of the session always take precedence. When combined with
the new way node content is being stored it should reduce the potential for conflicts during session
save operations. This means that all of the Sessions using a given workspace can share the cache of
persisted content, resulting in faster performance. It also significantly reduces the overhead of each
session, which means ModeShape can handle more sessions.
Thread-safety - The JCR specification only requires that the Repository and
RepositoryFactory interfaces are thread-safe. ModeShape's implementation of many of the other
interfaces, especially Session, Workspace, NodeTypeManager, etc., are all thread-safe. This
means that it is possible for multiple threads to share one Session for reading; note that Session is
inherently stateful, so sharing a Session for writes is not recommended.
New monitoring API that allows accessing the history for over a dozen metrics using a variety of
windows and durations.
JCR-based sequencing API - sequencers now use the JCR API to access the content being
processed and to create/update the derived content. Sequencers can also dynamically register
namespaces and node types. This simplifies the process for creating custom sequencers. We've
already migrated most of our 2.x sequencers to this new API, and will be migrating the rest over the
next few weeks.
MIME type detector API - A simple API for implementing custom MIME type detectors. (Most of the
time, ModeShape's built-in detectors are sufficient.)
Text extractor API - A simple API for implementing custom code to extract searchable text from
binary values. ModeShape provides extractors for several MIME types, via the Tika library, but
custom extractors can be implemented by wrapping the code that parses the binary formats.
Improved Binary storage - A new facility was added to store binary values of all sizes, including
those that are larger than available memory (e.g., gigabytes). An optimization to store small binary
values with the rest of the content is available. We've started out with a file system store that will work
even in clustered environments, but we also plan to add stores that use Infinispan and DBMSes.
Deprecated APIs - A few API interfaces and methods were deprecated in 2.7.0.Final, and these have
been removed. Most of ModeShape's small public API remains the same or has only
backward-compatible changes.
Bug Fixes - Of course, we've incorporated quite a few bug fixes and other minor improvements, too.
We've been planning on adding support for some other features not outlined in the JCR API, but these are
likely going to be pushed to 3.1:
Federation - Access and manipulate data from other external systems as if it were within the
repository.
Map-Reduce - Use map-reduce operations to perform reporting and custom read-only operations in
parallel against the entire content of a repository. ModeShape will use this to enable validation of
repository content against the current set or a proposed set of node types, as well as optimizing the
storage format/layout of each node.
Page 5 of 36
ModeShape 3
2 Migrating from 2.x

This page is primarily for developers that have already used ModeShape 1 or 2, and describes the
similarities and differences in ModeShape 3. As you'll see, your application code will likely not have to
change much (if at all), since ModeShape 3 still supports the standard JCR API.
However, the biggest change is how ModeShape repositories store their content and how they are
configured/managed. We now understood that configuring ModeShape 2 repositories was overly complex,
and we wanted to fix that in ModeShape 3. Also, a goal of ModeShape 3 was that we could achieve at least
a 10-fold increase in scalability (of clustering and of repository size), and this was simply not possible with
the old storage system.
None of these changes were undertaken lightly, and all were made with the goal of making ModeShape 3
easier to configure, easier to use, faster, more resilient, and more scalable. We think that once you learn
how ModeShape 3 has changed, you'll really like it.
Public API
Storage vs. connectors
Federation
Binary storage
Sequencers
MIME type detection
Text extractors
Configuration and running the engine
Migrating content
JBoss AS
Page 6 of 36
ModeShape 3
2.1 Public API

Client applications use the standard JCR 2.0 API to interact with ModeShape 2 and ModeShape 3
repositories. So most (if not all) of your application code will not need to be changed. ModeShape 2 did
provide a small public API that extended the standard JCR API with a few additional capabilities, and
ModeShape 3 supports and slightly expands this public API.
several of the methods and interfaces in ModeShape's public API were deprecated by version 2.8,
and these have been removed from the ModeShape 3 API.
ModeShape 3 also passes 100% of the unofficial JSR-283 (JCR 2.0) compatibility tests, as maintained by
the reference implementation. (The official TCK has quite a few bugs that have been fixed by the reference
implementation community. So although these compatibility tests are not official, we believe these tests are
a more accurate representation of the compliance with the intent of the specification. Plus, other
implementations use these same tests.)
ModeShape also provided several other APIs:
The RESTful API that was in 2.x is still supported, although the URLs have changed. ModeShape 3
adds a new RESTful API, and this is now the default. This API is cleaner and more capable. The
RESTful client library is capable of talking to both ModeShape 2.x and 3.x servers.
The WebDAV API is still supported and has been improved.
The JDBC Driver is still supported and is largely unchanged.
Page 7 of 36
ModeShape 3
2.2 Storage vs. connectors

ModeShape 2.x provided several storage connectors:
the disk-based connector, which stored any content on the local file system
the JPA connector, which stored any content in a JDBC database using Hibernate
the Infinispan connector, which stored any content within an Infinispan 4 or 5 cache
the JBoss Cache connector, which stored any content within a JBoss Cache instance (this was legacy
and not recommended for general use)
plus several access connectors:
the file system connector, which could project into the repository nodes that represented the files and
folders within an area of the file system
the SVN connector, which could project into the repository nodes that represented the files and
folders within an SVN repository
the JCR connector, which could project into the repository nodes that existed in another JCR
repository instance
ModeShape 1 and 2 used a single SPI for both kinds of connectors. We learned fairly early that this was not
ideal, since the two different access patterns really required dedicated operations. Also, ModeShape didn't
originally provide a centralized caching system, and retrofitting one proved to be quite complicated, so each
connector tended to implement it's own caching mechanism. Finally, optimizing content operations using the
SPIs was proving difficult due to the design of the SPI itself.
One of our goals of ModeShape 3 was to dramatically increase the scalability of a repository, both in terms
of scaling out (by clustering multiple ModeShape processes) and in terms of the amount of content. So in the
summer of 2011, we embarked on a project to build a new repository engine that could achieve these goals
while correcting the problems in the 2.x connector system. We ultimately decided that if a repository put the
node representations in an Infinispan cache, Infinispan would act as a very efficient cache and could persist
the content using a variety of techniques. On the small end, Infinispan was easily embeddable and could
store content on the file system, in databases, or even in the cloud. But for larger scale configurations, a
single Infinispan cache can manage our content in-memory by creating an effective "super-heap" across
multiple processes and machines (and even spread across multiple data centers), ensuring that several
copies of each node are maintained and distributed across the cluster. Infinispan is a data grid that can scale
to very large sizes, and ModeShape repositories can benefit from these capabilities.
So in ModeShape 3, a repository always uses a single Infinispan cache as both its caching and storage
system. All workspace content (except for binary value storage; see below) and the system content are all
stored within this single cache.
Page 8 of 36
ModeShape 3
2.3 Federation
ModeShape 3.0 does not yet support using the JCR API to access information in external systems. That is
the most important feature for 3.1, and will reintroduce the concept of a connector as a mechanism to do
this. One major difference, however, will be that ModeShape 3.1 will no longer be able to create a repository
that consists entirely of federated content. Instead, every ModeShape 3 repository will store its own content,
but that you'll also be able to federate and integrate into the repository the content from external systems.
Conceptually this is a bit different than in ModeShape 2, which seemed to allow a repository to be
configured such that all content was federated from external systems. Technically, even
ModeShape 2 required a storage connector to store the repository's system content, so it was
never actually possible to have a repository that consisted entirely of federated content.
2.4 Binary storage

Storing large BINARY values was something that ModeShape 2 didn't do very well. That's because very
large BINARY values are used differently and really need to be stored differently. For example, BINARY
values are always streamed and never need to be pulled completely into memory. They are also immutable,
which means they can be treated differently than the rest of the content.
ModeShape 3 was designed to explicitly handle very, very large BINARY values. To do that, ModeShape 3
separates out the storage of binary values into a completely separate store, where the values are stored
based upon their SHA-1 hash. This means that the same BINARY value is never stored more than once,
even if that BINARY value is used in properties on multiple nodes. ModeShape provides several binary
storage options:
Store binary values on the local file system, which can be a regular directory, a network share or even
a temporary directory. This option is generally fast, safe (as your file system), and native locks are
used to prevent multiple processes from conflicting. It is an excellent choice for local (non-clustered),
embedded repositories.
Store binary values in Infinispan. Although it is possible to use the same cache as the rest of the
repository uses, it is often far better to use two other caches and to configure those caches
specifically for what they store. For example, one cache is used to store metadata about the BINARY
values; this metadata is small, lightweight, and can thus be replicated across the cluster. The other
cache is used to store the actually binary data, separated into chunks (usually up to 1MB in size), and
for this cache distributing the data across the cluster is often desirable.
Store binary values in a relational database. This storage is recommended only when you are
expected to persist all content inside a relational database. The binary data is broken into chunks
(usually up to 1MB in size).
Store binary values in MongoDB. This storage option has not been thoroughly tested, but can be
considered as an option.
Store binary values in a custom store. ModeShape 3 provides an SPI for implementing your own
binary storage.
Page 9 of 36
ModeShape 3
2.5 Sequencers
ModeShape 3 sequencers work exactly the same way as they did in ModeShape 1.x and 2.x: they
automatically take new or updated content (matched by a path-based rule), generate additional structured
content, and write that new content into the repository (in a location determined by the configuration).
They are configured differently, most notably because each repository is configured with its own sequencers.
Implementing custom sequencers, however, is far easier in 3.0, since sequencers generate the additional
content by directly using the JCR API rather than the proprietary graph API in ModeShape 2. Sequencer
implementations are also able to register the node types programmatically, which simplifies the overall
configuration for a repository.
2.6 MIME type detection

ModeShape 1 and 2 had the ability to automatically detect the MIME type for BINARY values. Several
detectors were provided by ModeShape, and these very often didn't need to be customized or altered.
ModeShape 3 also has the same ability, but we've made several improvements. First of all, we've added two
new methods to the org.modeshape.jcr.api.Binary interface (which extends the standard
javax.jcr.Binary interface) to obtain the MIME type. Your applications can use these methods to
discover the MIME type for a BINARY value and, for example, to set the " jcr:mimeType" property on the
node.
Secondly, we've removed our SPI for custom MIME type detectors. Instead, ModeShape 3 simply uses the
Apache Tika framework, which has several MIME type detectors and provides its own SPI for custom
detectors.
2.7 Text extractors

Text extractors also work exactly the same was as they did in ModeShape 1.x and 2.x. Their purpose is to
extract searchable text from BINARY values, so that full-text search and queries are able to find results that
matched the content of BINARY values. ModeShape 3 continues to support several built-in extractors,
including one that uses Apache Tika. However, in ModeShape 3.x we've added a simple SPI so that you can
easily create your own extractors.
2.8 Configuration and running the engine

ModeShape 2 configuration files contains specifications for multiple repositories. Some components, like
sequencers and repository sources, were configured separately from the repositories. To run an engine, you
first read in the configuration file and then created an engine from the configuration:
Page 10 of 36
ModeShape 3
// Load the one configuration file ...

JcrConfiguration configuration = new JcrConfiguration();
configuration.loadFrom("modeshape-config.xml");
// Create and start an engine ...
JcrEngine engine = configuration.build();
engine.start();
// Get the repositories by their names ...
Repository repository1 = engine.getRepository("Cars");
Repository repository2 = engine.getRepository("Catalog");
Using a single configuration file for the engine seemed to make sense, but it was also confusing because a
single sequencer might be used in multiple repositories. It was also potentially problematic, because a single
source might be used by multiple repositories, even though this was not allowed. ModeShape 2 didn't allow
modifying the configuration while the engine was running, which meant it was not possible to dynamically
add or remove repositories without completely shutting down and restarting the engine. (In reality, very little
was shared between repositories.)
ModeShape 3 separates the configuration of each repository into a separate file, which are each "deployed"
to an engine:
// Start the engine ...

ModeShapeEngine engine = new ModeShapeEngine();
engine.start();
// Deploy and repository 1 ...
RepositoryConfiguration config1 = RepositoryConfiguration.read("cars-config.json");
engine.deploy(config1);
Repository repository1 = engine.getRepository("Cars");
// Deploy and use repository 2 ...
RepositoryConfiguration config2 = RepositoryConfiguration.read("catalog-config.json");
engine.deploy(config2);
Repository repository2 = engine.getRepository("Catalog");
// Undeploy repository 1 ...
engine.undeploy("Cars");
As you can see, it's now possible to dynamically deploy and undeploy repositories even when the engine is
running and other repositories are in use. There are multiple ways of reading in the configuration, too:
read from a java.io.File
read from a resolved java.net.URL
read from a String containing a URL or a path to a file on the file system or classpath
read a string containing the configuration
Page 11 of 36
ModeShape 3
You might have also noticed in the example above that ModeShape 3 configuration files are JSON files, not
XML files like in ModeShape 1 and 2. We thought that XML configuration files are noisy and make it difficult
to see the bigger picture. JSON files, on the other hand, are quite easy to read and edit. And ModeShape
does use a JSON Schema that dictates the allowed structure of the configuration files, so ModeShape can
even validate your configuration files:
Problems problems = config1.validate();

if ( problems.hasProblems() ) {
// Output a summary of the problems (with line numbers) ...
System.out.println(problems);
}
ModeShape 3 configuration files also have sensible defaults for everything, so this file is actually a valid
configuration for a repository named "my-repo":
my-repo.json
{ "name" : "my-repo" }
Of course, you'll likely want to specify more options, so here is another example of a repository with most of
the available options specified:
my-repo-config.json
{
"name" : "my-repo",
"transactionMode" : "auto",
"monitoring" : {
"enabled" : true,
},
"workspaces" : {
"predefined" : ["otherWorkspace"],
"default" : "default",
"allowCreation" : true,
"initialContent" : {
"ws1" : "file1.xml",
"ws2" : "file2.xml",
"*" : "default.xml"
}
},
"node-types" : ["file1.cnd", "file2.cnd"],
"storage" : {
"cacheName" : "Thorough",
"cacheConfiguration" : "infinispan_configuration.xml",
"transactionManagerLookup" =
"org.infinispan.transaction.lookup.GenericTransactionManagerLookup",
"binaryStorage" : {
"type" : "file",
"directory" : "Thorough/binaries",
"minimumBinarySizeInBytes" : 4096
}
},
Page 12 of 36
ModeShape 3
"security" : {
"anonymous" : {
"username" : "<anonymous>",
"roles" : ["readonly","readwrite","admin"],
"useOnFailedLogin" : false
},
"providers" : [
{
"name" : "My Custom Security Provider",
"classname" : "com.example.MyAuthenticationProvider",
},
{
"classname" : "jaas",
"policyName" : "modeshape-jcr",
}
]
},
"query" : {
"enabled" : true,
"textExtracting": {
"threadPool" : "test",
"extractors" : {
"customExtractor": {
"name" : "MyFileType extractor",
"classname" : "com.example.myfile.MyExtractor",
},
"tikaExtractor":{
"name" : "General content-based extractor",
"classname" : "tika",
}
}
},
"indexStorage" : {
"type" : "filesystem",
"location" : "Thorough/indexes",
"lockingStrategy" : "native",
"fileSystemAccessType" : "auto"
},
"indexing" : {
"threadPool" : "modeshape-workers",
"analyzer" : "org.apache.lucene.analysis.standard.StandardAnalyzer",
"similarity" : "org.apache.lucene.search.DefaultSimilarity",
"batchSize" : -1,
"indexFormat" : "LUCENE_35",
"readerStrategy" : "shared",
"mode" : "sync",
"rebuildOnStartup": {
"when" : "if_missing",
"includeSystemContent": false,
"mode": "sync"
},
"asyncThreadPoolSize" : 1,
"asyncMaxQueueSize" : 0,
"backend" : {
"type" : "lucene",
},
"hibernate.search.custom.overridden.property" : "value",
Page 13 of 36
ModeShape 3
}
},
"sequencing" : {
"removeDerivedContentWithOriginal" : true,
"sequencers" : {
"zipSequencer" : {
"classname" : "ZipSequencer",
"pathExpressions" : ["default:/files(//)(*.zip[*])/jcr:content[@jcr:data] =>
default:/sequenced/zip/$1"],
},
"delimitedTextSequencer" : {
"classname" : "org.modeshape.sequencer.text.DelimitedTextSequencer",
"pathExpressions" : [
"default:/files//(*.csv[*])/jcr:content[@jcr:data] =>
default:/sequenced/text/delimited/$1"
],
"splitPattern" : ","
}
}
},
"clustering" : {
}
}
See our documentation about the ModeShape JSON configuration file format for more information.
It is also possible to access the configuration of a running repository, change the configuration, and then
update the running repository:
// Get the configuration ...

RepositoryConfiguration config = repository1.getConfiguration();
// Edit the configuration (which is a JSON document) to change a value ...
Editor editor = config.edit();
editor.getOrCreateDocument(FieldName.STORAGE)
.getOrCreateDocument(FieldName.BINARY_STORAGE);
.setNumber(FieldName.MINIMUM_BINARY_SIZE_IN_BYTES, newLargeValueSizeInBytes);
Changes changes = editor.getChanges();
// Apply the changes to the deployed repository ...
Future<Boolean> future = engine.update(config.getName(), changes);
// And optionally wait until the repository configuration is updated ...
future.get();
Many configuration changes can be applied to a repository while it is running, but not everything.
For example, changing where data is stored will apply only after the repository is shutdown and
restarted.
Page 14 of 36
ModeShape 3
2.9 Migrating content

ModeShape 3.0 provides an efficient backup and restore capability that works at the repository level. This
means that each backup will contain all of the content in all of the workspaces of a single repository.
Backups can be used to recovered a repository back to an earlier state (due to a corruption, hardware
failure, etc.), and it also serves as a mechanism for migrating ModeShape 2.x repositories to 3.x.
We've not yet received any requests to provide a backup mechanism for ModeShape 2.x, and so at this time
it is not possible to upgrade a ModeShape 2.x repository to 3.x or 4.x.
2.10 JBoss AS
One other major change is that ModeShape 3 can be installed into JBoss AS7, which is a very fast and
lightweight application server. The integration is very good: ModeShape is a service (or _subsystem) within
AS7, and is configured and managed using the regular AS7 configuration files or tooling. Managing a
ModeShape instance across a JBoss AS7 domain (cluster) is just as easy as with any other AS7 subsystem.
Plus, ModeShape just uses AS7's built-in support for Infinispan, JGroups, security, and data sources, which
means you configure these components using AS7's tools.
ModeShape 3 no longer provides integration with JBoss AS 5 and 6.
ModeShape 3 can of course be used with other application and web servers, including JBoss AS5
and 6. But just like with ModeShape 2, doing so basically just embeds ModeShape within your web
application or service, and no other integration with the server is provided.
Page 15 of 36
ModeShape 3
3 Getting Started
We've published the ModeShape artifacts and JARs for this beta release only in the JBoss Maven repository
. The rest of this page shows how you can use ModeShape within your Maven-based projects. We've also
added several distributions on our project's download page:
a binary distribution with all the JARs, JavaDoc, and examples
a kit to install ModeShape into an EAP installation
a source distribution
So without further adieu...
Complete Maven examples
Embedding ModeShape in application or library built with Maven
Prerequisites
Add ModeShape Dependencies
Logging
Use newer Infinispan and JGroups versions
Using a transaction manager
Using the JBoss Transaction Manager
Using other transaction managers
Configuring a ModeShape repository
Configuring the Infinispan Cache
Simple configuration
Cache with Cache Store
Starting a ModeShape Repository
Starting the ModeShape engine
Deploying our Repository
Using the Repository and the JCR API
Stopping the repository and engine
Using JCR's RepositoryFactory
ModeShape and JBoss AS7
3.1 Complete Maven examples

We have a number of self-contained examples that you can checkout, build, and then modify to try different
things. So if Git is your thing, the easiest way to get going with ModeShape 3.7.2 is to simply clone this
repository and build the examples. For details, see our modeshape-examples repository on GitHub, and
follow the instructions shown the readme file on that page.
If Git isn't your thing, then read on to learn how to build a JCR application that embeds ModeShape and how
you can install ModeShape into AS7 and use it from within your web applications and services.
Page 16 of 36
ModeShape 3
3.2 Embedding ModeShape in application or library

built with Maven
If you're Java SE application or library uses Maven, then embedding ModeShape is really very easy. (If not,
then you probably want to wait to start testing ModeShape 3 until the first beta release, when we'll publish a
ZIP file that contains all the JARs, documentation and examples.)
The instructions on this page are for Java SE applications. If you're creating applications for
deployment onto JBoss AS7, see the specific documentation about how to install ModeShape into
AS7/EAP and use it with your web applications.
3.2.1 Prerequisites
Before you can use Maven to build an application that uses ModeShape, you'll need to have JDK 6 and
Maven 3 installed.
All ModeShape releases since 3.0.0.Final are now available directly from the Maven Central repository. It
takes a few hours (at least) after the artifacts are in the JBoss repository before they appear in Maven
Central. So if you don't see a recent release in Maven Central, just give it a bit of time - or use the JBoss
Maven repository.
3.2.2 Add ModeShape Dependencies

The next step is to edit your application or library's POM file and add the dependencies to the JCR API and
ModeShape. The easiest way to do that is to use one of our Maven BOMs that specifies the versions for all
of the ModeShape components and all of its dependencies:
Page 17 of 36
ModeShape 3
Maven dependencies for the JCR API and ModeShape engine

<dependencyManagement>
<dependencies>

<dependency>
<groupId>org.modeshape.bom</groupId>
<artifactId>modeshape-bom-embedded</artifactId>
<version>3.7.2.Final</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
Then include in the POM "<dependencies>" section the ModeShape modules that you will directly use.
Note that you don't need to specify any of the versions, since that's what the modeshape-bom-embedded
provided. The one module that you need to include is the primary JCR implementation:
Maven dependencies for the JCR API and ModeShape engine
<dependency>
...
<dependency>
<groupId>org.modeshape</groupId>
<artifactId>modeshape-jcr</artifactId>
</dependency>
...
</dependencies>
But you should also include any other modules that you'll either directly use or optional modules that you
want to use. For example, if you're going to use any of ModeShape's public API (instead of just the JCR
API), then you should include this dependency:
Optional Maven dependencies for the ModeShape public API
<dependency>
<artifactId>modeshape-jcr-api</artifactId>
</dependency>
If you want to use one of Infinispan's cache stores, then pick from ONE of the following:
Page 18 of 36
ModeShape 3
Maven dependencies for the Infinispan Cache Stores (Pick One)

<dependency>
<groupId>org.infinispan</groupId>
<artifactId>infinispan-cachestore-bdbje</artifactId>
</dependency>
<dependency>
<artifactId>infinispan-cachestore-jdbm</artifactId>
</dependency>
<dependency>
<artifactId>infinispan-cachestore-jdbc</artifactId>
</dependency>
<dependency>
<artifactId>infinispan-cachestore-cassandra</artifactId>
</dependency>
<dependency>
<artifactId>infinispan-cachestore-cloud</artifactId>
</dependency>
Adding multiple cache stores may be necessary if you're using multiple Infinispan caches, each
with a different cache store. Adding a dependency on any cache stores that you're not using,
however, simply brings in more unnecessary (transient) dependencies and should be avoided.
If you're going to use the JDBC Cache Store (e.g., "infinispan-cachestore-jdbc"), then you'll also
need to add a dependency on the JDBC driver or embeddable database. For example, here's the
dependency required to use the embeddable H2 database:
Maven Dependency for the H2 embeddable database
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<version>1.1.117</version>
</dependency>
Logging
ModeShape is designed to use the same logging framework as your application, and it can dynamically bind
to Log4J, SLF4J, Logback and the JDK's logging system. Your application or library will probably already be
using one of these logging frameworks and will already have them in the dependencies.
Page 19 of 36
ModeShape 3
Use newer Infinispan and JGroups versions

ModeShape 3.7.2 is currently dependent upon Infinispan 5.2.7.Final and JGroups 3.2.10.Final. These
versions are those that ship with the corresponding version of EAP (currently 6.1.1). If you're deploying to
EAP, you simply get the version of ModeShape, Infinispan and JGroups included in your EAP version.
However, for other cases it may be desirable to use newer version of Infinispan 5.x and the corresponding
version of JGroups. Doing this is actually straightforward, especially if you're using Maven: in your POM file
simply add dependencies on the Infinispan dependencies you're using (e.g., "infinispan-core"
plus any cache store artifacts) and JGroups before the ModeShape dependencies.
For example, here are parts of a POM file that do this:
Overriding Infinispan and JGroups dependencies
Page 20 of 36
ModeShape 3
<project>

<properties>
<infinispan.version>5.2.7.Final</infinispan.version>
<jgroups.version>3.2.10.Final</jgroups.version>
</properties>
<dependencies>
<dependency>
<artifactId>infinispan-core</artifactId>
<version>${infinispan.version}</version>
</dependency>
<dependency>
<artifactId>infinispan-cachestore-jdbm</artifactId>
<version>${infinispan.version}</version>
</dependency>
<dependency>
<groupId>org.jgroups</groupId>
<artifactId>jgroups</artifactId>
<version>${jgroups.version}</version>
</dependency>
<dependency>
<groupId>javax.jcr</groupId>
<artifactId>jcr</artifactId>
</dependency>
<dependency>
<artifactId>modeshape-jcr-api</artifactId>
</dependency>
<dependency>
<artifactId>modeshape-jcr</artifactId>
</dependency>

</dependencies>
<dependencyManagement>
<dependencies>

<dependency>
<groupId>org.modeshape.bom</groupId>
<artifactId>modeshape-bom-embedded</artifactId>
<version>3.7.2.Final</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
</project>
Page 21 of 36
ModeShape 3
Note that we're using properties to specify the versions of these artifacts. This makes it easy to change, but
also allows us to put the versions only in one (readable) location (since the " infinispan.version"
property is used in multiple places).
Be sure to pick one of the combinations of Infinispan and JGroups mentioned above.
Using a transaction manager

If you're deploying ModeShape within a JavaSE application (or a non-JavaEE environment such as Tomcat),
you will likely want to choose a transaction manager. (Infinispan has a simple one that is good enough for
non-clustered testing, but probably not for production.)
Using the JBoss Transaction Manager

We're somewhat partial to the JBoss Transaction Manager. It's solid and used in the popular JBoss
Application Server and Red Hat Middleware platforms. And it's what we use in our own testing and
examples.
Using it is easy, especially if you're using our embedded BOM (as we described above), because all you
have to do is add a dependency in your POM on the JBoss Transaction Manager:
Overriding Infinispan and JGroups dependencies
<project>

<dependencies>

<dependency>
<groupId>org.jboss.jbossts</groupId>
<artifactId>jbossjta</artifactId>
</dependency>

</dependencies>

</project>
Note that you don't need to (but can) specify the version, since our BOM already defines the default version.
The BOM also excludes a lot of the dependencies and components not necessary when using in a
non-clustered environment.
By default, the Infinispan configuration will automatically look for and find the transaction manager.
Page 22 of 36
ModeShape 3
Using other transaction managers

If you want to use another transaction manager such as Atomikos or Spring Transaction Manager, simply
add it as a normal dependency to your application, but be sure that it's one that Infinispan can automatically
find. If not, then you'll have to provide an implementation of the
org.infinispan.transaction.lookup.TransactionManagerLookup interface and specify it in
your Infinispan configuration file:
<transaction
transactionManagerLookupClass="<specify your TransactionManagerLookup implementation class
here>"
transactionMode="TRANSACTIONAL"
lockingMode="OPTIMISTIC"/>
3.2.3 Configuring a ModeShape repository

The ModeShape engine is capable of running (or "deploying") multiple JCR repositories. However, each
repository is configured separately and is completely independent from all other repositories. To configure a
repository, you'll need a configuration file. Starting with ModeShape 3.0, these configuration files use the
JSON format (which is a lot easier to read and create). Here is the minimum configuration file for a
repository:
{ }
That's not a mistake. An empty JSON document is a completely valid repository configuration. Everything
has a default value except for the repository's name, and the filename is used if one is not specified in the
file. In this case, the name of this repository will be "my_repository".
Of course, lots of other options can be specified in the configuration file, but typically only the non-default
values are specified. Since most of the defaults are sensible, many configurations will be pretty small.
Here's a configuration file that uses most of the available fields, most of which happen to be set the same
values as the defaults. (This time we'll show line numbers so we can more easily describe what's going on.)
'my_repository.json'
{
"name" : "Test Repository",
"jndiName" : "jcr/Test Repository",
"monitoring" : {
"enabled" : true
},
"workspaces" : {
"default" : "defaultWorkspace",
"predefined" : ["otherWorkspace"],
"allowCreation" : true
Page 23 of 36
ModeShape 3
},
"storage" : {
"cacheConfiguration" : "/path/to/infinispan/cache/configuration.xml",
"cacheName" : "Test Repository",
"transactionManagerLookup" :
"org.infinispan.transaction.lookup.GenericTransactionManagerLookup",
"binaryStorage" : {
"minimumBinarySizeInBytes" : 4096,
"minimumStringSize" : 4096,
"type" : "file"
}
},
"security" : {
"jaas" : {
"policyName" : "modeshape-jcr"
}
"anonymous" : {
"roles" : ["readonly","readwrite","admin"],
"username" : "<anonymous>",
"useOnFailedLogin" : false
},
"providers" : [
{
"classname" : "org.example.MyAuthorizationProvider",
"member1" : "value of instance member1"
}
]
},
"query" : {
"enabled" : true,
"rebuildUponStartup" : "if_missing", //DEPRECATED use indexing/rebuildOnStartup,
"indexStorage" : {
"type" : "filesystem",
"location" : "/path/on/filesystem",
"lockingStrategy" : "simple",
"fileSystemAccessType" : "auto"
},
"indexing" : {
"rebuildOnStartup": {
"when" : "if_missing",
"includeSystemContent": true,
"mode": async
},
"analyzer" : "org.apache.lucene.analysis.standard.StandardAnalyzer",
"similarity" : "org.apache.lucene.search.DefaultSimilarity",
"indexFormat" : "LUCENE_CURRENT",
"readerStrategy" : "shared",
"backend" : {
"type" : "lucene"
},
"batchSize" : -1,
"mode" : "sync",
"asyncThreadPoolSize" : 1,
"asyncMaxQueueSize" : 0
},
"extractors" : [MODE:ModeShape and JBoss AS7]
},
Page 24 of 36
ModeShape 3
"sequencing" : {
"removeDerivedContentWithOriginal" : true,
"sequencers" : [MODE:ModeShape and JBoss AS7] => /ddl"
},
{
"name" : "XSD sequencer",
"classname" : "xsd",
"pathExpressions" : [ "/(*.xsd)/jcr:content[@jcr:data]" ],
}
]
}
}
This configuration defines:

The name of the repository (on line 2) to be "Test Repository", which will take precedence over
the name of the file.
The repository will be registered in JNDI (if JNDI is available in the environment) with the name "
jcr/Test Repository" (line 3). By default, the JNDI name will follow the pattern "jcr/<name>",
where "<name>" is the repository name.
The repository will periodically collect performance and statistical metrics in the background (line 5).
This is enabled by default, but can be set to false to turn off the collection.
The "defaultWorkspace" workspace (on line 8) is used by default when the client a
Repository.login(...) method that doesn't have the workspace name as a parameter or if the
client provides a null reference for the workspace name. If not specified, the default workspace for the
repository will be named "default".
One other workspace named "otherWorkspace" (line 9) will exist upon startup. By default, only the
default workspace will exist.
Clients can use the "Workspace.createWorkspace(...)" methods to create new workspaces
(line 10). This is the default.
The repository will look for a Infinispan configuration file at "
/path/to/infinispan/cache/configuration.xml" (line 13) to create a new Infinispan
CacheContainer instance. The value can be a (absolute or relative) path on the file system, the path
to a resource on the (application, system, or thread-context) classloader, or the JNDI name where the
CacheContainer instance can be found in JNDI. If no configuration file is found at any of these
locations, a default Infinispan configuration (a basic, local mode, non-clustered, in-memory cache) will
be used.
The repository will use the Infinispan cache named "Test Repository" (line 14). If not specified,
the repository's name is used.
The repository will attempt to find the JTA transaction manager using the "
org.infinispan.transaction.lookup.GenericTransactionManagerLookup" class. This
is the default, and will work for many environments. You can specify the name of any class that
implements the "org.infinispan.transaction.lookup.TransactionManagerLookup"
interface, including several provided by Infinispan.
The repository will store all BINARY values equal to or larger than 4096 bytes (line 16) in the binary
store that uses the file system (line 18). Smaller BINARY values are held in-memory or persisted with
the node information. The default size is 4096 bytes, and the default type is "filesystem".
Page 25 of 36
ModeShape 3
The repository can also store all STRING values equal to or larger than a specified number of
characters. In this case, all STRING values with 4096 or more characters (line 17) will be stored in the
binary store that uses the file system (line 18). Smaller STRING values are held in-memory or
persisted with the node information. By default, the maximumStringSize value will be set to the
explicit or default value of maximumBinaryValueInBytes.
The repository will use several security providers for authentication and authorization. By default, only
the anonymous provider is used. The order of the providers is important: a caller will be authenticated
or authorized if any of the providers succeed for the caller:
The JAAS policy named "modeshape-jcr" will be used (lines 23-24). If the "jaas" nested
document is not specified, JAAS will not be used. If specified in this fashion, the JAAS security
provider will always be used first. The "modeshape-jcr" policy is used by default if JAAS is
enabled.
Any providers as configured by the "providers" nested array (lines 31-36), where each array
value is a nested document specifying the provider's name, description, and type (or
classname). Only the "type" (or "classname") field is required. The two built-in types are "
jaas" and "servlet", but any implementation of the '
org.modeshape.jcr.security.AuthorizationProvider" interface can be specified
instead. Any instance members on the implementation class can be set by specifying
additional fields of the same name, as long as the member type is String, a primitive boolean or
number, java.util.Map, or java.util.List.
The anonymous provider (lines 26-30) is enabled by default and (if enabled) always is the last
provider to be consulted. It authenticates all users with read and write permission by default,
although the exact roles (either "read", "readwrite", or "admin") can be configured with the
"roles" field; specify an empty "roles" array to completely disable the anonymous provider.
All sessions that are authenticated by this provider will be given the username given by the "
username" field (line 30), which defaults to the literal "<anonymous>" value (including the
angle brackets). Any user that fails to properly authenticate with another provider will not be
given an anonymous session unless the "useOnFailedLogin" field is set to true.
Page 26 of 36
ModeShape 3
The query system (lines 38-67) is enabled by default but is explicitly enabled on line 39
When the repository starts up, only the missing indexes will be rebuilt (lines 48-51) (which is
also the default), the system content area (under /jcr:system) will be indexed as well(by default
the system area isn't re-indexed) and all of the re-indexing will be done asynchronously (by
default however, it is done synchronously).
The indexes will be stored on the file system under the directory " /path/on/filesystem"
and will use simple locking and automatically choose the kind of file system storage based
upon the operating system (lines 42-45). By default the indexes are stored in memory (with a "
type" value of "ram" and no other fields), so be sure to configure this carefully for your
application/environment.
The indexing system will use the "modeshape-workers" thread pool for re-indexing the
workspace content in the background (line 48), and will use the "StandardAnalyzer" for
tokenizing terms (line 49) and the "DefaultSimilarity" class for scoring (line 50). By
default the indexes will be stored using the current format (line 51), though it's recommended
to explicitly set the value matching the Lucene version you've started using (e.g.,
"LUCENE_34"). The readers will be shared (line 52) until index changes are discovered. The "
backend" nested document (lines 53-55) specifies how ModeShape is to handle changes to
the indexes; the default of "lucene" (line 54) means the changes will be written directly to the
local Lucene indexes, while other options allow using a JMS queue, JGroups, a " blackhole"
option for testing, or even a custom implementation. The other advanced properties (lines
56-59) specify the maximum node updates per transaction, whether the indexes are to be
written synchronously, and the thread pool size and queue size for asynchronous writes.
Text extractors (lines 61-70) are used to find the search terms from BINARY values. No text
extractors are used by default, but specifying the name, description, and type (or classname)
for one or more text extractor implementation classes enables this feature. Two text extractor
types are provided out of the box, and both are configured here with the required " type" fields
(e.g., "tika" and "vdb") and an optional description (useful for documentation and during
administration).
The configured sequencers (lines 72-87) specify the types of sequencers that should be run. Each
sequencer is configured with one or more path expressions that are matched against the paths of
changed nodes; when any changed path matches the expression, the sequencer is called on the
changed property/node and the generated output of the sequencer invocation is written to the location
specified in the path expression. Each sequencer is configured by specifying the required " type"
field, and an optional name and description. Custom implementations of "
org.modeshape.jcr.api.sequencer.Sequencer" interface can be specified using the "
classname" field (instead of the "type" field), and any instance members on the implementation
class can be set by specifying additional fields of the same name, as long as the member type is
String, a primitive boolean or number, java.util.Map, or java.util.List. Several types of
sequencers are available out of the box:
"cnd" parses JCR CND files to generate a node structure describing the namespaces, node
types, property definitions, and child node definitions
"class" and "java" parse Java class files and source files (respectively) and generates a
node structure describing the encoded types, fields, methods, parameters, etc.
Page 27 of 36
ModeShape 3
"ddl" parses the more important DDL statements from SQL-92, Oracle, Derby, and
PostgreSQL, and constructing a graph structure containing a structured representation of these
statements. The resulting graph structure is largely the same for all dialects, though some
dialects have non-standard additions to their grammar, and thus require dialect-specific
additions to the graph structure.
"image" extracts metadata from JPEG, GIF, BMP, PCX, PNG, IFF, RAS, PBM, PGM, PPM
and PSD image files. This sequencer extracts the file format, image resolution, number of bits
per pixel and optionally number of images, comments and physical resolution.
"model" parses the model files produced by the Teiid Designer to extract the structured
relational data model described by the XMI file, and outputs a node structure that represents
this model.
"vdb" parses the VDB archive files produced by the Teiid Designer to extract the virtual
database information and the structured relational data model described in each of the
contained XMI model files, and outputs a node structure that represents the VDB and these
models.
"wsdl" parses WSDL files that adhere to the W3C's Web Service Definition Language (WSDL)
1.1 specification, and output a representation of the WSDL file's messages, port types,
bindings, services, types (including embedded XML Schemas), documentation, and extension
elements (including HTTP, SOAP and MIME bindings). This derived information is intended to
mirror the structure and semantics of the actual WSDL files while also making it possible for
ModeShape users to easily navigate, query and search over this derived information. This
sequencer captures the namespace and names of all referenced components, and will resolve
references to components appearing within the same file.
"xsd" parses XML Schema Documents that adhere to the W3C's XML Schema Part 1 and Part
2 specifications, and output a representation of the XSD's attribute declarations, element
declarations, simple type definitions, complex type definitions, import statements, include
statements, attribute group declarations, annotations, other components, and even attributes
with a non-schema namespace. This derived information is intended to accurately reflect the
structure and semantics of the XSD files while also making it possible for ModeShape users to
easily navigate, query and search over this derived information. This sequencer captures the
namespace and names of all referenced components, and will resolve references to
components appearing within the same files.
"xml" parses XML files and extracts the element, attribute, namespace, DTD, entity, comments
and other information in the file, producing a node structure representative of this information.
"zip" extracts the files and folders contained in the ZIP archive file, extracting the files and
folders into the repository using JCR's nt:file and nt:folder built-in node types. The
structure of the output thus matches the logical structure of the contents of the ZIP file. Note
that the resulting files may then be sequenced.
"mp3" processes MP3 audio files added to a repository and extracts the ID3 metadata for the
file, including the track's title, author, album name, year, and comment, and then writes a node
structure representing this information
"fixedwidth" extracts rows and fixed-width columns from text streams and generates a node
structure representative of the rows and column values in each row.
"delimited" extracts rows and delimited columns from text streams and generates a node
structure representative of the rows and column values in each row.
Page 28 of 36
ModeShape 3
3.2.4 Configuring the Infinispan Cache

As noted in the previous section, the repository configuration can specify the configuration file for the
Infinispan CacheContainer (see line 13 in the previous example). If a configuration or an existing
CacheContainer instance can be found, a basic Infinispan configuration (a basic, local mode,
non-clustered in-memory cache) will be used.
The rest of this section describes some basic ways to configure Infinispan. However, please see the
Infinispan documentation for much more detailed information about how to properly configure Infinispan and
its cache loaders using its XML configuration file format.
Simple configuration
As with ModeShape, Infinispan's minimal configuration is a (basically) empty file:
Minimal Infinispan configuration
<infinispan />
This default configuration will result in a basic, local mode (not replicated or distributed), non-clustered,
in-memory cache. While this cache will make the ModeShape repository be exceedingly fast, it's not the
most practical. So more often than not, you'll want to configure Infinispan to persist information.
Cache with Cache Store

One of the reasons Infinispan is so fast is because it keeps an in-memory cache of the information (node
content in ModeShape's case) most recently used. If all of the information can be kept in memory, then
retrieving and/or updating the information is extremely fast. However, keeping all the information in-memory
is not always a good idea, and Infinispan addresses this in several ways.
The most powerful way is to form a cluster of Infinispan caches so that Infinispan can distribute multiple
copies of each piece of information across the different cluster. Normally there are many more machines
than there are copies, so the effective storage capacity is many, many times the capacity of a single
machine. Doing this forms a data grid, and Infinispan can always calculate on which processes in the grid a
piece of information is stored. And, because each piece of information is stored in multiple locations on the
grid, the information kept in memory is safe even if some of the grid fails.
An alternative is to use a cluster but to replicate every piece of information across all the processes in the
cluster. The size of these clusters is typically much smaller than a data grid, since for durability only a
handful of copies are needed. And because every process in the cluster contains all the data, this too is
extremely fast, though it can't scale to the capacity of a data grid.
Page 29 of 36
ModeShape 3
Keeping information in memory is fast, but sometimes it's desirable to also persist the information
somewhere. Perhaps all of the information is to be persisted, or perhaps only that which can't be kept in
memory is to be persisted. Either way, Infinispan's cache loaders provide a way for Infinispan to write out
the information to an external store. The cache loaders that can persist information are also called cache
stores.
The cache loader system also means that we can use Infinispan even when we don't have a cluster where
Infinispan can replicate or distribute the information. In other words, we can configure an Infinispan cache
store when we're running ModeShape as a single process, and we're still able to persist the information.
Even in this mode, Infinispan will still act as a cache by keeping the most recently used items in-memory.
Here is a simple configuration file for Infinispan that defines a single cache named " Test Repository"
that stores its contents in a Oracle/Sleepycat BerkleyDB database stored on the file system at "
/path/to/bdb":
Sample Infinispan configuration using BerkleyDB cache store
<infinispan
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:infinispan:config:5.1
http://www.infinispan.org/schemas/infinispan-config-5.1.xsd"
xmlns="urn:infinispan:config:5.1">

<global>
</global>

<default>
</default>

<namedCache name="Test Repository">
<loaders
passivation="false"
shared="false"
preload="false">
<loader
class="org.infinispan.loaders.bdbje.BdbjeCacheStore"
fetchPersistentState="false"
purgeOnStartup="false">
<properties>
<property name="location" value="/path/to/bdb"/>
</properties>
</loader>
</loaders>
</namedCache>
</infinispan>
When a MdoeShape repository is configured to use this Infinispan cache, all the repository contents will be
persisted to disk (either in the binary store or in the Infinispan cache). Thus, the repository can be shut down
and restarted without loss of any information.
Page 30 of 36
ModeShape 3
Of course other cache stores are available. You can start out using them by replacing the "
org.infinispan.loaders.bdbje.BdbjeCacheStore" value in the Infinispan configuration with
another value. Again, see the Infinispan documentation for the details of how to properly configure cache
stores for your environment and needs.
org.infinispan.loaders.file.FileCacheStore - A simple loader that store information on
the file system. This has severe limitations but is a simple cache loader for testing purposes. Note that
it is not transactional, and it should not be used on NFS or Windows shares that do not properly
implement file locking.
org.infinispan.loaders.bdbje.BdbjeCacheStore - A very fast cache loader that is ideal
when the client application or library can accept the license terms of BerkleyDB.
org.infinispan.loaders.jdbm.JdbmCacheStore - A cache loader that uses JDBM, a free
alternative to BerkleyDB.
org.infinispan.loaders.jdbc.JdbcStringBasedCacheStore - A JDBC-based cache
loader that stores each ModeShape node in a separate row in a simple 4-column table. This isn't as
fast as some other cache loaders, but works very well when the repository content needs to be stored
in a relational database. See the Infinispan documentation for details on configuring the JDBC store.
org.infinispan.loaders.cloud.CloudCacheStore - A cache loader that stores repository
content in Amazon S3, Rackspace Cloudfiles, or any other provider supported by JClouds.
org.infinispan.loaders.remote.RemoteCacheStore - A cache loader that can access a
remote Infinispan data grid.
org.infinispan.loaders.cassandra.CassandraCacheStore - A cache loader that can store
repository content in an Apache Cassandra database. See the Infinispan documentation for the
details on this cache loader.
3.2.5 Starting a ModeShape Repository

Now that we have a configuration for a ModeShape repository and a configuration for our Infinispan cache,
we can start writing the code to start up ModeShape, deploy our repository, and start using JCR.
Page 31 of 36
ModeShape 3
Starting the ModeShape engine

The first step is to instantiate and start the ModeShape engine. As we mentioned earlier, the ModeShape 3
engine has no configuration, so this is almost trivial:
Start the ModeShape engine
// Create and start the engine ...
ModeShapeEngine engine = new ModeShapeEngine();
engine.start();
This uses the org.modeshape.jcr.ModeShapeEngine class' no-argument constructor, and then calls
start(), which will block until the engine is running. Since the engine is extremely lightweight, this returns
almost immediately.
At this point we have a running ModeShape engine, but it doesn't contain any repositories. That's next.
Deploying our Repository

In order to deploy a repository to our running engine, we need to read in the repository's configuration. This
is easily done with one of the org.modeshape.jcr.RepositoryConfiguration.read(...) static
methods to read a java.io.File, an java.io.InputStream, the java.net.URL to the file, a String
with either the path to the resource file on the classpath or the JSON string itself. In this example, we'll read
the file from the classpath:
Read a ModeShape repository configuration
RepositoryConfiguration config = RepositoryConfiguration.read("my-repository-config.json");
Here, the name of the repository will either be defined in the file, or will be " my-repository-config" due
to the name of the file being read. Of course, we can also optionally change the name programmatically:
Optionally set the repository name programmatically
config = config.withName("My Repository");
Once we've read in the configuration, we can validate it to ensure it was constructed correctly. If not, we'll
print out the problems (which will have the line number and description for each error) and simply exit,
although you probably want to do something more useful.
Page 32 of 36
ModeShape 3
Validate the repository configuration

// Verify the configuration for the repository ...
Problems problems = config.validate();
if (problems.hasErrors()) {
System.err.println("Problems with the configuration.");
System.err.println(problems);
System.exit(-1);
}
Any errors at this point will absolutely prevent deploying a repository, and they need to be dealt with. That's
why the above sample code exits the process if there are errors. However, not everything in the
configuration can be validated at this time. For example, references to CND files or initial content files can
only be dereferenced within a running environment, something which the RepositoryConfiguration
does not have on its own.
So after we determine the configuration has no errors, the next step is to deploy it to our engine:
Deploy the repository to the engine
javax.jcr.Repository repository = engine.deploy(config);
If there are any catastrophic problems, the repository will not successfully deploy and the above method will
throw an exception. If the repository does successfully deploy, then the repository will be in a running state.
Starting with ModeShape 3.6, the repository will record warnings and errors that do not prevent deployment
but which otherwise may be significant problems:
Checking for deployment problems
Problems problems = repository.getStartupProblems();
if (problems.hasErrors() || problems.hasWarnings()) {
System.err.println("Problems deploying the repository.");
System.err.println(problems);
System.exit(-1);
}
Again, your application should handle such errors more gracefully than the sample code above.
After this, at any time we could shutdown the repository and/or we could remove it from the engine. But lets
continue by getting a JCR Session.
Page 33 of 36
ModeShape 3
Using the Repository and the JCR API

Once a repository has been deployed to an engine (and is running), we can simply look up the repository by
name:
Get the JCR Repository by name
javax.jcr.Repository repository = engine.getRepository("My Repository");
And at this point, we can use the standard JCR API to obtain a Session and start using the repository:
Create and use a JCR Session
javax.jcr.Session session = repository.login("default");
// Get the root node ...
Node root = session.getRootNode();
assert root != null;
System.out.println("Found the root node in the \"" + session.getWorkspace().getName() + "\"
workspace");
session.logout();
Stopping the repository and engine

When we're finished with the engine, we can shut it down to stop all repositories, terminate any ongoing
background operations (such as sequencing), and reclaim any resources that were acquired by this engine.
Since this might take a little time, the "shutdown()" method immediately returns a
java.util.concurrent.Future that you can use to wait until the shutdown process has completed. Of
course, if you don't want to block while the engine shuts down, there's no need to call " get()" on the future.
Shutdown the ModeShape engine, optionally blocking until completed
Future<Boolean> future = engine.shutdown();
if ( future.get() ) { // blocks until the engine is shutdown
System.out.println("Shutdown successful");
}
This entire section showed how to use ModeShape to start an engine, deploy a repository, obtain the
repository, create a Session, and then shutdown the repository and the engine. This required the use of
ModeShape-specific classes, which isn't always desirable. In the next section, we'll see how this same
process can be done while only using the standard JCR API.
Page 34 of 36
ModeShape 3
3.2.6 Using JCR's RepositoryFactory

The JCR 2.0 specification introduced the javax.jcr.RepositoryFactory interface that can be used
with the Java SE Service Locator pattern to find a Repository instance without using any
implementation-specific APIs. The basic process is as follows:
Use only the standard JCR API to find a Repository
Map<String,String> parameters = ...
Repository repository = null;
for (RepositoryFactory factory : ServiceLoader.load(RepositoryFactory.class)) {
repository = factory.getRepository(parameters);
if (repository != null) break;
}
Session session = repository.login("default");
...
Note how simple this is, while under the covers it is doing exactly the same process we described above.
Here, the parameters contain implementation-specific properties, but your application can easily read them
from a file to keep all implementation-specific details out of your application code.
ModeShape requires one parameter:
Properties file for the ModeShape RepositoryFactory
org.modeshape.jcr.URL = file:path/to/my_repository.json
where the value of the property is the URL that can be resolved to the JSON configuration file. Other URLs
might be to a file on the file system using an absolute path (e.g., "
file:///abs/path/to/my_repository.json") or even a URL to a web server (or governance
repository!) and the configuration file (e.g., "http://www.example.com/repos/my_repository.json
").
At this point using ModeShape just requires using the standard JCR API.
Oh, and if you want to shut down the ModeShape engine, you can (try to) cast the
javax.jcr.RepositoryFactory instance to a org.modeshape.jcr.api.RepositoryFactory
instance. If successful, you can call the "shutdown()" method that returns a Future<Boolean> like the
ModeShapeEngine's shutdown() method.
Page 35 of 36
ModeShape 3
3.3 ModeShape and JBoss AS7

If you're building a web application or service (using any Java web or EE technology) and deploying to
JBoss AS7, then the easiest way to set up ModeShape is to install it as a subsystem within AS7. Then you
can use the AS7 administrative tools (including the CLI) to dynamically configure one or more repositories,
and ModeShape registers them in JNDI where your applications can simply look them up and start using
them.
See our detailed instructions for installing and working within ModeShape and JBoss AS7.
Page 36 of 36

Getting Started Guide-V2-20150918 - 1705

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Getting Started Guide-V2-20150918 - 1705

Hochgeladen von

Copyright:

Verfügbare Formate

ModeShape 3

Getting Started Guide

Exported from JBoss Community Documentation Editor at 2015-09-18 17:05:01 EDT

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

2 Migrating from 2.x

JBoss Community Documentation

2.1 Public API

JBoss Community Documentation

2.2 Storage vs. connectors

JBoss Community Documentation

2.4 Binary storage

JBoss Community Documentation

2.6 MIME type detection

2.7 Text extractors

2.8 Configuration and running the engine

JBoss Community Documentation

// Load the one configuration file ...

// Start the engine ...

JBoss Community Documentation

Problems problems = config1.validate();

JBoss Community Documentation

JBoss Community Documentation

// Get the configuration ...

JBoss Community Documentation

2.9 Migrating content

JBoss Community Documentation

3.1 Complete Maven examples

JBoss Community Documentation

3.2 Embedding ModeShape in application or library

3.2.2 Add ModeShape Dependencies

JBoss Community Documentation

Maven dependencies for the JCR API and ModeShape engine

JBoss Community Documentation

Maven dependencies for the Infinispan Cache Stores (Pick One)

JBoss Community Documentation

Use newer Infinispan and JGroups versions

JBoss Community Documentation

JBoss Community Documentation

Using a transaction manager

Using the JBoss Transaction Manager

JBoss Community Documentation

Using other transaction managers

3.2.3 Configuring a ModeShape repository

JBoss Community Documentation

JBoss Community Documentation

This configuration defines:

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

JBoss Community Documentation

3.2.4 Configuring the Infinispan Cache

Cache with Cache Store

JBoss Community Documentation

JBoss Community Documentation

3.2.5 Starting a ModeShape Repository

JBoss Community Documentation

Starting the ModeShape engine

Deploying our Repository

JBoss Community Documentation