Sie sind auf Seite 1von 47

EMC Documentum Architecture:

Foundations and Services for Managing


Content across the Enterprise
A Detailed Review

Abstract

EMC Documentum is an enterprise content management (ECM) platform for ordering the flow and
delivery of unstructured business information across an extended enterprise. Based on an extensible, open,
scalable, flexible, and secure architecture that meets the needs of global, distributed organizations,
Documentum is a set of integrated products and services that work together. From creation to capture,
organization, and archiving, the Documentum content management solution addresses a range of strategic
business challenges.
November 2009

Copyright 2008, 2009 EMC Corporation. All rights reserved.


EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION
MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE
INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks
on EMC.com All other trademarks used herein are the property of their respective owners.
Part Number H3411.1
EMC Documentum Architecture
A Detailed Review

Table of Contents
Executive summary ............................................................................................ 4
Introduction ......................................................................................................... 4
Audience ...................................................................................................................................... 4

Bringing order to unstructured business information .................................... 4


Business benefits: Beyond information silos ............................................................................... 4
What EMC Documentum delivers................................................................................................ 5

EMC Documentum: A service-oriented architecture ....................................... 6


The foundation group......................................................................................... 7
Content objects ............................................................................................................................ 8
Object relations ............................................................................................................................ 8
Storing content objects ................................................................................................................ 8
Repository infrastructure.............................................................................................................. 9
Connecting to an underlying storage infrastructure ................................................................... 12
High-Volume Server................................................................................................................... 13
Security services........................................................................................................................ 15

The application services group: Managing content as


interrelated modules......................................................................................... 19
Compliance Services ................................................................................................................. 20
Core Content Services............................................................................................................... 22
Process Services ....................................................................................................................... 33

The developer resources and tools group: Designing, developing, and


administering information-based applications .............................................. 36
Configuration capabilities........................................................................................................... 42
Administration capabilities ......................................................................................................... 43

The experiences group: Managing end users interactions.......................... 43


Client Infrastructure.................................................................................................................... 43
End-user application frameworks .............................................................................................. 45

Conclusion ........................................................................................................ 47

EMC Documentum Architecture


A Detailed Review

Executive summary
From engineering drawings and manufacturing procedures to marketing collateral and sales presentations,
business content comes in many forms. This unstructured content is critical to the smooth and efficient
functioning of an organization, yet needs to be managed in a systematic way.
Enterprise content management (ECM) technology captures, organizes, stores, and delivers unstructured
content within an enterprise and beyond. It manages this information according to predefined business
rules, policies, and procedures and establishes relationships among pieces of content so that the given items
can be used in different contexts and renditions.
ECM creates categorization schemas and metadata that make search and retrieval faster and more efficient.
It facilitates the publication of content through multiple channels, for instance, you can publish the same set
of words and pictures to a website, distribute as a fax, print as a hard copy document, and send to a wireless
device. To satisfy compliance mandates, the ECM system ensures content archiving and long-term
retention. In short, ECM systems automate the lifecycle processing of content.

Introduction
The EMC Documentum platform provides the foundation on which developers can build applications and
solutions for everything from managing business documents to publishing content across multilingual
websites to enabling collaboration with interactive tools. This white paper details the architecture of EMC
Documentum and identifies the four primary capability groups that form the foundation of ECM strategy. It
also explains how Documentum fits into a service-oriented approach to content-based applications.
Recent EMC Documentum composition platform advancements are worth a mention in this section, as they
represent a departure from the conventional approach of writing custom code. As businesses seek to reduce
costs, complexity, and risk associated with developing applications, EMC Documentum enhances its
platform offerings to address such challenges. Initially targeting case-based applications, the EMC
Documentum xCelerated Composition Platform (xCP) allows organizations to build case-based
applications and solutions. xCP is the new standard in application development as it combines integrated
technologies, development and deployment tools, and best practices in a single platform that emphasizes
configuration versus coding. xCP, however, is not the focus of this traditional architecture paper.

Audience
This white paper addresses application developers and IT executives who are looking to unite vertical
information silos by standardizing on a service-oriented platform that can manage content assets while
providing superior scalability and ease of use.

Bringing order to unstructured business information


Business benefits: Beyond information silos
Enterprise content management systems help integrate departments and other groups that otherwise
function within separate information silos. Using the ECM system, you can also share information with
business partners and any other contacts within the extended enterprise.
Why is this necessary and how does it add value? To be sure, the research and development department
will continue to produce product specifications and patents, while the marketing department generates
EMC Documentum Architecture
A Detailed Review

collateral and press releases, and the customer service organization responds to queries. Yet employees and
business partners need to access, collaborate, and share such information across departmental boundaries,
such as when launching new products or collaborating to create new customer experiences. Secure access
to and easy sharing of such information is critical.

What EMC Documentum delivers


EMC Documentum orders the flow and delivery of unstructured and semi-structured business information
across an extended enterprise.
Based on an extensible, open, and secure architecture, Documentum is in fact a set of integrated products
and services that work together in varying combinations. From creation to capture, categorization, and
electronic storage, through just-in-time delivery and archiving, Documentum supplies the core technologies
that are critical for managing content within your organization:

Global and distributed. For enterprises with sites and customers around the world, Documentum
responds to users and content regardless of physical location. It includes unique caching capabilities
for high-performance management to any location in the world. To accommodate local languages and
currencies, the architecture stores multilingual content and metadata in shared repositories, forming a
single, virtual repository that spans geographical boundaries and languages.

Extensible. Extend Documentum to meet unique operational needs by embedding business rules or
custom-designed content objects. Documentum incorporates a service-oriented architecture (SOA) that
exploits the capabilities of enterprise content services for integrating with disparate enterprise
applications. Add customizations in multiple areas, including user authentication, rich media handling,
and legacy storage support.

Open. Because Documentum is standards-based, it easily integrates with both existing IT


infrastructures and evolving, web-centric environments. It exploits SOAP and REST as architectural
approaches to web services. There are standard Documentum APIs for WebDAV, FTP, SMB, and
JDBC. The architecture is fully JEE compliant (for web-based applications) and supports the Microsoft
.NET environment. Documentum integrates readily with enterprise applications and systems, including
directory services using the LDAP standard. It also supports a wide range of XML-based standards.

Interactive. Documentum supports interactive applications by assembling content resources around


communities of interest and particular business activities. It organizes content in an intelligent manner,
and turns information about people, what they know, and what they do into valuable business
resources.

Scalable. With support for billions of objects, 100,000 concurrent user benchmarks, and ingestion
speed of 450,000 documents per hour, Documentum is the proven leader in scalability. As your content
management needs grow larger and more complex, the solution manages increasing content volumes,
high-traffic loads, additional users, and more complex workflow processes. It also addresses the
network latency and large-scale distribution issues that face global enterprises by utilizing its
underlying multiprocessor systems as well as caching and clustering environments (vertical and
horizontal scalability).

Secure. As organizations make repository content available to a wider range of contributors,


Documentum enforces secure access designations. Access control lists define the users, groups, and
roles that can access the repository or the discrete objects that it contains, as well as the operations that
can be performed. Sensitive information in the repository file stores are encryptable. It also keeps
network communications among servers and desktop clients secure through the Secure Sockets Layer
(SSL). Documentum also supports electronic signatures and offers auditing of all system activities.
Finally, Documentum secures roving content, those files and other objects that may be moving around
the network and beyond purview of the repository.

EMC Documentum Architecture


A Detailed Review

EMC Documentum: A service-oriented architecture


Because of its service-oriented architecture, Documentum provides a unified environment for capturing,
storing, accessing, organizing, controlling, retrieving, delivering, and archiving any type of unstructured
information. It also supports the resources for managing that content across an extended enterprise, for
publishing to the Internet, and for processing high-volume, content-intensive transactions.
For the purpose of simplification within this paper, consider that the Documentum architecture consists of
four conceptual groups:

The foundation group provides platform-oriented services and is a unified environment where content
is stored, accessed, and secured.

The application services group provides various business-oriented, application-level services for
organizing, controlling, sequencing, and delivering content to and from the repository.

The developer resources and tools group provides capabilities for designing, developing, and
administering enterprise-scale applications that use content within the context of business processes.
This group also includes Documentum Application Program Interfaces (APIs) as well as Enterprise
Content Services (ECS) for loosely coupling content-related objects with external enterprise services
and mash-ups.

The experiences group provides the frameworks and user interfaces for interacting with content
management functionality in desktop- or browser-based applications.

Each of these groups consists of a series of components, which, together, form a unified, consistent, and
extensible architecture, as shown in Figure 1.

EMC Documentum Architecture


A Detailed Review

Figure 1. Documentum consists of four groupsfoundation (bottom violet), application


services (middle gold), experiences (top gray), and tools (right navy)
Let us examine the capabilities of these four groups and identify how they interrelate to provide a
comprehensive environment for managing content across an enterprise.

The foundation group


The foundation group provides platform services, such as storing, accessing, and securing content in a
unified content infrastructure
Documentum contains an enterprise-wide repository where the logical services for accessing content are
separated from the underlying systems for storing it. To an application, the Documentum repository
appears as a unified environment, though content may reside on multiple servers and physical storage
devices and distributed geographically across an organization. Put another way, the operation of the
repository is independent of the network typology and the underlying storage infrastructure.
Documentum stores content in a consistent manner, regardless of content type, file size or complexity, and
file format. Content types include, but are not limited to, the following:

Ordinary business documents

Compound documents (containing interlinked and highly formatted text and graphics)

Web pages
EMC Documentum Architecture
A Detailed Review

XML documents and components

Scanned images

Digitized photographs

Multimedia digital assets (such as music, sounds, and full-motion video)

Fixed documents (such as the outputs and reports from enterprise applications)

E-mail and instant messages

Collaborative content such as threaded discussions, chats, blog posts, wiki pages, votes, and notes

Computer-aided design (CAD) drawings

Documents and data records from enterprise resource planning (ERP) applications

Virtual reality environments

Content objects
Content Server is an object-based system. Everything that users manipulate, whether documents, folders,
security profiles, or business processes, is stored and managed as an object by Content Server. Objects
comprise three parts:

Content assets or source data represent the core information stored in its native format.

Content attributes or metadata describe the content assets with descriptors such as keywords, owner,
version, links, and creation date.

Methods or operations are the instructions the system performs on content assets, such as transform,
notify, and display.

Out of the box, Documentum provides 100+ object types. These may be extended objects or new ones may
be created. A content objects set of attributes and methods are configurable and extensible. Using
Documentum development tools, developers can create new object types that behave exactly as dictated by
specific business needs.
Furthermore, content attributes characterize the relationships among the stored content objects. The
repository organizes content around its metadata; users and applications use the metadata to interact with
and retrieve relevant content.

Object relations
The Documentum Docbase lets you establish many types of relationships between objects. Content Server
includes various system-defined relationships, such as those between a document, or a note object
representing annotations to the document, or the relationship between a document and the workflow and
lifecycles assigned to it. In addition, the system lets you define custom relationships. For example, you
might define a relationship between two document types so that a document of one type is automatically
updated whenever a document of the other type is updated. Developers can define this type of relationship
and write procedures to manage it.

Storing content objects


Documentum works as a unified environment for storing content objects. It stores these objects in their
native formats, and further encrypts as needed. This way, applications rely on a single set of services and
programming interfaces to access content, regardless of where and how the system stores the content
objects themselves. The repository enforces security measures to ensure that only your authorized users and
applications have access to the assets and indexes of content attributes.
EMC Documentum Architecture
A Detailed Review

Documentum responds to the needs of your organization. This adaptability and flexibility is particularly
important for organizations that operate in multiple locations and require a distributed repository for
storing, caching, fetching, and updating content, while also managing rapid access across the enterprise.
The virtual reach makes it possible to implement distributed environments in various ways that ensure
enterprise-wide access. It enhances system performance and maintains underlying security and compliance
requirements. This provides you with many options for designing and deploying the virtual repository in
ways that best meet your operational objectives.
For example, a global company might host a Documentum content repository in multiple geographical
regions, storing content locally to meet corporate quality-of-service guarantees. To further enhance
productivity and business objectives, such a company can also support a series of branch offices in remote
locations. Important documents, large multimedia files, and other content types can be distributed and
cached at the branch offices, so that they are immediately available to local users (with no performance
degradation associated with accessing files across low-bandwidth connections).
Users at branch offices can access and modify content, as their roles require. The overall security and
access controls extend across the entire enterprise environment in a seamless fashion. If need be, content
stored at branch offices can be automatically encrypted for added security. Updates made by branch office
workers can be synchronized with the regional repositories in a predictable manner, optimized to ensure a
responsiveness experience and the currency of the revised content. The end result is a distributed virtual
repository that securely manages content, regardless of geography or network bandwidth, to meet strategic
business goals and objectives.

Repository infrastructure
The Documentum repository consists of four main components, which behave as a single entity from an
application point of view: a file store containing the content assets; attribute tables within a relational
database; an XML store for XML content; and full-text indexes. Figure 2 shows the four components of the
Documentum repository.

EMC Documentum Architecture


A Detailed Review

Figure 2. The four components of the Documentum repository: file stores containing
content assets; attribute tables within a relational database; an XML store for XML
content; and full-text indexes. All components behave as a single entity from an
application point of view

File store and RDBMS


Documentum stores content attributes in a relational database for rapid query and retrieval. It stores these
content assets as files in a logical file store, which in turn can encompass one or more physical file stores
and connect through a variety of network typologies.

XML Store
The Documentum XML Store is a native XML repository for XML content. It preserves XML structure,
allowing you to query content at any level of detail (for example, individual elements, attributes, content
objects, or metadata attributes), even on very large information sets. The XML Store provides performance
and feature advantages over relational databases and file systems through specialized XML indexing
methods, caching, and an architecture optimized for XML.
XML Store content is accessible through standard Documentum APIs and subject to the same policies and
management as other content objects. It manages updates and additions to existing Documentum
operations. You can query XML content in an XML Store using XQuery, via Documentum Foundation
Classes (DFC) API calls. Query results return to the application as an XML document. XML Store also
indexes content in the full-text system. Searches and queries can be performed with DQL, full-text
expressions, XQuery, or any combination.

EMC Documentum Architecture


A Detailed Review

10

Full-text indexes
Documentum maintains a full-text index of all text-based content assets stored within the Documentum
repository so it can rapidly search through large collections of unstructured information. The indexed
content assets include documents, text files, XML components, HTML files, and closed-caption tracks of
video files.
The FAST Index Server search technology is embedded within Documentum. The search capability is
modular, with alternate engines for market-specific Documentum offerings. For instance, the Documentum
OEM edition, built for software vendors that embed Documentum in their products, offers the open-source
alternative, Lucene, as the default engine. For all standard enterprise customer offerings, however, a FAST
search engine is built into the repository.
The full-text index, which is automatically created by an indexing process when content is added to the
repository, contains:

All words of the content assets stored within the repository

All keywords and other content attributes (or metadata) that describe the content assets

As part of the content ingestion process, an index agent forwards content to an index server, which
maintains the full-text index database. Documentum ensures that query performance and scalability are not
affected by repository size. To scale up for high-speed content ingestion, the indexing process can run on
multiple indexing pipelines deployed on multiple CPUs. This is particularly important in content archiving
applications for e-mail, enterprise reports, and SAP data. Figure 3 shows the indexing and query process
flows.

Figure 3. Documentum maintains a full-text index of all text-based content assets stored
within the Documentum repository. It accomplishes this through a set of plug-ins and APIs
for querying and indexing functions
EMC Documentum Architecture
A Detailed Review

11

Documentum search is exposed through Documentum Query Language (DQL) and either Documentum
APIs or Content Services can be used to issue query statements. In addition to searching the text within the
content assets, the full-text engine also searches all content attributes. So within a single query, the search
engine analyzes content on two levels, the content assets and the content attributes, and returns a unified
results list. As part of its query algorithms, the search engine analyzes and normalizes text, and identifies
synonyms based on a thesaurus of related terms. The search engine stores and supports multiple languages
within a single index, eliminating the need for multiple, language-specific indexes. It now supports more
than 70 languages.

Connecting to an underlying storage infrastructure


The Documentum repository transparently connects with the underlying storage infrastructure, which
consists of multiple disk drives and other types of mass storage devices. You can design the storage
infrastructure according to the specific reliability, security, policy, cost, and operational needs of your
organization. Documentum makes no distinction between content stored in different types of environments.
It relies on the file system APIs to communicate with the file system interface of the underlying file store.
Documentum supports any type of storage system, from a servers local hard drives and network-accessible
RAID arrays to network-attached storage (NAS) or complex storage area networks (SAN), regardless of
manufacturer. The storage system is transparent to the Documentum platform. For example, full-motion
video files can reside on a high-performance streaming server while text-oriented files are hosted on a file
server tuned to rapidly look up filenames. If, for operational, performance, or security reasons, an
enterprise manages all of its content in a relational database management system (RDBMS), then the
content assets can also be stored as binary large objects (BLOBs) adjacent to the attribute tables.
Documentum also provides two storage-specific services that enable system designers to enhance content
storage capabilities: Content Storage Services and Content Services for EMC Centera.

Content Storage Services


Content Storage Services add a storage policy engine to the Documentum repository that enables eventtriggered, ad hoc, and batch execution of storage allocation and migration policies. Storage administrators
can define, manage, and update the content storage policies to store live or frequently updated content on
one set of devices, and archived content on another. Content Storage Services include audit events and
migration logs, which enable easy reporting and chargeback capabilities. Content Storage Services can
connect to Write Once Read Many (WORM) drives using the retention file store connector.
For example, when content is initially created it can be automatically stored in an online storage device.
Frequently accessed content can remain within a high-performance storage environment, while rarely
accessed content can migrate on a scheduled basis to a near-line, more economical storage environment.
Valuable content that needs to be preserved for a predetermined period of time, such as the final versions of
business documents, can be automatically stored in a highly secure storage environment. Transitory
content, such as successive drafts of business documents or other work-in-progress items, can be securely
stored and rapidly accessed as needed, and then routinely purged when the project ends.

Content Services for EMC Centera


Content Services for EMC Centera is the bridge between the Documentum repository and Centera, an EMC
content-addressed storage (CAS) system that ensures fast, easy, online access with assured content
authenticity and petabyte scalability. The enterprise content management capabilities of Documentum
function seamlessly with the EMC Centera CAS architecture to deliver an extensible and scalable
foundation layer for fixed content assets. By providing these valuable capabilities on the storage level,
EMC Centera complements the software-level security and compliance Documentum provides for fixed
content assets.
EMC Documentum Architecture
A Detailed Review
12

EMC Centera provides a scalable, secure storage environment for cost-effective retention, protection, and
disposition of fixed contentincluding electronic records, e-mail archives, and scanned imageswithin an
enterprise environment. EMC Centera is optimized to store long-lived and archival content.
Content Services for EMC Centera relies on the plug-in architecture of the Documentum platform. The
content is stored directly within EMC Centera, which serves as a file store instead of a file system of the
underlying operating system. The content objects contain Centera-issued claim checks that are stored as
properties of the content objects in the Documentum repository.
Offering single store capabilities, EMC Centera ensures there are no duplicate or redundant versions, which
improves overall storage efficiency and performance. For non-Centera storage environments,
Documentum also has the ability to deduplicate content.

High-Volume Server
Through High-Volume Server, Documentum supports data-intensive, transactionally oriented, content
management applications. These services enable rapid ingestion, efficient database storage, and reliable
access to content.
High-Volume Server enhances the online and offline ingestion of content and reduces the metadata
footprint of the database within the repository. These services include lightweight system objects, together
with data partitioning and batch processing capabilities that are added to the Documentum platform.

Lightweight System Objects


Documentum recognizes the commonality of metadata for content objects in high-volume, repeatable
situations, such as transactional or archival applications. Documentum can optimize these content objects
as Lightweight System Objects (LwSOs).
An LwSO is a composite object, composed of a parent object and a series of child objects (as shown in
Figure 4) is maintained by the parent object. The unique metadata is then maintained by each child object.
The LwSO infrastructure enables normalization of metadata whenever a large set of objects share common
attributes and policies. Storing the common metadata once (as part of the parent object) saves storage space
and increases overall performance.

EMC Documentum Architecture


A Detailed Review

13

Figure 4. Lightweight System Objects (LwSOs) separate unique, application-specific


metadata from common system-specific metadata, creating a parent/child relationship.
The system-specific metadata, which is identical across all the similar objects, is kept in a
singe parent object. The numerous child objects contain only the attributes unique to the
object. As a result, the LwSOs can be stored efficiently within the HVS
LwSOs can be useful when there is content that can share common system and application attributes in a
hierarchical or parent-child relationship. This type of object model can improve object ingestion, as well as
reduce the database footprint. Consider the case of scanned check images. A bank may scan thousands of
checks in a single day. If all of the checks scanned on a particular day could share common attributes, like
bank name and bank routing number, and share a common retention policy, then LwSOs could be useful.
Placing the common attributes on the parent object allow the Lightweight children to share the attributes
while eliminating the need to store this redundant information in the database for each child object

Data partitioning
Data partitioning enables related objects to be placed in distinct database partitions (or ranges) when stored
within the repository. This improves the management, search, and processing of content within the
repository, by ensuring that the objects are managed in a highly efficient manner.
Data partitioning lets you maintain huge amounts of data in slices, resulting in better performance and
manageability. For instance, range partitioning provides performance (and cost savings) benefits that
increase with the amount of old data that is kept. Highly active content is maintained in hot partitions,
while less active data remains in the cold partitions. Hot partitions can be backed up more frequently than
cold partitions. Thus, data partitioning can reduce the time and cost of database backups and other types of
very large database maintenance activities.

EMC Documentum Architecture


A Detailed Review

14

Batch processing
Batch processing speeds ingestion by combining operations that interact with the database. Large volumes
of content can be ingested at a very rapid rate by streamlining the object model and by supporting bulk
operations on associated metadata.
In particular, batch processing reduces the total overhead associated with inserting each object individually
into the repository. Objects are grouped into a batch, and then inserted into the repository in a single
operation. The greater the number of objects that need to be inserted, the greater the reduction in overhead
achieved.
Repository operations involve process checking and validation. Ingesting objects into the repository
involves such checking and validation. When ingesting a large number of objects, this checking and
validation can be performed for each object. Scoping makes it possible to eliminate the redundant
checking associated with an operation when performing that operation on a large number of objects. Thus,
batch processing is often used with the scoping feature to eliminate redundant processing associated with
repository operations.

Security services
Documentum provides core content security by managing access to the underlying repository. Additional
security can be added via Trusted Content Services and Information Rights Management Services.
The core security services include:

Authentication

Authorization

Auditing

Each fulfills a unique function within an organizations security architecture. First, Documentum builds on
the underlying enterprise-wide security infrastructure to authenticate access to the repository. Next,
Documentum manages access control lists (ACLs) to authorize access to content stored within the
repository. Any activity can be audited using flexible auditing tools, with an audit trail stored in the
repository. Documentum can then encrypt all communications between the content server and other
systems such as clients, web-based applications, and directory servers.
Let us examine each in turn.

Authentication
Documentum relies, initially, on the authentication mechanisms of the underlying operating system or
database, such as a username/password challenge, to manage access to the repository. Documentum
supports token-based authentication for application-level access, ensuring that client applications have
valid tokens to connect to the repository and gain access to the content. Documentum includes RSA Access
Manager connections for single sign-on. The authentication mechanisms are extendable to include
Kerberos validation and authentication plug-ins from CA Netegrity.
Enterprise identity management
Documentum is designed to integrate seamlessly within an enterprise-wide security architecture; where an
enterprise directory service exists, Documentum relies on it for enterprise identity management.
Documentum supports connections to multiple directory services and can be integrated with many popular
directory servers, including Microsoft Active Directory, Sun ONE Directory Server, Oracle Internet
EMC Documentum Architecture
A Detailed Review

15

Directory, IBM Tivoli Directory Server, and Novell eDirectory. Documentum also supports the Microsoft
Active Directory Application Mode (ADAM) service. Documentum uses Lightweight Directory Access
Protocol (LDAP) to synchronize user and group identities, ensuring user identities are managed as an
enterprise-wide resource without adding extra administrative burden. Documentum includes support for
LDAP certification database automation, which expedites administrative tasks for identity management.

Authorization
Once a user or application authenticates an identity, the person or program can access the stored content
based on the privileges associated with that identity. The authorization rules (also called access controls or
permissions) then determine what content can be accessed or modified.
Documentum assigns authorization rules through access control lists (ACLs), which are automatically
applied to all repository objects when the objects are created. The ACLs can be modified manually by users
as well as automatically via lifecycle changes, through business processes, and through other applications.
Documentum applies authorization at the object level that every content object, version, and rendition, as
well as every container (ranging from folders to storage servers) and any other object (business process,
policy, audit trail, and so on) is secured by an ACL throughout its lifecycle.
Three criteria for ACLs
Documentum authorizes access based on one of three criteria:

Explicit assignment to an individual user

Membership in a user group

Assignment to a predefined role

Individuals, groups, and roles can own a content object managed by Documentum. For example, when
developing a press release, anybody with the role of PR Manager might be authorized to create a new
press release, and any member of the PR Group can have privileges to edit it. The tasks can be shared
and coordinated by managing role definitions across a workgroup so that managing a press release is no
longer limited to a predefined named individual.
Basic permissions
Documentum provides seven levels of basic permissions, or access privileges:

None: Content objects in the repository cannot be seen, reducing complexity by hiding content
irrelevant to predefined users. It is also an effective way to screen sensitive documents or projects, and
ensure that only those people and processes with proper privileges can find object references in the
repository.

Browse: Content attributes (or metadata) for content objects can be viewed, but the content assets
cannot be opened and read.

Read: Content assets can be opened and read, but not changed.

Relate: A user can create relationships between a given content object and other objects within the
repository. This permission is used by tools such as annotations where each annotation is a new object
that relates to an existing content object.

Version: A user can make changes to a content asset but cannot overwrite an existing version; changes
are saved in a new version, which can include a modified file, modified metadata, or both.

EMC Documentum Architecture


A Detailed Review

16

Write: A user can make changes to a content object (both the content asset and the associated
metadata) and save those changes without creating a new version. This level of access control is
usually restricted to the content owner.

Delete: A user can delete a content object.

This set of permissions is cumulative: Each level automatically grants all access rights of the levels below
it. For example, a user with write privileges can also version, relate, read, and browse the
contents. Delete is a special case, discussed next. There are additional advanced features regarding user
authorizations that are described later in this paper.
Object-level delete privileges
The delete object permission grants deletion privileges while denying other levels of access; that is, a
user or process can delete a content object without having permission to write, version, read, or relate. This
capability enables a corporate archivist, librarian, or records manager to dispose off objects from the
repository according to retention policies, without being able to access any aspects of its contents.
Extended permissions
Documentum supports multiple, extended permissions for managing the content objects within the
repository, such as:

Change location: A user can change the location of a content asset from one folder to another. By
default, a user with browse permission or greater has change location privilege.

Change permission: A user other than the content owner can change a content assets standard
permissions.

Change owner: A user other than the content owner can change the owner of a content asset. This is
important when content ownership is to be reassigned, and the original content owner is unavailable.

Execute procedure: A user can execute an external procedure on content assets, such as creating a
rendition. By default, a user with browse permission or greater inherits execute procedure
privileges.

Change state: A user can change a content assets lifecycle state.

Documentum controls access to the content objects as well as secures how they are organized and
categorized in the repository. As a result, Documentum provides the core security services that determine
what actions can be performed on a content object.
Auditing
Every operation performed by Documentum can be recorded in an auditable record. The audit trail can be
fully configured in the Documentum administrator (where it can also be viewed) and is secured in the
repository by strong encryption.
The audit trail meets the rigorous requirements of the FDA 21 CFR Part 11 regulation, considered a
benchmark for auditing. But the audit trail can go further into the scope and granularity of audited events,
and can be used to trace possible security breaches and optimize system utilization.
Each auditable record lists both the new and the previous values associated with an event (such as the time
and username when a document is checked out of the repository), enabling quick determination of what has
changed. End users and administrators can also view the history of documents and other objects stored in
the repository, so that they can determine how and when the information changes.

EMC Documentum Architecture


A Detailed Review

17

Encrypted communication
All communications involving Content Server such as between Content Server and an application
server, Content Server and Desktop clients, and Content Server and a directory server use SSL standard
encryption to prevent security breaches by eavesdropping.

Trusted Content Services


Documentum adds Trusted Content Services to tackle application-specific security situations beyond the
authentication and authorization mechanisms provided by the core security services of the content platform.
Trusted Content Services include:

Encrypted file stores: Repository content files can be encrypted to prevent system-level intrusion and
to secure content files stored on backup media. The encryption can be done selectively per file store, so
each repository can combine encrypted and unencrypted content.

Digital shredding of deleted items: Shredding irrevocably destroys content at an operating system
level by overwriting the data on the storage device. Documentum shreds content stored on both file
systems and CAS devices.

Support for electronic signatures: Users can sign electronic documents in a manner that meets
established industry standards to verify the integrity of the signed document.

In addition, Trusted Content Services can enrich the underlying security model and extend authorization
mechanisms through Mandatory Access Control List (MACL). This mechanism provides an additional
level of security before granting an authenticated user access to a content object.
Specifically, MAC can:

Enforce membership rules: Ensures that a user is a member of an externally defined group before
verifying authorization privileges

Enforce restriction rules: Restricts a users access privileges to a specific level even if the ACL
provides for a higher level of access

Apply application-level access control: Augments an ACL with an application-specific security


setting

Information Rights Management Services


Information Rights Management (IRM) Services extend the security and access controls on documents and
other types of content beyond the boundaries content platform. IRM Services secure roving content that
require persistent protection across the network and wherever the content is located and stored.
IRM Services add an IRM policy server to the enterprise environment, as shown in Figure 5. This server
establishes the policies by which documents, e-mail messages, or other types of objects can be opened,
displayed, printed, and further distributed outside the repository. Before leaving the Documentum Content
Server, the content together with the usage policy is secured via encryption. Only the encrypted file
(containing the content) is transferred from the repository and is available outside the security perimeter.
IRM Services support Microsoft Office (Word, PowerPoint, Excel, and Outlook), Adobe PostScript,
HTML, RIM BlackBerry, and Lotus Notes applications, and can be customized to support other file
formats.

EMC Documentum Architecture


A Detailed Review

18

Figure 5. IRS Services add an IRM Policy Server to the enterprise information environment
to secure the content that is no longer managed by Documentum Content Server
IRM Services then control the process by which the content is decrypted and made accessible to recipients.
An end user needs to obtain a key to decrypt the content by accessing a policy server over the network.
This policy server verifies the users identity through its own authentication mechanism. Once
authenticated, the policy server provides the end user with a key to decrypt the content. Once decrypted, the
users use of the content is limited by the predefined usage policy. For example, there might be limits on
the number of times the content can be viewed, whether recipients can print or copy the document into
another file, whether recipients can forward the document to third parties, or other operational constraints.

The application services group: Managing content as


interrelated modules
Focused on building solutions, Documentum includes a comprehensive suite of business-oriented
applications that provide services for managing content. These services function as interrelated modules,
meaning that one service calls another to obtain needed information or functionality.
Documentum provides multiple application services in support of the composition of business solutions.
These are grouped around three functional areas: Compliance Services, Core Content Services, and Process
Services. Furthermore, they are formally registered and cataloged in the Enterprise Content Services
Catalogs, where they are then made available as componentized web services to external applications.
EMC Documentum Architecture
A Detailed Review

19

Compliance Services
Compliance Services provide capabilities for retaining content and managing content as records. These are
Retention Policy Services, Federated Records Services, and Records Manager, respectively.

Retention Policy Services


Retention Policy Services (RPS) specify and enforce the retention of objects in the Documentum repository
by attaching one or more retention policies to those objects. The retained objects, or records, are
immutable, that is, they cannot be changed or deleted for the duration of the retention policy. An additional
hold capability retains documents according to ad hoc events such as an audit or litigation.
By applying policies to containers (such as folders) or processes (such as workflows or lifecycles),
document retention is enforced programmatically, with little to no human involvement. The policies and
automation tools can also be used for content disposition (or permanent archiving, or destruction), ensuring
that files are appropriately disposed and helping to limit content accumulation.
RPS enhances the standard Documentum controls along three important dimensions:

Notifications: Notifies owners or authorities based on trigger events, such as entry into, or completion
of, a retention phase

Auditing: Audits and records the before and after of metadata changes during a recordkeeping
action

Reporting: Provides report query engines with standard recordkeeping criteria and predefined
recordkeeping reports

Using RPS, organizations can meet compliance regulations, legal requirements, and best practices. RPS can
be added independently to any supported Documentum environment. RPS is the retention engine powering
the EMC Documentum Records Manager application.

Virtual Content Management (VCM)


With the introduction of Virtual Content Management (VCM), the EMC Documentum platform, which
traditionally provides robust infrastructure for storing and managing any kind of enterprise content in a
single repository, has extended that enterprise approach to managing content in any and all repositories
within the enterprise. From a central EMC Documentum repository designated as the master repository,
one can manage content in any repository through the use of proxy objects created within the master but
referring to objects located in source repositories, throughout the enterprise. Objects in the source
repositories are securely locked down to prevent change or deletion and are managed in place. Only
relevant metadata is transferred to the master for management purposes.

Federated Records Services


Federated Records Services (FRS) is a product built on top of the VCM framework, which provides the
foundation to connect disparate records repositories spread across an enterprise. With FRS, Documentum
provides a centralized location for records management and raises the level of assurance that records are
being managed in a consistent and systematic manner.
Using the VCM framework, FRS allows for the connection to remote (or source) repositories
(Documentum, external third-party repositories, file shares) to manage their content from the master
Documentum repository without necessarily importing content into the master Documentum environment.
FRS provides the ability to configure which sources are to be imported, processed, and managed by the
EMC Documentum Architecture
A Detailed Review

20

master environment. Through VCM, FRS provides an adapter to any Documentum repository.
Documentum partners provide support for adapters to third-party repositories and file shares.
In addition, using the VCM framework, FRS provides the ability to create a proxy object within a
Documentum repository that links to the main object stored in the external repository. FRS supports the
application of retention policies, mark-ups, and disposition to these proxy objects that interacts with
Documentum repositories and other third-party repositories. The capabilities of the adapters and the
external repository determine the level of support/enforcement

Records Manager
EMC Documentum Records Manager extends core Documentum content management capabilities by
adding features and functionality, such as corporate file plans, classification, and file-level and field-level
security.
The Records Manager architecture provides recordkeeping functionality as services that can be used for
electronic and physical records alike, as shown in Figure 6. Like functionality is aggregated into discrete
modules. By selecting the appropriate Records Manager modules, customers can deploy a records solution
that meets their unique requirements. Customers can also add additional modules if and when their
requirements change.

Figure 6. The Documentum records management capabilities support electronic


documents, e-mail, and paper-based documents as managed records. These capabilities
leverage the complementary offerings of the overall Documentum platform
Records Manager leverages Retention Policy Services and the capabilities of Documentum to provide
records management capabilities in a modular fashion. The modules and their capabilities are described in
Table 1.

EMC Documentum Architecture


A Detailed Review

21

Table 1. Records Manager modules and features


Records Manager
module

Capabilities

Containment
policies

Controls the number of tiers within the folder or file plan hierarchy and the actions
that are permitted within each tier, such as a check-in or records declaration.
Containment policies also allow or block recordkeeping governance by document
type and limit the number of a records classifications, which is architecturally
equivalent to the number of links associated with an object.

File plan

Provides a permanent, systemwide classification schema for records, defining


record naming, organization, and descriptive metadata, specified and managed by a
records administrator. A document is overtly declared as a record by storing it in a
location managed by the file plan, and classified using metadata specified by the
file plan. Retention is defined by the classification.

Naming policies

Configures the naming conventions for records and the file plan by controlling what
attributes are used, what date format is enforced, whether human entries get
validated, how names are dynamically generated, and more.

Security policies

Extends existing Documentum security by adding document-level permissions that


are discrete rather than cumulative. For example, the capability to grant browse
capabilities to a certain user, group, or role for a specific document type such as
invoices.

Retention policies

Determines the length of time a document, folder, or cabinet is retained, based on


operational, legal, regulatory, fiscal or internal requirements. For the duration of its
applied retention policy, the managed object cannot be deleted, nor can it be revised
in any way, although a new version of the object may be checked in.

Supplemental
markings / Shared
markings

Extends access controls by adding permissions based on participation in a


designated group, and restricting permissions to users who are members of all
designated groups.

Core Content Services


The Core Content Services provide the fundamental capabilities for accessing and storing repository
content. These include library services, workflow services, lifecycle services, XML services, Federated
Search Services, Content Transformation Services, Content Intelligence Services, and Content Delivery
Services.

Library services
Library services manage content in three key ways:

Check-in/check-out (or locking): Capabilities ensure users with editing privileges do not overwrite
one anothers versions or make incompatible updates. For example, when one person is editing a
document, another person cannot overwrite their edits.

EMC Documentum Architecture


A Detailed Review

22

Versioning: Capabilities track the multiple versions of documents or other content objects, and
provide the ability to revert to prior versions as required. For example, the repository can maintain
multiple versions of a set of web pages, and revert to a version from a prior date when needed.

Basic renditioning: Capabilities maintain alternative representations of documents or other content


objects in their different formats, resolutions, or natural languages. Documentum can automatically
generate renditions through Transformation Services and maintain the relationship between the
original object and its renditions, ensuring the objects integrity and enabling users to manage
renditions individually or collectively. For example, content initially authored as a Microsoft Word
document can be rendered as a fixed formatted Adobe Acrobat PDF file, or an HTML formatted web
page with associated embedded image files.

Documentum includes a comprehensive suite of library services, in addition to the three mentioned above.
Documentums suite of library services, in turn, rely on an extensive set of security services to determine
how users or applications are authenticated and authorized to access repository content.

Workflow services
Documentum workflow automates business activities and policies for repository content. A workflow is
defined by a model, the sequence of steps that comprise the process, and the actions that must occur at each
step. A workflow can describe a simple or complex process; it can be serial, with activities occurring one
after another, or parallel, with all activities occurring simultaneously; and it can combine serial and parallel
activities. Because an objects workflow state is defined by a set of content attributes attached to the object,
it travels with the object.
For example, a press release workflow might require an approval process involving five people and seven
serial steps.
Documentum persistently manages the state of multiple instances of each workflow, often hundreds or
thousands of instances, by storing workflow objects in the Documentum repository. Similarly, workflow
templates (definitions) are stored as repository objects so various services, such as security, versioning, and
retention, can be applied.

Lifecycle services
Documentum defines, maps, and implements flexible content lifecycle rules according to the business
policies established by the enterprise.
Like workflow, an objects lifecycle state is defined by a set of content attributes attached to the object, so
it also travels with the object. But instead of being defined by a flexible workflow model, lifecycle services
are defined by a set of business policies or business rules. While a workflow routes a document among
various users and automatic tasks, lifecycles define the business rules for changes that apply to content as it
moves through predefined stages (such as draft, in review, active, and obsolete). As you might
expect, unlike workflow, each content object has only a single lifecycle.
Lifecycle services automate the lifecycle policies of repository content. These services assign a lifecycle
stage to the content object, and then manage the objects transition from one stage to another. An
organization can extend the lifecycle stages to encompass its own operating policies (see Figure 7).

EMC Documentum Architecture


A Detailed Review

23

Figure 7. Lifecycle services assign a lifecycle stage to a content object and then manage
the object's transition from one stage to another.
Lifecycle services are a powerful content management capability. Policies instigating changes in access
control, logical and physical location, retention rules, labeling, naming, versioning, renditioning, and
workflow and business processes can be mapped to the lifecycle stages. Different object types can have
different lifecycle definitions.
For example, consider the lifecycles for press releases and patent applications:

When a company develops a press release, any member of the corporate communications department
may edit it prior to approval. Only marketing managers and product managers responsible for products
mentioned in the press release may read the drafts. Once the press release is approved, all senior
managers in the company can read it, but only the director of corporate communications can change it.
When the final version is published on the companys website, all prior (or draft) versions are
automatically deleted from the repository after 30 days. These access policies are distinct from a
workflow that routes the press release to the company managers who have to approve it before it can
be promoted to the final stage.

EMC Documentum Architecture


A Detailed Review

24

When a company creates a patent application, only designated researchers and staff attorneys can edit
the content, while research directors and the corporate counsel can read it. Once the application is
finalized and submitted to an external patent authority, other company researchers and managers can
then read the application. All drafts of the application are automatically archived for seven years. The
submitted version is automatically classified as a record and submitted to the company archives for
perpetual storage in a secure storage environment.

XML services
Documentum provides XML services for managing XML documents in their native format within the
Documentum repository. The XML services are provided through two complementary features: XML
Applications and XML Store. Documentum also provides XML Transformation services, as discussed next
in Content Transformation Services.
XML Applications
The XML Applications feature in the Content Server directly stores XML-tagged content and manages the
content within the Documentum repository. Content Server preserves the hierarchical structure and links
among XML components and documents. It provides the ability to automatically parse, validate, transform,
map, and store incoming XML documents. It also supports several features that are essential for managing
XML documents in their native format: XML content validation, automatic attribute population (extraction
of metadata values out of XML element and attribute values), XML link management, and XML
componentization (automatic bursting/chunking of larger documents into reusable components).
XML content validation
XML content validation ensures that the XML elements within an XML document are well formed and
conform to a predefined definition. Documentum can validate XML documents at any time, including
during ingestion into the repository.
An XML document can be validated against a Document Type Definition (DTD) or an XML schema. The
validation process ensures that the components, attributes, structure, types, and values correspond to the
specified format. In addition, Documentum can also manage the DTDs and schemas as Documentum
repository objects that can be versioned, secured, or retained as records.
Automatic attribute population
Documentum can automatically extract values from the elements and attributes of an XML document, and
use those values to populate object metadata. This allows the XML documents to be searched via DQL and
to trigger workflow, lifecycle, and policy events. Documentum also supports two-way attribute population
where updates in metadata can be reflected in the XML content stored in the repository.
Link management
XML links can be modeled to allow Documentum to maintain relationships between XML components and
other content. The platform automatically imports associated content or images that are linked to new XML
content, and updates links when an XML document is checked back into the repository after an editing
session. This enables people and applications to determine where a particular XML component is
referenced, or find all the content objects that it references. For example, before modifying a product
description, a product manager might want to see all of the other documents to which this description is
linked.
EMC Documentum Architecture
A Detailed Review

25

XML componentization
XML documents can easily be broken into granular components (commonly called chunks) for
manageability of content at various levels of granularity. The componentization can be configured to define
what types of objects to create, where to store them, what metadata to populate, and the level within a
document at which the componentization should occur (for example, all chapter elements become objects).
The hierarchical relationship between components is modeled within the repository as a virtual document.
For example, a data sheet could be broken into separate components, such as a short product description,
one or more feature sets, images with associated captions, and a summary. Each component can then be
managed separately as a discrete content object, with its own security levels, versioning, lifecycle, and
content attributes, as shown in Figure 8.

Figure 8. XML components are managed as discrete objects, just like any other content
object in the Documentum repository
Chunking is often used to facilitate content reuse. A predefined set of content components can be combined
and rendered in different contexts to meet various business situations. For instance, a set of news-related
headlines can be displayed as a news summary, while each headline can be paired with the relevant newsrelated paragraphs to produce a press release.
Content fragments within documents can also be accessed by XQuery for reuse and other purposes, without
chunking them into separate objects. However, chunking remains as a valuable feature for granular
management of XML components; for example, to apply different security levels, or to route the
components through different approval processes.

EMC Documentum Architecture


A Detailed Review

26

XML Store and XQuery


Documentum XML Store integrates xDB, EMCs native XML database, into Documentum Content Server.
XML Store adds standards-based XQuery to the proven XML capabilities of Documentum Content Server,
allowing users to efficiently and accurately query content at any level of detail (for example, individual
elements, attributes, content objects, or Documentum metadata attributes), even on very large information
sets. XML Store complements the other content stores in the Documentum repository, including the
RDBMS for metadata, the file system, and full-text indexes. As an integral part of the Documentum
repository, content in the XML store is subject to the same security, policies, and management as all other
content objects.
XML Store complements and enhances XML applications. Compared to XML applications on their own,
XML Store reduces the effort required to store and manage XML, while improving the searchability and
ease of retrieval of XML content, especially when an application needs to retrieve fragments of XML
documents instead of the entire documents themselves. Without XML Store, users commonly configure
their XML applications to extract content values from XML documents in order to populate the Content
Server metadata store. This step was needed to enable more structured searches (for example, searches
based on tag and attribute values instead of full-text). With XML Store, the document tags and attributes
can be searched directly using XQuery, so less work is required to configure XML applications and less
processing is required at document load time to parse and extract metadata values.
In addition, without XML Store the entire document found by a search needs to be retrieved into the calling
application. Operations such as extracting fragments, joining data from multiple documents, transforming
from one XML tag set to another, and sorting and grouping would then have to be performed in subsequent
steps. With XML Store, XQuery can perform all these operations on the fly, even composing entirely new
documents from the results of a search.
XQuery is also a W3C standard, which means that XML processing applications do not need to rely on a
proprietary query language. This openness and portability allows greater leverage of your investments and
reduces training and maintenance costs.
XQuery
XQuery is a standard query language for querying collections of XML data. It is comparable to SQL for
relational databases, but designed to handle the hierarchical and often irregular structures found in XML.
XQuery provides the query capabilities to:

Select elements/attributes from XML documents

Extract data from elements/attributes

Evaluate expressions on the data

Join the data across multiple documents

Order the results in any desired sequence

Compose new XML elements/attributes from the query results

A number of powerful content operations are made possible with XQuery, including:

Combined content, metadata, and full-text searches: XQuery can be combined with DQL and fulltext searches. For example, its simple to find documents containing the phrase safety warning in
any text element, with a part number element that matches 12345 and a metadata value of last
updated within the past year.

EMC Documentum Architecture


A Detailed Review

27

Advanced content analysis: XQuery enables better analysis of content, as well as the relationships
between content. For example, Where-Used reports today simply list the files that link to an object,
such as an image or an anchor point in a document. With XQuery, reports can be built that show the
actual elements (for example, paragraphs) within those files that contain the links

Flexible linking and reuse: Links and inclusions (for example, DITA conref, XInclude) can be
resolved to specific elements within a target without needing to prechunk the content. Applications no
longer have to retrieve and parse the entire target document to resolve the link. XQuery can directly
extract the linked content

Dynamic content composition: Content at any level of detail (for example, individual elements,
attributes, content objects, or Documentum metadata attributes) can be retrieved in a query and used to
compose entirely new sets of information for applications like personalized content delivery and
content analysis

Applications use a DFC interface to invoke XQuery access to the content in XML Store.

Federated Search Services


Documentum includes technologies and services to integrate, access, and query content beyond the
information stored within a Documentum repository. Federated Search Services (formally Enterprise
Content Integration (ECI)) use technology that leverages a framework of adapters for various internal and
external repositories. Federated search is useful when interacting with information stored in third-party
(non-Documentum) repositories and external websites.
Documentum relies on federated search for cross-repository searches as well as to query and retrieve
content from external information sources, including:

FileNet, Open Text, Microsoft SharePoint, IBM Lotus Notes, and content stores from other vendors

SAP, Oracle, and other enterprise application vendors

Lexis/Nexis and Factiva infobases and other dynamic web-based content environments

Static intranets accessed by third-party search environments such as the Autonomy search engine and
the Google enterprise search appliance

Desktop search engines provided by Google and any online search engine such as Google, Yahoo, and
Voila

Federated Search Services use an adapter framework and a query-brokering environment to enable these
federated search capabilities (see Figure 9). Each information source gets a unique adapter that maps the
content-related metadata defined within external information source into a schema supported by the
Documentum platform.

EMC Documentum Architecture


A Detailed Review

28

Figure 9. Federated Search Services are based on an adapter framework to enable


federated search capabilities
Federated Search Services function through a two-step process. First, the query broker maps a query into a
format supported by an external information source and then submits the query to the source. Then the
query processor receives the requested information from the external source, extracts the metadata, filters
the response, and returns the results.
Users can simultaneously submit a single query to multiple information sources via any client (including a
browser-based mobile client), receive the results from the multiple query processors interacting with the
external sources, and merge the results into a single set based on predefined criteria (such as relevance or
date published).

Content Transformation Services


Documentum provides a framework and a suite of Content Transformation Services (CTS) for changing
various kinds of content, such as documents, photos, video, and medical images, into different formats and
resolutions. The CTS framework provides common administration, configuration and customizations of the
various transformations. Content Transformation Services (see Figure 10) are built as self-contained
modules for accomplishing specific tasks. Some of the modules include:

Document Transformation Services (DTS): Supports document transformations, such as rendering


MS Office documents as PDF and HTML files. DTS runs as a separate server-side process without
requiring user authentication. The transformation can be triggered by users from the user interface or
automatically by a business process or lifecycle stage change.

EMC Documentum Architecture


A Detailed Review

29

Advanced Document Transformation Services (ADTS): Extends DTS by adding support for
additional document formats: Microsoft Project, Microsoft Visio, AutoCAD, and multi-page TIFF.
ADTS creates bookmarks and preserves links within documents and supports many advanced options
for controlling PDF output formats. ADTS includes an active storyboard capability for browsing
directly through PDF documents stored in the Documentum repository.

XML Transformation Services (XTS): Features extensive XML format transformations, an


eXtensible Stylesheet Language Transformations (XSLT) engine with full XSL-FO support, a style
sheet tool kit, and XML schema transformation support.
XTS transforms XML to popular web formats (such as HTML), mobile formats (such as WML,
cHTML, and XHTML Basic), Portable Document Format (PDF), help file formats (such as JavaHelp,
Microsoft WinHelp, and Microsoft Compiled HTML Help), Rich Text Format (RTF), and PostScript.
The tool kit provides support for Darwin Information Typing Architecture (DITA) and DocBook
standards. XTS can convert XML from one schema to another, invoked by workflows, lifecycles, userbased actions, or other applications. The rendering can occur in a channel-specific way. Thus the same
XML can be rendered for web, WAP, or print and graphics, and displayed appropriately for the device
capabilities of each channel.

Media Transformation Services (MTS): Provides rich media transformations and analysis for static
digital assets, including photos, scanned images, and Microsoft PowerPoint slide decks. MTS can read
and write metadata associated with digital assets, such as Adobe XMP tagging technology. MTS
includes capabilities for automatically managing PowerPoint slides as discrete objects, as well as
extracting thumbnails and low-resolution images from high-resolution assets. As a result, digital assets
can be centrally managed and reused in different contexts. MTS configuration capabilities can
integrate the Documentum platforms support of rich media repositories with the underlying content
storage infrastructure.

Audio/Video Transformation Services: Extends the capabilities of MTS to support multiple audio,
video, and animation formats. These services also integrate streaming media storage and delivery into
the content storage infrastructure.

Figure 10. Application developers can use the Content Transformation Services modular,
plug-in architecture to develop and deploy new transformation services
EMC Documentum Architecture
A Detailed Review

30

Content Intelligence Services


Content Intelligence Services (CIS) analyze the text within documents and other content objects,
automatically classifying the content assets; put another way, CIS determines what the text is about. The
results of the classification can be used to automatically populate the content metadata or to map the
content assets into a taxonomy.
CIS uses linguistic algorithms to analyze content, utilizing content-related terms, keywords, and attributes
related to the information domain of an enterprise. CIS aggregates content from disparate sources, runs it
through a parser, and uses three engines to analyze the resulting text, as shown in Figure 11.

Figure 11. Content Intelligence Services analyze the text within documents and other
content objects, and automatically classify the content assets
The three analysis engines include:

Information Extraction Engine: Extracts tags, content properties, and text from the parsed content
and generates metadata. It is itself further refined by the other two engines

Conceptual Classification Engine: Relates the parsed content to predetermined categories or


conceptual taxonomies

Semantic Analysis Engine: Analyzes the content based on enterprise-specific taxonomies or other
semantic considerations

CIS produces a list of concepts contained within the set of documents or other content objects. These
concepts can improve search accuracy as well as provide the ability to automatically categorize the
repository.

Content Delivery Platform


Documentum provides a publishing platform that is designed for global distribution of content of any size
with maximum speed, security, and accuracy. The platform consists of two delivery tiers that are
represented by two products: Interactive Delivery Services (IDS) and Interactive Delivery Services
EMC Documentum Architecture
A Detailed Review

31

Accelerated (IDSx). IDS and IDSx automate the deployment process by delivering content and metadata
from a centralized and managed content source (Documentum Content Server) to multiple cached network
locations such as in a runtime web infrastructure (including web server farms, enterprise portals, and
application servers).
Content delivery features that are common to both IDS and IDSx include the following.
Target distribution environments
IDS and IDSx can supply static and dynamic content to a wide variety of network-accessible applications,
personalization, portal, and e-commerce servers from enterprise vendors such as BEA, IBM, Microsoft,
Oracle, Sun, and SAP. Managers of these external environments can rely on the Documentum platforms
versioning, workflow, lifecycle, and other content management capabilities to maintain the content within
their applications. Distribution can be based on sets of business rules or queries that define the frequency of
updates and the content to be distributed. The platform can support discrete sets of distribution rules for
each environment. An administrative interface is used to manage all target locations and configurations.
Two-way communication
Next-generation websites are quickly adopting many social computing aspects to address the needs of their
customers and prospects, thereby strengthening customer loyalty and improving services. Sites are being
enriched with Web 2.0 technologies that engage the consumer in a two-way dialogue. This is accomplished
through tools that engage the customer in online forums, feedback mechanisms, and consumer ratings.
IDS and IDSx supports two-way communication between the source and target components. This allows
corporate to ingest user-generated content (captured by blogs, wikis, ratings, and online feedback forms)
directly into the repository. This bidirectional communication (known as write-back) allows users from any
location to upload content (or data) into a feedback form where it can be collected and routed back to the
content repository for approval and re-publishing.
XML database integration
XML technology is not limited to data-centric or document-centric applications, as it blurs the line between
structured content and unstructured content. However, XML operations such as parsing, querying, and
transforming are costly in terms of memory and processing time when performed against individual XML
documents in a file system. A native XML database is solves this issue by making these operations highly
performant and by also satisfying the scalability requirements of highly dynamic websites.
IDS and IDSx provide native XML database integration with Documentums XML database called xDB.
Corporations looking to support the delivery of personalized and dynamic content by providing partners
and consumers direct access to role relevant content and information are leveraging the capabilities of a
fully integrated XML database. IDS and IDSx can publish content and metadata both to a target location
and to a XML repository. The XML database simply acts as another target location that can be queried by
other applications or systems.
Interactive Delivery Services Accelerated (IDSx) (see Figure 12) extends the capabilities of Interactive
Delivery Services (IDS). IDSx provides enterprise scalability and global distribution of content with the
following additive features:

Interactive Delivery Services Accelerated automates the publishing of content to multiple geographic
locations by providing target to target replication. This allows IDSx to support globally distributed
publishing architectures that leverage server farms or geographically dispersed datacenters.

As web content and sites become richer (images and flash videos), robust delivery mechanisms are
required to support these interactive user experiences. IDSx uses high-speed, WAN transfer that
ensures that content and metadata are automatically and simultaneously delivered to all global

EMC Documentum Architecture


A Detailed Review

32

channels. IDSx accelerated file transfer is approximately 1,000 times faster than ftp/http/https
delivery. The transfer of large content files, such as video or complex layout documents, from
dispersed locations can also leverage IDSx.

IDSx also leverages a corporations existing IT infrastructure and delivers predictable, guaranteed
delivery times regardless of network conditions. Its transfer protocol utilizes existing network
infrastructures and monitors all transfers using Adaptive Rate Control in order to remain fair to other
applications (VOIP, web, and e-mail) without causing performance degradation. IDSx can facilitate
new revenue opportunities that deliver large datasets over inexpensive wide area networks.

Figure 12. Interactive Delivery Services Accelerated (IDSx) provides enterprise scalability
and global distribution of content

Process Services
The Process Services capabilities of Documentum include Collaborative Services, capabilities for
managing shared workspaces, as well as business process managementa set of products for managing
business processes across the enterprise.

Collaborative Services
Documentum provides Collaborative Services based on a set of six collaborative objects spaces, discussion
threads, contextual folders, notes, calendars, and data tables:

Spaces are shared, ad hoc workspaces that have their own membership lists and ownership. Only users
listed as members can access a room and the content stored within. Rooms support internal and
external users. Members can be external to the organization and not otherwise authenticated to access
the Document platform.

EMC Documentum Architecture


A Detailed Review

33

Discussion threads are a collection of messages organized around a predefined topic. A discussion
thread can be attached to any other object stored within the Documentum repository, such as a
document or a collection of documents stored within a folder.

Contextual folders collect and organize content within a collaborative environment, providing
additional information about the purpose of a folder. This descriptive information can appear as a
banner headline or as a mini-help environment within the context of a folder display.

Notes are web-based text files, stored in the repository, that maintain the context (and links) to related
objects. For example, a note can be a comment on a paragraph within a document, an annotation for a
document as a whole, or a summary for a set of documents stored within a folder.

Calendars provide the capabilities for members to organize, track, and schedule events for their teams.

Data tables are an easy way to collect information via a form, and then organize the resulting fielded
entries in a tabular form. Each row in the data table is an object within the repository, and can be
routed through a workflow for review and approval. Notes and discussion threads can also be attached
to the row.

These collaborative objects are stored just like other content objects within the Documentum repository.
They are managed with various repository services including check-in/out, search, workflow, retention,
security, and lifecycle.
Collaborative Services support subscriptions. Members can subscribe to any object of interest within a
room (such as all the items in a folder or a particular discussion thread) and then automatically receive
notifications when information related to the object changes.
Collaborative Services provide the services-oriented interfaces to call the collaborative objects. In turn,
Collaborative Services can be combined with related platform services. For instance, a discussion thread
accompanying the authoring and editing of a patent application can automatically be managed as a record
and be subject to the identical retention policies as the draft patent documents themselves.

Business process management


Documentum provides a complete suite of BPM products, the Documentum Process Suite, that manages
the complete lifecycle of business processes across the enterprise (see Figure 13). The suite supports
continuous business performance improvement methodologies. It orchestrates processes spanning beyond
Documentum to external systems, data sources, and applications.
The Process Suite is a key part of EMCs xCelerated Composition Platform solutions that supports
transactional and case-based processes. These are mission-critical, transactional processes that implement
some of the most important activities companies undertake, such as loan origination, invoice processing,
claims processing, and case management solutions.
Process Suite combines a process engine and a business activity-monitoring (BAM) engine, in addition to
the core content repository, to deliver extensive BPM capabilities. The Process Suite includes a modeling
environment for designing end-to-end business applications. Since the suite is based on the unified
architecture of the Documentum platform, it easily handles any type of content as the process payload,
from e-forms and XML documents to compound documents and rich media.
Process Suite includes TaskSpace within the user experience layer (described below), enabling rapid
configuration of process business logic for the rapid deployment of complete applications. This includes the
capabilities to gather inputs from high-fidelity forms that resemble paper version of documents. These
forms can include online submissions, barcodes, offline form-filling, copy and paste from Word or PDF,
digital signatures, form template versioning, document preview, and action invocation.

EMC Documentum Architecture


A Detailed Review

34

Figure 13. Documentum provides a suite of BPM products to manage content-intensive


business processes across the enterprise. The business activity monitoring (BAM) engine
monitors critical aspects of the business processes and provides up-to-date reports. The
Business Process Engine runs and manages the end-to-end processes, and integrates
with external applications through a SOA framework. All of the content is stored and
managed within the repository
Process Suite supports a graphical, object-oriented business process design environment. The Process
Builder specifies the flow of information from activity to activity, as well as the logic that determines the
sequence of activities. The processes and activities are reusable and fully distributed. The Process Builder
supports global structured data types as part of its underlying data model. Consequently, structured data can
be incorporated as a lightweight data type into the operation of the process models, and exposed by the
reporting tools.
At runtime, the Business Process Engine interacts with repository content, following the steps in a business
process as defined by the Process Suite. The Business Process Engine thus collects information from a
browser-based form or a Simple Object Access Protocol (SOAP) service, RESTful web service, or Java
API, then runs a set of process activities. The Process Engine includes persistent state management, queue
management services, automated task framework, timers/deadline services, and audit tracking, data
collection, and aggregation services to structure the predefined sequence of actions and activities that
constitute the business process.
The Process Engine includes a process reporting service that enables report designers to rapidly create new
BAM reports, charts, and alerts. Designers can export data sources to Crystal Reports for advanced
features. The Process Suite supports an extensible business process management environment, in which
third-party tools, including rules reporting and modeling engines, can be added as required.
The end result is a robust business process management environment that leverages managed content and
structures the flow of content across an enterprise.

EMC Documentum Architecture


A Detailed Review

35

The developer resources and tools group: Designing,


developing, and administering information-based
applications
An enabler of information-based applications and solutions, the Documentum platform includes a suite of
developer resources and tools group that provide the capabilities for designing, developing, and
administering content applications. These resources and tools, covered in the following paragraphs, provide
access to EMC Documentum repository content and to all content services. This group comprises design
capabilities, tools, configuration, and administration capabilities as well as an integrated development
environment that allows developers to easily build applications that take advantage of EMC Documentum
unified content services.

Documentum design capabilities


The Documentum platform provides developers with a great deal of flexibility by supporting a wide variety
of standards and a broad suite of Enterprise Content Services (ECS) and Application Programming
Interfaces (APIs) including those that are Documentum specific as well as interfaces available for
distributed content, data access, protocols, and XML. This section will focus on providing you with an
overview of Documentum Enterprise Content Services and APIs. The Documentum platform offers a suite
of design capabilities that support current technologies patterns used by customers to compose business
solutions. The diversity of these offerings are purposeful as specific technologies may possess properties,
such as agility verses stability, that lend themselves more to solving particular application challenges.
Documentum Enterprise Content Services
Service-oriented architecture (SOA) looks at IT assets as service components, establishing a software
architectural approach to building business applications. The SOA approach is based on creating standalone, task-specific reusable software components that function and are made available as services.
Consistent with this philosophy, EMC Documentum Enterprise Content Services (ECS) provides a
complete services architecture framework that can be incorporated within an organizations internal,
service-oriented architecture plans.
Documentum ECS provide content-related services that are loosely coupled and can be dynamically
assembled to meet business needs. ECS encapsulate the content management functions of Documentum as
a set of discrete service offerings that are designed to make content applications easier to design, develop,
and support. Enterprise Content Services can be broken down further into the following technologies:

Documentum Web Services


Documentum RESTful Services
Documentum Java Services
Documentum Interoperability Services: Content Management Interoperability Services (CMIS)

ECS comprise a collection of content management services, including core functionality that is provided as
part of EMC Documentum Content Server, together with a growing set of application level, extended
services. These more functionally oriented services include business process management services,
collaboration services, content intelligence services, search services, content transformation services,
enterprise integration services, compliance management services, interactive delivery services, and
interactive delivery services accelerated.
Developers exploit ECS when building a new service that is tailored to unique enterprise requirements. In
many instances, developers simply create the new business logic and orchestrate predefined ECS, wrap the
results into the new service, and add it to the ECS Catalog for others to consume. But in some situations,
developers need a more granular level of customization than provided through ECS. In these cases,
EMC Documentum Architecture
A Detailed Review

36

developers can drop down to Documentum APIs (covered in the next section), such as BOF or DFC, for
coding new functions, which can then be orchestrated as another service.
The following subsections provide a brief overview of each ECS technology as well as general usage
guidelines that are intended to help architects decide which service model is most appropriate to meet
specific functionality requirements in solution development. Documentums flexible design capabilities
provide a significant development advantage, as different business scenarios may benefit from using
specific technologies. It should be noted however, that each client is different in both goals and
environment, and not all recommendations work in all situations. As you design a solution, consider the
trade-offs between usability, performance, and time to market. Technology selection should be based on
your business requirements and context.
Documentum Web Services
Documentum SOAP-based Web Services include the core functionality that makes up Documentum
Foundation Services (DFS) together with other Documentum application services. These services exploit a
SOA development framework. Web Services are designed from the ground up to expose key content
management functions as standards-compliant web services that ensure that Documentum can operate as an
integral part of an organizations information infrastructure, developed using web services.
Documentum Web Services provide:

A set of core and extended services, implemented as web services, that expose Documentum content
management functionality

A Java SDK to enable development of service consumers using client runtime support, and
development of custom services based on Plain Old Java Objects (POJOs), or Service-based Business
Objects (SBOs) using service runtime support (common for Documentum Web Services and
Documentum Java Services)

A WSDL service interface to enable development of service to consumers using development


platforms that support SOAP messaging, including .NET

Web Services also honor Business Object Frameworks (BOF), like Type-based Business Objects (TBOs)
and Aspects, as well as basic DFC (described in the API section that follows). Thus, the services can call
and invoke predefined objects when integrating with the Documentum repository.
Documentum Core Web Services: Documentum Foundation Services
Out-of-the-box, Documentum Content Server delivers a core set of Web Services, Documentum
Foundation Services (DFS), which represent essential functions of the Documentum platform. Each service
provides a set of independent operations. The object service, for example, provides basic content
management functionality in operations such as create, get, update, and delete. The current DFS
services and their related functions are as listed in Table 2.
Table 2. Example of some basic DFS Web Services
Service

Description

Object

Fundamental ECM operations for creating, getting, updating, and


deleting repository objects, as well as copy and move operations

Version control

Operations that produce and control versions within the repository,

EMC Documentum Architecture


A Detailed Review

37

Service

Description
such as check-in and check-out

Query

Operations for obtaining data from repositories using ad hoc queries,


such as pass though, cache query, results, and query builder

Schema

Operations that examine repository metadata

Search

Operations that concern full-text and property-based searches against


both the enterprise repository and external information sources

Workflow

Operations that obtain data about workflow process templates stored


in repositories, and an operation that starts a workflow process
instance

Lifecycle

Manages lifecycles on persistent objects, including attaching and


detaching lifecycles and executing operations associated with those
lifecycles

ACL

Enables management of Access Control Lists (ACL) on persistent


objects, including the creation, update, and deletion of ACLs

Web Services usage guidelines


Web Services are generally considered to be a comprehensive standards-based approach to services that
provide extensibility and security. Typical applications include:
Attributes:

Offers extensibility, structure and security


Technology based on (SOAP/WSDL/WS-*)
Focused on accessing named operations
Language, platform, and transport agnostic
Great for the data center
Good at exposing logic
Communications are strict and structured
Example includes Documentum Foundation Services (DFS)

Documentum RESTful Services


The Documentum RESTful Services consist of a collection of new HTTP-based RESTful interfaces to
Documentum. RESTful services are a popular alternative to SOAP-based web services. These services
complement the SOAP-based Web Services such as DFS that are already available, with a focus on simple
consumption using a wider set of programming environments like Java, .NET, Flex, JavaScript, Python,
and Ruby.
Application developers can develop rich Internet applications by linking the content services, provided by
Documentum RESTful Services, with services from external applications and frameworks, and thus
provide content-enabled solutions that leverage enterprise content in new ways.
EMC Documentum Architecture
A Detailed Review

38

RESTful Services usage guidelines


RESTful Services are generally considered to be an agile, resource-oriented approach to services that are
considered to be fast and sufficient for "simple" services. Typical applications include mashups and portals.
Attributes:

Provides agility, simplicity, and ease of development


Uses technology based on HTTP-based RESTful interfaces
Focuses on accessing named resources
Is good at exposing data
Is language and platform agnostic
Enables clients to leave impressions in the software
Has less reliance on tools

Documentum Java Services


As a convenience and performance enhancement, Java Services are provided that allow the development of
new services using these Java Services. These Java Services are coarser grained APIs allowing developers
to use these services as they would standard, non-remote Java methods.
Java Services Usage Guidelines
Java Services are invocated locally and provide rapid development and testing and high performance for
local deployments. Typical applications include building other web services using these Java services.
Attributes:

Allows local interaction as though DFS were simply a Java API.

Bases technology on Java

Can be used to develop new service.

Provides performance of Java call for CPU and memory-intensive local applications

Documentum Interoperability Services


Documentum provides Interoperability Services through Content Management Interoperability Services
(CMIS). CMIS is a set of proposed standards used in conjunction with both web services and RESTful
services that ensure interoperability among disparate content repositories. CMIS is focused specifically on
providing interoperability as it standardizes the basic operations of an ECM system, and makes them
widely available as web or RESTful services.
The CMIS standard is designed to augment, not replace, existing ECM systems and their current
application interfaces. CMIS focuses on the core and more basic capabilities of an ECM system, that is, the
create, read, write, delete, and query functions. When deployed, CMIS ensures interoperability by defining
how these capabilities function in a uniform manner over a variety of ECM systems. CMIS will provide
customers with maximum investment protection for their existing ECM assets, as well as the ability to
freely adopt third-party applications that leverage the standard.
CMIS provides OID-based Create, Retrieve, Update, Delete (CRUD) services for objects. These actions
define the fundamental operations that compliant repositories need to manage:
EMC Documentum Architecture
A Detailed Review

39

The Create services create an object and returns an OID


The Retrieve services return the properties or content-stream of an object. These services may be
invoked with a filter specifying the properties to be returned
The Update services update the properties or content stream of an object. A multi-valued property
can only be updated by replacing the entire list of its values
The Delete services delete one or more objects. For versioned documents, all versions can be deleted

CMIS Services Usage Guidelines


CMIS is designed to solve a specific business challenge, ensuring interoperability for people and
applications using multiple content repositories. Recommend use when functionality is limited to basic
Services, such as Create, Read, Update and Delete, and interoperability between disparate repositories is a
key application requirement. Typical applications include those that need to interface with franchise
operations, dealerships, or interdepartmental organizations with disparate repositories. An example might
be a mortgage company whose business is national in scope, but which must work with a series of local
business partners to obtain flood reports, credit reports, title reports, appraisals, and so forth, that are
required to secure financing:

Provides basic services that support interoperability among multiple repository types
Technology based on OASIS standard Content Management Interoperability Services (CMIS)
Can use a single application to interact with content stored and managed by several ECM systems
Can deploy an enterprise workflow that interacts with content managed by multiple ECM systems
Can rely on a standardized, service-oriented interface to develop applications once, and deploy them
across multiple ECM systems

Documentum Application Programming Interface (APIs)


In the instances where developers need a more granular level of customization than provided through the
new xCP platform capabilities or ECS, Documentum provides access to Documentum functionality through
Documentum APIs. This section will provide an overview to Documentum APIs including;

Documentum Foundation Classes (DFC)


Business Object Frameworks (BOF)
SBO
TBO
Aspects

Documentum Foundation Classes


Documentum Foundation Classes (DFC) is the published and supported programming interface for
accessing the functionality of the Documentum platform. DFC exposes the Documentum object model as
an object-oriented library for other applications to use. DFC provides Java class libraries that expose the
functions that drive the Documentum environment. DFC is a supported API recommended for developing
extensions for Business Object Frameworks or implementing new services. For application development,
EMC recommends leveraging the rapid development capabilities of Enterprise Content Services. All
Documentum new generation clients have standardized on this approach.
Business Objects Framework
Documentum includes a Business Objects Framework (BOF), a structured environment for developing
content applications. The BOF shields application developers from the implementation details of the
EMC Documentum Architecture
A Detailed Review

40

platforms granular DFC and the underlying object model on which the DFC is based. Thus, BOF enables
application developers to easily develop highly reusable components that can be shared by multiple
applications.
BOF functions by abstracting the Documentum APIs and aggregating sets of these APIs into a business
logic layer. BOF provides a way to develop reusable business logic components, called business objects.
(Business objects are entities with predefined classes and properties (attributes) and can have unstructured
content associated with them.) The BOF can implement business logic as reusable components that can be
plugged into middle-tier network applications or client applications. These business objects combine
presentation and business logic with direct access to all content services.
Types of business objects
Documentum supports these types of business objects.
Type-based Business Object (TBO)
TBOs are tightly linked to an object type stored in the Documentum repository. Application developers can
add additional methods to the built-in or configured object type. Examples include catalog, product,
contract, and customer.
Service-based Business Object (SBO)
SBOs provide methods that perform more generalized procedures not usually bound to a specific object
type or repository. Rather, such objects represent a collection of functions that may operate on other kinds
of business objects. Examples include mailbox alert, catalog export, and syndicate service.
Aspects
Documentum supports aspects, an addition framework for extending object behavior and attributes.
Aspects are a type of BOF entity that can be dynamically attached to object instances, to provide fields and
methods beyond the standard ones for the object type. The extended behavior can include functionality that
applies to types across the object hierarchy. Aspects speed application development and improve code
reuse, as the extended attributes and behavior do not alter the underlying type definitions.
For example, an aspect can label objects as retainable or web-viewable. This single aspect can then be
applied to multiple distinct object types. Aspects speed application development and improve code reuse,
because the extended attributes and behavior do not alter the underlying type definitions.
Aspects can be associated with either an individual object or an object type. When associated with an
object type, the aspect is automatically associated with each new object of the specified object type.
Aspects can also have properties defined for them. Properties defined for an aspect appear to users as if
they are defined for the object type of the object to which the aspect is attached.

Integrated development capabilities: Documentum Composer


Application developers can use EMC Documentum Composer, an Eclipse-based integrated development
environment (IDE) for developing, deploying, and configuring applications running on the Documentum
platform. By leveraging a standards-based IDE, developers increase their productivity while reducing the
cost of application development. Eclipse enables an ecosystem of customers, partners, and business
analysts.
Documentum Composer supports a series of mechanisms for rapid application development. It exposes an
ECS catalog viewer so that developers can easily discover and access the ECS services available within
Documentum. Composer also includes a well-defined plug-in model for adding functionality to the
platform environment.
EMC Documentum Architecture
A Detailed Review

41

As an Eclipse-based IDE, Composer integrates with the broad range of application resources (and their
varied editors) available to developers within Documentum environment. Composer enables multiple tools
to share a common set of information resources. It provides an open development framework, with welldefined interfaces and extension points.
As a result, application developers can leverage their investments in DFS and ECS. They can easily
develop web services-oriented applications that integrate content-related objects with resources and
services of external enterprise applications.

Configuration capabilities
Configuration tools provide capabilities to customize Documentum to the needs of a business without
developing or modifying the code for a content application. Configuration tools adapt the platform
capabilities to the ways that an organization functions. These tools support essential configuration features,
including presets, high-fidelity forms, and smart containers.

Presets
A preset determines the selections or actions available to end-users in particular situations as they utilize
Webtop. Creating a preset offers a way to customize screen options to those selections that are relevant to
the users task in the particular situation. A Webtop configuration tool provides the web forms for selecting
particular presets options
A preset is assigned to a particular item or set of items. For example, a preset could be assigned to a
particular user group. A preset could also be assigned to a particular user group when combined with a
specific folder location.
A preset is comprised of one or more rules. Each rule determines the selections or actions available within a
specific functional area. For example, a rule can determine available lifecycles, available actions, or
available auto-complete text. As a result, end users can work in a customized ECM environment that is
tailored to their specific tasks and activities.

High-fidelity forms
Documentum provides Forms Builder as an integrated tool to create high-fidelity form templates that look
like their paper counterpart. Forms designers can create and manage templates using a word processing
paradigm similar to Microsoft Word.
Users who fill out forms created from high fidelity form templates have the option to submit the data online
by posting the data to a website or web-based application using HTTP, or e-mailing the data using their
favorite mail application. For speedy recognition of data on forms that are filled out electronically but
submitted via hard copy or fax, high-fidelity form templates support bar code generation that can encode
values such as account number, last name, or other key identifying data.
Content captured through these forms can be stored and managed within the Documentum repository, high
fidelity forms can share the same data model as web forms, thus providing alternative views of the same
structured content. As a result, application designers can create customized, high fidelity forms that have
the appearance of paper, without the bottlenecks, overhead, cost, and risk of maintaining paper-based
processes.

EMC Documentum Architecture


A Detailed Review

42

Smart containers
Smart containers provide templates for hierarchical objects, such as loan applications, case folders, and
health care patient records. Smart containers are designed to instantiate new objects and set relationships
among them without coding. These containers incorporate built-in logic to automatically link to the discrete
objects in a predetermined fashion. Thus, a home loan application might include the customers application,
verification of customers income, e-mail about the application, and property appraisal, together with the
logic to link the appraisal to the property address on the application. The platform-based logic can be
reused across many applications.
Developers specify the capabilities of a smart container model, and define runtime parameters. Then
administrators use Composer as a customization tool to establish the parameters and logic of a smart
container instance without the assistance of developers. As a result, less development time is spent on
coding pre-existing content and relationships, and firms can leverage smart containers across multiple
applications to ensure consistency.

Administration capabilities
Documentum Administrator provides the capabilities to monitor, administer, configure, and maintain the
content servers, repositories, and federations located within an enterprise. This administration tool supports
general content management functionality for such tasks as establishing user groups, setting roles and
permissions, managing security, lifecycles, workflows, and virtual documents. Documentum Administrator
also runs within Webtop.
The base administration capabilities supported by this tool can be enhanced to support additional reporting
and monitoring functions by integrating with third-party applications.

The experiences group: Managing end users


interactions
Documentum supports end users interactions through an extensible client infrastructure together with a set
of activity-specific applications. These applications target knowledge workers, interactive professionals,
and transaction-oriented specialists, who often work in production roles and need to complete business
tasks.
Documentum delivers both a client infrastructure for developing customized experiences, as well as
predefined end-user applications.

Client Infrastructure
Documentum includes an applications experiences group for developing web-based clients and user
applications. Documentum supports two kinds of experiences, one for conventional, browser-based
applications and another for fine-grained interactivity, based on Rich Internet Application (RIA)
capabilities.

The Web Development Kit framework


Documentum includes a Web Development Kit (WDK), an application development framework for
developing web-based clients and user applications that feature a conventional, browser-based experience.
Many Documentum clients and applications, including Webtop, Web Publisher, and Compliance Manager,
are built using WDK.
EMC Documentum Architecture
A Detailed Review

43

Documentum also uses the WDK to provide a series of Application Connectors for integrating
Documentum functionality within Word, Excel, PowerPoint, and Documentum Client for Outlook. The
WDK framework provides application developers with a consistent and unified environment for creating
web-based applications that interact with the Documentum repository. The WDK framework relies on a
form-control-event approach, consistent with .NET WebForms and the Java Server Faces standard (JSR
127). The WDK framework supports Lightweight System Objects (LwSOs described earlier) to improve
the user experience with certain content operations (such as when archiving or restoring from archive large
numbers of e-mail messages).
The WDK framework provides a set of WDK services that runs locally on a client-side device, either
within a browser or a desktop applications, and interacts with server-side business objects (developed using
the BOF) or with DFC functions (see Figure 14).

Figure 14. Documentum includes a Web Development Kit for developing both browserbased and Windows-based desktop applications.
Within Windows-based desktop applications, the WDK framework provides COM objects for sending and
receiving HTTP messages to and from a web application server.

Application Connectors
Application Connectors are WDK components that provide access to the Documentum repository and
content services from within desktop applications such as Microsoft Office applications. Application
Connectors are built on an open framework that enables application developers to add connectors as plugins. Because Application Connectors function consistently within various desktop applications, a single set
meets all of an application developers needs.
EMC Documentum Architecture
A Detailed Review

44

Application Connectors appear as menu items within a desktop applications pull-down menu. From
Microsoft Word, Excel, and PowerPoint, the Application Connectors directly call the server-side
components within the Documentum platform, perform the action, and return the results to the calling
Office application. Files and folders can be replicated between the Documentum repository and desktop
storage in a secure manner, to enable offline operations. Offline files and folders are then synchronized
with the repository upon reconnection.
For example, a Microsoft Word user could use the Documentum menu to query and access documents
stored within the Documentum repository. The Application Connector first authenticates the user and then
authorizes access rights, enabling the user to easily access the documents within Word. Meanwhile, the
server-side content is managed by the business policies of the Documentum platform.
As another example, the Documentum Salesforce.com Connector allows enterprises the flexibility of using
any best-of-breed, front-end sales application, while providing the security, compliance, and powerful tools
of Documentums industry-leading content management capabilities. Application developers can use the
Application Connector SDK to develop additional application connectors for the desktop applications of
their choice.

Capabilities for Rich Internet Applications


Keeping abreast of the rapidly evolving web-centric technologies, Documentum also delivers the highly
interactive experiences that run within a browser, rather than being served as predefined pages from a Web
server. Documentums capabilities for Rich Internet Applications (RIAs), its Rich Content Management
Platform (RCMP), supports UI components that are developed with such popular technologies as Adobe
Flex and AJAX. RCMP provides the infrastructure for controlling interactions among the components and
for managing events.
To deliver an interactive experience, RCMP supports interoperability and common event handling among
disparate UI components. The UI components, in turn, interact most frequently with DFS services to
manage the underlying content. On occasion, these components make calls directly to BOFs and DFC
interfaces.
Thus, an interactive experience can include several UI components, each based on a different RIA
technology, which are able to communicate with one another and respond to a common set of events. For
example, developers might develop a user experience with a Flex-based image viewer, JavaScript tree
control, and JavaScript form. Clicking on one of the tree nodes forces a refresh of the form and image
viewer.
Developers can also develop mash-up experiences where they combine a UI component that is managing
content through DFS with a component that calls an external Web resource. For example, a list of
documents containing the addresses of the primary authors can be mashed-up with a Google map to
visualize the authors locations.
In short, through this component-based approach, Documentum can support a wide variety of end-user
experiences, delivered both within a desktop browser and on mobile devices.

End-user application frameworks


Documentum delivers end-user application frameworks for knowledge workers, interactive professionals,
and transaction-oriented business specialists. Each application framework optimizes user interactions for
particular types of business tasks and activities.

EMC Documentum Architecture


A Detailed Review

45

Knowledge workers
Knowledge workers need to be able to work on multiple projects at once, be able to locate information
across an array of resources, and be able collaborate with team members on a variety of subjects.
Furthermore, knowledge workers must be able to use the particular clients and applications they prefer to
locate, access, and update information. Documentum provides the end-user experiences to make knowledge
workers more efficient and productive.
Documentum Client for Outlook
Many knowledge workers do business through e-mail, and particularly Outlook. The Documentum Client
for Outlook leverages Outlooks native features, menus, and toolbars to file, store, find, and manage e-mail
messages and documents within Documentum. Through ordinary drag-and-drop operations, e-mail
messages (including attachments) can be filed into the appropriate repository folder. Once stored within the
repository, Documentum functionality, such as properties, permissions, versions, and locations, can be
applied directly from Outlook.
CenterStage
Knowledge workers also need to stay abreast of activities and events within their continuously changing
business environments. CenterStage is designed to deliver the interactive experiences for accessing and
managing collaborative content within the framework of an enterprise information infrastructure.
CenterStage provides guided navigation, syndication services, social networking, and intelligent search of
disparate information sources.

Interactive professionals
Whether it is flash websites, streaming videos, or interactive product images, customers expect engaging
and interactive experiences in the digital world. With this expectation comes an explosion of rich media
content and the huge volumes of media content that businesses need to control. Photo editors, graphic
designers, cinematographers, and other creative professionals who produce this interactive content need to
manage their interactive assets in a systematic manner.
Media WorkSpace
Leveraging Flex technologies, Media WorkSpace provides a personalized, dynamic, and immersive way to
view, find, compare, annotate, review, and share interactive media. It includes dynamic previewing
capabilities, real-time filtering, relevance ranking, image annotations, and easy functions to collect and
compare sets of interactive assets. Media WorkSpace is designed to increase productivity with fewer mouse
clicks as well as substantially reduce the time and effort required for managing interactive assets.
Web Publisher Page Builder
Web Publisher Page Builder provides a WYSIWYG (What-You-See-Is-What-You-Get) interactive
interface for managing content on web sites. This interface is tailored to both web page designers and
content contributors. Page designers can quickly assemble new pages by dragging and dropping prebuilt,
dynamic, or custom components (such as RSS feeds, URL links, video clips, and so on) from a content
repository to a web page. Companies can quickly deploy and iterate relevant, meaningful, and current web
pages to their consumers.

Transactional
Documentum supports the end-user experiences for managing transactional content both from the
perspective of the process designer and the transaction-oriented business specialist. It provides the
capabilities for business analysts and process designers to quickly model and execute business processes in
EMC Documentum Architecture
A Detailed Review

46

a web-based environment, in direct partnership with their IT counterparts. Relying on flex to create the user
experience and DFS to interact with the content and process services, the result is an improved end-user
and customer experience, a decrease in development cycle time, better decision making, and substantial
savings by enabling an agile operational enterprise.
TaskSpace
TaskSpace provides a highly configurable user interface that unites process, content, and monitoring into a
single, powerful user experience. It provides 360 view of case work and high-performance document
viewing, including tight integration with high fidelity forms that truly resemble their paper counterparts.
TaskSpace is based on re-usable components and configurable actions to support rapid application
development. It provides an all-in-one user experience for transactional business applications.
Business Activity Monitor
For line managers responsible for business operations, Documentum provides a high-level view of real time
process execution, leading to better understanding of overall process performance. Business Activity
Monitor gives managers the ability to detect and diagnose potential problems before they affect the
customer or impact business.
Also a flex-based user experience, Business Activity Monitor includes dynamic key performance indicatory
(KPI) tracking capabilities, together with interactive charts and drill downs. There is a graphical dashboard
builder to create and customize user experiences as well as the ability to implement active alerts and
responses. Business Activity Monitor is designed to manage dynamic business processes by providing
managers with the tools to sense, assess, and respond to changing events.

Conclusion
Unstructured content serves as the critical information source for many applications. The EMC
Documentum architecture provides a strategy for solving todays needs to manage unstructured content,
and for investing in tomorrows opportunities to profit from content-centric applications. Documentum
delivers the services for managing unstructured business information within an enterprise and beyond.
Using Documentum, companies can ensure that unstructured content is stored, secured, delivered, and
archived in a systematic manner that follows predefined business rules and conforms to established policies
and procedures.
Documentum enables companies to develop robust content applications that solve mission-critical business
problems. For example, marketers and external business partners can always have easy access to the latest
product information, while engineers and scientists follow established business processes when
documenting new technologies. Companies can archive and retain content to meet compliance
requirements, while enabling multiple departments and external business partners to easily work together
and share any type of content over the network.
Finally, Documentum provides application-level components for developing enterprise-scale applications
that use content within the context of business processes and delivers a broad range of application
experiences to desktop- and browser-based applications. These capabilities form the foundation for
tomorrows solutions: managed content that disparate applications can access and consume, as needed, as
flexible web services based on a service-oriented environment.

EMC Documentum Architecture


A Detailed Review

47

Das könnte Ihnen auch gefallen