Sie sind auf Seite 1von 6

Productivity Growth for Services

A Middle Market Primer on Content Management Systems


Requiring the System Moving Away From Paper One bright day, your company decides its time to operate in the 21st Century and migrate to a paperless (or at least a reduced paper) office. Great, this is good news! Youre tired of tripping over randomly distributed filing boxes in the hallways. You surmise a content management system would put those megabytes of trash on a couple of redundant hard drives. Moreover, those tiresome people that are always trying to find that one electronic document at 5:00 pm on Friday will be satiated. A network based system would allow any salesman or customer to find that lost invoice on their own time. Wonderful! All your headaches will be reduced to a set of defined procedures. Every company document will be indexed and stored, out of sight, forever. Now, heres the bad news. You and your department have been assigned this instillation effort. Where do you begin? The Challenge The urge to build a content management system for your company is no surprise. Since the majority of American businesses are involved in the service industry, most activities at these firms depend heavily on codifying ideas and processes. The easiest way to formalize and maintain firm specific knowledge is with some form of electronic content management system (CMS). Moreover, the emerging regulatory environment that governs public companies (e.g. SOX, SEC, NASD, HIPAA, Homeland Security) requires CMS design be compliant with legal guidelines. The challenge for most organizations is how to do this with a quantifiable return on investment. Were discussing real returns, not some simple spreadsheet based model laced with overly optimistic assumptions. Management wants the money back in a measurable way, not via some fictitious total cost of ownership model, capable of growing financial trees to the moon. Often these projects are driven by senior managers, but are implemented by sub-CIO level officers. Consequently, these individuals are given the Spartan edict: Come back with your shield or on it! This is not a task carried out by rudimentary contractual coders. This assignment requires a seasoned system integrator that understands that a CMS system is a business critical system. Imagine every idea from the past year is dropped in one box only to evaporate the next day. In other words, if the system doesnt work, you firm losses money - lots of it. Content Management Systems Modern document management systems are essentially file assembly lines consisting of four major components: Document Construction In this stage, the publishers create the documents or messages they want to push to the CMS system. The document construction stages often utilize standard editors. These applications operate on the actual ASCII and binary components of a file. Contribution Workflow - Following the construction stage, the document enters the submission workflow. A workflow consists of a series of logical operations and queues. A workflow system Page 1 sales@altametric.com

Altametric LLC

Productivity Growth for Services can assist in assigning other components to the document (such as metadata and taxonomy) and delegate administrative tasks. Storage and Retrieval - As the document exits the workflow, all electronic content enters a referential data collection. The information references are created directly via metadata tagging or machine indexing. Subscription and Distribution - The final component of a document management system is the retrieval interface. Documents are queried and distributed with a number of programmatic interfaces. These interfaces can use SOAP, JMS or any other standard protocols. Gateways to individual systems are built over the standard interfaces. The document query can be executed using either a search engine or SQL.

Many Solutions in a Fragmented Industry Complicating matters, content management systems exist in all forms and capabilities. Problematically, this is one of the most fragmented segments in the software industry. Multiple vendor packages and a whole host of open source projects dot the technology landscape. Any one of these solutions can be based on a complete spectrum of technologies. The code base can be Java, Python, Perl, C++ etc. There are even solutions using functional languages like Haskell and Erlang. Additionally, some systems can be deployed on a variety of operating systems and hardware configurations. Implementation and Planning Requirement and Processes The most important step when selecting a content management system is the requirements engineering phase. This is usually conducted by business analysts and systems architects. This planning stage will begin with compilation of a feature list, moving next to business requirements and into technical requirements. Once the system constraints are finalized, the system analyst will create a set of use cases accompanied by business process maps. Once the documentation phase is completed, the systems architect will design the components and define the technical specifications. Smaller projects may combine the analysts and architects role. Economic Gain and Pain The business justification for this new technical strategy may include: customer requirements, regulatory requirements, storage costs or business effectiveness. In most cases, unlike large firms with decades of electronic data processing experience (e.g. financial service firm, insurance providers or a global goods distributor), moving critical (and not so critical) information to a completely electronic format is dauntingly unfamiliar and difficult. If you are not familiar with document automation, you are in for a surprising amount of work. The effort that goes into specifying, designing, building and testing a document management system requires trained and dedicated resources. These costs do not scale with the number of documents because the system is designed to minimize individual document costs. Unfortunately, most costs are generated during project implementation and operational oversight. The implementation costs are largely related to the number features the system must have as dictated by the requirements. The Altametric LLC Page 2 sales@altametric.com

Productivity Growth for Services operating costs are related to data maintenance and system administration. Fortunately, a good system pushes the creation and subscription efforts onto the users. Removing Legacy Methods In some cases, an enterprise may rely on a number of small departmental systems constructed in an unplanned environment. Replacing this diverse application base can reduce costs by consolidating licensing and software support costs. Since enterprise CM systems are scalable by definition, costs are also reduced with a significant reduction in complexity. Reducing a family of heterogeneous systems to a set of distributed, but identical points of presence will reduce training, integration and software modification costs. Simplifying the content management system will also lessen integration costs with third party subscribers. Moreover, reducing system complexity and creating a unified system eliminates many costs associated with data schema design and storage components. User Experience Most legacy systems provide little utility for active contributors. A unified content system should provide full previewing, staging and rollback capabilities. Administrators will also require improved monitoring and data manipulation features. These allow administrators to easily specify workflow, metadata, taxonomies and thesauri. Finally, the end user should be able to access content in a simple and effective manner. The metadata and the application interface should accurately deliver the appropriate content as requested by the subscriber. Each industry group also caters to its subscribing users differently some prefer to distribute lengthy documents in PDF format, others require instant delivery of timely information over an RSS feed and some just want simple, formatted web pages. Others industries deal heavily in analytical information, tabular reports in simple text and models created in Excel spreadsheets. Clearly, a content management system must support a diverse array of content types and delivery mechanisms. Long Term Operating Goals Increasing scale will require the system to handle more documents faster. With the increases in scale, the system will also need to implement inexpensive redundant systems. Expanding scope will increase the number of actors and roles in the system along with the amount of metadata. The increase in scope will require the system to be extendable programmatically. This is also important to service a growing number of external, third party systems subscribing to firm wide content. Content Management System Concepts Data and Content Metadata is simply data that describes other data. For content management systems, metadata is often referred to as tagging. The metadata can either be created deterministically or inferentially. The deterministic creation of metadata requires actors (usually the author) to create the set of metadata. Metadata created inferentially is generated by another piece of software that examines the context of the text itself. The metadata usually consists of a set of sets. Each subset consists of sets of terms that are related. Altametric LLC Page 3 sales@altametric.com

Productivity Growth for Services Taxonomy or categorization organizes the associated metadata into a mesh of relationships. The relationships are usually abstracted in hierarchical form. This results in a taxonomy tree that categorizes the individual files in a logical set of groupings. Thesauri (specifically metadata thesauri) are cross-referenced catalogs that map sets of metadata to other sets of metadata. Thesauri are useful for transforming a set of documents from one set of taxonomy trees to another, for use across different lines of business. Snippets are small units of body text. An atomic snippet is a single string of text between two metadata tags. Snippets can consist of other snippets. Base Documents are the most basic form of file in a content management system. These can be plain text files, XML files or any other form of document. Document Transformations initiate the constitution of another document from a base document. Typically, these mappings are one-way transformations. Version Control measures the incremental changes in a file. These incremental changes are sometimes committed to an archive in case of rollback. Locking occurs during stages in a workflow. Locking establishes dynamic access control determined by both actor roles and document state in a workflow. The version control system can also lock access to specific documents moving through a workflow. Activities Workflow consists of a logical progression of operations. Furthermore, a workflow consists of queues and logical operations. The queues and logical operations are arranged in a network across the workflow design. Origination (creation) initiates the content management workflow. Content creation results in a new file in the system. The new file is usually an approved document type. Modification of content is an activity in the workflow that changes the contents of a file. Disablement occurs when files are removed from the workflow. The file physically resides in the system but is placed outside the content management workflow. Deletion occurs when content is excised permanently from the system via a complete expulsion from the file system. Approval of content occurs when a designated file is approved by the current system overseer and is moved to the next step of the workflow. Review requires that a content file must be opened and read by an actor in order to complete an audit step in the workflow. Assembly of content is a workflow step where two or more independent files are unified to form a single, self-contained document. Actors Contributors are users that supply content to the document management system. They are authorized to create, modify and disable documents. Content Administrators are a group of actors that can modify or disable a content file. This group can also force a review and an approval workflow cycle on another user. Altametric LLC Page 4 sales@altametric.com

Productivity Growth for Services Data Architects are a user group controls the metadata and taxonomy design and implementation. Data architects can also create modify or delete a workflow. System administrators can create, modify and delete users and roles. System administrators can insert and remove metadata and taxonomy structures form the system. System administrators can create, modify or delete a workflow. Software Solutions Open Source This approach utilizes a central code base selected from a major open source project. Currently, there are a large number of open source projects supplying document management code. The most important factor when selecting an open source project is scale. Many projects work well for department level systems, but fail when used for global, enterprise systems. Each open source project is usually comprised of the document management core components. To fully meet the requirements for any firm, an open source code base would need additional enhancements. These may result in some design and construction risk. An open source approach can also utilize a service-based architecture. These systems can be extended for data importation and exportation. Service based architectures can also use remote calls and external operations to the actual construction and management of documents. Vendor Hybrid Solutions Hybrid strategies consist of a purchased applications supplemented with either additional vendor products or open source components. A typical out of box solution, selected as the system core, should meet over 80% of the business requirements. The vendor core should also solve all critical business and technical requirements prima facie. As in the Open Source solution, if there are any must save legacy system components, they can be harvested and connected to a vendor supplied programming core. This strategy provides a simple solution, but can increase the risk by neglecting some business requirements. Turnkey Solutions A turnkey approach depends entirely on a vendor delivering a complete solution that meets all requirements. The primary vendor acts as a general contractor, sub contracting out critical components to other vendors. Turnkey strategies reduce execution and delivery risks, but require substantial up front investments. Moreover, the requirement process becomes critical for success. Turnkey solutions usually require extended contractual maintenance. Conclusion Selecting a CMS solution is complex. A successful CM solution depends on the specific business operations within each firm, and in turn, the specific requirement from each industry. Complicating matters is the wide variety that is available from the vendor and open source communities. Fortunately, the activities within this software realm can be broken down into four major categories: origination, workflow, storage and distribution. The crucial factors for successful implementation include a thorough compilation of all business processes and rigorous supplier selection. Altametric LLC Page 5 sales@altametric.com

Productivity Growth for Services About Altametric Altametric, LLC formed in mid 2005 under the direction of Antonio Anselmo. Previously, Dr. Anselmo worked for 12 years at J.P. Morgan Chase in the Financial Engineering and Electronic Commerce groups in the Investment Bank. Prior to this, he was a Scientist at Varian Associates for 4 years. He holds a B.Sc., M. Eng. and Ph.D. from Cornell University and a M.B.A. from the Amos Tuck School at Dartmouth College.

Altametric LLC

Page 6

sales@altametric.com