Beruflich Dokumente
Kultur Dokumente
Table of Contents
Introduction......................................................................................................................... 1 Purpose of this Manual ................................................................................................... 1 Intended Audience .......................................................................................................... 1 Document Conventions................................................................................................... 1 Getting Started .................................................................................................................... 2 Starting Buldoser ............................................................................................................ 2 Using the Docbase Browser............................................................................................ 3 Opening cabinets and folders.................................................................................. 4 Getting an objects Object ID ................................................................................. 4 Unlocking objects ................................................................................................... 4 Deleting objects ...................................................................................................... 4 Refreshing the Browser........................................................................................... 4 Showing and Hiding User Cabinets........................................................................ 5 Logging into another Docbase ................................................................................ 5 Exiting the Docbase Browser.................................................................................. 5 Viewing the current version of Buldoser ................................................................ 5 Docbase to Docbase Overview ........................................................................................... 6 The Buldoser Methodology ............................................................................................ 6 Supported Object Types.................................................................................................. 8 ETL Process .................................................................................................................... 8 Check in content objects ....................................................................................... 10 Mandate a content freeze ...................................................................................... 10 Move supporting objects....................................................................................... 10 Create Batch Folder .............................................................................................. 10 Extract Content ..................................................................................................... 10 Finish Extract ........................................................................................................ 11 Switch Docbase..................................................................................................... 11 Transform.............................................................................................................. 11 Load Content......................................................................................................... 11 Finish Load ........................................................................................................... 11 Resolve Errors....................................................................................................... 12 Reprocess Errors ................................................................................................... 12 Load Relationships................................................................................................ 13 Finish Relationship Load ...................................................................................... 13 Resolve Relationship Errors ................................................................................. 14 Reprocess Relationship Errors.............................................................................. 14 Test Content .......................................................................................................... 14 Undo Load ............................................................................................................ 14 Extracting Content from a Docbase.................................................................................. 15 Starting an Extract......................................................................................................... 15 Finishing a Stopped Extract .......................................................................................... 21 Mapping Data Values ....................................................................................................... 22
Mapping Attributes ....................................................................................................... 25 Creating ACLs and Accessing Groups ....................................................................... 25 Creating Folders............................................................................................................ 27 Preview Folder Creation ............................................................................................... 29 Loading Content into a Docbase....................................................................................... 31 Starting a New Load ..................................................................................................... 31 Finishing a Stopped Load ............................................................................................. 38 Reprocessing Load Errors............................................................................................. 39 Starting a Relationship Load......................................................................................... 40 Finishing a Stopped Relationship Load ........................................................................ 42 Reprocessing Relationship Load Errors........................................................................ 43 Undoing a Load............................................................................................................. 43 Scheduling an ETL Operation .......................................................................................... 46 Scheduling Overview.................................................................................................... 46 Extract Jobs........................................................................................................... 46 Load Jobs .............................................................................................................. 47 Scheduling an Extract Job............................................................................................. 47 Scheduling a Load Job .................................................................................................. 49 Editing an Existing Extract or Load Job ....................................................................... 51 Deleting an Extract or Load Job ................................................................................... 51 Database to Docbase Overview ........................................................................................ 53 Connecting to a Data Source......................................................................................... 53 Mapping a Data Source to Documentums Data Model............................................... 53 The Object View ................................................................................................... 54 Supporting Views.................................................................................................. 54 Object Configurations........................................................................................... 55 Inline Data Transformation................................................................................... 56 Attribute Configuration......................................................................................... 57 Content Configuration .......................................................................................... 57 Folder Configuration............................................................................................. 58 Security Configuration.......................................................................................... 58 Versioning Configuration ..................................................................................... 59 Pre- and Post-Processing....................................................................................... 59 Multi-threaded Loading Algorithm....................................................................... 60 Loading Content from a Database .................................................................................... 61 Creating a New Configuration or Configuring an Existing Configuration................... 61 Finishing a Stopped Database Load.............................................................................. 96 Undoing a Database Load............................................................................................. 97 Appendix EDMS98 Operations ................................................................................... 100 Content View .............................................................................................................. 100 Defining Object View ................................................................................................. 100
............. 102
.......... 103
.......... 105
Introduction
This section describes the purpose of this manual and its intended audience.
Intended Audience
Movement of content between Docbases requires knowledge of the Documentum repository and of the specific content being moved. This manual is for administrators of Docbases and assumes the user is familiar with basic Documentum skills, including: Documentum Query Language (DQL) (for more information, see the Documentum DQL Reference) Documentum object model (for more information, see the Documentum Object Reference) Documentum security structures such as Users, Groups, and ACLs Using Documentum Administrator Documentum Cabinets and Folders
Document Conventions
Table 1-2. Conventions used in this Guide
Convention
Where used
For emphasis, for support documentation titles, and for text found in tables. To indicate keyboard keys, button names, or menu items that needed to press, click, or select. To indicate text needed in a field. The quotes are not typed; only the information within the quotes is typed. To indicate a variable type of information. The double less than / greater than signs and the actual text is not typed. The specific information represented by the variable within the double less than / greater than signs is typed.
Getting Started
This section provides instructions for getting Buldoser running the first time. It gives an overview of logging in and operating the Docbase Browser.
Starting Buldoser
Buldoser can be run from the installed shortcuts (Windows only) or the supplied batch files (LaunchBuldoser.bat or LaunchBuldoser.sh for Windows or UNIX, respectively). The first time Buldoser is run it will ask for a license key. See Figure 1.
Figure 1: Buldoser license key challenge dialog License keys can be obtained from Crown Partners. License keys are provided by installation machine to customers who have purchased a license. To purchase a license or obtain a license key, contact productsales@crownpartners.com. The license key is stored in a file named BuldoserLicense.txt in the installation directory. If an upgraded license is purchased, the new key must be entered in this file. After the license key has been entered successfully, the End User License Agreement is presented. Click the Accept option followed by the Ok button to proceed. The login dialog will then be displayed. See Figure 2.
If no Docbase is currently available, Buldoser will present an error message. The dmcl.ini file should be checked for correct connection information and that the Docbases are currently available. To start Buldoser, enter a valid username and password, and then click Login. It is suggested to use a Documentum super user to perform ETL operations. The Buldoser Docbase Browser will open. See Figure 3.
To open a cabinet or folder, double-click the cabinet or folder from either the tree view on the left, or the table on the right. The cabinet or folder may also be single-clicked, and then opened or closed using the +/- sign in the tree view. If there are a large number of items in a cabinet or folder, Buldoser will warn that opening may take a long time.
To copy an items Object ID into the Clipboard, select the item, then right-click and select Get Object ID. This can be useful for pasting into IAPI, IDQL, or another administrator tool for doing further research on an object. The Get Object ID menu item may also be reached from the File menu.
Unlocking objects
To unlock checked out objects, select the items then right-click and select Unlock. This can be useful for unlocking objects so that they may be deleted. Multi-select is enabled, so items may be selected using Ctrl + click for individual items or Shift + click for a range of items. The Unlock menu item may also be reached from the File menu.
Deleting objects
To delete objects, select the items then right-click and select Delete This will delete only the current version of the objects. To delete all versions of the selected objects, select Delete All Versions.
Current Version.
Multi-select is enabled, so items may be selected using Ctrl + click for individual items or Shift + click for a range of items. The Delete menu items may also be reached from the File menu. Refreshing the Browser To refresh the current view, select Refresh from the File menu. This is useful for viewing new cabinets that are created during a load. Cabinets or folders may also be clicked again in the tree view to refresh their contents.
By default Buldoser will only show the current users personal cabinet and all non-personal user cabinets. To show all users cabinets, select Show User Cabinets from the File menu. The menu item will then toggle to Hide User Cabinets, which can be selected to hide other users personal cabinets. To log into a different Docbase, select Switch Docbase from the File menu. The Login dialog will appear and a different Docbase may be selected. To exit the Docbase Browser and Buldoser, select Exit from the File menu. To view the current version of Buldoser, select About Buldoser from the Help menu.
Figure 4: Content to Supporting Object Relationship When Buldoser moves Content Objects, it assumes all supporting and related objects already exist in the target Docbase. They must be moved separately before the Content Objects are moved. This approach is different from other tools such as Dump & Load and DocApps, which will move any supporting or related objects in addition to the Content Objects to ensure success.
The following reasons support the Buldoser approach: Moving these supporting and related objects adds much more time to the Extract and load process, which is not practical for very large volumes. These objects usually already exist in a controlled implementation, making it unnecessary to move them. In some scenarios, the administrator does not wish to use the same supporting objects in the target that is used in the destination. For instance, the user who owns a particular document in the source Docbase should be changed in the target Docbase. With other tools, these supporting and related objects are frequently duplicated, causing clutter and inefficiency in the target Docbase. Other tools tend not to provide an exhaustive list of what was automatically moved, making it difficult for administrators to determine what was moved in the batch. The Buldoser methodology for ensuring success is to identify for the administrator what objects must exist in the target Docbase, and allowing the user to map supporting objects from the source to the target. Figure 5 illustrates the mapping concept.
The idea of mapping supporting and related objects is very powerful. In addition to making ETL operations much more efficient, it also gives the administrator complete control over what is moved, as well as guaranteeing there is no duplication of supporting objects. The mapping feature also allows the administrator to clean up the data as its moved by streamlining the object model, security model, folder structure, etc. during ETL operations.
ETL Process
Figure 6 below describes the overall process for moving content from Docbase to Docbase using Buldoser.
To ensure all content is moved, any outstanding updates should be checked into the Docbase using the appropriate client application. To keep updates from being lost during the process, all users should be notified that an ETL operation is scheduled to take place, and that updates should be postponed until after the operation is complete. Any configuration objects that exist in the source Docbase should be moved to the destination Docbase before the content movement begins, as content depends on these objects existing in the target. Supporting objects include, but are not limited to: Object Types Users ACLs Alias Sets Lifecycles Folders XML Applications Formats Storage Locations
A batch folder is simply the location where content objects will be Extracted. This location is remembered by Buldoser as the batch name, so every effort should be made to make these batches uniquely named. A calculation of content size should be made to ensure enough space exists in the batch folder before the Extract step. Using Buldoser, Extract the objects to be moved to the batch folder. Buldoser will also Extract a list of the supporting objects that must exist in the target Docbase. See the section, Extracting Content from a Docbase for more information. Buldoser allows Extract operations to be stopped and restarted from the stopping point at a later time. If the Extract is stopped, proceed to the Finish Extract step. Once the Extract is complete, proceed to the Switch Docbase step.
Extract Content
10
Finish Extract
If the Extract was stopped, Buldoser may restart the Extract from the stopping point. See the section, Extracting Content from a Docbase for more information. Once the Extract is complete, proceed to the Switch Docbase step.
Switch Docbase
After the Extract is finished, login to the target Docbase to map the supporting objects. See the section Using the Docbase Browser for more information. Once logged into the target Docbase, the administrator will Transform. Mapping data values identifies any potential issues due to supporting objects not existing in the target Docbase, and allows the administrator to resolve these issues before the load. At this step the ability also exists to create custom Folders and any ACLs and groups that correspond to that existing folder structure. This concept is core to Buldosers ETL process; it is described in more detail in the section, The Buldoser Methodology. For more information on performing data mapping, see the section, Mapping Data Values. Using Buldoser, load the Extracted objects from the batch folder. Buldoser will use the mappings stored in the batch folder to transform the content objects during the load. See the section, Loading Content into a Docbase for more information. Buldoser allows loads to be stopped and restarted from the stopping point at a later time. If the load is stopped, proceed to the Finish Load step. If the load completes but has errors, proceed to the Resolve Errors step. If the load is not stopped and has no errors, Buldoser will proceed immediately to the Load Relationships step.
Transform
Load Content
Finish Load
If the load was stopped, Buldoser may restart the load from the stopping point. See the section, Loading Content into a Docbase for more information. If the load completes but has errors, proceed to the Resolve Errors step. If the load is not stopped and has no errors, Buldoser will proceed immediately to the Load Relationships step.
11
Resolve Errors
Should errors exist in the load, Buldoser will set aside the objects in an error log to be reprocessed once the issue is fixed. Usually errors are due to Docbases being stopped or incorrect or incomplete mappings. Review the log file for the load to determine the problem. Once the issue is resolved, proceed to the Reprocess Errors step. Buldoser allows for only the errors in a load to be reprocessed. This way the administrator does not have to remove and reload successful objects to try the load again. Buldoser attempts to always move forward during the load process to be as efficient as possible. See the section, Loading Content into a Docbase for more information on reprocessing load errors. Once all errors are reprocessed, proceed to the Load Relationships step.
Reprocess Errors
12
Load Relationships
This step refers to the process of loading dm_relations and virtual document links, generically referred to as relationships. Relationships are loaded in a separate phase from the core Content Objects for two reasons: Both the parent and child objects in a relationship must exist before the relationship can be created. If relationships were created at the same time as the content objects, it would force the administrator to load the content objects in separate batches and in a particular order. If any errors exist in the initial phase, they can be resolved before relationships are created. If relationships were loaded immediately after, errors would be duplicated in both phases creating twice the number of issues for the administrator to resolve. If there are no errors during the initial load, relationships will automatically be started immediately after the first phase. If relationships exist but were not created immediately after the first phase, use Buldoser to create the relationships. See the section, Loading Content into a Docbase for more information. Buldoser allows relationship loads to be stopped and restarted from the stopping point at a later time. If the relationship load is stopped, proceed to the Finish Relationship Load step. If the load completes but has errors, proceed to the Resolve Relationship Errors step. If the load is not stopped and has no errors, proceed to the Test Content step.
If the relationship load was stopped, Buldoser may restart the load Finish Relationship from the stopping point. See the section, Loading Content into a Load Docbase for more information. If the load completes but has errors, proceed to the Resolve Relationship Errors step. If the load is not stopped and has no errors, proceed to the Test Content step.
13
Should errors exist in the relationship load, Buldoser will set aside the objects in an error log to be reprocessed once the issue is fixed. Usually errors are due to child objects not existing in the target Docbase. To resolve them, locate the child objects in the source Docbase and move them using Buldoser. Once the child objects are move, proceed to the Reprocess Relationship Errors step. Buldoser allows for only the errors in a relationship load to be reprocessed. This way the administrator does not have to remove and reload successful objects to try the load again. See the section, Loading Content into a Docbase for more information on reprocessing relationship load errors. Once all errors are reprocessed, proceed to the Test Content step.
Test Content
After any load operation, the content should be tested in the target Docbase using the appropriate client application. Usually it is sufficient to test the first 100 objects or so, then randomly test 510% of the remaining population. If testing fails, re-examine the Extract and load logs for any errors, incorrect mappings, or incorrectly handled objects. If the batch was executed incorrectly, proceed to the Undo Load step. If the testing succeeds, the process is complete.
Undo Load
Buldoser provides the capability to remove any objects that are loaded. See the section, Loading Content into a Docbase for more information.
14
Starting an Extract
This section describes starting a new Extract. Extracts may be stopped and completed at a later time from the stopping point. To complete a stopped Extract, see the section, Finishing a Stopped Extract. To start Extracting content from a Docbase, follow these steps: 1. Create a location on the file system to contain the Extracted content and attributes. Make sure the location has enough space. 2. Select New Extract from the Docbase Extract menu. The Exract dialog will appear.
15
3. Enter the location into the Extract Directory text box. Select the [] button to browse for the location. 4. Create a DQL statement to identify the objects to Extract. The DQL must be of the form
dm_document where
Note that the DQL statement may not contain the (all) keyword. Buldoser can also automatically create a DQL statement based on a cabinet, folder, or document that is selected from the Docbase Browser. If documents are selected, the DQL statement will collect object IDs as the Docbase is browsed. To close the Extract dialog but save the values in the DQL, click Save Settings. 5. Select Threads to indicate the number of threads that will be used for this Extract operation.
16
6. Select Extract Content to Extract all renditions along with the attributes. a. Select Local Extract to indicate that the Metadata and any content and renditions will be stored locally in the Extract Directory indicated in Step 3. b. Select Links from File Store to designate that all content and renditions will not be Extracted, however, a link relative to the content storage directory of the filestore will be saved in the xml metadata. This is referred to as Contentless Migration and can reduce the time of the overall operation if the Renditions are significantly large. c. Choose the Metadata only option if content less objects are desired. No content will be Extracted as a result of this option. 7. Select Extract Renditions to indicate if the content has more than just page=0 renditions. a. Select Page 0 Renditions if all of the objects in the batch have associated dmr_content objects with page=0. b. Select All Renditions if any renditions in the batch have dmr_content objects associated with page=1. The will also preserve content metadata. Know that this will impose an additional load on the docbase during both the Extract and load operations. It is recommended to verify that there are dmr_content objects with page=1 in the batch prior to selecting this option. For more information on content metadata refer to the Content Server Fundamentals Guide. 8. Select Extract Relationships to Extract any dm_relation objects that refer to the objects being Extracted as the parent in the relationship. 9. Select Extract Lifecycle Setting to Extract what Lifecycle is attached to the objects. Note: This does not Extract the Lifecycle itself. Lifecycle movement should be performed using Documentums Application Builder. 10. Select Extract Virtual Docs to Extract virtual document relationships that refer to the objects being extracted as the parent in the relationship. 11. Select Extract All Versions to extract the entire version tree for each object. If this option is not selected, only the current version will be Extracted. 12. Select Extract Audit Trail to Extract any dm_audittrail objects associated to the content within this batch. Note that moved audit trail will be associated to the same user as the existing system. 13. Click Save Settings to save the settings but not perform the Extract immediately. 14. Click Extract Dependencies to create a dependency mapping file only. 15. Click Cancel to close the dialog. 16. Click the Extract button to start Extracting immediately. First the Folders, Groups and ACLs for the batch will be Extracted. Upon completion the following status dialog will appear.
17
Indicates the operation being performed Current Batch Location Total number of objects anticipated to be processed with this operation. An important distinction to make here is that this number represents individual number of objects across a version tree. Additionally this progress bar is only update after the processing of an entire version tree.
18
Total Threads
Max Objects/Thread
Number of threads being used. Each Thread has a Docbase session and separate pool of resources used for processing objects. Indicates the number of objects that can be allocated to be processed by a single thread at a given time. Number of objects that remain to be processed for the entire operation Status Messages for the operation. Number of objects/second being processed collectively by all of the worker threads. Using the < and > buttons, each individual threads statistics can be analyzed. 1. Number Waiting Number of objects that are waiting to be processed by this thread. Note that for a multiple version load that this one object actually accounts for the entire version tree. 2. Number Processed Number of objects processed by this thread. This includes multiple versions of objects. 3. Number Failed Number of objects that were failed to be processed. 4. Average Processing Time Average Processing Time for this thread.
Will open the log file associated to the currently Selected Thread. Will stop the dealing of objects to each of the threads. Note that this will not stop the operation; each of the threads will need to process their queue of objects. Only when the Stopped message appears is the operation is in a
19
completed state. Close Update Stats Closes the HUD Dialog. Will force the recalculation of the statistics on the currently displayed thread.
20
3. Enter the location of the Extract into the Extract Directory text box. The location may be browsed by clicking the [] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the Extract, click Finish Extract.
21
22
3. Select Open from the File menu. A file browser will appear. Navigate to the Extract directory, select the dependency mapping file, and click Open.
If an Extract was just performed, Buldoser will present the option to open the dependency mapping file from the previous Extract. 4. After the file is opened, the Transform dialog will evaluate each dependency and identify whether the supporting object exists in the target Docbase. For nonexistent mappings, the value will display a red exclamation point, and the dependency type will display 2 exclamation points to make it easy to identify any potential errors during the load. Initially, the Transform dialog will show Object Type mappings.
23
5. To resolve a mapping for a missing dependency or to map a value from the source Docbase to a different value in the target Docbase, select the row or rows to be mapped. 6. Select a value from the Target Options drop-down. 7. Click the Map button to map the selected rows to the selected target option. 8. Click the Map All button to map all rows to the selected target option. 9. Click the Clear button to remove mappings from selected rows. 10. Click the Clear All button to remove mappings from all rows. 11. To change to another Dependency Type, select the type from the drop-down. The screen will update to show dependencies of the selected type. 12. After all mappings are completed, the changes can be saved by selected Save from the File menu. If changes are not saved and the dialog is closed, Buldoser will warn the user before closing the file.
24
Mapping Attributes
Each object type included in the batch can have the value of the target attribute mapped from a different source prior to loading. This feature is especially useful when consolidating multiple custom types into a single custom type. In order to map attributes first map object types to their desired target value. For more information see the steps outlined in the section Transform. Once the object types are mapped select the button labeled Map Attributes In Buldoser version 3.3 and above, dm_owner, and acl_domains are mapped using the dm_dbo alias if the owner_name attribute is the docbase owner. This allows the dependency file to thereby be more portable across docbases by saving the step of mapping to docbase owner.
To create an ACL open the Transform dialog and select the ACL dependency type. Refer to the Transform section for more information. At this point any number of ACLs can be selected from the table. Upon clicking the Create Selected ACLs button these ACLs will be created in the target docbase.
25
26
Creating Folders
The Transform screen is also used to recreate an existing folder structure in the target repository. It is important to know that this function will recreate an entire security structure of a repository. Specifically, ACLs to which a folder refers may be created if not already existing in the repository. Additionally any groups those ACLs reference will be created if not already existing in the docbase. The ACLs of this folder structure should be mapped as desired prior to performing this operation. For more information on how to map ACLs refer to the section Transform. Upon an Extract operation all of the folder objects for a given batch are Extracted to the specified load directory. If a custom folder type (i.e. dm_folder is the supertype) the custom Metadata associated to that folder will additionally be Extracted. Some things to keep in mind while Extracting folders: Map any desired folder types in the Transform window prior to creating folders. Map any desired folder object type and attribute mapping prior to creating folders. Map any desired folder ACL mapping prior to creating folders. It is important to know that any ACL mapping will supersede any existing ACL value of a folder from the source repository.
To create a Folder, open the Transform dialog and select the Folder dependency type. Refer to the Transform section for more information. At this point any number of Folders can be select from the table. Upon clicking the Create Selected Folders button to these folder will then be created in the target docbase.
27
Select the desired folder(s) and click the Create Selected Folders button. Note: If a large number of folders were selected it may take some time to process. All of the resultant created folders for this batch will be displayed in the table.
28
29
30
31
2. Enter the location of the Extract into the Load Directory textbox, or browse using the [] button. If a ContentLess Extract is Specified, the option will be presented to specify a Base Path. First map a network driver to the existing filestore of the source docbase. An example of this location is: C:\Documentum\data\sbdev\content_storage_01\00002417. Note that in this case the source docbase is sbdev and the docbase id 2417(hex). Using this functionality will only save time only if the renditions are large. 3. Optionally a Processing Class can be specified. By implementing the interface IBuldoserTransform gives the full power of Java and the DFC classes to implement any custom functionality that might be required. 4. Select the Synchronization Setting to tell Buldoser how to react if the CURRENT version of an object being loaded matches a CURRENT object that already exists in the target system.
32
First the r_object_id is compared against the r_object_id of the object in the batch. This case would only occur if the extract and load occurred within the same docbase. Second the buldoser_audit_trail object is checked to determine if the source r_object_id of this object has been loaded. If so, this object is determined to be loaded. Lastly, The object_name and folder_path attributes are used to determine if this object has already been loaded. The Syncronization Setting is particularly useful for running scheduled jobs against a production docbase. On a nightly docbase this job can be used to Syncronize good content to a development system. a. Select ALWAYS Create Version Tree to create objects regardess of existing objects (default). The object is never checked to be existing. b. Select CREATE If Previous Object Does Not Exist to only load this version tree if the CURRENT version of the batch is not found. c. Select REPLACE If Previous Object Exists to delete the existing version tree if the CURRENT version of the object is found. d. Select CURRENT versions only. If Previous Object Exists Append Version Tree to append the existing version tree with only the CURRENT version of the object. In other cases only the CURRENT version of the object will get loaded. 5. Select Lifecycle Promotion setting to indicate how many lifecycle promotions need to occur after the lifecycle is attached to this object. a. Select No Lifecycle Promotion to only attach the lifecycle and do no promotion. b. Select Promote Desired Number of Cycles to promote all content loaded with lifecycles a set number of times. i. the number of times to promote any objects that are loaded with Lifecycles in the Promote Cycles drop-down. c. Select Promote Content to Previous State to run the promote command on the content the same number of times as the value of the r_current_state attribute. Note: Not all lifecycles have an r_current_state that starts at zero. Check the i_state_no and the state_name attributes of the dm_policy objects in the source and target docbase prior to running this operation. d. Select Set Previous Lifecycle State to set the r_current_state attribute of the content after attaching the lifecycle. This is useful in Web Content Management Systems where the promotion through lifecycle states takes a great deal of time. In this case Simply run the Site Publishing Job after the load to take full advantage of this feature. 6. Select the number of threads to use for the load from the Threads drop-down. Selecting multiple threads helps the content to load faster, but it is dependant on the hardware. Each thread will create an additional Docbase Session, which will impose an additional load on the Content Server. It is best to start with one or two
33
threads for initial loads, then increase the number of threads for subsequent loads after monitoring network, memory, and processor resources. Additionally the default number of sessions a client can obtain is 10. Set the MAX_SESSION_COUNT attribute in the dmcl.ini file of the client running Buldoser to modify this setting. 7. Check the Verbose box to create a very detailed log file to be created from the load. This option can negatively impact performance for very large loads. If this option is not selected, only errors will be written to the log file. 8. Check the Trace box to have Buldoser write a DMCL-level trace for every 1000th object that is loaded. This option may be used for diagnostic purposes. 9. Check the Chart Performance box to create a tab-delimited data file containing relevant performance information on the load in milliseconds. A file is created for each thread and is named chart<thread number>.txt. By opening a chart file using Microsoft Excel, a line graph may be created to display performance trends over the life of the load. This option is typically used during a small trial load for diagnostic purposes. For large loads this option may degrade performance. 10. Check the Auto Create Formats box to automatically create formats if they dont exist in the destination Docbase. Buldoser will create a format with the correct name only an administrator must fill in the remaining attributes once the load is complete. It is highly recommended to move formats with a Documentumprovided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only. 11. Check the Auto Set Owner box to automatically set the owner_name attribute of each object to the Docbase owner of the target Docbase. Having the Docbase owner own content is a standard convention. This allows administrators to quickly map this attribute without using the Transform dialog. Click Save Settings to save the settings but not perform the load immediately. Click Cancel to close the dialog. Click the Load button to start loading immediately. An instance of the following status dialog will appear.
34
35
Indicates the operation being performed Current Batch Location Total number of objects anticipated to be processed with this operation. An important distinction to make here is that this number represents individual number of objects across a version tree. Additionally this progress bar is only update after the processing of an entire version tree. Number of threads being used. Each Thread has a Docbase session and separate pool of resources used for processing objects. Indicates the number of objects that can be allocated to be processed by a single thread at a given time. Number of objects that remain to be processed for the entire operation Status Messages for the operation. Number of objects/second being processed collectively by all of the worker threads. Using the < and > buttons, each individual threads statistics can be analyzed. 5. Number Waiting Number of objects that are waiting to be processed by this thread. Note that for a multiple version load that this one object actually accounts for the entire version tree. 6. Number Processed Number of objects processed by this thread. This includes multiple versions
Total Threads
Max Objects/Thread
36
of objects. 7. Number Failed Number of objects that were failed to be processed. 8. Average Processing Time Average Processing Time for this thread. View Log Stop Will open the log file associated to the currently Selected Thread. Will stop the dealing of objects to each of the threads. Note that this will not stop the operation, each of the threads will need to process their queue of objects. Only when the Stopped message appears is the operation is in a completed state. Closes the HUD Dialog. Will force the recalculation of the statistics on the currently displayed thread.
37
3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the [] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Finish Load. 7. To modify the values for promote cycles, verbose logging, tracing, auto ACL creation, auto Format creation, auto Folder creation, or auto set owner, click the Modify Load Settings button. The dialog will change to display the above options.
38
3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the [] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Reprocess.
39
7. To modify the values for promote cycles, verbose logging, tracing, auto ACL creation, auto Format creation, auto Folder creation, or auto set owner, click the Modify Load Settings button. The dialog will change to display the above options.
2. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the [] button 3. To close the dialog and save the entries, click Save Settings. 4. To close the dialog, click Cancel. 5. To start the relationship load, click Create Relationships. Relationship loads are single-threaded only, so only one instance of the following status dialog will appear.
40
Title Bar
Operation Type
Load Location Thread Number Progress Messages Successful Relationships Failed Relationships Begin Time Average Load Time Projected Completion Time
Indicates the percent completion and thread number (which is always zero for Relationship loads.). Indicates the operation type. For Relationship Load this is always BuldoserRelationshipLoad Indicates the Batch location where this operation is being performed. Indicates the thread number (always zero for relationship loads). Indicates the number completed. Describes the current operation. Indicates the number of successful relationships. Indicates the number of failed relationships. Timestamp for when the load began. Indicates the running average time per relationship in milliseconds. Indicates when the relationship load should complete calculated by
41
extrapolation. View Log Stop Opens the log file. Stops the load. To restart the load, see the section, Finishing a Stopped Relationship Load. Only available after the load is complete or has been stopped.
Close
3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the [] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Finish Relationships.
42
3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the [] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To resume the load, click Reprocess Relationship Errors.
Undoing a Load
Undo will completely eradicate the previous objects, relationships and the audit records that track the loading process. Undo will additionally remove any automatically created Folders, ACLs, and Groups. Undo will not remove any automatically created formats.
43
To undo a load, follow these steps: 1. Identify the location on the file system that contained the loaded objects. Buldoser uses this location in its audit trail to identify objects loaded from the batch. 2. Select Undo Load from the Docbase Load menu. The Undo dialog will appear.
3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the [] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To undo the load, click Undo. An instance of the following status dialog will appear. Undo operations run single-threaded only.
44
Item
Description
Title Bar
Operation Type Load Location Thread Number Progress Messages Stop Close
Indicates the percent completion and thread number (which is always zero for Undo.). Indicates that this is a BuldoserUndo. Indicates the Batch location where this operation is being performed. Indicates the thread number (always zero for Undo). Indicates the number completed. Describes the current operation. Stops the undo. Only available after the undo is complete or has been stopped.
45
Scheduling Overview
Buldoser Scheduler provides the capability to schedule Docbase Extracts and Docbase Loads. There are a plethora of applications for Scheduler, including: Scheduling Extracts or loads to be performed during off hours when users are not accessing the source or target Docbase; Regularly monitoring a location for content that is generated by another system for import into a Docbase; Creating a regular backup of selected files while the Docbase is running; and, Replicating content from a source to a target Docbase. Scheduler is implemented as a Documentum job, and requires that Buldoser is loaded on the Content Server. Scheduler uses the exact same installation procedure as normal Buldoser operation. For more information on installation, see the Buldoser Installation Guide and Release Notes. The operation of Buldoser Extract and Load Jobs are described in further detail in the following paragraphs. Extract Jobs When Extract Jobs are scheduled, they are configured exactly the same as a normal Buldoser Docbase Extract, with the addition of a job name, run frequency, and run mode (e.g., minutes, hours, days, etc.). Normally Extracts will run the DQL statement and Extract the objects directly to the extract location. When Documentum runs an Extract Job, Buldoser will modify the Extract by: Creating a sub folder underneath the Extract location for the Extract. The name of the folder will follow the convention Backup yyyy-mm-dd hh-mm-ss. An example of a sub folder is, Backup 2005-04-21 04:00:00. Objects will be Extracted to this location instead of the configured location.
46
Modifying the DQL statement to only Extract objects that have changed since the last execution of the job. An example of a modified DQL statement is, select distinct r_object_id from dm_document where folder(/Demo,descend) and r_modify_date>Date(04/21/2005). Creating an audit trail entry that records the date and time of the execution. This audit entry is used for modifying the DQL above. These changes allow Buldoser to run incremental Extracts of objects that are identified by the DQL statement for the job. Load Jobs When Load Jobs are scheduled, they are configured exactly the same as a normal Buldoser Docbase Load, with the addition of a job name, run frequency, and run mode (e.g., minutes, hours, days, etc.). Normally Loads will load objects that are found in the configured load location. When Documentum runs a Load Job, Buldoser will modify the load by: Looking for any new batches that have been created in the jobs load location since the last time the job ran. Batches are sub folders with the naming convention, Backup yyyy-mm-dd hh-mm-ss. An example of a sub folder is, Backup 2005-04-21 04:00:00. Copying a master mapping dependency file to the sub folder if it exists. Creating an audit trail entry that records the name of the batch that was executed. These changes allow Buldoser to poll a location for generated or Extracted content, as well as to keep from duplicating loaded batches.
47
3. For Extract Directory through Extract Audit Trail, configure the dialog in the same way as a normal Docbase Extract. For more information on how to configure a Docbase Extract, see the section, Starting an Extract. 4. Enter a name for the job in the Job Name text box.
48
5. Enter how often the job should run in the Run Frequency text box. The value should be an integer. 6. Select the units for the run frequency in the Run Mode drop-down. 7. Click the Cancel button to close the dialog. 8. Click the Create Extract Job button to create the job in the Docbase. The job will be grouped under the Buldoser category.
49
3. For Load Directory through Auto Set Owner, configure the dialog in the same way as a normal Docbase Load. For more information on how to configure a Docbase Load, see the section, Starting a New Load. 4. Enter a name for the job in the Job Name text box. 5. Enter how often the job should run in the Run Frequency text box. The value should be an integer. 6. Select the units for the run frequency in the Run Mode drop-down. 7. Click the Cancel button to close the dialog. 8. Click the Create Load Job button to create the job in the Docbase. The job will be grouped under the Buldoser category.
50
2. Select the job to be edited from the list. 3. Click the Edit button. 4. Follow the same instructions as creating an Extract or Load Job within the sections Scheduling an Extract Job and Scheduling a Load Job, respectively.
51
52
53
3. Test the mapping to see if its correct. The following paragraphs describe the process of creating a mapping. The Object View After connecting to the data source, the first step in creating a load is to identify the objects to be moved. Buldoser accomplished this via an SQL statement that queries the primary table(s) that contains the objects. This query statement is known as the Object View. The Object View should result in a list of objects 1 row per object and be in versioning order. Versioning order is defined as the oldest version to latest version. Usually this is the creation order of the objects. For non-versioned objects, the order is not relevant. Buldoser requires the order to be correct to make sure version trees are correctly re-created during the load. When creating an SQL statement, be sure to use the format supported by the selected driver. If the data source contains multiple tables, a primary key is required in the Object View. This allows any other tables that are registered with Buldoser to link to the Object View. Supporting Views Supporting Views are other tables or views within the data source that contain information that is necessary to build Documentum objects. These views usually related to the Object View in 1:M relationship. Oftentimes these tables contain all the renditions of the object, repeating attribute values, or folder links. If the data source only has a single table, there will be no supporting views. These views must have a foreign key that establishes a relationship to the Object View. If the foreign key exists within the Object View, a view must be created within the database that joins the two tables together. See the diagram below for an example of a conceptual data model.
54
In the above scenario, a view could also be created for the Object View that joins data from the Customer table since the relationship is 1:1. Usually understanding which table represents the Object View and which tables represent Supporting Views requires some knowledge of the tables and relationships in the data source. Reference should be made to the design documentation or product literature to determine which tables to use. Object Configurations Once the source data model is understood and registered with Buldoser, the tables and columns can be mapped to Documentum object attributes, security, folders, and content renditions. These mappings are created by object type in Object Configurations. Buldoser allows for more than one Object Configuration for a particular load; the configuration to apply will be selected on an object-by-object basis by evaluating its Applies When criteria. Since configurations may be similar and may take some effort
55
to create, Buldoser allows for the copying of a configuration to speed the mapping process and reduce errors. Object Configurations consist of Attribute Configurations, Content Configuration, Folder Configuration, Security Configuration, Version Configuration, and Pre- and Post-Processing Configuration. Each topic is described in more detail below. Inline Data Transformation In addition to loading data from a database into Documentum, Buldoser also provides the ability to transform data during the load. For instance, suppose a column contained a status of a particular document in the legacy system. When that column is moved to an attribute, the user wishes to change the value of the status attribute to a new value that reflects a difference in business rules. This example is illustrated in the table below. Status Column (old value) Work in Process Staged Approved Archived Status Attribute (new value) WIP Staging Active Expired
Buldoser calls this process Inline Data Transformation. It is offered in Attribute Configuration, Content Configuration (for formats), Folder Configuration, and Security Configuration. Inline Data Transformation is also useful for deriving other configurations than attribute data. For instance, suppose a data source doesnt have the concept of an Access Control List (ACL), but the administrator wishes to determine the ACL for a particular object based on a column called Department. By mapping the Department column to the ACL Name in Security Configuration, the administrator can use the Department to drive the ACL to be used. This example is illustrated in the table below. Department Column (old value) ACL (new value)
56
Attribute Configuration tells Buldoser how to assign values to the attributes of an Object Type for a specific Object Configuration. There are several options depending upon whether the attribute to be assigned is single-valued or repeating. For single-valued attributes, the options are: No Configuration no values will be assigned to the object. Static Value a literal value will be assigned to all objects to which the current configuration is applied. Map Column a column is mapped to the attribute. Inline Data Transformation can be applied to change the value from the database before assignment to the attribute. For repeating-valued attributes, the same options are available as single-valued attributes, plus: Map Multiple Columns multiple columns are mapped to the attribute. The first value from each column will be assigned to the attribute. This option is usually used for single-table data sources where the administrator has created multiple columns to contain repeating attribute values. Map Column with Delimiter a single column is mapped to the attribute with the addition of a delimiter. The delimiter is used to parse out multiple values that are stored in the column. This option is usually used for single-table data sources where the administrator has created a single column to contain repeating attribute values and has separated multiple values with a delimiter. For Date type attributes, values from the source must be in mm/dd/yyyy hh:mm:ss AM format. If the data type of the source column is Date, then the date will be formatted automatically. Content Configuration Content Configuration tells Buldoser how add content renditions to an object. To add a content rendition, Buldoser needs the full physical location of the file and the Documentum format. The physical location can be broken into a base file path and a
57
relative file path, since most Content Management Systems whether homegrown or purchased store content relative from a base location. Buldoser provides two options for configuring physical location and format: Identify one-to-many columns that contain file locations and automatically determine format This option is usually used for single-table data sources where the administrator has created multiple columns to contain file locations. Buldoser will pull the file extension from the file and look up the Documentum format. For formats with the same file extension, Buldoser will use the last file extension. If a specific file extension is desired, Buldoser allows the administrator to configure which format to use. Identify a column for location and a column for format and map formats This option is usually used for multi-table data sources where a separate view contains the physical location and format for the rendition. Buldoser provides Inline Data Transformation to map formats from a column to Documentum formats. Folder Configuration Folder Configuration tells Buldoser which folders will contain an object. Buldoser provides two options for configuring folders: Select a fixed folder for all objects The simple option is to put all objects in the same folder location. A folder may be selected or typed in. If the folder doesnt exist, Buldoser provides the option to create it on-the-fly. Identify a column that indicates folder location In this case, a column from the data source either contains a folder path or indicates a folder path. Buldoser provides Inline Data Transformation to map column values to a particular Documentum folder path if there isnt a direct mapping. If folders are not configured, all objects will be located in the personal cabinet of the currently logged-in user. Security Configuration Security Configuration tells Buldoser which ACL will be applied to an object. Buldoser provides two options for configuring security: Select a fixed ACL for all objects The simple option is to assign the same ACL to all objects. Identify a column that indicates the ACL In this case, a column from the data source either contains an ACL or indicates an ACL. Buldoser provides Inline Data Transformation to map column values to a particular Documentum ACL if there isnt a direct mapping. Copyright 2003 - 2005 - Crown Partners, LLC
58
If Security is not configured, ACLs will be applied with the default configuration of the Content Server. See the Documentum Content Server Administrators Manual for more information on Content Server configuration. Versioning Configuration Versioning configuration is a required step for data sources that implement versioning, otherwise it is optional. There are three items to configure for versioning: Previous version column The column in the Object View that contains the value of the Primary key for the previous version must be identified. Version Label column A column that contains version labels may be optionally identified. Base of the version tree A method must be selected that identifies for Buldoser how to identify the base of a version tree. Configuring this item allows Buldoser to implement multi-threading. Three options are available: o Previous version column = Primary Key With this option, any time the Previous Version columns value equals the Primary Key, Buldoser recognizes the base of a version tree. o Previous version column = Null With this option, any time the Previous Version columns value is Null, Buldoser recognizes the base of a version tree. o Previous version column = Literal Value With this option, any time the Previous Version columns value equals a literal string entered by the administrator, Buldoser recognizes the base of a version tree. Pre- and Post-Processing For those scenarios that require special processing or validation, Buldoser provides the capability to execute external Java processing methods before and after each object is loaded. The custom class file must be identified within the class path so that Buldoser can use it at run-time. These methods are executed from a provided Java Interface named, IBuldoserTransform. A set of JavaDocs is provided for use when implementing the interface, and can be found at <Buldoser Install Location>\docs\apidocs\index.html. Full access is given the object in memory before the load, and a handle to the IDfSysObject interface after the object is loaded. When Buldoser executes the method, it will check for a returned value. If one is found, it will fail the object and write the message to the log file.
59
Multi-threaded Loading Algorithm Buldoser incorporates a multi-threaded load algorithm that is modeled after a card dealer dealing cards to players. A controlling (dealer) thread first makes the one and only connection to the source database. Worker threads (players) are launched which make connections to the target Docbase. Each player monitors its own queue of objects to load. The dealer iterates through the Object View, gathering all the data required to load a particular object and pushes it onto the queue of the player with the least number of objects. When all the queues are full, the dealer will wait until there is availability. When all the queues are empty, the players will wait on the dealer, although this rarely happens.
60
61
7. 8. 9.
Enter the log path location into the Log Path text box, or browse for the location by clicking the [] button. Click Cancel to close the Wizard and stop configuration without saving. Click Next to continue. Step 2 of the Wizard will appear. If a configuration already exists at this location, Buldoser will update the Wizard with the existing settings.
62
10. 11.
12. 13.
14.
Enter or select the class name of the driver in the Driver drop-down. The value for the ODBC driver and the Oracle driver are provided. Enter the connection URL for the data source in the Data Source text box. For ODBC connections, the value is formatted as jdbc:odbc:<name of the ODBC connection>. Enter a user name for the connection in the User Name text box, if applicable. Enter a password for the connection in the Password text box, if applicable. Buldoser will not store the password in the configuration file so it must be entered every time. Click the Test button to test the connection information. If Buldoser is successful connecting, the Next button will be enabled. If not, Buldoser will
63
return an error. Continue modifying the connection information until the test is successful. 15. Click Cancel to stop configuring and close the Wizard without saving. 16. Click Previous to go back to Step 1. 17. Click Next to proceed to Step 3.
18.
Enter a name for the Object View in the Object View Name text box. This value is user-defined and can be anything that represents the data set. 19. Enter the SQL statement that identifies the objects to be moved into the SQL text box. The SQL must be formatted correctly for the driver entered in Step 2. 20. Click the Test SQL button to validate the SQL statement. If Buldoser is successful executing the SQL, the Next button and Primary Key drop-down will be
64
enabled. If not, Buldoser will return an error. Continue modifying the SQL statement until the test is successful. 21. Select the Primary Key column for the Object View in the Primary Key dropdown. 22. Click Cancel to stop configuring and close the Wizard without saving. 23. Click Previous to go back to Step 2. 24. Click Next to proceed to Step 4.
25.
Step 4 only applies to multi-table data sources. For single-table data sources such as Excel spreadsheets, click Next to proceed to Step 5.
65
For each Supporting View, select a View from the View Name drop-down and the column that links the view to the Object View from the Linking Column dropdown, then click Add. 27. To remove a Supporting View that has been added, select the rows to remove and click Remove. 28. Click Cancel to stop configuring and close the Wizard without saving. 29. Click Previous to go back to Step 3. 30. Click Next to proceed to Step 5.
26.
31. Click Add to create a new Object Configuration. 32. Click Edit to edit a selected Object Configuration. 33. Click Copy to copy a selected Object Configuration.
66
34. 35.
Click Remove to remove a selected Object Configuration. If Add or Edit was clicked the Object Configuration will appear.
36. a. b. c.
Create filters in the Configuration Applies When table to make the configuration apply to only certain rows in the Object View. Click Add to create a new row. Click Remove to remove a selected row. For each row, select an Object View column from the Source Column drop-down, select a comparison operator from the Is drop-down, and enter a value for the Source Column in the Value text box.\
67
37.
38.
Select an Object Type from the Object Type drop-down. If the Object Type for the configuration is changed after attributes have been configured, the attribute configuration will be lost. Click Configure Attributes. The Attribute Configuration dialog will appear.
39.
Select the attribute to be configured, and then select a Configuration Type from the Configuration Type drop-down. 40. To map a single column to an attribute, select an attribute, then Map Column from the Configuration Type drop-down.
68
41. Select a view and column from the View and Column drop-downs, respectively. 42. To map values from the source column to attribute values, click Map Values. The Attribute Value Mapping dialog will appear.
69
43.
Enter or select a value in the Maps To column for each Column Value from the data source. If the attribute has value assistance, the values will appear in a drop-down in the Maps To column. Note: SQL Server 2000 has a case insensitive distinct query, which will cause unique entries across case. 44. Click OK to save the mappings or Cancel to close the dialog without saving. 45. The Attribute Configuration dialog will reappear. Click Save to save the mappings. If another attribute is selected before clicking Save, the configuration will be lost. 46. To enter a static value for a single-valued attribute, select a single-valued attribute, then Enter Static Value from the Configuration Type drop-down.
70
47. Enter a value in the Value text box. Click Save to save the configuration. 48. To enter multiple static values for a repeating-valued attribute, select a repeating-valued attribute, then Enter Static Values from the Configuration Type drop-down.
71
49.
Enter a value in the Value text box, and then click Add to add it to the list. Click Remove to remove a value from the list. Click Save to save the configuration. 50. To map multiple columns to a repeating-valued attribute, select a repeatingvalued attribute, then Map Multiple Columns from the Configuration Type dropdown.
72
51.
Select a view and column from the View and Column drop-downs, and then click Add to add it to the list. Click Remove to remove an entry from the list. Click Save to save the configuration. 52. To map a column with a delimiter to a repeating-valued attribute, select a repeating-valued attribute, then Map Column with Delimiter from the Configuration Type drop-down.
73
53.
Select a view and column from the View and Column drop-downs and a delimiter in the Delimiter text box. Click Save to save the configuration. 54. Once all attributes are configured, click OK to save the configuration and return to the Object Configurator. Click Cancel to cancel all changes since the dialog was opened.
74
55.
75
56.
57.
To use a fixed ACL for all objects, select the Use a Fixed ACL and ACL Domain radio button. Next, select an ACL and ACL Domain from the drop-down. Entries are formatted as ACL Domain.ACL Name. To use a column from the data source to drive the ACL that is used, select the Use an ACL from a Column radio button. Select a view, ACL column, and ACL Domain column from the drop-downs. To map values from the selected columns to ACLs in the current Docbase, click the Map ACL Values button. The ACL Mapping dialog will appear.
76
58.
For each row, select the ACL in the Maps To drop-down that should be selected based on the value from the source column. ACLs are formatted as ACL Domain.ACL Name. 59. Click OK to save the mappings and close the dialog. Click Cancel to close without saving. The Security configuration dialog will reappear. 60. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear.
77
61.
78
62.
Enter the base path for all files in the Base Content Path text box, or click [] to browse for the location. 63. Select the radio button for the Configuration Type to use. 64. For content in column or columns, select each view and column and click Add. Click Remove to remove a view and column. 65. To identify which format will be used in the case where multiple formats exist in the target Docbase for the same file extension, click Map Duplicate Formats. The Format Mapping dialog will appear.
79
66.
For each file extension on the left, select the format to be used from the Maps To drop-down on the right. 67. Click OK to close the dialog and save the mappings. To close without saving, click Cancel. The Content Configuration dialog will reappear.
80
68.
For content in a separate view, select the view, physical location, and format columns. If this is a Documentum Docbase and the data_ticket attribute is used for physical location, check the Documentum Ticket checkbox to have Buldoser calculate the location.. 69. If the formats identified in the Format column are not Documentum formats, click the Map Formats button. The Format Mapping dialog will appear.
81
70.
For each source format on the left, select a Documentum format to use instead in the Maps To drop-down. 71. Click OK to save the mappings and close the dialog. Click Cancel to close without saving. The Content Configuration dialog will reappear. 72. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear.
82
73.
Click the Configure Folders button. The Folder Configuration dialog will appear.
83
74.
To use a fixed folder path for all objects, select the Use a Fixed Folder Path radio button. Next, enter a folder path in the drop-down. To list folders in the target Docbase in the drop-down, click the List Folders button. 75. To use a column from the data source to drive the folder that is used, select the Use Folder Path from a Column radio button. Select a view and folder column from the drop-downs. Optionally enter a folder path prefix and suffix. To map values from the selected column to folders in the current Docbase, click the Map Folder Paths button. The Path Mapping dialog will appear.
84
76.
For each row, select the Folder in the Maps To drop-down that should be selected based on the value from the source column. 77. Click OK to save the mappings and close the dialog. Click Cancel to close without saving. The Folder configuration dialog will reappear. 78. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear.
85
79.
Click the Configure Versioning button. The Version Configuration dialog will appear.
86
80.
Select the column that contains the primary key value of the previous version in the Previous Version Column drop-down. 81. Select the view and column for the version labels from the Version Label View and Version Label Column drop-downs, respectively. 82. Select the method for determining the base of a version tree from the three radio buttons. This must be configured correctly for multi-threading to function properly. 83. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear.
87
84.
Click the Configure Pre- and Post-Processing button. The Processing Configuration dialog will appear.
88
85.
Enter the name of the class that implements IBuldoserTransform into the Processing Class textbox. The class must exist within the classpath or Buldoser will return an error. Leave the box blank for no processing. 86. Click OK to save the configuration and close the dialog. Click Cancel to close without saving. The Object Configurator will reappear. 87. Click OK to save the Object Configuration and close the dialog. Click Cancel to close without saving. 88. After all configuration is complete, Step 5 will reappear.
89
89. Click Cancel to stop configuring and close the Wizard without saving. 90. Click Previous to go back to Step 4. 91. Click Next to proceed to Step 6.
90
92.
Step 6 provides the ability to preview objects as they will be built using the configurations created in Step 5. Click Next Object to advance through the data set. Each object will display each configuration in sections, showing values for attribute, security, folders, content, versioning, and processing class. When the end of the data set is reached, Buldoser will start over from the beginning. 93. Click Previous to go back to Step 5 and correct any errors in configuration. 94. Click Next to proceed to Step 7.
91
95.
Select the number of threads to use for the load from the Threads drop-down. Selecting multiple threads helps the content to load faster, but it is dependent on the hardware that is being used. It is best to start with one or two threads for initial loads, then increase the number of threads after monitoring network, memory, and processor resources. 96. Check the Verbose box to create a very detailed log file to be created from the load. This option can negatively impact performance for very large loads. If this option is not selected, only errors will be written to the log file. 97. Check the Trace box to have Buldoser write a DMCL-level trace for every 1000th object that is loaded. This option may be used for diagnostic purposes. 98. Check the Chart Performance box to create a tab-delimited data file containing relevant performance information on the load in milliseconds. A file is created for each thread and is named chart<thread number>.txt. By opening a chart
92
99.
100.
101.
102.
103.
104. 105.
file using Microsoft Excel, a line graph may be created to display performance trends over the life of the load. This option is typically used during a small trial load for diagnostic purposes. For large loads this option may degrade performance. Check the Auto Create ACLs box to automatically create ACLs if they dont exist in the destination Docbase. Buldoser will create a System ACL with the correct name only an administrator must fill in the correct Users, Groups, and permission levels once the load is complete. It is highly recommended to move ACLs with a Documentum-provided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only. Check the Auto Create Formats box to automatically create formats if they dont exist in the destination Docbase. Buldoser will create a format with the correct name only an administrator must fill in the remaining attributes once the load is complete. It is highly recommended to move formats with a Documentumprovided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only. Check the Auto Create Folders box to automatically create folders if they dont exist in the destination Docbase. Buldoser will create the entire folder path with the correct name only an administrator must fill in the remaining attributes once the load is complete. It is highly recommended to move folders with a Documentum-provided tool or script outside of Buldoser before performing ETL operations. This option is provided as a fallback mechanism only. Check the Auto Set Owner to DBO box to automatically set the owner_name attribute of each object to the Docbase owner of the target Docbase. Having the Docbase owner own content is a standard convention. Check the Dealer-Side Database Load box to have the heavy Docbase operations executed on the Dealer side instead of at the thread level. It is best to have this option checked when doing loads with content. If the loads are without content then it is best to leave this option unchecked. This option indicates whether a bulk of the database operations will occur on the Controlling thread or on each of the worker threads. Click Previous to go back to Step 6. Click Next to proceed to Step 8.
93
106. Configuration is now complete and Buldoser is ready to load. Click Save and Close to save the configuration and close without initiating the load. 107. Click Previous to return to Step 7. 108. Click Cancel to close the Wizard without saving. 109. Click Load Now to start the load. The Dealer Heads-Up Display (HUD) will appear.
94
The dialog is explained below: Item Location: Total Objects: Total Threads: Definition Indicates the name Log Path for the batch. Indicates the total number of objects for the batch. Indicates the number of threads that was selected. The number may be increased or decreased during the load by clicking the + and buttons. Indicates the number of objects that each worker thread will cache. This number can be increased or decreased during the load by clicking the + and buttons. Indicates how many objects have been dealt by the Dealer thread.
Max Objects/Thread:
Dealing Progress:
95
Number Waiting: Number Processed: Number Failed: Average Load Time: View Log: Stop:
Provides messages that indicate the current operation of the load. Indicates the average speed for dealing an object. Indicates how fast on average objects are being loaded across all threads. Indicates for which thread statistics are being displayed. The thread number is indicated. Use the < and > buttons to cycle through the threads to view each threads statistics. Indicates how many objects are in the selected threads queue. Indicates the number of objects that have been processed by the selected thread. Indicates the number of objects that have failed for the selected thread. Indicates the average load speed for the selected thread. Opens the load log for the selected thread. Allows the dealing to be stopped. The load may be restarted from the stopping point by selecting Finish Load from the Database Load menu. Note that each worker thread will finish the objects that are currently in its waiting queue. Once the load has been stopped or has completed, the Close button allows the dialog to be closed. Updates the per thread statistics for the currently selected thread.
96
3. Enter the location of the database configuration into the Load Directory text box. The location may be browsed by clicking the [] button 4. Enter the password to connect to the database in the Password text box. 5. To close the dialog and save the entries, click Save Settings. 6. To close the dialog, click Cancel. 7. To resume the load, click Finish Load.
97
3. Enter the location of the load into the Load Directory text box. The location may be browsed by clicking the [] button 4. To close the dialog and save the entries, click Save Settings. 5. To close the dialog, click Cancel. 6. To undo the load, click Undo. An instance of the following status dialog will appear. Undo operations run single-threaded only.
Title Bar
Indicates the percent completion and thread number (which is always zero for Undo.). Indicates that this is a BuldoserDbUndo. Indicates the Batch location where this operation is being performed.
98
Indicates the thread number (always zero for Undo). Indicates the number completed. Describes the current operation. Stops the undo. Only available after the undo is complete or has been stopped.
99
Content View
1. A view in the database that runs the Docbase the following query:
create view content_view as select c.data_ticket, full_format, parent_id_i from dmr_content_s a, dmr_content_r b where a.r_object_id_i = b.r_object_id_i
2. A network drive will need to be mapped to the filestore directory on the source Docbase. In Windows this directory is typically c:\Documentum\Data\<Docbase>
100
Next, link this query (in Step 3 of 8) with the Supporting View defined as content_view
101
102
103
The Content and Versioning buttons are particularly important for this operation. Set them according to the below screenshots. *Note that the <Mapped Filestore Path> (within the Base Content Path: field) should actually be the Drive and path to where the filestore is mapped on the machine running Buldoser.
104
105