Sie sind auf Seite 1von 5

.-.

10f5

(~?JZ~)

Sent: Thursday, April 28, 2005 3:44 PM

To:

Cc:

eop.gov;

Subject: RE: Dup Exchange Message IDs

.'

I have a possible reason why we are seeing messages that seem unique with the same PR_SEARCH_KEY. A user might be keeping a 'form' email in their drafts folder which then could be modified with new message, recipients, etc and sent out. I noticed (at least in the PST I saw) that the message with duplicate PR_SEARCH_KEYs were similar in content and from the same sender.

I am also getting some feedback about the two properties I mentioned previously (PR_ CLIENT _SUBMIT _ TIME and PR_INTERNET_MESSAGE_ID) from some other Exchange escalation engineers. The consensus seems to be there is no one MAPI property that Exchange can give you that will guarantee uniqueness. Properties like PR_SEARCH_KEY can be used to weed out obvious cases (like if PR_SEARCH_KEY is actually different), along with timestamps (if the messages were sent at different times, they are probably unique). But to be absolutely sure that no duplicates exist, additional parts of the message like body, subject, recipients, attachments, etc will need to be compared.

Thanks,

..

From: _ [mailto §)mdy.com]

Sent: Thursday, April 28, 2005 11:49 AM

To:

Cc:

Subject: RE: Dup Exchange Message IDs

eop.gov;

eop.gov

_,

Thank you very much for you information.

Can you please help us to understand more about the usage of PR_SEARCH_KEY?

Specifically in case of EOP, they are filing messages from Exchange Journals. Is it possible under any circumstances that two 'not identical' messages with the same value of PR_SEARCH_KEY will be found in Exchange Journals?

Thank you very much for your help,



From:

Sent: We!dnl~sdav.

To:

Cc:

Subject: RE: Dup Exchange Message IDs

oa.eop.gov;

III,

Thank you for this email, it was very helpful to both me and the engineer I am working with. I wanted to pass along some thoughts that the engineer has about determining message uniqueness. According to him, the

5/3/2005

GEORGE W. BUSH PRESIDENTIAL RECORD

OAP00045302

Page 2 of5

authoritative way to determine duplicates between two or more messages in Exchange 2000/2003 is with a combination of the following two properties:

• Date header or PR_CLlENT_SUBMIT_ TIME

• Message-ID or PR_INTERNET_MESSAGE_ID

He also said that he has advised another customer attempting to engineer a de-duplication solution of this method. Again, he says" ... only those two properties stated above can be the authoritative way to determine duplicates due to multiple JournalRecipients being used for multiple stores."

He also clarifies what 'authoritative' means. He says it is what we call the '59's way' of determining uniqueness of messages. 59's meaning it will provide 99.999% certainty of determining dupes.

I hope this information is helpful, and please let me know if you have any more questions.

Thanks,

-

, , •• <,~ _ ••••• ~. • •••••• ~._ •••• ~ •• ~ •••• .- • "'_·""'~"'~"·'_'_~"""' •• M"'·_' ••.... n •• "# ,,.~ •••• ~ ••••• _ ..

From:_[mailto @mdy.com]

Sent: 126,2005 10:07 PM

To:

Cc:

Subject: RE: Dup Exchange Message IDs

eop.gov;



_wrote in his email: A search key is used to compare the data in two objects. An object's search key is stored in its e8_;;£:ARQ·U~_~Y property. Because a search key represents an object's data and not the object itself, two different objects with the same data can have the same search key. When an object is copied, for example, both the original object and its copy have the Same data and the same search key.

From here it looks like an original message and its copy are considered as two different message objects with the Same data and should have different values for PR_INTERNET_MESSAGE_ID and PR_MESSAGE_SUBMISSION_ID. That is why I made a test with duplicate messages and found that they are not different.

The duplication of messages can happen for different reasons.

First example, when using journaling, if a message message was sent to multiple recipients and this recipients have mailboxes assigned to different journals. In this case the same message can appear in multiple journals and FileSurf will try to file it multiple times.

Second example, when filing directly from the user's mailbox (thisoption is not currently used by EOP, but still a valid option), one recipient can modify an incoming message and file it to FileSurf. When second recipient tries to file the same message in its original form, it will be considered as a duplicate.

Third example, when filing from PST files, Janet and Dan found two messages with the same PR_SEARCH_KEY. I did not see these messages myself, but from what I know, they were almost identical and had only slight differences in the date sent and recipient list. They looked like they were sent from Blackberries. I do not know what is the nature of these messages, but it is interesting to check if they have different PR_INTERNET_MESSAGE_ID and PR_MESSAGE_SUBMISSION_ID.

Now about FileSurf. When FileSurf files a message, it extracts the value of PR_SEARCH_KEY and stores it in the database. When it files the next message, it checks if the same value of this property is already recorded in the

5/312005

GEORGE W. BUSH PRESIDENTIAL RECORD

OAP00045303

Page 3 of5

database. If it is, then the second message considered as a duplicate. In general, there are different configuration options that determine how the duplicates are handled. In particular case of EOP, we do not allow filing duplicate messages to the same file plan location. This is done to save space in the database and in the repository, make the search more efficient, etc. What would you suggest we should use to identify duplicates?

And finally, why are we using PR_SEARH_KEY. Back in year of 2002, we opened a case with Microsoft (case # SRZ020926001555). The recommendation was the following: Based on our deep research, we found another MAPI property called PR _ SEARCH _ KEY may meet your requirements.

Things probably have changed since then and we are ready to address it. I hope we can find the right solution.

Thank you very much for you help,

Thanks,

To:

Cc:

Subject: RE: Dup Exchange r-ressane

[mailto:~microsoft.com] 20058:47 PM

oa.eop.gov;

oa.eop.gov

.,

Here are some answers from the Exchange engineer in Charlotte that I'm working with:

1. EOP is currently at Exchange 2000, but Exchange 2000 does have the PR_INTERNET_MESSAGE_ID property.

2. File Surf should not (more specifically, cannot) be using PR_SEARCH_KEY to determine duplicates.

For the last two questions, we need some more information. First (and these questions may have been answered already, so I apologize), why will there be duplication of messages? If a copy of a message is made, we would expect that all properties (SEARCH_KEY, INTERNET _MESSAGE_ID, body, recipients, etc) would all stay the same. We are not sure about what will (or should) happen if a message is modified and saved, but is modification of messages something that needs to be addressed? In other words, why are emails being modified?

The Exchange engineer and myself would like to know how FileSurf works (how does it determine duplicates for example) so we can figure out how best to solve this problem.

Thanks!

-

From: glmdy.com]

Sent: Tuesday, April 26, 2005 7:10 PM

5/3/2005

GEORGE W. BUSH PRESIDENTIAL RECORD

OAP00045304

Page 4 of5

To:

Cc:

Subject: RE: Dup Exchange Message IDs

-

I have the following questions:

1. I think EOP is using Exchange 2000, not 2003 yet. Is this correct? Do all messages have PR_INTERNET_MESSAGE_ID in Exchange 2000?

2. Going back to PR_SEARCH_KEY. According to the documentation. this property uniquely identifies message's data, not the message object itself.

Does it mean that only if two message objects have absolutely identical data, they will have the same PR_SEARCH_KEY?

If this is the case, do we still should consider them as duplicates?

3. I just created a copy of a message. Both the original message and the copy of the message have identical PR_SEARCH_KEY, PR_INTERNET _MESSAGE_ID and PR_MESSAGE_SUBMISSION_ID. So using the PR_INTERNET_MESSAGE_ID does not help to distinguish them. The question is do want to consider them as different messages?

4. I opened the copy message, modified its body and saved. It had no effect on any of the three properties. Neither one changed. In this case we have two different messages and they will be identified as duplicates no matter which property we use.

Thanks,



From:

Sent:

To:

Cc:

Subject: FW:

oa.eop.gov;

All,

Here's some clarification from_ This makes more sense to me, and seems more promising. He is saying that PR_INTERNET_MESSAGE_ID is the best way to determine uniqueness of mail. In Exchange 2003, all messages will have this property, and most should also have the PR_MESSAGE_SUBMISSION_ID property.

The only time he has seen messages with PR_MESSAGE_SUBMISSION_ID and not

PR_INTERNET _MESSAGE_ID, was in Exchange 5.5.

Let me know if you have any questions.

Thanks!

5/3/2005

GEORGE W. BUSH PRESIDENTIAL RECORD

OAP00045305

Page 5 of5

From:_CCPR) Sen~5:03PM To: ____

Subject: RE: Dup Exchange Message IDs

To clarify a bit further ...

. PR_MESSAGE_SUBMISSION_ID is stamped on all MAPI submissions or mail, and can be deemed a message-ID for tracking purposes, but its really what we call the MTS-ID ... also for mail that traverses the XAOO protocol e.s. and also for used for message tracking purposes for mail that has no PR_INTERNET _MESSAGE_ID. Most mail in E2k(3) will have both #2 and #3. The real means of determining the uniqueness of a message is PR_INTERNET_MESSAGE_ID. If that prop is missing ... the fallback is PR_MESSAGE_SUBMISSION_ID.

PR_INTERNET _MESSAGE_ID is the real way to determine this ... since in E2k(3) all message will have this property. I've only seen messages with no PR_INTERNET_MESSAGE_ID, but a PR_MESSAGE_SUBMISSION_ID in 5.5, and we all know why i.e 5.5 was an MTA product, not SMTP. SMTP was an extensible/EDK feature of 5.5.

II

Thanks for the great info •. I have passed this along to the messaging people at EOP, and I'm sure they are going to have some questions. One thing I think they may have some heartburn about is the following "The real means of determining the uniqueness of a message is PR_INTERNET _MESSAGE_ID. If that prop is missing ... the fallback is PR_MESSAGE_SUBMISSION_ID."

From that statement, it doesn't sound like there is an absolute method of determining the uniqueness of a message other than to look for both of those properties.

I'll send more info as I get it, and thanks very much for your time.

5/3/2005

GEORGE W. BUSH PRESIDENTIAL RECORD

OAP00045306

Das könnte Ihnen auch gefallen