Generation of Adverse Drug Event Detection Rules by Weighted Association Rule Mining

IOSR Journal of Engineering (IOSRJEN) e-ISSN: 2250-3021, p-ISSN: 2278-8719 Vol. 3, Issue 1 (Jan.
2013), ||V4|| PP 43-46
Generation of Adverse Drug Event Detection Rules by Weighted Association Rule Mining
Varuna Sivasakthi1, Gomathy Ramesh2
(Computer Science and Engineering, Bannari Amman Institute of Engineering, Sathyamangalam, India) Abstract: Adverse drug events (ADEs) are a public health issue. The objective is to automatically detect cases of ADEs by data mining. Different kinds of outcomes are traced, and weighted association rule mining (WARM) is used to discover ADE detection rules, with respect to time constraints. In this paper, we propose a self-assigned weights method to discover positive and negative association rules, instead of assigning the weights by users. The rules are then filtered, validated, and reorganized. The rules use a various number of conditions related to laboratory results, diseases, drug administration, and demographics. Some rules involve innovative conditions, such as drug discontinuations. Keywords- Adverse drug events (ADEs), association rules, data mining, self-assigned weights.
I.
INTRODUCTION
Adverse drug events (ADEs) endanger patients as they are the most common type of iatrogenic injury. They can be defined as injuries due to medication management rather than the underlying condition of the patient. Data mining is sometimes used in the field of ADE detection. But i t was mainly used to analyze voluntary ADE reports by means of supervised rule induction methods such as decision trees, association rules, or Bayesian neural networks, and not to analyze hospitalization records. As a consequence, the results can only be used to analyze other voluntary ADE reports. Association rule mining is an important technique in data mining, which aims to explore the relation between the data. The classical model of association rule mining employs the support measure, which treats every transaction equally. Practically, different transactions have different weights in real life datasets, in which different items usually have different importance. Much effort has been dedicated to association rule mining with user-specified weights. However, medical and health data do not come with such user-specified weights. In addition, negative rules are also important for data analysis. A frequent itemset may not be as important as it appears, because the weights of transactions are different. The weights in this paper are derived from the internal structure of the database based on the assumption that important transactions consist of important items. The weights are assigned by self learning the data. So we can use these weights in weighted association rule mining algorithm WARM , whose item weights are assigned.
II.
WEIGHTED ASSOCIATION RULE MINING
Weighted association rule mining generalizes the traditional model to the case where items have weights. Some researchers introduced weighted support and weighted confidence of association rules based on the costs assigned to items or transactions. The definition broke the downward closure property. As a result, the weighted algorithm became more complicated. However, another problem came following the traditional algorithms: the weights need to be assigned by users, but users may not know how to confirm the weights correctly. All the classical algorithms produce positive associations between items existing in transactions. A directed graph is created where nodes denote items and links represent association rules, where all nodes and links are allowed to have weights.
III.
DETECTING THE ADVERSE DRUG EVENTS
In order to detect the ADEs a list of outcomes will first be defined , and the link between those outcomes and prior drug administration or discontinuations will be studied by means of WARM techniques applied on a training set. Rules will be obtained, in which an outcome is explained by a set of drugs in combination with a clinical background, in the form of ADE detection rules (e.g., drug_A & background_B outcome_C). Then those rules will be applied onto past hospital stays of an evaluation set to get contextualized statistics such as the confidence (e.g., probability of outcome_C when drug_A and background_B are present). Regarding data mining techniques, two issues have to be solved: 1) The temporal constraints have to be taken into account;
www.iosrjen.org
43 | P a g e
2) We have to use weighted association mining methods, although the ADEs are not explicitly flagged in the routinely collected data, which are usually required in the classical rule induction method. TABLE I Description of the hospitals and stays used Number of Age in years Men Duration in stays included Mean (sd) proportion days Mean (sd) 50,072 52.8 (21.6) 29.2% 5.48 (6.10) 1,367 7,846 26,245 23,067 6,880 71.4 (18.4) 45.4 (27.5) 55.6 (25.9) 53.1 (22.6) 49.4 (16.1) 42.1% 51.6% 40.4% 44.8% 26.4% 11.4 (15.1) 10.7 (15.3) 4.56 (11.8) 4.51 (8.41) 6.96 (2.54)
Hospital
Wards
French #1 French #2 French #3 Danish #1 Danish #2 Bulgarian
Medicine surgery obstetrics Geriatrics Geriatrics and Cardiology Medicine surgery obstetrics Medicine surgery obstetrics Endocrinology
IV.
ALGORITHM FOR WEIGHTED SUPPORT AND WEIGHTED CONFIDENCE
The problem of mining association rules that satisfy some minimum weighted support and weighted confidence can be decomposed into three sub-problems: 1. Rank an item and assign its weight. 2. Find significant itemsets whose weighted supports meeting the given threshold. 3. Build rules within the itemsets found in Step 2. 4.1 Ranking Items with HITS Let I={I1,I2,,Im} be a set of items and let T={T1,T2,,Tn} be a set of transactions in database. Clearly, D is equivalent to the bipartite graph G= {D, I, E}, where E= {(T , Ii):Ii T, T D, Ii I} 4.2 Weighted Metric After the iteration process, we obtain every auth(Ii). The auth of the item can reflect the importance of the item. For this reason, we pick up the maximal auth to normalize all auths. The normalized auth(Ii) is the weight of the Ii denoted as wi. W= {w1,w2,,wm} authMax = Max{auth(Ii)| Ii I} (2) wi = auth(Ii)= auth(Ii)/authMax (3) The items are always arranged by their weights from the largest one to the smallest one in an item set shown as TABLE II Items and their weights of example database Item Self Assigned Weight Urine 1.0 Micturition 0.786 Lumbar 0.752 Urinary Bladder 0.751 Nephritis 0.695 Urethra 0.668 Nausea 0.445 4.3 Rules Generation The third step is more time-consuming than the classical algorithm. The classical algorithm (Aprioriap-genrules) generates rules with the theorem that if confidence(X=>Y-X)< minconf, confidence(X=>Y-X) < minconf, X X. confidence(X=>Y-X) = support(Y)/support(X), confidence (X=>Y-X) = support(Y)/support(X). Without weights, we can easily find the result. Because support (X) > support(X). But when we assign the weight, the result cant be got. Indeed, when X is weighted frequent itemset, X may not be. We dont know
www.iosrjen.org
44 | P a g e
the exact relationship between sawsupport(X) and sawsupport(X). When we get one rule, we should check the weighted support and weighted confidence. PARS: positive association rules set NARS: Negative association rules set. new-genrules (fk, Hm): fk: frequent itemset Hm: rule consequent The proposed algorithm is shown below: 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12) 13) 14) 15) 16) 17) 18) 19) 20) 21) 22) 23) 24) 25) 26) initialize hub(t) to 1 for each transaction t in T for(l=0;l<num_it;l++) do begin hub(t)=0 for each transaction in T for all item Ii I do begin auth(Ii)= hub(t) hub(t)+=auth(Ii) for each transaction obtain Ii hub(t)=hub(t) for each transaction in T authMax=Max{auth(Ii)|Ii I}, W(Ii) = auth(Ii)=auth(Ii)/authMax WARDM generate frequent item sets L for each fk of K-frequent item sets, K>=2 do for each h1 H1 do if sawsup(h1)>=minsawsup then if |sawinterest-1| >mini then if sawinterest-1 >0 if sawconf((fk h1)=>h1) >=minsawconf then PARS=PARS {fk-h1)=>h1} if sawconf((fk-h1)=>h1>=minsawconf then PARS=PARS {(fk - h1)=> h1} if sawinterest -1 <0 if sawconf ((fk-h1)=>h1)>=minsawconf then NARS=NARS {fk h1)=>h1} if sawconf ((fk h1)>=minsawconf then NARS=NARS {(fk h1)=> h1} else return else delete h1from H1 Call new-genrules (fk, H1)
V.
RESULTS AND DISCUSSION
This method is able to automatically discover ADE detection rules. Some are already known and validated. In addition, the method enables to discover new knowledge, such as segmentation conditions or unknown rules. The confidence often varies a lot with respect to the place a rule is applied. Those differences might be due to latent variables that are not observed in the data, such as the risk monitoring policies or the medical background of the patient. For the data mining phase, the data have to be simplified. For instance, the duration and dose of medications have been ignored, as well as the numeric value of the laboratory results. However, the rules so obtained can be enriched by such parameters later, for instance, in a CDSS, for prospective ADE prevention. Producing ADE detection rules by data mining is complex. Indeed, the ADE cases are not flagged in the data: when hyperkalemia can be observed, we do not simply know if it is an ADE or not. Yet most of the outcomes are principally due to the patients diseases, and occasionally due to drugs. For that reason, an automated filtering and an expert filtering and reorganization of the rules are performed. Once the rules have been filtered and modified, they are automatically evaluated using the evaluation set. Regarding ADEs, these methods appear not to be relevant because the order of appearance of the conditions is not overriding, but the conditions have to be active simultaneously. It is not a problem of order of appearance, but a problem of concomitant presence and delay up to the condition. In addition, the discontinuation of a drug itself is a kind of event. For all these reasons, the temporal conditions are analyzed and filtered before the rule induction to ensure that all the events that are candidate to explain an outcome are compatible with the outcome regarding time. Then, the same constraints are applied for the rule automated evaluation. The number of conditions is not constrained by the method, and the output provides more complex rules than in other studies. In addition, this study takes into account the effects of drug discontinuation. Some previous works have involved segmentation conditions, such as the age, the renal function, the hepatic function,
www.iosrjen.org
45 | P a g e
and the patients weight. Contrary to the fact, many important ADE detection rules are not discovered by data mining because either the conditions never occur or, when the conditions are present, the outcome never occurs. This is probably because those rules are well known and, consequently, the risk is well monitored. However, it is possible to input the corresponding rules and enforce their automated evaluation.
VI.
CONCLUSION
Weighted association rule mining brings innovative and semi automated solutions for ADE detection. The method is quite generic and could be applied to other kinds of data as soon as they are available in the Electronic Health Records, such as structured results of electrocardiograms. A drawback of the method is that only the data that are recorded can be mined. The patients weight and known drug allergies could have been used, but this information was not sufficiently present in the dataset. The results of the method used here bring an important contribution to ADE knowledge. The rules that are obtained are versatile and can be used either as detection rules on past hospital stays, or as prevention rules in a Clinical Decision Support System context. Those rules are already loaded in several prototypes that are developed in the frame of the PSIP(Patient Safety Intelligent Procedures) Project. 1) A tool designed for retrospective ADE detection and follow-up in past hospitalizations: the Scorecards. 2) A knowledge-based system for prospective ADE prevention during the medication process, which is used by three Clinical Decision Support System: one embedded in a computerized physician order entry, another embedded in an Electronic Health Record, and a prescription simulation tool that is available even without any Hospital information system. First, the weights of the items are derived from a database with only boolean attributes. Second, the weights are assigned to the items. Third, those weights are used to mine positive and negative association rules.
REFERENCES
[1]. [2]. [3]. [4]. [5]. [6]. [7]. Data Mining concepts and techniques by Jiawei Han, 3rd edition. Ke Sun and Fengshan Bai, Mining Weighted Association Rules without Pre-assigned Weights VOL. 20, NO. 4, April 2008. David W.Bates, Frank Federico, Medication Errors and Adverse Drug Events in Pediatric Inpatients VOL.285,NO.16, April 2001. James G.Anderson, Information Technology for Detecting Medication Errors and Adverse Drug Events Aug 2004. Eiji Aramaki, Yasuhide Miura, Extraction of Adverse Drug Effects from Clinical Records 2010 Emmanuel Chazard, Stephanie Bernonville, Data Mining to Generate Adverse Drug Event Detection Rules VOL. 15,NO.6, Nov 2011. Wei Xie and Jing Wu, Mining Positive and Negative Weighted Association Rules in Medical Records without User Specified Weights based on HITS Model2010.
www.iosrjen.org
46 | P a g e

Generation of Adverse Drug Event Detection Rules by Weighted Association Rule Mining

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Generation of Adverse Drug Event Detection Rules by Weighted Association Rule Mining

Hochgeladen von

Copyright:

Verfügbare Formate

IOSR Journal of Engineering (IOSRJEN) e-ISSN: 2250-3021, p-ISSN: 2278-8719 Vol. 3, Issue 1 (Jan.

2013), ||V4|| PP 43-46

WEIGHTED ASSOCIATION RULE MINING

DETECTING THE ADVERSE DRUG EVENTS

French #1 French #2 French #3 Danish #1 Danish #2 Bulgarian

ALGORITHM FOR WEIGHTED SUPPORT AND WEIGHTED CONFIDENCE

RESULTS AND DISCUSSION

Das könnte Ihnen auch gefallen