A COMPARISON OF CAUSALITY ASSESSMENT TOOLS FOR SUSPECTED ADVERSE DRUG REACTIONS IN HOSPITALIZED PATIENTS AT A TERTIARY CARE HOSPITAL

Objective: The objective of the study was to compare six causality assessment (CA) tools for suspected adverse drug reactions (ADRs) reported in hospitalized patients at a tertiary care hospital in India. Methods: Intensive ADR monitoring was performed in indoor patients of two randomly selected medicine units. A detailed case report of each suspected ADR (n=120) was provided to six independent experts for CA using either visual analog scale (VAS) or WHO-UMC scale. Investigator assessed causality using Naranjo’s scale, Koh et al . scale, the French method, and Karch and Lasagna scale. Similar causality categories from these scales were coded for correlation. Agreement among experts and that between various CA tools were analyzed using Cohen’s kappa and Fleiss kappa. Reasons for disagreements among different scales were evaluated. Results: A variation was observed in the total number of drugs suspected to cause ADR by experts and investigator. “Likely” and “Plausible” causality were suggested frequently by experts using VAS whereas “Possible” causal association was frequent according to experts using the WHO-UMC scale and also by the investigator using algorithms except Koh et al . scale. None to the slight agreement was observed among experts who used VAS (k=0.117), whereas a substantial agreement was observed among experts using the WHO-UMC scale (k=0.707). A substantial agreement was observed between Karch and Lasagna scale and the French method (k=0.740). Both scales demonstrated moderate agreement with Naranjo’s scale. Disagreement among the WHO-UMC scale, the French method, and Karch and Lasagna scale were associated with polypharmacy, serious ADRs, non-availability of laboratory data, and skin and subcutaneous tissue ADRs. Conclusion: A higher inter-rater agreement with the WHO-UMC scale suggests its utility for CA of suspected ADRs in indoor patients. The French method and Karch and Lasagna scale can be used for CA in hospitalized patients as an adjunct to Naranjo’s scale. Factors associated with disagreement should be considered at the time of reporting ADRs and evaluating causality.


INTRODUCTION
Causality assessment (CA) is defined as the evaluation of the possibility of a drug being the cause of an adverse drug reaction (ADR). It measures the strength of the relationship between a drug and suspected ADR. It is performed to identify important ADRs, to generate signals, and to evaluate the risk-benefit profile of drugs [1]. A precise and accurate method of CA is key to effective management and minimization of ADRs [2].
Accurate CA is posed with certain challenges such as inadequate data, lack of a universally accepted method, and an inherently complex evaluation. Principle methods of CA include expert judgment, algorithmic approach, and probabilistic method. Expert judgment is widely used and involves an expert who applies knowledge and experience to evaluate causality. However, inter-and intra-rater variation and weak reproducibility are observed with this method [3]. Various scales have been utilized for expert judgment, including WHO-UMC scale [4] and visual analog scale (VAS) [5]. However, a low inter-rater agreement is observed with the WHO-UMC scale as compared to algorithms [6].
Algorithms are simple to use and demonstrate a higher intra and inter-rater agreement. These have poor sensitivity but good specificity as compared to expert judgment and probabilistic methods [3]. A number of algorithms are in use, including Naranjo's scale [7], Koh et al. Scale [8], Karch and Lasagna Scale [9], and the French method [10].
Studies conducted previously show poor agreement between the three methods of CA [11] and also among different scales being used [12].
These studies, however, have usually relied on spontaneously reported ADRs, whereas an intensive monitoring of ADRs is likely to provide more robust information for accurate CA. Furthermore, few studies have evaluated the agreement between the WHO-UMC scale, VAS, and the above-mentioned algorithms for ADRs occurring in indoor patients.
With this view, the present study was conducted to evaluate the agreement between above mentioned CA tools for ADRs reported in hospitalized patients and to evaluate the utility of these scales in the Indian health-care system.

METHODS
This was an observational, prospective, and single-center study conducted in indoor patients of two randomly selected units of the Department of Medicine at a Tertiary Care Hospital in Gujarat, India over a period of 23 months, that is, September 2016-August 2018. Permission to conduct the study was obtained from the IEC (Ref. No: IEC/Certi/21/17) and head of the Department of Medicine. Patients hospitalized with an even indoor registration number to the selected medicine units, of either gender and aged more than 12 years, who developed an ADR following admission or who were hospitalized due to an ADR were included after written informed consent. Intensive monitoring of observed ADRs was performed. Enrolled patients were followed up till resolution of ADR or discharge, whichever was earlier. A sample size of 30 ADRs was needed to assess agreement between two scales or methods, as determined from previous studies [13][14][15]. Hence, a total of 120 ADRs were included in the sample to evaluate the agreement between four algorithms.

Parikh et al.
Case reports of all ADRs were provided to experts for CA using either the WHO-UMC scale or VAS. Experts 1 (clinician), 2 (clinician), and 5 (pharmacologist) evaluated causality using VAS, whereas experts 3 (clinician), 4 (clinician), and 6 (pharmacovigilance associate) evaluated causality using the WHO-UMC scale. Length of VAS from zero to the mark assigned by the expert was measured by the investigator and converted to causality categories as described by Arimone [5]. Drug assigned highest score on VAS or highest category on the WHO-UMC scale was considered as the primary suspect drug by the respective expert. Investigator evaluated causality of the same ADRs using Naranjo's scale, Koh et al. scale, the French method, and Karch and Lasagna scale. Severity and preventability of ADRs were assessed by the investigator using the modified Hartwig and Siegel scale [16] and modified Schumock and Thornton criteria [17], respectively.
The causality of ADRs among four algorithms, WHO-UMC scale and VAS was matched using the coding system described by Thaker et al.
Factors associated with disagreement among three scales, that is, Karch and Lasagna Scale, the French method, and WHO-UMC scale, were also evaluated using the Chi-square test. For this, ADRs were divided into Group A-ADRs which showed 100% agreement between three scales and Group B-remaining ADRs. P<0.05 was considered as statistically significant.

CA by Experts using VAS
Experts 1, 2, and 5 suspected a total of 209 drugs for 117 ADRs, 209 drugs for 104 ADRs, and 257 drugs for 120 ADRs, respectively. Experts 1 and 2 assessed three and sixteen cases, respectively, as "not related to drug." Disagreement regarding the primary suspect drug was observed in eight cases. In three cases, all experts suspected different primary drugs. In the remaining 5 cases, experts 1 and 5, experts 2 and 5, and experts 1 and 2 agreed in two, two, and one case, respectively. All experts assigned "Likely" and "Plausible" causality more frequently for the primary suspect drug (65% by expert 1, 77.9% by expert 2 and 74.2% by expert 5). Expert 5 assigned "Certain" causality more frequently (4.2%) than other two experts (2.5% by expert 1 and none by expert 2). For all suspect drugs, "Plausible," "Unassessable," and "Doubtful" associations were suggested frequently by experts. "Unlikely" association was also suggested in fair number of reports ( Table 2).

CA by investigator using algorithms
The investigator suspected a total of 190 drugs for 120 ADRs. "Possible" association with the primary suspect drug was most frequent using    (Table 3).

Agreement among different scales used by experts and investigator
A substantial agreement was observed among experts using the WHO-UMC scale (k=0.707), whereas none to the slight agreement was observed among experts using VAS (k=0.117) (

Factors associated with disagreement among scales
Polypharmacy, serious ADRs, non-availability of laboratory data, and skin and subcutaneous tissue ADRs were found to be associated with disagreement among Karch and Lasagna scale, the French method, and WHO-UMC scale ( Table 6).

DISCUSSION
CA is an integral part of pharmacovigilance. It is performed to identify important ADRs, to generate signals, and to evaluate the risk-benefit profile of drugs. Given its importance, an accurate CA is essential. While a number of tools are available for CA, few studies have evaluated their utility and agreement for ADRs occurring in indoor patients.    In the present study, the incidence of ADRs was 5.2%. A lower incidence (2.12%) was reported by Doshi et al. [22] in a study of intensive monitoring of ADRs in indoor patients of two medical units of the same hospital. Rajpara and Kanani [23] also reported a low incidence (0.58%) of ADRs in a study conducted in indoor patients of a Tertiary Care Hospital in Vadodara, Gujarat. Differences in the prescribing pattern of drugs, as well as individual susceptibility, could have contributed to this discrepancy.
Patients >70 years of age were frequently affected by ADRs. However, the number of patients screened in these groups was substantially less compared to most groups. Elderly patients are more prone to ADRs due to factors such as multidrug therapy, changes in Pharmacokinetics and Pharmacodynamics of drugs [24]. However, Rajakannan et al. [25] reported a higher incidence of ADRs in the age group of 31-45 years (24.69%) compared to the age group of 61-75 years (20.19%) in a study conducted in South India. Further studies are recommended to evaluate the age-wise difference in the incidence of ADRs in different ethnic populations.
Infections (25.8%) were common in the study population. As a result, antimicrobials were frequently used and were the most common suspect drugs during CA. Gastrointestinal, respiratory, and cardiovascular disorders were also frequently observed. The underlying disorder can produce signs and/or symptoms similar to ADR which can influence CA.
The common system-organs affected by ADRs were skin and subcutaneous tissue disorder (30%), gastrointestinal system disorder (24.1%), and metabolic and nutritional disturbances (15.8%). In the study by Doshi et al. [22], GI ADRs (27%) were most frequent and cutaneous ADRs (25%) were the most common cause of hospitalization. CNS ADRs (25.3%) were more frequent in the study by Rajpara and Kanani [23] followed by GI ADRs (14.9%) and skin and subcutaneous tissue disorders (13.8%). Variation in the pattern of ADRs can be due to differences in the pattern of drug use and individual susceptibility.
Following antimicrobial agents, drugs acting on the renal system were frequently suspected by experts and investigators since these drugs are often associated with metabolic and nutritional disorder ADRs. Other drug groups were less frequently suspected. Expert 5 demonstrated a tendency to suspect drugs acting on GI system frequently, whereas experts using the WHO-UMC scale did not suspect these drugs at all. This type of variation is expected in the expert judgment method as it

Parikh et al.
depends on the knowledge and experience of the assessor. Hence, interrater agreement is often poor.
A large variation was observed with regards to the total number of drugs suspected by experts using VAS. Increased communication among experts with a discussion of case safety reports can be employed to overcome such discrepancies. The variation was less among experts using the WHO-UMC scale since these experts suspected only one/two drugs for a given ADR. However, this tendency poses a risk of missing out on a rare drug event association.
A good agreement was observed with regards to the primary suspect drug in the majority of cases. However, experts using VAS showed disagreement in this regard in eight cases. Furthermore, experts 1 and 2 labeled some cases as 'not related to drug' as opposed to other experts, reflecting the inter-rater variation. A perfect agreement was observed with regards to the primary suspect drug among experts using the WHO-UMC scale, which can partly be attributed to suspicion of a less number of drugs.
Experts using VAS suggested "likely" and "plausible" associations between primary suspect drug and ADR most frequently. A variation, however, was observed with regards to two extremes of causality, that is, "certain" and "excluded". While clinicians (experts 1 and 2) chose "certain" relationship less frequently, the pharmacologist tended not to "exclude" a causal association with the suspect drug. Further studies are recommended to evaluate these tendencies in a larger number of experts. A fair number of "unassessable" and "doubtful" associations was also reported by experts. In the opinion of authors, the numerical nature of scale can be responsible for this finding as experts often judge the probability of a causal association in terms of percentage and not in terms of causality category. However, this opinion needs further evaluation. With regard to other suspect drugs, a fair number of "plausible" and "unassessable" associations were suggested by all experts, suggesting their suspicion regarding a possible causal role of these drugs. "Doubtful" and "unlikely" associations derived by experts suggested a tendency of not ruling out the drug causation possibility.
CA by experts using the WHO-UMC scale was more uniform as only three associations "Possible," "Probable," and "Certain" were derived upon. This can be attributed to the fact that only one/two most likely drugs were suspected by these experts. "Possible" association was most frequent, which suggested that ADRs presented with confounding factors, lack of dechallenge/negative dechallenge, and/or multiple suspect drugs. Sharma et al. [26] also reported a high proportion (73%) of "Possible" association in 200 ADRs by three experts using the WHO-UMC scale in a study conducted in Maharashtra, India. In a fair number of cases (22-29%), other drugs were also suspected to have a "Possible" causal association with the ADR.
Inter-rater agreement among experts using VAS was poor which suggests that this tool is not ideal to determine the causality of ADRs in indoor patients by a panel of experts. Arimone et al. [5] also reported none to the slight agreement (k=0.2) among five experts using VAS for CA of 150 drug-event pairs in 30 ADRs. On the other hand, a substantial agreement observed between experts using the WHO-UMC scale suggests that the criteria included in the scale are applied in a similar manner by experts even from different professional backgrounds and can be useful for CA in similar settings. Sharma et al. [26] also reported an almost perfect agreement (k=0.89) among three raters using the WHO-UMC Scale for CA of 200 ADRs in a study conducted in Maharashtra, India. Mouton et al. [27] also reported a substantial agreement (k=0.61) among four raters using the WHO-UMC scale for CA of 48 drug-event pairs.
CA by the investigator using algorithms revealed similar proportions of "Possible" and "Probable" associations with Naranjo's scale, the French method, and Karch and Lasagna scale. Lack of dechallenge/negative dechallenge, confounding factors and/or multiple suspect drugs contributed to frequent "Possible" associations (65-70%) with these scales. Few "certain" associations were derived with Naranjo's scale and Karch and Lasagna scale and none with the French method since the latter avoids a "Definite" association [11]. "Probable" association was more frequently (57%) suggested with Koh et al. scale due to the difference in the scoring system as compared to other algorithms.
CA using the French method and Karch and Lasagna scale demonstrated a substantial agreement higher than any other pair of algorithms. Furthermore, both scales demonstrated moderate agreement with Naranjo's scale, which suggests the utility of these scales in the validation of CA with Naranjo's scale in similar settings. A moderate agreement of these two algorithms with the WHO-UMC scale was also observed; however, the agreement was lower than that with Naranjo's scale.
Polypharmacy, serious ADRs, non-availability of laboratory data and skin and subcutaneous tissue ADRs were found to be associated with disagreement among the WHO-UMC scale, the French method, and Karch and Lasagna scale. These factors need consideration during reporting and at the time of CA to ensure an accurate assessment.