Negative Appendectomy Rate in Urban Referral Hospitals in Tanzania: A Cross-sectional Analysis of Associated Factors

Background: Acute appendicitis (AA) has a lifetime risk of 8.3% with a consequent 23% lifetime risk of emergency appendectomy. In atypical presentation, making a clinical diagnosis is difficult, leading to a high perforation rate (PR) or misdiagnoses and high negative appendectomy rates (NAR). This study aimed to establish NAR and explore the associated factors and possible attainable solutions to reduce it in urban referral hospitals in Tanzania. Methods: This was a crosssectional study with 91 consecutive patients, aged 10 years and older undergoing appendectomy for suspected AA with histological evaluation of specimens. The study was powered to detect the NAR at 95% confidence level and 80% power. Results: The histological NAR was 38.5% and the perforation rate was 25.3%. The Alvarado score (AS) was rarely applied (6%), despite a demonstrated ability in this study to decrease the NAR by half. Females were four times more likely to undergo negative appendectomy than males. Conclusion: The NAR is clinically significant as about two out of every five patients undergoing emergency appendectomy for suspected AA do not require the procedure. The AS is underutilized despite a demonstrated ability to decrease the NAR. We recommend that the AS be incorporated in the management of patients with suspected appendicitis.


Introduction
Acute appendicitis (AA) has a lifetime prevalence of between 6.7% and 8.6%, with a corresponding lifetime risk for emergency appendectomy of 12.0% to 23.1% (1). Despite the frequent occurrence, making a correct clinical diagnosis is often difficult in an atypical presentation. Delay in diagnosis leads to perforation while misdiagnosis results in unnecessary appendectomy (2,3). Low negative appendectomy rate (NAR) has been traditionally interpreted as being associated with missed early AA and, consequently, progression to perforation.
By contrast, a high NAR while reducing the risk of missed early AA commonly results in subjecting patients to unnecessary surgery (4). While the relationship described above is still prevalent in resource-limited health services, imaging technologies available in highly resourced health services can reduce the NAR without increasing the perforation rate (5,6). A high NAR leads to unnecessary surgical intervention with its associated risk of morbidities, economic burden, and with the potential adverse consequences of unnecessary anesthesia (7)(8)(9)(10)(11).
The precision of diagnosis of AA is a major determinant of NAR. This precision can be increased by the use of medical imaging, clinical scoring systems, and laparoscopy. Diagnostic scoring systems such as the Alvarado score (AS) have parameters with a positive correlation to the diagnosis of AA (12). Using the AS, the most established scoring system, a score of less than 5 has been endorsed as having enough sensitivity to virtually rule out AA (13). Medical imaging displays the appendix and associated features of inflammation during AA. Diagnostic performance of ultrasound for suspected AA yields an overall NAR of about 4.9% to 9.7% (14). Use of computer tomography (CT) results in a NAR of 2.5% to 8.5% (15). In Sub-Saharan Africa, AA is associated with significant potentially avoidable morbidities and mortalities. This is due to prehospital delays and in-hospital delays caused predominantly by limited human resources, infrastructure, and diagnostic capacity (16). Access to laparoscopy and magnetic resonance imaging is limited in this setting. This situation is hypothesized to adversely impact the NAR, which ranges from 17% to 33.1% (17,18). This study was undertaken to establish the baseline NAR, and explore associated factors and possible attainable solutions to reduce it in urban referral hospitals in Tanzania. Furthermore, these parameters could serve as measures of performance and as evaluation parameters for future interventions aimed at improving AA case management in this region.

Methods
This was a cross-sectional analytical study conducted in four urban referral hospitals in Dar es Salaam City, Tanzania, from May 2018 to April 2019. Three hospitals were public district referral hospitals with fully equipped laboratories and radiology services offering ultrasound services; however, CT was not available. The fourth hospital was a private referral hospital with CT services in addition to the diagnostic capacity of the public hospitals. Patients who underwent appendectomy or emergency laparotomy for suspected AA above the age of 10 years were included. Pregnant women, those who intraoperatively had alternative diagnoses, and those who underwent incidental appendectomy were excluded. We applied a finite population correction of 120. This reflected the total number of appendectomy procedures that would be done during the study period with the outcome of interest. Based on 95% confidence level and power of 80%, using the 33% NAR and a 5% precision level, the minimum sample size required was 89 (18). Given the attrition rate and lost data a sample size of 95 was targeted. Appendectomy specimens were collected with corresponding data abstraction tools. The surgical specimens were analyzed histologically by a consultant anatomical pathologist. All appendix specimens collected underwent histological analysis. Standard quality assurance processes of the pathology laboratory mandated random 10% confirmation by a second consultant pathologist. We collected information on patient demographics, lag time-defined as duration of onset of illness in days until appendectomy-, signs, symptoms of the patient during illness along with the white blood cell count and differentials. AS use, the score assigned, as well as medical imaging use and operative findings were acquired. The main outcomes were appendix histological diagnosis. AA was defined histologically as transmural attendance of acute inflammatory cells, and negative appendectomy was defined as a lack of transmural attendance of inflammatory cells. The NAR was determined as a ratio of histologically negative appendicitis to the total number of appendectomy specimens. Descriptive statistics such as proportions, means, median, range, and standard deviations were calculated. Continuous variables were tested for normality using the Shapiro-Wilk test and proportions were compared by chi-square (χ 2 ) and Fisher's exact tests. We calculated AS for all patients from the collected data. Each parameter used to make a radiological diagnosis of AA for a CT abdomen was given a score of 1 when present. The parameters for CT were appendix diameter >7, free fluid in the right iliac fossae (RIF), fat stranding, and the presence of appendicolith. As the scores increased, the likelihood of AA increased. In a similar manner, ultrasound features for diagnosing AA used to create the ultrasound score were RIF fluid, diameter of appendix >7 mm, and the third criteria was the presence of appendicolith. These scores were evaluated for association with NAR. Group means for normally distributed variables were compared by Student's t test whereas non-normal group medians were compared by non-parametric tests (Mann-Whitney U and Kruskal-Wallis). Regression analyses identified and quantified true predictors of negative appendectomy, p≤0.05 was considered statistically significant. The study did not interfere with patient care and management decisions. Participants were not placed at additional risk during participation in the study. Permission to conduct the study was sought from the Aga Khan University Educational Research Board, reference number of AKU/2017/245/fb, and from the respective hospitals' ethical committees. Consent was sought from participants and material management agreement for transporting, examining, and archiving the collected appendicular specimens. Collected data were archived by the AKU.

Results
Ninety-two eligible candidates underwent appendectomy during the study period. One patient was excluded following an incidental appendectomy due to findings of uterine fibroid disease. The total number of participants analyzed was 91. Table 1 summarizes the characteristics of the participants. The physicians who evaluated the participants were predominantly medical officers (83.5%). In one center, the medical officers made the decisions semiindependently with consultations with their on-call consultants. Sixty-one patients were evaluated from the private health facility, and 30 patients were from the public facility. Full blood count was not conducted in two participants. Ultrasound examination was conducted 32 times and CT 61 times to evaluate AA. Sonographers conducted 53% and medical radiologists conducted 47% of the ultrasound evaluations. Two participants did not undergo either imaging modality and four participants underwent ultrasound followed by CT. Surgical access was commonly through McBurney's incision [69% (63/91)] and laparoscopy was not used. The presence of reactive free fluid in RIF on gross appearance was encountered in 95% of the procedures. The appendix was grossly perforated in 19.8% and appeared grossly uninflamed in 11% of the cases. After histological analysis, NAR was 38.5% (35/91), perforation rate was 25.3% (23/91), and noncomplicated appendicitis was 36.3% (33/91). Appendicular carcinoma was not encountered. There were two cases of eosinophils, one case of schistosomiasis, and one case of enterobiasis of the appendix, inciting a limited inflammatory response that did not meet the histopathological definition of AA. One case was of a foreign body reaction and one case of inflammatory cells confined to the serosa without evidence of mucosal inflammation. Males had a NAR of 28.0% (16/57) and females of 55.8% (19/34), this difference was statistically significant (χ 2 =6.960, p=0.008). There was no statistically significant association between NAR and duration of illness using binary logistic regression. The presence of RIF rebound tenderness was independently negatively associated with NAR (χ 2 =4.242, p=0.039). Other clinical findings did not have an association with histological outcomes of appendectomy. Those with negative appendectomy had a lower leucocyte count than those with AA, similarly absolute neutrophil count was higher in those with AA than with those with NAR, this difference also being statistically significant on the Mann-Whitney U test. Table 2 summarizes the clinical and laboratory findings. The Alvarado score was determined in only 6% of the cases. We computed a calculated AS from collected participants' data. The mean calculated AS in those with negative appendectomy was lower than in those with AA; this difference was statistically significant on the Mann-Whitney U test (z -3.864, p=0.000). Half of those with negative appendectomy had a calculated AS of less than 5, compared with one quarter of those with AA.

NEGATIVE APPENDECTOMY RATE IN URBAN REFERRAL HOSPITALS IN TANZANIA
The difference was statistically significant. Negative appendectomy did not have an association with ultrasound use (p>0.05), ultrasound score (p>0.05), or level of training of ultrasound operator (p>0.05). CT abdomen diagnosis had a statistically significant association with outcomes of appendectomy (χ 2 =9.531, p=0.009). Those with AA had a higher mean CT score than those with negative appendectomy. A binary regression analysis assessed factors associated with negative appendectomy. The factors considered in this equation were sex of participants, calculated AS of less than 5, leukocyte count, and CT score. The point of interception of these factors was statistically significantly associated with NAR at a p<0.05 and an odds ratio of 16,358. Of these factors, sex of the participant, leucocyte count, and CT score were shown to have a statistically significant association with NAR. The model predicted females are four times as likely to have negative appendectomy than males, with a 95% confidence interval (CI) of 0.938 to 16.12.

Discussion
The NAR in this study was 38% despite medical imaging use. This high NAR is a concern as more than a third of patients undergoing emergency appendectomy for suspected AA do not require the procedure. This finding is in sharp contrast to the described NAR of less than 5% with use of clinical decision rules and diagnostic imaging (14,15,19). Clinical decision rules were rarely used in our setting; the diagnostic accuracies for imaging investigations that were more commonly used in our setting are unknown and hypothesized to be lower than those cited elsewhere in view of our findings. These differences are possible contributors to the observed findings. The female sex was statistically associated with NAR, constituting 54% of those with negative appendectomy. This result is similar to a study by Tseng et al. that revealed that females contributed 62% of their NAR patients (15). Other authors found the female sex to have accounted for 30-50% of their determined NAR (3). In our study, it was further shown that females were about four times more likely to have a negative appendectomy than males. This is mainly due to gynecological disease processes that may present as AA that are not present in males. The AS was used in only 6% of our participants, despite strong recommendations for its use in multiple international guidelines and from studies in the region (13,20). Nineteen participants who had negative appendectomy also had an AS of less than 5, and had these participants not undergone appendectomy our NAR would have been 17% (16/91). Ultrasound use and experience of the radiologist did not have a statistically significant association with NAR. This is in contrast to findings by other authors that reaffirm the sole use of ultrasound to have an ability to decrease the NAR to about 10% (14,15,21).

NYAMURYEKUNG'E ET AL.
Ultrasound use is associated with inherent subjectivity, hence it is hypothesized that the radiologist's expertise has an impact on the accuracy of investigations (22).
In studies citing the role of ultrasound in outcomes of appendicitis, most investigations were conducted and interpreted by medical radiologists and consultants, contrary to the findings in our study (15,21). The association between the experience of the radiologist and NAR was possibly not evident in our study as we did not have sufficient power to detect this difference.
CT scans were shown to be useful in decreasing NAR and diagnosing AA (χ 2 =9.531, p=0.009). The effect size was moderate, revealing the NAR was 32.8% among those who underwent CT. The ability of CT to decrease the NAR has been well established. Use of CT scans is associated with a NAR of 2.7-8.7% (14,15,19,23). Despite the use of CT scanning in our study, NAR in those who underwent this modality was still high. It is likely that the diagnostic accuracy in our setting is not similar to that described in literature (24).

Conclusions and recommendations
The NAR is clinically significant as about two out of every five patients undergoing emergency appendectomy for suspected AA do not require the procedure. The AS is underutilized despite a demonstrated ability to decrease NAR.
We are strongly recommending the uniform use of the AS in patients with suspected AA. This will significantly reduce our NAR. Implementation science research studies are recommended to provide solutions to curb the high NAR in our setting.

Authors' contributions
Masawa Klint, first author and primary researcher; Athar Ali, primary and content supervisor; Miten Patel, contributed to data collection; Aidan Njau, cosupervisor; Omar Sherman, reviewed the histology; Ahmed Jusabani, statistician and methodology supervisor: Ali Akbar Zehri, co-supervisor and reviewed the manuscript.