Evaluation of the Inter and Intra-Observer Reliability of the AO Classification of Intertrochanteric Fractures and the Device Choice (DHS, PFNA, and DCS) of Fixations

Background ArbeitsgemeinschaftfürOsteosynthesefragen (AO) classification is the most frequently used tool to classify intertrochanteric fractures. However, there is limited evidence regarding its reliability. Therefore, this study was designed to evaluate inter-observer and intra-observer reliability of the AO-2018 intertrochanteric fracture classification. Method A retrospective study was conducted in Imam Khomeini Hospital Complex, on radiography of patients who came with intertrochanteric fractures from March 21, 2018, to March 19, 2019. Four orthopedic trauma surgeons assessed 96 anteroposterior pelvic radiographs of intertrochanteric fractures and classified using an AO intertrochanteric fracture classification of 2018. The reading and review of radiography were performed in 2 separate occasions in a 1-month interval. The inter-observer and intra-observer reliability was assessed using kappa statistics. Result The level of both mean inter-observer (K =0.322; 95%CI: 0.321–0.323) and intra-observer agreement (K =0.317; 95%CI: 0.314–0.320) in AO intertrochanteric fracture classification subgrouping were not satisfactory. The inter-observer (K =0.61; 95%CI: 0.608–0.611) and intra-observers' (K=0.560; 95%CI: 0.544–0.566) reliability in AO main groupings showed moderate agreement. Conclusion The AO classification does not show adequate and acceptable inter-observer and intra-observer reliability and reproducibility. Therefore, it will be hard to base on the AO classification for treatment protocols.


INTRODUCTION
Intertrochanteric fracture is the fracture that occurs in the region between greater and lesser trochanters of the proximal femur. It is extracapsular where the vascularity of the femoral head is rarely affected (1). Intertrochanteric fracture makes about 50% of the hip fractures which is caused by low energy mechanisms such as falls (2). It can occur in both the elderly and the young. However, it is more common in the elderly population with osteoporosis due to low energy mechanisms (3). Generally, 6 million hip fractures are estimated to occur by 2050 (4).
Most of the patients present with the absence of weight-bearing, painful shortened, and externally rotated lower limbs (5). For the evaluation and diagnosis of intertrochanteric fractures, standard X-ray of the pelvis and femur can be used. The radiological finding can also help to measure the width of the medullary cavity and assessment of the diaphyseal morphology. Thus, adequate radiological evaluation is required to understand fracture type, and for preoperative planning (6).
The primary goal of intertrochanteric fracture treatment is the early mobilization and avoidance of secondary complications which can be achieved by appropriate reduction and fixations through different fixation devices (7). There are a number of fixation devices available for the treatments. Each has its indications, advantages, and disadvantages. The selection of the devices depends on the type of fracture. However, no implant fully satisfies all fixation requirements of intertrochanteric fractures (8). Implant selection and placement are important factors that can determine and predict the failure of fracture after fixation (9). Identifying the presence of atypical fractures or unstable fracture patterns is important for fracture management (10). The classification system should be valid and reliable and should have a prognostic value that can assist us to plan treatment protocols (11).
As the AO classification is the most commonly used classification that is utilized to base our protocols of choosing appropriate fixation devices, it is worth much to assess the reliability, to optimize the treatment outcome. Therefore, this study aimed to evaluate the interobserver and intra-observer reliability of the AO classification system in intertrochanteric fractures.

METHODS
This retrospective study was conducted in Imam Khomeini Hospital Complex, Tehran, Iran. Patients with an intertrochanteric fracture who were admitted to the hospital from March 21, 2018, to March 19, 2019, were included in this study. All adult patients (age ≥ 18 years) with new intertrochanteric fractures were included. However, patients with pathological fractures, periprosthetic fractures, and subtrochanteric and neck fractures were excluded. Initially, the radiographs of 136 patients were identified, but 96 of the 136 radiographs met our inclusion criteria and were enrolled in this study. Four orthopedic trauma surgeons had evaluated and read the radiographs (x-ray) findings and provided their classifications twice in the onemonth interval.
Data sources and collection: The health information system (HIS) of Imam Khomeini Hospital Complex was used to identify patients with an intertrochanteric fracture in the data collection period. The demographic characteristics of the patients such as age and sex, and the radiologic image (X-ray) were obtained from the HIS. The X-rays were matched and coded with the structured questionnaires. Four experienced orthopedic trauma surgeons who conduct on an average per month 4-7 intertrochanteric fracture fixations independently and who had 3 to 10 years of experience reviewed the radiographs and suggested fixation devices type. The radiographs were reviewed at 2 different times with a onemonth interval between the readings. The reviewers were not told that there would be a second time reading. For the first round, the observers classified the fracture according to the AO intertrochanteric fracture classification-2018 and suggested their choice of treatment. One month later, the observers were provided with the same set of radiographs rearranged in a different order along with the same questionnaires and a chart of AO Intertrochanteric fracture classification -2018 were asked to classify the fracture and select the suitable treatment fixation of choice.
Data analysis: Kappa statistics was performed by SPSS version 24 to assess inter-observer and intra-observer reliability. Interobserver agreement was evaluated by comparing the responses of 4 different observers in 2 different readings, while the intra-observer reliability was evaluated by comparing each observer's reading on 2 different occasions. The kappa value indicates −1.0 (complete disagreement), 0 (chance agreement) and 1.0 (complete agreement). Interpretation of the strength of agreement determined with the kappa values was given by adopting the criteria of Landis and Koch (12). Landis et al classify the level of agreements into six groups: perfect agreement (K ≥ 0.80), substantial agreement (K = 0.61-0.80), moderate agreement (K = 0.41-0.61), fair agreement (K = 0.21-0.41), slight agreement (K = 0-0.21) and poor agreement (K < 0). The level of significance was set at P-value < 0.05.

DISCUSSION
In this study, several attending orthopedic trauma surgeons who had different levels of experience in terms of intertrochanteric fracture management participated to evaluate the reliability of the AO 2018 intertrochanteric classification. Classification of intertrochanteric fracture serves as a guideline for treatment and helps to predict the result (13) or provides a reasonable estimation of the likely outcome (14). Therefore, the reliability of the fracture classification depends on the inter-observer and intra-observer agreement. A low level of agreement among and between observers can limit the use of classification systems in decision making (15). If the preoperative classification is not correct, the usefulness of the prognosis will also be limited (14). However, there is limited evidence in the reliability of fracture classification using the AO-2018 classification criteria in the study area. Therefore, this study was intended to determine whether the reliability of the fracture classification depends on the inter-observer and intra-observer agreement. In this study, the inter-observer reliability in AO intertrochanteric fracture classification for the subgroup analyses of the first and second observations was fair. However, the interobserver reliability in AO intertrochanteric fracture classification for the main group at the first and second observations was moderate. The interobserver agreement based on the choice of fixation devices had also shown moderate agreement at the first and second observations. The intra-observer agreements in the sub and main groupings had shown lower agreement compared to interobserver agreements. The agreements were fair for the subgrouping and moderate for the main groupings.
A previous study reported by Schipper et al (16) which used the AO classification system to classify trochanteric fractures of 20 X-rays indicated a mean intra-observer kappa value of 0.48 and interobserver kappa values of 0.33 and 0.34 in sub-grouping. However, for the main grouping classifications, intra-observer kappa value was 0.78, while interobserver kappa values were 0.67 and 0.63. These findings are in agreement with our results. However, the intra-observer agreement of our study was slightly lower than the interobserver agreement in comparison to the above study (15). Besides, our study evaluated the agreement among observers based on device choice of fixations which showed a moderate level of agreement.
A study reported by Pervez et al (13) in which 88 sets of radiographs were observed by using AO classifications and Jensen modification of the Evans indicated that the mean intra-observer agreements were K = 0.42 for sub-grouping and K = 0.72for main grouping. Similarly, mean interobserver agreements were K = 0.33 for sub-grouping and K = 0.62 for main groupings. Moreover, a study reported by De Boeck (17) was also found the AO classification unreliable. Our results are in agreement with this study as there is no adequate reliability.
The study reported by Newey et al (18) found that the AO intertrochanteric fracture classification system is unnecessarily complicated and falls short of playing a useful role in the management of intertrochanteric fractures. Since the classification system intends to indicate the nature of the injury and provides a rationale for treatment (18) and most of the orthopedic surgeons use this classification for choosing appropriate fixations or devices, there is the need for modified criteria or classification system which can help the surgeons to make appropriate clinical decisions. This study's main limitation was the use of X-rays which were not equally standardized.
In conclusion, this study of AO intertrochanteric classification did not show adequate acceptable interobserver and intraobserver reliability and reproducibility. Therefore, based on the findings of this study and that of other studies, there is a probability that AO intertrochanteric classification cannot help to support the exact treatment selection protocols since the results were not reliably strong. Finally, it is better to have back up of one extra fixation device (DHS+PFNA or DHS+DCS) during the operation because based on the above results during the operation, a fracture may not become the one which was seen in the radiographic X-ray.