Research Review By Dr. Jeff Muir©


Download MP3

Date Posted:

May 2014

Study Title:

McKenzie Lumbar Classification: Inter-rater agreement by physical therapists with different levels of formal McKenzie postgraduate training


Werneke MW, Deutscher D, Hart DL, et al.

Author's Affiliations:

CentraState Medical Center, Freehold, NJ; Physical Therapy Service, Maccabi Healthcare Services, Tel Aviv, Israel; Focus On Therapeutic Outcomes, Inc., White Stone, VA; School of Rehabilitation Science, Institute of Applied Sciences and Department of Clinical Epidemiology and Biostatistics, McMaster University, Ontario, Canada; Team Care Physical Therapy, Oxford, NC; Physical Therapy Department, St David’s Hospital Austin, TX; Department of Health Services, Policy and Practice, Brown University, Providence, RI.

Publication Information:

Spine 2014; 39: E182–E190.

Background Information:

Accurate classification of patients with non-specific low back pain is important in directing treatment and improving outcomes. Classification systems, however, need to achieve a minimum level of inter-rater, chance-corrected agreement before being widely accepted and put into widespread clinical use (1). One common classification for low back pain is the McKenzie classification system (2), which classifies patients with low back pain into one of three main syndromes: derangement, dysfunction and posture (2). Classifications are based primarily on the subjective history and objective examinations utilizing repeated end-range lumbar movements and manual or static positioning techniques during physical examination. EDITOR’S NOTE: we have reviewed the McKenzie classification categories in prior reviews – please see Related Reviews below.

McKenzie classification is commonly used by physical therapists and other manual medicine providers to evaluate, clinically diagnose, and manage patients with lumbar impairments (3-6). Practitioners achieve credentialing through postgraduate training, progressing through 4 levels, with Level D representing the highest level of certification. Despite its popularity, inter-rater reliability has not been well-researched among McKenzie practitioners (that is, can practitioners consistently agree with each other?). A small number of studies (7-11) have shown adequate inter-rater reliability, although the methodology for these studies has been called into question.

The purpose of the current study was to examine the association between therapist level of McKenzie postgraduate training and agreement of McKenzie syndrome classification variables for patients with low back pain.

Pertinent Results:

The inter-rater reliability in all therapists evaluated, including those who had completed the highest level of McKenzie credentialing, while greater than that expected merely by chance, was below acceptable agreement for all classification variables assessed.

A Kappa of 0.60 was determined a priori to represent suitable inter-rater agreement. None of the paired therapists approached this level of inter-rater agreement, regardless of their level of McKenzie certification. Therapists having achieved Level D demonstrated a range of Kappa values from 0.11 to 0.43. Level A, B and C therapists similarly demonstrated varying levels of agreement, although none exceeded 0.44 for any diagnostic category utilized by the McKenzie system.

The authors propose that previous studies that demonstrated suitable agreement between practitioners were flawed in their methodology. They suggest that the use of separate subjective and objective examinations in the current study was superior to the use of simultaneous examinations in previous studies. Furthermore, the multicentre approach and greater number of participating therapists provide a more comprehensive analysis and thus more reliable results.

Clinical Application & Conclusions:

Inter-rater agreement for determining the main McKenzie syndromes (presence of lateral shift, derangement reducibility, directional preference, and centralization) in patients experiencing low back pain did not reach an acceptable level of agreement in therapists across all levels of McKenzie certification.

The findings call into question the clinical utility of the classification system for therapists with these levels of training. Future research is needed to determine if alterations to the educational paradigms are required to improve inter-rater reliability.

Study Methods:

Study Design:
This was a prospective study involving 47 physical therapists certified in McKenzie (Mechanical Diagnosis and Therapy – MDT) techniques to see if they could agree on the main McKenzie syndromes (derangement, dysfunction, posture and other).

Adult patients (mean age, 51 [SD=15]) seeking rehabilitation for low back pain with or without referred lower extremity symptoms were asked to participate in the study and sign a consent form. Subjects were recruited from 25 clinics throughout Israel including all 5 districts defined nationally by geographic regions. Patients were eligible to participate in this study if they:
  • Were fluent in Hebrew,
  • were not pregnant,
  • did not have previous spinal or hip surgery within the past year, and
  • were not involved in work compensation or car insurance litigations.
All examiners were physical therapists who signed a consent form agreeing to participate and follow study procedures. Participating therapists (n = 47) had an average of 14 years (SD = 6; range, 5–32) experience treating patients with low back pain. Their average age was 43 (SD = 8; range, 30–65), 76% were females, all had a bachelor’s degree in physical therapy, 17% also obtained a master’s degree and none had a doctoral degree.

Patient Examinations:
Patient examination followed the recommended guidelines and the Standards for Reporting Diagnostic Accuracy criteria for improving the design of agreement studies. Each pair of raters was instructed to perform independent and consecutive evaluations for 25 to 30 patients. Patients were scheduled for independent evaluations by 2 examiners during a single clinic visit. Paired examiners were alternately sequenced; that is, examiner number 1 or examiner number 2, so that each would be examiner number 1 in 50% of the patient cases.

Data Analysis:
Inter-rater agreement for pairs of physical therapist raters were calculated using generalized Kappa values. Kappa values of 0.60 to 0.79, 0.80 to 0.90, and 0.90 or greater are interpreted as moderate, strong, and almost perfect, respectively (12). For this study, a Kappa value of 0.60 was determined a priori to be an acceptable level of agreement.

Study Strengths / Weaknesses

  • Not all participating therapists were able to examine the minimum number of patients.
  • Seventeen therapists were unable to complete all study stages.
  • The use of back-to-back examinations has been suggested to contribute to aggravation of symptoms which may affect diagnosis.
  • A large pool of therapists at multiple sites was utilized.
  • Subjective and objective evaluations were performed independently and separately.
  • Therapists with all levels of McKenzie certification were compared.

Additional References:

  1. Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam Med 2005; 37: 360–63.
  2. McKenzie R, May S. The Lumbar Spine: Mechanical Diagnosis and Therapy. 2nd ed. Waikanae, New Zealand: Spinal Publication Ltd; 2003.
  3. Battie MC, Cherkin DC, Dunn R, et al. Managing low back pain: attitudes and treatment preferences of physical therapists . Phys Ther 1994; 74: 219–26.
  4. Byrne K, Doody C, Hurley DA. Exercise therapy for low back pain: a small-scale exploratory survey of current physiotherapy practice in the Republic of Ireland acute hospital setting. Man Ther 2006; 11: 272–8.
  5. Foster NE, Thompson KA, Baxter GD, et al. Management of nonspecific low back pain by physiotherapists in Britain and Ireland. A descriptive questionnaire of current clinical practice. Spine 1999; 24: 1332–42.
  6. Gracey JH, McDonough SM, Baxter GD. Physiotherapy management of low back pain: a survey of current practice in northern Ireland. Spine 2002; 27: 406–11.
  7. Riddle DL, Rothstein JM. Intertester reliability of McKenzie’s classifications of the syndrome types present in patients with low back pain. Spine 1993; 18: 1333–44.
  8. Kilby J, Stigant M, Roberts A. The reliability of back pain assessment by physiotherapists, using a McKenzie Algorithm. Physiotherapy 1990; 76: 579–83.
  9. Razmjou H, Kramer JF, Yamada R. Intertester reliability of the McKenzie evaluation in assessing patients with mechanical low back pain. J Orthop Sports Phys Ther 2000; 30: 368–83; discussion 384–9.
  10. Kilpikoski S, Airaksinen O, Kankaanpaa M, et al. Interexaminer reliability of low back pain assessment using the McKenzie method . Spine 2002; 27: E207–14.
  11. Clare HA, Adams R, Maher CG. Reliability of McKenzie classification of patients with cervical or lumbar pain. J Manipulative Physiol Ther 2005; 28: 122–7.
  12. McHugh ML . Inter-rater reliability: the kappa statistic. Biochem Med (Zagreb) 2012; 22: 276–82.