How Reliable Are Structured Psychiatric Interviews?

June 15, 2026by Chris Aiken, MD0
Structured diagnostic interviews are the gold standard, but their reliability varies widely by diagnosis

STUDY TYPE: Systematic review and meta-analysis
FUNDING: Independent

Background

Structured diagnostic interviews include the Structured Clinical Interview for DSM (SCID) and the Mini-International Neuropsychiatric Interview (MINI). They are the gold standard for diagnosis, and nearly all the research on psychopharmacology are based on these tools. This meta-analysis asked how reliable they are when the same patient is interviewed twice.

The Study
  • 57 studies, 46 included in the meta-analysis, covering 8,146 adults assessed with 17 different structured interviews.
  • Each study gave patients the same interview twice, by different interviewers, to measure test-retest reliability using Cohen’s kappa (κ), where 0 = chance agreement and 1 = perfect agreement.
  • Studies spanned 26 countries and four decades of diagnostic criteria, from DSM-III through DSM-5 and ICD-10.
Results

Overall test-retest reliability landed at κ = 0.69, which falls in the “substantial” range but with enormous variability across disorders.

Substance use disorders scored higher (κ = 0.72) than mental disorders (κ = 0.65). Among mental disorders, bipolar disorder had the best reliability (κ = 0.74) and nonaffective psychoses the worst (κ = 0.55). Among substance use disorders, opioid use disorder topped the list (κ = 0.81) and hallucinogen use disorder came in lowest (κ = 0.59).

None of the methodological quality factors — sample size, retest interval, interviewer blinding — explained the variability. The one exception: for substance use disorders, newer diagnostic criteria (DSM-III-R, DSM-IV, ICD-10) outperformed the older DSM-III.

Practice Implications
  1. Behavioral criteria (eg, addictions) are more reliable than those that are subjective or require interpretive judgment (eg, psychosis).
  2. These instruments need to be augmented by collateral history, longitudinal observation, treatment response, mental status, and associated signs. The Bipolarity Index does that for mood disorders.
  3. Despite these problems, structured interviews are more accurate than unstructured ones, and are sadly underused in practice.
  4. I’ve used them routinely for two decades, and created a free version (and am computerizing it). Use them, distribute them, no permission needed.

—Chris Aiken, MD
Director, Psych Partners
Editor in Chief, Carlat Psychiatry Report

What’s Your Take? Share in Comments

Leave a Reply

Your email address will not be published. Required fields are marked *