Applicability of Regression Models for the Analysis of Ordinal PANSS Data in Schizophrenia: A Cohort Study
- Authors: Gvozdeckii A.N.1,2, Prokopovich G.A.1,2, Dobrovolskaya A.E.1,2, Kusnirev I.S.1,2, Sofronov A.G.1,2
-
Affiliations:
- The North-West State Medical University named after I.I. Mechnikov of the Ministry of Health of Russia
- St. Petersburg Psychiatric Hospital No. 3 named after I.I. Skvortsov-Stepanov
- Section: RESEARCH
- Submitted: 30.07.2025
- Accepted: 15.05.2026
- Published: 15.05.2026
- URL: https://consortium-psy.com/jour/article/view/15731
- DOI: https://doi.org/10.17816/CP15731
- ID: 15731
Cite item
Full Text
Abstract
BACKGROUND: There is ongoing debate about the appropriate measurement level for symptom severity scores derived from clinical rating scales. The use of statistically inappropriate analytical methods for such data may distort results and lead to misinterpretations.
AIM: To study the applicability of linear, beta, beta-binomial, and ordinal regression models for assessing changes in schizophrenia symptoms over time using the PANSS.
METHODS: The study cohort comprised patients diagnosed with schizophrenia and schizoaffective disorder. Symptom severity was quantified using the Positive and Negative Syndrome Scale (PANSS). Observation period, sex, brexpiprazole prescription upon admission, brexpiprazole monotherapy, and dropout from the study prior to completion of follow-up were used as covariates. Bayesian mixed-effects regression models were fitted to the obtained dataset: Model 1 (normal distribution), Model 2 (ordered beta distribution), Model 3 (beta-binomial distribution), and Model 4 (ordinal regression). The applicability of the models was assessed using the fit index γ, the proportion of predicted values that corresponded with any possible schizophrenia symptom score on the PANSS. A model was considered consistent at a γ=1. Additionally, the 95% Highest Density Interval (HDI) was calculated for γ.
RESULTS: The study enrolled 24 patients with schizophrenia (75% of whom were men) aged 20 to 45 years. The fit index γ was 0.00 (95% HDI 0.00–0.00) for Model 1: 0.17 (95% HDI 0.14–0.21) for Model 2: 1.00 (95% HDI 1.00–1.00) for Model 3: and 1.00 (95% HDI 1.00–1.00) for Model 4. Model 1 allows results that fall outside the range of the PANSS. Models 1 and 2 can produce fractional values.
CONCLUSION: Statistical models designed to analyze continuous variables (linear and beta regression) are inapplicable for ordinal variables and, in particular, changes in schizophrenia symptoms over time on the PANSS. Beta-binomial and ordinal regression models are recommended for rating scores.
Keywords
Full Text
INTRODUCTION
The Italian scientist Galileo Galilei is credited with the phrase “…il faut mesurer tout ce qui est mesurable, et tâcher de rendre mesurable tout ce qui ne l’est pas directement” ("... measure what is measurable, and make measurable what is not so") [1]. In current understanding, measurement is the process of obtaining one or more values that can be reasonably attributed to a quantity [2]. Measurements play an important role in society and are now considered a reliable source of knowledge [3].
An example of early measurements in psychiatry is the work of the Norwegian Royal Commission, which counted patients with "mania, melancholia, dementia, idiocy, blind in one or both eyes, deaf, mute, and leprous", taking into account sex and place of residence [4]. More recently, the scientific community has begun discussing the model of measurement-based psychiatric care, which is defined as follows: "the use of valid clinical measurement tools to objectify assessment, treatment, and clinical outcomes, including efficacy, safety, tolerability, functioning, and quality of life in patients with mental disorders" [4]. This concept is largely associated with the widespread use of measurement tools, rating scales, in research practice. The latter represent a series of items designed to quantify or rank the manifestations of a single variable (i.e., a trait) [5]. At the same time, measurement results can be expressed not only in numbers but also in letters and words; that is, measurement as an empirical human activity is not limited solely to quantitative assessment [3].
To date, disagreement remains among researchers regarding the interpretation and analysis of rating scores [6, 7]. The discussion begins with the classification of measurement levels, which distinguishes between nominal, ordinal, interval, and ratio scales [8]. The first position (the "intervalist" approach) interprets the score as an interval variable. This view is based on the well-known argument that this issue does not belong to the realm of statistics ("numbers do not know where they come from”) [9]. From this perspective, the assumption that the data distribution is continuous naturally follows, along with the applicability of parametric models [10]. This view is supported by several simulation and empirical studies [7, 10, 11], prompting some authors to suggest ending the debate and using parametric methods to analyze such data [12].
The second position (the "ordinalist" approach) interprets rating scores as ordinal variables [6, 7, 10]. Applying linear models to ordinal data is regarded as a methodological error due to problems in interpreting results [13], reduced statistical power [10, 14], detection of non-existent effects, and distortion of real effects [15]. Non-parametric tests and their generalizations are proposed for analyzing ordinal data [16, 17].
The debate between proponents of the "intervalist" and "ordinalist" positions is directly relevant to psychiatry. The problem can be formulated as follows: data obtained using psychiatric scales, being ordinal by nature, are often treated as continuous variables, yet the consequences of this assumption remain unclear [18]. Accordingly, it is unclear whether applying statistical models designed for continuous variables to rating scores is erroneous.
The Positive and Negative Syndrome Scale (PANSS) is considered as an example [19]. A number of researchers classify PANSS results as interval-level measurements [20, 21], and therefore scale scores are analyzed as continuous variables [22]. Within this paradigm, linear [21, 23] and beta regression [24] are used to model PANSS scores. From the "ordinalist" point of view, the use of ordinal [16, 17] or beta-binomial regression [25] is considered acceptable for analyzing PANSS scores.
The use of appropriate statistical methods ensures the reliability and validity of conclusions and contributes to the development of evidence-based clinical practice [26]. Accordingly, the choice of a model for analyzing PANSS scores requires justification. However, to the best of our knowledge, a justification for the applicability of linear and beta regression models to PANSS scores, considering their ordinal nature, is not available in the literature [18]. Although ordinal and beta-binomial regression models are theoretically acceptable for analyzing ordinal data, their justification specifically for the PANSS has also not been established.
The study aim was to assess the applicability of linear, beta, beta-binomial, and ordinal regression models for assessing changes in schizophrenia symptoms over time using the PANSS.
METHODS
Study design
A prospective cohort study was conducted.
Setting
The study was conducted from November 2022 to February 2023 at the Saint Petersburg Psychiatric Hospital No. 3 named after I.I. Skvortsov-Stepanov.
Participants
Inclusion criteria:
• a diagnosis of schizophrenia (F20) or schizoaffective disorders (F25) according to the International Classification of Diseases, 10th Revision (ICD-10);
• a psychotic state as the reason for hospitalization;
• brexpiprazole prescription for routine clinical practice at any time during hospitalization.
We restricted the study to patients receiving brexpiprazole to demonstrate how to build a mixed-effects model in a typical drug efficacy evaluation.
Non-inclusion criteria:
• inability to provide voluntary informed consent for treatment and participation in the study;
• PANSS scores above the following threshold values: delusions >4 points, excitement >3 points, hostility >3 points, anxiety >5 points, tension >5 points, suspiciousness >4 points, hallucinations >4 points, depression >5 points, poor impulse control >3 points (threshold values were chosen taking into account the patients' ability to provide informed voluntary consent to participate in the study).
Exclusion criteria:
• patient refusal to participate in the study at any stage; development of any condition during the study that could interfere with accurate assessment of the severity of the disease symptoms;
• change of diagnosis during the study (criterion introduced after study initiation).
Data sources
Schizophrenia symptom severity was assessed using the PANSS [19] in its Russian-language adapted and validated version [23]. Assessments were made at the following time points: 1, 3, 5, 7, 10, 14, 21, 28, and 42 days from study inclusion. Symptoms were evaluated by the attending physician in the clinic during a structured clinical interview, which lasted no more than 30–40 minutes. Upon completion of the interview, the physician completed a form recording the severity of each of the nine PANSS symptoms on a 7-point scale (from 1 to 7). Scoring followed the evaluation criteria for each symptom.
Only symptoms planned for inclusion in the statistical models were entered into the spreadsheet. In the case of a patient withdrawal from the hospital or replacement of brexpiprazole with another antipsychotic before study completion, missing values were not imputed.
Statistical methods
Sample size
The required sample size was not calculated at the study planning stage.
Statistical models
For all models, the PANSS score was denoted by Y. The first model (Model 1) is a linear regression with normally distributed residuals [15]. The normal distribution is continuous, symmetric, and can take any negative or positive values [27]. Linear regression and its special cases (t-test, ANOVA) [28] are widely used to analyze continuous variables [16].
The second model (Model 2) is based on a continuous beta distribution. This distribution is defined by the parameters α ("success") and β ("failure"), which allows estimation of values within the interval from 0 to 1 [29]. If the minimum (Ymin) and maximum (Ymax) scores on the scale are known, the measurement result Y can be linearly transformed to a value YT, lying in the interval from 0 to 1, as follows: YT = (Y − Ymin)/(Ymax − Ymin). Since 0 and 1 are not allowed for the beta distribution, either modification of the data [29] or the use of an ordered beta model [30] may be required; the latter approach was used in this study.
The third model (Model 3) is based on a discrete betabinomial distribution, used to analyze count data. To obtain a particular score on a rating scale, a series of consecutive comparisons is made according to the "yes or no" principle. For example, a score of 5 on a 7-point scale means that scores 1, 2, 3, and 4 were rejected (four consecutive successes), but the score of 5 was not rejected when compared to the score of 6 [31]. The probability of success or failure is modeled using a beta distribution. Thus, the beta-binomial model is a discrete analog of beta regression [25].
The fourth model (Model 4) is based on the assumption that there is a latent variable Y*, that cannot be measured directly. Nevertheless, the observed discrete values, scores, are segments of a normally distributed latent variable separated by threshold points [17]. The ordinal model accounts for the fact that the distances between categories may not be equivalent [16, 17]. A special case of this approach is non-parametric testing, such as the Mann-Whitney test [32].
Statistical analysis
All calculations were performed using R, version 4.5.1.
The dependent variable was the PANSS scores for schizophrenia symptoms. Scores for each of the nine symptoms (from 1 to 7, categorical variable) were included in the regression model independently.
The following independent variables were used:
• observation time (days from the study start; continuous non-negative variable);
• patient sex (male or female; categorical variable);
• patient age (full years at hospitalization; continuous non-negative variable);
• brexpiprazole therapy from the moment of hospitalization (yes or no; binary variable);
• brexpiprazole monotherapy (yes or no (polypharmacy); binary variable);
• dropout before study completion (before 42 days; yes or no; binary variable).
To account for repeated measurements, the de-identified participant ID (categorical variable) was included in the model.
The data analysis was performed using a two-component mixed-effects model with repeated measurements, random intercepts, and slopes. The general syntax of the models was as follows: score~time+drop+start+mono+sex+age+(time|id )+(time|item), where score is the score on a particular PANSS item, time is the observation time, start is brexpiprazole therapy from the start of hospitalization, mono is monotherapy, sex is patient sex, age is patient age, drop is dropout from the study, id is the individual participant ID, and item is the
PANSS item.
The inclusion of patient ID as a random effect accounted for repeated measurements and within-subject correlation and variability (part of the syntax — …|id). Symptoms were considered as random effects for a more accurate assessment of values (part of the syntax — …|item). Since patients could respond to treatment at different rates, an individual time factor was included for each random effect (part of the syntax — time|…). A separate analysis of dropout before day 42 was not conducted.
The dependent variable in Model 1 (score~...) was used without transformation. In Model 2 (prop~...), the dependent variable was transformed to values between 0 and 1 (prop = (score − 1)/(7 − 1)). In Model 3 (cnt|trials(trl)~...), the dependent variable was defined as the number of successful trials relative to the total number of trials (cnt = score − 1, trl = 7 − 1). In Model 4 (score|thres(trl)~...), the dependent variable was treated as an ordinal variable with a predefined number of thresholds (trl = 7 − 1).
All models were evaluated in a Bayesian framework using the BRMS library [33]. The use of Bayesian models was justified by their flexibility, the availability of many pre-installed distributions, and the accessibility of software. The model parameters were set to default values, except for the coefficients characterizing fixed effects (set_prior(“normal(0, 5)”, class = “b”)). A weakly informative prior for all coefficients was used to improve model convergence [17], including on the logit scale [30]. To assess the significance of parameters, we converted the maximum probability of effect to a p-value [34]. The distribution of parameters was evaluated using the 95% Highest Density Interval (HDI), and a significant result was defined as p<0.005 [35]. All models were subjected to stratified K-Fold cross-validation. Patient ID was the grouping variable. The number of folds was 8.
To evaluate the applicability of the models for analyzing rating data, we determined the proportion of predicted values that corresponded to possible PANSS score (the fit index γ). A model was considered consistent at a γ=1. Additionally, the Bayes factor (BF) was calculated for the model parameters. However, the BF was not used for direct comparison, as the dependent variables differed.
Additional models were constructed in a frequentist framework. The models were used to assess observed power. For this purpose, patients were resampled with replacement from the original dataset. After generating 1,000 datasets, the proportion of statistically significant predictors was calculated across simulations.
Continuous variables were described with mean (standard deviation), discrete variables (including rating scales) — with median (Q1; Q3).
Ethical considerations
The study was approved by the local ethics committee of the State Budgetary Healthcare Institution "Hospital for War Veterans" in Saint Petersburg (Protocol No. 28 dated October 13, 2022). All participants provided signed informed consent before participating. Patients continued to receive all necessary medical care regardless of consent (or refusal) to participate in the study
RESULTS
Participants
During the study, 31 patients were screened, all of whom met the inclusion criteria. Five patients met the non-inclusion criteria, all due to PANSS scores exceeding threshold values. After providing signed informed consent, 26 patients were enrolled in the study. Two patients were excluded during the study, both due to change in diagnosis. Thus, data from 24 patients were analyzed.
This sample was predominantly male (n=18, 75%). The mean patient age was 31.0 years (Me (Q1; Q3)=31.0 (20; 45), SD=7.7). Eleven patients (46%) had secondary specialized or higher education. At the time of hospitalization, seven patients (29%) were employed or in education, 14 (58%) were unemployed, and 3 (13%) had a disability. Three patients (13%) were married, and four (17%) were divorced.
The duration of mental illness ranged from 12 to 60 months, averaging 40.1 months (SD=12.3). Information on previous hospitalizations was unavailable for three patients (13%); 10 patients (42%) reported one prior hospitalization, and 11 (46%) reported ≥2 prior hospitalizations. At the study’s outset, the predominant syndromes were hallucinatorydelusional syndrome in 13 patients (54%) and affectivedelusional syndrome in 11 patients (46%).
Brexpiprazole therapy was initiated after admission in six patients (25%); three of these patients received monotherapy. In three patients, brexpiprazole was combined with trazodone, quetiapine, and chlorprothixene (one case each, 4% each). Quetiapine and chlorprothixene were continued from prior therapy to facilitate transition to brexpiprazole. An additional 12 patients (50%) were switched from combination therapy to brexpiprazole monotherapy until study end: the initial combination included olanzapine in six cases, haloperidol in five cases, and aripiprazole in one case; bromdihydrochlorbenzodiazepine was used in four patients (17%). In six patients (25%), combination therapy (other antipsychotic therapy + brexpiprazole) continued until the end of the study due to long cross-titration. Patients were switched from therapy with zuclopenthixol, haloperidol, olanzapine, aripiprazole (one case each), and lurasidone (two cases). Two patients (8%) discontinued brexpiprazole therapy before study completion.
During the observation period, akathisia was the only side effect registered in three (13%) patients.
Brexpiprazole dosing followed prescribing information.The initial dose of the drug was 1 mg per day for the first four days, and 2 mg per day on days 5–7. Ten patients (42%) remained on a dose of 2 mg until the end of the study, and seven patients (29%) each were on doses of 3 mg and 4 mg, respectively.
Changes over time in the severity of symptoms
During the study, the severity of all schizophrenia symptoms decreased (Table 1). A median reduction of less than 2 points was noted for the symptoms of "suspiciousness", "hostility", and "aggressiveness". The overall changes in PANSS scores relative to the fixed effects of the model are presented in the Table S1 (in the Supplementary).
Comparison of the statistical models
When comparing the models, a notable difference emerges in the structure of statistically significant effects versus statistically important effects based on the BF (Table 2). The time factor is statistically significant across all models, the study dropout factor only in Model 2, and the age factor in all models except Model 1. Based on the BF, a non-zero effect is probable for the time factor in all models, for the dropout factor in Model 2, and for the initiation of brexpiprazole therapy in Model 4. In all models, the random intercept and slope effects are statistically significant. The association between the time factor and PANSS items is statistically significant in Models 1 and 2. The association between the time factor and the individual patient ID lacks statistical significance, but is considered important in Model 2 based on the BF. In Model 3, the probability of zero variance for the random slope of time on the PANSS is high and statistically significant. A nonzero effect for the association between time and PANSS items is also probable in Model 4. Thus, based on the BF, zero is more probable for some of the model coefficients.


However, these were not excluded from the final comparison, as they reflect clinically relevant information, and further model optimization was not a study objective.
Model fit to the ordinal nature of PANSS data was evaluated using the γ fit index. It was found that Model 1 and Model 2 predict values that do not correspond (γ<1) to the range of possible PANSS scores. Model 3 and Model 4 predict values that perfectly correspond (γ=1) to the range of possible PANSS scores. The results of modeling in the frequentist framework are presented in the Table S2 (in the Supplementary). For all models, the correspondence of predicted values to PANSS scores is consistent with the Bayesian models.
As an example, Figure 1 illustrates changes over time in "delusions" symptom severity across different models, with other effects averaged. Visual analysis shows that the median of the observed data can take fractional values, due to its calculation when there is an even number of observations. Nevertheless, for observed data, the median and 95% HDI generally correspond to the levels of the 7-point scale. Models 1 and 2 predict values not present on the scale (fractional scores). For instance, Model 1 does not exclude scores below 1 and allows negative values. The sole advantage of Model 2 over Model 1 is the correct handling of the scale's lower limit. Models 3 and 4 correctly estimate the mean score, which does not fall outside the levels of the 7-point scale. Similar changes were observed when plotting the trajectories of other symptoms (data not shown).
Visual analysis of histograms for all PANSS items confirms differences between the models in predicting scores. Model 1 can yield any value, Model 2 yields any value in the range of 1 to 7, and Models 3 and 4 yield any integer values in the range of 1 to 7 (Figure 2).


DISCUSSION
We considered four models with differing assumptions about the data type of the PANSS scores. A critical issue with Model 1 is the generation of out-of-range results and the presence of fractional values in the predicted scores. Model 2 also produces fractional values, but these do not exceed the PANSS score range. All predictions from Models 3 and 4 correspond to possible PANSS scores. Thus, for statistical analysis of PANSS schizophrenia symptom severity scores, beta-binomial and ordinal regressions represent valid statistical models.
The obtained results cannot be accurately interpreted without addressing the fundamental problem associated with the ordinal level of measurement. Let us return to the PANSS. All scale items are graded from 1 point (patient condition does not meet the description) to 7 points (maximum symptom severity). Obviously, "mild" severity (3 points) is greater than the "absence" of a symptom (1 point) but less than "severe" (5 points).
The verbal descriptions of symptom severity can be encoded as follows: 1<2<3<4<5<6<7. Since results are expressed in natural numbers, there is an implicit assumption of a unit of measurement and, consequently, the validity of arithmetic operations. However, units of measurement are not defined for ordinal variables, and algebraic operations do not exist for them [2], which is counterintuitive when working with notations resembling natural numbers. For example, the difference between 7 and 5 is equal to the difference between 3 and 1; the difference is 2. This can be expressed verbally as follows: "The difference between extreme symptom severity and severe severity is equal to the difference between mild symptom severity and its absence; the difference is equal to ...”. The value of this difference in units of symptom severity is unknown. By contrast, for counting apples: "The difference between seven and five apples is equal to the difference between three and one apple; the difference is equal to two apples". The difference between these examples is obvious: a unit of measurement (pieces) is naturally defined for apples, which is absent in ordinal scales (points, letters, labels, ranks) by definition. This property of an ordinal variable explains why it cannot be considered quantitative [36].
The theoretical invalidity of analyzing ordinal data as metric data leads in practice to an increase in false-positive results and a decrease in the statistical power of tests, up to the inversion of the result [15], as confirmed empirically. Despite this, proposals to analyze ordinal data as metric variables remain prevalent [11, 12]. It has been suggested that prohibiting parametric methods for ordinal data may be excessive [37]. In our opinion, the potential to obtain unrealistic results (values exceeding the scale gradations) is a more significant issue. Part of this problem is the lack of published indications regarding the principle of assigning specific numbers to ordinal categories when using linear models [38]. The results of this study can be added to the existing list of problems associated with analyzing ordinal data as metric variables [15].
The assumption of continuity of the beta distribution (Model 2) [29, 30] does not resolve the problem of equal distances between scale categories [39, 40]. Further development of models for assessing discrete values [40] may enable a frequency interpretation of score-based assessments, but this approach is currently inaccessible to investigators.
Accurate results in analyzing PANSS scores can be obtained with beta-binomial and ordinal regression models. In the constructed models, scores were evaluated at the level of individual scale items rather than as total scores. This is due to the absence of a mechanism to transform the sum of item scores (ordinal values) into an ordinal or interval scale [41]. Consequently, when conducting repeated assessments of patients, multilevel models should be employed [17]. This analytical strategy is supported by clinical practice. For example, the total score on the PANSS negative symptom subscale is not a reliable measure of symptom severity [22].
The findings on the interpretation of rating scale assessments can be extrapolated to other scales. It is evident that using models based on unrealistic assumptions that treat ordinal data as continuous leads to implausible estimates [13]. This problem is universal, as rating scales are widely used in psychiatry, possess an ordinal structure yet are interpreted as metric variables [18].
Patients were selected using convenience sampling. The use of threshold values in the inclusion criteria precluded obtaining a dataset with all possible scale scores. Bayesian beta-binomial and ordinal models predicted all possible PANSS scores, which reflects model specifications rather than observed dataset values.
Due to the absence of a prior power calculation and failure to divide the sample into training and validation subsets, an unambiguous assessment of model predictive strength cannot be provided. This is because models may differ in their control of Type I and Type II errors, as demonstrated by the analysis of observed power for each variable across models within the frequentist framework (see Table S3 in the Supplementary).
Patients likely withdrew due to treatment outcomes. However, this dropout mechanism was not accounted for in the modeling, which could have introduced bias. Due to the fundamental differences between the models, there is no basis to assume that any bias was uniform or negligible. The sample size could have affected parameter estimates and result stability during model selection. Additionally, no model optimization was performed to determine the simplest structure that describes the dataset.
Nevertheless, these limitations did not affect the assessment of the consistency of the models, as it follows from their underlying assumptions. Comparison of predictive efficiency was beyond the scope of this study.
CONCLUSION
The assessment of symptom severity is fundamental to clinical decision-making. The ordinal nature of rating scales determines subsequent data analysis methods. Using continuous distributions, such as normal and beta distributions, lacks theoretical and practical justification for analyzing scale scores. Linear and beta regression models are inappropriate for the statistical analysis of PANSS symptom severity scores. Ordinal and beta-binomial regression models are appropriate because they account for the nature of the original PANSS score data. Further research is needed to justify the application of other statistical models for the analysis of ordinal data.
About the authors
Anton N. Gvozdeckii
The North-West State Medical University named after I.I. Mechnikov of the Ministry of Health of Russia;St. Petersburg Psychiatric Hospital No. 3 named after I.I. Skvortsov-Stepanov
Author for correspondence.
Email: comisora@yandex.ru
ORCID iD: 0000-0001-8045-1220
SPIN-code: 4430-6841
Scopus Author ID: 55933857400
Candidate of Medical Sciences, Assistant of the Department of Psychiatry and Narcology, Deputy Chief Physician for Organizational and Methodological Work
Russian Federation, 191015, Russia, Saint-Petersburg, Kirochnaya ul. 41; 197341, Russia, Saint-Petersburg, Fermskoe shosse 36Galina A. Prokopovich
The North-West State Medical University named after I.I. Mechnikov of the Ministry of Health of Russia;St. Petersburg Psychiatric Hospital No. 3 named after I.I. Skvortsov-Stepanov
Email: galinapro1@rambler.ru
ORCID iD: 0000-0001-7909-6727
SPIN-code: 5985-3715
Scopus Author ID: 57203003009
Candidate of Medical Sciences, associate professor, Chair of Psychiatry and Addiction Medicine, doctor-methodologist
Russian Federation, 191015, Russia, Saint-Petersburg, Kirochnaya ul. 41; 197341, Russia, Saint-Petersburg, Fermskoe shosse 36Alla E. Dobrovolskaya
The North-West State Medical University named after I.I. Mechnikov of the Ministry of Health of Russia;St. Petersburg Psychiatric Hospital No. 3 named after I.I. Skvortsov-Stepanov
Email: maxmmm@yandex.ru
ORCID iD: 0000-0002-3582-6078
SPIN-code: 4423-4454
Scopus Author ID: 57202999908
MD, Cand. Sci. (Med.), Associate Professor, Psychiatry and Narcology Department, Deputy Chief Medical Officer
Russian Federation, 191015, Russia, Saint-Petersburg, Kirochnaya ul. 41; 197341, Russia, Saint-Petersburg, Fermskoe shosse 36Ivan S. Kusnirev
The North-West State Medical University named after I.I. Mechnikov of the Ministry of Health of Russia;St. Petersburg Psychiatric Hospital No. 3 named after I.I. Skvortsov-Stepanov
Email: splitter887@gmail.com
ORCID iD: 0009-0006-9477-3566
SPIN-code: 3608-7610
Postgraduate student, psychiatrist
Russian Federation, 191015, Russia, Saint-Petersburg, Kirochnaya ul. 41; 197341, Russia, Saint-Petersburg, Fermskoe shosse 36Aleksander G. Sofronov
The North-West State Medical University named after I.I. Mechnikov of the Ministry of Health of Russia;St. Petersburg Psychiatric Hospital No. 3 named after I.I. Skvortsov-Stepanov
Email: alex-sofronov@yandex.ru
ORCID iD: 0000-0001-6339-0198
SPIN-code: 4846-6528
Scopus Author ID: 57202998979
MD, Dr. Sci. (Med.), Professor, Head of the Psychiatry and Narcology Department, Chief Medical Officer
Russian Federation, 191015, Russia, Saint-Petersburg, Kirochnaya ul. 41; 197341, Russia, Saint-Petersburg, Fermskoe shosse 36References
- Martin TH. Galilée: les droits de la science et la méthode des sciences physiques. Didier et cie; 1868.
- International Vocabulary of Metrology — Basic and General Concepts and Associated Terms (VIM), 3rd ed. JCGM 200; 2012. doi: 10.59161/JCGM200-2012
- Mari L, Maul A, Torres Irribarra D, Wilson M. Quantities, quantification, and the necessary and sufficient conditions for measurement. Measurement. 2017;100:115–121. doi: 10.1016/j.measurement.2016.12.050
- Aboraya A, Nasrallah HA, Elswick DE, et al. Measurement-based care in psychiatry — past, present, and future. Innov Clin Neurosci. 2018;15(11–12):13–26.
- Hamilton M. The role of rating scales in psychiatry. Psychol Med. 1976;6(3):347–349. doi: 10.1017/S0033291700015774
- Buri M, Curt A, Steeves J, Hothorn T. Baseline-adjusted proportional odds models for the quantification of treatment effects in trials with ordinal sum score outcomes. BMC Med Res Methodol. 2020;20(1):104. doi: 10.1186/s12874-020-00984-2
- Koo M, Yang SW. Likert-type scale. Encyclopedia. 2025;5(1):18. doi: 10.3390/encyclopedia5010018
- Stevens SS. On the theory of scales of measurement. Science. 1946;103(2684):677–680. doi: 10.1126/science.103.2684.677
- Norman G. Likert scales, levels of measurement and the “laws” of statistics. Adv Health Sci Educ. 2010;15(5):625–632. doi: 10.1007/s10459-010-9222-y
- Eiselen R, Huyssteen GB van. A comparison of statistical tests for likert-type data: The case of swearwords. J Open Humanities Data. 2023;9(1):18. doi: 10.5334/johd.132
- Huh I, Gim J. Exploration of likert scale in terms of continuous variable with parametric statistical methods. BMC Med Res Methodol. 2025;25(1):218. doi: 10.1186/s12874-025-02668-1
- DeWees TA, Mazza GL, Golafshar MA, Dueck AC. Investigation into the effects of using normal distribution theory methodology for likert scale patient-reported outcome data from varying underlying distributions including floor/ceiling effects. Value Health. 2020;23(5):625-631. doi: 10.1016/j.jval.2020.01.007
- Veríssimo J. Analysis of rating scales: A pervasive problem in bilingualism research and a solution with bayesian ordinal models. Bilingualism. 2021;24(5):842–848. doi: 10.1017/S1366728921000316
- Howcroft DM, Rieser V. What happens if you treat ordinal ratings as interval data? Human evaluations in NLP are even more under-powered than you think. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics; 2021. p. 8932–8939. doi: 10.18653/v1/2021.emnlp-main.703
- Liddell TM, Kruschke JK. Analyzing ordinal data with metric models: What could possibly go wrong? J Exp Soc Psychol. 2018;79:328–348. doi: 10.1016/j.jesp.2018.08.009
- Theobald EJ, Aikens M, Eddy S, Jordt H. Beyond linear regression: A reference for analyzing common data types in discipline based education research. Phys Rev Phys Educ Res. 2019;15(2):020110. doi: 10.1103/PhysRevPhysEducRes.15.020110
- Bürkner PC, Vuorre M. Ordinal regression models in psychology: A tutorial. Adv Methods Pract Psychol Sci. 2019;2(1):77–101. doi: 10.1177/2515245918823199
- Geck S, Roithmeier M, Bühner M, et al. COSMIN systematic review and meta-analysis of the measurement properties of the positive and negative syndrome scale (PANSS). eClinicalMedicine. 2025;82:103155. doi: 10.1016/j.eclinm.2025.103155
- Kay SR, Fiszbein A, Opler LA. The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr Bull. 1987;13(2):261–276. doi: 10.1093/schbul/13.2.261
- Obermeier M, Mayr A, Schennach-Wolff R, et al. Should the PANSS be rescaled? Schizophr Bull. 2010;36(3):455–460. doi: 10.1093/schbul/sbp124
- Wang X, Su Y, Yan H, et al. Association Study of KCNH7 Polymorphisms and Individual Responses to Risperidone Treatment in Schizophrenia. Front Psychiatry. 2019;10:633. doi: 10.3389/fpsyt.2019.00633
- Baandrup L, Allerup P, Nielsen MØ, et al. Rasch analysis of the PANSS negative subscale and exploration of negative symptom trajectories in first-episode schizophrenia — data from the OPTiMiSE trial. Psychiatry Res. 2020;289:112970. doi: 10.1016/j.psychres.2020.112970
- Ivanova E, Khan A, Liharska L, et al. Validation of the russian version of the positive and negative syndrome scale (PANSS-ru) and normative data. Innov Clin Neurosci. 2018;15(9–10):32–48.
- Lee LHN, Procyshyn RM, White RF, et al. Developing prediction models for symptom severity around the time of discharge from a tertiary-care program for treatment-resistant psychosis. Front Psychiatry. 2023;14:1181740. doi: 10.3389/fpsyt.2023.1181740
- Tillé Y. Yet another attempt to classify positive univariate probability distributions. Aust J Stat. 2024;53(3):87–101. doi: 10.17713/ajs.v53i3.1776
- Enoyoze E, Enoyoze GE. Statistical applications in the biomedical sciences: A review. Int J Sci Res Arch. 2024;12(2):1594–1601. doi: 10.30574/ijsra.2024.12.2.1433
- Im S. Performance of the beta-binomial model for clustered binary responses: Comparison with generalized estimating equations. J Mod Appl Stat Methods. 2021;19(1):2–25. doi: 10.22237/jmasm/1619482380
- Shatz I. Assumption-checking rather than (just) testing: The importance of visualization and effect size in statistical diagnostics. Behav Res Methods. 2024;56(2):826–845. doi: 10.3758/s13428-023-02072-x
- Smithson M, Verkuilen J. A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol Methods. 2006;11(1):54–71. doi: 10.1037/1082-989X.11.1.54
- Kubinec R. Ordered beta regression: A parsimonious, well-fitting model for continuous data with lower and upper bounds. Polit Anal. 2023;31(4):519–536. doi: 10.1017/pan.2022.20
- Iannario M. Modelling uncertainty and overdispersion in ordinal data. Commun Stat Theory Methods. 2014;43(4):771–786. doi: 10.1080/03610926.2013.813044
- Long Y, Ruiter SC de, Luijten LWG, et al. Statistical practice of ordinal outcome analysis in neurologic trials. Neurology. 2025;104(4):e210229. doi: 10.1212/WNL.0000000000210229
- Bürkner PC. Brms: An R package for bayesian multilevel models using Stan. J Stat Soft. 2017;80(1):1–28. doi: 10.18637/jss.v080.i01
- Makowski D, Ben-Shachar MS, Chen SHA, Lüdecke D. Indices of effect existence and significance in the bayesian framework. Front Psychol. 2019;10:2767. doi: 10.3389/fpsyg.2019.02767
- Benjamin DJ, Berger JO, Johannesson M, et al. Redefine statistical significance. Nat Hum Behav. 2018;2(1):6–10. doi: 10.1038/s41562-017-0189-z
- Penna MP, Agus M, Hitchcott PK, Pessa E. Psychometric methods: The need for new conceptual advances. Measurement. 2018;117:96–107. doi: 10.1016/j.measurement.2017.11.054
- Harpe SE. How to analyze Likert and other rating scale data. Curr Pharm Teach Learn. 2015;7(6):836–850. doi: 10.1016/j.cptl.2015.08.001
- Selman CJ, Lee KJ, Ferguson KN, et al. Statistical analyses of ordinal outcomes in randomised controlled trials: A scoping review. Trials. 2024;25(1):241. doi: 10.1186/s13063-024-08072-2
- Zou KH, Carlsson MO, Quinn SA. Beta-mapping and beta-regression for changes of ordinal-rating measurements on Likert scales: A comparison of the change scores among multiple treatment groups. Stat Med. 2010;29(24):2486–2500. doi: 10.1002/sim.4012
- Taverne C, Lambert P. Inflated discrete beta regression models for Likert and discrete rating scale outcomes. ISBA Discussion Paper. 2014;19:1–25.
- Kampen JK. Reflections on and test of the metrological properties of summated rating, Likert, and other scales based on sums of ordinal variables. Measurement. 2019;137:428–434. doi: 10.1016/j.measurement.2019.01.08
Supplementary files




