A standardized scoring method for the copy of cube test, developed to be suitable for use in psychiatric populations

Background Although the 'copy of cube test', a version of which is included in the Short Test of Mental Status (STMS), has existed for years, little has been done to standardize it in detail. The aim of the current study was to develop a novel and detailed standardized method of administration and scoring this test. Methods The study sample included 93 healthy control subjects (53 women and 40 men) aged 35.87 ± 12.62 and 127 patients suffering from schizophrenia (54 women and 73 men) aged 34.07 ± 9.83 years. The psychometric assessment included the Positive and Negative Symptoms Scale (PANSS) the Young Mania Rating Scale (YMRS), and the Montgomery-Åsberg Depression Rating Scale (MADRS). Results A scoring method was developed based on the frequencies of responses of healthy controls. Cronbach's α was equal to 0.75 and inter-rater reliability was 0.90. Three indices and five subscales of the Standardized Copy of the Cube Test (SCCT) were eventually developed. They included the Deficit Index (DcI), which includes the Missing Elements (ME) Mirror Image (M) subscales, the Deformation Index (DfI) which includes the Deformation (D) and the Rotation (R) subscales and the Closing-In Index (CiI). Discussion The SCCT seems to be a reliable, valid and sensitive to change instrument for the testing of psychiatric patients. The great advantage of this instrument is the fact that it only requires paper and a pencil, and is this easily administered and brief. Further research is necessary to test its usefulness as a neuropsychological test.


Background
The copy of cube task is a well known, simple paper and pencil test which is part of the Short Test of Mental Status (STMS) [1,2]. Additionally, patterns of blocks of cubes are incorporated in the Bender Gestalt Test [3][4][5][6][7][8][9][10][11]. This simple test demands the copy of a Necker cube. This shape is an optical illusion first published in 1832 by the Swiss crystallographer Louis Albert Necker, and it is an ambiguous line drawing. In essence, it is a wireframe drawing of a cube in isometric perspective. This means that parallel edges of the cube are drawn as parallel lines in the picture. The ambiguity lies in the fact that when two lines cross, the picture does not show which is in front and which is behind. This leads to what is called multistable perception, since sometimes the observer might experience the cube 'flipping' between its two perceptual solutions.
This phenomenon is very interesting as it shows that from an ambiguous picture, the human visual system picks an interpretation of each part that makes the whole consistent. Humans do not usually see an inconsistent interpretation of the cube (for example, an impossible object). Most people see the lower-left face as being in front, possibly because people view objects from above, with the top side visible, far more often than from below with the bottom visible, so the brain selects as most probable the interpretation that the cube is viewed from above. Thus, the use of the Necker cube in neuropsychology has shed light on the human visual system. The phenomenon has served as evidence of the human brain being a neural network with two distinct and equally possible interchangeable stable states [12].
The scoring method as indicated in the STMS rates the performance from 0-2. Psychiatric patients, however, including most patients with schizophrenia, are likely to receive a score of 1 or 2, which is largely similar to controls. Samples showing how patients with schizophrenia perform in this task are shown in Figure 1. It is obvious that by using these scoring methods to assess the drawings of psychiatric patients, valuable information might be lost.
The reversal of the perception of the Necker cube has been extensively studied, but this is not the case concerning its copying. To date no standardized method has been developed. The aims of the current study were to develop a novel and detailed standardized method of administration and scoring of the copy of the Necker cube test and to preliminarily test this method in schizophrenic patients. This new scoring method aims to be reliable, valid and sensitive to change in response to treatment.
All subjects were physically healthy with normal clinical and laboratory findings. All control subjects and patients gave informed consent and the protocol received approval from the University's Ethics Committee. The patients were either inpatients or outpatients of a private psychiatric clinic.

Clinical diagnosis
The diagnosis was set according to DSM-IV-TR criteria on the basis of a semistructured interview based on the Schedules for Clinical Assessment in Neuropsychiatry (SCAN) version 2.0 [13].

The SCCT procedure
The SCCT procedure required the subject copy a Necker cube. The template shape is shown in Figure 1 and in Additional file 1. The SCCT instructions ask the subject to draw an identical shape on the same piece of paper. The template shape was printed on the left half of the sheet leaving space for the subject to reproduce it on the right. No time limit was set and no time recording was made.
The assessment included the Random Letter Test for the assessment of attention and vigilance [14] to assure that subjects could concentrate enough. This includes the following four series' of letters: LTPEAOAISTDA-LAA, ANIABFSAMPZEOAD, PAKLATSXTOEABAA and ZYFMTSAHEOAAPAT. The first and third group include five 'A's, while the second and the fourth include four 'A's. The test requires the patient to hit the desk when the examiner pronounces 'A'. Errors of omission and commission are recorded. It is expected (and verified in the present study) that the mean number of errors expected from healthy controls in this test is around 0.2. Both errors of omission and commission were registered for this test.

The psychometric assessment
The psychometric assessment included Positive and Negative Symptoms Scale (PANSS) [15], the Young Mania Rating Scale (YMRS) [16], and the Montgomery-Åsberg Depression Rating Scale (MADRS) [17] in order to assess the clinical picture of patients. The PANSS assesses psychotic symptoms, the YMRS manic symptoms and the MADRS depressive symptoms.

Raters
All authors served as raters with regard to the psychometric scales and neuropsychological testing. They were not blind to clinical diagnosis. Only brief training was given, as all of them were already experienced in the field. There was no specific training concerning the SCCT because the essence of the development procedure was that the scoring directions included in the test should be sufficient alone.

Statistical analysis
The statistical analysis included the development of frequency tables for scores of healthy controls so as to arrive at percentile scores and develop a scoring method for the scale. The Pearson's R correlation coefficient, factor analysis (varimax normalized rotation) and item analysis [18] (calculation of Cronbach's α) were used to explore the internal structure of the scale. Analysis of variance [19], was used to test the difference between groups, and was performed separately for subjects below and above the age of 40. Discriminant function analysis was also used to explore differences between groups and the power of the scale in discriminating between them. The Pearson's R correlation coefficient was calculated to assess the inter-rater reliability. However, the calculation of correlation coefficients is not a sufficient method to test reliability and reproducibility of a method and its results, because it is an index of correlation and not an index of agreement [19][20][21]. The calculation of means and standard deviations for each SCCT item and total score during the rating by each examiner may provide an impression of the stability of results.
Additionally, the means and the standard deviations of the differences concerning each SCCT item between rating and re-rating were calculated and the plots of the rating vs re-rating and difference vs average value for each variable were created. In fact it is not possible to use statistics to define acceptable agreement [19]. However, these plots may assist decision. This method has been used in previous studies concerning the validation of scientific methods [22,23].

Results
The frequency tables for scores of healthy controls are shown in Table 1. In the same table the proposed scoring for each item is also shown. This scoring method is based on the frequencies of responses of healthy controls (percentile scores).
Subjects were divided into those under and over the age of 40 (for those bellow the age of 40: controls 28.57 ± 7.18 years old vs patients 30.18 ± 6.30 years old, P = 0.09 and for those above the age of 40: controls 50.70 ± 6.90 years old vs patients 55.60 ± 9.90 years old, P = 0.001). The one-way analysis of variance (ANOVA) revealed significant results for subjects under the age of 40 (P < 0.001) but not for those above this age (P = 0.055). Note that SCCT-14 had no variance so it was not included in the analysis. The results are shown in Table 2 along with post hoc tests. This analysis made the samples considerably smaller and, thus, this study does not have adequate power to detect a difference between healthy controls and people with schizophrenia in those over 40 and testing should be considered exploratory. The results indicate that the difference between healthy controls and patients with schizophrenia gets smaller with age because the performance of controls gets worse, even though patients were significantly older in the above 40 years old group. The Pearson's R correlation coefficients among the SCCT items in the total study sample are shown in Table 3.
The Pearson's R correlation coefficient, among the SCCT items and the PANSS (Positive, Negative and General Psychopathology scales), the YMRS and the MADRS are shown in Table 4.
The results of the factor analysis (varimax normalized rotation) are shown in Table 5. The analysis (by using the Keiser-Fleish criterion of eigenvalues larger than 1) produced four factors explaining 71% of the total variance. The scores in the subscales created on the basis of these factors and the differences between groups in these scales are also shown in Table 6. The last SCCT item (closing in) was included as a fifth subscale, since it did not contribute to the factor analysis. The one-way ANOVA revealed significant differences between the two diagnostic groups and post hoc tests showed that this difference concerned the some of the subscales but not all (P < 0.001; Table 6). The correlation coefficients among these subscales are shown in Table 7 and they are non-significant. A second factor analysis of these subscales produced two superfactors explaining 29% and 28% of total variance respectively (Table 8).
Item analysis (calculation of Cronbach's α) Cronbach's α was equal to 0.75, with no item increasing dramatically the α coefficient when omitted.
The discriminant function analysis results are shown in Tables 9 and 10. This analysis produced the following function: when 2 × (SCCT-4) + 3 × (SCCT-5) + 2 × (SCCT-13) = >363.6 then the subject is likely to be a healthy control rather than a schizophrenic patient. This function correctly classified 62.36% of controls and 89.76% of patients with schizophrenia, which is a satisfactory performance.
The Pearson's R correlation coefficient (R) for interrater reliability is 0.90 for the total SCCT scale and ranges from 0.51 to 0.90 for individual items (Table 11). The calculation of means and standard deviations for each SCCT item and total score for the rating and rerating as well as the respective plots and plots of difference vs average value for each variable suggested that the SCCT is reliable.

Discussion
The SCCT is a test of visual-motor ability and, although several decades have passed since the copy of a cube test was introduced, little has been done to standardize it. This may be due to the complex pattern of these tests and a preference of the examiners to score them on the basis of an 'overall' impression or 'qualitatively'.    Little data can be found in the literature and even then only because it is included in the STMS [1,2]. The Bender Gestalt Test includes complex three-dimensional figures constituted from many Necker cubes, but again scoring is simplistic [3][4][5][8][9][10][11]. Scoring is based on the overall impression and quality of the drawing as well as on common errors observed, and the focus is on detecting 'organic' brain defects. However, in this way many details in the performance of patients may be lost, and this is especially true when the test is used in psychiatric populations. The current study attempted to develop a standardized scoring method that would allow the examiner to reliably quantify the subject's performance in the copy the Necker cube test. This test requires the subject to copy a simple drawing template. Both the drawing template and the resulting SCCT along with the scoring method developed by the current study are shown in Additional file 1. The test and its scoring method proved to be reliable and stable. There are some clues that it could be also sensitive to change after treatment. An example of possible change after 2 months of antipsychotic treatment is shown in Figure 2. However, targeted research is necessary to show whether this is the case and also it is necessary to apply the test to different patient population, especially to patients suffering from 'organic' brain disease, before and after therapeutic intervention.
The scoring method is such that allows for maximum contrast and differentiation between healthy subjects and patients and simultaneously leaves little space for subjective assessment. Largely, the scoring method expands levels 2-4 of the Bender-Gestalt scoring system. Further research is necessary to show whether such a detailed approach adds substantially to the understanding of the neurocognitive deficit of mental patients or simply consumes time.    The results of the discriminant function analysis support the usefulness of this new scoring method. By using the functions, the SCCT can assist in the differentiation between patients with schizophrenia from healthy controls. However, apart from discriminant function analysis, we did not proceed to try to calculate sensitivity and specificity for one or more specific cut-off points, because the overlap between groups was significant and the test seems to be useful to assess aspects of cognitive function but not as a specific diagnostic test for a specific illness.
The correlation coefficients among individual SCCT items, although some were significant, suggest that overall each item assesses a distinct issue. This is also reflected in factor analysis. The four factors that emerge explain 71% of the total variance. The SCCT can be divided into subscales on the basis of the factor analysis and its interpretation. In this way, five subscales can be created. The first factor includes items 1, 4, 7 and 11 and it constitutes the Missing Elements (ME) subscale. The second includes items 2, 3, 5, 6, 8, 9 and 10 and it constitutes the Deformation (D) subscale. The third includes only item 13 and it constitutes the Mirror (M) subscale. The fourth includes only item 12 and constitutes the Rotation (R) subscale. Item 14 had no variability and thus it constitutes a separate subscale, the Close-In (CI) subscale.
Correlations among these subscales are very weak. The factor analysis of these subscales produced three superfactors, named 'indices'. The first (subscales ME and M) constitutes the 'Deficit Index' (DcI), while the second (subscales D and R) is the 'Deformation Index' (DfI). The third index (subscale CI alone) is the 'Closing-In Index' (CiI). It is important to note that all the items of the SGST included in the DcI are easy for the healthy subject, while the more difficult ones (2, 5 and 8) are included in the DfI. Patients differ from controls concerning DfI and CiI indices (P < 0.001) but not DcI. In the frame of the above, the SCCT is divided into the following three indices and five subscales: (a) Deficit Index (DcI), which includes the following two subscales: (1) Missing Elements (ME) subscale (items 1, 4, 7 and 11); (2) Mirror Image (M) subscale (item 13).
Further research is necessary to elucidate the underlying cognitive functions and deficits that are reflected in   these indices and subscales. The correlations among the psychometric scales (PANSS, YMRS and the MADRS) and individual items and subscales of the SCCT revealed some very interesting points ( Table 4). The Deficit Index correlates negatively with all psychometric scales. The MADRS correlates also negatively with all subscales and indices. Generally the correlation among the scoring of the SCCT and the psychometric scales is significant. The above suggest a complex neurocognitive profile for schizophrenia as this is revealed by the SCCT. Further research is necessary to uncover specific issues and mechanisms. Commenting on these correlations is beyond the scope of the current manuscript and the data included here are insufficient as they do not focus on this research target. We believe that further factor analysis with the inclusion of different patient groups will help to further elucidate the mechanisms underlying performance in the SCCT.

Conclusions
The current study has developed a reliable, valid and maybe sensitive to change instrument. The great advantage of this instrument is the fact that it only requires paper and a pencil, and hence is easily administered and brief. Further research is necessary to test its usefulness as a neuropsychological test.

Additional material
Additional file 1: Standardized Copy of the Cube Test (SCCT). The SCCT. manuscript and approved the final version. StM collected data, assisted in the interpretation of results, gave input to revisions of the manuscript and approved the final version. SK collected data, assisted in the interpretation of results, gave input to revisions of the manuscript and approved the final version. VAT collected data, assisted in the interpretation of results, gave input to revisions of the manuscript and approved the final version. TO collected data, assisted in the interpretation of results, gave input to revisions of the manuscript and approved the final version.