The standardised copy of pentagons test

Background The 'double-diamond copy' task is a simple paper and pencil test part of the Bender-Gestalt Test and the Mini Mental State Examination (MMSE). Although it is a widely used test, its method of scoring is crude and its psychometric properties are not adequately known. The aim of the present study was to develop a sensitive and reliable method of administration and scoring. Methods The study sample included 93 normal control subjects (53 women and 40 men) aged 35.87 ± 12.62 and 127 patients suffering from schizophrenia (54 women and 73 men) aged 34.07 ± 9.83. Results The scoring method was based on the frequencies of responses of healthy controls and proved to be relatively reliable with Cronbach's α equal to 0.61, test-retest correlation coefficient equal to 0.41 and inter-rater reliability equal to 0.52. The factor analysis produced two indices and six subscales of the Standardised Copy of Pentagons Test (SCPT). The total score as well as most of the individual items and subscales distinguished between controls and patients. The discriminant function correctly classified 63.44% of controls and 75.59% of patients. Discussion The SCPT seems to be a satisfactory, reliable and valid instrument, which is easy to administer, suitable for use in non-organic psychiatric patients and demands minimal time. Further research is necessary to test its psychometric properties and its usefulness and applications as a neuropsychological test.


Background
The 'double-diamond copy' task is a well known, simple paper and pencil test included in the Bender-Gestalt Test [1][2][3][4][5][6][7][8][9]. A slightly different version ('double-pentagon copy') with a different overlapping shape is included also in the Mini Mental State Examination (MMSE) [10,11]. It is composed of two overlapping pentagons, with the overlapping shape being a rhombus. It assesses visual motor ability. However, for both scales this item is scored in a very simple way. For example, in the MMSE it receives a 0/1 score and in the Bender-Gestalt Test a 0-4 score, with sample drawings to lead the examiner. The overall method is more 'qualitative' and focuses on the 'organic/neuropsychiatric' end of the spectrum (for example, dementia), since scoring levels 0-2 are reserved for very poor performance.
Non-organic psychiatric patients, however, including most patients with schizophrenia, are likely to receive a score of 2-4. Samples showing how patients with schizophrenia perform in this task are shown in Figure 1. It is obvious that by using these scoring methods to assess the drawings of psychiatric patients, valuable information might be lost.
The aim of the current study was to develop a novel and detailed standardised method for the administration and scoring of a task similar to the 'double-diamond copy' task. This task included two pentagons overlapping into a rhombus but with a slightly different shape in comparison to the Bender-Gestalt figure (Figure 1). This new task with his novel scoring method aims to be reliable, valid and sensitive to change in response to treatment and be suitable for use in mental patients suffering from other disorders than dementia.
All subjects were physically healthy with normal clinical and laboratory findings. All control subjects and patients gave informed consent and the protocol received approval by the University's Ethics Committee. The patients were either inpatients or outpatients of a private psychiatric clinic.

Clinical diagnosis
The diagnosis was made according to DSM-IV-TR criteria on the basis of a semistructured interview based on the Schedules for Clinical Assessment in Neuropsychiatry version 2.0 (SCAN v 2.0) [12].
Normal controls were assessed on the basis of an unstructured clinical interview.

The Standardised Copy of the Pentagons Test (SCPT) procedure
The SCPT procedure demanded the subject to copy a shape of two partially overlapping pentagons analogous to a shape of the Bender-Gestalt Test and similar to the figure used in some versions of the MMSE. The shape includes two pentagons whose overlap is a four-angle rhombus. The shape is shown in Figure 1 and in Additional file 1. The SCPT instructions ask the subject to draw an identical shape on the same piece of paper. The template shape was printed on the left half of the sheet leaving space for the subject to reproduce it on the right. No time limit was set and no time recording was made.
The assessment included the Random Letter Test (RLT) for the assessment of attention and vigilance [13].
It includes the following four series of letters: LTPEAOAISTDALAA; ANIABFSAMPZEOAD; PAK-LATSXTOEABAA and ZYFMTSAHEOAAPAT. The first and third group include five 'A's, while the second and the fourth include four 'A's. The test requires the patient to hit the desk when the examiner pronounces 'A'. Errors of omission and commission are recorded. It is expected (and verified in the present study) that the mean number of errors expected from normal controls in this test is around 0.2 [14]. Both errors of omission and commission were registered for this test.

Statistical analysis
Frequency tables were created concerning the scores of healthy controls. These tables were used to produce percentile scores and develop a scoring method for the scale. The Pearson's R correlation coefficient, factor analysis (varimax normalised rotation) and item analysis [18] (calculation of Cronbach's α) were used to explore the internal structure of the scale. Analysis of variance (ANOVA) [19], was used to test the difference between groups, and was performed separately for subjects below and above the age of 40. Discriminant function analysis was also used to explore the power of the scale in discriminating between groups. The Pearson's R correlation coefficient was calculated to assess the test-retest reliability as well as the inter-rater reliability. However, the calculation of correlation coefficients is not a sufficient method to test reliability and reproducibility of a method and its results, because it is an index of correlation and not an index of agreement [19][20][21]. The calculation of means and standard deviations for each SCPT item and total score during the first (test) and second (retest) applications may provide an impression of the stability of results over time.
The means and the standard deviations of the differences concerning each SCPT item between test and retest were also calculated, and plots of the test vs retest and difference vs average value for each variable were generated. In fact, it is not possible to use statistics to define acceptable agreement [19]. However, these plots may assist decision. This method has been used in previous studies concerning the validation of scientific methods [22,23].

Results
The frequency tables for scores of healthy controls are shown in Table 1. In the same table, the proposed scoring for each item is also shown. This scoring method is based on the frequencies of responses of healthy controls (percentile scores).
The one-way ANOVA revealed significant difference in the total SCPT score in comparison to controls for subjects under the age of 40 (P < 0.001) but not for those above this age (P = 0.17; Table 2). Note that SCPT-14 and SCPT-15 had no variance so they were not included in the analysis concerning separate items. The results are shown in Table 2 along with post hoc tests. It seems that in older subjects there are no differences because the performance of controls gets worse, while the change in the performance of patients is not great.
The Pearson's R correlation coefficients for the SCPT items are shown in Table 3 (total study sample).
The Pearson's R correlation coefficients for the SCPT items and the Positive and Negative Syndrome Scale (PANNS; positive, negative and general psychopathology subscales), the YMRS and the MADRS are shown in Table 4 (only for patients with schizophrenia).
The results of the factor analysis (varimax normalised rotation) are shown in Table 5. The analysis (by using the Keiser-Fleish criterion of eigenvalues larger than 1) produced six factors explaining 62% of the total variance. On the basis of this factor analysis, subscales were created and the differences between groups concerning these subscales are also shown in Table 6. The last SCPT item (closing-in) was included as a seventh subscale since it did not contribute to the factor analysis. One-way ANOVA revealed significant differences between the two diagnostic groups and post hoc tests showed that this difference concerned the some of the subscales but not all (P < 0.001; Table 6). The correlation coefficients for these subscales are shown in Table 7. Some correlations among these scales are statistically significant but weak. A second factor analysis of these subscales produced three superfactors explaining 22%, 22% and 15% of total variance,   respectively. The first one included subscales 2 and 5, the second included subscales 1, 3, 4 and 6, and the third included subscales 3 and 7 (Table 8).
Item analysis (calculation of Cronbach's α) Cronbach's α was equal to 0.61. The α coefficient did not change significantly when any item was omitted from the analysis.
The Pearson's R correlation coefficient (R) for interrater reliability is 0.52 for the total SCPT scale and ranges from 0.46 to 0.86 for individual items (Table 11); with regard to test-retest reliability, the same coefficient was equal to 0.46 and the items coefficients ranged from -0.12 to 0.70 (Table 9). Retest was performed within 5 days of first testing. The calculation of means and standard deviations for each SCPT item and total score during the first (test) and second (retest) applications as well as the plots of the test vs retest and difference vs average value for each variable suggested that the SCPT is reliable and replicable.

Discussion
The SCPT is a test of visual motor ability, and although several decades have passed since it was introduced, little has been performed to standardise it. This may be due to its complex pattern and a preference to score it on the basis of an 'overall' impression or 'qualitatively'. Little data can be found in the literature and these exist only because it is included in the MMSE and the Bender-Gestalt Test. Until now, scoring has been based on the overall impression and quality of the drawing as well as on common errors observed. The focus is on detecting 'organic' brain defects (for example, due to tumour, stroke or dementia), however, in this way many details in the performance of patients may be lost, and this is especially true when the test is used in psychiatric populations. Even the Bender-Gestalt Test uses a very simple way to score these tests. The current study attempted to develop a standardised scoring method that would allow the examiner to reliably quantify the subject's performance in the copy the pentagons test. This test demands the subject to copy a simple drawing template. Both the drawing template and the resulting SCPT along with the scoring method developed by the current study are shown in Additional file 1. The test and its scoring method proved to be satisfactory reliable and stable. It is not clear whether it is also sensitive to change after treatment. In one patient, performance improved after 2 months of antipsychotic treatment ( Figure 2). However, it is still necessary to apply the test to different patient populations, especially to patients suffering from 'organic' brain disease, before and after therapeutic intervention.    The scoring method is such that it allows for maximum contrast and differentiation between normal subjects and psychiatric patients. It also leaves little space for subjective assessment. In essence, the proposed scoring method expands levels 2-4 of the Bender-Gestalt scoring system.
Although some of the correlation coefficients among individual SCPT items were significant, overall each item assesses a distinct issue. This is also reflected in factor analysis. The six factors that emerge explain roughly 10% of the total variance each and 64% combined. The SCPT can be divided into subscales on the basis of the factor analysis and its interpretation. In this way, six subscales can be created. The first factor includes items 5, 6, 7 and 9 and largely reflects 'proportion'. Thus it may constitute the basis of a subscale named 'proportion' (P). The second one includes items 1, 2 and 3 and reflects the number of missing angles in the drawing. Thus it constitutes the basis of a subscale under the title 'missing angles' (MA). The third factor includes items 11 and 12 and reflects the quality of the line drawing in the shape. The resulting subscale is named 'quality of lines' (QL). The fourth factor includes items 8 and 13 (and 14, although that item's variance did not permit to include it in the factor analysis) and is an index of image distortion, and constitutes the basis of the 'image distortion' (ID) subscale. The fifth includes    items 2 (again) and 10 and reflects differences in size between the template and the shape designed by the subject, thus being the basis of the 'size' (S) subscale. The sixth factor includes items 4 and 9 (again) and reflects correction efforts, giving rise to the 'correction' (C) subscale. A final subscale, which includes only item 15 and is named 'closing-in' (CI), should be added. Schizophrenic patients differ from controls in P, MA and QL but not concerning the rest subscales.
Correlations among these subscales are significant but weak. The factor analysis of these subscales produced three superfactors, named 'indices'. The first (subscales MA and S) constitutes the 'deficit index' (DcI), while the second (subscales P, QL and C) is the 'deformation index' (DfI). The third index (subscales QL and CI) is the 'closing-in index' (CiI). It is important to note that all the items of the SGST included in the DcI are easy for the normal subject, while the more difficult ones (2, 5 and 8) are included in the DfI. Patients differ from controls concerning DfI and CiI indices (P < 0.001) but not DcI. In the context of the above, the SCPT is divided into the following three indices and six subscales: a. Deficit index (DcI), which includes the following two subscales: 1. Missing angles (ME) subscale (items 1, 2 and 3) 2. Size (S) subscale (items 2 and 10). b. Deformation index (DfI), which includes the following three subscales: 1. Proportion (P) subscale (items 5, 6, 7 and 9) 2. Quality of lines (QL) subscale (items 11 and 12) 3. Corrections (C) subscale (items 4 and 9) 4. Image distortion (ID) subscale (items 8, 13 and 14). c. Closing-in index (CiI), which includes the following two subscales: 1. Quality of lines (QL) subscale (items 11 and 12) 2. Closing-in (CI) subscale (item 15).
The correlations among the psychometric scales (PANSS, YMRS and the MADRS) and individual items and subscales of the SCPT revealed some very interesting points ( Table 4). The PANSS-Positive subscale correlates inversely with the DfI and Cil. The PANSS-Negative subscale also correlates inversely with most indices. PANSS-General Psychopathology correlates again inversely with the DfI and Cil. The YMRS does not correlate with any index, and in the current study it was used in order to have a measure to compare with bipolar patients in future studies. The MADRS correlated negatively with most indices. From the above it is obvious that the relationship of schizophrenia and its psychometric profile to the cognitive function as assessed by the SCPT is rather complex and non-linear, and further research is necessary to uncover specific issues and mechanisms.
We believe that future factor analysis with the inclusion of different patient groups will help to further elucidate the mechanism underlying the performance in the SCPT.

Conclusions
In summary, the current study has developed a reliable and valid instrument. The great advantage of this instrument is the fact that it is paper and pencil, easily administered and little time consuming and appropriate for use in non-organic mental patients. Further research is necessary to test its usefulness and its applications as a neuropsychological test.

Additional material
Additional file 1: Standardised Copy of the Pentagons Test (SCPT).