Ordinal Scale and Nonparametric Methods
Ordinal scales involve ranking. The use of numbers in an ordinal scale formation implies greater than or less than relationship. It does not in any way imply as to how much more or how much less. In other words, in any ordinal scale, objects are ranked but the distance between objects cannot be measured. It is not necessary that you should use numbers in ordinal scales. Verbal categories such as 'Low," "Medium," and "High" would very well constitute an ordinal scale.As ordinal scales are frequently encountered in research studies, the usual parametric tests don't hold true because of two reasons. First, they assume a level of measurement of interval/ratio scales. Second, they assume that the samples are drawn from a population with a known distribution such as the normal distribution. Measurement of attitudes, consumer tastes and preferences, and ranking of attributes are very prevalent in research. You need exclusive hypothesis testing procedures that deal with ordinal scales. These fall under a set of elegant nonparametric methods.
This article attempts to give an illustrative account of nonparametric methods that are used in ordinal scales of measurement. The coverage is by no means exhaustive. However typical situations are discussed to throw light on how useful these tests are.
Kolmogorov-Smirnov test is a test of goodness of fit for the univariate case when the scale of measurement is ordinal. It is similar to the chi-square test of goodness of fit in the sense it also examines whether the observed frequencies are in accordance with the expected frequencies under a well defined null hypothesis. Of course the chi-square test involves nominal measurement. Kolmogorov-Smirnov test is more powerful than the chi-square test when ordinal data are encountered in any decision problem. In the concluding remarks, you will see the advantages of using Kolmogorov-Smirnov test over the chi-square test. To understand how this test works in practice, let us take an example.
Example:
A manufacturing company producing decorative paints is interested in knowing whether the consumers have distinct preferences for different shades in the context of a new decorative paint that it proposes to market. If the consumers have special preference for any particular shade, then the company would market only that shade. Else, it would plan to market all the shades. A sample of 150 consumers was interviewed and the data collected on shade preferences are given in the table below:
Table showing shade preferences
Shade | Number of Consumers preferring |
Very Light | 25 |
Light | 35 |
Medium | 55 |
Dark | 20 |
Very Dark | 15 |
What are your conclusions?
Next-Analysis and Interpretations previous
The test involves comparing the expected cumulative distribution function under the null hypothesis being true with that of observed cumulative distribution function. If we designate Fo(X) as the expected cumulative distribution function and Sn(X) as the observed cumulative distribution function, Kolmogorov-Smirnov D is calculated as D = Max |Fo(X)-Sn(X)| (D is the absolute difference between the expected cumulative proportion and the observed cumulative proportion). Please note that n is the sample size. The following table shows the necessary calculations.
Table1: Basic Calculations for the Example
Shade |
Observed Frequency |
Observed Proportion | Observed Cumulative Proportion Sn(X) | Expected Proportion |
Expected Cumulative Proportion Fo(X) |
|Fo(X)-Sn(X)| |
Very Light | 25 | 0.1667 | 0.1667 | 0.2000 | 0.2000 | 0.0333 |
Light | 35 | 0.2333 | 0.4000 | 0.2000 | 0.4000 | 0.0000 |
Medium | 55 | 0.3667 | 0.7667 | 0.2000 | 0.6000 | 0.1667 |
Dark | 20 | 0.1333 | 0.9000 | 0.2000 | 0.8000 | 0.1000 |
Very Dark | 15 | 0.1000 | 1.0000 | 0.2000 | 1.000 | 0.0000 |
The null hypothesis is that all shades are equally preferred
The alternative hypothesis is that they are not equally preferred
Computed D = Max |Fo(X)-Sn(X)| = 0.1667. The critical D value for a level of significance of 5% is given by
Substituting for n in the left side expression, you get D =0.1110. Since the calculated D(0.1667) exceeds the critical D(0.1110), reject the null hypothesis at 5% level. The conclusion is that all shades are not equally preferred. The results show a significant preference for medium shade. |
Next- Concluding Remarks on Kolmogorov-Smirnov Test previous
Concluding Remarks on Kolmogorov-Smirnov Test
You could very well have used the chi-square test of goodness of fit for testing the hypothesis of equal preference for all shades in this example instead of the Kolmogorov-Smirnov test. When the data measurement are ordinal, Kolmogorov-Smirnov test is more powerful than the chi-square test for the following reasons.
It is easier to compute
It does not have the problem of minimum frequency in each cell as the chi-square test
For very small samples, the chi-square test cannot be used, but the Kolmogorov-Smirnov test can be used
When samples are small and there are categories to be combined for proper use, the chi-square test is definitely less powerful than the Kolmogorov-Smirnov test.
Home | previous | Next-Median Test |
Median Test
Median test is used for testing whether two groups differ in their median value. In simple terms, median test will focus on whether the two groups come from populations with the same median. This test stipulates the measurement scale is at least ordinal and the samples are independent (not necessary of the same sample size). The null hypothesis structured is that the two populations have the same median. Let us take an example to appreciate how this test is useful in a typical practical situation.
Example: A private bank is interested in finding out whether the customers belonging to two groups differ in their satisfaction level. The two groups are customers belonging to current account holders and savings account holders. A random sample of 20 customers of each category was interviewed regarding their perceptions of the bank's service quality using a Likert-type (ordinal scale) statements. A score of "1" represents very dissatisfied and a score of "5" represents very satisfied. The compiled aggregate scores for each respondent in each groupare tabulated be given below:
Current Account | Savings Account |
79 |
85 |
86 | 80 |
40 | 50 |
50 | 55 |
75 | 65 |
38 | 50 |
70 | 63 |
73 | 75 |
50 | 55 |
40 | 45 |
20 | 30 |
80 | 85 |
55 | 65 |
61 | 80 |
50 | 55 |
80 | 75 |
60 | 65 |
30 | 50 |
70 | 75 |
50 | 62 |
What are your conclusions regarding the satisfaction level of these two groups?
Next-Analysis and Interpretations previous
The first task in the median test is to obtain the grand median. Arrange the combined data of both the groups in the descending order of magnitude. That is rank them from the highest to the lowest. Select the middle most observation in the ranked data. In this case, median is the average of 20th and 21st observation in the array that has been arranged in the descending order of magnitude.
Table showing descending order of aggregate score and rank in the combined sample
Descending Order |
Rank | Descending Order |
Rank |
86 85 85 80 80 80 80 79 75 75 75 75 73 70 70 65 65 65 63 62 |
1 |
61 |
21 |
Grand median is the average of 20th and 21st observation = (62+61)/2 =61.5. Please note that in the above table, average rank is taken whenever the scores are tied. The next step is to prepare a contingency table of two rows and two columns. The cells represent the number of observations that are above and below the grand median in each group. Whenever some observations in each group coincide with the median value, the accepted practice is to first count the observations that are strictly above grand median and put the rest under below grand median. In other words, below grand median in such cases would include less than or equal to grand median.
Scores of Current Account Holders and Savings Account Holders as compared with Grand Median
Current Account Holders | Savings Account Holders | Marginal Total | |
Above Grand Median | 8(a) | 12(b) | 20(a+b) |
Below Grand Median | 12(c) | 8(d) | 20(c+d) |
Marginal Total | 20(a+c) | 20(b+d) | 40(a+b+c+d)
= n |
Null Hypothesis: There is no difference between the current account holders and savings account holders in the perceived satisfaction level.
alternative Hypothesis: There is difference between the current account holders and savings account holders in the perceived satisfaction level.
The test statistic to be used is given by
The chi-square statistic shown on the left side of the table is the one we would have obtained in a contingency table with nominal data except for the factor (n / 2) used in the numerator as a correction for continuity . This is because a continuous distribution is used to approximate a discrete distribution. |
on substituting the values of a, b, c, d, and n we have
Critical chi-square for 1 d.f at 5% level of significance = 3.84. Since the computed chi-square(0.90) is less than critical chi-square(3.84), we have no convincing evidence to reject the null hypothesis. Thus the the data are consistent with the null hypothesis that there is no difference between the current account holders and savings account holders in the perceived satisfaction level.