Ordinal Scale and Nonparametric Methods

preamble:

Ordinal scales involve ranking. The use of numbers in an ordinal scale formation implies greater than or less than relationship. It does not in any way imply as to how much more or how much less. In other words, in any ordinal scale, objects are ranked but the distance between objects cannot be measured. It is not necessary that you should use numbers in ordinal scales. Verbal categories such as 'Low," "Medium," and "High" would very well constitute an ordinal scale.

As ordinal scales are frequently encountered in research studies, the usual parametric tests don't hold true because of two reasons. First, they assume a level of measurement of interval/ratio scales. Second, they assume that the samples are drawn from a population with a known distribution such as the normal distribution. Measurement of attitudes, consumer tastes and preferences, and ranking of attributes are very prevalent in research. You need exclusive hypothesis testing procedures that deal with ordinal scales. These fall under a set of elegant nonparametric methods.

This article attempts to give an illustrative account of nonparametric methods that are used in ordinal scales of measurement. The coverage is by no means exhaustive. However typical situations are discussed to throw light on how useful these tests are.

Next-Kolmogorov-Smirnov Test

Kolmogorov-Smirnov Test:

Kolmogorov-Smirnov test is a test of goodness of fit for the univariate case when the scale of measurement is ordinal. It is similar to the chi-square test of goodness of fit in the sense it also examines whether the observed frequencies are in accordance with the expected frequencies under a well defined null hypothesis. Of course the chi-square test involves nominal measurement. Kolmogorov-Smirnov test is more powerful than the chi-square test when ordinal data are encountered in any decision problem. In the concluding remarks, you will see the advantages of using Kolmogorov-Smirnov test over the chi-square test. To understand how this test works in practice, let us take an example.

Example:

A manufacturing company producing decorative paints is interested in knowing whether the consumers have distinct preferences for different shades in the context of a new decorative paint that it proposes to market. If the consumers have special preference for any particular shade, then the company would market only that shade. Else, it would plan to market all the shades. A sample of 150 consumers was interviewed and the data collected on shade preferences are given in the table below:

Table showing shade preferences

Shade Number of Consumers preferring
Very Light 25
Light 35
Medium 55
Dark 20
Very Dark 15

What are your conclusions?

 Next-Analysis and Interpretations                        previous

Analysis and Interpretations:

The test involves comparing the expected cumulative distribution function under the null hypothesis being true with that of observed cumulative distribution function. If we designate Fo(X) as the expected cumulative distribution function and Sn(X) as the observed cumulative distribution function, Kolmogorov-Smirnov D is calculated as D = Max |Fo(X)-Sn(X)| (D is the absolute difference between the expected cumulative proportion and the observed cumulative proportion). Please note that n is the sample size. The following table shows the necessary calculations.

Table1: Basic Calculations for the Example

Shade

Observed Frequency

Observed Proportion Observed Cumulative Proportion Sn(X) Expected Proportion

Expected Cumulative Proportion

Fo(X)

|Fo(X)-Sn(X)| 
Very Light 25 0.1667 0.1667 0.2000 0.2000 0.0333
Light 35 0.2333 0.4000 0.2000 0.4000 0.0000
Medium 55 0.3667 0.7667 0.2000 0.6000 0.1667
Dark 20 0.1333 0.9000 0.2000 0.8000 0.1000
Very Dark 15 0.1000 1.0000 0.2000 1.000 0.0000

The null hypothesis is that all shades are equally preferred

The alternative hypothesis is that they are not equally preferred

Computed D = Max |Fo(X)-Sn(X)| = 0.1667. The critical D value for a level of significance of 5% is given by

Substituting for n in the left side expression, you get D =0.1110. Since the calculated D(0.1667) exceeds the critical D(0.1110), reject the null hypothesis at 5% level. The conclusion is that all shades are not equally preferred. The results show a significant preference for medium shade.

Next- Concluding Remarks on Kolmogorov-Smirnov Test                       previous

Concluding Remarks on Kolmogorov-Smirnov Test

You could very well have used the chi-square test of goodness of fit for testing the hypothesis of equal preference for all shades in this example instead of the Kolmogorov-Smirnov test. When the data measurement are ordinal, Kolmogorov-Smirnov test is more powerful than the chi-square test for the following reasons. 

  1. It is easier to compute

  2. It does not have the problem of minimum frequency in each cell as the chi-square test

  3. For very small samples,  the chi-square test cannot be used, but the Kolmogorov-Smirnov test can be used

  4. When samples are small and there are categories to be combined for proper use, the chi-square test is definitely less powerful than the Kolmogorov-Smirnov test.

Home previous Next-Median Test

Median Test

Median Test:

Median test is used for testing whether two groups differ in their median value. In simple terms, median test will focus on whether the two groups come from populations with the same median. This test stipulates the measurement scale is at least ordinal and the samples are independent (not necessary of the same sample size). The null hypothesis structured is that the two populations have the same median. Let us take an example to appreciate how this test is useful in a typical practical situation.

Example:   A private bank is interested in finding out whether the customers belonging to two groups differ in their satisfaction level. The two groups are customers belonging to current account holders and savings account holders. A random sample of 20 customers of each category was interviewed regarding their perceptions of the bank's service quality using a Likert-type (ordinal scale) statements. A score of "1" represents very dissatisfied and a score of "5" represents very satisfied. The compiled aggregate scores for each respondent in each groupare tabulated be given below:

 

Current Account Savings Account

79

85

86 80
40 50
50 55
75 65
38 50
70 63
73 75
50 55
40 45
20 30
80 85
55 65
61 80
50 55
80 75
60 65
30 50
70 75
50 62

What are your conclusions regarding the satisfaction level of these two groups?

Next-Analysis and Interpretations                            previous

Analysis and Interpretations:

The first task in the median test is to obtain the grand median. Arrange the combined data of both the groups in the descending  order of magnitude. That is rank them from the highest to the lowest. Select the middle most observation in the ranked data. In this case, median is the average of 20th and 21st observation in the array that has been arranged in the descending order of magnitude. 

Table showing descending order of aggregate score and rank in the combined sample

Descending
Order
Rank Descending
Order
Rank
86
85
85
80
80
80
80
79
75
75
75
75
73
70
70
65
65
65
63
62

1
2.5
2.5
5.5
5.5
5.5
5.5
8
10.5
10.5
10.5
10.5
13
14.5
14.5
17
17
17
19
20

61
60
55
55
55
55
50
50
50
50
50
50
50
45
40
40
38
30
30
20

21
22
24.5
24.5
24.5
24.5
30
30
30
30
30
30
30
34
35.5
35.5
37
38.5
38.5
40

Grand median is the average of 20th and 21st observation = (62+61)/2 =61.5. Please note that in the above table, average rank is taken whenever the scores are tied. The next step is to prepare a contingency table of two rows and two columns. The cells represent the number of observations that are above and below the grand median in each group. Whenever some observations in each group coincide with the median value, the accepted practice is to first count the observations that are strictly above grand median and put the rest under below grand median. In other words, below grand median in such cases would include less than or equal to grand median.

Scores of Current Account Holders and Savings Account Holders as compared with Grand Median

  Current Account Holders Savings Account Holders Marginal Total
Above Grand Median 8(a) 12(b) 20(a+b)
Below Grand Median 12(c) 8(d) 20(c+d)
Marginal Total 20(a+c) 20(b+d) 40(a+b+c+d)

= n

Null Hypothesis: There is no difference between the current account holders and savings account holders in the perceived satisfaction level.

alternative Hypothesis: There is difference between the current account holders and savings account holders in the perceived satisfaction level.

The test statistic to be used is given by

The chi-square statistic shown on the left side of the table is the one we would have obtained in  a contingency table with nominal data except for the factor (n / 2) used in the numerator as a correction for continuity . This is because a continuous distribution is used to approximate a discrete distribution. 

on substituting the values of  a, b, c, d, and n we have

 

Critical chi-square for 1 d.f at 5% level of significance = 3.84. Since the computed chi-square(0.90) is less than critical chi-square(3.84), we have no convincing evidence to reject the null hypothesis. Thus the the data are consistent with the null hypothesis that there is no difference between the current account holders and savings account holders in the perceived satisfaction level.

 Home