Ordinal Scale and Hypothesis Testing

Ordinal Scale and Nonparametric Methods

preamble:

Ordinal scales involve ranking. The use of numbers in an ordinal scale formation implies greater than or less than relationship. It does not in any way imply as to how much more or how much less. In other words, in any ordinal scale, objects are ranked but the distance between objects cannot be measured. It is not necessary that you should use numbers in ordinal scales. Verbal categories such as 'Low," "Medium," and "High" would very well constitute an ordinal scale.

As ordinal scales are frequently encountered in research studies, the usual parametric tests don't hold true because of two reasons. First, they assume a level of measurement of interval/ratio scales. Second, they assume that the samples are drawn from a population with a known distribution such as the normal distribution. Measurement of attitudes, consumer tastes and preferences, and ranking of attributes are very prevalent in research. You need exclusive hypothesis testing procedures that deal with ordinal scales. These fall under a set of elegant nonparametric methods.

This article attempts to give an illustrative account of nonparametric methods that are used in ordinal scales of measurement. The coverage is by no means exhaustive. However typical situations are discussed to throw light on how useful these tests are.

Next-Kolmogorov-Smirnov Test

Kolmogorov-Smirnov Test:

Kolmogorov-Smirnov test is a test of goodness of fit for the univariate case when the scale of measurement is ordinal. It is similar to the chi-square test of goodness of fit in the sense it also examines whether the observed frequencies are in accordance with the expected frequencies under a well defined null hypothesis. Of course the chi-square test involves nominal measurement. Kolmogorov-Smirnov test is more powerful than the chi-square test when ordinal data are encountered in any decision problem. In the concluding remarks, you will see the advantages of using Kolmogorov-Smirnov test over the chi-square test. To understand how this test works in practice, let us take an example.

Example:

A manufacturing company producing decorative paints is interested in knowing whether the consumers have distinct preferences for different shades in the context of a new decorative paint that it proposes to market. If the consumers have special preference for any particular shade, then the company would market only that shade. Else, it would plan to market all the shades. A sample of 150 consumers was interviewed and the data collected on shade preferences are given in the table below:

Table showing shade preferences

Shade	Number of Consumers preferring
Very Light	25
Light	35
Medium	55
Dark	20
Very Dark	15

What are your conclusions?

Next-Analysis and Interpretations previous

Analysis and Interpretations:

The test involves comparing the expected cumulative distribution function under the null hypothesis being true with that of observed cumulative distribution function. If we designate Fo(X) as the expected cumulative distribution function and Sn(X) as the observed cumulative distribution function, Kolmogorov-Smirnov D is calculated as D = Max |Fo(X)-Sn(X)| (D is the absolute difference between the expected cumulative proportion and the observed cumulative proportion). Please note that n is the sample size. The following table shows the necessary calculations.

Table1: Basic Calculations for the Example

Shade	Observed Frequency	Observed Proportion	Observed Cumulative Proportion Sn(X)	Expected Proportion	Expected Cumulative Proportion Fo(X)	\|Fo(X)-Sn(X)\|
Very Light	25	0.1667	0.1667	0.2000	0.2000	0.0333
Light	35	0.2333	0.4000	0.2000	0.4000	0.0000
Medium	55	0.3667	0.7667	0.2000	0.6000	0.1667
Dark	20	0.1333	0.9000	0.2000	0.8000	0.1000
Very Dark	15	0.1000	1.0000	0.2000	1.000	0.0000

The null hypothesis is that all shades are equally preferred

The alternative hypothesis is that they are not equally preferred

Computed D = Max |Fo(X)-Sn(X)| = 0.1667. The critical D value for a level of significance of 5% is given by

Substituting for n in the left side expression, you get D =0.1110. Since the calculated D(0.1667) exceeds the critical D(0.1110), reject the null hypothesis at 5% level. The conclusion is that all shades are not equally preferred. The results show a significant preference for medium shade.

Next- Concluding Remarks on Kolmogorov-Smirnov Test previous

Concluding Remarks on Kolmogorov-Smirnov Test

You could very well have used the chi-square test of goodness of fit for testing the hypothesis of equal preference for all shades in this example instead of the Kolmogorov-Smirnov test. When the data measurement are ordinal, Kolmogorov-Smirnov test is more powerful than the chi-square test for the following reasons.

It is easier to compute
It does not have the problem of minimum frequency in each cell as the chi-square test
For very small samples, the chi-square test cannot be used, but the Kolmogorov-Smirnov test can be used
When samples are small and there are categories to be combined for proper use, the chi-square test is definitely less powerful than the Kolmogorov-Smirnov test.

Home

Next-Median Test

Median Test

Median Test:

Median test is used for testing whether two groups differ in their median value. In simple terms, median test will focus on whether the two groups come from populations with the same median. This test stipulates the measurement scale is at least ordinal and the samples are independent (not necessary of the same sample size). The null hypothesis structured is that the two populations have the same median. Let us take an example to appreciate how this test is useful in a typical practical situation.

Example: A private bank is interested in finding out whether the customers belonging to two groups differ in their satisfaction level. The two groups are customers belonging to current account holders and savings account holders. A random sample of 20 customers of each category was interviewed regarding their perceptions of the bank's service quality using a Likert-type (ordinal scale) statements. A score of "1" represents very dissatisfied and a score of "5" represents very satisfied. The compiled aggregate scores for each respondent in each groupare tabulated be given below:

Current Account	Savings Account
79	85
86	80
40	50
50	55
75	65
38	50
70	63
73	75
50	55
40	45
20	30
80	85
55	65
61	80
50	55
80	75
60	65
30	50
70	75
50	62

What are your conclusions regarding the satisfaction level of these two groups?

Next-Analysis and Interpretations previous

Analysis and Interpretations:

The first task in the median test is to obtain the grand median. Arrange the combined data of both the groups in the descending order of magnitude. That is rank them from the highest to the lowest. Select the middle most observation in the ranked data. In this case, median is the average of 20th and 21st observation in the array that has been arranged in the descending order of magnitude.

Table showing descending order of aggregate score and rank in the combined sample

Descending Order	Rank	Descending Order	Rank
86 85 85 80 80 80 80 79 75 75 75 75 73 70 70 65 65 65 63 62	1 2.5 2.5 5.5 5.5 5.5 5.5 8 10.5 10.5 10.5 10.5 13 14.5 14.5 17 17 17 19 20	61 60 55 55 55 55 50 50 50 50 50 50 50 45 40 40 38 30 30 20	21 22 24.5 24.5 24.5 24.5 30 30 30 30 30 30 30 34 35.5 35.5 37 38.5 38.5 40

Grand median is the average of 20th and 21st observation = (62+61)/2 =61.5. Please note that in the above table, average rank is taken whenever the scores are tied. The next step is to prepare a contingency table of two rows and two columns. The cells represent the number of observations that are above and below the grand median in each group. Whenever some observations in each group coincide with the median value, the accepted practice is to first count the observations that are strictly above grand median and put the rest under below grand median. In other words, below grand median in such cases would include less than or equal to grand median.

Scores of Current Account Holders and Savings Account Holders as compared with Grand Median

	Current Account Holders	Savings Account Holders	Marginal Total
Above Grand Median	8(a)	12(b)	20(a+b)
Below Grand Median	12(c)	8(d)	20(c+d)
Marginal Total	20(a+c)	20(b+d)	40(a+b+c+d) = n

Null Hypothesis: There is no difference between the current account holders and savings account holders in the perceived satisfaction level.

alternative Hypothesis: There is difference between the current account holders and savings account holders in the perceived satisfaction level.

The test statistic to be used is given by

The chi-square statistic shown on the left side of the table is the one we would have obtained in a contingency table with nominal data except for the factor (n / 2) used in the numerator as a correction for continuity . This is because a continuous distribution is used to approximate a discrete distribution.

on substituting the values of a, b, c, d, and n we have

Critical chi-square for 1 d.f at 5% level of significance = 3.84. Since the computed chi-square(0.90) is less than critical chi-square(3.84), we have no convincing evidence to reject the null hypothesis. Thus the the data are consistent with the null hypothesis that there is no difference between the current account holders and savings account holders in the perceived satisfaction level.

Home