Over the past 50 years, correspondence analysis (CA) has increasingly been used by data analysts to examine the association structure of categorical variables that are cross-classified to form a contingency table. However, the literature has paid little attention to the case where the variables are ordinal. Indeed, Pearson’s chi-squared statistic X^2 can perform badly in studying the association between ordinal categorical variables (Agresti in An introduction to categorical data analysis, Wiley, Hoboken, 1996; Barlow et al. in Statistical inference under order restrictions, Wiley, New York, 1972). Taguchi’s (Nair in Technometrics 28(4):283–291, 1986; Nair in J Am Stat Assoc 82:283–291, 1987) and Hirotsu’s (Biometrika 73: 165–173, 1986) statistics have been introduced in the literature as simple alternatives to Pearson’s index for contingency tables with ordered categorical variables. Taguchi’s statistic takes into account the presence of an ordinal categorical variable by considering the cumulative sum of the cell frequencies across the variable. An extension of correspondence analysis using a decomposition of Taguchi’s statistic has been introduced to accommodate this feature of the variables. This considers the impact of differences between adjacent ordered categories on the association between row and column categories. Therefore, the main aim of this paper is to introduce a confidence region for each of the ordered categories so that one may determine the statistical significance of a category with respect to the null hypothesis of independence. We highlight that the construction of these circles has not been considered in the literature for this approach to CA. We also introduce a suitable decomposition of Taguchi’s statistic to test the statistical significance of each column category.
Confidence regions and other tools for an extension of correspondence analysis based on cumulative frequencies
Amenta, Pietro
Conceptualization
;
2020-01-01
Abstract
Over the past 50 years, correspondence analysis (CA) has increasingly been used by data analysts to examine the association structure of categorical variables that are cross-classified to form a contingency table. However, the literature has paid little attention to the case where the variables are ordinal. Indeed, Pearson’s chi-squared statistic X^2 can perform badly in studying the association between ordinal categorical variables (Agresti in An introduction to categorical data analysis, Wiley, Hoboken, 1996; Barlow et al. in Statistical inference under order restrictions, Wiley, New York, 1972). Taguchi’s (Nair in Technometrics 28(4):283–291, 1986; Nair in J Am Stat Assoc 82:283–291, 1987) and Hirotsu’s (Biometrika 73: 165–173, 1986) statistics have been introduced in the literature as simple alternatives to Pearson’s index for contingency tables with ordered categorical variables. Taguchi’s statistic takes into account the presence of an ordinal categorical variable by considering the cumulative sum of the cell frequencies across the variable. An extension of correspondence analysis using a decomposition of Taguchi’s statistic has been introduced to accommodate this feature of the variables. This considers the impact of differences between adjacent ordered categories on the association between row and column categories. Therefore, the main aim of this paper is to introduce a confidence region for each of the ordered categories so that one may determine the statistical significance of a category with respect to the null hypothesis of independence. We highlight that the construction of these circles has not been considered in the literature for this approach to CA. We also introduce a suitable decomposition of Taguchi’s statistic to test the statistical significance of each column category.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.