Full Text:   <699>

Summary:  <188>

CLC number: TP301

On-line Access: 2018-08-06

Received: 2016-09-21

Revision Accepted: 2017-01-14

Crosschecked: 2018-06-15

Cited: 0

Clicked: 1804

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Divya Pandove

http://orcid.org/0000-0001-8694-1538

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2018 Vol.19 No.6 P.699-711

http://doi.org/10.1631/FITEE.1601549


An intuitive general rank-based correlation coefficient


Author(s):  Divya Pandove, Shivani Goel, Rinkle Rani

Affiliation(s):  Research Lab, Computer Science and Engineering Department, Thapar University, Patiala 147004, India ; more

Corresponding email(s):   dpandove@gmail.com, shigo108@yahoo.co.in, raggarwal@thapar.edu

Key Words:  General rank-based correlation coefficient, Multivariate analysis, Predictive metric, Spearman’s rank correlation coefficient


Share this article to: More |Next Article >>>

Divya Pandove, Shivani Goel, Rinkle Rani. An intuitive general rank-based correlation coefficient[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(6): 699-711.

@article{title="An intuitive general rank-based correlation coefficient",
author="Divya Pandove, Shivani Goel, Rinkle Rani",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="6",
pages="699-711",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601549"
}

%0 Journal Article
%T An intuitive general rank-based correlation coefficient
%A Divya Pandove
%A Shivani Goel
%A Rinkle Rani
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 6
%P 699-711
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601549

TY - JOUR
T1 - An intuitive general rank-based correlation coefficient
A1 - Divya Pandove
A1 - Shivani Goel
A1 - Rinkle Rani
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 6
SP - 699
EP - 711
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601549


Abstract: 
Correlation analysis is an effective mechanism for studying patterns in data and making predictions. Many interesting discoveries have been made by formulating correlations in seemingly unrelated data. We propose an algorithm to quantify the theory of correlations and to give an intuitive, more accurate correlation coefficient. We propose a predictive metric to calculate correlations between paired values, known as the general rank-based correlation coefficient. It fulfills the five basic criteria of a predictive metric: independence from sample size, value between −1 and 1, measuring the degree of monotonicity, insensitivity to outliers, and intuitive demonstration. Furthermore, the metric has been validated by performing experiments using a real-time dataset and random number simulations. Mathematical derivations of the proposed equations have also been provided. We have compared it to spearman’s rank correlation coefficient. The comparison results show that the proposed metric fares better than the existing metric on all the predictive metric criteria.

一种直观的一般秩相关系数

概要:相关分析是研究数据模式和预测的有效机制。在看似无关的数据中建立相关性可得到许多有趣发现。提出一种算法,用于量化相关性理论并得出一个直观且更精确的相关系数。为计算配对值之间相关性,提出一项预测指标,称为一般秩相关系数。其满足预测指标的5个基本标准:样本规模的独立性、数值介于−1与1之间、测量单调性程度、对异常值不敏感性、直观演示。此外,使用实时数据集和随机数模拟实验对该指标进行验证。同时,展示了所提方程的数学推导过程,并与斯皮尔曼等级相关系数比较。结果表明,该指标在所有预测度量标准上均优于现存指标。

关键词:一般秩相关系数;多变量分析;预测指标;斯皮尔曼等级相关系数

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Chaudhuri B, Bhattacharya A, 2001. On correlation between two fuzzy sets. Fuzzy Sets Syst, 118(3):447-456.

[2]Chen H, Chiang RHL, Storey VC, 2012. Business intelligence and analytics: from big data to big impact. MIS Q, 36(4):1165-1188.

[3]Chen N, Xu Z, Xia M, 2013. Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis. Appl Math Model, 37(4):2197-2211.

[4]Davenport T, Barth P, Bean R, 2013. How ‘Big Data’ is Different. https://sloanreview.mit.edu/article/how-big-data-is-different/

[5]Deufemia V, Giordano M, Polese G, et al., 2014. A visual language-based system for extraction-transformation-loading development. Softw Pract Exper, 44(12):1417-1440.

[6]Devarajan S, 2013. Africa’s statistical tragedy. Rev Income Wealth, 59(S1):9-15.

[7]Didelez V, Pigeot I, 2001. Judea Pearl: causality: models, reasoning, and inference. PVS, 42(2):313-315.

[8]Ginsberg J, Mohebbi MH, Patel RS, et al., 2009. Detecting influenza epidemics using search engine query data. Nature, 457(7232):1012-1014.

[9]Granville V, 2014. Developing analytic talent: becoming a data scientist. John Wiley & Sons, Inc., Indianapolis, USA.

[10]Gratton G, Kolotilin A, 2015. Euclidean fairness and efficiency. Econ Inq, 53(3):1689-1690.

[11]Hauke J, Kossowski T, 2011. Comparison of values of Pearson’s and Spearman’s correlation coefficients on the same sets of data. Quaest Geograph, 30(2):87-93.

[12]Hong DH, 2006. Fuzzy measures for a correlation coefficient of fuzzy numbers under TW (the weakest t-norm)-based fuzzy arithmetic operations. Inform Sci, 176(2):150-160.

[13]Hung WL, 2001. Using statistical viewpoint in developing correlation of intuitionistic fuzzy sets. Int J Uncert Fuzz Knowl Based Syst, 9(4):509-516.

[14]Huo X, Székely GJ, 2016. Fast computing for distance covariance. Technometrics, 58(4):435-447.

[15]Kitano H, 2002. Systems biology: a brief overview. Science, 295(5560):1662-1664.

[16]Kong J, Klein BEK, Klein R, et al., 2012. Using distance correlation and SS-ANOVA to assess associations of familial relationships, lifestyle factors, diseases, and mortality. PNAS, 109(50):20352-20357.

[17]Li R, Zhong W, Zhu L, 2012. Feature screening via distance correlation learning. J Am Stat Assoc, 107(499):1129-1139.

[18]Liao H, Xu Z, Zeng X, et al., 2015a. Qualitative decision making with correlation coefficients of hesitant fuzzy linguistic term sets. Knowl Based Syst, 76:127-138.

[19]Liao H, Xu Z, Zeng X, 2015b. Novel correlation coefficients between hesitant fuzzy sets and their application in decision making. Knowl Based Syst, 82:115-127.

[20]Linden G, Smith B, York J, 2003. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Intern Comput, 7(1):76-80.

[21]Liu S, Kao C, 2002. Fuzzy measures for correlation coefficient of fuzzy numbers. Fuzzy Sets Syst, 128(2):267-275.

[22]Lyons R, 2013. Distance covariance in metric spaces. Ann Probab, 41(5):3284-3305.

[23]McGregor C, 2013. Big data in neonatal intensive care. Computer, 46(6):54-59.

[24]Mitchell HB, 2004. A correlation coefficient for intuitionistic fuzzy sets. Int J Intell Syst, 19(5):483-490.

[25]Murthy CA, Pal SK, Majumder DD, 1985. Correlation between two fuzzy membership functions. Fuzzy Sets Syst, 17(1):23-38.

[26]Reshef DN, Reshef YA, Finucane HK, et al., 2011. Detecting novel associations in large data sets. Science, 334(6062):1518-1524.

[27]Ritala P, Golnam A, Wegmann A, 2014. Coopetition-based business models: the case of Amazon.com. Ind Mark Manag, 43(2):236-249.

[28]Sen A, Dacin PA, Pattichis C, 2006. Current trends in web data analysis. Commun ACM, 49(11):85-91.

[29]Susantitaphong P, Cruz DN, Cerda J, et al., 2013. World incidence of AKI: a meta-analysis. Clin J Am Soc Nephrol, 8(9):1482-1493.

[30]Székely GJ, Rizzo ML, 2012. On the uniqueness of distance covariance. Stat Probab Lett, 82(12):2278-2282.

[31]Volpone SD, Tonidandel S, Avery DR, et al., 2015. Exploring the use of credit scores in selection processes: beware of adverse impact. J Bus Psychol, 30(2):357-372.

[32]World Bank, 2012. World Development Indicators 2012. World Development Indicators, Washington DC, USA. https://openknowledge.worldbank.org/handle/10986/linebreak6014

[33]Xiao C, Ye J, Esteves R, et al., 2015. Using Spearman’s correlation coefficients for exploratory data analysis on big dataset. Concurr Comput Pract Exp, 28(14):3866-3878.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE