Big Data Quality: Factors, Frameworks, and Challenges

  • Mohammad Abdallah Al-Zaytoonah University of Jordan http://orcid.org/0000-0002-3643-0104
  • Mohammad Muhairat Al-Zaytoonah University of Jordan
  • Ahmad Althunibat Al-Zaytoonah University of Jordan
  • Ayman Abdalla Al-Zaytoonah University of Jordan
Keywords: Big Data, Quality, Quality Dimension, Quality Factors, Quality Frameworks, Quality Challenges

Abstract

Big Data applications are widely used in many fields; Artificial Intelligent, Marketing, Commercial applications, and Health care, as we have seen the role of Bid Data in the Convid-19 pandemic. Therefore, to ensure that the Big Data applications are used and generated in good quality for their consumers. It is important to have quality factors that the Big Data applications should satisfy, quality frameworks that applied and tested the quality factors for the Big Data application. However, the quality measurement process has some challenges to be applicable and trustworthy. In this research, we have listed different quality factors and dimensions and quality frameworks that are commonly used to measure the Big Data quality measurement. Also, we listed the frequent challenges that the researchers and data scientists are faced through the Big Data quality measurement process.

Downloads

Download data is not yet available.

Author Biographies

Mohammad Abdallah, Al-Zaytoonah University of Jordan

Assistant Professor

Software Engineering Department

 

Mohammad Muhairat, Al-Zaytoonah University of Jordan

Association Professor

Software Engineering Department

Ahmad Althunibat, Al-Zaytoonah University of Jordan

Association Professor

Software Engineering Department

Ayman Abdalla, Al-Zaytoonah University of Jordan

Association Professor

Computer Science Department

References

Rijmenam, M.v. A Short History Of Big Data. 2013 [cited 2018 16/11/2018]; Available from: https://datafloq.com/read/big-data-history/239.

Rider, F., The Scholar and the Future of the Research Library. A Problem and Its Solution. 1944: Hadham Press.

Sagiroglu, S. and D. Sinanc. Big Data: A review. in Collaboration Technologies and Systems (CTS), 2013 International Conference on. 2013. IEEE.

Kataria, M. and M.P. Mittal, Big Data: a review. International Journal of Computer Science and Mobile Computing, 2014. 3(7): p. 106-110.

Reinsel, D. and J. Gantz. The Digital Universe in 2020. 2012 [cited 2020; Available from: https://www.emc.com/leadership/digital-universe/2012iview/index.htm.

Dumbill, E., Making Sense of Big Data. Big Data, 2013. 1: p. 1-2.

Laney, D., 3D Management: Controlling Data Volume, Velocity, and Variety, in Application Delivery Strategies. 2001, META Group: blogs.gartner.com.

Khan, N., et al., The 51 V’s Of Big Data: Survey, Technologies, Characteristics, Opportunities, Issues and Challenges, in Proceedings of the International Conference on Omni-Layer Intelligent Systems. 2019, Association for Computing Machinery: Crete, Greece. p. 19–24.

Gandomi, A. and M. Haider, Beyond the hype: Big Data concepts, methods, and analytics. International Journal of Information Management, 2015. 35(2): p. 137-144.

Abdallah, M. Big Data Quality Challenges. in 2019 International Conference on Big Data and Computational Intelligence (ICBDCI). 2019. IEEE.

Batini, C., et al., From Data Quality to Big Data Quality. Journal of Database Management, 2015. 26: p. 60-82.

Strong, D.M., Y.W. Lee, and R.Y. Wang, Data quality in context. Communications of the ACM, 1997. 40(5): p. 103-110.

Pipino, L.L., Y.W. Lee, and R.Y. Wang, Data quality assessment. Commun. ACM, 2002. 45(4): p. 211-218.

Sidi, F., et al., Data quality: A survey of data quality dimensions. 2013.

Salih, F.I., et al. Data Quality Issues in Big Data: A Review. 2019. Cham: Springer International Publishing.

Mirzaie, M., B. Behkamal, and S. Paydar, State of the Art on the Quality of Big Data: A Systematic Literature Review and Classification Framework. 2019.

Mirzaie, M., B. Behkamal, and S. Paydar, Big Data Quality: A systematic literature review and future research directions. arXiv preprint arXiv:1904.05353, 2019.

Abdullah, N., et al., Data quality in Big Data: A review. 2015. 7: p. 16-27.

Hiba, J., et al., BIG DATA AND FIVE V'S CHARACTERISTICS. 2015: p. 2393-2835.

Ishwarappa and J. Anuradha, A Brief Introduction on Big Data 5Vs Characteristics and Hadoop Technology. Procedia Computer Science, 2015. 48: p. 319-324.

Khan, M., M. Uddin, and N. Gupta, Seven V's of Big Data understanding Big Data to extract value. 2014. 1-5.

Khan, N., et al., The 10 Vs, Issues and Challenges of Big Data. 2018. 52-56.

Gani, A., et al., A survey on indexing techniques for Big Data: taxonomy and performance evaluation. Knowl. Inf. Syst., 2016. 46(2): p. 241–284.

Cai, L. and Y. Zhu, The Challenges of Data Quality and Data Quality Assessment in the Big Data Era. Data Science Journal, 2015. 14.

Taleb, I., M.A. Serhani, and R. Dssouli. Big Data Quality: A Survey. in 2018 IEEE International Congress on Big Data (BigData Congress). 2018.

Catarci, T., et al. My (fair) Big Data. in 2017 IEEE International Conference on Big Data (Big Data). 2017.

Gyulgyulyan, E., et al., Data Quality Alerting Model for Big Data Analytics. 2019. p. 489-500.

Ridzuan, F. and W.M.N.W. Zainon, A Review on Data Cleansing Methods for Big Data. Procedia Computer Science, 2019. 161: p. 731-738.

Someswararao, C., Data Cleaning – A Framework For Robust Data Quality In Enterprise Data Warehouse. International Journal of Computer Science and Technology, 2012.

Cichy, C. and S. Rass, An Overview of Data Quality Frameworks. IEEE Access, 2019. 7: p. 24634-24648.

Bath, G., The Next Generation Tester: Meeting the Challenges of a Changing ITWorld, in The Future of Software Quality Assurance, S. Goericke, Editor. 2020, SpringerOpen.

Staegemann, D., et al., Exploring the Specificities and Challenges of Testing Big Data Systems, in The 15th International Conference on SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS. 2019: Sorrento, Italy.

Technology, N.I.o.S.a. NIST Big Data Interoperability Framework: Volume 1, Definitions. 2018 [cited 2020 10/04/2020].

Gani, A., et al., A survey on Indexing Techniques for Big Data: Taxonomy and Performance Evaluation. Knowledge and Information Systems, 2015. 46.

Mills, S., et al., Demystifying big data: a practical guide to transforming the business of government. 2012, TechAmerica Foundation: Washington.

Katal, A., M. Wazid, and R.H. Goudar, Big Data: Issues, challenges, tools and Good practices. 2013. 404-409.

Elgendy, N. and A. Elragal, Big Data Analytics: A Literature Review Paper. Vol. 8557. 2014. 214-227.

Loshin, D., Evaluating the business impacts of poor data quality. Information Quality Journal, 2011.

Redman, T., The Impact of Poor Data Quality on the Typical Enterprise. Communications of the ACM, 1998. 41.

Samsudeen, S.N. and H. Atham, Impacts and Challenges of Big Data: A Review. 2020. 24: p. 2020.

Haug, A., F. Zachariassen, and D. Van Liempd, The costs of poor data quality. Journal of Industrial Engineering and Management (JIEM), 2011. 4(2): p. 168-193.

Press, G. 12 Big Data Definitions: What's Yours? 2014 [cited 2020 16/4/2020]; Available from: https://www.forbes.com/sites/gilpress/2014/09/03/12-big-data-definitions-whats-yours/#497bbf8713ae.

Geczy, P., Big Data characteristics. The Macrotheme Review, 2014. 3(6): p. 94-104.

Kolajo, T., O. Daramola, and A. Adebiyi, Big Data stream analysis: a systematic literature review. Journal of Big Data, 2019. 6(1): p. 47.

Published
2020-08-31
How to Cite
Abdallah, M., Muhairat, M., Althunibat, A., & Abdalla, A. (2020). Big Data Quality: Factors, Frameworks, and Challenges. COMPUSOFT: An International Journal of Advanced Computer Technology, 9(8), 3785-3790. Retrieved from https://ijact.in/index.php/ijact/article/view/1164