Multimodal content-based recommender system using three-dimension convolution neural network
Keywords:
Multimodal Recommender System, Content-Based Recommendation (CBR), Three-Dimension Convolution Neural Network (3D CNN), Support Vector Machine (SVM), Emotion, Implicit Feedbacks, Explicit FeedbacksAbstract
Research on Recommender Systems has grown tremendously over the past few years; however, the quest to date for how user emotions can be used as implicit feedback to supplement these systems is sparse. Recommender Systems should take advantage of the high availability of digital data to collect input data of various types that allow the system to enhance its accuracy implicitly or explicitly. In this study, a Multimodal Content-Based Recommender System for image recommendation is proposed which is based on Implicit and Explicit Feedbacks. In order to obtain the Implicit Feedbacks, a Convolution Neural Network with Three-Dimensions is constructed to predict the emotion of the user's face if it is positive or negative. The Convolution Neural Network making a mixture of spatial and temporal data in Three-Dimension Convolution in order to learn about a transition in consecutive frames. The results of predictions of Neural Network are used as Implicit Feedback for the recommendation algorithm. The Multimodal Recommender System is built by combining the output of two Content-based Recommender Systems using a binary Logistic Regression algorithm. Content-based Recommender System is built by training the Support Vector Machine classifier on features of item profile and Explicit or Implicit feedback. The performance measures are computed based on predicted and ground truth feedbacks. The result shows that the Three-Dimension Convolution Neural Network contributes to Implicit Feedbacks prediction in the Recommender System. Also, the combination of the results of two Recommender Systems with different feedback techniques can enhance the performance of the proposed system.
References
F. Ricci, L. Rokach, and B. Shapira, "Recommender systems: introduction and challenges," in Recommender systems handbook, ed: Springer, 2015, pp. 1-34.
M. Tkalcic, A. Odic, A. Kosir, and J. Tasic, "Affective labeling in a content-based recommender system for images," IEEE transactions on multimedia, vol. 15, pp. 391-400, 2012.
Y. Deldjoo, M. Elahi, P. Cremonesi, F. Garzotto, P. Piazzolla, and M. Quadrana, "Content-based video recommendation system based on stylistic visual features," Journal on Data Semantics, vol. 5, pp. 99-113, 2016.
X. W. Zhao, Y. Guo, Y. He, H. Jiang, Y. Wu, and X. Li, "We know what you want to buy: a demographic-based system for product recommendation on microblogs," in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 1935-1944.
T. D. Owusu and C. Hoffman, "The Personalization and Prediction Innovation of Mobile Recommender Systems," Issues in Information Systems, vol. 15, 2014.
A. Felfernig, M. Jeran, G. Ninaus, F. Reinfrank, S. Reiterer, and M. Stettinger, "Basic approaches in recommendation systems," in Recommendation Systems in Software Engineering, ed: Springer, 2014, pp. 15-37.
J. Bobadilla, F. Ortega, A. Hernando, and A. Gutiérrez, "Recommender systems survey," Knowledge-based systems, vol. 46, pp. 109-132, 2013.
K. Y. Collaborative, "filtering with temporal dynamics [J]," Communications of the ACM, vol. 53, pp. 89-97, 2010.
I. Bayer, X. He, B. Kanagal, and S. Rendle, "A generic coordinate descent framework for learning from implicit feedback," in Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 1341-1350.
R. Katarya and O. P. Verma, "Recent developments in affective recommender systems," Physica A: Statistical Mechanics and its Applications, vol. 461, pp. 182-190, 2016.
M. Tkalčič, U. Burnik, and A. Košir, "Using affective parameters in a content-based recommender system for images," User Modeling and User-Adapted Interaction, vol. 20, pp. 279-311, 2010.
P. J. Lang, "International affective picture system (IAPS): Affective ratings of pictures and instruction manual," Technical report, 2005.
Y. Moshfeghi and J. M. Jose, "An effective implicit relevance feedback technique using affective, physiological and behavioural features," in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, 2013, pp. 133-142.
A. Mahata, N. Saini, S. Saharawat, and R. Tiwari, "Intelligent movie recommender system using machine learning," in International Conference on Intelligent Human Computer Interaction, 2016, pp. 94-110.
A. Kaklauskas, R. Gudauskas, M. Kozlovas, L. Peciure, N. Lepkova, J. Cerkauskas, et al., "An Affect-Based Multimodal Video Recommendation System," Studies in Informatics and Control, vol. 25, p. 6, 2016.
S. Lu, L. Xiao, and M. Ding, "A video-based automated recommender (VAR) system for garments," Marketing Science, vol. 35, pp. 484-510, 2016.
Y. Diaz, C. O. Alm, I. Nwogu, and R. Bailey, "Towards an affective video recommendation system," in 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), 2018, pp.137-142.
J. Y. Liu, "A Survey of Deep Learning Approaches for Recommendation Systems," in Journal of Physics: Conference Series, 2018, p. 062022.
J. Arunnehru, G. Chamundeeswari, and S. P. Bharathi, "Human action recognition using 3D convolutional neural networks with 3D motion cuboids in surveillance videos,"Procedia computer science, vol. 133, pp. 471-477, 2018.
N. A. Rahmad, M. A. As’Ari, N. F. Ghazali, N. Shahar, and N. A. J. Sufri, "A survey of video based action recognition in sports," Indonesian Journal of Electrical Engineering and Computer Science, vol. 11, pp. 987-993, 2018.
P. Barros, C. Weber, and S. Wermter, "Emotional expression recognition with a cross-channel convolutional neural network for human-robot interaction," in 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), 2015, pp. 582-587.
K. Schindler and L. Van Gool, "Action snippets: How many frames does human action recognition require?," in 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1-8.
J. Tolles and W. J. Meurer, "Logistic regression: relating patient characteristics to outcomes," Jama, vol. 316, pp. 533-534, 2016.
M. Tkalčič, A. Košir, and J. Tasič, "The LDOS-PerAff-1 corpus of facial-expression video clips with affective, personality and user-interaction metadata," Journal on Multimodal User Interfaces, vol. 7, pp. 143-155, 2013.
P. H. Bloch, "Seeking the ideal form: Product design and consumer response," Journal of marketing, vol. 59, pp. 16-29, 1995.
C. Garbin, X. Zhu, and O. Marques, "Dropout vs. batch normalization: an empirical study of their impact to deep learning," Multimedia Tools and Applications, pp. 1-39, 2020.
R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection," in Ijcai, 1995, pp. 1137-1145.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2020 COMPUSOFT: An International Journal of Advanced Computer Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.
©2023. COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY by COMPUSOFT PUBLICATION is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY. Permissions beyond the scope of this license may be available at Creative Commons Attribution 4.0 International Public License.