MULTIMODAL CONTENT-BASED RECOMMENDER SYSTEM USING THREE-DIMENSION CONVOLUTION NEURAL NETWORK
AbstractResearch on Recommender Systems has grown tremendously over the past few years; however, the quest to date for how user emotions can be used as implicit feedback to supplement these systems is sparse. Recommender Systems should take advantage of the high availability of digital data to collect input data of various types that allow the system to enhance its accuracy implicitly or explicitly. In this study, a Multimodal Content-Based Recommender System for image recommendation is built based on Implicit and Explicit Feedbacks. In order to obtain the Implicit Feedbacks, a Convolution Neural Network with Three-Dimensions constructed to predict the emotion of the user's face if it is positive or negative. The Convolution Neural Network making a mixture of spatial and temporal data in Three-Dimension Convolution in order to learn about a transition in consecutive frames. The results of predictions of Neural Network are used as Implicit Feedback for the recommendation algorithm. The Multimodal Recommender System is built by combining the output of two Content-based Recommender Systems using a binary Logistic Regression algorithm. Content-based Recommender System is built by training the Support Vector Machine classifier on features of item profile and Explicit or Implicit feedback The performance measures are computed based on predicted and ground truth feedbacks. The results show that using a Three-Dimension Convolution Neural Network contributes to Implicit Feedbacks prediction in the Recommender System. Also, the combination of the results of two Recommender Systems with different feedback techniques can enhance the performance of the proposed system.
F. Ricci, L. Rokach, and B. Shapira, "Recommender systems: introduction and challenges," in Recommender systems handbook, ed: Springer, 2015, pp. 1-34.
M. Tkalcic, A. Odic, A. Kosir, and J. Tasic, "Affective labeling in a content-based recommender system for images," IEEE transactions on multimedia, vol. 15, pp. 391-400, 2012.
Y. Deldjoo, M. Elahi, P. Cremonesi, F. Garzotto, P. Piazzolla, and M. Quadrana, "Content-based video recommendation system based on stylistic visual features," Journal on Data Semantics, vol. 5, pp. 99-113, 2016.
X. W. Zhao, Y. Guo, Y. He, H. Jiang, Y. Wu, and X. Li, "We know what you want to buy: a demographic-based system for product recommendation on microblogs," in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 1935-1944.
T. D. Owusu and C. Hoffman, "The Personalization and Prediction Innovation of Mobile Recommender Systems," Issues in Information Systems, vol. 15, 2014.
A. Felfernig, M. Jeran, G. Ninaus, F. Reinfrank, S. Reiterer, and M. Stettinger, "Basic approaches in recommendation systems," in Recommendation Systems in Software Engineering, ed: Springer, 2014, pp. 15-37.
J. Bobadilla, F. Ortega, A. Hernando, and A. GutiÃ©rrez, "Recommender systems survey," Knowledge-based systems, vol. 46, pp. 109-132, 2013.
K. Y. Collaborative, "filtering with temporal dynamics [J]," Communications of the ACM, vol. 53, pp. 89-97, 2010.
I. Bayer, X. He, B. Kanagal, and S. Rendle, "A generic coordinate descent framework for learning from implicit feedback," in Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 1341-1350.
R. Katarya and O. P. Verma, "Recent developments in affective recommender systems," Physica A: Statistical Mechanics and its Applications, vol. 461, pp. 182-190, 2016.
M. TkalÄiÄ, U. Burnik, and A. KoÅ¡ir, "Using affective parameters in a content-based recommender system for images," User Modeling and User-Adapted Interaction, vol. 20, pp. 279-311, 2010.
P. J. Lang, "International affective picture system (IAPS): Affective ratings of pictures and instruction manual," Technical report, 2005.
Y. Moshfeghi and J. M. Jose, "An effective implicit relevance feedback technique using affective, physiological and behavioural features," in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, 2013, pp. 133-142.
A. Mahata, N. Saini, S. Saharawat, and R. Tiwari, "Intelligent movie recommender system using machine learning," in International Conference on Intelligent Human Computer Interaction, 2016, pp. 94-110.
A. Kaklauskas, R. Gudauskas, M. Kozlovas, L. Peciure, N. Lepkova, J. Cerkauskas, et al., "An Affect-Based Multimodal Video Recommendation System," Studies in Informatics and Control, vol. 25, p. 6, 2016.
S. Lu, L. Xiao, and M. Ding, "A video-based automated recommender (VAR) system for garments," Marketing Science, vol. 35, pp. 484-510, 2016.
Y. Diaz, C. O. Alm, I. Nwogu, and R. Bailey, "Towards an affective video recommendation system," in 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), 2018, pp. 137-142.
J. Y. Liu, "A Survey of Deep Learning Approaches for Recommendation Systems," in Journal of Physics: Conference Series, 2018, p. 062022.
J. Arunnehru, G. Chamundeeswari, and S. P. Bharathi, "Human action recognition using 3D convolutional neural networks with 3D motion cuboids in surveillance videos," Procedia computer science, vol. 133, pp. 471-477, 2018.
N. A. Rahmad, M. A. Asâ€™Ari, N. F. Ghazali, N. Shahar, and N. A. J. Sufri, "A survey of video based action recognition in sports," Indonesian Journal of Electrical Engineering and Computer Science, vol. 11, pp. 987-993, 2018.
P. Barros, C. Weber, and S. Wermter, "Emotional expression recognition with a cross-channel convolutional neural network for human-robot interaction," in 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), 2015, pp. 582-587.
K. Schindler and L. Van Gool, "Action snippets: How many frames does human action recognition require?," in 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1-8.
J. Tolles and W. J. Meurer, "Logistic regression: relating patient characteristics to outcomes," Jama, vol. 316, pp. 533-534, 2016.
M. TkalÄiÄ, A. KoÅ¡ir, and J. TasiÄ, "The LDOS-PerAff-1 corpus of facial-expression video clips with affective, personality and user-interaction metadata," Journal on Multimodal User Interfaces, vol. 7, pp. 143-155, 2013.
P. H. Bloch, "Seeking the ideal form: Product design and consumer response," Journal of marketing, vol. 59, pp. 16-29, 1995.
C. Garbin, X. Zhu, and O. Marques, "Dropout vs. batch normalization: an empirical study of their impact to deep learning," Multimedia Tools and Applications, pp. 1-39, 2020.
R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection," in Ijcai, 1995, pp. 1137-1145.
The submitter hereby warrants that the Work (collectively, the “Materials”) is original and that he/she is the author of the Materials. To the extent the Materials incorporate text passages, figures, data or other material from the works of others, the undersigned has obtained any necessary permissions. Where necessary, the undersigned has obtained all third party permissions and consents to grant the license above and has all copies of such permissions and consents.
The submitter represents that he/she has the power and authority to make and execute this assignment. The submitter agrees to indemnify and hold harmless the COMPUSOFT from any damage or expense that may arise in the event of a breach of any of the warranties set forth above. For authenticity, validity and originality of the research paper the author/authors will be totally responsible.