Discovering and Mining Links for Protein Databases
Keywords:
Link analysis, Markov chain, correspondence analysis, graph mining, principal component analysisAbstract
This work introduces a link analysis procedure for discovering relationships in a protein database or a relational database generalizing simple correspondence analysis. It is based on extracting the links to the related protein database and malfunctioned protein database. The datasets are trained in order to find out missing interactions and the sequences related to them. Further the analysis of links proceeds by performing a random walk defining a Markov chain. The elements of interest are analysed through stochastic complementation which gives a reduced Markov chain. This reduced map is then analysed by projecting the elements of interest through Principal component analysis. Several Protein datasets are analysed using the proposed methodology, showing the usefulness of the technique for extracting relationships in relational databases or graphs.
References
. M.Belkin and P.Niyogi ,”Laplacian EigenMaps for Dimensionality reduction and Data Representation”, Neural Computation , Vol 15,pp.1373-1396,2006
. I.Borg and P.Groenen, Modern Multidimensional Scaling :Theory and Applications, springer,1997.
. T.Cox and M.Cox , Multidimensional Scaling, second ed. Chapman and Hall 2001.
. X.Geng, D-C.Zhan and Z-H Zhou, “Supervised Nonlinear Dimensionality Reduction for Visualization and Classification”IEEE Tras.Systems, Man and Cybernetics, Part B: Cybernetics, vol.35, no.6, pp 1098-1107, Dec 2005.
. J.Gower and D.Hand, Biplots, Chapman & Hall , 1995.
. M.J.Greenacre, Theory and Applications of Correspondence analysis . Academic Press, 1984.
. K.M Hall, “An R-Dimensional Quadratic Placement Algorithm ,”Management science, vol.17, no.8,pp.219-229, 1970.
. J.Lee and M.Verleysen, Nonlinear Dimensionality Reduction, Springer , 2007.
. B.Nadler,S.Lafon, R.Coifman and I.Keverekidis, “Diffusion Maps, Spectral Clustering and Reaction coordinates of Dynamical systems”, Applied and Computational Harmonic Analysis, Vol.21, pp.113-127,2006.
. P.Pons and M.Laptapy, “Computing Communities in Large Networks Using Random Walks,” Int’l Symposium of Computer and Information sciences, pp.191-218,2006.
. M.Telwal, Link Analysis: An Information Science Approach. Elsevier , 2004.
. L.Yen, F.Fouss, C.Decaestecker and M.Saerens, “Graph nodes Clustering Based on the Commute –Time Kernel (PAKDD 07’), 2007
. C.D. Meyer, “Stochastic Complementation, Uncoupling Markov Chains, and the Theory of Nearly Reducible systems,” SIAM Rev.,vol. 31, no. 2, pp. 240-272, 1989.
. F. Geerts, H. Mannila, and E. Terzi, “Relational Link-Based Ranking,” Proc. 30th Very Large Data Bases Conf. (VLDB), pp. 552-563, 2004.
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2024 COMPUSOFT: An International Journal of Advanced Computer Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.
©2023. COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY by COMPUSOFT PUBLICATION is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY. Permissions beyond the scope of this license may be available at Creative Commons Attribution 4.0 International Public License.