Discovering and Mining Links for Protein Databases

Authors

  • Mercy AI Dept. of Computer Applications, MPC, Thanjavur
  • Padmavathi S Dept. of Computer Science and Information Technology, MPC, Thanjavur, India

Keywords:

Link analysis, Markov chain, correspondence analysis, graph mining, principal component analysis

Abstract

This work introduces a link analysis procedure for discovering relationships in a protein database or a relational database generalizing simple correspondence analysis. It is based on extracting the links to the related protein database and malfunctioned protein database. The datasets are trained in order to find out missing interactions and the sequences related to them. Further the analysis of links proceeds by performing a random walk defining a Markov chain. The elements of interest are analysed through stochastic complementation which gives a reduced Markov chain. This reduced map is then analysed by projecting the elements of interest through Principal component analysis. Several Protein datasets are analysed using the proposed methodology, showing the usefulness of the technique for extracting relationships in relational databases or graphs.

References

. M.Belkin and P.Niyogi ,”Laplacian EigenMaps for Dimensionality reduction and Data Representation”, Neural Computation , Vol 15,pp.1373-1396,2006

. I.Borg and P.Groenen, Modern Multidimensional Scaling :Theory and Applications, springer,1997.

. T.Cox and M.Cox , Multidimensional Scaling, second ed. Chapman and Hall 2001.

. X.Geng, D-C.Zhan and Z-H Zhou, “Supervised Nonlinear Dimensionality Reduction for Visualization and Classification”IEEE Tras.Systems, Man and Cybernetics, Part B: Cybernetics, vol.35, no.6, pp 1098-1107, Dec 2005.

. J.Gower and D.Hand, Biplots, Chapman & Hall , 1995.

. M.J.Greenacre, Theory and Applications of Correspondence analysis . Academic Press, 1984.

. K.M Hall, “An R-Dimensional Quadratic Placement Algorithm ,”Management science, vol.17, no.8,pp.219-229, 1970.

. J.Lee and M.Verleysen, Nonlinear Dimensionality Reduction, Springer , 2007.

. B.Nadler,S.Lafon, R.Coifman and I.Keverekidis, “Diffusion Maps, Spectral Clustering and Reaction coordinates of Dynamical systems”, Applied and Computational Harmonic Analysis, Vol.21, pp.113-127,2006.

. P.Pons and M.Laptapy, “Computing Communities in Large Networks Using Random Walks,” Int’l Symposium of Computer and Information sciences, pp.191-218,2006.

. M.Telwal, Link Analysis: An Information Science Approach. Elsevier , 2004.

. L.Yen, F.Fouss, C.Decaestecker and M.Saerens, “Graph nodes Clustering Based on the Commute –Time Kernel (PAKDD 07’), 2007

. C.D. Meyer, “Stochastic Complementation, Uncoupling Markov Chains, and the Theory of Nearly Reducible systems,” SIAM Rev.,vol. 31, no. 2, pp. 240-272, 1989.

. F. Geerts, H. Mannila, and E. Terzi, “Relational Link-Based Ranking,” Proc. 30th Very Large Data Bases Conf. (VLDB), pp. 552-563, 2004.

Downloads

Published

2014-01-30

How to Cite

Mercy, A., & Padmavathi, S. (2014). Discovering and Mining Links for Protein Databases. COMPUSOFT: An International Journal of Advanced Computer Technology, 3(01), 467–472. Retrieved from https://ijact.in/index.php/j/article/view/82

Issue

Section

Original Research Article

Categories