An Efficient Annotation of Search Results Based on Feature Ranking Approach from Web Databases
Keywords:
web databases, data units, structured databases, semantic data, multi annotator method, schema value annotator, support vector machine, data alignment algorithmAbstract
With the increased number of web databases, major part of deep web is one of the bases of database. In several search engines, encoded data in the returned resultant pages from the web often comes from structured databases which are referred as Web databases (WDB). A result page returned from WDB has multiple search records (SRR).Data units obtained from these databases are encoded into the dynamic resultant pages for manual processing. In order to make these units to be machine process able, relevant information are extracted and labels of data are assigned meaningfully. In this paper, feature ranking is proposed to extract the relevant information of extracted feature from WDB. Feature ranking is practical to enhance ideas of data and identify relevant features. This research explores the performance of feature ranking process by using the linear support vector machines with various feature of WDB database for annotation of relevant results. Experimental result of proposed system provides better result when compared with the earlier methods.
References
N. Krushmerick, D. Weld, and R. Doorenbos,“Wrapper Induction for Information Extraction,”Proc. Int’l Joint Conf. Artificial Intelligence (IJCAI), 1997.
Bizarro, L. Liu, C. Pu, and W. Han, “XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources,” Proc. IEEE 16th Int’l Conf. Data Eng. (ICDE), 2001.
W. Meng, C. Yu, and K. Liu, “Building Efficient and Effective Meta search Engines,” ACM Computing Surveys, vol. 34, no. 1, pp. 48-89, 2002.
Z. Wu et al., “Towards Automatic Incorporation of Search Engines into a Large-Scale Metasearch Engine,” Proc. IEEE/WIC Int’l Conf. Web Intelligence (WI ’03), 2003.
D. Embley, D. Campbell, Y. Jiang, S. Liddle, D. Lonsdale, Y. Ng, and R. Smith, “Conceptual-Model Based Data Extraction from Multiple-Record Web Pages,” Data and Knowledge Eng., vol. 31, no. 3, pp. 227-251, 1999.
S. Mukherjee, I.V. Ramakrishnan, and A. Singh,“Bootstrapping Semantic Annotation for ContentRich HTML Documents,” Proc. IEEE Int’l Conf. Data Eng. (ICDE), 2005.
J. Wang and F.H. Lochovsky, “Data Extraction and Label Assignment for Web Databases,” Proc. 12th Int’l Conf. World Wide Web (WWW), 2003.
J. Zhu, Z. Nie, J. Wen, B. Zhang, and W.-Y. Ma,“Simultaneous Record Detection and Attribute Labeling in Web Data Extraction, Proc. ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, 2006.
W. Liu, X. Meng, and W. Meng, “ViDE: A VisionBased Approach for Deep Web Data Extraction,”IEEE Trans. Knowledge and Data Eng., vol. 22, no. 3, pp. 447-460, Mar. 2010.
H. Elmeleegy, J. Madhavan, and A. Halevy, “Harvesting Relational Tables from Lists on the Web,” Proc. Very Large Databases (VLDB) Conf., 2009.
Spaccapietra.S and Parent.C, “A step forward in solving structural conflicts,” IEEE Transactions on Knowledge 5and Data Engineering, vol. 6, no. 2, 1998.
Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A.,Price T.G. Access Path Selection in a Relational Database System. In Readings in Database Systems. Morgan Kaufman.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2015 COMPUSOFT: An International Journal of Advanced Computer Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.
©2023. COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY by COMPUSOFT PUBLICATION is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY. Permissions beyond the scope of this license may be available at Creative Commons Attribution 4.0 International Public License.