Cluster based natural language question answering for e-resources in a library

dc.contributor.authorJahan, M. A. C. Akmal
dc.contributor.authorJiffriya, M. A. C.
dc.contributor.authorRagel, Roshan G.
dc.contributor.authorDeegala, Sampath
dc.date.accessioned2024-07-29T10:55:49Z
dc.date.available2024-07-29T10:55:49Z
dc.date.issued2013-07-04
dc.description.abstractProcessing questions and question answering (QA) based on a pool of text documents is an information retrieval mechanism which responds to queries in natural language by returning brief answers. QA involves (1) understanding user questions which are asked in a natural language through a Natural Language Processing technique and (2) categorizing the documents based on data mining for fast and real time behaviour of the system. The work presented in the abstract targets the second issue for an e-library system of a university. In a library, there are a large number of resources in electronic format similar toprinted material such as e-books, e-journals, etc. Most of the users do not know the mechanism of selecting the appropriate e-resource for their reference. When a user wants a brief answer for a question or most appropriate source to extract the answer, he should search the entire source documents in a particular category. Such an approach will incur a significant amount of time mostly failing the real time requirements and sometimes the required answer may not be available in the documents. The problem of unacceptable time delay may be overcome by introducing an e-library management system in a university and accessing information by processing naturallanguage questions through the system, which stores all the electronic resources. This isachieved by narrowing down the search space by looking at the index or abstracts of the resources and looking for answers by matching required question with narrowed down resources. For this purpose we have applied an approach of cluster based document retrieval in which matching a query is performed against clusters of documents instead of individual documents. Therefore, we need to find an appropriate clustering algorithm for document clustering which is accurate and efficient for the problem domain. We have tested data from the given domain with K-Means, Hierarchical, EM and Cobweb clustering algorithms available in a well-known data mining tool named WEKA. Numbers of incorrectly clustered instances of the documents and the computational time for clustering have been measured for each algorithm. According to the results, K-Means algorithm showed 74% of accuracy with six clusters and performed faster than the other clustering algorithms. Since the K-Means algorithm shows higher performance on document clustering it was selected for constructing clusters for the e-resource pool
dc.identifier.citationPeradeniya University Research Sessions PURSE - 2012, Book of Abstracts, University of Peradeniya, Sri Lanka, Vol. 17, July. 4. 2012 pp. 150
dc.identifier.isbn9789555891646
dc.identifier.issn13914111
dc.identifier.urihttps://ir.lib.pdn.ac.lk/handle/20.500.14444/476
dc.language.isoen_US
dc.publisherThe University of Peradeniya
dc.subjectNatural language
dc.subjectE-resources
dc.subjectCluster
dc.subjectLibrary
dc.titleCluster based natural language question answering for e-resources in a library
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
M.A.C.Akmal jahan.pdf
Size:
202.09 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections