Semantic information helps in identifying the context of a document. It will be interesting to find out how effectively this information can be used in recommending related docume...
We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does ex...
In this paper we briefly describe a new conceptual model for XML called XSEM. It is a combination of several approaches in the area. It divides the conceptual modeling process to c...
Standard Machine Learning approaches to text classification use the bag-of-words representation of documents to deceive the classification target function. Typical linguistic stru...
This paper presents a series of tools for the extraction of specialized corpora from the web and its subsequent analysis mainly with statistical techniques. It is an integrated sy...