Sciweavers

2827 search results - page 320 / 566
» Marking Text Documents
Sort
View
EMNLP
2008
15 years 8 months ago
HTM: A Topic Model for Hypertexts
Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recent...
Congkai Sun, Bin Gao, Zhenfu Cao, Hang Li
RIAO
2007
15 years 8 months ago
XML Fragments Extended with Database Operators
XML documents represent a middle range between unstructured data such as textual documents and fully structured data encoded in databases. Typically, information retrieval techniq...
Yosi Mass, Dafna Sheinwald, Benjamin Sznajder, Siv...
TREC
2007
15 years 7 months ago
Overview of the TREC 2007 Question Answering Track
The TREC 2007 question answering (QA) track contained two tasks: the main task consisting of series of factoid, list, and “Other” questions organized around a set of targets, ...
Hoa Trang Dang, Diane Kelly, Jimmy J. Lin
CLEF
2010
Springer
15 years 7 months ago
A Textual-Based Similarity Approach for Efficient and Scalable External Plagiarism Analysis - Lab Report for PAN at CLEF 2010
In this paper we present an approach to detect external plagiarism based on textual similarity. This is an efficient and precise method that can be applied over large sets of docum...
Daniel Micol, Óscar Ferrández, Ferna...
DOCENG
2010
ACM
15 years 5 months ago
From templates to schemas: bridging the gap between free editing and safe data processing
In this paper we present tools that provide an easy way to edit XML content directly on the web, with the usual benefit of valid XML content. These tools make it possible to crea...
Vincent Quint, Cécile Roisin, Stépha...