Abstract: The thematic text segmentation task consists in identifying the most important thematic breaks in a document in order to cut it into homogeneous passages. We propose in t...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
This paper examines IS higher education, concentrating on issues of ‘coherence’ in IS curricula. While curriculum coherence can be jeopardized by poor curriculum design, misal...
We develop a new component analysis framework, the Noisy-Or Component Analyzer (NOCA), that targets high-dimensional binary data. NOCA is a probabilistic latent variable model tha...
We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is...
- Rich Site Summary (RSS) technology is a web content syndication format commonly used to organise news and the content of news-like sites. Indeed any information that can be broke...