We consider the problem of learning a sparse multi-task regression, where the structure in the outputs can be represented as a tree with leaf nodes as outputs and internal nodes a...
This work provides a framework for learning sequential attention in real-world visual object recognition, using an architecture of three processing stages. The first stage rejects...
We introduce a model class for statistical learning which is based on mixtures of propositional rules. In our mixture model, the weight of a rule is not uniform over the entire ins...
The asymptotic behavior of stochastic gradient algorithms is studied. Relying on some results of differential geometry (Lojasiewicz gradient inequality), the almost sure pointconve...
The performance of machine learning algorithms can be improved by combining the output of different systems. In this paper we apply this idea to the recognition of noun phrases. W...