Search Sciweavers | Sciweavers

82 search results - page 11 / 17

» Balancing Exploration and Exploitation in Learning to Rank O...

143

click to vote

ML
2002
ACM

133views Machine Learning» more ML 2002»

Finite-time Analysis of the Multiarmed Bandit Problem

15 years 5 months ago

Download homes.dsi.unimi.it

Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...

Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...

claim paper

Read More »

167

click to vote

KDD
2007
ACM

178views Data Mining» more KDD 2007»

Practical learning from one-sided feedback

16 years 6 months ago

Download www.eecs.tufts.edu

In many data mining applications, online labeling feedback is only available for examples which were predicted to belong to the positive class. Such applications include spam filt...

D. Sculley

claim paper

Read More »

155

click to vote

ATAL
2010
Springer

152views Intelligent Agents» more ATAL 2010»

Learning context conditions for BDI plan selection

15 years 7 months ago

Download www.cs.rmit.edu.au

An important drawback to the popular Belief, Desire, and Intentions (BDI) paradigm is that such systems include no element of learning from experience. In particular, the so-calle...

Dhirendra Singh, Sebastian Sardiña, Lin Pad...

claim paper

Read More »

162

click to vote

RAS
2010

117views more RAS 2010»

Extending BDI plan selection to incorporate learning from experience

15 years 4 months ago

Download goanna.cs.rmit.edu.au

An important drawback to the popular Belief, Desire, and Intentions (BDI) paradigm is that such systems include no element of learning from experience. We describe a novel BDI exe...

Dhirendra Singh, Sebastian Sardiña, Lin Pad...

claim paper

Read More »

159

click to vote

CORR
2004
Springer

103views Education» more CORR 2004»

Online convex optimization in the bandit setting: gradient descent without a gradient

15 years 5 months ago

Download www.cs.cmu.edu

We study a general online convex optimization problem. We have a convex set S and an unknown sequence of cost functions c1, c2, . . . , and in each period, we choose a feasible po...

Abraham Flaxman, Adam Tauman Kalai, H. Brendan McM...

claim paper

Read More »

« Prev « First page 11 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers