Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...
In many data mining applications, online labeling feedback is only available for examples which were predicted to belong to the positive class. Such applications include spam filt...
An important drawback to the popular Belief, Desire, and Intentions (BDI) paradigm is that such systems include no element of learning from experience. In particular, the so-calle...
An important drawback to the popular Belief, Desire, and Intentions (BDI) paradigm is that such systems include no element of learning from experience. We describe a novel BDI exe...
We study a general online convex optimization problem. We have a convex set S and an unknown sequence of cost functions c1, c2, . . . , and in each period, we choose a feasible po...
Abraham Flaxman, Adam Tauman Kalai, H. Brendan McM...