We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
If all features causing heterogeneity were observed, a mixture of experts approach (Jacobs et al., 1991) is likely to be superior to using a single model. When unobserved or very n...
In the single rent-to-buy decision problem, without a priori knowledge of the amount of time a resource will be used we need to decide when to buy the resource, given that we can ...
Logistic Regression is a well-known classification method that has been used widely in many applications of data mining, machine learning, computer vision, and bioinformatics. Spa...
Localization is an important and extensively studied problem in ad-hoc wireless sensor networks. Given the connectivity graph of the sensor nodes, along with additional local info...
Amitabh Basu, Jie Gao, Joseph S. B. Mitchell, Giri...