Sciweavers

2609 search results - page 289 / 522
» Optimizing for parallelism and data locality
Sort
View
MM
2004
ACM
124views Multimedia» more  MM 2004»
16 years 3 days ago
An online-optimized incremental learning framework for video semantic classification
This paper considers the problems of feature variation and concept uncertainty in typical learning-based video semantic classification schemes. We proposed a new online semantic c...
Jun Wu, Xian-Sheng Hua, HongJiang Zhang, Bo Zhang
IPPS
2007
IEEE
16 years 29 days ago
C++ based System Synthesis of Real-Time Video Processing Systems targeting FPGA Implementation
Implementing real-time video processing systems put high requirements on computation and memory performance. FPGAs have proven to be effective implementation architecture for thes...
Najeem Lawal, Mattias O'Nils, Benny Thörnberg
EUROPAR
2007
Springer
16 years 25 days ago
Toward Scalable Matrix Multiply on Multithreaded Architectures
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for s...
Bryan Marker, Field G. Van Zee, Kazushige Goto, Gr...
IPPS
2010
IEEE
15 years 4 months ago
A lock-free, cache-efficient multi-core synchronization mechanism for line-rate network traffic monitoring
Line-rate data traffic monitoring in high-speed networks is essential for network management. To satisfy the line-rate requirement, one can leverage multi-core architectures to par...
Patrick P. C. Lee, Tian Bu, Girish P. Chandranmeno...
INTENSIVE
2009
IEEE
16 years 1 months ago
Accelerating K-Means on the Graphics Processor via CUDA
In this paper an optimized k-means implementation on the graphics processing unit (GPU) is presented. NVIDIA’s Compute Unified Device Architecture (CUDA), available from the G8...
Mario Zechner, Michael Granitzer