Sciweavers

5448 search results - page 389 / 1090
» Breakpoints and Time in Distributed Computations
Sort
View
IPPS
2007
IEEE
16 years 1 months ago
A Fault Tolerance Protocol with Fast Fault Recovery
Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...
Sayantan Chakravorty, Laxmikant V. Kalé
PDCAT
2007
Springer
16 years 27 days ago
High Throughput Multi-port MT19937 Uniform Random Number Generator
ct There have been many previous attempts to accelerate MT19937 using FPGAs but we believe that we can substantially improve the previous implementations to develop a higher throug...
Vinay Sriram, David Kearney
IPPS
2006
IEEE
16 years 24 days ago
A proactive fault-detection mechanism in large-scale cluster systems
To improve the whole dependability of large-scale cluster systems, an online fault detection mechanism is proposed in this paper. This mechanism can detect the fault in time befor...
Linping Wu, Dan Meng, Wen Gao, Jianfeng Zhan
IPPS
2003
IEEE
16 years 1 days ago
Master-slave Tasking on Heterogeneous Processors
In this paper, we consider the problem of scheduling independent identical tasks on heterogeneous processors where communication times and processing times are different. We assum...
Pierre-François Dutot
IPPS
1999
IEEE
15 years 11 months ago
Reducing Parallel Overheads Through Dynamic Serialization
If parallelism can be successfully exploited in a program, significant reductions in execution time can be achieved. However, if sections of the code are dominated by parallel ove...
Michael Voss, Rudolf Eigenmann