Tuesday, November 18, 2008

Adventures on the Floor

Today i visited the SiCortex booth, and didn't really learn anything I didn't know (low power, really great at problems that are "chatty" due to the fact that everything is connected via a very fast backplane rather than a network). The guy knew about Phil Dickens [1] (they haven't been at this that long, so they don't have too many customers yet) and had a great quote from Phil that has been making its way around the guys at SiCortex: when asked what he needed to do for prep for his 648 processor SiCortex he said "Well I might have swept the floor once.". He also liked the fact that he could fit it in his lab without adding additional power or cooling and have a grad student admin it and not involve the IT department (don't worry Gregg, I won't put one in my office).




I also visited the NVIDIA booth and spoke with a CUDA expert, and confirmed what I had been suspecting after I actually thought this through a little better - the GPGPU may not be the right fit for my FFT-heavy program. The problem is the program does a lot of small FFTs (it could be a bunch of 15x15 2D FFTs, definately not a lot of the 128x128 or 256x256 FFTs that the NVIDIA CUDA expert says would be necessary to start seeing a bennefit from offloading to the GPU). There is a latency to transfer the data to and from the GPU - you want to have a larger FFT so that you don't end up spending more time waiting for the data to move across the PCI bus than you save by off-loading the FFT. The good news is their CUDA-based FFT library is very similar to the FFTW library I am using and that it would require very little coding changes. He said it may be possible to batch multipe small FFTs together, but I don't know if that will fit in well with this code. Perhaps I can get someone to donate a TESLA-equipped worksation to do some testing on.




I also spent some time in a break out room at the Cluster Resources booth with Chris Samuel of VPAC and Scott Jackson of CRI to discuss TORQUE. We agreed that getting job arrays finished and solid is a high priority task for the following year. Hopefully by SC in Portland job arrays will be in wide use among TORQUE users.

No comments: