The ICR (note fabulous new branding) organised a half-day seminar on “GPU Acceleration in Bioinformatics”. NVIDIA were kind enough to sponsor the event, which meant flying speakers in from as far as Russia and Hong Kong. Thanks, NVIDIA!
All computers contain at least one processor, the CPU (central processing unit), but many also contain a specialised secondary processor which is geared toward doing the calculations necessary to produce graphics quickly, the GPU (graphics processing unit). Although these may be embedded on the motherboard of your PC (or even on the same die as the CPU in some cases) for serious amounts of graphical power you’ll need a separate device, a so-called graphics card.
The first question to address is why would you want to use a GPU at all? In desktop scenarios, this is easy because many users have a GPU in their machine doing almost nothing most of the time. What if you could make use of all this power? You’ll call it “hardware acceleration”. Microsoft recently posted a very informative article outlining how they’ve baked this kind of GPU utilisation right into Windows 8 to speed up everything from the obvious (3D game performance) to the seemingly mundane (text rendering). Even so, the the majority of the power of a GPU is still largely unused in a desktop machine when not gaming.
However, the situation in computational biology is not so simple, as most of my compute is performed on Linux clusters, nowhere near a traditional desktop computer, with no GUI and therefore no apparent need for a GPU. At first glance, you might suspect that bioinformaticians just wanted an excuse to get their funders to furnish them with all-singing-all-dancing gaming rigs, but this isn’t the case at all. GPUs are exceptionally good at performing huge numbers of similar operations in parallel and it just so happens that many of the problems we have in biology (a great example being sequence alignment to a reference genome) are exactly these sort of embarrassingly parallelisable problems.
Having established the potential need for GPUs, you need to choose a language; CUDA (NVIDIA specific) or openCL (an open standard). I’m not a computer-scientist, but I am a gamer and I’ve noted that NVIDIA cards tend to be better (for my purposes) than ATi cards, though I’ve only switched to NVIDIA in the last few years. This was partly because I noticed that non-gaming applications I use such as Handbrake (a video transcoder) would be much faster using a CUDA-enabled card. You might count that as anecdotal evidence that CUDA is more widely supported and used when even consumers have heard of it. OpenCL offers hardware-agnostic portability and academics will often vouch for open solutions but as I have a preference for NVIDIA cards, certain apps I like run in CUDA already and anything written in OpenCL would run on an NVIDIA card but not vice-versa… the case for NVIDIA seems clear.
The talks themselves were very good, with highlights including:
SOAP3-dp. A new version of the SOAP aligner has just been released. It looks quite exciting as it now runs on GPUs, running “easy” alignments via two-way Burrows-Wheeler transforms (i.e. it works like BWA) on the GPU and leaving harder reads to the CPU , so much like running Stampy in hybrid mode with BWA, which has been my preferred approach. I guess I need to run some comparisons. On the other hand, I guess we need to get our GPU server up and running, first. Soon.
Coincidentally, the very next day I received an automated update from Gerton Lunter that a new version of Stampy had been released. It’s different in two main ways; firstly, it’s now multithreaded so should be substantially faster. Secondly, it no longer runs concurrently with BWA; rather, you align first using BWA and then allow Stampy to realign any poorly aligned reads. You could run Stampy on any bam file, so maybe we might end up using a SOAP3-dp/Stampy hybrid. Who can say? I recently had a disagreement with someone who was using Novoalign, which I gather is much like running Stampy/BWA, but costs actual money. Proprietary hardware and even operating systems? Fine. Proprietary aligners? I think not, at least not without some serious empirical evidence.
Unipro UGENE. All the way from Novosibirsk in Russia came Yuriy Vaskin to show off the UGENE software. I’d used it briefly before and have been generally impressed by the sheer number of features that they’ve packed into it – everything from (multiple) multiple alignment tools, a full workflow manager (think Taverna Workbench) and even the world’s only Windows-based implementation of the BWA aligner. I’m very biased toward the command-line and scripting while at work so I’m not sure there’s a place for a GUI-based program in my workflow, but I can certainly see the utility for such software, especially for groups without a dedicated bioinformatician. My interest is not particularly using UGENE as a general workbench, but as an enhanced genome browser. I currently use IGV to view our bam files; users can access data locked away on our Linux servers from their Windows (or Mac, or Linux if we had any of those) desktops as I followed the instructions here (http://www.broadinstitute.org/igv/DataServer); very handy indeed. Users are shielded from a crash-course in SSHing into a Linux server and the data is shielded from the users, who are only human (like me) and bound to accidentally delete or corrupt it eventually. However, sometimes IGV feels both a little ugly and sometimes a bit bare-bones. I guess a short attention-span and the urge for shiny novelty are common issues in the 21st century. In the age of buttery-smooth apps running at 60fps on our smartphones, Java-swing desktop apps do look a bit ugly, but it would be foolish to judge software on “shinyness” over utility, particularly when IGV can be run natively on any platform with a Java Virtual Machine.
I was disappointed to discover that on my (fairly wimpy) Intel Core2 Duo workstation with 8GB of RAM UGENE took several hours to index a 14GB BAM file, when I can view the same file near-instantly using IGV and a standard bai index file. Despite this there’s plenty of reasons to recommend UGENE; it’s not a great genome browser, but I feel it’s important to suffix that statement with the word “yet”.
Finally, there was some debate over why you should invest so much money in the enterprise-standard NVIDIA Tesla cards when you could feasibly use their much cheaper and consumer focussed (i.e. gaming) GeForce cards. There are several good reasons. While consumer cards are indeed cheaper, this is because they are produced in enormous quantities while attempting to squeeze the maximum profit out of them, meaning that they’re clocked at higher speeds and built from less reliable parts and may even have reduced feature/instruction sets. Even playing your favourite game for 30+ hours a week with all the dials turned to maximum does not compare to the stress a Tesla card might face when consistently hammered with jobs 24/7/365 in a server. It’s the same story with all computer hardware; server hard drives are more expensive because they’re more robust and you are of course also paying for support which you just don’t receive or even want as a consumer. It’s also worth noting that consumer cards have a different form factor and might not fit into a server in the first place or may block the airflow across the system in a catastrophic way. You might build your own compute farm from consumer cards if you were, say, running some sort of Bitcoin mining operation at home, but you’ll really need Tesla cards if you’re doing science or engineering. I suppose you could do some testing and local development using consumer cards, but when in full production you just need to spend the money. You do want an accurate, repeatable answer, don’t you?
Thanks to the speakers for making me think, to NVIDIA for sponsoring the event and Igor Kozin from the ICR’s Scientific Computing Team for organising it. Maybe Some rough results when I start toying with GPU-accelerated aligners will show up on this blog at some point?