High performance computing has been used in finance for years, even when high performance meant a big, complicated Cray supercomputer. Now that high performance is available on standalone workstations or commodity hardware clusters, financial services firms are turning to it for pricing, pre-trade risk analysis, global risk, and regulatory compliance.
The most important advance in the last year, or perhaps just the last six to nine months, is the mainstream use of General Purpose Computing on Graphical Processing Units (GPGPUs). Designed for sophisticated gaming graphics but now being applied to enterprise computing with huge increases in speed and sharp reductions in operating costs, they have changed the business for some companies and are rapidly attracting attention from users.
The Benfield unit of insurance giant Aon uses GPUs to run real-time hedging during the day for variable life annuities. Steve Betts, the global chief information officer for Aon, says the company had started working with IBM Cell computing blades, until IBM abandoned the Cell chip set.
After giving up on Cell and testing AMD and Intel processors, Aon chose IBM Dataplex servers with Nvidia GPU cards. Betts says that the Nvidia GPU offered 1,000 to 3,000% better performance than Intel processors.
GPUs were known to be powerful but also considered difficult to program. However, in the last three years, Nvidia has invested heavily in software to program the processors and in redesigns of the hardware to make them easier for programmers. Betts says that C++ programmers can easily pick up the Nvidia CUDA programming architecture. "We have really not had challenges finding appropriate developers. Working in C++ and the CUDA extensions is a pretty widely available skill set." While GPUs are not a solution for all the computing tasks Aon faces, he expects it will continue to be an option for high end statistical analysis. The company is also looking at GPUs for catastrophe and other modeling. "You have to make sure you have the case to fit the technology," he says.
Information provider Bloomberg used Nvidia GPUs for a bond pricing service it launched a few years ago. The computation would have required 2,000 CPUs at a cost of $4 million, plus $1.2 million a year in power. Instead, the firm deployed the pricing engine with 48 GPUs costing $144,000 and consuming $31,000 in electricity a year, in 42 times less space.
Nvidia has focused on the high performance computing market in addition to its traditional graphics market for games, says Sumit Gupta, product lead for HPC at the company. Last year the company had $100 million in revenue from HPC, up from zero three years ago. The firm's supercomputing conference last year attracted 2,400 attendees and its GPUs are used in three of the top five supercomputers. He says the HPC business could be worth up to $5 billion a year for the company, more than its current annual revenue.
Gupta sees strong momentum for Nvidia in HPC because its CUDA architecture makes it possible for programmers to write for it in C, C++ and Fortran.
"A developer can take a laptop and start writing applications using the GPU; you don't have to buy a supercomputer to start programming our GPUs." Gupta says that GPU programming is now being taught in 400 universities around the world. That gives it a distinct lead over GPU efforts underway at Intel and AMD, he claims.
Andrew Sheppard, a consultant and trainer specialising in GPUs for financial services, began inviting people to GPU meet-ups in January. Now more than 700 people have signed up for them around the world, often at Microsoft offices.
"Microsoft has offices pretty much everywhere, and they are pushing GPU actively," says Sheppard. The meet-ups usually have experts as speakers in informal sessions, often someone from Nvidia or Microsoft. Participants can learn about GPUs and share their ideas. Participation has grown because GPUs are a compelling technology, he adds.
"Rule one if you want to displace an existing technology: you have to be 10 times better or 10 times cheaper. GPUs are hundreds of times faster than CPUs, 10 times cheaper and require less power and space. Microsoft is pushing very hard with HPC and GPU for the computational speed and the huge amounts of data we have to process," says Sheppard
Nvidia's software has made a real difference in programming, he added: "If you were doing this two years ago it was pretty painful, and more than two years ago you had to be deep into bits and bytes. The tools have become good; you can program in C and C++, debug, and develop in Microsoft Visual Studio - all this has come along in the last six to eight months."
Microsoft built the Nvidia compiler into Visual Studio because GPUs are a big business, used by the largest banks in the world, says Joe Pagano, who covers banking and capital markets for Microsoft. "We have put GPUs into a familiar development environment, Visual Studio."
Now more than 80% of the downloads from Nvidia are for Windows, added David Rich, a director of marketing at Microsoft: "We have all the driver support to run on servers and clusters." He also noted that progress in making Nvidia GPUs more widely usable has been recent and rapid. "The hardware got error correcting, IEEE floating point, OEM server platforms are built for it, a steady stream of libraries is being released and we offer Visual Studio support. Before you put anything in production, all these things have to be in place," he says.
IBM's Risk Management Solutions achieved a speed increase of more than 100 times with GPUs, he says, adding: "If you do that, everybody else has to prepare to do it do because they don't want to be left in the dust."
Pagano says the growth in high performance computing in finance comes in two main areas - risk management and predictive analytics. Some of the work which banks do cannot run in parallel, so clusters and the ability to expand into a cloud can provide the power needed. "If you are running an Excel workbook on your desktop and it isn't fast enough, you can send it into a cluster." Microsoft can support up to 256 processors, but many clusters in finance are 4 to 16 nodes, says Rich.
Robert Brinkman, an US IBM financial services executive, added that HPC covers two areas in finance - the work done on grids, which is mostly middle office, and low latency applications for trading. HPC requires more than fast processors - it places demands on the I/O system and on storage.
One solution is to run applications in memory, and IBM is ready to oblige. "A lot of firms want big machines with lots of memory, so they can keep the data off the disk. A CPU going to disk is hundreds or thousands of times slower that running in memory," says Brinkman.
IBM's Watson, which won the American game show Jeopardy!, had 3,000 cores and four terabytes of memory so it could answer questions fast. Brinkman says he is working with a firm which wants a petabyte of memory so it can do analytics pre- and post-trade.
IBM has a portfolio of products, including very fast servers and the fast Netezza data warehouse to create a framework that financial institutions and ISVs can use to develop their solutions incorporating: BNT 10 gigabyte switches from Blade Network Technologies, another IBM acquisition; solid state drives; GPUs and FPGAs (Field-Programmable Gate Arrays in which the processor logic is itself reconfigurable to act as hard-wired circuitry). This summer it will announce a computer with 3,072 cores and up to 524,288 nodes capable of communicating in 200 nanoseconds.
Simon Garland, chief strategist at Kx Systems, which deals in real-time databases of tick data, says the drive for more memory is also coming to high end commodity boxes. Clients are buying machines with half a terabyte to two terabytes of memory.
"If you have something like that, you can do an awful lot in memory. You can load up years of data in memory and then just run from there," says Garland.
Tervela, a high speed messaging specialist, uses a little of everything to achieve extremely low latency data processing. Barry Thompson, founder and chief technology officer, divides HPC into three types of problems:
They all have strengths and weaknesses, he says. FPGAs, for instance, are fine for large amounts of data with minimal or no floating point operations. "The cons with FPGA are really long time to market, completely inflexible code and the people on Wall Street who can code on them are a pain to manage," he says
JP Morgan in London uses a combination of FPGAs and Intel CPUs from Maxeler, a London-based HPC specialist, to speed up the processing, pricing, and risk calculations of credit derivatives by 30 times compared to an 8-core Intel processor.
James Spooner, vice president of acceleration at Maxeler, says the challenge in programming for parallel processing is you have to think about more than one thing at a time. "Everyone is trying to solve the same problem - how do you teach a global population of programmers used to a simple serial machine to program in a parallel way. That is a tricky thing for the computer science community to work out."
Tervela's Thompson says that GPUs are much easier to use than FPGAs - "they rock on floating point and are amazing at hyper parallelization" - but present challenges getting data in and out fast.
To address the I/O issues, GridIron Systems in California has developed a fast intelligent layer built on solid state devices with multi-level cache to provide a solution at a reasonable cost while reducing bandwidth demands. The system learns in real-time what the requirements are and can increase performance by 2-10 times.
"We are going after applications with a lot of data, a lot of growth, high access rates and concurrent use," says Dave Anderson from the office of the CTO at GridIron. "Concurrent use in particular is an area that hasn't been addressed by any of the other products out there. When you get multiple users trying to get at data off a disk at the same time, the inherent serial nature of a disk drive becomes a bottlenecked." By contrast, SSDs are random access and have extremely high I/O rates and high bandwidth. That lets GridIron run in random access mode, which fixes the concurrency issue.
Kaminario, an Israeli company, has also addressed the I/O problem with fast DRAM SSD storage that is used in finance, telecommunications and health care.
Tervela's Thompson thinks GPUs are ready for prime time. "Any idiot can code to a GPU; they are designed for game programmers, after all," he says. "The interfaces are straightforward - really good technology. People talked about it five years ago, now the GPUs and the developer toolkits are powerful enough that the average developer can develop on top of them."
One of the first ways to deliver affordable high performance computing was with grids that linked dedicated servers together or scavenged idle desktops and servers and drew on their compute power to generate answers.
Jingwen Wang, vice president products at Platform Computing, says grid is growing beyond its traditional role in capital markets to reach buy-side firms and retail banks. One driver of grid is the price: senior business managers want to get more value from their existing infrastructure and delay new server purchases for a year or two. One fund manager told Platform its Intel x86 servers were running at 10-20% utilisation when they were used for occasional Monte Carlo runs. By moving to grid, the servers could be shared, they saw a 4-5 times gain in performance, cut bandwidth use to a tenth and hit 70-80% utilisation.
Wang says that six or seven years ago grid decisions were made by line-of-business managers. Now more banks are using a central IT organisation that takes responsibility for the grid and runs it across lines of business to achieve better utilisation while reducing space and energy costs.
High performance places demands across a system including networks, storage, databases and input/output. As an example, SunGard's Protegent compliance software maintains three years of market tick data, about one billion records a day, on a Sybase IQ database which organises data in columns, which are faster to access than rows in a traditional database, says Steve Rider, a product architect.
Combining this kind of data architecture approach with appropriate hardware acceleration and storage access speed improvements is going to mean that the HPC levels accessible to only top-end high rollers are going to become more widely available. They may already be in the games machine in your kids' bedroom ...
Sign up to receive FREE Banking Technology news alerts straight to your inbox
MyStandards, officially launched 14 May, is a development that goes to "the heart of what Swift is doing to reduce the cost of managing the...