Graves:inherent benefits of in-memory analytics

Among the many complaints from financial institutions about the volume of post-crisis regulation is the amount of data that is required by the regulators.

As STAC – the Securities Technology Analysis Center – puts it: “Analysing time-series data such as tick-by-tick quote and trade histories is crucial to many trading functions, from algorithm development to risk management. But the domination of liquid markets by automated trading – especially high-frequency trading – has made such analysis both more urgent and more challenging. This places a premium on technology that can store and analyse that activity efficiently.”

The STAC Benchmark Council developed the STAC-M3 Benchmarks to provide a common basis for “quantifying the extent to which emerging hardware and software innovations improve the performance of tick storage, retrieval, and analysis”.

In a series of STAC tests at the end of last year, one of the fastest performing systems was the eXtremeDB database system from Washington State-based McObject.

Running on an IBM POWER8 Linux system, eXtremeDB achieved new records in a set of 17 STAC-M3 tests, including the best-ever results for jitter. Adding together the mean response times for the 17 tests, eXtremeDB on POWER8 took 44% less time than the nearest competitive solution while using just one-third the number of cores.

The new STAC-M3 results beat standing performance records in 6 of the 17 tests, and set records for lowest standard deviation of results (lowest jitter) in 5 of 17 tests. The results build on previous record-setting STAC-M3 implementations by McObject and IBM that, when combined, hold 16 of the 17 test performance records when compared to other solutions using non-clustered 2-socket servers.

What’s interesting about that is that McObject didn’t design the software specifically for HFT capital markets environments: when the company was formed in 2001, it was addressing the needs of embedded real-time systems more typically found in avionics and defence applications – and its customer base includes BAE Systems, EADS and Boeing as well as financial institutions such as the National Stock Exchange of India.

President and chief executive Steve Graves was one of the co-founders of the company and says that the embedded nature of the application has several benefits: the design is based on a core in-memory database system that eliminates performance-draining I/O, cache management, data transfer, and other sources of latency that are hard-wired into traditional disk-based relational database management systems.

Not least of the problems that have to be overcome is that, by their nature, embedded systems have to function with what Graves calls “meagre resources” in terms of compute power, which has a direct bearing on the STAC results showing it run faster on fewer cores.

The company’s background also means high reliability, he says: “If one of our systems fails in defence and aerospace it isn’t just inconvenient – bad things happen.”

Other approaches to handling high volumes of data, such as using field programmable gate arrays, have limitations that are overcome by the embedded architecture. “FPGAs are really fast, but they fail on moving the data,” says Graves. “In embedded systems the application is the database.”

The Financial Edition of the system takes the embedded database strengths – including a streamlined hybrid in-memory/persistent storage database system design, multi-core optimisation, developer flexibility and high scalability – and adds specialised features to address financial data management challenges:

  • Columnar data layout for fields of type ‘sequence’. Sequences can be combined to form a time series, ideal for working with tick streams, historical quotes and other sequential data
  • A library of vector-based math functions that accelerate management of time series data by maximizing L1/L2 cache use. Functions can be pipelined to form an assembly line of operations on sequences in support of statistical/quantitative analysis
  • Use of common languages including SQL, C/C++, Java, C#, Python and ODBC/JDBC
  • Sharding (breaking large databases into smaller ‘shards’ to distribute the load) and distributed query processing, which support high scalability

One recent user is Shenzhen Kingdom Technology, a large financial securities software developer and systems integrator in China’s domestic market, which has deployed new order execution systems to meet growing demand for trading speed and throughput. Its existing trading systems used on-disk DBMS technology, which imposed disk and file I/O that limited the systems’ performance to 3,000 trades per second. The in-memory DBMS architecture of eXtremeDB’s enables the new OMS to process as many as 38,000 trades a second, the company said.

The new VIP Quick Order and VIP Stock Options technology also uses eXtremeDB’s Data Relay feature. Data Relay simplifies the code that “looks inside” the database system’s transaction buffer to identify changes that should be relayed to external systems such as enterprise DBMSs, and accelerates performance by eliminating the CPU-intensive task of monitoring the database itself for this activity. Shenzhen Kingdom’s order management systems use Data Relay to provide persistent records of selected trading activity and prevent data loss in the event of disaster, as required by regulation.

The VIP Quick Order and Stock Options systems are now used on the Shanghai Stock Exchange, with roll-out planned soon on the Shenzhen Stock Exchange. Shenzhen Kingdom also plans to use its technology platform incorporating eXtremeDB as the foundation for additional capital market systems such as risk management, the company said.

The role of high speed analytics in pre-trade risk management is demonstrated by India’s NSE, which also integrated eXtremeDB to replace traditional relational database management system software in performance-critical features of its algorithmic trading solution based on the NeatXS trading platform.

NSE.IT’s Algo Solution integrates eXtremeDB as the real-time database for NeatXS’s risk management features, which protect Algo Solution users – typically brokerages, specialised trading firms, asset management firms and hedge funds – from a wide range of risks related to factors including order size, allocation of margins, gross and net exposure, options coverage, securities’ mark-to-market value, and fat finger keystroke errors.

In algorithmic and high-frequency trading, risk management is automated and requires analysis of orders after they are placed but before they are executed. “The NeatXS risk management function receives thousands of new data points per second. This flow has increased exponentially and is still growing, due in part to exchanges’ expanding data broadcast services. Storing, filtering and organising the data for complex analysis can become a bottleneck for algorithmic strategies that must ‘get in and get out’ quickly. For our most recent upgrade, we identified the risk management module’s real-time database technology as an area where we could optimise the product and deliver greater value to users,” said VS Kumar, NSE.IT’s president, Americas.

According to Dr Pareshnath Paul, chief delivery officer at NSE.IT, the software “has enabled us to reduce latency to the sub-millisecond level per order while implementing a complex risk and compliance system, successfully positioning our Algo Solution in an increasingly competitive marketplace”.