The development of machine readable news is typical of today’s trading environment and the services that are offered to its participants. The competitive spirit that drives the market has led to an obsession with speed, complexity and technology and has created a marketing paradise for technology vendors.
Complex, benchmark-beating algorithms are now a necessity: microsecond-based, low latency execution is a must. Liquidity seeking, execution venue-trawling smart order routers are essential. And if all else fails, stick the prefix ‘intelligent’ in front of a product offering and sales are sure to follow.
The latest addition to these trading tools is machine readable news – a service by which financial news can be elementised, metatagged and fed into firm’s trading models.
The dissemination of financial news has always played a role in trading decisions, particularly in the currency markets where movements are often dictated by central bank announcements or sovereign events. But the traditional delivery methods for news articles, earnings announcements, ratings upgrades and the like are of a limited relevance in today’s high-frequency trading market
So it should be of little surprise that financial news providers have devoted considerable resources to making their content machine readable. But the question remains – just how useful is machine-readable news? Do traders feel that price-driven algorithms have gone as far as they can and that an extra edge is needed or are they simply buying whatever new technology their competitor has?
As Richard Brown, global business manager, Machine Readable News at Thomson Reuters, admits, matching competitors’ strengths is important. “In news-based algorithmic trading, you have to keep up with the Joneses and the majority of our customers think that automated news-based trading will give them a competitive edge,” he says.
There are several things that firms have to consider before implementing an MRN service other than how to spend the considerable fortune that the service will undoubtedly bring them. According to Armando Gonzalez, chief executive and founder of Ravenpack, a US-based service that transposes standard news content into a machine readable format for algorithmic and quantitative trading, there are five basic considerations.
The first is to receive the content in the right format. “Some newswires will have added metadata which makes it better for searching but not necessarily for trading purposes or for computer interpretation,” says Gonzalez. “The scope of the metadata is important – what attributes does it stress?” he asks.
The other important aspect is whether the news is being elementised. This is different from metadata, which allows machines to find a story: because elementising involves creating fields of categories and values – company names, market sectors, price and so on – within the story in a format that can be processed by machines. “So are you merely adding metadata or are you properly elementising the underlying information?”
Gonzalez also highlights the need for low latency, the ability to collect MRN data from multiple sources and store and retrieve this data as well as having access to historical news data in the right format. This allows firms to conduct backtesting because without years of historical data in a MRN format, there is really no way to build quantitative models, he says.
The most important consideration, however, is the ability to derive analytics from the MRN. “The MRN is just the format, you then have to figure out what to do with it and how to apply it in a profitable way and find alpha in the analytics,” says Gonzalez. “But you have to address the first four points before you get to the fifth and most important part.”
Deriving analytics from more data-driven and numbers-based news stories is straightforward enough, as with earnings announcements or GDP figures but where figures are absent the key question is how do firms analyse and quantify sentiment? “We have an existing sentiment engine product, the News Analytics service we have developed with Ravenpack,” says Alan Slomowitz, director of Algorithmic and Trading Products for Dow Jones’ Content Technology Solutions division.
“Using different sentiment analyses that are out there in the market and applying them to our news is one facet. The other aspect is the context and the various drivers of news. People are looking to see whether the story is good or bad and in what context this is. We have been looking at various academic studies that are out there looking at language and whether positive or negative language has an effect on share prices and market movements.”
The technology involved in the identification of negative and positive words is relatively straightforward, involving control dictionaries and software to pick out the relevant words, says Slomowitz. The next stage is identifying different kinds of events within one story and looking at the tone around these events. The underlying technology for this is a combination of semantics analysis and data mining. “There are lots of variations on this technology, all trying to identify events, themes, trends and so on and this is where there has been an increased emphasis with more firms trying to gain trading advantages from news content,” says Slomowitz.
Reuters’ Brown makes a similar distinction between basic and advanced methods and the technology used to track author sentiment. Simple categorisation involves scoring an article based on certain words and phrases and then taking an arithmetic mean to derive the average tone of the article. “This may sound comprehensive but it is not,” says Brown. “In most articles there are mentions of multiple companies where one may be mentioned positively and another negative so that when you apply a categorisation approach these positive and negative mentions cancel each other out and give a neutral result.”
A more comprehensive approach is to adopt a hybrid statistical sentiment scoring methodology that takes positive and negative scores for certain words and phrases and then putting them into context based on proximity of words to each other, surrounding words and other sophisticated natural language processing techniques in order to come up with an entity level score for each company within the article.
This approach to assessing a company’s fortunes through the sentiment of news articles was first adopted in the PR industry as a way of tracking clients’ fortunes, publicity-wise. A team of linguistic analysts would be employed but the speed at which they worked (six to 10 articles per hour) was never deemed quick enough to be applicable to financial services and trading decisions. However, advances in automation mean that machine readable news services can apply the same sentiment analysis to six to 10 articles per second.
So how useful has tracking author sentiment proved to be? How are firms able to tell if this sentiment reflects the behaviour of the market in any way? “You have to remember that we are assessing author sentiment, not trying to predict the direction of the market,” says Brown. “We suggest to our clients ways in which they can incorporate author sentiment with market sentiment because a lot of it is about relative values – what is the relative value of one IT company’s author sentiment against the IT sector. Are retail stocks performing better than financial stocks? An article can be applied to a company, that company can be applied to a sector and so on.”
There are pretty much endless opportunities for applying MRN, says Brown. It is not just a tool for high frequency traders, it can also act as a risk management tool, as a stock screening tool and even as a simple filter on news stories to help analysts know what to read and what to ignore. Whatever the application of the service, Brown stresses that the most important point is to make sure that there is “an appropriate level of data available to be able to backtest the models and have complete confidence in them”.
It is a valuable point as, despite the investment that has been put into semantics technology and the effort to combine it with ‘good’ and ‘bad’ sentiment, it is still not clear to Slomowitz whether all of it is accurate enough for traders to completely trust. “They are still waiting for more analysis and research to confirm that they can use these as predictive feeds into the trading models that they are building. This is the other piece that needs to be completed.”
Slomowitz senses that the vast majority of firms will be using MRN services to trade in more familiar areas where the tagging and identifying will be easier and there will be some preconceived ideas regarding how they will trade. “So for those engaged in M&A arbitrage, by using MRN feeds that concentrate on M&A announcements, they will be able to trade faster. They may engage in more sophisticated trades in this area but by and large it is the tried and tested trading strategies that are being employed.”
The next stage is for these traders to look historically at MRN and identify unusual trading patterns based on how the market or specific traders reacted to certain news involving less obvious companies, says Slomowitz, something that is particularly relevant when one considers the collapse of the sub-prime market and the fact that few people saw it coming.
“A lot of the events that people trade off are scheduled – you know when they are going to happen and set yourself up accordingly: but how do you set yourself up to be able to take advantage of unscheduled events? That is the bigger challenge and it is in this area that I think technology still has some catching up to do in order to do perform the pattern analysis and complex event processing required.”
Of course – more important than any of this – is what the development of MRN means for the financial journalist. “There will be certain types of news that will not need to be written by journalists and the more these types of stories are automated, the more it will allow journalists to focus on what they’re good at – investigating, writing and publishing quality articles,” says Ravenpack’s Gonzalez. (Phew! Ed.)
After so many years of writing the same sentiments regarding back-office banking staff, this is perhaps to be expected: however Gonzalez’s next statement is what should really scare journalists. “MRN will also enable readers to quickly and easily verify the news that they read so they can tell whether a journalist is spot-on or off-the-wall.”
Bookmark with:   (What is this?)