The enormity of the increase in data processing facing the business of capital markets is hard to comprehend. Research firm Celent estimates that the average daily trading volume in the foreign exchange market was around US$4 trillion last year, which represented a 20% increase over the past three years, preceded by an event greater growth of 72% prior to the onset of the financial crisis in 2008.
The electronification of trading is enormous. HFT volumes in FX will be in the range of 28% this year, compared to 5% in 2004. The big banks are running 92% of their FX transactions through e-trading systems. As a result, firms are having to invest in their processing capacity to stay on top of the market.
The term ‘big data’ has been coined to describe the capturing and processing of data too large to be held in a conventional relational database, often in real time. Originally used by online retail focussed firms such as Google and Facebook, who have to capture and process data often without knowing how it might need to be used later. That makes the structuring of data into tables, a necessary formatting process for use in databases, impossible.
“There is a lot of unstructured data in finance that doesn’t fit with the standard relational technology,” says Louis Lovas, director of solutions at OneMarketData.(OMD) “The big relational vendors are nowhere to be found at the quant houses and big banks because their technologies do not fit. And there are a number of reasons why they don’t fit. In commercial industry big data is defined by the adage of the three V’s – volume, velocity and variety, a somewhat nebulous description relating to unstructured social content. In that context of social big data the goal is judging human behaviours, mood shifts and buying patterns. There is a disquieting alchemy behind the social science in the hunt for business benefit within the glut of data. The analysis involves not only what data to keep, but what to throw away.”
He continues, “By contrast, within finance the big data definition is much more precise since there is need for reliability, accuracy and timeliness. FX itself represents the world’s largest and most liquid market exceeding $4 trillion daily turnover. Unlike any other asset class, the global currency market ‘never sleeps’. FX clearly fits well within big data’s high-volume definition.”
The most valuable asset
Being sure in a business based on trading risk for capital is crucial, and as they say in the army ‘Time spent on reconnaissance is never wasted.’
“Reliable and accurate data is the ultimate driving force in all capital markets transactions,” says Paul Kennedy, business manager, reference data at Interactive Data. “Firms talk about cleaning or scrubbing data which is almost a religious issue as it is about individual versions of the truth. Do you have a supplier whose information you trust? Have you got an automated intelligence system to check if a price comes in that it is the price you are expecting; is it within the standard deviation of the last one? Is it a price and not a piece of text? How do I avoid the fat finger syndrome that you often get with order management systems?”
He believes the greatest challenge at the moment is that although firms have to understand whether the data they have is accurate, the business is being buffeted by three winds of change: technology, cost constraints and the parallel search for profitability. The ability to generate and manage these massive data sets, with nano-second response times is a huge potential advantage but spending on the systems clashes with the tough business environment and tight spreads that characterise the market. Automation is one option firms can look at to keep their costs down.
Some of this technology is being adapted from other industries; the Hadoop data management system is an open source platform, based on a white paper Google released in 2004 detailing its MapReduce data management model. HSBC and Bank of America Merrill Lynch are both investigating its potential at present. By storing data in chunks that are not related, across an architecture of networked computers, it can be searched more flexibly and without the need to load and unload it into a database. The use of computers operating in parallel allows processing to occur at a much faster pace. There are also more established commercial options available.
“Big Data platforms like OneTick provide a premiere solution for large scale FX data management,” says Lovas. “OneTick combine the attributes of a high performance, massively parallel database with real-time complex event processing in a single solution. Bundling a large analytical library of functions accessible via a graphical model builder, it is designed for quants and sophisticated traders. But is also includes the familiar interfaces of SQL/ODBC and APIs for C++, Java and C# for the developer community.”
Many of OMD’s current clients take advantage of data centre proximity hosting and co-location for managed services. This typically includes taking rack space, storage platforms (i.e NAS, SAN) and resident cross connects to markets from leading vendors.
“They look to firms like OneMarketData to provide the next level of managed service – data management to complement our licensed product. This allows them to focus on bottom-line profitability of the firm, achieved through those elements in the trade life cycle,” Lovas adds. In-memory storage and computation environments, such as SAP’s HANA platform, can handle several terabytes of data, which is enough for a firm’s daily operational data and provide a single data view for multiple user communities, who would be able to access it in a way that suits them. For example, the front office might want look at real-time information while the middle office may use it to execute risk analytics.
“Risk and finance, from their traditional middle office seat, are now bridging the front and back office and bringing them closer together,” says Stuart Grant, EMEA business development manager - Financial Services at database provider Sybase. “As a result it is no longer good enough to move data around in batch windows, and execute risk measures on an overnight basis across a firm’s entire portfolio. Banks want to do this on an intraday, on-demand basis using current data. That’s where in-memory storage and Map Reduce come into their own, Map Reduce being a similar concept to Hadoop.”
However whilst in-memory storage provides low-latency access to data, it is not cost effective to store all of a firm’s historical data in memory. Products like Sybase IQ, which is a columnar database, is capable of storing petabytes of data then providing techniques such as MapReduce to provide low latency analytics.
The drivers for big data
While market participants are cutting costs, there is an almost contradictory push in the search for alpha, where firms want to invest not just in FX as a currency class but as part of their hedging strategy for equity and fixed income. That creates a more sophisticated requirement to enter the market.
“[Those issues] are why big data is such a buzzword,” says Kennedy. “But it means different things to different people. In the technology team it’s a plumbing issue. Where is the data, is it in a cloud? Do I have to move the data or move my application to manipulate the data. When I’m at a certain volume do I need to use certain technologies? Should I use Hadoop? Should I use a column-oriented database instead of a relational one? I have to consider how flexible the data model is, for example if there is a new Greek currency tomorrow, what is the impact on the system? How do I value this strange new asset if I have no historical experience?”
As a technology issue, Kennedy notes that business and traders can be oblivious to the ‘plumbing’ around big data, and do not tend to care about the solutions used, as long as they can make a successful trade. From the perspective of management it is often a question of whether the firm is being competitive and secure with well controlled costs.
Grant says that market conditions are driving up the number of FX transactions, creating the need for big data technology. “A number of the larger tier one banks are trying to dominate the flow of the FX environment to shore up revenues, because FX is relatively stable, regardless of what happens in other asset classes,” he said. “The margins that banks earn in FX are slimmer that those in other asset classes. So they need that increase in flow to improve revenue and profit. That’s where big data comes in, with costs associated with handling and processing of data.”
“The driver is transaction volume,” agrees Ralf Behnstedt, managing partner of consultancy FX Architects. “The algorithmic trading approach means the computer decides when to buy or sell, so the simple calculation of scale by the number of traders connected to the FX markets doesn’t happen anymore. You have a couple of computers making decisions that could give you 100,000 transactions and tomorrow just five transactions from them.”
The turf war Grant asserts has been building in the EMEA FX market involves not only those large banks trying to claim dominance, but also as a consequence high street non-bank FX players, which is adding to existing connectivity challenges.
“Most organisations now regularly look at 50-60 feeds, which in EMEA is greater than if you were looking at the equity environment,” he said. “In addition to the connectivity issues, they are now in a latency war. Some are reducing their latency in terms of their market making capabilities; how long it takes them to generate a price. I’ve heard of firms trying reduce price generation from 100 milliseconds to sub-10 milliseconds. When you think of the volume of prices going out to the market, that’s significant.”
The other side of the equation is the development of algorithmic trading. Like equities, FX has been electronic for some time, but in equities there have been more factors to base algos on, making it dominant in terms of their complexity. As algorithms have become more sophisticated, the know-how has flown into the FX market with firms using multiple algorithms with commensurate growth in the measurement of their performance. To increase their FX order flow banks are now adapting these systems for retail clients.
“Some banks are looking at how they can bring their two sides of the FX business together to link retail and institutional flow,” says Grant. “So on the retail side we are starting to see FX provider offer algorithms to their high street customers. There are opportunities to reduce costs by improving your ability to trade and the speed at which you can trade based on the strategy you use. For example, who narrows their bid-offer spreads first when there’s an event in the market? Who are the laggards? That’s where big data really comes into its own, where organisations can start developing more sophisticated strategies.”
Comply or die
This growth in trading has knock-on effects for other areas of the business. Each trade has to be checked by the bank to ensure that it is not exceeding risk positions or is in some way unauthorised.
Behnstedt says, “Regulations pose a challenge for banks as they have to manage large volumes of data, for example based on know your customer rules, but they have a silo-oriented approach so there may not be much interaction between the Forex silo, the equities silo and the bond silo. As trading volumes increase so banks are less able to cope. They are looking for methods of volume intensive processing; it doesn’t meant that they will move from a position of 10,000 clients to 1,000,000 clients the next day, it is more an issue that transaction volumes might spike.”
Banks already have large structured information stores, data warehouses, but these are really designed to be used after the trade has been made, notes Behnstedt. Often used to meet Basel III and reporting requirements, they are not designed to calculate details of the trading operations, such as the average mark-up of trades, or which currencies a client is trading and which he would normally have exposure in. Business intelligence technologies that allow a deeper understanding of this information are in place in firms but not usually attached to data warehouses.
“Business intelligence functionalities which help to analyse data are always held in systems in the middle office, sometimes in the risk management area,” he explains. “By making trading more automated banks have a problem in that they have to cope with the data volume and apply business intelligence. So they need to look for or build something that makes to easier to manage. The question is whether you can build and support this or outsource it to a provider.”
The wood from the trees
Getting the data processed is well and good but at some point its analysis has to be crystallised into something that FX traders can actually use. Given that the definition of big data is a dataset beyond the capacity of a normal database, what tools can be used for the sake of practicality?
“There is understanding data using analysis, which I might call the value of the data, then there is visualisation, which are ways of interacting with the data,” explains Kennedy. “If I have got an enormous quantity of bid-ask trading data how do I, as a human, make sense of that? Visualisation demonstrates that a picture can be worth a thousand words, it allows us to make sense of the numbers in the sense of where the market is going and what is affecting it.”
“Big Data is about the capture and storage of deep data and linking disparate data sets under some common thread to tease out an intelligible answer, a diamond from a mountain of coal,” adds Lovas.
In his view, it facilitates many functions including the provision of wider price transparency, both depth in terms of years of price data, and breadth, sourced across many banks and ECNs. It is then the fuel driving the quantitative trading engine – including alpha discovery and research, strategy back-testing and optimisation, portfolio valuation and transaction cost analysis (TCA). Accurate historical data – depth of book, trades, even daily closing prices are the ante to the fiercely competitive trading game.
“Big Data has improved price transparency which has made FX transaction cost analysis easier,” he says. “The highly fragmented FX market can be brought together under big data platforms which support broad connectivity and can perform aggregation for accurate time and sales data. This is also accessible via complex event processing systems for real-time TCA.”