The perception of speed is relative to the perspective of the observer.
-- Author Unknown
There is no question that the computer has radically enhanced our ability to perform time series analysis (TSA) in a, well, timely manner. The manual collection and plotting of thousands of data points could take weeks in the pre-PC days. Now our computers can crunch numbers and produce results in hours, if not minutes. Yes, to the casual observer it happens fast. However, is fast actually fast enough when split-second market-timing decisions hang in the balance? Ask any trader who has had to wait several minutes for something that was needed in several seconds, and you’ll begin to understand that the perception of speed truly IS relative to the perspective of the observer.
You would think that if you had a state-of-the art PC that was running the fastest processor on the market, maxed out with high-speed RAM, and spinning a hard drive with the quickest seek time in the industry, that no one would be able to perform TSA faster than you. The truth is your high-tech workhorse probably doesn’t contribute any measurable increase in results because the bottleneck isn’t inside of the PC; it’s in the database.
"When all you have is a hammer, everything looks like a nail"
-- Abraham Maslow (1908-1970) American Psychologist
It’s hard to blame the average financial software developer for storing “data” in a “database”. After all, isn’t that what a database is supposed to do, store data? Yes, it is hard to blame them, but blame them we must. Any in-depth study of the method in which TSA results are obtained would reveal that a structured relational database is the last way that TSA data should be stored. However, when you are a database programmer, you only have a hammer in your toolbox.
"Having an RDBMS doesn't mean instant decision-support nirvana. As enabling as RDBMSs have been for users, they were never intended to provide powerful functions for data synthesis, analysis, and consolidation (functions collectively known as multidimensional data analysis)."
- Ted Codd, inventor of the relational database model, 1993.
So, if the man who invented the modern-day computer database doesn’t think that it is suited for data analysis, why do the financial software vendors keep insisting on storing data in them? Simple – to them, data looks like nails.
A look at traditional data storage
SQL databases consist of a set of row/column-based "tables", indexed by a "data dictionary". A table is a “container” that stores data. In reality, a table looks a lot like a spreadsheet as it is composed of rows (records) and each row is composed of columns (fields). A collection of related tables are known as a database.
Using the very flexible SQL (structured query language), you can retrieve data from any table, or groups of related tables, and have that data presented to you as a “view”.
This basic functionality, and the flexibility to store and relate almost anything, is what makes the RDMS model so powerful and so widely used for nearly every serious business application.
Unfortunately, this “one size fits all” approach to data storage and retrieval is exactly why the RDMS model fails so miserably for financial analysis and reporting applications.
The RDBMS model produces substantial overhead due to its inherent multiple-row and table record structures. When you heap indices, clusters, and procedures on top, you create even more overhead which slows down performance considerably.
Since all RDBMS records are equally “important” to the database, they are not optimized for time.
Also, since an RDBMS has no inherent data compression methods, they are usually combined with exception reporting and averaging techniques, which may result in data loss and inaccurately reproduced data.
Now let’s layer on some more bottlenecks
The speed of writing to an RDBMS is quite slow (from the prospective of the PC). Major RDBMS vendors often claim benchmarks that include very high transactions per second (TPS). What they don’t say is that the TPS speed refers to actions performed on the data after it is already in the database, and not to the speed at which it is written to the database or the data retrieval speed. What goes on inside of the database is of little interest to the end user. The data acquisition speed, and the actual time that it takes to put a set of results onto the screen, is where money is made and lost.
An additional SQL drawback, from the perspective of any financial-based data reporting, is that statistics are not automatically calculated by the RDBMS because SQL mathematics is limited to sums, minimums, maximums, and averages.
What’s more, a traditional RDBMS is generally limited to a one-second-time resolution. This is a problem when you are acquiring high burst quantities of data with sub second time stamps.
The ideal solution for the financial reporting industry is a storage and retrieval methodology that is able to access data extremely fast, nearly instantaneously, and can calculate the statistics for a given time span "on the fly" without the overhead of SQL.
Could that solution be Modulus Financial Engineering’s Market Database Server (MDS™)? We think so.
MDS: A new data retrieval methodology custom-designed for financial analysis
Modulus Financial Engineering, a Chicago-based provider of libraries and software development kits for trading system research and development since (date) is known for providing financial analysis tools to companies such as E*TRADE, Lycos Finance, Smith Barney and Yahoo Finance. On a never-ending quest for speed, their software engineers may have written a whole new rulebook when it comes to number crunching.
MDS uses a new generation of data storage called the binary flat file. Not to be confused with low-tech comma-delimited ASCII files, or any of the other images that come to mind when the term “flat file” is used, this new data storage framework is specially optimized for the financial analysis industry.
The heart of MDS is its patent-pending search algorithm called
MAX II™ search algorithm that in-house tests show is capable of scanning through terabytes (1024 gigabytes) in xx seconds which is xx times faster than the fastest known RDMS in existence.
Although the new routines carry the company’s highest security classification, Tom Wong, Senior Software Engineer at Modulus Financial Engineering was willing to say this:
“Our innovative use of Julian date key fields, coupled with our MAX II search algorithm enables our RMD Server to outperform industry standard RDBMS products like SQL Server 2000 by as much as 1,800% in terms of speed and efficiency."
In the financial industry, where fortunes are made and lost in a blink of the eye, and the perception of speed truly IS relative to the perspective of the observer, MDS has raised the bar beyond the reach of any traditional RDBMS system available today.
MDS is available as a stand-alone server product that may be integrated into most any development language, including C++, VB, Delphi, .net, and Java via the included Application Programmers Interface (API).
For more information contact:
© 2004 Modulus Financial Engineering. All rights reserved. MDS and MAX II are Trademarks of Modulus Financial Engineering. All other trademarks mentioned are the property of their respective owners. Benchmarks and product comparisons were accurate as of the date this paper was written.