Saturday, 8 October 2016

Understanding why export from Essbase to relational database can be slow



This morning while reading the Hyperion EPM Week in Review (http://www.hyperionepm.com/hyperion-epm-week-review-oct-7-2016/ ), I came across this comment from Brian where he mentioned that he has seen very slow performance when exporting data directly from Essbase to a relational source as compared to a file. To tell the truth, even I have seen this issue. So let us see if I can explain why the export to a relational source is slower as compared to export to a flat file or a binary file. 
The below diagram shows the high level architecture of Essbase along with the associated data flow pathways.


Observe that in the bottom right hand corner you have Essbase applications capable of interacting directly with the RDBMS server using the ODBC/JDBC drivers. This is what is used when we push data from Essbase to a relational source. 

There are two main reasons for the delay: -

  • Networked environments.
  • Relational systems implementing transaction control.
The above diagram is quite simplistic because it is missing the glue which holds the entire applications together. This is because that glue is something which is shown at a high level of abstraction like HTTP, TCP/IP and drivers. This glue is what we call networking. And unfortunately networking is not foolproof. While the above diagram shows up the high level protocols, what it misses is that underlying these protocols are physical devices likes switches, routers and adapters. And these devices are not foolproof. Packets can be scrambled, lost or come out of order in the network. Any network on an average will suffer packet loss. A network can in the long run, on an average, suffer packet loss between 0% to 4%. Any number greater than this and you would be seeing network issues. This means that while it took say 100ms for packets containing say 100 rows to come without any possible loss, it is quite possible that it takes 4ms to fix the mismatched packets and so we now have a total time of 104ms. And for computer systems these extra milliseconds are equivalent to eternity. This is one reason why when I push data from Essbase to a relational system, I am using a medium that is error prone and hence I can see slight performance degradation since my networking protocols are doing the error check and handling.

The second reason is actually a bit more subtle. Remember that you are pushing data to a relational system. And one of the main characteristics of relational systems is that they implement something called as ACID properties. This is mainly related to transaction processing. The ACID properties are as follows: -

  • Atomicity: - Each transaction is all or nothing.
  • Consistency: - Ensures a transaction brings a database from one valid state to another.
  • Isolation: - Ensures that concurrent execution of transactions result in a system state that would be obtained if the transactions were executed serially.
  • Durability: - Ensures that once a transaction is committed, it will remain so even if we suffer power loss, crashes or errors.
The main reason for having the ACID properties is so that when I push data from say Essbase to the relational system, the properties ensure that there is a sanity check. All data goes in. The contents are not inconsistent and suppose I shut down the database and restart it, the records still very much exist. And in order to achieve this, the relational system maintains logs that record the transaction. Each recording of this takes some time and so although it might take say 100ms to push data into a flat file, when you push it into a relational system the transaction logging will take some time and you will observe that it takes slightly more than 100ms when you push data to the relational system.

These are the two of the many reasons why pushing data into a relational system from Essbase can be slightly slow at times. With exporting to files, the associated overheads of network and database is bypassed and you get very good export speed.

(However, there are ways to slow export to file as well…This is what you have to do… Fill up your hard disk with data so that your free space on disk is approximately equal to the data export file size plus an odd 10 or 20 megabytes. Now run the export. You will find that export will slow down. This is because of something called as disk fragmentation. If you are a Windows user, you would have observed copying 10000 1KB file takes more time than copying 1 10000KB file. This is because of the hard disk trying to reduce disk fragmentation)

No comments:

Post a Comment