Thursday, July 26, 2012

A look at why using Multi-value tables are not always the best approach

I recently had a case at a client where they were unable to get a full import of their user information from Oracle into FIM 2010 with the full import failing after 36 hours without importing all the necessary information. The institution has a 250000+ user base and complex infrastructure.  The Oracle MA utilized a multi-value table for allocation of some RBAC components within their AD.

The Oracle MA had the following tables configured:

1.       Main User information table (250000+ items, 18 fields, 790mb)

2.       Multi-value table (250000+ items, 4 fields,  690mb)

3.       Delta table

I followed the usual troubleshooting steps ensuring that:

1.       The Oracle tables were correctly built and indexed.

2.       The Oracle client version was correct.

3.       The FIM Server was installed, configured and patched correctly.

4.       All possible bottlenecks were identified and eliminated between servers in question.

With all these resolved, we did not get a significant increase in speed, so I did a test of importing the user information excluding the multi-value table from the configuration and the import completed in less than 40 minutes. It was in fact the multi-value table causing the import performance degradation. Although it was correctly indexed, contained only 4 fields, it was almost the same size as the main user table.

We had three options:

1.       Reduce the complexity of the data in the main and multi-value tables (was not possible in this instance).

2.       Consolidate the data into a single view and pass the multi-value processing to FIM rules extension.

3.       Write a custom management agent to incorporate multi-values directly (Due to the work involved in achieving this it was not really an option).

I then asked the client to consolidate the data into a single view writing the multi-values delimited to a single value field. I was then able to still do the import in less than 40 minutes and use a rules extension to extract the multi-values from the single value table to an array which I then flowed to the relevant multi-value attributes in the metaverse. The total process of full import and sync then completed in less than 3 hours.

I am a big advocate of using multi-value tables where appropriate, but there certain scenarios where using multi-value tables are just not feasible and other approaches may be needed.

When evaluating whether or not to use a multi-value table it is important remember that the size and data complexity if the table will affect performance. My own rule of thumb is once the multi-value table exceeds 33-50% (based on complexity) of size of the main table there will be a performance impact on the import of data into the connector space.

5 comments:

  1. Jacques ... you make a good point, but only in the context of when you're forced into using the Oracle MA, which I presume uses a similarly poorly performed delta view approach to the SQL MA. An ECMA will not have this issue at all ... provided you perform in-memory dataset/hashtable joins to handle the multi-value relationships. I had a similar experience with a 24 hour SQL MA full import of just 100K records (student-class-school relationships), which imported in a few minutes when performed with an ECMA. Essentially you need something to convert an RDBMS into an LDIF format ... which is my preferred approach for all but the most rudimentary of RDBMS designs.

    ReplyDelete
  2. Hi Bob,
    Thanks for the reply. A custom MA would have been my first choice but was not an option in this scenario and this merely presents some of the real world challenges around using multi-value tables and also gives another alternative if a custom MA is out of the question.

    ReplyDelete
  3. I'm with Bob on this one... the built-in SQL and Oracle Management Agents are known to have pretty poor performance due to the way in which the SQL queries are executed. Particularly when it comes to complex views or multi-value fields. When having to integrate with SQL, I try to avoid using the built-in MAs wherever possible - having your own ECMA as part of your arsenal probably isn't a bad idea.

    From what I read in your article, it seemed your main reason for going this route was only the work/effort involved to create such an xMA? Wouldn't be an issue if you had one now that you could re-use. You also wouldn't need to modify the underlying data source's schema, which may be a moot point if you needed to have a multi-value table/view created in the first place, but can be quite important to some clients where the DBA's don't want modifications to their application database.

    There's a MS test-lab that demonstrates how to write a SQL xMA in ECMA2 at http://download.microsoft.com/download/1/8/D/18DDD470-7DA8-4D04-8ABB-1F6C184E6C35/FIM_TLG_ECMA2_CALLv1.docx

    I don't imagine it would be hard to modify in order to provide multi-value capability.

    ReplyDelete
  4. Hi Ross,
    Thank you for the input.As stated, I had to use the Oracle MA due to supportability reasons and I had to look at an alternative solution which would work on the built in MA

    ReplyDelete
  5. Hi Ross,
    Thank you for the input.As stated, I had to use the Oracle MA due to supportability reasons and I had to look at an alternative solution which would work on the built in MA

    ReplyDelete