Thursday, July 26, 2012

A look at why using Multi-value tables are not always the best approach

I recently had a case at a client where they were unable to get a full import of their user information from Oracle into FIM 2010 with the full import failing after 36 hours without importing all the necessary information. The institution has a 250000+ user base and complex infrastructure.  The Oracle MA utilized a multi-value table for allocation of some RBAC components within their AD.

The Oracle MA had the following tables configured:

1.       Main User information table (250000+ items, 18 fields, 790mb)

2.       Multi-value table (250000+ items, 4 fields,  690mb)

3.       Delta table

I followed the usual troubleshooting steps ensuring that:

1.       The Oracle tables were correctly built and indexed.

2.       The Oracle client version was correct.

3.       The FIM Server was installed, configured and patched correctly.

4.       All possible bottlenecks were identified and eliminated between servers in question.

With all these resolved, we did not get a significant increase in speed, so I did a test of importing the user information excluding the multi-value table from the configuration and the import completed in less than 40 minutes. It was in fact the multi-value table causing the import performance degradation. Although it was correctly indexed, contained only 4 fields, it was almost the same size as the main user table.

We had three options:

1.       Reduce the complexity of the data in the main and multi-value tables (was not possible in this instance).

2.       Consolidate the data into a single view and pass the multi-value processing to FIM rules extension.

3.       Write a custom management agent to incorporate multi-values directly (Due to the work involved in achieving this it was not really an option).

I then asked the client to consolidate the data into a single view writing the multi-values delimited to a single value field. I was then able to still do the import in less than 40 minutes and use a rules extension to extract the multi-values from the single value table to an array which I then flowed to the relevant multi-value attributes in the metaverse. The total process of full import and sync then completed in less than 3 hours.

I am a big advocate of using multi-value tables where appropriate, but there certain scenarios where using multi-value tables are just not feasible and other approaches may be needed.

When evaluating whether or not to use a multi-value table it is important remember that the size and data complexity if the table will affect performance. My own rule of thumb is once the multi-value table exceeds 33-50% (based on complexity) of size of the main table there will be a performance impact on the import of data into the connector space.