Friday, March 23, 2012

Performance problems with Nested SCOPE statements

I'm hoping someone has run across this particular issue and/or has identified some acceptable alternatives:

In using some scope statements to alter the behavior for certain measures, I've determined that when scope statements are nested, they will impact performance dramtically. For example

Scope [Measures].[myMeasure];
Scope Leaves( [Periods] ); -- Approx 100 members
Scope Leaves( [Markets] ); -- Approx 1500 members
This = { some simple calculation }
End scope;
End scope;
End scope;

NB. A 4th dimension, Products has about 450,000 members

At connect time, this statement is re-evaluated. The result is a long connect time. If the scope statement is removed for either Periods of Markets, the connection is immeadiate, but of course the results are wrong. Order of the scope statements doesn't affect the outcome (the [Measure] scope must be first).

I understand the difference in scale (100 or 1500 vs 150,000) but so far, the results seem to be out of line with reasonable expectations. (< 1 second vs ~65 seconds).

NB. The techqiue, causes the calculation to be evaluated at the product level, then SUM'ed over the Periods and Markets. All levels of Product aggregation use / evaluate the same calculation.

Thoughts & suggestions?

It probably depends on what is in your {some simple calculation}. There may be an alternative approach which would be faster, but it's impossible to say without more details. I don't think its the nesting of scopes that is the issue, it's more related to the size of the sub cube over which you are performing the assignment.

The same thing could be expressed as a single scope statement, but I would not expect this to change the performance at all.

Scope (Leaves([Periods]) * Leaves([Markets]));

([Measures].[myMeasure]) = "some simple calculation";

End Scope;

|||

Thanks for your feedback.

First, you're right that changing the "form" of the scope statements doesn't change the performance characteritics. (Already tried that). Indeed, the problem seems to the the size / scope of the sub cube being accessed.

The performance characteritics is indeed linked to the complexity of the calculation. For example, if I use this = 1; (the simplest possible expression), the connect returns immeadiately.

In this case, my target expression is:

this = Iif( [myMeasure], 1, null)

Essentually, this "resets" the expression to 1, for every level of the product attributes / hierarchies, then aggregates the result within stores and periods.

My basic problem is how to stop this calculation from occuring "during the connect" and instead have it invoked / calculated when the measure is actually used (when I'm willing to take the performance hit). It's easy to see / test, since in VS, I simply browse the cube and reconnect and watch Task Manager on the server. (It jumps to 100% for 60 seconds or so).

NB. The server in question is a x64, 4 way, dual core Opteron (2.6 Ghz) with 20 Gig of memory, running Enterprise edition (SP2). Everything is running 64 bit code.

|||

Found the answer, so I thought I'd post it for other people who might trip across this issue:

The problem actually exists in the front end "tool", not in Analysis Services. In this case, I was testing / debugging with Visual Studio, making small modifications to the calculation, then quickly browsing.

Turns out VS issues the following statement shortly after connecting:

select filter([Measures].members, [Measures].currentmember.member_caption = [Measures].currentmember.member_caption) on columns from [myCube]

This statement causes all of the facts to be referenced / resolved, including the measure in question. Therefore, a full bottom up calculation is performed. The query only ends after 60 seconds, because VS uses a 60 second timeout. (No wonder all of my tests came out the same!)

I can only speculate, that the reason VS does this, is to "warm" the cache, to make the subsequent queries faster. Results (and error messages) are discarded.

NB. For the record, this particular expression isn't intended to be used at the "higher levels". But, since I've isolated the key performance issue, at least alternatives can no be developed.

|||Glad you found the issue. I tend to test more complicated stuff with queries in SSMS, rather than using the browser, so I have not hit this issue myself. The OWC browser in VS does not really have a reputation for sending the "best" MDX.

No comments:

Post a Comment