Thursday, April 19, 2007

Singleton DMX

If you've used DMX to generate predictions from a data mining model you will be familiar with the singleton query. This is where you obtain a prediction for a particular set of input attributes; either within a case table or in a nested table. There was an example of one of these in a previous post.

Now the form of that query was:

select prediction
from data mining model
prediction join
(select (select something as nested row 
          union select somethingelse as nested row etc) as nestedtable)
as t

Now, what I wanted to do was using openquery() to return all the nested rows, but I did not have an explicit case. How do you make it work?

After bugging the team on the www.sqlserverdatamining.com web site in the newsgroup section the answer was obvious. Just go ahead and create a dummy case. A sample query that works for one of my models follows.

 

SELECT
    Predict([Model_NaiveBayes].[Bucket]),
    PredictProbability([Model_NaiveBayes].[Bucket])
From [Model_NaiveBayes]
NATURAL PREDICTION JOIN
SHAPE { OPENQUERY(Test, 'SELECT 1 as CaseKey') }
APPEND (
    { OPENQUERY(Test,'
    select 1 as ForeignKey, Term
    from Terms
    cross apply Matches('some long discourse containing many terms that I want to characterise'',''\b('' + Term + '')\b'')')
    } RELATE [CaseKey] TO [ForeignKey]
) AS [Msg Term Vectors]
AS T

In this case it uses the Matches TVF which I describe in a previous post to identify the terms in the text.

No comments: