Hanalei, Hawaii Sunday, March 14, 2010

Tip: Using Full Text Search With a Table-Valued Function

I’ve been working with the SubSonic forums lately (CommunityServer) as I prep them for retirement (archiving, searchable). One thing I really need to is to be able to run accurate searches based on free text – so over the weekend I’ve been goofing around with Full Text indexing and it hit me that not a lot of people know a little trick you can pull with a Table-valued function.

I’ve been working with the SubSonic forums lately (CommunityServer) as I prep them for retirement (archiving, searchable). One thing I really need to is to be able to run accurate searches based on free text – so over the weekend I’ve been goofing around with Full Text indexing and it hit me that not a lot of people know a little trick you can pull with a Table-valued function.

SQL Functions
Also known as “UDFs”, functions are something that people don’t use too much. It’s understandable in our current climate of ODD – ORM Dependency Disorder (hey I *just* thought of that – kind of like it :) where all DB procedures (SPs or whatever) are generally avoided. I won’t get into that debate – but if you’re free to put a little programming in your DB, I’ll show you how you can get away from dropping business logic in there and raising it into your ORM while still flexxing the power of SQL Server.

There are 2 main function types in SQL Server 2005 that you can write (not counting system functions here):

  • Scalar: returns a single, typed value
  • Table-valued: returns an in-memory table of records

Scalar functions are very, very useful and you can use them to run calculations that you might want to re-use in your application – such as finding the distance between two points given their lat/long.

Table-valued functions are incredibly useful if you want to reuse functionality that returns one or more results – such as a Full Text search.

Hooking Search Up To a Function
In my last post I showed you how to hook up Full Text indexing, which allows you to do some really cool free text searches. Once you get that done, you can work up a Table-value Function that accepts the search term (this example is using the CommunityServer database):

ALTER FUNCTION [dbo].[SearchPosts] 
(    
    @search nvarchar(500)
)
RETURNS TABLE 
AS
RETURN 
(
    SELECT KEY_TBL.[KEY] as PostID,
       KEY_TBL.RANK as Relevance
    FROM 
       FREETEXTTABLE(cs_Posts, Body, 
       @search) AS KEY_TBL    
)

 

Notice that in this code I’m purposely avoiding joining any tables – I’m only exposing what the function gives me: a Primary Key of the searched record (which in this case is the PostID) and a RANK. I’ll work with the returned data in other ways in a moment.

Note: I could use CONTAINS or other method of search here – this part’s up to you. I like FREETEXTTABLE because it works for me

Using Linq To Sql with Table-value Functions
Linq to Sql supports table-valued functions in the same way it supports tables – you can use them in your queries just like you would any other class. To see how this works I’ll work up a Linq to Sql class and pull in CommunityServer’s posts, as well as my groovy function:

tablefx-1

I’ve named my DataContext “DB” inside of the “CS” class just to keep things simple – now I can query my search results (using a Console App here):

static void Main(string[] args) {

    CS.DB db = new CS.DB();

    var results=from s in db.SearchPosts("distinct query")
                select s;

    Console.WriteLine(results.Count());
    Console.ReadLine();
}

 

And I get back a bunch of results – lots of people have written posts in our forums, evidently, on the subject of DISTINCT queries using SubSonic:

2009-04-13_1046

That’s not all that useful – but I can tweak this a bit and pull in some relevant information – as well as restricting the whole thing to the top 20 records:

static void Main(string[] args) {

    CS.DB db = new CS.DB();

    var results = from s in db.SearchPosts("distinct query")
                  join p in db.cs_Posts on s.PostID equals p.PostID
                  orderby s.Relevance descending
                  select new
                  {
                      Author=p.PostAuthor,
                      Date=p.PostDate,
                      Subject=p.Subject,
                      Rank=s.Relevance
                  };
    foreach (var post in results.Take(20)) {
        Console.WriteLine("Author: " + post.Author + "; Title: " + post.Subject);
    }
    Console.ReadLine();
}

 

This works perfectly:

2009-04-13_1050

Using SQL Profiler I can take a look under the hood to make sure Linq to Sql isn’t running away with my database:

exec sp_executesql N'SELECT TOP (20) [t1].[PostAuthor] AS [Author], 
[t1].[PostDate] AS [Date], [t1].[Subject], [t0].[Relevance] AS [Rank]
FROM [dbo].[SearchPosts](@p0) AS [t0]
INNER JOIN [dbo].[cs_Posts] AS [t1] ON [t0].[PostID] = [t1].[PostID]
ORDER BY [t0].[Relevance] DESC',N'@p0 nvarchar(14)',@p0=N'distinct query'

 

Notice how everything is parameterized as it should be? This is good stuff!

Summary
Many people don’t like diddling with StoredProcedures and Functions because, to them, “business logic will bleed in”. To some degree I can see why people think that – and again I won’t debate the point :). I think that this is pretty biz-logic free here, and shows a great way to push the search capabilities into application.


John Kirk - April 13, 2009 - I like diddling.
robconery - April 13, 2009 - Thanks John - as a CPA I think your input is, as always, super. Does anyone else have a friend who dogs them on their blog?
Matt Sherman - April 14, 2009 - Hi Rob, in fact this is the only way I've found to query full-text while also filtering by other columns. I am working on a dating site and need to do a full-text search of "About Me" while also filtering by gender and location. This technique seems to work well and also seems database-efficient. Imagine if you had to first return all of the full-text results and then filter -- yikes! I first learned it here: http://sqlblogcasts.com/blogs/simons/archive/2008...
P.K. - April 14, 2009 - Awesome Rob! Tricks like this are worth their weight in gold. I love seeing how we can use L2S with Sql Server + SqlServer tricks.
cowgaR - April 14, 2009 - intense debate comments sux... they sux badly (maybe not here but in TDD discussion I'm just reading) both in Firefox and a horror called IE. I know the response will be "no, you suck!" but I've need to tell the truth to the world until Kim arrest me!
cowgaR - April 14, 2009 - well, I've just read that Obama used it. I should have been quiet... ;-) (and yes, again, please write your name & email..again)
robconery - April 14, 2009 - Can you tell me why? I moved off Disqus because it was really, really slow in IE - what's the prob here?
cowgaR - April 14, 2009 - first of all, somehow I am receiving notification that you _dare_ to react on my comment ;P and the checkbox Notify me of follow-up is "unchecked" for second, it didn't remember form data the first time I've tried it, now it is somehow ok (my third post). But these are all minor issues. Major one was that I wanted to sort discussion by time in "integration testing" blog post but I also wanted to have them nested. wasn't able to do that. It seems as somehow everyone that was writing comment there started a whole new thread, which can't be the case. Simply there's no tree (so no chance of to whom particular person reacted to). plus I am getting "unable to load media" accessing your blog on every page (maybe some firefox plugin is messing in).
Elcorin - April 14, 2009 - Hello,
I have already seen it somethere...

Have a nice day
Elcorin
P.K. - April 22, 2009 - Hi Rob, just want to raise this blog post, once more. I really like how you are doing this BUT can u create a FILTER that does this. For example. Lets reference the Storefront code truck #17126 (pre-Kona) (http://mvcsamples.codeplex.com/SourceControl/chan... => Commerce.MVC.DataDataAccessSqlServer SqlInventoryRepository.cs public IQueryable<InventoryRecord> GetInventory() { return from i in _db.InventoryRecords select new InventoryRecord { ID = i.InventoryRecordID, DateEntered = i.DateEntered, Increment = i.Increment, Notes = i.Notes, ProductID = i.ProductID }; } (continued on next comment, because this new comment system has a max-length field :( )
P.K. - April 22, 2009 - (continued from comment, above) ... and if we want to get all the Inventory items for a product, we use the Inventory Filters ... public static IQueryable<InventoryRecord> ForProduct(this IQueryable<InventoryRecord> qry, int productID) { return from i in qry where i.ProductID == productID select i; } kewl! Love this heaps! Now, what happens if we wanted all the inventory items for a product which has some type of 'text query' in the inventory notes field. This notes field is Full Text Catalog'd. Can we make a filter, for this?
Tobias Rundbom - April 22, 2009 - @P.K.
I've done something like that.

I let my filter take another IQueryable as parameter like

Search(this IQueryable<InventoryRecord> qry, IQueryable<SearchResult> search_table){
return from i in qry join s in search_table on q.ID = s.ID
}

I then use a "wrapper" class Search

public class Search
{
public IQueryable<InventoryRecord> Inventory{ get; set; }
public IQueryable<SearchResult> Inventory{ get; set; }
}

that I retrieve in the service.

What I wonder though is how you go about to test all this?
robconery - April 22, 2009 - Why couldn't we? It's a table - you can do anything with it that you can do with any other table...
P.K. - April 23, 2009 - Hi Tobias. Thanks so much for the reply :) :) hmm. please bare with me, as I am very blond (literally). I understand the filter code and what you are doing in that. What I don't understand is what is a search result? Is that a custom class that is (something like) .. public class FullTextSearchResult { public int Key { get; set; } // You named this property, as ID in your example, above. public int Relevance { get; set; } } And if that's on the right track ... 1) Where and what is the repository code? 2) What does some code look like that uses this filter? I was playing around with the following idea, which i really really really hate. I was hoping it would lead me to a proper solution (aka. baby steps). (continued on next comment ....)
P.K. - April 23, 2009 - SqlInventoryRepository.cs // ### This method currently exists, created by RobC. public IQueryable<InventoryRecord> GetInventory() { return from i in _db.InventoryRecords select new InventoryRecord { ID = i.InventoryRecordID, DateEntered = i.DateEntered, Increment = i.Increment, Notes = i.Notes, ProductID = i.ProductID }; } ### This is my evil method :( public IQueryable<InventoryRecord> GetInventory(string notes) { return from i in _db.InventoryRecords join s in _db.InventoryRecordsFTSOnNotes(notes) on i.ID = s.Key select new InventoryRecord { ID = i.InventoryRecordID, DateEntered = i.DateEntered, Increment = i.Increment, Notes = i.Notes, ProductID = i.ProductID }; } i really don't like this. (..continued again...)
P.K. - April 23, 2009 - (and now the final post..) I'm wondering if it should be... public IQueryable<FullTextSearchResult> GetInventory(string notes) { return from i in _db.InventoryRecordsFTSOnNotes(notes) select new FullTextSearchResult { i.Key, Relevance }; } and throw that into the SqlInvetoryRepository.cs file. Hmmm. not sure. I'm so worried :( thoughts guys?
robconery - April 23, 2009 - Hey can you fix that icon of yours please :):):)
P.K. - April 23, 2009 - fixed. appologies (i've changed my gravatar 'g' rating icon). no harm ment :) (not sure if there's any lag between updating and seeing it on this site).
robconery - April 23, 2009 - Thanks! I really do appreciate it :)
tobiasrundbom - April 23, 2009 - @P.K.
Well, in the repository I do something like,

public Search SearchInventory(string phrase)
{
var ctx = new DB();
Search result = new Search();
search.Inventory = from i in ctx.Inventories select new Inventory {...}
search.SearchResult = from s in ctx.SearchInventories(phrase) select new Inventory {...}
return result;
}

and then the service uses something like the follwing:
Search result = _repository.SearchInventory('foobar');
return result.Inventory.Search(result.SearchResult).ToList();

I'm not sure this is optimal though, it feels a bit like things should be more separated from each other.

This does make it possible to make your own search method in the testrepository and run unit tests. However, that search method doesn't employ the same logic as the search in the database (that's impossible) so my question is if any one has any better ideas on how to do this or how to best unit test this?
Stoontarrot - June 9, 2009 - почитаем и посмотрим
Tip Using Full Text Search With a Table Valued Function Rob Conery | Patio Chairs - June 12, 2009 - [...] Tip Using Full Text Search With a Table Valued Function Rob Conery Posted by root 9 days ago (http://blog.wekeroad.com) Searchposts quot distinct query quot select s console writeline results continued on next comment because this new comment system has a max length field p k says rob conery is powered by wordpress under creative commons Discuss  |  Bury |  News | Tip Using Full Text Search With a Table Valued Function Rob Conery [...]
Tip Using Full Text Search With a Table Valued Function Rob Conery | debt settlement program - June 15, 2009 - [...] Tip Using Full Text Search With a Table Valued Function Rob Conery Posted by root 23 minutes ago (http://blog.wekeroad.com) Linq to sql supports table valued functions in the same way it supports tables you exec sp executesql n 39 select top 20 t1 postauthor as author t1 continued on next comment because this new comment system has a max length field rob conery is powered by w Discuss  |  Bury |  News | Tip Using Full Text Search With a Table Valued Function Rob Conery [...]
Gecko