Lies and out-dated statistics

The saying goes that you can prove anything with statistics – and I am sure the same figures are being used by the McCain, Obama and Clinton camps to tell vastly different stories.  But enough about politics – what has this to do with Project Server 2007?  SQL Server uses statistics too – and if they are not up to date it might make the wrong decisions!

As an example if you were trying to find me in a database and there were indexes on my department, surname and date of birth, and you knew may name, birth month (but not the actual day or year) and my department what is the best way to find me?  With a name like Smith, you probably wouldn’t want to start there.  So perhaps you would find all the people in Project Support, then look for any with surname Smith, then trim down to the ones born in the right month and finally see if any (or hopefully the only one) is called Brian.  If you were looking for Chris Boyd, or Christophe Fiessinger then you might start your filtering on their names first – as there aren’t as many Boyds or Fiessingers as there are Smiths. What you are doing is building an execution plan, and usually the most efficient way is to trim down the number of results at the earliest point in the plan.  SQL Server does this by keeping statistical data about the contents of data tables and indexes, and builds plans according to this data.  If the data is not accurate it can lead to execution plans that are not optimal.

Maintenance plans in SQL Server give you the ability to schedule jobs to reorganize your indexes to ensure they are efficiently stored both for reading and writing – and also update your statistics to ensure SQL will make the best possible decisions when deciding a query plan.  For SharePoint databases you should avoid using the default rebuild index option of the maintenance plan wizard as it does not re-apply existing index options – but you can script your own tasks to do this.

There are some great resources online for maintenance of your databases – I’d suggest the recent Whitepaper titled Database Maintenance for Microsoft® SharePoint® Products and Technologies and also for general background reading the SQL Books Online, Database Engine Administrator InfoCenter is an excellent resource. 

I will be following up in the next few weeks with blogs specifically about two areas where a good set of up to date statistics can improve the performance in Project Server 2007.

  • Building Data Analysis cubes – outdated statistics are one of the causes of recent problems seen with slow cube building and heavy tempdb usage.
  • Project Center Views – the error that can appear after around 30 seconds if you have many projects, and also large project that you drill into.