in
Eric Vaillancourt on SQL Server...

MS SQL Admin

March 2008 - Posts

  • Avoid using local variables in your Select statements

    People are always surprised when I tell that their queries will run slower when they use local variables in their queries.  Let me explain: If you use a local variable in a query predicate instead of a parameter or literal, the optimizer resorts to a guesstimate for selectivity of the predicate. Use parameters or literals in the query instead of local variables, and the optimizer will do a better selecting a query plan. For example, execute both of these queries.  The first one uses a local variable and the second one a literal:

    use adventureworks

     

    --first one with a local varaiable

    declare @StartOrderDate datetime

    set @StartOrderDate = '20040731'

     

    select * from Sales.SalesOrderHeader hdr, Sales.SalesOrderDetail dtl

    WHERE hdr.SalesOrderID = dtl.SalesOrderId

    AND hdr.OrderDate >= @StartOrderDate

     

    --secondeone with a literal

    SELECT * FROM Sales.SalesOrderHeader hdr, Sales.SalesOrderDetail dtl

    WHERE hdr.SalesOrderID = dtl.SalesOrderId

    AND hdr.OrderDate >= '20040731'

    The number of rows from Sales.SalesOrderHeader that the optimizer estimates will qualify vs. the condition hdr.OrderDate >= @StartOrderDate is 9439.5, which is exactly 30% of the size of the table. (the observations that follow are based on my SQL Server 2005 version; your results may differ depending on your SQL Server version, available memory, etc.).

    The cardinality of the result set for the predicate "hdr.OrderDate >= '20040731'" is estimated as 40 for a selectivity of 0.13%. The plan chosen for this query uses a nested loop join instead of a merge join because of this improved estimate.

    image

    To force SQL Server to optimize a query every time it is run, and use the values of local variables to estimate cardinality and cost during optimization of the query, add the RECOMPILE query hint to the query. For example, modify the first of the two preceding example queries shown to:

    SELECT * FROM Sales.SalesOrderHeader hdr, Sales.SalesOrderDetail dtl

    WHERE hdr.SalesOrderID = dtl.SalesOrderId

    AND hdr.OrderDate >= @StartOrderDate

    OPTION(RECOMPILE)

     

    This eliminates the problem of bad estimates due to local variables, at the cost of recompiling the query every time you run it.  Compilation overhead can be quite high, especially for frequently-run, small queries. To avoid this overhead, further modify your application to use parameterized batches or stored procedures instead of dynamic SQL.

     

    To eliminate the use of local variables, consider :

    1. Rewriting the query to use literals directly in queries instead of assigning them to variables and then using the variables in the queries;
    2. Using sp_executesql or API server cursors with parameters that replace your use of local variables;
    3. Using a stored procedure with parameters that replace your use of local variables.

    Hope this helps,

     

    Eric Vaillancourt

     

     

     

    Reference : http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx

  • Preventing Microsoft Excel or any other application from connecting to SQL Server 2005

    Today, I was asked a question on how to prevent Microsoft Excel from connecting to SQL Server.  At first I wasn’t sure how to respond to this, but I finally found a way.  It is a simple trick that uses the FOR LOGON option now available since the SP2 release of SQL Server 2005.

     

     

    USE master

    GO

     

    CREATE TRIGGER Prevent_Apps_logon

    ON ALL SERVER FOR LOGON

    AS

    BEGIN

          IF APP_NAME() LIKE '%excel%' -- or any other app name!

          ROLLBACK

    END

     

    Hope this helps,

     

    Eric

     

     

     

  • Auditing Logins in SQL Server 2005

    When I teach my SQL administration classes, I'm often ask if SQL Server provides a mechanism to log user logins.  Before SQL Server 2005 SP2, this task was possible but not reliable.  In SQL Server 2005 this task is easy to implement using DDL triggers (Data Definition Language).

    One reason that people ask me this question, is because they need to comply with Sarbanes-Oxley.  I will be writing a series of articles on auditing for SOX in the next weeks.

    As a consultant, I am frequently invited to attend meetings with Sarbanes-Oxley auditors to discuss the security and integrity of corporate data.  they want to know who has access to the data, how access is granted, and how is the company monitoring to prevent someone from sneaking in, logging on, and doing something they shouldn't be doing.

    In SQL 2005, a trigger can be created in order to perform any action upon a DDL statement. The scope can be database level or Server level. DDL triggers allow us to track critical changes to our database environment that may be intentional, by mistake, or malicious.

    When you design DDL triggers, it's important to determine what events you want to audit, and to determine in which scope each event occurs. In this article, we will write a trigger to capture logins, which are server level events. You can get a list of all events that are valid for use in DDL triggers.

    The first step is to create a database and a table that we will use to store the logs.

    Execute the following code:

     

    use master

    GO

     

    create database Admin_Log

    GO

     

    use Admin_Log

    GO

     

    create schema [admin]

    GO

     

    CREATE TABLE admin.logs (

    LogID int IDENTITY(1,1),

    EventTime DATETIME,

    EventType VARCHAR(100),

    LoginName VARCHAR(100),

    HostName VARCHAR(100),

    AppName VARCHAR(255),

    Event_Data XML)

    GO

    Note that the table makes use of the XML data type, which is new in SQL Server 2005. As you'd assume by its name, its job is to hold XML data.

    Once your table is in place to keep track of the events, you're ready to create the trigger necessary to watch for these event.

     

    Execute the following code:

     

    use master

    GO

     

    create trigger ServerWideLoginLogs

    on all server

    with execute as self

    for LOGON

    as begin

    DECLARE @event XML

    SET @event = eventdata()

    INSERT INTO Admin_Log.admin.logs (EventTime,EventType,LoginName,HostName,AppName,Event_Data)

    VALUES(CAST(CAST(@event.query('/EVENT_INSTANCE/PostTime/text()') AS VARCHAR(64)) AS DATETIME),

           CAST(@event.query('/EVENT_INSTANCE/EventType/text()') AS VARCHAR(100)),

           CAST(@event.query('/EVENT_INSTANCE/LoginName/text()') AS VARCHAR(100)),

           CAST(@event.query('/EVENT_INSTANCE/ClientHost/text()') AS VARCHAR(100)),

           APP_NAME(),

           @event)

    end

    GO

     

    This script creates the trigger that declares that we want to look for all DDL Login events that occur on the server. The information resulting when this trigger fires will go into the admin.logs table that we created earlier. Notice that we casting information from our @event variable (that we initialized using the eventdata() function) to insert data into our table.

    In this context, the eventdata() function will gather information about the triggered event (LOGON) because that's the event specified in the statement FOR.

    To test our new trigger, try connecting to the server and execute the following code:

    select * from admin.logs

    image 

    After running this statement, you will see that a record is inserted every time someone connects to the server.

    What's in your XML?
    The Event_Data field is populated by the EVENTDATA() function from our INSERT statement in our trigger. In this case, the eventdata() function returns XML data related to the connection, including the date and time of the login, the server name, the user ID, and the machine system ID.

    image

    Hope this was helpful,

     

    Eric Vaillancourt

     

  • Understanding SET STATISTICS IO and SET STATISTICS TIME

    Today, I want to talk about two commands that are often overlooked.  SET STATISTICS IO and SET STATISTICS TIME these two commands are very useful when comes time to tune a query.

    Tuning a query seems simple enough. In essence, we want our searches to run faster.  When I teach my class on “Optimizing SQL Server” I realize that many people find it difficult to accomplish this task.  While there are many reasons why query tuning is difficult, this article will concentrate on one aspect.  And that is that query tuning takes place in an environment that is often changing from second to second, making it hard to really know what exactly is going on.

    When you run a query, different server resources are used. One of those resources is the CPU time. If you run the same query over and over again, the CPU time needed to run the query will remain very close from one execution to other. I am not talking about the total time it takes to run query from the start to the end, but the amount of CPU time needed to run the query.

    Another resource that the Server needs is IO. Whenever you run a query, the server must retrieve data from memory (logical reads) or from the disk (physical reads).  Using high CPU and IO resources will result in slow query performance. 

    So your goal should be to rewrite your queries in such a way that they use less CPU and IO resources. If you accomplish this, then the performance of your query will improve.

    Both the SET STATISTICS IO and the SET STATISTICS TIME Transact-SQL commands have been around a long time. But for some reason, they are often ignored.

    Both commands can be “turned on” using Transact-SQL commands or by setting the appropriate connection properties in Options dialog from the tools menu in Management Studio.

    clip_image002

    By default, these settings are not turned on.  The SET STATISTICS IO and the SET STATISTICS TIME commands act like settings, turning on and off various reported information on the resources our queries use.

    So let’s look at an example of how use these commands.

    To begin our example, start Management Studio and connect to a SQL Server. For this example, we will use the sample database provided by with SQL Server 2005 and that is AdventureWorks.  So start by making AdventureWorks your default database.

    Next, run this query:

    USE AdventureWorks;

    GO      

    SELECT * FROM Person.Contact;

    GO

    Normally this query should return 19972 rows.  Now, let’s run the same query, but this time, let's run the SET STATISTICS IO and the SET STATISTICS TIME commands before we run the query.  Before we begin our example, run these two commands DBCC DROPCLEANBUFFERS and DBCC FREEPROCCACHE (don’t run these on a production server) first. These two commands will clear out SQL Server’s data and procedure cache.

     

    DBCC DROPCLEANBUFFERS
    DBCC FREEPROCCACHE

     

    Now, enter and run the following Transact-SQL commands:

    SET STATISTICS IO ON
    SET STATISTICS TIME ON

     

    Once you have done this, then run this command again:

    SELECT * FROM Person.Contact;

    GO

     

    After you run these command, you will see new information in the message tab that you did not see before.

     

    clip_image004

     

    (The results you get may vary, but we will talk about this below as we explain everything you see above.)

    Let me explain what it all means:

     

    SET STATISTICS TIME Results:

     

    “SQL Server parse and compile time:” tells us how much CPU time and total time it took SQL Server to parse and compile our T-SQL statement.  In my case, it took 0 ms. of CPU time, and a total time of 359 ms. of elapsed time. Your CPU time will probably vary because your server is different than mine.

     

    If you re-execute the query, again without clearing the cache (as recommended above), most likely both instances of CPU and compile time will be 0, as the cached query plan will be reused, so no new compile time is required.

     

    This is not too important, what is of interest to us is the time information displayed at the bottom of the query’s output. Here is it is again:

     

    SQL Server Execution Times:

       CPU time = 125 ms,  elapsed time = 1112 ms.

     

    What this tells us is how much CPU time was used to execute the query, and how long the query took to run.  The CPU time is a somewhat consistent measurement of the amount of CPU resources it takes for your query to run, and this is relatively independent of how busy your CPU is.

     

    The elapsed time number is a measurement of how long the query took to execute (not counting the time for locks or reads). This number will vary a lot because depending on the load on your server.

     

    Because the CPU time is relatively consistent, this is what you should use to determine whether the changes you make in your queries, are helping or making things worse.

     

    SET STATISTICS IO Results:

     

    Table 'Contact'. Scan count 1, logical reads 561, physical reads 1, read-ahead reads 559, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    This is not all useful; let me try to explain what it means.

     

    Output item

    Meaning

    Table

    Name of the table, in our case ‘Contact’

    Scan count

    Refers to the number of times that tables referenced in the query have been accessed. In our example, the table in our query was only accessed 1 time. For queries that don’t include joins, this information is not too helpful. But if your query has one or more joins, then this information may be of use to you.  You may want to pay attention to the Scan Count, watching to see if it is increasing or decreasing as you tune the query.

    logical reads

    Number of pages read from the data cache.  This is the most useful piece of data provided by the SET STATISTICS IO commands. As you probably know, before SQL Server can do anything with data, that data must be in its data cache. In addition, when SQL Server reads data from the data cache, it reads it in 8K pages.

     

    A logical read is the number of pages SQL Server had to read from the data cache in order to produce the results specified by our query. The exact same number of logical reads will always occur when the same query runs against the exact same data on SQL Server.

     

    Knowing the number of logical reads that SQL Server takes to execute a query is important because it is the one thing that never changes from one execution to another giving us the perfect reference to use when we are tuning a query to determine if you we are succeeding or failing.

     

    The fewer the logical reads performed by a query, the more efficient it is, and the faster it will perform, assuming all other things are held equal.

    physical reads

    Number of pages read from disk.  This information is not too useful for query tuning. 

     

    First of all, a physical read refers to when SQL Server must go to disk to get data pages it needs to put into the data cache before it can execute the query.

     

    As mentioned above, all data must be in the data cache before SQL Server can execute a query. Whenever SQL Server begins to execute a query, the first thing it does is to check and see if the data pages it needs are in the data cache.

    read-ahead reads

    Number of pages placed into the cache for the query.  Like physical reads, this information is not too useful for query tuning.

     

    This value tells you the number of physical page reads that SQL Server performed as part of its read-ahead mechanism. To help optimize its performance, SQL Server reads physical data pages ahead of when it thinks your queries might need the data.

    lob logical reads

    Number of text, ntext, image, or large value type (varchar(max), nvarchar(max), varbinary(max)) pages read from the data cache.

    lob physical reads

    Number of text, ntext, image or large value type pages read from disk.

    lob read-ahead reads

    Number of text, ntext, image or large value type pages placed into the cache for the query.

     

    Now for a real example:

     

    To be honest with you, when you execute a query without a WHERE clause there’s not much you can do to tune a query because the server performs a scan on the table.  Let’s take another example:

     

    Execute the following code:

    USE AdventureWorks;

    GO      

    SELECT * FROM Person.Contact WHERE firstname='Carla';

    GO

     

    Now look at the Message tab:

     

    clip_image006

     

    Even though we have filtered our query and the elapsed time is down (data is cached), the logical reads are still at 561.  This means that reading 25 rows is costing us the same as in the previous example that was returning 19972 rows. 

     

    One thing we could do to reduce the number of logical reads is to create an index on firstname (I will cover the creation of index in another article).

     

    You can create an index directly in Management Studio by selecting the table in the object explorer and right clicking on the Indexes folder:

     

    clip_image008

     

    A new index dialog will appear:

     

    clip_image010

     

    Or you can execute the following code:

     

    USE [AdventureWorks]

    GO

    CREATE NONCLUSTERED INDEX [IDX_firstname] ON [Person].[Contact]

    (

           [FirstName] ASC

    )

    GO

     

    Either way, you will achieve the same result.  Now re-execute the query and look at the message tab:

     

    clip_image012

     

    Now that is quite an improvement…

    Hope this was useful, feel free to post any comments.

     

    Eric

     

     

  • Scripting Multiple Objects

    The other day I found a way to script multiple objects using the "Object Explorer Detail" tab in Management Studio. To be honest, I never thought that this tab was useful.

    All you have to do is select the type object you want to script in the object explorer and do a multiple selection in the Object Explorer Detail window.

    Look at the example below:

    image

    Hope this helps,

     

    Eric

©2008 SQLProf.com & Eric Vaillancourt
Powered by Community Server (Non-Commercial Edition), by Telligent Systems