Friday, March 30, 2012

How to trouble shoot a deadlock problem

I have a job that took 1 hour and 55 mintes to run, which is causing deadlock situatin with another database. How would I go about troubleshoot this problem.
Thanks
LystrWell, you could run Enterprise Manager, go to Manage->Current activity and look at process info to find the locking process. Look at the locks / process id / object.

Try running your queries with the (nolock) hint or use temp tables.|||The message I am getting is:

The VB Application identified by the event source logged this Application ANSDAC: Thread ID: 2768, Logged: Microsoft OLE DB Provider for ODBC Drivers Session.Save() -2147467259 [Microsoft][ODBC SQL Server Driver][SQL Server]Transaction (Process ID 66) was deadlocked on lock resources with another process and has been chosed as the deadlock victim. Rerun the transaction.

Would I need to rerun the process ID 66?|||Yes, whatever query/statement being run by that process was aborted due to deadlocking. It must be re-run.

Try running your job on non-peak/business hours, if possible. What job are you running, btw?|||Then name of the job is call Archive IIS Log which consists of 4 steps
1-Update department-run sp_Update_Log
2-Copy to Archive Database-run sp_ArchiveLog
3-Clean up Log-run sp_cleaningLog
4-Free log space-Backup Log IISLog with No_Log

It runs at 9am everyday which is a busy time. The developer feel that the solution should be to move the database to another server. The Job did finish it just took 1 and 55 minutes, but no one can tell me how long do the job normally take.

So I should run each stored procedure.

Also, when I look at process ID 66
this is what I see"

SELECT MonthEnd."Accounting_date", Manual_Adjustment."Accounting date", Manual_Adjustment."Adjustment_amount", Manual_Adjustment."CheckNumber" FROM { oj "MonthEnd" MonthEnd INNER JOIN "ManualAdjustment" Manual_Adjustment ON MonthEnd."Accounting_date" = Ma

However this is not the database that they are complaining about. Is it possible for 66 to be connected with more than 1 database?

Thanks

Lystra|||use Trace Flags. Check out in Books Online about them else search on google key words [TRACE FLAG, DEADLOCK].|||Yes, process 66 can change the database it is running against at any time.

-When it is blocking, look at the blocking process (the job process) and see what database and query/statement it is running.

-Also look at running it at off peak hours if at all possible. If it is something that updates a bunch of records in a single table you are bound to cause locking. Espcially if this thing runs that long.

-On the database they are complaining about - do the job scripts have a lot of references to this table or are there only a few? If there are a few it should be easier to pinpoint which statement is causing the locking.

-Try starting Sql Profiler before the job runs and let it run to track all sql statements made by the job. As soon as the blocking stops, look to see if any long queries just finished running in the profiler. These could be the culprits.

Once you know the statements causing the issue you can better determine how to resolve the issue.|||First order of business, go beat the developer soundly for scheduling an automated process during a high usage time for users. That may take a while to beat them soundly, but it is an important step, so I'll wait while you finish.

Now that they've been beaten, go back and beat them again. That might make them think twice about such idiocy in the future.

Ok, that is as good a preventative as we can manage at the moment, so now we should move on to correcting the underlying problem. Pick the lowest usage period of the day when you can be certain that all the affected systems will be available. 03:00 works well for me, although some folks prefer 20:00 or something like it. Change the job so that it now runs at a low usage time instead of a high usage time.

After the job runs at its new time, check the SQL Server and NT logs for any error messages. Sometimes there are hidden dependancies that can trip you up when you move a job. If those both check out, then check the log tables that this process maintains to be sure that they also look as you'd expect.

Once you get done cleaning up the problem, go hose the blood off of the developer and carry them to a nice quiet closet somewhere to sleep off the beatings... There's no point in being mean about things, you just have to curb that behavior problem!

-PatP|||The database is called ISSLOG, which log all of the traffic from the web. Now why would this logging database cause deadlocking? The database log information like, clienthost, username, logtime, service, machine, serverIP, processing time, etc..,

I have added the code please look to see why it would cause a deadlock issue.

Thanks

Lystra|||I think my eyes just popped. I think if code1, code2 and code3 are run one after the other, there might be some other processing running that's causing the deadlock. If Code2 and Code3 are run at the same time, it might cause some trouble.|||Then name of the job is call Archive IIS Log which consists of 4 steps
1-Update department-run sp_Update_Log (CODE 3)
2-Copy to Archive Database-run sp_ArchiveLog (CODE 1)
3-Clean up Log-run sp_cleaningLog (CODE 3)
4-Free log space-Backup Log IISLog with No_Log

The code is ran one after the other.

If the code is looking to log traffic why should it create deadlock, because it's not looking at rows of data.

Thanks|||First order of business, go beat the developer soundly for scheduling an automated process during a high usage time for users. That may take a while to beat them soundly, but it is an important step, so I'll wait while you finish.

Now that they've been beaten, go back and beat them again. That might make them think twice about such idiocy in the future.

Ok, that is as good a preventative as we can manage at the moment, so now we should move on to correcting the underlying problem. Pick the lowest usage period of the day when you can be certain that all the affected systems will be available. 03:00 works well for me, although some folks prefer 20:00 or something like it. Change the job so that it now runs at a low usage time instead of a high usage time.

After the job runs at its new time, check the SQL Server and NT logs for any error messages. Sometimes there are hidden dependancies that can trip you up when you move a job. If those both check out, then check the log tables that this process maintains to be sure that they also look as you'd expect.

Once you get done cleaning up the problem, go hose the blood off of the developer and carry them to a nice quiet closet somewhere to sleep off the beatings... There's no point in being mean about things, you just have to curb that behavior problem!

-PatP

What pat said, but just for good measure, beat them again after the problem's sorted, then hose them down. ;)

No comments:

Post a Comment