Recently I got a request for a top 10 list of things that a DBA should be doing on a daily basis from an engineering managers perspective. Well, here is what I came up with, and please note, that this list is dependant on the poor guy being a full-time DBA, not some programmer who also doubles as a DBA as many of these tasks can be fairly time intensive depending on the size of the SQL Server installation and number of servers/databases/jobs that the DBA has to maintain.
1.) Ensuring backups are running (and complete successfully)
Recently a round of SQL injection attacks put our backup/recovery strategy to the test and uncovered a few flaws that we hadn't really encountered before. Namely, applications that continuously try and maintain connections to the database (note: a database should be in single user mode to do a recovery properly as the restore process needs to have exclusive access)
2.) Ensuring storage space/RAM is not a concern
Here at my job, when we launch a new site it's usually (these days) a statewide deployment, with the states Governor, senators, lobbyists, and other political figures present, which means lots of news coverage, promotional material distribution, and the like. Which always results in a sizable spike in traffic for about a month afterwards, then it normalizes again but is on average higher than it was before. Starting late last year, this spike in traffic made the SQL Server logs grow exponentially and eventually took down the site due to a non-proactive system administrator and no full-time DBA. It's a good idea to keep this kind of thing as automated as possible. As a matter of fact, all new HP Servers come with Insight Manager which can do this for you; I'd suggest finding someone capable to configure it for you.
3.) Proactively monitoring database performance
This is somewhat of an open ended statement, and can mean a lot of different things to different people. Work with your DBA to determine a set of metrics that are important to you for knowing what's going on in the database. Starting with SQL Server 2005, there is a rich set of data from the DMVs (Dynamic Management Views) available for snapshots into the internals of SQL Server. If you cannot find anything to start with take a look on MSDN for information on whats available, from there you will get some ideas. Some suggestions to start are the Top 100 Most resource intensive queries on your system. Start with the queries that are utilizing the highest amount of IO, as disk access is the slowest part of a database usually.
4.) Ensuring that a usable disaster recovery plan is in place
This takes time, resources, planning, and people dedicated enough to the job to want to do it in off hours and on weekends, otherwise you could have some upset customers when your site goes down for "Testing". It is pretty important though, as who wants to be the one to tell the Boss that the site won't come back up cause you can't figure out why the database restore software gives a generic error message? I didn't want to, but I was that guy!
5.) Ensuring an adequate set of tools are available and are familiar with their functionality
Every profession has its set of tools and utilities that make life easier. Database professionals ... are... well... often left out. However, there are some good tools out there for them, you just have to look for them! Red-Gate is an excellent starting point, their one product, SQL-Prompt is a godsend for people who do a lot of T-SQL development and use SQL Server Management Studio. IntelliSense for T-SQL! YAAY! (note: SQL Server 2008 Management Studio has this built in when connected to a 2008 Server)
6.) Ensuring that all database changes submitted by developers are reviewed and signed off on
This is also somewhat open-ended, and is dependant on your organization having a structured change management process. Let's face it though, not all developers take the time to really understand how to properly program in SQL, and may not be making best use of existing indexes, so make sure your DBA looks over all the new SQL code before it goes to production. It can save time on hard-to-locate performance hits later.
7.) Identifying rapidly growing objects in the database (transaction logs, tables, etc)
This goes hand in hand with #2, however, this is more of a proactive approach, usually automation techniques will let you know when you're starting to approach a pre-determined threshold. Doing this proactively, you will know when that time is coming, and you can have a strategy in place to handle it.
8.) Ensuring scheduled tasks are running (and complete successfully)
Backups, data cleansing, ETL tasks, all of them. No one likes coming in in the morning and finding an application down because a nightly dataload did not complete properly or a partner did not upload it. Be ready to roll back any partial updates!
9.) Ensuring the disaster recovery plan actually works
I reiterate because I love you. No, really, I do. I mean it. Call me?
10.) Be familiar with what your platform can do
Sounds simple enough, but I can't tell you the number of times I've interviewed potential candidates, and they couldn't tell me the difference between an inner join and an outer join. But a DBA that can't tell me what Profiler is? Nah.. not acceptable. If nothing else, know the toolsets that came with the product like the back of your hand, sometimes they're too generic for a specific task, but I will lay money on them being useful 99% of the time!
Posted
Jun 24 2008, 12:28 PM
by
dacrowlah