Thursday, August 27, 2009

Building a High-end Windows 2008 Database Server - Page 2


The Components

Hard Drives

With a database server, the hard drives are the key to performance.   The memory and processor certainly play a part, but hard drives are incredibly slow when compared to everything else.  There are a number of things that can be done to improve the speed.  Some are obvious to anyone who has built a system before and others are very advanced.

Spin Rate
It is a given to use 15,000 RPM drives.  A slower drive can’t keep up with the transfer rate of its own interface and access times are greatly impacted.

Type
The first decision these days is whether to select SAS or parallel SCSI.  I went with SAS because of the flexibility and because I believe that parallel SCSI has reached the end of its life cycle.  At their fastest speeds, SAS allows for a longer cable length and can connect to significantly more drives per controller.  In the future, when I decommission my current controllers, I can reuse them with a combination of SAS and SATA drives in a lesser system.  Because SAS connecters are much smaller than SCSI, it is possible to put two ports on a single drive to allow a drive to be connected to two different controllers for controller failover at the drive level.  SAS drives are also typically smaller, allowing you to fit more of them in a system.

RAID
Speed and throughput are not the same thing, although they are related.  Throughput is about how many transactions you can handle at once.  Increasing throughput by using more drives or spreading transactions across multiple servers will actually decrease the speed if the number of simultaneous transactions is relatively small.  But when you have many simultaneous transactions, you will get more speed by increasing the throughput.  Smart controllers and sufficient caching can help reduce speed loss while also improving throughput.

Given the above, the next decision is how many drives I will need.  This also requires that I decide on the number of drive volumes and RAID level for each.  Since I am building a single-purpose server, specifically a SQL Server, I can really optimize here.  I do have budget constraints to consider and I need to make sure I can get a case to handle my plan.  I should note that drive capacity is only a minor consideration because even the smallest drives have ample capacity given the number of drives I want to use to boost my RAID throughput.

There is much to consider with RAID, especially when you add in variants such as RAID 50 and 60.  Most RAID levels have their pluses and minuses, but I find RAID 10 tends to give me the best balance between performance and the number of drives you can lose before data loss.  The fact that you can’t hold as much data as RAID 5 is not an issue for me since I am using many drives to increase my throughput.  Also, losing a drive under RAID 5 absolutely kills performance until the drive is replaced and rebuilt.

There is a little confusion with RAID 10 but modern controllers have eliminated the issue.  Two schemes have been called RAID 10: You can have a stripe that is mirrored or you can have mirrors that are striped.  The latter is significantly better because you can lose one drive on each mirror without any data loss, whereas the former cannot withstand the loss of more than one drive (you can’t afford a second lost drive because one of the stripes will already be offline after the first loss).  However, both schemes physically store the bits of data identically, so modern controllers use that fact to give the same level of protection to both.

I need the following volumes:

      1)      System (OS)
      2)      Data
      3)      Logs
      4)      TempDB
      5)      Backups

I used a two-drive RAID 1 for the OS.  I also used a single drive for the backups.  This is because the backups are already redundant to the original data and are replicated to both servers, and the backups get backed up again each day to offline storage as well as a third server in another state.

TempDB also doesn’t require redundancy as it is rebuilt every time SQL Server starts and it only contains interim results and not permanent storage.  You can speed up all of your databases by storing TempDB on its own volume.  You can store TempDB in memory instead if you know what its maximum size will be.  That is not the case for me, so I used a two-drive RAID 0 volume.  One final consideration with TempDB is that you get optimized usage with SQL Server if you break TempDB into as many files as you have CPU cores.  The ideal is to put each file on its own hard drive and skip RAID altogether.  I will not have enough drives initially to do this.

Data and Logs are accessed differently and should, therefore, be on separate volumes.  Like TempDB, you can benefit by breaking databases into multiple files and putting each on its own volumes.  I am supporting far too many databases for this to be possible.  Instead, I created a six-drive RAID 10 for data and a four-drive RAID 10 for logs.

Here is the summary using a total of 15 drives:

Purpose
RAID Type
Drives
OS
1
2
Data
10
6
Logs
10
4
TempDB
0
2
Backups
none
1

Putting 15 drives into a case creates a major constraint on what kind of case I can use, especially since I need the drives to be hot swappable and I plan to add more drives later.

I will be describing more things to do to make the drives faster when I describe the hardware and OS settings below.  If you read nothing else, make sure you read that section.


Next Page: The Components continued




Please go to the first page to read or post comments.