The Components
Hard Drives
With a database server, the hard drives are the key to performance. The memory and processor certainly play a part, but hard drives are incredibly slow when compared to everything else. There are a number of things that can be done to improve the speed. Some are obvious to anyone who has built a system before and others are very advanced.
Given the above, the next decision is how many drives I will need. This also requires that I decide on the number of drive volumes and RAID level for each. Since I am building a single-purpose server, specifically a SQL Server, I can really optimize here. I do have budget constraints to consider and I need to make sure I can get a case to handle my plan. I should note that drive capacity is only a minor consideration because even the smallest drives have ample capacity given the number of drives I want to use to boost my RAID throughput.
There is much to consider with RAID, especially when you add in variants such as RAID 50 and 60. Most RAID levels have their pluses and minuses, but I find RAID 10 tends to give me the best balance between performance and the number of drives you can lose before data loss. The fact that you can’t hold as much data as RAID 5 is not an issue for me since I am using many drives to increase my throughput. Also, losing a drive under RAID 5 absolutely kills performance until the drive is replaced and rebuilt.
There is a little confusion with RAID 10 but modern controllers have eliminated the issue. Two schemes have been called RAID 10: You can have a stripe that is mirrored or you can have mirrors that are striped. The latter is significantly better because you can lose one drive on each mirror without any data loss, whereas the former cannot withstand the loss of more than one drive (you can’t afford a second lost drive because one of the stripes will already be offline after the first loss). However, both schemes physically store the bits of data identically, so modern controllers use that fact to give the same level of protection to both.
I need the following volumes:
1) System (OS)
2) Data
3) Logs
4) TempDB
5) Backups
I used a two-drive RAID 1 for the OS. I also used a single drive for the backups. This is because the backups are already redundant to the original data and are replicated to both servers, and the backups get backed up again each day to offline storage as well as a third server in another state.
TempDB also doesn’t require redundancy as it is rebuilt every time SQL Server starts and it only contains interim results and not permanent storage. You can speed up all of your databases by storing TempDB on its own volume. You can store TempDB in memory instead if you know what its maximum size will be. That is not the case for me, so I used a two-drive RAID 0 volume. One final consideration with TempDB is that you get optimized usage with SQL Server if you break TempDB into as many files as you have CPU cores. The ideal is to put each file on its own hard drive and skip RAID altogether. I will not have enough drives initially to do this.
Data and Logs are accessed differently and should, therefore, be on separate volumes. Like TempDB, you can benefit by breaking databases into multiple files and putting each on its own volumes. I am supporting far too many databases for this to be possible. Instead, I created a six-drive RAID 10 for data and a four-drive RAID 10 for logs.
Here is the summary using a total of 15 drives:
Purpose | RAID Type | Drives |
OS | 1 | 2 |
Data | 10 | 6 |
Logs | 10 | 4 |
TempDB | 0 | 2 |
Backups | none | 1 |
Putting 15 drives into a case creates a major constraint on what kind of case I can use, especially since I need the drives to be hot swappable and I plan to add more drives later.
I will be describing more things to do to make the drives faster when I describe the hardware and OS settings below. If you read nothing else, make sure you read that section.
Next Page: The Components continued
Please go to the first page to read or post comments.