I’ve been managing a lot of Dell servers lately, where the baseline showed very poor performance for local drives connected to PERC (PowerEdge Expandable RAID Controller) controllers. Poor enough to trigger negative marks on a MSSQL RAP. Typically, read and write latency would never get below 11ms, even with next to no load on a freshly reinstalled server. Even the cheapest laptops with 4500 RPM SATA drives would outperform such stats, and these servers had 10 or 15K RPM SAS drives on a 6Gbps bus. We have a combination of H200, H700 and H710 PERC controllers on these servers, and the issues didn’t seem to follow a pattern, with one exception: all H200 equipped servers experienced poor performance.
A support ticket with Dell gave the usual response: update your firmware and drivers. We did, and one of the H700 equipped servers got worse. Further inquiries with Dell gave a recommendation to replace the H200 controllers with the more powerful H700. After having a look at the specs for the H200 I fully agree with their assessment, although I do wonder why on earth they sold them in the first place. The H200 doesn’t appear to be worth the price of the cardboard box it is delivered in. It has absolutely no cache whatsoever, and it also disables the built in cache on the drives. Snap from the H200 users guide:
This sounds like something one would use in a print server or small departmental file server in a very limited budget, not in a four-way database cluster node. And it explains why the connected drives are painfully slow, you are reduced to platter speed.
Note: The H200 is replaced by the H310 on newer servers. I have yet to test it, but from what the specs tell me it is just as bad as the H200.
Update: Test data from a H310 equipped test server doing nothing but displaying the perfmon curve:
Pull out the H200 controllers and replace them with H700s. H310 controllers should be replaced with H710s. Sounds easy enough, but it isn’t. Especially if the server is in production and you don’t want to reinstall it from scratch. The H700 is able to import the RAID config stored on drives previously connected to a H200, provided you have all the drives and connect them the same way as on the H200. It is a simple matter of pressing ‘F’ during boot at the right moment to import the “foreign” configuration. Booting Windows is another matter altogether though. It all looks very promising, the normal boot-sequence is displayed as expected. But suddenly midway through the boot, you are greeted with a BSOD. This is caused by the fact that the driver for the H200 is not compatible with the H700. But if you follow this “simple” list it should be doable.
Replace PERC H200 with H700
- Make sure you have backups of anything you might need on the drives. In a cluster this is not as important as you can reinstall and re-add a node to without affecting the cluster as long as the other node(s) are intact, and the data is usually stored on shared SAN drives. But you never know.
- Read the rest of this article before you continue.
- Replace the H200 driver with the corresponding H700 driver:
- Extract the driver installation package ( /s /e=[path])
- Device manager, H200 properties, Driver, Update driver
- Chose “Browse my computer for driver software”
- Choose “Let me pick…”
- Click “Have disk”
- Browse to the extracted installation package, and find the oemsetup.inf file:
- Choose the corresponding H700 version. If you have a H200 Modular, chances are you will replace it with a H700 Modular. Modular means blade server version, Adapter versions are for full size PCIXpress cards, and integrated versions are for chips soldered to the main board.
- Reboot, and confirm that the boot fails with a BSOD half way through the boot.
- Turn of the server and replace the controller
- Press ‘F’ when the H700 board asks if you want to import the foreign config, or enter the PERC BIOS for a menu driven approach.
After you have replaced the controller, performance will still be dismal, albeit a little less dismal. This is because the imported RAID config still has the cache disabled. To fix this, use Dell OMSA (Open Manage Server Administrator) or the PERC BIOS interface. If you use OMSA, the settings can be changed on the fly.
I leave the Disk Cache Policy disabled, as the cache on the drives isn’t backed by the battery connected to the H700. The following perfmon graph shows the effect of enabling the cache: