Best Practices for Configuring and Using the SAN
When planning your SAN, it is important to take the following items into consideration. Incorporate the practices outlined below to optimize the performance of your SAN, as well as the management of your SAN in the event of failure, or when upgrading or modifying your configuration.
Environment
-
Your storage servers should be dedicated to only running SANsymphony software. Do not use them for any other purpose.
-
Design your SAN network so that all storage devices (physical storage, storage servers, switch fabrics) allow for the shutdown of a complete storage path from application server through physical disk without impacting other paths. This allows for the event (upgrade, failure, addition, and so forth) to occur with minimal impact.
-
Label cables and host bus adapters (HBAs).
-
Separation of components leads to a higher degree of availability because they are not in the same environment.
-
-
Keep storage servers out of the public network.
-
Separate servers from storage racks.
-
Keep switches in separate racks.
-
Do not take power from the same grid (if possible).
-
Physical plant (A/C, access controls)
-
-
Depending on your SAN configuration, you may wish to change the default setting to ensure that your HBA data rate settings match the speed of the infrastructure.
Hardware Configuration
-
When upgrading your network, more and smaller is preferred to less and bigger. It is better to add another storage server to the network (leverage N + 1) than add more ports to a larger storage server.
-
Calculate I/Ops required by the application server layer and then decide on the number of application server ports to use. Alternatively, the following handles average traffic:
-
-
Two storage server target-only ports
-
Two unidirectional dedicated mirror ports
-
Two storage server initiator ports
-
-
Physical storage performance is a key factor to overall system performance, particularly in high load environments.
-
DataCore storage servers implement a RAID1 mirror when synchronous (secondary, tertiary) mirroring is used with multipathing. Also consider these storage types for your configuration:
-
-
JBOD allows for best performance when used with NMV striping. Problem isolation and replacement are simple in the event of failure.
-
RAID0 performs striping at the disk level and while this is good for overall storage performance, this may conflict with the NMV striping feature. This method also groups disks together so a failure will affect more of the overall managed storage. This is a good solution of AIM buffers and high performance applications.
-
RAID5 is a good compromise of reliability and performance, but a write penalty is always incurred for the possibility of failure. (Two writes, a parity calculation, and a third write for every I/O written.) Use only for the most critical storage.
-
NMV Configuration
-
Match the storage type to an NMV pool and keep it consistent. Do not mix vendor types and configuration types (RAID0, JBOD) in the same pool.
-
Create a pool for each type of workload and task in your environment. Six pools could be created for the configuration below.
I/O TYPE |
MATCHING STORAGE CONFIGURATION |
Small block |
JBOD or RAID0 |
Large block |
RAID5 |
Hot Spare |
Any matched to primary pool type |
Mirrors |
Any matched to primary pool type |
Snapshot |
JBOD |
AIM |
Any |
-
Application vendors’ recommendations for separate disks translates directly to separate pools for virtual disk sources. For example, Microsoft recommends putting the change log for Exchange on a volume different than the actual exchange database in order to assure the best performance. Simply putting those entities on two virtual volumes from the same pool will not allow for tailoring the pool performance to the appropriate hardware configuration. Furthermore, this approach would potentially give rise to the condition where the larger I/Os of the database traffic would block the smaller and faster I/Os of the change log.
High Availability Configuration
-
Use LUN distribution if your storage controller supports an active/active configuration. LUN distribution requires TRUE active/active functionality from the storage subsystem with multiple controllers. This is the preferred method for implementing HA because it provides for the highest performance level and the greatest amount of control.
-
While currently supported, Dual Path configuration is not recommended because it breaks the foundation on which High Availability is based.
-
With Dual Path (DP) configurations, we advise that you configure the two mirror paths within a virtual volumes of a given DP pool to use different channels (e.g. all mappings from Storage Server 1 to Storage Server 2 go over one path between storage servers, and all mirror mappings from Storage Server 2 to Storage Server 1 go over another completely different pair of paths).
-
Using a single mirror path for both mirror channels with Dual Path pools is bad practice as this can cause excessive arbitration and will cause disruption on that single mirror path such that I/O from both directions is affected. This can lead to possible failed application server access.
-
If you intend on setting up the Windows application server for HA using MPIO, then install MPIO before mapping third party AP volumes to the application server.
-
Paths from each application server to each storage server should be physically independent. This means physical separation switches for Fibre Channel (FC) and independent subnets and switches for iSCSI provide the highest level of availability.
-
Each switch that is connected to the storage server should have at least three independent zones. (Follow vendor recommendations regarding creation of additional zones.) Traversing zones should only be possible through storage servers. The zones are:
-
-
Storage – storage server initiators to physical storage targets
-
Mirror – storage server initiators to storage server target
-
Application server – application server initiator to storage server targets
-
-
If using iSCSI, refer to Best Practices for iSCSI High Availability.
-
If mapping the same virtual volume to multiple application servers, refer to LUN Numbering Consistency.
-
A clear understanding of why the data is critical or needs to be moved off-site is crucial in order to configure. Important questions to ask include:
-
-
How much latency between source and destination volume is tolerable?
-
How many discreet applications need to be protected by AIM? This influences the number of AIM sets required.
-
It is beneficial to determine which application server files must be replicated and organize them. Files that can be easily re-installed, don’t have to be replicated if space or throughput is an issue. You can keep files that need to be replicated (such as user or data files) on different volumes than the files that don’t (such as system or program files). Limiting the amount of less important replication allows vital information to be replicated on the destination at a faster rate.
-
How much data is expected to be changed during the latency period? This will influence the size of the link and buffer. Buffer size should be adequate to support the collection of buffer data to support production data change rate for the duration of the worst case IP link outage.
-
How much data can be transferred over the link? This must be measured; the advertised speed is not actually achieved across the link. This will influence what is important enough to be AIM protected. Refer to AIM Initialization Time Estimates for more information.
-
If AIM sets are to be initialized over the inter-site link then they should be faster and more reliable.
-
-
Buffers should reside on local disks that are outside of SANsymphony control (not virtual volumes). Refer to AIM Buffers for more important buffer information.
-
Disks used to transfer AIM data should be fast (RAID0 or JBOD) so that the transfer rate is not limited. External or portable disks are the best practice.
-
Create Snapshot relationships using destination virtual volumes from a dedicated NMV pool configured with the smallest SAU size (1 MB). Doing this will allow for the optimal storage allocation on the destination.
-
Impact of Snapshot I/O on performance will be minimized by placing the destination virtual volumes on faster physical disks.
-
I/O to destinations that are not Complete Images (CI) increase the I/O load on the Snapshot source. This I/O can impact performance adversely. Consider performing a CI to limit the impact.