Note: This is a template to ease development. The storage-vendor-specific part of the scripts have to be coded by yourself.
There sometimes comes a need to simplify complex operations, in this case failover and failback operations of SAN storage replication between sites (e.g. production and DR), for reasons such as letting operators or the less technically-confident colleagues to more easily perform the operations in case of disasters or drill tests. To achieve that, this template has been created.
Written primarily in PowerShell, this package contains a set of SAN storage failover and failback scripts for Microsoft Failover Cluster (including Hyper-V cluster) and vendor-neutral pseudo code for SAN storage (for further modification to support different SAN vendors). Not only does it perform storage failover and failback, services running on top of it such as databases and virtual machines can also be catered.
Moreover, it features a user-friendly interactive console menu, where complex operations are handled by the scripts in the backend.
Served as a general template for implementation, one has to fill in the blanks – SAN-vendor-specific commands – to fit for his/her own use. Although the Failover Cluster parts and the interactive console menu are already there, further coding and testing cannot be avoided.
The scripts were designed to be as reusable (e.g. parameterized) as possible, in order to allow me to more easily adapt it to fit different projects. It has also been released as an open-source project on the GitHub repository ws-storage-failover which encourages IT pros to fork it to for their own SAN storage systems, and if possible, contribute their modified bits in returns, assisting others who might want to implement it for their own choice of SAN storage systems.
- Script failover and failback between replicated SAN storage with Microsoft Failover Cluster (Hyper-V and others)
- Operator-friendly interactive console menus through which failover, failback and validation, status reporting can be performed
- Steps can be performed individually or all at once
- Scripted in PowerShell (Windows operations) and pseudo code (SAN-specific operations) as a template to be modified or customized
- Define variables or fill in parameters in SAN-Parameters.xml and SAN-Parameters.ps1
- Pseudo code of SAN operations is provided inside <# #> inline comment blocks to be replaced as required per command-line reference of SAN storage
- Speed up development of automated/manual SAN failover/failback with this set of scripts
- Leverage the command-line interface usually provided by SAN storage systems over SSH which makes it possible to use plink.exe (from PuTTY) to command SAN storage systems in scripts
- Default scenario is for storage systems located in two sites with SAN-level replication
- Console outputs are logged under Logs folder, with error messages separately stored
- Open-source project on GitHub, encouraging forking
- SAN storage systems with SAN replication enabled (2 sites – production and DR sites assumed) with hosts running Failover Cluster (e.g. Hyper-V) in each site
- A Windows client that runs the console menu with PowerShell 3.0 or above (comes with Windows 8.1 and Windows Server 2012 R2)
- PuTTY should be installed under a location specified in SAN-Parameters.ps1
- A one-time authentication may be required by connecting to SAN storage systems in both sites via SSH (with PuTTY or plink.exe) in order to cache the host key in registry
- Run PowerShell as Administrator prior to running the Console Menu script
¦ SAN-Console_Menu.ps1 // Menu for selecting among operations (failover, failback, validation, etc.)
¦
+---Logs // Console output and errors are recorded separately here
¦ ...
¦
+---Parameters
¦ SAN-Parameters.ps1 // Script options can be specified here (PowerShell)
¦ SAN-Parameters.xml // SAN replication options specified here (XML)
¦
+---Scripts
SAN-Failback.ps1 // Failback operation subscript called by Console Menu script
SAN-Failover.ps1 // Failover operation subscript called by Console Menu script
SAN-GetStatus.ps1 // Status querying subscript called by Console Menu script
SAN-Plink.ps1 // Functions which command SAN using plink.exe from PuTTY
SAN-Variables.ps1 // Variables from XML, PS1 parameter files and SAN are further processed here
SAN-Validate.ps1 // Validation subscript called by Console Menu script to confirm settings are valid
- Edit SAN-Parameters.ps1 and .xml files for options such as changing the user account for SSH communication with your SAN.
- All pseudo code (SAN-specific operations) inside <# #> should be replaced with the implementation of your SAN vendor. For example, change Echo <# display LUN #> to the actual command lsvdisk (IBM/Lenovo Storwize), lun show (NetApp), volcoll (HPE Nimble), etc.
- Perform further development or customization according to your needs. For example:
- Follow the comments in the script
- Names of functions, variables and echo messages are self-explanatory
- End Replication - from Production Site to DR Site
- Create Clone from Replicated LUN in DR Site
- Present Cloned LUN to DR Site Host, Take Disk Online, Run Replication from Production Site to DR Site
- Import VM in DR Site
- Disconnect VM Network Adapter in DR Site
- End Replication - from Production Site to DR Site
- Create Clone from Replicated LUN in DR Site
- Present Cloned LUN to DR Site Host, Take Disk Online, Run Replication from Production Site to DR Site
- Stop and Delete Replication Group - from Production Site to DR Site
- Stop and Delete Replication Group - from DR Site to Production Site
- Stop VM in Production Site
- Stop CSV in Production Site
- Delete VM in Production Site
- Unpresent LUN of CSV from Production Site Hosts
- Delete LUN of CSV in Production Site
- Create LUN of CSV in Production Site
- Create Replication Group - from DR Site to Production Site
- Add LUN to Replication Group - from DR Site to Production Site
- Run Replication Group - from DR Site to Production Site
- Stop VM in DR Site
- Take Disk Offline in DR Site Host
- Unpresent Cloned LUN from DR Site Host
- Stop and Delete Replication Group - from DR Site to Production Site
- Present LUN of Replicated CSV to Production Site Hosts
- Run CSV in Production Site
- Import VM in Production Site
- Disconnect VM Network Adapters in Production Site
- Run VM in Production Site
- Create Replication Group - from Production Site to DR Site
- Add LUN to Replication Group - from Production Site to DR Site
- Run Replication Group - from Production Site to DR Site
- Stop and Delete Replication Group - from Production Site to DR Site
- Stop and Delete Replication Group - from DR Site to Production Site
- Stop CSV in Production Site
- Unpresent LUN of CSV from Production Site Hosts
- Delete LUN of CSV in Production Site
- Create LUN of CSV in Production Site
- Create Replication Group - from DR Site to Production Site
- Add LUN to Replication Group - from DR Site to Production Site
- Run Replication Group - from DR Site to Production Site
- Take Disk Offline in DR Site Host
- Unpresent Cloned LUN from DR Site Host
- Stop and Delete Replication Group - from DR Site to Production Site
- Present LUN of Replicated CSV to Production Site Hosts
- Run CSV in Production Site
- Create Replication Group - from Production Site to DR Site
- Add LUN to Replication Group - from Production Site to DR Site
- Run Replication Group - from Production Site to DR Site
- There is no one-size-fits-all solution – modification is inevitable
- Not all error messages are separately recorded in error log file; some errors only exist in the main log. Outputs and errors encountered in the menu are not logged
- SAN storage credentials are stored in the ps1 configuration file in clear text (protect the file properly)
Ver | Date | Changes |
---|---|---|
1.1 | 20190302 | Improved delimiter choice in Scripts\SAN-Variables.ps1 |
1.0a | 20141231 | First released in December 2014 |