Week 6 —— Planning High Availability

Week 6 —— Planning High Availability

failover clustering(故障转移集群)
故障转移群集是一种高可用性的基础结构层,由多台计算机组成,每台计算机相当于一个冗余节点,整个群集系统允许某部分节点掉线、故障或损坏而不影响整个系统的正常运作。一台服务器接管发生故障的服务器的过程通常称为"故障转移"。

如果一台服务器变为不可用,则另一台服务器自动接管发生故障的服务器并继续处理任务。 群集中的每台服务器在群集中至少有一台其他服务器确定为其备用服务器。

Week 6 —— Planning High Availability

1.What are the components of failover clustering?

• A cluster consists of two or more nodes that offer services to the network
• The types of service offered by a cluster include fi le servers, print servers, DHCP servers, Hyper-V virtual machines, or any other application that has been written to be cluster aware, such as, for example, Exchange and SQL Server.

故障转移集群的组成部分是什么?

•集群由两个或多个节点组成,这些节点为网络提供服务

•集群提供的服务类型包括文件服务器、打印服务器、DHCP服务器、Hyper-V虚拟机或任何其他已被编写为集群感知的应用程序,例如Exchange和SQL Server。

2.What is the Quorum?
• Quorum is the mechanism used to ensure that in the event of a break in communication between parts of the cluster or the loss of parts of the cluster, we always have to have a majority of cluster resources for the cluster to function.

•quorum是一种机制,用于确保在发生集群部分通信中断或集群部分丢失的情况下,我们总是必须拥有集群的大部分资源才能让集群正常工作。

3.Why Quorum?
With a cluster, there are multiple nodes that share a common cluster database. Services can run on any node in the cluster.

Split-brain describes a situation in which multiple nodes in a cluster try to bring online the same service or application, which causes the nodes to try to bring online the same resources.

对于集群,有多个节点共享一个公共集群数据库。服务可以在集群中的任何节点上运行。

Split-brain指集群中的多个节点试图使相同的服务或应用程序联机,这导致这些节点试图使相同的资源联机。

4.How Quorum works?
Node that is now unavailable will have its vote removed from the cluster by the remaining nodes.

For example:
◆An administrator performs patching on a node, which requires reboots, so the node would be unavailable for a period of time. As the node goes in to maintenance mode, it removes its vote from the cluster, reducing the total number of votes from five to four.
◆ The administrator starts to perform maintenance on another node, which again requires reboots. The node removes its vote, reducing the total number of votes in the cluster to three.
◆ A failure in a node occurs or the administrator is an overachiever and performs maintenance on another node, losing its vote. Now there are only two votes left out of the three total votes, which is greater than 50 percent, so the cluster stays running! In actual fact, that node that is now unavailable will have its vote removed from the cluster by the remaining nodes.

现在不可用的节点将被其余节点从集群中移除其投票。

例如:
◆管理员在一个节点上执行补丁,这需要重新启动,所以节点将不可用一段时间。当节点进入维护模式时,它将从集群中删除其投票,将总投票数从5个减少到4个。
◆管理员开始在另一个节点上进行维护,再次需要重启。节点删除它的投票,将集群中的总投票数减少到3。
◆某个节点发生故障,或者管理员是超级管理员,在另一个节点上执行维护操作,失去了该节点的投票权。现在,3个总投票中只剩下2个,大于50%,因此集群保持运行!实际上,现在不可用的节点将被其余节点从集群中移除其投票。

5.What are requirements of failover cluster?

Server Hardware:
• It is essential that the underlying hardware should be Windows Server 2012 R2 certified before you can deploy the Hyper-V role and join the server to a cluster.
• From the CPU perspective, there’s a trade-off, which allows us to use different CPUs. However, they should be from the same vendor.
• If the storage assigned to the nodes is DAS or FC, it is imperative that all the components for the storage stack installed on the servers should be alike.
• If the cluster storage is going to be iSCSI, ensure that your network adapters are uniquely profiled.

Storage Prerequisites:
• Shared storage is one of the building blocks of a failover cluster.
• If you are using iSCSI or FC LUNs, there are some gotchas you should remember:

  • Ensure that storage compatibility is checked as per the Windows Server Catalog
  • Use MPIO or LBFO
  • Mask the LUNs per cluster and ensure isolation of LUNS via zoning
  • If you wish to utilize native disk support, then use only basic disks. For Cluster Shared Volumes (CSV), NTFS-formatted volumes are preferable (ReFS volumes are also supported)

Software Prerequisites:
• All the members should run the same version of the operating system
• be at the same service pack level
• have a set of patches or software updates installed

Environment Prerequisites:
• When you create a cluster, its name gets registered in the Active Directory as a CNO (cluster name object) which is a virtual computer object. It is recommended to keep all the computer accounts and CNO of a cluster in their respective OUs.
• When planning a Hyper-V cluster, there are multiple traffic channels required to address various needs.
• The primary network traffic for a Hyper-V cluster can be segregated as follows :

  • Management
  • Cluster
  • Virtual machine access
  • Live migration
  • Hyper-V Replica

• All the members should run the same version of the operating system
• be at the same service pack level
• have a set of patches or software updates installed

上一篇:[编程题]Linked List Sorting


下一篇:PVE 开启https 及免费SSL证书申请