Terraform 是一个 IT 基础架构自动化编排工具,它的口号是 "Write, Plan, and create Infrastructure as Code", 基础架构即代码。Terraform 几乎可以支持所有市面上能见到的云服务。
Terraform 要解决的就是在云上那些硬件资源分配管理的问题。相比较 Chef, Puppet, Ansible 这些软件配置工具,Terraform 提供的是软件配置之前,软硬件(基础)资源构建的问题。
当我们创建资源时,使用 terraform 比 ansible 好在哪里?
- 并发创建,速度快
- 扩容/缩容 很方便,改一个数字就行
- state 文件记录资源状态
就创建资源这个角度来说,terraform 和 ansible 都能完成,terraform能够并发,效率高很多,另外它在资源生产成功之后会在本地以一个state文件的形式记录整个资源的详细信息,而这些信息的记录使得整个模板所定义的资源可以保证前后端的高度一致性,可以有利于后续对于整个一套资源的有效的版本控制。同时Terraform拥有一个Data Source功能,利用这个功能可以实现对于已有资源的获取,比如在生产资源之前想要查看当前有哪些可用区,有哪些可用镜像等,所有的这些都可以通过DataSource实现。
Terraform 和 Ansible 的结合
- terraform 调用 ansible
- Provisioner(local-exec, remote-exec) (官方推荐)
- ansible 调用 terraform
- Ansible module for terraform(官方推荐)
- 其它方式
- terraform output 生成 inventory 给 ansible 使用(手工)
- Terraform template 渲染后,生成 inventory 给 ansible 使用(自动)
- Terraform创建的时候使用tag,ansible直接对tag 操作(完全解耦,云平台,动态主机列表)
- 第三方工具解析state文件给ansible使用。 比如 Terraform - Inventory 这个第三方工具能够将Terraform生产出的资源转化为Ansible想要的Inventory文件
安装软件
- 下载zip 文件 https://www.terraform.io/downloads.html
- 解压后直接就能用。把文件放到合适的路径,比如 /usr/local/bin
生成配置文件
-
新建目录,并生成配置文件,比如 azure.tf
# Configure the provider provider "azurerm" { version = "=1.20.0" } # Create a new resource group resource "azurerm_resource_group" "rg" { name = "royTR" location = "eastasia" }
配置有两部分:provider 和 resource。provider 告知与哪一个云平台打交道,这里是Azure;如果使用AWS,这里就写成 provider "aws"。第二部分是资源,说明要生成哪些资源,例子中是resource group,还可以继续往下写,比如网卡,存储,虚拟机等。
格式:resource resource_type resource_name { }
A resource block has two string parameters before opening the block: the resource type (first parameter) and the resource name (second parameter). The combination of the type and name must be unique in the configuration.
我已经通过Azure CLI 登陆过,所以上面provider 部分没有提供用户验证信息,如果单独配置,使用如下形式:
# Configure the Microsoft Azure Provider
provider "azurerm" {
# More information on the authentication methods supported by
# the AzureRM Provider can be found here:
# http://terraform.io/docs/providers/azurerm/index.html
subscription_id = "..."
client_id = "..."
client_secret = "..."
tenant_id = "..."
}
这些信息怎么获取? 可以用Azure CLI 的命令生成:
az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/${SUBSCRIPTION_ID}"
详细信息参考微软文档
创建资源
-
初始化
在初始化项目的时候,Terraform 会解析目录下的*.tf文件并加载相关的 provider插件。
$ terraform init
Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "azurerm" (1.20.0)...
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
-
apply changes
This output shows the execution plan, describing which actions Terraform will take in order to change real infrastructure to match the configuration.
$ terraform apply . An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: + azurerm_resource_group.rg id: <computed> location: "eastasia" name: "royTR" tags.%: <computed> Plan: 1 to add, 0 to change, 0 to destroy. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes # 查看 execution plan 符合期望,输入 yes 确认,之后真正执行。 azurerm_resource_group.rg: Creating... location: "" => "eastasia" name: "" => "royTR" tags.%: "" => "<computed>" azurerm_resource_group.rg: Creation complete after 1s (ID: /subscriptions/7c91db0e-eb7f-491b-997f-32cf55b85dea/resourceGroups/royTR) Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
- 查看状态
$ terraform state show
id = /subscriptions/7c91db0e-eb7f-491b-997f-32cf55b85dea/resourceGroups/royTR
location = eastasia
name = royTR
tags.% = 0
更多
$ terraform state list
module.roy-azure.azurerm_availability_set.hdp-avset
module.roy-azure.azurerm_network_interface.bastion-nic
...
$ terraform state show module.roy-azure.azurerm_virtual_machine.hdp-slave[1]
...
location = japaneast
name = roy-tf0-hdp-slave-02
...
$ terraform state show module.roy-azure.azurerm_network_interface.hdp[0]
...
ip_configuration.0.load_balancer_backend_address_pools_ids.# = 0
ip_configuration.0.load_balancer_inbound_nat_rules_ids.# = 0
ip_configuration.0.name = hdp-01-ip-conf
....
private_ip_address = 10.0.10.8
...
更改资源
-
改配置
修改刚才的文件,添加tag部分。
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
tags {
environment = "TF sandbox"
}
}
-
apply changes
An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols: ~ update in-place Terraform will perform the following actions: ~ azurerm_resource_group.rg tags.%: "0" => "1" tags.environment: "" => "TF sandbox" Plan: 0 to add, 1 to change, 0 to destroy.
销毁基础设施
terraform destroy
$ terraform destroy
azurerm_resource_group.rg: Refreshing state... (ID: /subscriptions/xxxx/resourceGroups/royTR-rg)
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
- azurerm_resource_group.rg
Plan: 0 to add, 0 to change, 1 to destroy.
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
azurerm_resource_group.rg: Destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxx/resourceGroups/royTR-rg, 10s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 20s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 30s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 40s elapsed)
azurerm_resource_group.rg: Destruction complete after 48s
Destroy complete! Resources: 1 destroyed.
单独删除一个资源:
$ terraform destroy -target=module.roy-azure.azurerm_virtual_machine.hdp[2]
...
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
- module.roy-azure.azurerm_virtual_machine.hdp[2]
Plan: 0 to add, 0 to change, 1 to destroy.
Do you really want to destroy all resources?
....
Destroy complete! Resources: 1 destroyed.
资源的依赖关系
要创建一个VM,需要一些资源已经具备,这些资源可能包括:
- Resource group
- Virtual network and subnet
- Public IP
- Network security group
- Network interface
先来一个简单的例子,创建网络:
# Create virtual network
resource "azurerm_virtual_network" "vnet" {
name = "royTFVnet"
address_space = ["10.0.0.0/16"]
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
}
在location
等部分引入了插值(interpolation),它已经在前面的资源定义,之后直接调用,格式是 TYPE.NAME.ATTRIBUTE.
Azure 网络和虚拟机的基础架构如下图所示:
azure_vm_structure.png
把上面的图,变成代码,创建VM需要的整个文件:
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
tags {
environment = "TF sandbox"
}
}
# Create virtual network
resource "azurerm_virtual_network" "vnet" {
name = "royTFVnet"
address_space = ["10.0.0.0/16"]
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
}
# Create subnet
resource "azurerm_subnet" "subnet" {
name = "royTFSubnet"
resource_group_name = "${azurerm_resource_group.rg.name}"
virtual_network_name = "${azurerm_virtual_network.vnet.name}"
address_prefix = "10.0.1.0/24"
#address_prefix = "${cidrsubnet(var.cluster_cidr, 8, 10)}"
}
# Create public IP
resource "azurerm_public_ip" "publicip" {
name = "myTFPublicIP"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
public_ip_address_allocation = "dynamic"
}
# Create Network Security Group and rule
resource "azurerm_network_security_group" "nsg" {
name = "myTFNSG"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
security_rule {
name = "SSH"
priority = 1001
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "*"
}
}
# Create network interface
resource "azurerm_network_interface" "nic" {
name = "myNIC"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
network_security_group_id = "${azurerm_network_security_group.nsg.id}"
ip_configuration {
name = "myNICConfg"
subnet_id = "${azurerm_subnet.subnet.id}"
private_ip_address_allocation = "dynamic"
public_ip_address_id = "${azurerm_public_ip.publicip.id}"
}
}
# Create a Linux virtual machine
resource "azurerm_virtual_machine" "vm" {
name = "royTFVM"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
network_interface_ids = ["${azurerm_network_interface.nic.id}"]
vm_size = "Standard_DS1_v2"
storage_os_disk {
name = "myOsDisk"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Premium_LRS"
}
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04.0-LTS"
version = "latest"
}
os_profile {
computer_name = "royvm"
admin_username = "royzeng"
}
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/royzeng/.ssh/authorized_keys"
key_data = "ssh-rsa AAAAB3Nz{snip}hwhqT9h"
}
}
}
使用 Provisioners 进行环境配置
Provisioners 可以在资源创建/销毁时在本地/远程执行脚本。
Provisioners 通常用来引导一个资源,在销毁资源前完成清理工作,进行配置管理等。
Provisioners拥有多种类型可以满足多种需求,如:文件传输(file),本地执行(local-exec),远程执行(remote-exec)等 Provisioners可以添加在任何的resource当中:
# Create a Linux virtual machine
resource "azurerm_virtual_machine" "vm" {
<...snip...>
provisioner "file" {
connection {
type = "ssh"
user = "royzeng"
private_key = "${file("~/.ssh/id_rsa")}"
}
source = "newfile.txt"
destination = "newfile.txt"
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = "royzeng"
private_key = "${file("~/.ssh/id_rsa")}"
}
inline = [
"ls -a",
"cat newfile.txt"
]
}
}
上面的方式适合有public ip,能够直接连接的机器,对于不能直接连接的vm,通过跳板来实现。
官方的方法,定义 bastion_host
resource "null_resource" "connect_private" {
connection {
bastion_host = "${aws_instance.bastion.public_ip}"
host = "${aws_instance.private.private_ip}"
user = "ubuntu"
private_key = "${file("~/.ssh/id_rsa")}"
}
provisioner "remote-exec" {
inline = ["echo 'CONNECTED to PRIVATE!'"]
}
}
或者
resource "azurerm_virtual_machine" "vm" {
<...snip...>
provisioner "remote-exec" {
connection {
bastion_host= "${azurerm_public_ip.bastion.ip_address}"
type = "ssh"
user = "${var.admin-username}"
private_key = "${file("~/.ssh/id_rsa")}"
}
inline = [
"sudo parted /dev/disk/azure/scsi1/lun0 mklabel msdos",
"sudo parted /dev/disk/azure/scsi1/lun0 mkpart primary 1 100%",
"sudo partprobe",
"sleep 5; sudo mkfs.xfs /dev/disk/azure/scsi1/lun0-part1",
"sudo mkdir /roytest",
"sudo mount /dev/disk/azure/scsi1/lun0-part1 ${var.mount_path[0]}",
"echo 'UUID='`sudo blkid -s UUID -o value $(readlink -f /dev/disk/azure/scsi1/lun0-part1)` ${var.mount_path[0]} 'xfs defaults 0 0' | sudo tee -a /etc/fstab",
"df -hl | grep /dev/sd"
]
}
}
另一种方法,用local-exec 来跳转
provisioner "local-exec" {
## 简化方式
command = "ssh -o "ProxyCommand ssh -q -W %h:%p -i mykey jump_server” -C 'echo hello'"
## 真实环境用的方式
command = <<EOF
sleep 30; ansible-playbook -i '${element(azurerm_network_interface.master_bind.*.private_ip_address, count.index)},' ${local.ansible_ssh_args} ${var.ansible_path}/mount_disk.yml --extra-vars '{
"root_user": "centos",
"deviceName": "/dev/disk/azure/scsi1/lun0",
"mountPath": "${var.mount_path}",
"bind_zone_name": "${var.bind_zone_name}"
}'
EOF
}
使用 null resource 和 trigger 来解耦
为了让ansible 脚本单独运行,而不需要创建或销毁资源,可以用 null_resource 调用 provisioner 来实现。
resource "null_resource" "datanode" {
count = "${var.count.datanode}"
triggers {
instance_ids = "${element(aws_instance.datanode.*.id, count.index)}"
}
provisioner "remote-exec" {
inline = [
...
]
connection {
type = "ssh"
user = "centos"
host = "${element(aws_instance.datanode.*.private_ip, count.index)}"
}
}
}
输入变量
新建一个文件定义变量
# file variables.tf
---
variable "prefix" {
default = "royTF"
}
variable "location" { }
variable "tags" {
type = "map"
default = {
Environment = "royDemo"
Dept = "Engineering"
}
}
文件中 location
部分没有定义,运行terraform的时候,会提示输入:
$ terraform plan -out royplan
var.location
Enter a value: eastasia
<...snip...>
This plan was saved to: royplan
To perform exactly these actions, run the following command to apply:
terraform apply "royplan"
其它输入变量的方式
命令行输入
$ terraform apply \
>> -var 'prefix=tf' \
>> -var 'location=eastasia'
文件输入
$ terraform apply \
-var-file='secret.tfvars'
默认读取文件 terraform.tfvars
,这个文件不需要单独指定。
环境变量输入
TF_VAR_name
,比如 TF_VAR_location
变量类型
- list
- map
对于 list 变量
# 定义 list 变量
variable "image-RHEL" {
type = "list"
default = ["RedHat", "RHEL", "7.5", "latest"]
}
# 调用 list 变量
storage_image_reference {
publisher = "${var.image-RHEL[0]}"
offer = "${var.image-RHEL[1]}"
sku = "${var.image-RHEL[2]}"
version = "${var.image-RHEL[3]}"
}
map 是一个可以被查询的表。
variable "sku" {
type = "map"
default = {
westus = "16.04-LTS"
eastus = "18.04-LTS"
}
}
查询方式(使用 lookup)
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "${lookup(var.sku, var.location)}"
version = "latest"
}
输出变量
定义输出
output "ip" {
value = "${azurerm_public_ip.publicip.ip_address}"
}
测试
$ terraform apply
...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
ip = 52.184.97.1
$ terraform output ip
52.184.97.1
Bug? 第一次运行,ip 输出是空的,terraform output ip
命令的结果也是空的,过一段时间才能看到结果。
$ terraform output -module=roy-azure
bastion-private-ip = 10.0.1.4
bastion-public-ip = 40.115.243.72
cluster_cidr = 10.0.0.0/16
cluster_location = japaneast
cluster_prefix = roy-tf0
cluster_resource_group = roy-tf0-rg
hdp-master-ip = 10.0.10.4,10.0.10.6,10.0.10.7
hdp-master-name = roy-tf0-hdp-master-01,roy-tf0-hdp-master-02,roy-tf0-hdp-master-03
hdp-slave-ip = 10.0.10.5,10.0.10.9,10.0.10.8
hdp-slave-name = roy-tf0-hdp-slave-01,roy-tf0-hdp-slave-02,roy-tf0-hdp-slave-03
k8s-master-ip = 10.0.20.8,10.0.20.5
k8s-master-name = roy-tf0-k8s-master-01,roy-tf0-k8s-master-02
k8s-slave-ip = 10.0.20.6,10.0.20.7,10.0.20.4
k8s-slave-name = roy-tf0-k8s-slave-01,roy-tf0-k8s-slave-02,roy-tf0-k8s-slave-03
virtual_network = roy-tf0-vnet
Data Source
DataSource 的作用可以通过输入一个资源的变量名,然后获得这个变量的其他属性字段。
用 Azure网络 来举例,提供一些信息,查询其它的属性。具体必须提供什么,能查到什么,参考这个链接。
data "azurerm_virtual_network" "test" {
name = "production"
resource_group_name = "networking"
}
output "virtual_network_id" {
value = "${data.azurerm_virtual_network.test.id}"
}
output "virtual_network_subnet" {
value = "${data.azurerm_virtual_network.test.subnets[0]}"
}
生成主机列表
ansible通过主机列表来连接目标主机,我们就要想办法让terraform来生成。(local-exec 也是一种方式,这是另一种思路:用terraform 调用ansible)
terraform 生成inventory的思路是:从模板到文件,需要先用template_file 渲染成一个字符串,然后用local_file把这个字符串输出到一个文件。
模版文件
## file inventory.tpl
[backend]
${bastion_private_ip}
[frontend]
${bastion_pub_ip}
[all:vars]
ansible_ssh_private_key_file = ${key_path}
ansible_ssh_user = dcpuser
渲染和输出
## file inventory.tf
data "template_file" "inventory" {
template = "${file("./test/inventory.tpl")}"
vars {
bastion_private_ip = "${element(azurerm_network_interface.bastion-nic.*.private_ip_address, count.index)}"
bastion_pub_ip = "${element(azurerm_public_ip.bastion.*.ip_address, count.index)}"
key_path = "~/.ssh/id_rsa"
}
}
resource "local_file" "save_inventory" {
content = "${data.template_file.inventory.rendered}"
filename = "./myhost"
}
运行后,当前目录生成文件myhost
[backend]
13.78.94.242
[frontend]
10.0.1.4
[all:vars]
ansible_ssh_private_key_file = ~/.ssh/id_rsa
ansible_ssh_user = dcpuser
对于多个主机,使用 join
来把它们合在一起。
File inventory.tf
data "template_file" "k8s" {
template = "${file("./templates/k8s.tpl")}"
vars {
k8s_master_name = "${join("\n", azurerm_virtual_machine.k8s-master.*.name)}"
}
}
resource "local_file" "k8s_file" {
content = "${data.template_file.k8s.rendered}"
filename = "./inventory/k8s-host"
}
File k8s.tpl
[kube-master]
${k8s_master_name}
Final result
[kube-master]
k8s-master-01
k8s-master-02
k8s-master-03
使用module进行代码的组织管理
Module 是 Terraform 为了管理单元化资源而设计的,是子节点,子资源,子架构模板的整合和抽象。将多种可以复用的资源定义为一个module,通过对 module 的管理简化模板的架构,降低模板管理的复杂度,这就是module的作用。
Terraform中的模块是以组的形式管理不同的Terraform配置。模块用于在Terraform中创建可重用组件,以及用于基本代码组织。每一个module都可以定义自己的input与output,方便代码进行模块化组织。
用模块,可以写更少的代码。比如用下面的代码,调用已有的module 创建vm。
调用官方module
# declare variables and defaults
variable "location" {}
variable "environment" {
default = "dev"
}
variable "vm_size" {
default = {
"dev" = "Standard_B2s"
"prod" = "Standard_D2s_v3"
}
}
# Use the network module to create a resource group, vnet and subnet
module "network" {
source = "Azure/network/azurerm"
version = "2.0.0"
location = "${var.location}"
resource_group_name = "roytest-rg"
address_space = "10.0.0.0/16"
subnet_names = ["mySubnet"]
subnet_prefixes = ["10.0.1.0/24"]
}
# Use the compute module to create the VM
module "compute" {
source = "Azure/compute/azurerm"
version = "1.2.0"
location = "${var.location}"
resource_group_name = "roytest-rg"
vnet_subnet_id = "${element(module.network.vnet_subnets, 0)}"
admin_username = "royzeng"
admin_password = "Password1234!"
remote_port = "22"
vm_os_simple = "UbuntuServer"
vm_size = "${lookup(var.vm_size, var.environment)}"
public_ip_dns = ["roydns"]
}
调用自己写的module
## file main.cf
module "roy-azure" {
source = "./test"
}
## file test/resource.tf
variable "cluster_prefix" {
type = "string"
}
variable "cluster_location" {
type = "string"
}
resource "azurerm_resource_group" "core" {
name = "${var.cluster_prefix}-rg"
location = "${var.cluster_location}"
}
参考文档
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/terraform-install-configure
https://learn.hashicorp.com/terraform/azure/install_az
https://www.terraform.io/docs/providers/azurerm/r/virtual_machine.html
Create a VM cluster with Terraform https://docs.microsoft.com/en-us/azure/terraform/terraform-create-vm-cluster-with-infrastructure
http://aukjan.vanbelkum.nl/2016/02/23/Ansible-inventory-from-Terraform/
Terraform Azure modules https://registry.terraform.io/browse?offset=9&provider=azurerm
How to use Ansible with Terraform https://alex.dzyoba.com/blog/terraform-ansible/