GCE修改Hosterror等待时间
主机错误 (compute.instances.hostError
) 意味着托管虚拟机的物理机器上存在硬件或软件问题,进而导致虚拟机崩溃。如果主机错误涉及硬件完全无法运行或其他硬件问题,则可能会阻止虚拟机实时迁移。如果您的虚拟机设置为自动重启(这是默认设置),Google 通常会在检测到错误后的 3 分钟内重启您的虚拟机。重启可能最多需要 5.5 分钟,具体取决于问题。
有时,虚拟机可能会在检测到主机错误之前无响应。您可以使用 --host-error-timeout-seconds
标志(预览版)缩短 Compute Engine 等待重启或终止虚拟机的时间。此标志设置 Compute Engine 在检测到虚拟机无响应后等待重启或终止虚拟机的最长时间。如需了解详情,请参阅设置可用性政策。
1. 缩短hosterr等待时间
hostErrorTimeoutSeconds
(预览版):设置 Compute Engine 在检测到虚拟机无响应后等待重启或终止虚拟机的最长时间(以秒为单位)。- [默认] 未设置,Compute Engine 最多会等待 5.5 分钟(330 秒),然后再重启无响应的虚拟机。
90
和330
之间的秒数(以 30 为增量),这用于设置 Compute Engine 在重启无响应的虚拟机之前等待的时间。
修改示例:
gcloud beta compute instances set-scheduling wanggaoli-dasan \
--zone=asia-northeast1-b \
--host-error-timeout-seconds=90
日志
{
"protoPayload": {
"@type": "type.googleapis.com/google.cloud.audit.AuditLog",
"authenticationInfo": {
"principalEmail": "wanggaoli@yunion-hk.com",
"principalSubject": "user:wanggaoli@yunion-hk.com"
},
"requestMetadata": {
"callerIp": "34.168.207.222",
"callerSuppliedUserAgent": "google-cloud-sdk gcloud/405.0.0 command/gcloud.beta.compute.instances.set-scheduling invocation-id/63e829c5d6044343a26f86aedb13cdfd environment/devshell environment-version/None interactive/True from-script/False python/3.9.2 term/screen (Linux 5.10.133+),gzip(gfe)",
"requestAttributes": {
"time": "2022-10-21T05:55:12.761312Z",
"auth": {}
},
"destinationAttributes": {}
},
"serviceName": "compute.googleapis.com",
"methodName": "beta.compute.instances.setScheduling",
"authorizationInfo": [
{
"permission": "compute.instances.setScheduling",
"granted": true,
"resourceAttributes": {
"service": "compute",
"name": "projects/mec-test-344202/zones/asia-northeast1-b/instances/wanggaoli-dasan",
"type": "compute.instances"
}
}
],
"resourceName": "projects/mec-test-344202/zones/asia-northeast1-b/instances/wanggaoli-dasan",
"request": {
"@type": "type.googleapis.com/compute.instances.setScheduling",
"hostErrorTimeoutSeconds": "90"
},
"response": {
"selfLinkWithId": "https://www.googleapis.com/compute/beta/projects/mec-test-344202/zones/asia-northeast1-b/operations/5345066212986423983",
"status": "RUNNING",
"targetLink": "https://www.googleapis.com/compute/beta/projects/mec-test-344202/zones/asia-northeast1-b/instances/wanggaoli-dasan",
"zone": "https://www.googleapis.com/compute/beta/projects/mec-test-344202/zones/asia-northeast1-b",
"progress": "0",
"startTime": "2022-10-20T22:55:12.620-07:00",
"id": "5345066212986423983",
"@type": "type.googleapis.com/operation",
"name": "operation-1666331711770-5eb8515c21079-cb78d67c-81a0d17f",
"selfLink": "https://www.googleapis.com/compute/beta/projects/mec-test-344202/zones/asia-northeast1-b/operations/operation-1666331711770-5eb8515c21079-cb78d67c-81a0d17f",
"targetId": "4609340071823249091",
"insertTime": "2022-10-20T22:55:12.609-07:00",
"operationType": "setScheduling",
"user": "wanggaoli@yunion-hk.com"
},
"resourceLocation": {
"currentLocations": [
"asia-northeast1-b"
]
}
},
"insertId": "f9l5ube2i15w",
"resource": {
"type": "gce_instance",
"labels": {
"zone": "asia-northeast1-b",
"project_id": "mec-test-344202",
"instance_id": "4609340071823249091"
}
},
"timestamp": "2022-10-21T05:55:11.824199Z",
"severity": "NOTICE",
"logName": "projects/mec-test-344202/logs/cloudaudit.googleapis.com%2Factivity",
"operation": {
"id": "operation-1666331711770-5eb8515c21079-cb78d67c-81a0d17f",
"producer": "compute.googleapis.com",
"first": true
},
"receiveTimestamp": "2022-10-21T05:55:13.732887733Z"
}
{
"protoPayload": {
"@type": "type.googleapis.com/google.cloud.audit.AuditLog",
"authenticationInfo": {
"principalEmail": "wanggaoli@yunion-hk.com",
"principalSubject": "user:wanggaoli@yunion-hk.com"
},
"requestMetadata": {
"callerIp": "34.168.207.222",
"callerSuppliedUserAgent": "google-cloud-sdk gcloud/405.0.0 command/gcloud.beta.compute.instances.set-scheduling invocation-id/63e829c5d6044343a26f86aedb13cdfd environment/devshell environment-version/None interactive/True from-script/False python/3.9.2 term/screen (Linux 5.10.133+),gzip(gfe)",
"requestAttributes": {},
"destinationAttributes": {}
},
"serviceName": "compute.googleapis.com",
"methodName": "beta.compute.instances.setScheduling",
"resourceName": "projects/mec-test-344202/zones/asia-northeast1-b/instances/wanggaoli-dasan",
"request": {
"@type": "type.googleapis.com/compute.instances.setScheduling"
}
},
"insertId": "-l2ndfwddik4",
"resource": {
"type": "gce_instance",
"labels": {
"project_id": "mec-test-344202",
"zone": "asia-northeast1-b",
"instance_id": "4609340071823249091"
}
},
"timestamp": "2022-10-21T05:55:15.286348Z",
"severity": "NOTICE",
"logName": "projects/mec-test-344202/logs/cloudaudit.googleapis.com%2Factivity",
"operation": {
"id": "operation-1666331711770-5eb8515c21079-cb78d67c-81a0d17f",
"producer": "compute.googleapis.com",
"last": true
},
"receiveTimestamp": "2022-10-21T05:55:15.745905867Z"
}
修改多台机器:
# inlist.txt为VM主机名列表
while read name;do gcloud beta compute instances set-scheduling $name --host-error-timeout-seconds=90 --zone=asia-northeast1-b; done < inslist.txt
2. 获取迁移通知
此元数据键的值会在维护事件开始前 60 秒发生更改,让您的应用代码能够在维护事件之前触发您要执行的任何任务,例如备份数据或更新日志。