- EC2 instances must be automatically added to Active Directory (AD) on provisioning and removed form AD on termination
- Each EC2 instance must have 3 private IP addresses (required for MS SQL Always-On) assigned by DHCP
- NLB must be used to expose MS SQL Always-On Listener because it’s faster than changing CNAME value
- Dedicated ASG to keep each instance fault tolerant
- Dedicated data disks which are re-attached to new instance in ASG with the same disk letters
1. Adding instances to AD on provisioning
cfn-init is used to execute
join-ad.ps1 script on instance during provisioning.
After instance is joined AD it’s hostname will be changed from default to instance ID by
AWS-JoinDirectoryServiceDomain isn’t used because it’s limited to use with AWS Directory Service only.
2. Removing instances from AD on termination
When instance is terminated it should be disabled in Active Directory to keep it relevant.
AWS SSM doesn’t have built-in functionality for such purposes but it can execute
Remove-Computer command on instance.
When instance is terminated via ASG:
- ASG sends event to CloudWatch Events
- CloudWatch Events rule triggers Lambda
- Lambda triggers SSM
- SSM executes script on instance
- Lambda checks SSM execution result
3. cfn-init and UserData logs export to CloudWatch Logs
All PowerShell scripts located in
bootstrap-scripts are set to put logs to default cfn-init log file:
Amazon SSM Agent is used to put those logs to CloudWatch logs; Amazon SSM Agent config is set during cfn-init execution; This is done to keep logs available after instance is terminated.
AMI can use outdated version of Amazon SSM Agent, so during cfn-init execution existing version is replaced with latest one.
4. Domain admin password
Domain admin password is stored in AWS Secret Manager, both InstanceRole and LambdaRole must have read access to that secret.
Known issues and workarounds
1. Windows disks attach, initialize and partition creation with specific letters flow
EBS disks are attached to EC2 instances with
Attached disks initialized in Windows with
Add-EC2Volume doesn’t provide any output that can be used as input for
As a workaround, EBS disks are mounted one by one and when one new disk is attached it’s passed to
Initialize-Disk not as output of
Add-EC2Volume but as output of
Get-Disk | Where-Object PartitionStyle -eq RAW
When disk is initialized new partition is created and specific drive letter is assigned:
New-Partition -DriveLetter X
2. Disks re-attachment in case of instance replacement
Disks are not included to LaunchTemplate: they are created and deleted together with CloudFormation stack.
When instance in ASG fails it is replaced with new instance.
The same disks are connected to new instance in the same order. This is done by UserData scripts that reference existing Volume IDs.
When disks are re-attached they will have the same disk letters in Windows.
But if some new disks were added to Windows manually before UserData executed disk letters will be changed.
3. Windows hostname is shorter than AWS InstanceID
4. First parse of UserData log by Amazon SSM Agent is not correct
Amazon SSM Agent has correct mask for UserData execution log:
"TimestampFormat": "yyyy/MM/dd HH:mm:ss'Z':"
However first bunch of events will not be displayed correctly in CloudWatch Logs GUI (further events will be displayed correctly)
5. AWS SSM isn’t stable enough for Windows
It should be replaced with Elastic Filebeat or something similar.
AWS SSM and CloudWatch are used for demo purposes only.
Production usage caution
This code is for demo purposes only and should never be used in production.
- Sample event to use for Lambda testing (replace
"detail-type": "EC2 Instance-terminate Lifecycle Action",
- This template can be used to add missing parts to quickstart-microsoft-sql