Centralized Inventory of Managed VMs in AWS

Published in

Fortum Technology Blog

5 min readApr 9, 2024

In any organization, there is a constant need to manage and operate VMs (virtual machines) centrally. Motivation includes standardization, patch management, better cost management by reservations or savings plans, ensuring compliance, listing instances which have certain applications installed and so on and so forth.

I would personally prefer to transition away from VMs towards managed services or serverless architecture. That way we abstract the infrastructure management and lower our operational overhead. However, despite our best efforts and good intentions, it is a fact that neither us nor the majority of organizations can completely shift away from having, at least some, virtual machines to manage.

The predicament arises due to legacy applications, managed services not supporting some use cases (e.g. Timescale DB), ecosystem compatibility (integration with some enterprise tooling), regulatory constraints, licensing considerations and more. So, the natural first step is to ensure we have an accurate, centralized repository of all the VMs configuration.

The AWS built-in service for configuration management is AWS Config.

AWS Config’s multi-account and multi-region data aggregation does help with the discoverability of VMs¹. However, its scope is limited to only centralized configuration management. It looks at EC2 instances from the outside and sees its configuration and compliance posture. For example, it helps answer the below questions

List of all EC2 instances
List of publicly available EC2 instances
Instances with specific tags
Non-compliant instances

However, Config does not provide any information about an instance from the inside. For that, the Systems Manager Inventory comes to the rescue.

Systems Manager Inventory — single account view

Systems Manager Inventory provides a wealth of metadata and configuration information about managed instances². The information collected by Inventory is as below³

instance information like InstanceId, OS details, CPU details, etc
installed applications: such as applications installed through yum, apt
network configuration: information about IP address, subnet
aws components: details about aws components installed such as ssm agent
windows registry keys
files: information about the filesystem
custom inventory: Inventory can be used to collect custom metadata like Business name, contact details, etc

Inventory works with any managed node. A managed node has an SSM agent installed and can communicate with the AWS Systems Manager. So, a managed node can be a VM from other cloud providers, IoT devices, and on-premise servers⁴.

Inventory for a managed node is collected by running the AWS-GatherSoftwareInventory SSM document (provided by AWS in all accounts) against those instances. The State Manager feature of Systems Manager can then be used to associate the above document against all the instances and run them at a frequency.

Terraform implementation for this is shown below. It collects inventory for all EC2 instances in an account once a day.

resource "aws_ssm_association" "this" {
  name = "AWS-GatherSoftwareInventory"
  targets {
    key    = "InstanceIds"
    values = ["*"]
  }
  schedule_expression = "rate(1 day)"
}

The solution above gathers inventory information but is limited to an individual account. To meet my ultimate goal of aggregating the data at the organization level i.e. multiple accounts and regions, enter Resource Data Sync.

Resource Data Sync for Inventory — AWS Organization view

Resource Data Sync is a powerful Systems Manager feature that enables the collection of inventory information from multiple AWS Accounts and regions into an S3 bucket. It stores the raw data in JSON format which can then be queried using Athena and integrated with QuickSight for visualization purposes.

As shown in the diagram above, resource data sync needs to be set up in all the accounts and in all the region for it to work. The easiest way to implement this is by using Cloudformation Stackset. Below is an example of the template for the stack

AWSTemplateFormatVersion: 2010-09-09
Description: Cloudformation Template to create resources for Systems Manager Inventory
Parameters:
  ResourceBucketName:
    Type: String
    Description: Name of S3 bucket where the inventory data will be stored

Resources:
  ResourceDataSync:
    Type: AWS::SSM::ResourceDataSync
    Properties:
      SyncName: centralized-resource-data-sync
      SyncType: SyncToDestination
      S3Destination:
        BucketName: !Ref ResourceBucketName
        BucketRegion: !Ref 'AWS::Region'
        SyncFormat: JsonSerDe

  InventoryCollection:
    Type: AWS::SSM::Association
    Properties:
      AssociationName: software-inventory
      Name: AWS-GatherSoftwareInventory
      ScheduleExpression: "rate(1 day)"
      Targets:
      - Key: InstanceIds
        Values:
        - "*"

Next, is the integration with Amazon Athena. The data exported by Resource Data Sync is in JSON format and is organized in a structure that enables Glue to automatically create partitions. This is how the root folder structure looks like

And below is an example how the detailed structure look like.

s3://example-inventory-bucket/AWS:Application/accountid=012345678912/region=us-west-1/resourcetype=ManagedInstanceInventory/

Below code is an example implementation of Glue crawler using Terraform

resource "aws_glue_catalog_database" "this" {
  name = "ssminventory"
}

resource "aws_glue_crawler" "this" {
  database_name = aws_glue_catalog_database.this.name
  name          = "inventory"
  role          = aws_iam_role.crawler_role.arn

  s3_target {
    exclusions = [
      "**/test.json",
    ]
    path = "s3://${var.resource_data_sync_bucket_id}"
  }

  configuration = jsonencode(
    {
      CrawlerOutput = {
        Partitions = {
          AddOrUpdateBehavior = "InheritFromTable"
        }
      }
      CreatePartitionIndex = true
      Version              = 1
    }
  )

  recrawl_policy {
    recrawl_behavior = "CRAWL_EVERYTHING"
  }

  schema_change_policy {
    delete_behavior = "LOG"
    update_behavior = "LOG"
  }

  schedule = "rate(1 day)"
}

resource "aws_iam_role" "crawler_role" {

  name               = "inventory-crawler-role"
  path               = "/inventory/"
  description        = "Policy used by glue crawler to access resource data sync bucket"
  assume_role_policy = data.aws_iam_policy_document.crawler_role_assume_role_policy.json
}

data "aws_iam_policy_document" "crawler_role_assume_role_policy" {
  statement {
    actions = ["sts:AssumeRole"]
    principals {
      type        = "Service"
      identifiers = ["glue.amazonaws.com"]
    }
  }
}


data "aws_iam_policy_document" "glue_access_resource_data_sync" {
  statement {
    sid = "s3"
    actions = [
      "s3:GetObject",
      "s3:PutObject"
    ]
    resources = ["${var.resource_data_sync_bucket_arn}/*"]
    effect    = "Allow"
  }
}

resource "aws_iam_role_policy" "glue_access_resource_data_sync" {
  name   = "access-resource-data-sync-bucket"
  role   = aws_iam_role.crawler_role.id
  policy = data.aws_iam_policy_document.glue_access_resource_data_sync.json
}

resource "aws_iam_role_policy_attachment" "glue_service_role" {
  role       = aws_iam_role.crawler_role.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole"
}

Please note that resource data syncs automatically adds a test.json file in the above folders which should be excluded from the scope of the crawler.

Unfortunately, the crawler above posed a problem for me while crawling the AWS:Application folder. It creates a partition column called resourcetype because of the resourcetype=ManagedInstanceInventory folder. Additionally, the JSON file also contains the resourcetype field. As a result, catalog table was created with a duplicate resourcetype column leading to an error when querying from Athena. To address the issue, I created the aws_application table manually using terraform and disabled Glue from updating the table schema.

Finally, I was able to query my complete managed VM inventory 🚀

[1] https://docs.aws.amazon.com/config/latest/developerguide/aggregate-data.html

[2] https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started.html

[3] https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-inventory-schema.html

[4] https://docs.aws.amazon.com/systems-manager/latest/userguide/managed_instances.html

About the Author:

Sunil Mohanty, Lead Architect (Platform Engineering) at Fortum

Centralized Inventory of Managed VMs in AWS

Systems Manager Inventory — single account view

Resource Data Sync for Inventory — AWS Organization view

Written by Sunil Kumar Mohanty