Are you currently experiencing an attack?

Are you currently experiencing an attack?

Getting More from AWS CloudWatch

AWS CloudWatch was launched as an observability service that can monitor applications and collate many types of logs (including infrastructure logs, application logs, and audit logs), all in one place. The service was later extended to provide visualization of these log metrics and events to interpret the real-time status of a system. It’s an important part of security for AWS, as we shall see.

CloudWatch does more than collect logs; it also provides features to respond to system changes, such as degrading performance of an application running on EC2 or an instance shutting down completely. It also provides support for implementing event-driven workflows: for example, triggering a Lambda function when a file is uploaded into an AWS S3 bucket.

In this article, we’ll look into the features AWS CloudWatch provides, and discuss how these can be used in real-world systems.

AWS CloudWatch Logs Insights

AWS CloudWatch Logs is used to monitor, store, and access the logs of several AWS services, including AWS EC2, CloudTrail, ELB, and so on. AWS uses the AWS log driver installed on these services to emit logs to the CloudWatch service. For new services such as AWS ECS and AWS Fargate, it introduced AWS FireLens, which uses Fluentd and Fluent Bit to emit logs to any tool a customer chooses.

AWS CloudWatch Logs collates these logs using log streams and log groups:

  • A log stream is a series of log events that have the same source.
  • A log group consists of log streams that have the same retention, monitoring, and access-control settings. For example, a Lambda function is associated with one log group, and for each function instance, there is one log stream. For EC2 and others, you can create log groups with specific streams going into each. As of now, there is no limit on the number of log streams per log group.

AWS also offers a Logs Insights tool that can be used to write queries and analyze log data, helping you effectively address operational issues. It can also auto-detect fields from AWS service logs (JSON format), which helps provide better visualization without any effort from DevOps.

Figure 1: CloudWatch Logs Insights (Source: AWS)

AWS CloudWatch Alarms

CloudWatch data is all about logs and metrics. Once you have these data via CloudWatch Logs, you can take action based on the current value of the metric or any change in the state of the service. In order to raise an alert, you need to create an alarm in CloudWatch Alarms, which monitors metrics or the outcome of a formula applied to the metrics. Based on the rule defined in the alarm, it can then take one or more actions.

You can also create a composite alarm, which will be triggered based on the states of multiple alarms. In this case, an alert will be raised only when the conditions of all the rules are met.

AWS CloudWatch Events

AWS exposes metrics in several ways. Many services save log files to S3 buckets, but this mechanism isn’t optimal for security; there can be a delay of tens of minutes before data for a given event will appear here.

Log streams (mentioned above) are a second mechanism. These are typically updated more frequently than S3 buckets, but they too have timeframes which are too long for security and infrastructure monitoring.

AWS CloudWatch Events are the third mechanism, and they are the best source for monitoring. They operate almost instantaneously, and are triggered by changes in AWS resources. You need to set up the rules to match events, such as an EC2 instance going into FAILED status from RUNNING, and then route them to targets like a Lambda function or Kinesis stream. With this feature, you can keep track of operational changes as they occur and take corrective action as required.

Figure 2: AWS CloudWatch Events (Source: AWS)

You can also use CloudWatch Events to schedule actions that are self-triggered at a specific time using a cron expression. For example, if you need to process a file uploaded in an S3 bucket every day at midnight, you can schedule it with CloudWatch Events and also define a rule to trigger a Lambda function to process any change in that bucket.

AWS CloudWatch Benefits

For the sake of analyzing the benefits of CloudWatch, let’s assume your solution involves only AWS services. Here are a few of the benefits gained using CloudWatch versus other available services:

  • CloudWatch doesn’t just monitor and query logs; it also provides alarms, event-driven workflow support, and many other features. No other tool has such a vast range of capabilities all in one.
  • The CloudWatch integration with AWS services is very smooth, with no effort required on your part. For example, Lambda functions use CloudWatch Logs by default as a collector service. Other services use the AWS log driver, which can be embedded as part of an AMI.
  • CloudWatch not only collects host logs but application and audit logs (from AWS CloudTrail) as well, making it very easy for a security team to check security logs and identify security-related loopholes and anomalies.
  • CloudWatch uses the AWS IAM security mechanism to access AWS services logs. This is a built-in feature of AWS, requiring zero effort from you. For third-party services, you generally need to manage the authentication part separately.
  • CloudWatch makes managing log retention very easy, as it can integrate with AWS S3 for archiving less-used logs. With third-party services, you need to plan for storage outside of AWS, which can be costly and sometimes non-compliant.

AWS CloudWatch Use Cases

Let’s now take a look at a few of the use cases for customers to build solutions around the CloudWatch service.

Debugging via CloudWatch Synthetics and AWS X-Ray

AWS launched CloudWatch Synthetics to let developers and DevOps engineers create configurable scripts called “canaries.” These scripts, in turn, allow users to view their application endpoints and URLs. Running once every minute, 24/7, canaries capture the behavior of these endpoints and URLs, making sure they are functioning per your given script. And when they stop functioning, an alert is raised.

AWS also integrated CloudWatch Synthetics with its X-Ray service, enabling DevOps to trace end-to-end requests from these canaries, with the following benefits:

  • You can use CloudWatch alarms for canaries, triggering an alarm any time there is an increase in errors, faults, throttling rates, or slow responses. Using the X-Ray service map, you can also see all of these in one place for each request.
  • You can lower the mean time to resolution of ongoing failures in both upstream and downstream services.
  • Synthetics’ canaries let you identify performance bottlenecks by looking at the X-Ray map over time.

CloudWatch Alarms Based on Anomaly Detection

CloudWatch provides an anomaly-detection feature, where it monitors a period of metric data and then uses this data to create a model. You can set the threshold values to be used with the model to determine what would be considered a “normal” range. Using CloudWatch alarms, you can then raise an alert anytime the metric value falls above or below this range.

AWS GuardDuty and CloudWatch Events for Slack Notifications

AWS GuardDuty is a monitoring service that collects events from CloudTrail, VPC Flow Logs, and DNS logs. It then analyzes these data using machine learning and anomaly detection, giving you valuable security intelligence.

However, you would need CloudWatch Events to notify your security team to act on any alerts raised from this intelligence, meaning you would also need notifications sent to CloudWatch Events. Luckily, GuardDuty automatically does the latter. And as most teams use Slack for their day-to-day communications, one approach is to set a CloudWatch Events rule that triggers a Lambda function, which then uses Slack APIs to send out alerts over a configured Slack channel.

Summary

As more and more applications are being built using serverless architecture, logging and monitoring have become more critical to debug and identify production issues. AWS CloudWatch helps developers and DevOps teams to collect all logs, events, and metrics in one place and analyze them to resolve issues swiftly, minimizing MTTR. CloudWatch also supports triggering recovery actions using the service’s alarms and events features.

CloudWatch has many uses beyond the ones described here. To realize its full potential for cloud web security, it should be combined with AWS GuardDuty. We’ll post another article soon about GuardDuty, and how to use it (along with CloudWatch) for security monitoring and automated remediation. Stay tuned!

Get your price quote

Fill out your email below, and we will send you a price quote tailored to your needs

This website uses cookies to ensure you get the best experience on our website.