Logging and monitoring are a crucial part of robust web security today. To properly manage traffic and protect an organization’s sites, apps, and APIs, security managers must be able to:
- See at a glance the volume and composition of incoming traffic.
- Get immediate alerts when anomalous activity is detected.
- Monitor the decisions being made by their web security solution(s).
- Observe the effects of all revisions to security policies as they are made.
- And fine-tune their security posture in order to avoid FP (false positive) and FN (false negative) alarms.
To support these requirements, a good web security solution must provide:
- Full traffic data (headers and payloads) for all requests that are being blocked
- Full details of why these decisions are being made (necessary for monitoring the effectiveness of security policies, and reducing FP alarms)
- Full traffic data (headers and payloads) for all requests that are not being blocked (which is necessary for reducing FN alarms)
- An intuitive interface for quickly accessing this information at all time scales (from broad overviews to drill-downs into individual requests), and for potentially complex queries
- The ability to view all this data in real time, so you can react immediately to new threats
- The ability to access this data in comprehensive historical logs, to help understand trends, construct new security policies, and so on
- And the ability to create custom alerts based on traffic events.
All three of the top-tier CSPs (cloud service providers) offer native logging and monitoring tools as part of their platforms.
In this article, we’ll discuss the tools from all three providers (AWS, GCP, and Microsoft Azure), and talk about:
- What logging/monitoring abilities each provider offers
- The drawbacks and limitations of each CSP’s tools
- How to compensate for their weaknesses, and get robust logging, monitoring, and alerting capabilities for your sites, apps, and APIs regardless of the CSP that you use.
Logging and Monitoring on AWS
When considering AWS security, the primary services that come to mind are AWS WAF and AWS Shield. However, as mentioned above, logging and monitoring are also important parts of a solid security posture. For these, AWS offers Amazon CloudWatch Logs.
Amazon CloudWatch Logs allows you to store and access log files from different services, such as EC2 instances, Lambda functions, Amazon Route 53, and more. Many Amazon services, such as AWS CloudFront, do not support direct streaming into CloudWatch; this can be worked around by writing data instead to an S3 bucket, and then using a Lambda function to write the data to CloudWatch. However, some services offer direct support for CloudWatch; for example, as of late 2021, AWS WAF and AWS Shield Advanced logs can be sent directly to a CloudWatch Logs group.
Logs Insights includes a language for querying logs. In just a few commands, you can query up to 20 log groups in a single request, which you can then save for later use. Thus, complex queries only need to be written once, and then can be re-used as necessary.
CloudWatch also lets you create “metric filters” that predefine the terms and patterns CloudWatch should look for when receiving log data. Over time, log data can produce meaningful metrics you can analyze for different types of insights.
CloudWatch has several drawbacks.
First, it requires manual setup. Admins need to understand their infrastructure and define proper filters in order to capture and analyze important data. If this isn’t done correctly, useful information can be lost or missed.
Also, as applications scale, the amount of data to be logged and analyzed increases almost exponentially. Most AWS services offer logging to CloudWatch, and as you increase your use of them, you also increase log data. This can make later analysis time-consuming and unwieldy.
For hybrid or multi-cloud deployments, CloudWatch isn’t ideal. If you want to log data from other sources, you’ll have to create custom logging with events in JSON format and handle them yourself. Again, with a growing infrastructure, there will be more logs and maintenance required.
Lastly, real-time access to data is generally not available. CloudWatch does not inherently offer this capability, so you have to construct workarounds yourself. You can assign a Kinesis stream, Lambda function, or Kinesis Data Firehose stream via subscriptions to receive all log data that fits within a defined filter. From there, you’re able to custom-process or analyze these logs, as well as store them using S3.
Even when direct data streaming into CloudWatch is supported, Amazon still does not provide real-time availability of traffic data. Only “near real time” access is offered.
Many AWS services send logs to CloudWatch for free, and many applications will be able to operate within the AWS free tier. However, if you have a large infrastructure, you will be charged on a pay-as-you-use basis, with each CloudWatch service billing differently. More information is available on the AWS pricing page.
Logging and Monitoring on GCP
When considering web security for GCP, the first tool that comes to mind is Cloud Armor. Although this service logs its data, it does not do so separately. Instead, Cloud Armor data is sent to the Cloud Load Balancing logs.
GCP’s primary logging tool is Cloud Logging, a fully managed and scalable Google Cloud Platform security service that’s divided into five smaller subservices: Dashboard, log-based metrics, Cloud Router, Cloud Storage, and Log Explorer.
- Logs Dashboard provides visualization of the most common log data provided by Google.
- Custom log-based metrics are based on the content of log entries (similar to CloudWatch metric filters). With these, you can closely monitor critical parts of your infrastructure. The service comes with some default metrics, such as byte count for logs, billable bytes, and exported log entries, among others.
- Router allows you to route different logs to other GCP services using “sinks.” It also uses filters to send out requests to the sinks, to be processed and analyzed by other services.
- Storage lets you create log buckets to store logs; you choose the retention period and only store logs that are useful to you, again by using filters.
- Logs Explorer allows you to write queries to your logs. It uses Google’s Logging Query Language to perform and save fast and complex queries so you can write them once and run them whenever you need to. (Note: As with AWS, engineers will have to learn the language and maintain these queries when something changes. However, you can also use Python to write queries and then upload them to Google’s Cloud, as demonstrated here.) Logs Explorer also presents the latest logs grouped by timestamp, as well as some functionality to create alerts.
As for monitoring, the primary GCP service here is Cloud Monitoring. Cloud Armor can export its data into Cloud Monitoring; more information on this is available here.
Google Cloud has a number of weaknesses for monitoring and logging web traffic.
First, although Cloud Armor can send data into Cloud Monitoring, admins can still find it difficult to view the information that they want. It’s common to want broad visibility into everything that’s currently happening in an environment, but GCP isn’t really designed for this. Rather, metrics are provided for individual security policies, or for individual backend services.
Admins can create custom dashboards and custom queries, but this requires using the Cloud Monitoring API, which can be complicated.
As for logging, this can be awkward too. For example, logs for a Google Cloud Armor security policy are only available in the Google Cloud console. Complicated environments such as hybrid deployments can cause problems; it’s difficult (sometimes even impossible) for Cloud Logging to ingest logs from on-premises Kubernetes clusters or Knative deployments, for example. Also, Cloud Logging’s new UI is slow, with a fresh and empty account sometimes taking 3 to 4 seconds to change from one tab to another. This can become very frustrating and time-consuming if you need to track down an error using the UI.
Lastly, Google Cloud Platform doesn’t make traffic data available in real time. At best, Cloud Monitoring can provide data in near real time. At worst, log data is sometimes available only after considerable delays.
Google defines two billable metrics: ingestion and storage. These are quite complex and heavily dependent on your use of the service. More information is available at the pricing page.
Logging and Monitoring on Azure
Azure Monitor is split up into subservices that provide log capabilities, analytics, and alerts.
For logging, there’s Azure Monitor Logs. This allows you to use a version of the Kusto Query Language (KQL) to perform queries against log data collected and organized automatically by Azure. With these queries, you can then access other capabilities of the service, such as Log Analytics to analyze your data, Log Alerts to receive notifications or take automated actions when a query matches a certain result, and more. You can also visualize this data in an Azure dashboard by exporting query results to it and rendering custom charts.
For monitoring, there’s Azure Monitor Metrics. In contrast with Azure Logs, this service doesn’t allow for the writing of complex queries, but it does support near real-time scenarios. It does this by regularly logging lightweight numerical values in a time-series database that describe a part of your system at a certain point in time. You can still perform analysis and visualization on these metrics using the Metrics Explorer and have alerts or actions triggered via rules.
Azure Monitor’s drawbacks appear when trying, for example, to analyze data over long periods. Even though Azure stores the most recent 93 days of data, you can only query a maximum of 30 days’ worth of data on a single chart. For that matter, many organizations will find themselves wanting more than 93 days of data when trying to do long-term analysis.
Like GCP and AWS, Azure Monitor requires engineers to use a query language, and Azure’s is a bit more complicated than the others. Also, its interface can be slow.
Lastly, although log data is generally more versatile than metrics, Azure doesn’t support real-time applications for logs. This can delay your reaction time when dealing with potential threats.
Azure Monitor has a rather complicated pricing structure, with each component of the service priced differently depending on usage and the tier selected (either Pay-As-You-Go or various Commitment Tiers).
Cloud Logging and Monitoring: Common Problems
As shown above, although the top-tier CSPs provide native tools for logging and monitoring, they share some substantial weaknesses.
Not designed for multi-cloud or hybrid deployments. Unsurprisingly, each CSP wants to encourage exclusive usage of its platform, and their tools are meant for this purpose. So, these services are limited in their abilities to support other providers.
Not necessarily designed for good visibility. Many of these tools were developed piecemeal, and it shows. Although they might provide details about granular performance (e..g, of a specific security policy), it can be difficult to get a good overview of entire environments. Security managers and executives will often be frustrated when they want their tools to “show me everything that’s going on in my site’s traffic right now,” only to be told that this information isn’t easily available.
Accessing data can be awkward. While each provider offers some preconfigured queries and dashboards, very few organizations will find all their needs met out of the box. Engineers will need to write and maintain queries in specific languages. Many administrative tasks can only be accomplished through APIs. And even when a UI is available, often its performance can be less than desired.
Data isn’t available in real time. As mentioned previously, it’s often vitally important (especially during an attack) to know what’s happening in a site, app, or API right now. The CSP tools can’t provide this information. At best, some data is available in near real-time; at worst, there can be a delay of several hours before it can be accessed and analyzed.
How to Get Effective, Real-Time Traffic Visibility
Traffic logging and monitoring is vitally important for many reasons: to create robust security policies, to fine-tune a security posture and eliminate false alarms, to defend against attacks as they occur, to analyze and understand long-term trends, and so on.
Reblaze is a comprehensive WAAP (Web Application and API Protection) platform that provides both real-time traffic visibility and full historical logs. Through its dashboard, users can see everything that’s happening across multiple environments, zoom in and out of different time scales, and even drill down into individual requests, all with a few clicks. An intuitive UI provides the ability to quickly construct point-and-click filters and queries, for both current and historical data, without having to learn or use a specific language.
For more information about Reblaze, or to schedule a demo, you can contact us here.