We talk a lot about PCI in the digital-commerce space, whether that’s TLS versions, vulnerability scans, iframes, and verifying code security. What we don’t talk enough about, though, is logging, log retention, and log monitoring.
When was the last time that you know someone looked through your commerce site’s logs? Do you know how long you are keeping your logs? Are you sure you’re logging enough information?
These questions, and other questions like them, are essential for developing a solid “situational awareness” of your commerce site. The PCI DSS Requirement 10 states that you need to be aware of the technical performance of your site. You need to be aware of who is using the administrative functions of your site, when they are using them, and what they are doing with their administrative access. You need to be able to see who has access to the audit trails and who can turn the audit trails off and on. You even need to report on invalid access attempts to your administration pages.
Beyond this, your application logs can tell you that:
- You’ve had a persistent attacker attempting to steal data from your site by performing low-level encapsulated SQL injection attacks from an open proxy in South Africa.
- You have a group of organized criminals using your site to validate their stock of stolen credit card numbers.
- You have an issue connecting to one of your backend data services, causing service level issues for your fulfillment processes.
- The newly deployed code is generating volumes of errors related to a change that your payment processor asked you to make.
Hopefully, you see the point that this can go on and on and on. Essentially, there is an incredible wealth of knowledge in your logs.
These logs are different from the metrics that you are collecting from a business perspective. Your conversion rates, server response time graphs, average cart values, and so on, all provide tremendous insight into the business performance of your commerce site but provide limited security value.
Why do we need to collect logs?
There are two significant reasons why we need to be concerned about collecting and maintaining application logs. The first is the most obvious – when something goes wrong, you need to be able to trace back to what happened and when so you can begin to fix the problems. In the case of an intrusion or data breach, your logs will help trace down how the attackers were able to exploit your platform, when they were able to get in, what they accessed, and provide details for how to start fixing the vulnerability. The logs will also help investigators develop reports so you can begin the process of remediating the business impact of the incident.
The second reason is more “long tail” – your logs can provide insight into the who, what, when, where, and why of your site performance. Logs won’t give you the quantitative measurements showing slowdowns and speedups but, if you are doing a good job of generating and capturing logs, the logs will show you why those slowdowns and speedups are happening. Good logs will also help your support and developer teams identify and resolve problems faster and with greater confidence than without logs.
How do we deal with logs?
Getting your level of logging right can be difficult. Too many logs and you’ll get washed away in noise. Too few, and you could miss important data trends. In either case, it’s very likely that you are going to have too many messages for a human to manage on their own.
Fortunately for us humans, many good tools exist to help us make sense of all the data in the logs. As a class, these tools are called “log aggregation” tools. The name means that the tools are meant to combine logs from any different sources into a single logging system. These tools come in many shapes and sizes and many different costs.
Among the best free options, you will find tools like ELK and Graylog. These are very good tools and will get the job done. The software is free, but the time and server resources to run them are not. If you have people and technical resources to run them, these tools can be configured to answer many of the questions asked earlier and can be configured to satisfy PCI DSS requirements.
You have many more options if you go the commercial route. Tools such as Splunk, Loggly, Datadog, and many others offer both on-premises and cloud-based log aggregation and analysis. These tools provide additional automated insight into the logs and offer a managed service, so you don’t have to dedicate your own resources to manage the logs. The commercial terms for these tools are usually based on the volume of logs you are storing and the level of automated analysis you want to be done with those logs.
Your day-to-day PCI DSS log monitoring compliance efforts become less of a burden once you have your log management solution running and tuned. All the tools mentioned allow you to configure automated alerts for known conditions and can be configured to store logs in accordance with PCI DSS policy. PCI DSS compliance then becomes a matter of validating that your log management solution is functioning as expected and reviewing the results of its analysis. Not only will you improve your PCI DSS compliance, but you will also be generating value from the knowledge gleaned from the logs and their analysis.
If you would like to improve your PCI DSS compliance efforts with respect to log monitoring, give Capgemini a call. You can also review the PCI DSS Security Council’s advice for log management here: https://www.pcisecuritystandards.org/documents/Effective-Daily-Log-Monitoring-Guidance.pdf