Thursday, April 17, 2014

Questioning Information Security: A couple of examples

We've discussed the necessity in information security of asking good questions (Questioning Information Security - Part 1) and how to answer those questions using data and analytics (Questioning Information Security - Part 2).  Now apply this approach to answer two questions for a hypothetical enterprise.

Which employees have the worst security behavior?

Ever wonder which employee has the worst security behavior? New employees come in to the organization and get training in information security. They are instructed about the hazards of clicking on links in email messages from unexpected senders, the risks of using web mail and file sharing sites at work, and the potential liability of storing sensitive data on external media.

You send them out into the world to do good things, but you know that one of them is really going to cause big security problems. Looking at them as they leave the class, you are certain it is going to be one of the guys in the far left column.

 Back in the trenches, the security engineers and analysts are fighting the good fight of malware infections, bot nets, unsecured servers and hosts, broken security software, lost devices, attacks against the perimeter. Lots of activity. From whom is it stemming? If you could just get to the root ....who is causing these problems? Then you remember reading those cool posts on where you learned that you have all the data you need to answer your security questions.

Rolling up your sleeves, you determine that in order to identify the person who has the worst security behavior is going to require Active Directory (for employee information), the web gateway event logs (who is hitting high risk categories), the AV logs (who is getting the malware alerts), the system management logs (tells you patch levels and apps installed on systems), and the network vuln scan data.

You set up ETL jobs to periodically pull this data from its various sources in to your PostGres database and you bind the data together using hostnames and IP addresses from the DNS logs.

You crunch the data - looking at a simple count of security events by employee over the last three months and are surprised but not surprised that 90% of your security problems come from 1% of your users.

So you dive in and create the reports that will drive action in the organization and come up with something like this...

Armed with this, you know who the worst security actors are in the company. Starting with the biggest offenders, your team provides individual training to get the worst right. You see the curve flatten over time. You are proud of yourself. But then, you ask yourself, "Am I asking the right question?"

Which employees expose the organization to the greatest risk?

What you really care about is which employees expose the organization to the greatest risk. To figure that out you decide you need to tie in the security behavior score you've developed with the user access permissions and the system risk ratings. You have a centralized store of user system access permissions because you do periodic access permission certifications. And you have a database of the risk profile for each of those systems because of your risk management program.

Adding these together with the security behavior scoring that you already did...

Some tweaks to your report. Done! As it turns out, some of the people with better security behavior do need some attention, like Enoch Root, your CFO!

And by the way, it turns out that this guy is the one who was causing you all the grief :)

Friday, April 11, 2014

Questioning Information Security - It's all about the data

In Questioning Information Security Part 1, I argued that your security is only as good as the questions you ask.  If you never ask the question - is my network exposed to compromise through third party connectivity or how can my password reset function be defeated or who is using unauthorized systems on my network - you will never know the answer.

But great questions are only the seed of the answer. Just as a scientific hypothesis necessitates a quest of rigorous experimentation, an Information Security question requires data and analytics.

Data Collection

Fortunately, your environment has all the data needed to answer your security questions. Let me say it again - your environment has all the data you need to answer your security questions. The diagram below shows some of the data that is commonly available in enterprise network environments.

Mixed together properly, tour DNS activity, web logs, firewall logs, endpoint events, malware events, AD data, and net flow holds rich treasures to answer numbers of questions - ones that you are asking and ones that you haven't yet asked.

But how do you collect it? How do you process it? If your last attempt at wrangling and analyzing large amounts of data was back in 2005, you probably felt a bit like this guy.  The relational databases just didn't scale. It was expensive. It was a lot of work. And queries took forever.

The world of data processing began to change in 2005 when the Apache Software Foundation released Hadoop, an implementation of Google's MapReduce technology for parallel processing of huge data sets. As Hadoop because more accessible and associated technologies such as Hive and Pig wrangling this data wasn't so difficult any more. Now data analytics can occur on an industrial scale.

Massive data sets can be loaded in to commodity hardware running open source software and analyzed effectively, processing complex queries in reasonable time that was previously unthinkable. The shackles and overhead of relational data structures are off with the NoSQL paradigm and the structured and unstructured data can processed from all kinds of sources.

Of course, you don't have to collect all of the data in your environment to answer really meaningful security questions. Just start with the data necessary to answer well the core questions.

Data Correlation

Some cool questions can be answered by simply analyzing a single dimension of data that spans a long period of time, such as firewall or DNS logs. Things get exponentially more interesting when you correlate the different data sources together. How do you do this though with data from a variety of sources that was really never 'designed' to be together?

The common denominators of the data is an ip address or a hostname. All your data that interacts with the network can be tied pretty easily to one of these two attributes. With data collected over time and correlated, really interesting questions can start to be answered.  Take Active Directory - useful to correlate stuff to that because that is where you can tie in to the users behind the events. AD doesn't have the IP address of each user, but the domain authentication logs do. Tie that to your DNS logs and you can get down to a hostname. Similar process for just about any other data source. I have found that doing this data enrichment is useful to do during the data load processing.

Data Contextualization

Data contextualization is simply the process of taking that security data and putting it within the context of its place in the business, the environment, the overall risk of the enterprise. Then, something as lonely as a vulnerability report can be really useful. Then you can know how important that vuln is because of the risk profile of the system it resides in, and you can report vulnerabilities by business unit or, better, by business process. You can tie vulns to specific people within specific business units. Lots of places you can go with other data.


Ultimately, these answers to the questions you've asked are valuable in so much as they result in action, which requires good communication. One of the goals is to make it valuable to the business and create a meaningful business-specific dashboard that isn't color-coded based on swags.

Rather, the dashboard is based on real data such that you can dive in to its dimensions and layers, such as the data behind outsider fraudulent Transaction in the retail division. Or employee accidental data loss in the R&D group. Being driven and derived from data, you might see rates of non-compliance with use of encryption on laptops and host-based external media controls and perhaps promiscuous policies related to use of cloud storage.

In the last post in this series, I'll expand on ideas for questions to answer using data collection, correlation, and analytics. 

Thursday, April 10, 2014

Questioning Information Security - You are only as good as your questions

Your security is only as good as the questions you ask. It is the questions that drive the search for answers. And the answer drives informed action or inaction. Anything else is a random, uninformed walk.  So, as you shape your security strategy to support the innovations of the business, it is in asking good questions and creating correct answers through which effective security is achieved. No one else but the enemy will tell you the questions you should have asked and the answers you should have come up with. But by then it is too late. Because they told you by running all over your systems.

Before we jump in to the information security side of this, let's take a look at the historical implications of leaders who didn't ask effective questions about their own security.

Wars and Empires Lost 

Wars and Empires have been lost because those charged with defending their country did not ask the questions needed to correctly determine the defenses necessary to defeat their enemy. Darius III ruled Persia from 380 - 350 BC. At its height, the Persian Empire reached in to three continents, spanning over 8 million square kilometers. Alexander, King of Macedonia, had eyes on creating a vast empire. To do that required conquering Persia.

Darius knew that Alexander had designs on his empire. Unfortunately, Darius' spies failed to provide him good intelligence about the weaponry of Alexander's army. They didn't ask the question, 'is my military weaponry sufficient to defend against an attack by Alexander?' As it turned out - no. His spears were just a bit too short.

And what happens when your spear is too short? You lose. And Alexander, instead of being 'Alexander the So-So' became Alexander the Great. And who talks about Darius III?

Questioning Information Security 

Enterprises have sustained massive losses because they didn't ask the right questions. The breaches of the NSA, Target, Neiman Marcus, TJ Maxx, 7-11, Heartland Payment Systems, RBS WorldPay and others are rooted in not asking good questions and properly analyzing them and acting on the answers. 

Here is a framework from which we'll discuss this idea of 'Question Security'. It is really simple. Ask questions, answer the questions through data collection and analytics, and act on the answers.

Lame Questions
Think about your own enterprise within this framework. How good are you? Honestly. Most companies are doing something like this (below) where they are only collecting some limited, silo-ed data and acting on it. 

And what are the questions being asked? Pretty weak ones and, frankly, ones that aren't terribly useful on their own. What vulnerabilities does this scanner tell me that I have? What vulnerabilities does this security consultant (who is likely running the same scanner) tell me that I have

Lame Answers
The type of answers you get of questions like this - what vulnerabilities does this scanner tell me that I have - aren't too useful for protecting an enterprise. Nothing against scanners, but really, how useful is this?

On top of that, the frequency at which the data is collected in many cases is not frequent enough. For example, many organizations only scan their perimeter once a quarter. A lot can change in a quarter. That is the equivalent of only doing security event monitoring once a quarter. 'Any attacks? No. I looked at traffic for a few days a couple of months ago. Didn't see anything. Who would do that? Well...

So, not only are the questions being asked and answered weak, but the questions aren't being answered very frequently so the information is STALE!

Good Questions Lead to Good Answers
Answers all start with a question. And good questions lead to good answers. So what questions are you asking?

Here are some examples of good questions:
  • How did my Internet footprint change between yesterday and today?
  • Who is using unauthorized systems on the network?
  • Which of my users has the worst security behavior?
  • Which of my users exposes the organization to the greatest risk?
  • What is the security profile of my customer care business unit across all people, applications, and infrastructure?
With some good questions staged, we'll dive in to the data stuff required to answer these good questions here.