Thursday, January 15, 2009

Distilled vol.1,2 - Security Metrics

-- Aurora Report summary by Jayson Cavendish based on the book by Andrew Jaquith - Security Metrics - Replacing Fear, Uncertainty, and Doubt.

 

Forward (Daniel E. Geer Jr)

 

"To measure is to know" - Maxwell

 

Security is a means, not an end.

 

"The purpose of risk management is to improve the future not explain the past" - Borge

 

"Risk management means taking deliberate action to shift the odds in your favor -- increasing the odds of good outcomes and reducing the odds of bad outcomes." - Borge

 

Security metrics are the servants of risk management, and risk management is about making decisions. Therefore, the only security metrics we are interested in are those that support decision making about risk for the purpose of managing that risk.

 

We need to understand, quantify, measure, score, package and trade digital security risks as effectively as all the other risks with which the financial services sector already deals.

 

When speaking about VaR (Value at Risk) a top economist leans over the lectern at the end of the day and says: "It works because there is zero ambiguity about which of you owns what risks." 

 

In the field of security there is nothing but ambiguity about who owns what risk.

 

In the digital world, the defenders work factor is proportional to the sum of all the methods the attackers possess times the complexity of that which is to be defended. The attacker's work factor is the cost of creating new methods as fast as old ones must be retired while the complexity ensures that the supply of new methods can never be exhausted.

 

The canon of digital security is now that of cost effectiveness, and the analysis of cost effectiveness runs on the fuel of measurement: measurement of inputs and outputs, of states and rates, of befores and afters.

 

Geer’s fervent hope is that "Security Metrics" sets off a competitive frenzy to measure things, to fulfill that Grand Challenge of a quantitative information risk management.  This may be brutal for some but better than enduring - complexity crafted by knaves to make a trap for fools.

 

Audience

 

Practitioners need to know how, what, and when to measure. Their bosses need to know what to expect.

 

Introduction: Escaping the Hamster Wheel of Pain

 

Thinking about security as a circular, zero-sum game cripples our ability to think clearly. The world ought to be a place where digital security risk becomes reducible to numbers. Digital security risk as a commodity that could be identified, rated, mitigated, traded and above all, quantified and valued.

 

Hamster Wheel of Pain

 

  1. Use produce, and discover you're hosed
  2. Panic
  3. Twitch uncomfortably in front of boss
  4. Fix the bare minimum (but in a vigorous, showy way).
  5. Hope problems go away

 

Stated differently: You're hosed; we fix it; 30 days later, you're hosed again; Patch, pray, repeat.

 

Breaking this cycle of managing the identified issues and then identifying more issues starts by asking questions.

 

  1. What is the value of the individual information assets residing on workstations, servers and mobile devices? What is the value in aggregate?
  2. How much value is circulating through the firm right now?
  3. How much value is entering or leaving? What are its velocity and direction?
  4. Where are my most valuable assets? Who supplies and demands the most information assets?
  5. Are my security controls enforcing or promoting the information behaviors we want?
  6. What is the value at risk today? What could we lose if the 1% chance scenario occurs?
  7. How much does each control in my security portfolio cost?
  8. How will my risk change if I reweight my security portfolio?
  9. How do my objective measures of assets, controls, flow and portfolio allocation compare with those of others in my peer group?

 

Metrics Supplant Risk Management

 

"Security is a process" - Bruce Schneier

 

How are processes measured? Through metrics and key indicators -- in other words, numbers about numbers.

 

We need to establish a set of key indicators that tell us how healthy our security operations are, on a standalone basis and with respect to our peers.

 

  1. What metrics does your product or service provide?
  2. Which would be considered key indicators?
  3. How do you use these indicators to demonstrate improvement to customers? And to each customer's investors, regulators and employees?
  4. Could you, or a management consultant, go to different companies and gather comparable statistics?
  5. Do you benchmark your customer base using these key indicators?

 

These key indicators should incorporate time and money measures, should be measured consistently, and should be comparable across companies to facilitate benchmarking.

 

Defining Security Metrics

 

Information security experts meet simple questions readily answered in any other business context, with embarrassed silence. These questions are:

 

  1. Is my security better this year?
  2. What am I getting for my security dollars?
  3. How do I compare with my peers?

 

Instead of metrics driven indicators, fear of catastrophic consequences of an information attack uncertainty about their vulnerability, and doubts about the sufficiency of current safeguards drive organization's security decisions.

 

Security Measurement Business Drivers

 

Information asset fragility, Provable Security, Cost pressures, Accountability.

 

Both awareness of information security and the will to address it are spreading.

 

No good, consistent security metrics are available. Consequently, the amount a company can spend on "improving" security has no natural bounds beyond the company's ability to pay.

 

Cost-benefit analyses and ROI calculations are becoming standard prerequisites for any information security sale.

 

Legislation and regulations such as Basel II Capital Accords, Gramm-Leach-Bliley, HIPAA, and Sarbanes-Oxley are generating accountability for information security.

 

Modeling Security Metrics

 

Modelers think about risk equations, loss expectancy, economic incentives, and why things happen. Measurers think more about empirical data, correlation, data sharing and causality.

 

Correlations are sometimes interesting but if they don't lead to an understanding of the dynamics that give rise to the correlation, then we have not generated information from the data that allows decisions and responsible actions to be taken.

 

First, you have to measure the threat, not just the incidents and controls people apply, if you want to know why some incidents happen and others don't. Second you need a model, because correlations on their own are sterile.

 

We can steal much more from the quality control literature, particularly if we treat security flaws as special cases of quality flaws, assuming they are accidental, and not fruits of sabotage.

 

If you have not calibrated the model with measurement, only one thing is certain: You will either overspend or under-protect.

 

From the systems administration viewpoint, nothing is so sweet as having all platforms be alike. From the security administration viewpoint nothing is so frightening as having all platforms be alike.

 

What makes a good Metric?

 

In security, business leaders ask the following questions:

 

  1. How effective are my security processes?
  2. Am I better off than I was this time last year?
  3. How do I compare with my peers?
  4. Am I spending the right amount of money?
  5. What are my risk transfer options?

 

"Metric" defined oxford dictionary: a system of standard measurement. 

 

Metrics used for quantifying value (doing things right) and those used to measure performance (continually doing those things better) are attributable to IT.

 

Good metrics should be consistently measured, cheap to gather, expressed as a number or percentage, and expressed using at least one unit of measure, and ideally be contextually specific.

 

"Metrics" that depend on the subjective judgments of those ever-so-reliable creatures - humans - are not metrics at all. They are ratings.

 

Good metrics mean something to the persons looking at them. They shed light on an under performing part of the infrastructure under their control, chronicle continuous improvement, or demonstrate the value their people and processes, bring to the organization.

 

ALE is security's spherical cow. It encourages practitioners to think about dollar impact on an aggregate, averaged basis, in spite of the fact that losses do not gravitate to the middle; they cluster at the far edges.

 

Metrics are an emerging field of study for information security professionals. Security Metrics attempt to put numbers around activities that safeguard information resources.

 

Organizations looking to start a security metrics program should think less about taxonomies (COBIT, ITIL, ISO17799, et. al.) and more about key indicators that quantify particular security activities.

 

Diagnosing Problems and Measuring Technical Security

 

Remember the scientific method:

 

  1. Formulate a hypothesis about the phenomenon,
  2. Design tests to support or disprove the hypothesis,
  3. Rigorously conduct and measure the results of each test,
  4. Draw a conclusion based on the evidence.

 

In security, metrics help organizations:

 

  • Understand security risks,
  • Spot emerging problems,
  • Understand weaknesses in their security infrastructures,
  • Measure performance of countermeasure processes,
  • Recommend technology and process improvements.

 

A collection of common security metrics for diagnosing problems and measuring technical security activities grouped into four categories: perimeter defenses, coverage and control, availability/reliability, and applications.

 

The primary benefit of the diagnostic method is that hypotheses are proven or disproved based on empirical evidence rather than intuition. Because each hypothesis supports the other, the cumulative weight of cold, hard facts builds a supporting case that cannot be disputed.

 

Metrics need to provide insights that you don't already possess, arm you with information you can use to spend your organization's dollars more wisely, or help you diagnose problems better.

 

Antivirus and Antispam

 

Dividing the number of incidents that required human intervention into the total number of incidents gives a much more honest assessment of the effectiveness of the antivirus system.

 

The number of outbound viruses or spyware samples caught by the perimeter mail gateway's content filtering system is an excellent indicator of how clean the internal network really is.

 

Firewall and Network Perimeter

 

Open access points can present a security risk for firms with many far-flung offices in urban environments -- especially when considered in combination with the number of remote offices connected directly to core transaction networks.

 

Attacks

 

The lowest level, security events, feed into SEIM from source systems. These events are processed by the SEIM and are not necessarily intended to be viewed by humans. If certain types of events correlate strongly, the system generates an alert and forwards it to a security dashboard, along with supporting data.

 

These statistics are certainly interesting in and of themselves, but they are also interesting in relation to each other.  When the corpus of these events can be scoped down to a well defined group of assets - such as public web servers - we can use these numbers to create a "funnel" that shows the ration of Internet Web sessions to prospective attackers, suspected attackers, and actual (manually investigated) attackers.

 

Coverage and Control

 

With turnkey dedicated servers supplied by a vendor, the security group cannot do anything to the system because it will invalidate support from the vendor. With these systems we allow them to connect to the network, but only with very strict security and very specific network filters.

 

Control means the degree to which a control is being applied in a manner consistent with the security organization's service standards, across the scope of covered resources.

 

Coverage metrics identify the implementation gaps: of the eligible workstations and servers, how many have antivirus/antimalware software? How many have updated signatures? These metrics help administrators understand the extent of their overall control regime.

 

The degree to which an organization keeps its infrastructure up to patch indicates the effectiveness of its overall security program.  It is not true that consistent patching correlates to higher levels of security - it is true that inconsistent patching correlates highly to insecurity.

 

Related to latency are overall cycle times. Patching can be thought of as having at least three distinct steps: identification of target systems that need particular patches, patch testing, and patch distribution and installation.

 

As it happens, in many companies patch testing for workstations is becoming less common, due to the ever-decreasing window between the release of a patch and its evil twin, the follow-on malicious exploit.  It is becoming too risky to wait.

 

Host Configuration

 

The most important point about host configuration metrics is that they measure whether workstations and servers are configured in a manner that allows the organization's security objectives to be achieved.

 

The CIS High Security benchmark is a very good standard that reflects the consensus view from the NSA, NIST and other experts about what a securely configured Windows system should look like.  Another emerging standard is the Federal Desktop Core Components (FDCC).

 

Event logging controls and synchronization of system clocks are important for security incident handling. When problems occur investigators need access to event information, the best way to make this available is to forward events to centralized logging servers such as syslog.

 

Every organization's security metrics program should give a clear eyed view of the vulnerabilities present in its network and in particular its hosts.

 

Availability and Reliability

 

Unplanned downtime measures the amount of volatility in the operational environment, measuring the amount of downtime related to security incidents can be quite revealing, especially when expressed in dollar costs to the organization due to loss of productivity or revenue.

 

The "support response time" measures how long it takes an organization to recognize a security outage and to initiate support activities. "Mean time to recover" identifies the amount of time required to restore operations to fully operational state.

 

Businesses that actively walk through their BCP/DRP processes are probably better equipped to deal with a security incident.

 

Change Control

 

A critical component of any security program is the process used to manage changes to the configuration of the environment. It is hard to see how you could have a security program without change control.

 

Three key metrics can help assess the degree of change control an organization possesses:

 

  • The number of production changes,
  • The number of exceptions,
  • The number of unauthorized changes (violations to the process).

 

Application Security

 

Software written with out sufficient attention to security carries much more risk than software that adheres to generally accepted principles for coding secure software -- as much as five times more risk.

 

The objective of testing is to find vulnerabilities that can be exploited to compromise the applications integrity, confidentiality or availability.

 

Qualitative assessments (such as: design reviews, architecture assessments, code reviews, and penetration tests) earlier on in the application life cycle uncover issues before they become bonafide vulnerabilities in production.

 

Measuring KSLOC (Thousands source lines of code) as the denominator for other metrics such as number of defect/vulnerabilities provide a view of "vulnerability density" and can be consistently measured to an acceptable threshold.

 

Complex systems fail complexly.  Thus, if complexity contributes to insecurity, we ought to devise methods for measuring code complexity as a leading indicator of future security problems.

 

An open source toolkit that works well for Java code, is the Project Mess Detector a.k.a. PMD.

 

Diagnostic security metrics borrow from management consulting techniques by asking two questions: what hypothesis can be formed about the efficiency or effectiveness of security controls, and what evidence can be marshaled to support or disprove they hypothesis?

 

Measuring Program Effectiveness

 

"Trust, but verify" - Reagan

 

Trust is good. Control is better.

 

Companies can and should, develop working environments in which managers and employees communicate freely and trust each other.  But trust is not the only ingredient; for an organization to be effective, trust must be backed by systems of accountability.

 

Security is about control. The prevailing standards and formal literature on security all speak of "security controls". Regardless of the source the notion of control ranks high as one of the key objectives of information security.

 

Security controls are designed to ensure that an organization meets its confidentiality, integrity and availability objectives.

 

Metrics promote accountability by quantifying the effectiveness of security processes.

 

It is more important to define critical assets than to assess overall asset value. Employing the principle of Occam's Razor to cleave assets into two groups: critical assets and everything else. Critical assets are those for which an organization possesses zero tolerance for risk.

 

Planning and Organization

 

The primary control objectives for planning and organization include defining the IT strategic plan and desired architecture; communicating goals and direction; assessing risks; and managing investments in time, money and capital.

 

From the security point of view, the most important things to measure are how effectively the organization assesses risks, manages security issues associated with personnel, and manages costs.

 

Assessing Risk

 

It is important to capture high-level metrics that quantify what an organization knows about the nature of risk inherent in its infrastructure, people, and information.

 

  1. Do critical assets reside on systems that are compliant with organizations security standards?
  2. Has the organization reviewed critical assets and functions for physical and information security risks?
  3. Is the organization prepared to handle compromises of critical assets?

 

Organizations metrics should avoid complicated risk assessment formulas in favor of hard and fast facts that can be independently verified.

 

Human Resources

 

  1. Are security responsibilities included in job descriptions and assessed during performance reviews?
  2. Does the organization face potential downstream issues due to employees without background checks?
  3. How well dispersed are security responsibilities throughout the organization?

 

A centralized security group should exist whose job it is to safeguard all information assets and keep everyone safe.

 

In diverse geographic organizations local security coordinators liaise with regional and central security staff to guide security investments, handle incidents, and coordinate communications within their business units.

 

Managing Investments

 

Security costs are no different from any other kind of IT costs, generally they equate to activities that:

 

  1. Operate the organization's security systems,
  2. Implement new security controls to meet new requirements,
  3. Get out in front of new business requirements.

 

Furthermore, for the purposes of budgetary metrics, security organizations should track the level of spending for their operational, new implementation and discretionary security initiatives.

 

Acquisition and Implementation

 

Planning and organization controls describe "what" an organization must do.  The acquisition and implementation controls begin to answer the question "how".

 

From the perspective of security, the goal of acquisition and implementation is to ensure that appropriate security controls are incorporated into information systems as early as possible.

 

  1. How frequently are security teams engaged when business units draw up requirements for new information systems?
  2. How much attention do business units pay to customer requirements when planning a new information system?
  3. How often are business units requiring controls that ensure that customer information managed by new systems cannot be tampered with by unauthorized parties?
  4. How often are business units requiring controls that ensure the confidentiality of customer information managed by new systems?

 

Two terms recurring in the metrics merit further explanation. The word "coverage" refers to the degree to which organizational requirements are met for a collection of systems.  The work "consultation" which refers to formal and informal collaboration between business and security units to increase awareness and acceptance of security, especially at the beginning of the system's implementation life cycle.

 

Installing and Accrediting Solutions

 

System owners understand and accept the risks associated with systems, and installed systems demonstrably comply with prevailing security standards, is the process of accrediting and certifying information systems.

 

  1. Do information systems adhere to organizational security standards?
  2. Have the owners of information systems agreed to, and accepted, security risks?
  3. Has the organization included security in the costs of new systems?
  4. Do information owners sign off on the security of their information systems, or not?

 

Knowing the costs implies hardware, software, expenses, and labor allocations. Budget allocation means just that: people who own the budgets need to explicitly set aside dollars to support the security functions of the systems they own or use.  Thus, determining the percentage of systems with built-in security costs enforces a sort of budgetary and planning rigor that most organizations benefit from.

 

High, medium and low classification systems are too subjective, but if classification must be done a more factual determinant would be whether a system is customer or internet facing or not. Visibility of systems to customers or the internet has a higher cost in terms of reputation and revenue losses than to internal supporting systems.

 

Developing and Maintaining Procedures

 

Tangible forms that instructions take for keeping systems up and running go by many names such as "procedures", "work instructions", and "operational controls". Regardless of the name they should include the following:

 

  1. Procedure for stopping and starting the system
  2. Day to day operational responsibilities of the system
  3. Availability policy for the system and service level agreements
  4. Monitoring and oversight responsibilities
  5. Problem management processes
  6. BCP/DRP instructions
  7. Technical Architecture
  8. Security responsibilities for users and operators
  9. Data security policy
  10. System ownership defined

 

For the purposes of security, the procedures for monitoring, data security, bcp and security responsibilities matter most.

 

Each system that management deems critical needs appropriate operational procedures.

 

Track which systems have suitable operational policies and procedures defined.

 

Delivery and Support

 

No antivirus software, intrusion prevention system, or vulnerability scanner can possibly tell an organization whether its people possess sufficient security awareness training.

 

Delivery and support metrics focus on user training and awareness, system security and data management.

 

Educating and Training Users

 

End-user education is a cornerstone of modern thinking about information security.

 

These education programs impart the organization's policies and procedures for such generally applicable security concepts as antivirus protection, "acceptable use" of company resources, software licensing, password policy, social engineering, and recommended steps for reporting security incidents.

 

  1. Are employees acknowledging their security responsibilities as users of information systems?
  2. Are employees receiving training at intervals consistent with company policies?
  3. Do security staff members possess sufficient skills and professional certifications?
  4. Are security staff members acquiring new skills at rates consistent with management objectives?
  5. Are security awareness and training efforts leading to measurable results?

 

NISTs guidance on metrics warns that one of the ironic side effects of increased awareness can be an increase in reported incidents.

 

Ensuring System Security

 

When the words "people" and "security" are uttered in the same breath, many professionals automatically think about things like "acceptable use" policies, training, passwords, and responsibilities.  If one substitutes the work "accountable parties" for "people" the discussion shifts more to the operational aspects of security management - specifically to concepts like authorization.  Put succinctly we need to know if the right people have the right access to the right things at the right time - and that the wrong people do not.

 

Permissions when granted to a particular person or organizational role constitute an entitlement. Today's information security battleground is all about entitlements - who's got them, whether they were granted properly, and how to enforce them.

 

  1. How consistently does the organization implement principles of accountability?
  2. How pervasive is the principle of role based access to systems?
  3. Does the organization segregate responsibility for production systems to prevent change control problems?
  4. What users possess a significant amount of privileged access to information systems?
  5. Does the organization review employee entitlements?
  6. Is the business at risk from terminated employees, especially those with privileged access?

 

Systems that define entitlements exclusively through roles tend to have better security.  RBAC can implemented by:

 

  • Provision individual user accounts,
  • Define (name) and provision roles for the system or application,
  • Define application entitlements by assigning the appropriate permissions/privileges to each role,
  • Map users to roles using application provided tools or a central identity management system.

 

User permissions are therefore defined as the union of the permissions they possess explicitly combined with those of they inherit by virtue of the roles they possess.

 

One revealing metric is the percentage of users who have privileged access to more than one critical system. Another is the percentage of "high risk" users, defined as those who have access to multiple tiers on the same system, such as database administrator privileges and root access (system administrator).

 

Measuring the percentage of employees who have not taken vacation in a long time might reveal "indispensable" employees at high risk of burnout.

 

Identifying and Allocating Costs

 

To understand the operational side of security we need metrics that capture essential data about security costs incurred during day to day operations.

 

Security costs include operational hardware, software, people, consulting, and outside audit fees. 

 

Security incidents are another category of costs that contain hard and soft dollar estimates of the costs the organization incurs when investigating, reacting to, and reporting on break-ins, data disclosures, and other security problems.

 

Hard costs include legal fees, regulatory fines, crisis communication representation, external forensics fees, and extraordinary charges for equipment and software related to the incident.

 

Soft costs include labor for hours spent by internal counsel, IT personnel, and senior management.

 

  1. What do the security systems managed by the organization cost relative to top-line business activities?
  2. How much of the operational security budget ties back to business activities?
  3. How significant was the cost impact of security incidents?

 

Certain security costs are easy to allocate: Outsourced security monitoring services for a DMZ, SSO Systems, Database and web application monitoring tools/firewalls, Managed intrusion detection, and External audit fees and consulting.

 

If costs have clear and obvious per-employee expenses, pro rata allocation to each business unit based on head count represents a fair and transparent method.

 

Managing Data

 

Understanding data's direction, magnitude, and sensitivity is important discipline for information security programs.

 

Security metrics for data should measure the amount of data flowing into and out of the organization.

 

  1. How much information is flowing into the organization on a daily, monthly, and yearly basis?
  2. How much sensitive customer information does the company manage?
  3. How safe are storage media entrusted to third parties?
  4. Are media assets decommissioned properly to remove traces of sensitive information?
  5. How quickly can the organization respond to complaints from employees and customers about potential data privacy or integrity problems?

 

Managing Third-Party Services

 

  • Understanding requirements for access: scope of access, resources needed, expected duration, and security impact.
  • Approving access: provisioning user accounts and documenting access grants.
  • Periodically reviewing access rights.
  • Removing access: deprovisioning.

 

  1. How quickly are requests for access by third parties vetted and approved?
  2. Once approval is granted, how quickly can the organization grant access to third party applicants?
  3. Does the organization understand what security controls are needed for third party access?
  4. To what extent do third parties implement the correct security controls appropriate to their level of access?
  5. How frequently are the access rights of third parties reviewed?
  6. Are third-party applications authorized?

 

Monitoring the Process

 

Monitoring processes determine how well the organization's controls work by "instrumenting" oversight functions such as external audit and security assurance.

 

Effective security requires that organizations know the sources and consequences of security events as they happen.

 

A critical component of any effective security program is the process used to monitor information systems for deviations against standards.

 

Without change controls, security monitoring processes become less effective.

  1. To what extent does the organization monitor the security of its information systems?
  2. Is the organization monitoring systems that contain customer data?
  3. To what extent are systems monitored for changes to their configuration?

 

Assurance activities often include nontechnical activities like review of process documentation, system walkthroughs, data-center tours, and interviews with system owners, business units, and IT operations.  Technical assurance activities - control testing - commonly include penetration tests and system vulnerability scans.

 

  1. On systems for which independent assurance is needed, are security controls working as designed?
  2. Are electronic communications with third parties secured from tampering, interception, and attacks?
  3. How effective is the process for independently testing security controls?
  4. How much money is the organization spending on assurance activities?

 

Ensuring Regulatory Compliance

 

In contrast to assurance activities, audits examine how well an organization implements those controls in compliance with the statute or contract.

 

Accountants speak a different language than IT security people. Instead of "policies and processes" they speak of "key controls and general controls", and instead of "vulnerabilities and violations" they speak of "material weaknesses and deficiencies".  These terms carry special meaning for auditors and therefore for boards of directors as well.

 

  1. How much time and effort are security staff spending on audit-related activities?
  2. Have audits uncovered serious weaknesses in existing controls?
  3. How much time and effort are security staff spending fixing problems uncovered by audits?
  4. Have audit activities uncovered problems with controls that would affect customer trust or privacy?

-- Aurora Report says and now you have a bit more ammo for measuring your security efforts.  The full book is well worth the read and what has been left out of this review are the final chapters that discuss how to put all these metrics together into a Security Dashboard replete with analytic methodology, charts and graphs.

No comments:

Post a Comment

My Blog List