Identity in Assembly and Integration

2005-11-30; Updated 2008-09-27

Securely integrating a shared service across highly distributed software systems presents a significant challenge at every phase of the software development life cycle. Moreover, there is a crucial need within the project team(s) for common abstractions and a common understanding of all the relevant aspects of a shared service. This document discusses the issues and necessary abstractions related to integrating identity services, which are particularly critical as the basis for granting or denying access to system resources and data.

Acknowledgements. Contributions and reviews by Dan Blum, Stefan Brands, Pamela Curtis, Howard Lipson, Gary McGraw, and Tony Nadalin are gratefully acknowledged. Errors and omissions are my own.

Introduction

In their report [APWG 05] for the month of June 2005, the Anti-Phishing Working Group identified and studied 15,050 reports of phishing covering 74 different corporate brands. The average time online for the sites used in the phishing attacks was 5.9 days. The longest time online for these sites was 30 days. The report found 154 unique password-stealing applications and 526 URLs containing malicious code for password stealing.

Phishing is only one example of attacks targeting identity-related data. The attack vectors against identity-related data are made possible by overall weaknesses in identity technologies and usability factors and vulnerabilities in identity deployments. Individual users are not able to discern when it is appropriate and safe to disclose personal information. From a risk management viewpoint, identity data is ill protected in many systems, and yet it remains among the most valuable data sets to both an attacker and the victim. When software projects move into the assembly and integration phase, the systems must be prepared to meet a host of security challenges that wait in the target production deployment environment.

Overview

In assembly and integration, the logical design assumptions for a system meet the physical, business, technical, organizational, and individual user realities of the target system environment, including identity systems such as user repositories, directory services, and provisioning systems. This document describes common issues, approaches, and integration considerations regarding identity integration in software, with the goal of beginning to build a shared understanding of the problems and solutions in this space so that identity may develop a consistently strong representation and usage within and across domains.

Software development teams lack agreed-upon adoption of standard representation and consumption patterns for authentication, attribute query or update, and authorization of identity information across technological and organizational domains. The current state of identity consists of numerous identity silos that are directly bound to domain-specific technologies, policies, and organizational domains, each with its own interpretation of how to issue, encapsulate, and negotiate identity data and services. This lack of consistency creates issues for distributed systems that are required to traverse across identity silos and domains and has the overall effect of numerous one-off solutions for identity, where each of which contains its own arcane, tightly coupled, and technology-specific ways of dealing with identity. There is a well understood best practice in software development that developers should not attempt to write their own cryptographic algorithms because of the complexity, lack of peer review, and value of that which the cryptographic functions are protecting. Developers, in contrast, routinely write one-off identity solutions that are never peer reviewed by a wider audience. This identity information is propagated and integrated throughout software systems and used as a basis for making security decisions about access control to critical resources and the confidentiality of personal and business data.

These one-off solutions frequently are further integrated with other identity silos, creating a mishmash of identity solutions with varying limitations, in the worst case generating a lowest common denominator effect. Exacerbating this further is the situation that many identity solutions are already in place as legacy systems while software is being developed, and so the projects inherit the standard issues found in legacy integration with identity, such as brittleness and lack of support for robust protocols and current standards. Why is this an especially serious problem for identity management? Matt Bishop states that “Every access control mechanism is based on an identity of some sort.” Bishop goes on to state that all decisions of access and resource allocation assume that the binding of an identity to the principal is correct [Bishop 02]. Hence, identity is a foundation-level element for security and accountability decisions, and breakage at this level in design has profound implications for the system’s security as a whole. Transactions may employ multiple identity contexts throughout their life cycles, broadening the scope of identity’s usage for access.

Added to the above system-level problems are the user population’s lack of awareness, ability, and tools for managing and propagating their own digital identity information and their lack of ability and technical tools to use in determining the veracity of requests for their personal information. The net result of this is the emergence of phishing attacks and other attacks targeting these vulnerabilities at the directory, user desktop, and client level.

To protect identity on the client and server and throughout the system, software development teams require an overarching understanding of identity’s architectural elements and approaches to integrating identity into software systems. Such an understanding will enable them to navigate the chasm that exists between the assumptions made about identities and the actual state that exists in the systems, with which they are attempting to integrate. The acquisition of knowledge regarding the union of identity elements, behaviors, and constraints and the software user’s knowledge and abilities with the desktop and clients will give software development teams the tools they need to build more robust software based on secure usage of identity. Ideally, the architecture should abstract the low-level concerns relating to identity from the developer.

This document examines architectural concerns relating to identity abstraction, security, and privacy in distributed systems so that software development teams may compose a framework that harmonizes these concerns in a way commensurate with their risk management decisions.

Identity Architectural Concerns

Identity information typically provides the basis for a number of functional requirements (e.g., being used as a key value to direct access to appropriate resources and data) and non-functional requirements such as security and privacy. Each scenario that interacts with identity data and services may generate and/or rely on identity information for different purposes. Identity usage paradigms in representing and using identity in the system can create conflicting goals. One example is the inherent tension between privacy and security. Many privacy goals revolve around data subjects controlling who can access information about themselves; security goals, especially protection and detection mechanisms, are concerned with not letting information fall into the wrong hands and gathering as much data as possible about the system’s users. The balance between the two is determined by how the identity is protected, by what mechanisms, and who owns and operates the mechanisms.

Identity comprises a number of architectural concerns. Each concern’s elements and constraints can result in architectural tradeoffs against other identity architecture concerns. Project teams map out the constraints and rationale for the identity elements in their system, conducting architectural tradeoff analysis regarding how identity is generated, represented, consumed, and transformed. Examples of identity architectural concerns include the following:

Access control: Identity is a foundation-level component for many access control mechanisms. Identity information about a digital subject is bound to a principal. Access control mechanisms consume identity data from the principal to make and enforce access control decisions. Weaknesses in identity systems affect the overall viability of access control, security, and privacy mechanisms.
Regulatory and legal: Increasingly legislation and regulation recognize the value of identity data. Countries and industries have specific points that must be addressed to ensure that identity is protected. For applications that have an international user base, there are additional regulatory and legal concerns that may span legal boundaries.
Privacy: Privacy concerns relate to identity information that is linked at some level to an individual. They center on what personal data is disclosed and may manifest themselves in the system design through privacy legislation, liability, and/or psychological acceptability and success of the solution. Systems may implement privacy mechanisms using pseudonyms or anonymous mechanisms. There is an inherent tension between security and privacy that plays out most directly in the identity space. The tension revolves around the extent to which the user and the relying party have control and visibility of personal data. To be effective, the identity architecture must resolve these concerns in a manner that is congruent with each party’s requirements.
Personalization: Information relating to digital subjects is used by a wide array of applications from Internet portals (e.g., business websites, loyalty programs, customer relationship management services, personalization engines, and content management servers) to enhance the customer experience and provide convenience and targeted services on behalf of businesses and consumers. Personal data, stored by organizations, may also be shared and correlated for a variety of reasons including data mining and target marketing; these uses of personal data may directly conflict with goals for pseudonymous protection of data subject information.
Domain attributes: Information relating to digital subjects may be used in a system for attribution purposes, including provisioning, credentialing, and data and function views based on keys. The domain-specific attributes that are related to the digital subject may be mapped to the identity and used in the domain without the knowledge of the end user, depending on the policies in the domain. The actual brokering of access control functions may be delegated to or impersonated by other system servers and services, as is the case in some distributed system architectures where individual user accounts are impersonated by application servers.
Provisioning: These are services responsible for managing identity information and assigning rights and privileges in one or more domains. Provisioning services are typically role and/or rule based and may have sophisticated workflow, logging, and audit logging capabilities.
Audit and reporting: These systems can be used to record, track, and trace identity information throughout systems. Audit logs and usage reports may be used for regulatory, compliance, and security purposes, and depending on implementation they may create privacy issues for individuals. Anonymizing and pseudonymizing sanitizers may be used to allow for system reporting and monitoring without disclosing identity information.
Identity mapping services: In distributed systems, identities are communicated and transformed in a variety of ways. Identity mapping services are used to manage these relationships when identity information, such as attributes, are mapped onto other principals.

Depending on how each of the above architectural concerns is implemented and integrated, there is a potential cascading effect to the detriment of the other concerns. A holistic architectural view of how identity is generated, represented, consumed, transformed, communicated, and stored in the system is required to gauge the system impacts of the design decisions.

Identity Integration Considerations in Distributed Systems Architecture

When software is assembled and integrated into the system for production release, the overall strength of the security mechanisms is affected by

how the software uses identity
the assumptions around the generation and binding of identity information
the communication and interoperability architectural paradigms of identity

Security properties, especially authentication, often do not compose. Nevertheless, information systems are often built on composition.

The tradeoffs involved in proxying are analyzed, including a number of identity integration patterns including impersonation, delegation, proxy, tunneling, and other scenarios.

Role-Based Access Control

Role-based access control is a type of mandatory access control that allows for subject’s information to be mapped onto a role at runtime so that access control decisions can be made against the role’s privileges. Subjects are assigned a set of roles and the system evaluates the roles to determine authorized access to objects. Role-based access control can enhance administrative flexibility in the system for dealing with user accounts and at the same time allow for granular authorization controls on objects and transactions based on roles rather than individual accounts. In many role-based access control systems the subject’s attributes are propagated along with the role credentials so risks to identity information, like identity information leakage, remain.

Federation

In a federated identity model, one or more relying parties and service providers enter into league together, where assertions made by one party are recognizable and verifiable by the other parties. Assertions can be made about any number of attributes pertaining to the identity, including authentication, authorization, and domain-specific attributes. Federation protocols and standards, such as SAML, Liberty ID-FF, and WS-Federation, allow for identity information to be transferred across domain contexts. There are many use cases that employ federation; for example, in a federated single sign-on scenario, an employee at a company could log on to the company’s intranet site, and when the employee clicked to browse his or her health benefits on a site hosted by a third-party healthcare benefits provider, the employee’s credentials would be federated to the site, and the employee would be logged on to the healthcare plan system automatically with the requisite privileges and data. As systems become increasingly integrated, federation is emerging as a desirable property for identity data.

Figure 4. Federated identity across security domains

The convenient portability properties that federation provides can conflict with privacy goals. In some cases the federation protocols set up a panoptic situation that reduces privacy in a multiparty federation [Brands 05]. Depending on implementation, personal information can be correlated and traced across domains through logs and other mechanisms. Federation, as its name implies, relies on mutually agreed upon static groups and is not suited to ad hoc groups that want to rely on the same identity data. Federation servers create failure points as targets of denial of service and disruption of availability attacks. As with all identity services, federation design decisions need to deal with the balance of security, privacy, and usability.

Identity Abstraction Layer

Abstraction layers are frequently used to provide reusability, vendor and technology independence, and a layer of indirection for databases, operating systems, and other computing resources. This technique is relatively less used for identity services and stores; however, there is no technical reason for this to be the case, and emerging technologies make it simpler for identity services and stores to enjoy the same abstraction benefits as databases and other resources. Those technologies include open, interoperable protocols such as SOAP; federated identity standards such as Liberty ID-FF, SAML, and WS-Federation; and security token servers (STS) such as the STS standards proposed in the WS-Trust specification, which eliminate the need for pairwise identity relationships where identity and its consumers are tightly coupled. The combined impact of open protocols and standards, federated identity, and STS creates a situation where identity information may be abstracted and loosely coupled to domains and services. The net result is that an application’s queries for identity information can be bound to an identity abstraction layer using open protocols and standards-based security tokens. The back end user repository and policy stores can then be switched, merged, and replicated, similar to how an application server hides the details of the database server behind it so that switching vendors or technologies (or versions) does not require a total rewrite of presentation and business logic layer code. Identity usage patterns, such as authentication, personalization, and attribution can use the services in an identity abstraction layer rather than being bound in a stovepipe fashion to a single identity store.

Key Functions

The key functions of an identity abstraction layer include

Identity runtime services: The main job of the runtime services is to virtualize the authoritative source of the identity information so the developer does not have to know implementation details about where and how the identity information is stored. The typical services provided by an identity abstraction layer can include query services for applications to query the abstraction layer for validating and exchanging identity information. An identity abstraction layer can provide identity communication services to encapsulate, negotiate, transform, and propagate identity information.
Reporting: Privacy, security, legal, and regulatory architectural concerns all have unique requirements and constraints around reporting, logging, and auditability. An identity abstraction layer allows for a consistent way to provide these services and partitions access to these functions and reports.

Providing identity information to applications through the identity abstraction layer may require behind the scenes provisioning of attributes, views, or repositories. Such provisioning should be designed by the identity infrastructure architect and the application architects, while remaining as transparent as possible to the developers.

In these cases, identity is provisioned to applications’ user repositories and other locations. Provisioning services add, update, and delete identity and information relating to identities such as authentication, authorization, keys, and domain-specific attributes. In systems that do not have integrated identity at runtime through federation or other means, provisioning may be used to keep identity information and policy in sync across the system. In large organizations that have disparate technologies and protocols for identity, provisioning systems provide a way to have consistent representation of identities and a central point for logging and reporting.

Two common ways of deploying provisioning include metadirectories and virtual directories. Metadirectories connect disparate user repositories through roles and/or rules combined with joins and filters. The metadirectory stores information about the identity as a key value to keep the information in sync and consistent across the user repositories. Virtual directories provide an interface across multiple stores and represent a single point to connect to for identity information. Virtual directories connect to identity repositories but do not store information about specific identities.

Goals

The following are the goals of an identity abstraction layer:

Abstract back end resources: The identity abstraction layer provides a layer of indirection across the back end identity technologies, products, and protocols. From an architectural viewpoint, the layer of indirection has great utility in allowing for flexibility in the implementation behind the identity interface; architects can choose to change products, vendors, availability schemes, and other architectural concerns, since they are hidden behind the façade.
Provide for interoperability/pluggability: The identity abstraction layer enables interoperability on two levels: First, by making identity strong and portable, identity allows for interoperability across applications with disparate identity stores. Second, distributed applications can integrate across domains while retaining a strong identity profile.
Service-oriented focus: An identity abstraction layer is a step towards a service-oriented architecture (SOA), where clients and services are decoupled both logically and at runtime. Service-oriented identity information makes a minimum of assumptions about the systems that connect to it and provides interoperable standards for portability of security information.

Additional Constituents

The identity abstraction layer may contain additional architectural constituents, including

Checkpoint functions: Checkpoint functions include logging services of entry and exit of processes, point of origin, date and time, header values, and other criteria. Checkpoint functions are widely used in security product technologies and may also represent a privacy concern regarding information leakage when ported across domains.
Subject descriptors: The subject descriptor pattern described by Blakley et al. in Security Design Patterns [Blakley 04] can be used to provide access to security information. The subject descriptor provides methods to get attributes from an attribute list that contains security attributes, as well as an iterator method for adding attributes. A variant of this pattern is used in the Java Authentication and Authorization Service (JAAS).
Naming services: Naming services are required to describe the policy, status, and issue and verify identity tokens for applications. The identity abstraction layer must provide a naming service to perform the identification. One pitfall in distributed systems is naming conflicts resulting from disparate and legacy implementations, and the naming services model must address these potential issues as well.

Guarding the Keys to the Kingdom

Since identity information is so central to so many security decisions and to so much application functionality, it represents a highly prized target for attackers. From a cultural viewpoint, identity information is understood to require extra due diligence by government, regulatory bodies, and individual users. Identity information and its related architectural constituents therefore may be held to a higher standard for both security and privacy elements, and additional security analysis, design, implementation, operations, and auditing may be required. Examine the security model of the identity services and identity stores in the context of the overall system security to ensure that the identity services and identity stores are among the strongest links in the system. The more identity information is centralized logically or physically, the more risk to identity information is also aggregated.

Availability

Identity services provide an interface to information about subjects stored in the identity stores in a system. They also can provide a single point of failure that attackers may target to bring application systems down, without the need for the attackers to target the application itself. In fact, since identity services and stores are often reused in organizations serving identity information to multiple applications, an attacker who executes a denial-of-service or other availability attack against identity services and stores can have large adverse impact on the availability of the system. Failover, replication, dynamic binding, decentralization, and clustering techniques can be used to combat availability threats.

Hardened Servers and Services

Due to the criticality of the data that identity servers host and the access they vouch for in the system, identity servers should be hardened to the highest level of surety that is practical. The goal of identity servers to is provide and verify identity information for applications, not to run web servers, database servers, and so on. Standard server hardening techniques that limit privileges and services available only to those strictly necessary apply in this instance. Hardening special-purpose identity servers such as directory services servers is a relatively more straightforward task than hardening identity servers that are general purpose tools in the organization and may contain both identity and line of business or domain information.

Design for Failure

As a high-priority target for attackers, identity servers and services are likely to be the recipient of the most sophisticated attacks that opponents can muster. A study by IBM [BNET 05] showed a marked increase in attack sophistication. The study reported that there were over 237 million security attacks in the first half of 2005 and that more sophisticated, “customized” attacks increased by more than 50 percent in 2005.

Host integrity monitoring, network and host-based intrusion detection systems, network security monitoring, and secure exception management practices enable more robust detection when protection mechanisms fail.

Incident Response

Many attacks against identity, particularly identity theft, rely in large part on the victim being ignorant that theft has occurred for some period of time. The damage an attacker can cause can be partially mitigated by effective, rapid, and targeted response to identity data theft. An effective program could include clear communication lines and response patterns, along with a set of guidelines that the victimized users can implement to deal with the aftermath of an identity theft.

Usability

At runtime, the end point for identity data is frequently the user session and user desktop. Therefore, securing identity often comes back to a battle between usability and security. The work done protecting an identity across dozens of hops across servers and nodes can be defeated by attackers targeting the desktop layer. Robust identity systems must ensure that the usability of identity is factored in so users understand their roles and responsibilities in using their identity in the system.

Assurance

Identity information is used as a basis for access control decisions and audit purposes. Identity services must be designed so that the confidentiality, integrity, and accountability mechanisms relating to identity information provide assurance that the system will meet its overall specification including protection and detection mechanisms that rely on identity information.

Project Roles

Software development teams benefit from understanding project roles. The software development team may contain these roles:

identity architect, who is responsible for the identity system design and implementation
application architect, who owns the whole set of non-functional requirements like security and scalability
developer, who is responsible to write the application code, but not expected to be expert in implementing cryptographic systems, password recovery, provisioning, and other identity related functions

Aligning architecture and organizational roles and responsibilities supports the separation of architectural concerns.

Future Directions

The identity space remains a ferment of activity in the technology industry. Universities, businesses, government, criminals, and privacy advocates all realize the utility of identity in their various enterprises. Developing a shared understanding of the conceptual framework of identity problems and solutions remains a challenge. As each group develops new solutions and implementations, cascade effects are felt across the other identity stakeholders’ interests, so staying current with the issues and solutions that emerge is critical. Two websites that closely track identity issues across the landscape are IdentityBlog (www.identityblog.com) and Identity Corner (www.idcorner.org).

References

[APWG 05] Anti-Phishing Working Group. Phishing Activity Trends Report, June 2005.

[Bishop 02] Bishop, Matt. Computer Security: Art and Science. Boston, MA: Addison-Wesley, 2002.

[Blakley 04] Blakley, Bob & Heath, Craig. Security Design Patterns. The Open Group, 2004.

[Brands 05] Brands, Stefan. "UK Study recommends federated architecture – but not a la Liberty Alliance." The Identity Corner, March 21, 2005.

[BNET 05] Business Wire. "IBM Report: Government, Financial Services and Manufacturing Sectors Top Targets of Security Attacks in First Half of 2005; 'Customized' Attacks Jump 50 Percent As New Phishing Threats Emerge." BNET Business Network, 2005.

[Cameron 05] Cameron, Kim. The Laws of Identity, 2005.

[Comm 04] 9-11 Commission. The 9-11 Commission Report. Washington, D.C.: U.S. Government Printing Office, 2004.

Cigital, Inc. Copyright

Permission to reproduce this document and to prepare derivative works from this document for internal use is granted, provided the copyright and “No Warranty” statements are included with all reproductions and derivative works.

For information regarding external or commercial use of copyrighted materials owned by Cigital, including information about “Fair Use,” contact Cigital at copyright@cigital.com.

The Build Security In (BSI) portal is sponsored by the U.S. Department of Homeland Security (DHS), National Cyber Security Division. The Software Engineering Institute (SEI) develops and operates BSI. DHS funding supports the publishing of all site content.

Get PDF Reader