Skip to content
Privacy and Use

Identity in Assembly and Integration

Gunnar Peterson, Cigital, Inc. [vita]

Copyright © 2005, 2008 Cigital, Inc.

2005-11-30; Updated 2008-09-27

L1 / L, M    

Securely integrating a shared service across highly distributed software systems presents a significant challenge at every phase of the software development life cycle. Moreover, there is a crucial need within the project team(s) for common abstractions and a common understanding of all the relevant aspects of a shared service. This document discusses the issues and necessary abstractions related to integrating identity services, which are particularly critical as the basis for granting or denying access to system resources and data.

Acknowledgements. Contributions and reviews by Dan Blum, Stefan Brands, Pamela Curtis, Howard Lipson, Gary McGraw, and Tony Nadalin are gratefully acknowledged. Errors and omissions are my own.

Introduction

In their report [APWG 05] for the month of June 2005, the Anti-Phishing Working Group identified and studied 15,050 reports of phishing covering 74 different corporate brands. The average time online for the sites used in the phishing attacks was 5.9 days. The longest time online for these sites was 30 days. The report found 154 unique password-stealing applications and 526 URLs containing malicious code for password stealing.

Phishing is only one example of attacks targeting identity-related data. The attack vectors against identity-related data are made possible by overall weaknesses in identity technologies and usability factors and vulnerabilities in identity deployments. Individual users are not able to discern when it is appropriate and safe to disclose personal information. From a risk management viewpoint, identity data is ill protected in many systems, and yet it remains among the most valuable data sets to both an attacker and the victim. When software projects move into the assembly and integration phase, the systems must be prepared to meet a host of security challenges that wait in the target production deployment environment.

Overview

In assembly and integration, the logical design assumptions for a system meet the physical, business, technical, organizational, and individual user realities of the target system environment, including identity systems such as user repositories, directory services, and provisioning systems. This document describes common issues, approaches, and integration considerations regarding identity integration in software, with the goal of beginning to build a shared understanding of the problems and solutions in this space so that identity may develop a consistently strong representation and usage within and across domains.

Software development teams lack agreed-upon adoption of standard representation and consumption patterns for authentication, attribute query or update, and authorization of identity information across technological and organizational domains. The current state of identity consists of numerous identity silos that are directly bound to domain-specific technologies, policies, and organizational domains, each with its own interpretation of how to issue, encapsulate, and negotiate identity data and services. This lack of consistency creates issues for distributed systems that are required to traverse across identity silos and domains and has the overall effect of numerous one-off solutions for identity, where each of which contains its own arcane, tightly coupled, and technology-specific ways of dealing with identity. There is a well understood best practice in software development that developers should not attempt to write their own cryptographic algorithms because of the complexity, lack of peer review, and value of that which the cryptographic functions are protecting. Developers, in contrast, routinely write one-off identity solutions that are never peer reviewed by a wider audience. This identity information is propagated and integrated throughout software systems and used as a basis for making security decisions about access control to critical resources and the confidentiality of personal and business data.

These one-off solutions frequently are further integrated with other identity silos, creating a mishmash of identity solutions with varying limitations, in the worst case generating a lowest common denominator effect. Exacerbating this further is the situation that many identity solutions are already in place as legacy systems while software is being developed, and so the projects inherit the standard issues found in legacy integration with identity, such as brittleness and lack of support for robust protocols and current standards. Why is this an especially serious problem for identity management? Matt Bishop states that “Every access control mechanism is based on an identity of some sort.” Bishop goes on to state that all decisions of access and resource allocation assume that the binding of an identity to the principal is correct [Bishop 02]. Hence, identity is a foundation-level element for security and accountability decisions, and breakage at this level in design has profound implications for the system’s security as a whole. Transactions may employ multiple identity contexts throughout their life cycles, broadening the scope of identity’s usage for access.

Added to the above system-level problems are the user population’s lack of awareness, ability, and tools for managing and propagating their own digital identity information and their lack of ability and technical tools to use in determining the veracity of requests for their personal information. The net result of this is the emergence of phishing attacks and other attacks targeting these vulnerabilities at the directory, user desktop, and client level.

To protect identity on the client and server and throughout the system, software development teams require an overarching understanding of identity’s architectural elements and approaches to integrating identity into software systems. Such an understanding will enable them to navigate the chasm that exists between the assumptions made about identities and the actual state that exists in the systems, with which they are attempting to integrate. The acquisition of knowledge regarding the union of identity elements, behaviors, and constraints and the software user’s knowledge and abilities with the desktop and clients will give software development teams the tools they need to build more robust software based on secure usage of identity. Ideally, the architecture should abstract the low-level concerns relating to identity from the developer.

This document examines architectural concerns relating to identity abstraction, security, and privacy in distributed systems so that software development teams may compose a framework that harmonizes these concerns in a way commensurate with their risk management decisions.

Identity Architectural Concerns

Identity information typically provides the basis for a number of functional requirements (e.g., being used as a key value to direct access to appropriate resources and data) and non-functional requirements such as security and privacy. Each scenario that interacts with identity data and services may generate and/or rely on identity information for different purposes. Identity usage paradigms in representing and using identity in the system can create conflicting goals. One example is the inherent tension between privacy and security. Many privacy goals revolve around data subjects controlling who can access information about themselves; security goals, especially protection and detection mechanisms, are concerned with not letting information fall into the wrong hands and gathering as much data as possible about the system’s users. The balance between the two is determined by how the identity is protected, by what mechanisms, and who owns and operates the mechanisms.

Identity comprises a number of architectural concerns. Each concern’s elements and constraints can result in architectural tradeoffs against other identity architecture concerns. Project teams map out the constraints and rationale for the identity elements in their system, conducting architectural tradeoff analysis regarding how identity is generated, represented, consumed, and transformed. Examples of identity architectural concerns include the following:

Depending on how each of the above architectural concerns is implemented and integrated, there is a potential cascading effect to the detriment of the other concerns. A holistic architectural view of how identity is generated, represented, consumed, transformed, communicated, and stored in the system is required to gauge the system impacts of the design decisions.

Identity Integration Considerations in Distributed Systems Architecture

When software is assembled and integrated into the system for production release, the overall strength of the security mechanisms is affected by

Identity Domains and Relationships

Identity information is context sensitive. Identity domains, services, and providers control what identity information is made available and verifiable by what relying parties. In a distributed system, identity domains and identity providers are typically tied back to one or more parties that can provide identity services, such as verification. These parties could include commercial and government interests and social networks. Each party that performs the identity provider role has direct implications on privacy and security depending on how identity will be treated in the system (and potentially across systems) and what stakeholder is in control of what identity functions. Systems that employ replication of identity information need mechanisms to recognize the authoritative source of the identity information.

Identity Information Leakage in Cross-Domain Relationships

Information systems are increasingly interconnected. One of the findings of the 9/11 Commission [Comm 04] was that United States Intelligence organizations’ stovepipes impeded intelligence analysis. In business scenarios, information systems are integrated, such as when companies’ intranet sites provide single sign on capabilities to other companies that provide 401k and health benefit services. The relying parties, the financial and health benefits services organizations, may rely on their customers’ systems to provide information about the identity of the user who is connecting to their services. However, the two systems may not have a consistent security policy, enforcements, audit, or privacy requirements. Identity information leakage can occur when identity providers supply more information than is necessary to perform the functional task and do not protect the identity information when it is transmitted across the domains’ boundaries. A classic example would be a service that requires that authorized users be 21 years of age or over. The relying party asks the identity provider for the age information. If the identity provider gives the relying party the user’s birth date so that the relying party can calculate the age of the user, then the user’s birth date has been propagated to a separate service that now can retain (or disclose or otherwise lose) a valuable piece of personal information that the service does not absolutely require to perform its functions. A more appropriate response could be that the relying party queries the identity provider or the data subject if the user is more than 21 years old and receives a Boolean yes/no response. Some information has been revealed to the service provider in this instance, but far less critical data has been revealed. Emerging technologies like Web Services and Federated Identity have direct implications on identity information leakage. Early efforts around portable identity for web usage like Microsoft Passport suffered disclosed identity information to parties that did not have a justifiable place in the transaction [Cameron 05]. Directory services that replicate identity information at the data level can also create exposure through replicating more identity information than is required for dependent systems.

Identity Integration in N Tier Systems

N Tier architecture such as J2EE and .Net has become a de facto standard in modern application environments. One N Tier paradigm is logically and/or physically partitioning the web, application, and database servers. This architecture creates a separation between presentation and business logic code and components and abstracts the back end data resources and technologies. Given that these separate servers can execute in separate policy and process spaces, what options exist for authentication and authorization at the entry point to each tier, and what identity attributes are carried forward and in what manner as they traverse the system? In a typical scenario, security credential and access control decisions exist, in large part, in the middle tier.

The diagram below shows that even in a straightforward N Tier architecture, there are multiple decision points for access control, configuration management, and identity attribute propagation. Distributed applications can include many more contexts and domains at each tier and include such elements as asynchronous messaging systems, long-running transactions, and batch systems, making this model even more complex. Identity architects and application architects need to address a design strategy that applies risk management to the identity information propagation.

Figure 1. Access control points in an N Tier application

Figure showing access control points in an N Tier application.

Figure 1 shows three main points for access control in a simple N Tier application. Each tier may be supported by unique defense-in-depth layers that guard the physical, network, host, application, and data resources. Each tier may have one or more user repositories and identity services that are used to provide and negotiate identity information for access control usage.

Access control point 1, in many systems, consists of the user providing his or her username and password. Additional factors may come into play, such as certificates. Once the user is authenticated, the web server creates a session for the user based on the user’s credentials and access rights in the system.

To access the data in an N Tier system, the user’s transactions must first traverse the middle tier. The access control point number 2 in the diagram shows the web server process authenticating itself to the middle tier application server. While the web server on behalf of the user performs the authentication and authorization functions, these do not typically consume the end user’s credentials, but rather system accounts or other general-purpose accounts. How then is the user’s information carried through the system and persisted? What information is required for the system to be able to perform granular access control in the middle tier and data tier on behalf of the user? There are several common methods of resolving this situation, described below as impersonation and delegation.

Access control point 3 is further removed from the original user session. It is the gateway to the data on the system and may require information that is not used by the middle tier or presentation tier but that may need to be persisted through the transaction life cycle for access control decisions at this tier.

Notes on Identity Services and Stores Unification

Each tier in the N Tier system can have one or more identity services and stores. The layering of an application across multiple logical and/or physical tiers results in a linkage of the identity services and stores in each tier. The unification of the identity services and stores can result in a lowest common denominator effect for applications’ overall security, especially when access control functions and requests are chained. During assembly and integration, each tier should be investigated to identify weak points that may adversely affect the security of the application as a whole.

Impersonation

Impersonation allows the servers in a system to authenticate under system-type accounts. In the N Tier architecture example, the web and application servers impersonate Alice. They may propagate her credentials for use in the application, but the actual access control requests to the relying servers are carried out using the server’s credentials.

Figure 2. Impersonation in an N Tier application

Figure showing impersonation in an N Tier application.

Impersonation allows for flexibility in that the application and database servers do not have any knowledge of end-user accounts. However, this design may have unanticipated consequences if the server accounts (Bob and Charlie) have more privileges than are necessary for Alice’s requests. Injection attacks can be used by attackers to insert commands on the user’s (Alice’s) behalf and have them executed by accounts with greater privileges to call out and gain command shells or other resources. Impersonation requires detailed security analysis of the rights and privileges for accounts and functions that allow the system to use impersonation to map identity onto other accounts.

Delegation

The delegation model carries the user’s credentials through the transaction life cycle. The access control checks at all access control points are performed using the end user’s (Alice’s) credentials. Alice delegates her identity data for authorization purposes to the Web and Application Servers to negotiate access. This solution requires more up front configuration to ensure that the servers are provisioned with the correct credentials or mapping to user credentials.

Figure 3. Delegation in an N Tier application

Figure showing delegation in an N Tier application.

A delegation model has access control advantages because, once the servers are properly provisioned, the user accounts are constrained only to that which they are allowed throughout the whole transaction life cycle. The downsides of this approach include increased administrative burden for provisioning and the potential for performance impacts resulting from loss of pooling resources such as connection pools, which are common in impersonation and system account scenarios. Any time identity data is handled by systems outside of the user’s control, diligence is required to ensure the identity information retains confidentiality, integrity, and availability properties even in failure scenarios.

Note

Security Design Patterns [Blakley 04] describes Secure Proxy patterns, motivated by the issue that

Security properties, especially authentication, often do not compose. Nevertheless, information systems are often built on composition.

The tradeoffs involved in proxying are analyzed, including a number of identity integration patterns including impersonation, delegation, proxy, tunneling, and other scenarios.

Role-Based Access Control

Role-based access control is a type of mandatory access control that allows for subject’s information to be mapped onto a role at runtime so that access control decisions can be made against the role’s privileges. Subjects are assigned a set of roles and the system evaluates the roles to determine authorized access to objects. Role-based access control can enhance administrative flexibility in the system for dealing with user accounts and at the same time allow for granular authorization controls on objects and transactions based on roles rather than individual accounts. In many role-based access control systems the subject’s attributes are propagated along with the role credentials so risks to identity information, like identity information leakage, remain.

Federation

In a federated identity model, one or more relying parties and service providers enter into league together, where assertions made by one party are recognizable and verifiable by the other parties. Assertions can be made about any number of attributes pertaining to the identity, including authentication, authorization, and domain-specific attributes. Federation protocols and standards, such as SAML, Liberty ID-FF, and WS-Federation, allow for identity information to be transferred across domain contexts. There are many use cases that employ federation; for example, in a federated single sign-on scenario, an employee at a company could log on to the company’s intranet site, and when the employee clicked to browse his or her health benefits on a site hosted by a third-party healthcare benefits provider, the employee’s credentials would be federated to the site, and the employee would be logged on to the healthcare plan system automatically with the requisite privileges and data. As systems become increasingly integrated, federation is emerging as a desirable property for identity data.

Figure 4. Federated identity across security domains

Figure showing federated identity across security domains.

The convenient portability properties that federation provides can conflict with privacy goals. In some cases the federation protocols set up a panoptic situation that reduces privacy in a multiparty federation [Brands 05]. Depending on implementation, personal information can be correlated and traced across domains through logs and other mechanisms. Federation, as its name implies, relies on mutually agreed upon static groups and is not suited to ad hoc groups that want to rely on the same identity data. Federation servers create failure points as targets of denial of service and disruption of availability attacks. As with all identity services, federation design decisions need to deal with the balance of security, privacy, and usability.

Identity Abstraction Layer

Abstraction layers are frequently used to provide reusability, vendor and technology independence, and a layer of indirection for databases, operating systems, and other computing resources. This technique is relatively less used for identity services and stores; however, there is no technical reason for this to be the case, and emerging technologies make it simpler for identity services and stores to enjoy the same abstraction benefits as databases and other resources. Those technologies include open, interoperable protocols such as SOAP; federated identity standards such as Liberty ID-FF, SAML, and WS-Federation; and security token servers (STS) such as the STS standards proposed in the WS-Trust specification, which eliminate the need for pairwise identity relationships where identity and its consumers are tightly coupled. The combined impact of open protocols and standards, federated identity, and STS creates a situation where identity information may be abstracted and loosely coupled to domains and services. The net result is that an application’s queries for identity information can be bound to an identity abstraction layer using open protocols and standards-based security tokens. The back end user repository and policy stores can then be switched, merged, and replicated, similar to how an application server hides the details of the database server behind it so that switching vendors or technologies (or versions) does not require a total rewrite of presentation and business logic layer code. Identity usage patterns, such as authentication, personalization, and attribution can use the services in an identity abstraction layer rather than being bound in a stovepipe fashion to a single identity store.

Key Functions

The key functions of an identity abstraction layer include

Providing identity information to applications through the identity abstraction layer may require behind the scenes provisioning of attributes, views, or repositories. Such provisioning should be designed by the identity infrastructure architect and the application architects, while remaining as transparent as possible to the developers.

In these cases, identity is provisioned to applications’ user repositories and other locations. Provisioning services add, update, and delete identity and information relating to identities such as authentication, authorization, keys, and domain-specific attributes. In systems that do not have integrated identity at runtime through federation or other means, provisioning may be used to keep identity information and policy in sync across the system. In large organizations that have disparate technologies and protocols for identity, provisioning systems provide a way to have consistent representation of identities and a central point for logging and reporting.

Two common ways of deploying provisioning include metadirectories and virtual directories. Metadirectories connect disparate user repositories through roles and/or rules combined with joins and filters. The metadirectory stores information about the identity as a key value to keep the information in sync and consistent across the user repositories. Virtual directories provide an interface across multiple stores and represent a single point to connect to for identity information. Virtual directories connect to identity repositories but do not store information about specific identities.

Goals

The following are the goals of an identity abstraction layer:

Additional Constituents

The identity abstraction layer may contain additional architectural constituents, including

Guarding the Keys to the Kingdom

Since identity information is so central to so many security decisions and to so much application functionality, it represents a highly prized target for attackers. From a cultural viewpoint, identity information is understood to require extra due diligence by government, regulatory bodies, and individual users. Identity information and its related architectural constituents therefore may be held to a higher standard for both security and privacy elements, and additional security analysis, design, implementation, operations, and auditing may be required. Examine the security model of the identity services and identity stores in the context of the overall system security to ensure that the identity services and identity stores are among the strongest links in the system. The more identity information is centralized logically or physically, the more risk to identity information is also aggregated.

Availability

Identity services provide an interface to information about subjects stored in the identity stores in a system. They also can provide a single point of failure that attackers may target to bring application systems down, without the need for the attackers to target the application itself. In fact, since identity services and stores are often reused in organizations serving identity information to multiple applications, an attacker who executes a denial-of-service or other availability attack against identity services and stores can have large adverse impact on the availability of the system. Failover, replication, dynamic binding, decentralization, and clustering techniques can be used to combat availability threats.

Hardened Servers and Services

Due to the criticality of the data that identity servers host and the access they vouch for in the system, identity servers should be hardened to the highest level of surety that is practical. The goal of identity servers to is provide and verify identity information for applications, not to run web servers, database servers, and so on. Standard server hardening techniques that limit privileges and services available only to those strictly necessary apply in this instance. Hardening special-purpose identity servers such as directory services servers is a relatively more straightforward task than hardening identity servers that are general purpose tools in the organization and may contain both identity and line of business or domain information.

Design for Failure

As a high-priority target for attackers, identity servers and services are likely to be the recipient of the most sophisticated attacks that opponents can muster. A study by IBM [BNET 05] showed a marked increase in attack sophistication. The study reported that there were over 237 million security attacks in the first half of 2005 and that more sophisticated, “customized” attacks increased by more than 50 percent in 2005.

Host integrity monitoring, network and host-based intrusion detection systems, network security monitoring, and secure exception management practices enable more robust detection when protection mechanisms fail.

Incident Response

Many attacks against identity, particularly identity theft, rely in large part on the victim being ignorant that theft has occurred for some period of time. The damage an attacker can cause can be partially mitigated by effective, rapid, and targeted response to identity data theft. An effective program could include clear communication lines and response patterns, along with a set of guidelines that the victimized users can implement to deal with the aftermath of an identity theft.

Usability

At runtime, the end point for identity data is frequently the user session and user desktop. Therefore, securing identity often comes back to a battle between usability and security. The work done protecting an identity across dozens of hops across servers and nodes can be defeated by attackers targeting the desktop layer. Robust identity systems must ensure that the usability of identity is factored in so users understand their roles and responsibilities in using their identity in the system.

Assurance

Identity information is used as a basis for access control decisions and audit purposes. Identity services must be designed so that the confidentiality, integrity, and accountability mechanisms relating to identity information provide assurance that the system will meet its overall specification including protection and detection mechanisms that rely on identity information.

Project Roles

Software development teams benefit from understanding project roles. The software development team may contain these roles:

Aligning architecture and organizational roles and responsibilities supports the separation of architectural concerns.

Future Directions

The identity space remains a ferment of activity in the technology industry. Universities, businesses, government, criminals, and privacy advocates all realize the utility of identity in their various enterprises. Developing a shared understanding of the conceptual framework of identity problems and solutions remains a challenge. As each group develops new solutions and implementations, cascade effects are felt across the other identity stakeholders’ interests, so staying current with the issues and solutions that emerge is critical. Two websites that closely track identity issues across the landscape are IdentityBlog (www.identityblog.com) and Identity Corner (www.idcorner.org).

References

[APWG 05] Anti-Phishing Working Group. Phishing Activity Trends Report, June 2005. 

[Bishop 02] Bishop, Matt. Computer Security: Art and Science. Boston, MA: Addison-Wesley, 2002.

[Blakley 04] Blakley, Bob & Heath, Craig. Security Design Patterns. The Open Group, 2004. 

[Brands 05] Brands, Stefan. "UK Study recommends federated architecture – but not a la Liberty Alliance." The Identity Corner, March 21, 2005.

[BNET 05] Business Wire. "IBM Report: Government, Financial Services and Manufacturing Sectors Top Targets of Security Attacks in First Half of 2005; 'Customized' Attacks Jump 50 Percent As New Phishing Threats Emerge." BNET Business Network,  2005.

[Cameron 05] Cameron, Kim. The Laws of Identity, 2005.

[Comm 04] 9-11 Commission. The 9-11 Commission Report. Washington, D.C.: U.S. Government Printing Office, 2004.

Cigital, Inc. Copyright

Copyright © Cigital, Inc. 2005-2007. Cigital retains copyrights to this material.

Permission to reproduce this document and to prepare derivative works from this document for internal use is granted, provided the copyright and “No Warranty” statements are included with all reproductions and derivative works.

For information regarding external or commercial use of copyrighted materials owned by Cigital, including information about “Fair Use,” contact Cigital at copyright@cigital.com.

The Build Security In (BSI) portal is sponsored by the U.S. Department of Homeland Security (DHS), National Cyber Security Division. The Software Engineering Institute (SEI) develops and operates BSI. DHS funding supports the publishing of all site content.

Get PDF Reader Get PDF Reader