Presentation Integration

 

Patterns and Practices home

Integration Patterns

Start | Previous | Next

Contents

Aliases

Context

Problem

Forces

Solution

Example

Resulting Context

Testing Considerations

Security Considerations

Acknowledgments

Aliases

Screen scraping

Context

You have multiple independent applications that are organized as functional silos. Each application has a user interface.

Problem

How do you integrate information systems that were not designed to work together?

Forces

To correctly solve this problem, you need to consider the following forces:

  • When integrating multiple applications, you should minimize changes to existing systems because any change to an existing production system represents a risk and might require extensive regression testing.
  • Because many computer systems are designed according to the Layered Application pattern, some of the layers are typically deployed in physical tiers that are not externally accessible. For example, security policy may require the database tier of an application to reside on a separate physical server that is not directly accessible to other applications. Most Web applications allow external access only to the user interface and protect the application and database tiers behind firewalls.
  • Many applications feature little or no separation between business and presentation logic. The only remote components of these applications are simple display and input devices. These display terminals access the central mainframe computer, and the central mainframe computer hosts all business logic and data storage.
  • The business logic layer of many applications is programming-language specific and is not available remotely, unless it was developed on top of a specific remoting technology such as DCOM or Common Object Request Broker Architecture (CORBA).
  • Directly accessing an application's database layers can cause corruption of application data or functionality. In most applications, important business rules and validations reside in the business logic, and they often reside in the presentation layer also. This logic is intended to prevent erroneous user entry from affecting the integrity of the application. Making data updates directly through the data store bypasses these protection mechanisms and increases the risk of corrupting the application's internal state.

Solution

Access the application's functionality through the user interface by simulating a user's input and by reading data from the screen display. Figure 1 shows the elements of a solution that is based on the Presentation Integration pattern.

Figure 1. Presentation Integration connects to an existing application through the presentation layer

The Presentation Integration pattern is sometimes disparagingly called screen scraping because the middleware collects (or scrapes) the information from the information that is displayed on the screen during a user session. Collecting information from the screen of the user session tends to be the simpler part of the integration. The more difficult part tends to occur when the middleware has to locate the correct screen in the application in the same way a human user has to.

To simulate user interaction, the integration solution has to use a terminal emulator that appears to the application as a regular terminal, but that can be controlled programmatically to simulate user input. Such a terminal emulator is usually specific to the exact type of user interface device that is supported by the application. Fortunately, in the mainframe world, IBM's 3270 terminal standard is so widespread that many commercial 3270 terminal emulators are available. Instead of displaying information to the user, these emulators make the screen data available through an API. In the case of 3270 emulators, a standard API exists that is called the High Level Language Application Program Interface (HLLAPI). The emulator can send data to the application to simulate keystrokes that a user would make on a real 3270 terminal. Because the terminal emulator mimics a user's actions, it usually does not depend on the specific application that it is interacting with. Additional middleware logic must encode the correct keystrokes and extract the correct fields from the virtual screen.

The widespread trend of equipping applications with Web-based interfaces has revived interest in using Presentation Integration as a vital integration approach. Web applications are easily accessible over the Internet. However, the only accessible portion is the user interface that is accessed through the relatively simple HTTP protocol. Web applications transmit presentation data in HTML. Because HTML is relatively easy to parse programmatically, Presentation Integration is a popular approach.

Unfortunately, the ease of collecting information from a provider's Web page over the Internet has caused some application providers to intentionally exploit the biggest weakness of Presentation Integration: brittleness. Because Presentation Integration usually depends on the exact geometric layout of the information, rearranging data fields can easily break a Presentation Integration solution. The graphical nature of HTML allows a provider to easily modify the HTML code that describes the layout of the information on the screen. The layout changes then block any attempt to collect information from the Web page.

Presentation Integration is based on the interaction between the components that are described in Table 1.

Table 1: Presentation Integration Components

Component Responsibilities Collaborators
Presentation layers - Render a visual presentation to be displayed on a user terminal
- Accept user input and translate it into commands to be executed by the business logic
Terminal emulator
Terminal emulator - Impersonates a user session to the presentation layer
- Makes screen information available through an API
- Allows other applications to issue commands to the presentation tier
Presentation layer and other applications
Other applications - Consume application data
- Issue commands
Terminal emulator

 

Example

A big challenge faced by government agencies is the lack of integrated data across multiple state agencies. For example, integrated data gives an income tax agency a more holistic view of a business because the integrated data might show the number of employees that the business has and the amount of sales tax that the business reports, if any. This type of information can be used to identify businesses where there is a difference between the tax owed and the tax actually collected; this common issue is referred to as a tax gap. However, integrating information from multiple state agencies is often constrained by political and security concerns. In most cases, it is easier for an agency to obtain end-user access to another agency's data as opposed to obtaining direct database access. In these situations, you can use Presentation Integration to gain end-user access to a remote data source in the context of an automated integration solution.

Resulting Context

Presentation Integration is almost always an option and has certain advantages, but also suffers from a number of limitations:

Benefits

  • Low-risk. In Presentation Integration, a source application is the application that the other applications extract data from. It is unlikely that the other applications that access the source application can corrupt it because the other applications access the data the same way that a user accesses the data. This means that all business logic and validations incorporated into the application logic protect the internal integrity of the source application's data store. This is particularly important with older applications that are poorly documented or understood.
  • Non-intrusive. Because other applications appear to be a regular user to the source application, no changes to the source application are required. The only change that might be required is a new user account.
  • Works with monolithic applications. Many applications do not cleanly separate the business and presentation logic. Presentation Integration works well in these situations because it executes the complete application logic regardless of where the logic is located.

Liabilities

  • Brittleness. User interfaces tend to change more frequently than published programming interfaces or database schemas. Additionally, Presentation Integration may depend on the specific position of fields on the screen so that even minor changes such as moving a field can cause the integration solution to break. This effect is exacerbated by the relative wordiness of HTML.
  • Limited access to data. Presentation Integration only provides data that is displayed in the user interface. In many cases, other applications are interested in internal codes and data elements such as primary keys that are not displayed in the user interface. Presentation Integration cannot provide access to these elements unless the source application is modified.
  • Unstructured information. In most cases, the presentation layer displays all data values as a collection of string elements. Much of the internal metadata is lost in this conversion. The internal metadata that is lost includes data types, constraints, and the relationship between data fields and logical entities. To make the available data meaningful to other applications, a semantic enrichment layer has to be added. This layer is typically dependent on the specifics of the source application and may add to the brittleness of the solution.
  • Inefficient. Presentation Integration typically goes through a number of unnecessary steps. For example, the source application's presentation logic has to render a visual representation of the data even though it is never displayed. The terminal emulation software in turn has to parse the visual representation to turn it back into a data stream.
  • Slow. In many cases, the information that you want to obtain is contained in multiple user screens because of limited screen space. For example, information may be displayed on summary and detail screens because of limited screen space. This requires the emulator to go to multiple screens to obtain a coherent set of information. Going to multiple screens to obtain information requires multiple requests to the source application and slows down the data access.
  • Complex. Extracting information from a screen is relatively simple compared to locating the correct screen or screens. Because the integration solution simulates a live user, the solution has to authenticate to the system, change passwords regularly according to the system policy, use cursor keys and function keys to move between screens, and so on. This type of input typically has to be hard-coded or manually configured so that external systems can access the presentation integration as a meaningful business function, such as "Get Customer Address." This translation between business function and keystrokes can add a significant amount of overhead. The same issues of complexity also affect error handling and the control of atomic business transactions.

Testing Considerations

One advantage of using Presentation Integration is that most user interfaces execute a well-defined and generally well-understood business function. This can be an enormous advantage when dealing with monolithic systems that might be poorly documented or understood.

Unfortunately, this advantage is often offset by the fact that testing usually depends on the ability to isolate individual components so that they can be tested individually with a minimum of external dependencies. Such a testing approach is generally not possible when using Presentation Integration.

Security Considerations

Presentation Integration uses the same security model as an end user who logs into the application. This can be an asset or a liability depending on the needs of the applications that are participating in the integration solution. An end-user security model typically enforces a fine-grained security scheme that includes the specific data records or fields that a user is permitted to see. This makes exposing the functions through presentation integration relatively secure.

The disadvantage of the fine-grained security scheme is that it can be difficult to create a generic service that can retrieve information from a variety of data sources. In those cases, a special user account has to be created that has access rights to all the data resources that are needed by the external applications.

Acknowledgments

[Ruh01] Ruh, William. Enterprise Application Integration. A Wiley Tech Brief. Wiley, 2001.

Start | Previous | Next