Implementing Data Transfer Object in .NET with a DataSet

Version 1.1.0

GotDotNet community for collaboration on this pattern

Complete List of patterns & practices

Context

You are implementing a distributed application in the .NET Framework. The client application displays a form that requires making multiple calls to an ASP.NET Web service to satisfy a single user request. Based on performance measurements, you have found that making multiple calls degrades application performance. To increase performance, you would like to retrieve all the data that the user request requires in a single call to the Web service.

Background

Note: The following is the same sample application that is described in Implementing Data Transfer Object in .NET with a Typed DataSet.

The following is a simplified Web application that communicates with an ASP.NET Web service to deliver recording and track information to the user. The Web service in turn calls a database to provide the data that the client requests. The following sequence diagram depicts the interaction among the application, the Web service, and the database for a typical page.

Figure 1: Behavior of a typical user request

Figure 1 illustrates the sequence of calls needed to fulfill the entire user request. The first call retrieves the recording information, and the second call retrieves the track information for the specified recording. In addition, the Web service must make separate calls to the database to retrieve the required information.

Database Schema

The schema that is used in the example shown in Figure 2 depicts a recording record that has a one-to-many relationship with a track record.

Figure 2: Schema for sample application

Implementing a DTO

One way to improve the performance of this user request is to package all the required data into a data transfer object (DTO) that can be sent with a single call to the Web service. This reduces the overhead associated with two separate calls and allows you to use a single connection with the database to retrieve both the recording and the track information. For a detailed description of how this improves performance, see the Data Transfer Object pattern.

Implementation Strategy

There are a number of possible implementations for Data Transfer Object in the .NET Framework. To use a DTO, you have to complete the following four steps. The good news is that the DataSet class that is built into the .NET Framework already takes care of three (actually, more like three and one-half) of the following steps:

1.Design the DTO class. One step in this process is to decide what data types and structures to support. The DataSet is generic enough to use for any DTO purpose; therefore, you do not need to design a new class for each DTO.

2.Write or generate the code for the data transfer class. DataSet is part of the .NET library, so you do not have to code it.

3.Create an instance of the DTO, and fill it with data. This is the only step that you have to program. DataSet provides convenient functions to load the DTO with data from a database or Extensible Markup Language (XML) document, greatly simplifying this task.

4.Serialize the DTO into a byte or character stream (and back) so that the content of the object can be sent over the network. The DTO has built-in serialization functions.

A DataSet holds a collection of DataTable objects. Each DataTable object represents the data that was retrieved using a SELECT statement or stored procedure execution. The data in a DataSet can be written out or read as XML. A DataSet also stores schema information, constraints, and relationships among multiple DataTable objects. Through a DataSet, you can add, edit, and delete data; therefore, a DataSet makes an ideal data transfer object in the .NET Framework, especially when you add the requirement to display the DataSet in controls on a form.

Because the .NET Framework already implements the DataSet, the remainder of this implementation strategy focuses on how to fill a DataSet from a data source and how to use the resulting DataSet in the Web Form.

Filling a DataSet from the Database

This example demonstrates how to fill the DataSet with the data that the sample application requires by using a database query. This includes the recording record as well as all the track records that are associated by the recordingId.

Assembler.cs

The Assembler class is a specialized instance of the Mapper pattern [Fowler03]. Its purpose is to isolate the DTO from the rest of the system. The following code example shows how the DTO is created from the database:

 

using System;
using System.Data;
using System.Data.SqlClient;

public class Assembler
{
   public static DataSet CreateRecordingDto(long id)
   {
      string selectCmd = 
         String.Format(
         "select * from recording where id = {0}",
         id);

      SqlConnection myConnection = 
         new SqlConnection(
         "server=(local);database=recordings;Trusted_Connection=yes");
      SqlDataAdapter myCommand = new SqlDataAdapter(selectCmd, 
         myConnection);

      DataSet ds = new DataSet();
      myCommand.Fill(ds, "recording");

      String trackSelect = 
         String.Format(
         "select * from Track where recordingId = {0} order by Id",
         id);

      SqlDataAdapter trackCommand = 
         new SqlDataAdapter(trackSelect, myConnection);
      trackCommand.Fill(ds, "track");

      ds.Relations.Add("RecordingTracks",
         ds.Tables["recording"].Columns["id"],
         ds.Tables["track"].Columns["recordingId"]);

      return ds;
   }
}

This code has some interesting aspects. You need to execute queries to fill both the recording and track tables. You must also explicitly define the relationship between the two tables even though the relationship is defined in the database.

Note: The example shown here does not describe the only way to fill the DataSet. There are many ways to retrieve this data from the database. For example, you could use a stored procedure.

Using a DataSet in an ASP.NET Page

When using the .NET user interface controls (Web Forms or Windows Forms) a DataSet is a natural choice. For example, the sample application page uses two DataGrid controls, RecordingGrid and TrackGrid. Because you need to retrieve both the recording and the tracks for the recording, it makes sense to use a single DataSet that contains multiple tables.

Given the DataSet that was built by the Assembler class, this code displays how to assign a DataSet to the DataSource property of the two grid controls:

 

using System;
using System.Data;

public class RetrieveForm : System.Web.UI.Page
{
   private RecordingCatalog catalog = new RecordingCatalog();

   // 
 
   protected void Button1_Click(object sender, System.EventArgs e)
   {
      string stringId = TextBox1.Text;
      long id = Convert.ToInt64(stringId);

      DataSet ds = catalog.Get(id);
      RecordingGrid.DataSource = ds.Tables["recording"];
      RecordingGrid.DataBind();

      TrackGrid.DataSource = ds.Tables["track"];
      TrackGrid.DataBind();
   }
}

Tests

Because the DataSet is provided by the .NET Framework, you do not need to write tests to verify that it functions correctly. You could argue this point, but you should assume that classes provided by the Framework are innocent until proven guilty; therefore, what you need to test is the code that assembles the DataSet, which in this case is the Assembler class.

RecordingAssemblerFixture.cs

This fixture tests that the contents of the DataSet are filled and that the relationship between recording and track is defined correctly:

 

using NUnit.Framework;
using System.Data;

[TestFixture]
public class RecordingAssemblerFixture
{
   private DataSet ds;
   private DataTable recordingTable; 
   private DataRelation relationship;
   private DataRow[] trackRows; 

   [SetUp]
   public void Init()
   {
      ds = Assembler.CreateRecordingDto(1234);
      recordingTable = ds.Tables["recording"];
      relationship = recordingTable.ChildRelations[0];
      trackRows = recordingTable.Rows[0].GetChildRows(relationship);
   }

   [Test]
   public void RecordingCount()
   {
      Assert.Equals(1, recordingTable.Rows.Count);
   }

   [Test]
   public void RecordingTitle()
   {
      DataRow recording = recordingTable.Rows[0];
      string title = (string)recording["title"];
      Assert.Equals("Up", title.Trim());
   }


   [Test]
   public void RecordingTrackRelationship()
   {
      Assert.Equals(10, trackRows.Length);
   }

   [Test]
   public void TrackContent()
   {
      DataRow track = trackRows[0];

      string title = (string)track["title"];
      Assert.Equals("Darkness", title.Trim());
   }

   [Test]
   public void InvalidRecording()
   {
      DataSet ds = Assembler.CreateRecordingDto(-1);
      Assert.Equals(0, ds.Tables["recording"].Rows.Count);
      Assert.Equals(0, ds.Tables["track"].Rows.Count);
   }
}

These tests describe how to access the individual elements of the DataSet. The tests themselves demonstrate some of the issues, in that you need to know the column names as well as the types of the objects. Because of this direct dependency, this code must change if the database schema changes. These types of issues are mitigated when you use a typed DataSet. For more information, see Implementing Data Transfer Object in .NET with a Typed DataSet.

Resulting Context

The following are the benefits and liabilities related to using a DataSet as a data transfer object:

Benefits

Development tool support. The DataSet class is implemented in ADO.NET, so there is no need to design and implement the data transfer object. There is also extensive support in the Microsoft Visual Studio version 6.0 development system for automating the creation and filling of DataSet objects.

Integration with controls. A DataSet works directly with the built-in controls in Windows Forms and Web Forms, making it a logical choice as a data transfer object.

Serialization. The DataSet comes with the ability to serialize itself into XML. Not only is the content serialized but the schema for the content is also present in the serialization.

Disconnected database model. The DataSet is a snapshot of the current contents of the database. This means that you can alter the contents of the DataSet and subsequently use the DataSet as the means to update the database.

Liabilities

Interoperability. Because the DataSet class is part of ADO.NET, it is not the best choice for a data transfer object in cases requiring interoperability with clients that are not running the .NET Framework. You can still use DataSet, however, the client will be forced to parse the XML and build its own representation. If interoperability is a requirement, see Implementing Data Transfer Object in .NET with Serialized Objects.

Stale data. The DataSet, as stated previously, is disconnected from the database. It is filled with a snapshot of the data in the database when it is constructed. This implies that the actual data in the database may be different from what is contained in the DataSet. For reading primarily static data, this is not a major issue. If the data is constantly changing, however, using a DataSet is not recommended.

Dependency on database schema. Because the DataSet is most often filled from the database, any code that references the column names depends on the database schema. Also, because the programmer must explicitly code the relationships between tables, if a relationship changes in the database, the code also has to be modified.

Potential for performance degradation. Instantiating and filling a DataSet can be expensive. Serializing and deserializing a DataSet can also be very time consuming. A good rule of thumb for using a DataSet is that a DataSet is a good choice when you are using more than one table or relying on the capability of the DataSet to update the database. If you are displaying the results from a single table and do not require the capabilities that a DataSet provides, you could consider using a DataReader to load strongly-typed objects, which may offer better performance.

Not type-safe. The values that you receive from the DataSet may have to be cast to the correct data type. This requires you to determine what the types are supposed to be. This can be tedious and error-prone because you have to inspect the DataSet type information explicitly. A typed DataSet, as described in "Working with a Typed DataSet" [Microsoft02], alleviates this issue by generating a strongly-typed DataSet subclass that inherits from the generic DataSet class.

Proliferation of two-tiered architecture. The convenience of using a DataSet can become a liability when it tempts developers to pass DataSets from the database directly to the user interface. This can couple the user interface tightly to the physical database schema. A number of mechanisms can help avoid this issue. For example, a DataSet can be filled from a stored procedure so that the DataSet structure is abstracted from the physical database schema. Alternatively, DataSets can be loaded from XML documents, which can be transformed using Extensible Stylesheet Language (XSL). This provides another level of indirection among user interface, business logic, and data storage.

Related Patterns

For more information, see the following related patterns:

Implementing Data Transfer Object in .NET with Serialized Objects.

Implementing Data Transfer Object in .NET with a Typed DataSet.

Assembler. In Patterns of Enterprise Application Architecture, Fowler defines Assembler as a specialized instance of the Mapper pattern [Fowler03].

Acknowledgments

[Fowler03] Fowler, Martin. Patterns of Enterprise Application Architecture. Addison-Wesley, 2003.

[Powell03] Powell, Matt. "DataSets, Web Services, DiffGrams, Arrays, and Interoperability." MSDN Library, February, 2003. Available at: http://www.msdn.microsoft.com/library/default.asp?url=/library/en-us/dnservice/html/service02112003.asp.