Download a PDF of this Article
According to reports from the most prestigious international information technology analyst firms, cloud computing represents the next wave in the IT world, not only for organizations but for every one of us, today and in the near future. Using a center-based data processing service, one’s own or a third party and contracted in time, depending on the specific needs of these uses, follows the pattern of pay-per-use, a paradigm that lives closer to the specific needs and the greater or lower demand for resources required by these entities and the final users of this technology.
This service-based paradigm is organized around three main types, depending on the level of resources required in organizations: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). They cover virtually all the current needs of any consumer, from an individual or end user (SaaS and PaaS) to a large corporation, through self-government and small and medium enterprises (see Figure 1).
The Microsoft Cloud Service proposal, in this regard, is called Windows Azure and is orchestrated in the way of PaaS, just now a reality, and SaaS with Office 365, a Cloud-office application launched internationally in recent weeks.
This article focuses on the database model as a service, named SQL Azure™ by Microsoft, demonstrating various differences with respect to its flagship database SQL Server. SQL Azure™ could be likened to renting a relational database in the cloud for use as needed in an open environment (Public Cloud, Web and mobile applications), or in a corporate environment (Private Cloud) as part of a unique management database system used as is or integrated with the rest of the databases in a company’s data center.
An analysis of possibilities demonstrates that when you migrate a database on-premise to a database in the cloud, or migrate only a few databases to the cloud and then work with other database systems or your own data in SQL Server (On-Premises), the scenarios are plentiful for using technologies that either follow the low cost option service model or a pay-as-you-go system.
What is Microsoft SQL Azure™
According to Microsoft , SQL Azure™ is a highly available and scalable cloud database service built on SQL Server technologies. When we talk about a database service, we usually think of that service as involving a consumer who pays for the use of that service, without reflecting on how powerful the database server (hardware resources) should be, or about how many servers (physical) and human resources the organization needs to guarantee a high percentage of service availability. SQL Azure service provides high availability and fault tolerance without requiring server/hardware administration (at least in the way you traditionally take care of the DBMS physically installed in your Data Centers). Figure 2 demonstrates how to manage your virtual servers and databases from one central Web point—the SQL Azure Database Management Portal (SADMP).
Current Scenarios for Working with SQL Azure™
One of the first decisions made, regarding the use of a relational database on the cloud, is which database to use and why. Many individuals worldwide are looking at SQL Azure™ and deciding which part of their business would benefit most from its use. The best response, of course, is that choosing a database cloud model is not only a question of whether or not the service should be on the cloud, but that it is part of a planned process to achieve a business goal.
Successful companies of all sizes make reasoned and well thought-out decisions as part of their common corporate strategy of analyzing and choosing the best part of a total process that represents real savings of both money and other resources. Now, a great many of them are analyzing SQL Azure™ uses (as evidenced in the Windows Azure evidence site ), as they attempt to determine the best scenario for their unique needs. Perhaps you are among these analysts.
Environments for Working with SQL Azure™
Once you have decided to go with an SQL Azure™ database, the first question you’ll have is how to best interact with it, and how you can reach, view, manage and administrate your cloud databases. This can be accomplished by using the Windows Azure Portal (figure 3). From here you can create and manage your database servers, firewall rules and SQL Azure Databases. Once stipulated, you can use various tools to administrate and manage your data:
SQL Azure™ Database Management Portal (SADMP). Silverlight-based application that allows you to connect a database and make basic management arrangements using visual tools (limited; not very powerful): create tables, stored procedures, databases, etc.). It provides a console for executing T-SQL commands, but is quite basic.
SQL Server Management Studio (SSMS) 2008 R2. May be the best option for DBA. All the power of SSMS for connecting, querying and script (not as powerful as in an On-Premise database) SQL Azure™ database. This version incorporates tools for schema and data (basic) migration from SQL Server On-Premise to SQL Azure™.
Microsoft Visual Studio 2008/2010. An option for database developers. Some functionality has been added into both versions of Visual Studio for connecting and accessing SQL Azure™ databases and database objects, not only from the Server Explorer but programmatically (ADO.NET, ODBC). An interesting option is the use of DAC project templates that allow developers and DBA to create and migrate SQL Schemas to/into SQL Azure™ and maintain a relationship between DAC application versions, managed from the Visual Studio DAD project.
Your own applications: via SQL Azure™ Client API, ADO.NET, ODBC, etc. It represents a more low level access for developers to create applications (desktop, Web, mobile, etc.) that creates and consumes data in SQL Azure™ sources, using popular data access objects in ADO.NET. A tool for creating the code is needed to ensure a better experience in making this kind of application, like Visual Studio 2010.
Tools for Migrating Data to SQL Azure™ (Schemas & Data)
One method for beginning your use of SQL Azure™ is through an existing application. There are subtle differences between SQL Server and SQL Azure™, as they are two diverse approaches to data management, one On-Premise and the other as a service on cloud. Knowing this, you will need some kind of migration system from your existing database to SQL Azure™.
Before going any further, you should be aware that these instructions assume you are dealing with the migration of a SQL Server database. In the case you are dealing with other databases, take time to check out the advantages of the various tools available in SQL Server Migration Assistant (SSMA) , a special free application intended for Oracle, Microsoft Access, Sybase, MySQL  and other non Microsoft databases.
If you have your SQL Server 2005/2008 databases working, you must first differentiate two things about the main elements of the migration process: schema migration and data migration. We are taking a look at both processes and the most suitable tools available (currently, there is only one tool that takes proper care of the schema and data migration in any circumstances).
The following process of migrating an existing database to SQL Azure™ is considered the official SQL Azure™ approach, according to the SQL Azure Team Blog and the Hands On Labs provided by Microsoft.
1. Provision the destination SQL Azure™ database
2. Extract the origin database schema into DDL script
3. Validate DDL Script (if possible without brute force)
4. Execute origin DDL script against SQL Azure™
5. Disable foreign key constraints (you can load data in any order)
6. Disable non-clustered indexes (improves the data load performance)
7. Load data (export/import) from origin to destination (best choice BCP/SSIS)
8. Rebuild non-clustered indexes
9. Enable foreign key constraints
10. Validate schema/data migration
When determining whether or not to migrate a database to SQL Azure™, you should first be aware of which features of the SQL Server are not supported or which ones have different behavior in SQL Azure™; for example, the data type GUID is not supported in SQL Azure™, which requires that all tables have a clustered index created to allow for the insertion of new rows. These are only two of the differences between the database engines; a more complete list of differences resides at MSND On-Line  and .
There are two possibilities for obtaining the schema of an existing SQL Server database in order to create DDL Scripts for SQL Azure™: Generate and Publish Scripts contextual menu in SSMS, and Visual Studio 2010 DAC project connected to an On-Premise SQL Server for extracting this information and making the proper corrections to prepare for the cloud.
Generate and Publish Scripts (SSMS)
After having the SQL Azure™ account, server and database prepared for the migration, your first step involves preparing the DDL script with correct SQL Azure™ code for creating the same schema into the destination database.
You can use a couple helping tools for this part of the process, primarily one of the export features of SQL Server Management Studio in 2008 R2 version, the Generate and Publish Scripts feature, with support for exporting schemas and small sets of data to SQL Azure™, under Database contextual menu Tasks > Generate Scripts. You should take into account that XML Schema collections and data types are not allowed in SQL Azure™, so carefully choose which objects for your database you want to export to the cloud. In Scripting options, select Advanced options (Figure 3) in the General section, then the option Script for the database engine type to SQL Azure Database. This selection disables some of the characteristics of the dialog that are not supported in SQL Azure™ (e.g., extended properties, script logins, the USE command, and so on).
Here’s a practical consideration: before going further with the setting of advanced options for the script generation, be sure you understand the most important differences not supported in SQL Azure™. You may produce a few errors when finishing the export process. Print a copy of the logs to visualize these errors and determine which features are essential for your database or application. If you encounter a problem with your application, you may need to make a couple programming changes before going to the cloud with Azure (e.g., if your database uses cmdshell or bulk copy operations directly through stored procedures).
Now, validate the corrections in the final DDL script. Be sure you understand which errors occurred and why, in order to solve them. Document them for future use in the migrating process; this will ensure you don’t have to create/delete/recreate the schema many times. Remember that using SQL Azure™ Database as a service is billable, and the pricing depends on both in and out traffic.
DAC Project (VS2010 Team Edition)
Another interesting possibility, most suitable for developers and DBAs accustomed to working with scripts in this development environment, is to use the DAC template available in Visual Studio 2010 Ultimate Edition for connecting to an On-Premise database and extracting the schema and object definitions. Once this has been accomplished, you can work on the scripts for detecting and solving possible sources of incompatibility.
If you decide to use this approach to create your DDL scripts for SQL Azure™, consider that the VS projects are not supported by the ALTER TABLE T-SQL command; have all your constraints directly defined in your CREATE scripts.
Also note that since this is an early version, there are many objects not yet supported for this kind of application using the Import Data-tier Application Wizard.
On the other hand, you can take advantage of several useful features that allow you to detect and correct problems in your script code in addition to Intellisense on database objects; this is a good friend in writing code, not only for .Net applications but for T-SQL. One of these features is the static Code Analysis, which is turned off by default. You can turn it on from the project properties window, using the tab Code Analysis (Figure 5); then you can establish various options, such as Data.Design, Data.Naming, Data.Performance and so forth.
When using Visual Studio DAC projects to create the initial schema for exporting to SQL Azure™, you are defining a tool for maintaining this schema during the lifecycle of your database application, not only for the process of migration of the schema, but for having all the needed database definition stipulations together. Having all your codes in one single point of administration is very useful for future modifications and consistency, a key aspect for working in several-member teams while developing many parts of an application.
On the other and, one disadvantage to this approach is that you don’t have tools for detecting problems related to feature incompatibility between T-SQL On-Premise and T-SQL in SQL Azure™ or other cloud issues; you need to know the known problems in SQL Azure™ and try to avoid generating a noncompliant schema before executing the schema migration. Here again, the validation process is important; check your schema objects to detect possible sources of conflict.
If you have been following the steps of the process, you should have a new database with all needed objects created in SQL Azure™, but as yet unfilled. Now is the time to populate the database. We have mentioned a couple of the possibilities open to you. Although Generate and Publish Scripts (SSMS) also has the option of exporting date from one database to another via the “Export Data” feature, it is only recommended for very small sets of data.
You can transfer data to the SQL Azure Database by using the following:
SQL Server 2008 Integration Services (SSIS)
The bulk copy utility (BCP.exe)
Scripts that use INSERT statements to load data into the database
SQL Azure™ Data Sync (CTP2). Consider this as a synchronization tool more than a tool for migrating data. (It will be detailed in a future article in SolidQ Journal.)
If your application has hundreds, thousands or millions of rows in tables, a better approach is to use BCP or SSIS, as the two most effective tools for loading data into SQL Azure .
At SolidQ, we have created a command-line application call SolidQ.BCP that uses the System.Data.SqlClient.SqlBulkCopy class with load results similar to BCP; however, it provides direct access to the origin data into SQL Server 2008 or 2008 R2 database tables instead of using text data files (BCP.exe), saving a few process steps. (Thanks go to our workmate Enrique Catalá for his big efforts in making such a tool possible.) The following are our recommended tools and best practices based upon our experimentation with data migration to SQL Azure™.
BCP and Best Practices
The bulk copy utility (BCP.exe) is a command-line tool for generating text data files from a SQL Server database and uploading this data to a destination database, in our case to SQL Azure™. First, we exported from the original database to a text data file; second, we imported from text data file to the SQL Azure database, with the corresponding objects already created (remember to create your clustered index before trying to insert a row in each table if you don’t want to acquire an error!). In the following code sequence, you can see these two processes.
In various projects and tests, we obtained an average of 187 rows/sec using a laptop with 3 GB RAM and a dual Core processor, with an Internet connection of 1Mb for uploading data in one single process. With the same configuration and two different processes (over the same table), we obtained the same time average as for one process. In that case, each of the processes was in charge of half the data in the table (in the example with one table, 587,000 data rows were uploaded).
In more complex scenarios, we uploaded, with the same configuration, 17 GB of data to a SQL Azure database in 22h. 37mins.
How could this performance be improved? We shortened these figures considerably by using an Azure approach with the assistance of a VM Role machine (medium sized) in Windows Azure, and delegating the uploading process and computing to it. The result was surprising; the uploading process for the same database (17 GB) took of only 5h. 33mins. and we achieved an average of approximately 9.989 rows/second.
SSIS and Best Practices
SQL Server Integration services is another popular technology for moving data (even for making transformations on them) from On-Premise to SQL Azure™. One of the strengths of this method is that you can detect some sources of conflict in data types, as well as making some data transformation compatible with SQL Azure™ during the active process of data copying.
In our trials, we made a migration process for an example data table with 587,000 rows, with a simple package, reading data from a SQL Server 2008 r2 database and writing data directly into SQL Azure™ as our workbench test. Data in origin was in the correct format without any incompatibility issues. The test was conducted with the same machine described in the preceding section with two different data destinations (ADO NET and OLE DB Destination), and the same result was achieved—an average time of 185 rows/sec.
Is it possible to improve performance? Of course. In our test case, we took advantage of VM Role in Azure to launch the same SSIS project, but in a medium instance machine (4 processors, 3.7 GB RAM), with similar results to the BCP from VM Role test, about 9.550 rows/sec.
We are working now on parallelizing this SSIS approach in order to improve performance as much as possible in both scenarios, On-Premise servers and VM Role/Worker Role, as well as on defining different tests, not only in medium instance but in large and extra-large instances, too, in order to verify the impact of the processor and RAM in SQL Azure™ loading processes. We expect to have a more in-depth description of this experiment in future issues of SolidQ Journal and, if possible, at our annual Solid Quality Summit.
Automating the Migration Process: SQL Azure™ Migration Wizard (SQLAzureMW)
The open source (codeplex) tool that joins this process together is one single tool called SQL Azure Migration Wizard (SQLAzureMW), available for downloading in . This tool is in constant review and development, and consists of a Windows desktop application that takes care of the entire process, from the selection of the origin database to the data population migration into SQL Azure, to validating and proposing alternative data types, creating scripts, and highlighting the possible conflictive points of the code used for this task. It will still require reviewing on your part, due its potential for detecting false positives.
Although this tool has many advantages and is highly recommended, even for the SQL Azure™ team, you will need to take into consideration, as you do for all automated processes, that you may lose some control over the process. More importantly, you must conduct a necessary review and validation of the resulting structures before going to the cloud. For example, if there is no clustered index in an origin table (needed in the destination database), the tool will create one for you, probably with the primary key, even though this may not be the best choice, depending on your own data and your particular business.
In this article, we have presented only the big picture of migrating an On-Premise database to the Microsoft’s Data Cloud Platform, SQL Azure™. We have introduced the model of a database as a service, described some of the advantages of this model, which is a rational and accessible one for users of a wide range of public or private companies of various sizes. Users will not need to purchase an expensive license, but can use the database engine however and whenever it is needed.
Using a general process for the migration of existing databases to SQL Azure™, we have described the pros and cons of using various tools and techniques related to the migration of the schema objects (through the creation and management of the DDL scripts) through the use of the recommended tools BCP or SSIS; we have provided useful information for loading data with a good performance using VM Role in Azure, and for improving cloud loads made from your own data centers. Finally, we have introduced the popular tool SQLAzureMW for automating the migration process, available for free from CodePlex, that will assist you in the process of moving your database onto the cloud utilizing the Azure platform. You can take advantage of this tool event from the Windows Azure platform to improve your data migration processes and shorten the time required for the procedure, especially if you have some kind of Web Role, Worker Role or VM Role that could launch this tool from Azure.
About the Author
Miguel López (blog | twitter), is a director of Cloud Computing at SolidQ Spain & Portugal. He has broad IT expertise with more than 13 years, tied to Microsoft technologies. He has been a Visual Basic developer, analyst, consultant and trainer in workflow technologies. He is an author who has published a couple of books and is a speaker at Tech Ed. He has participated as well in Microsoft’s National & International events and published a course with MSDN on Microsoft Expression Web.