What is Data Migration?
Data migration is the process of transferring data from one data storage system to another and also between data formats and applications. It also involves data transfers between different data formats and applications.
The data migration process also includes data preparation, extraction, and transformation. It is usually conducted when introducing new systems and processes in an organization.
The following are some common scenarios that require data migration:
- Replacement, upgrade, and expansion of storage systems and equipment
- Legacy software upgrade and replacement
- Firms moving from local storage system to a cloud-based system to optimize operations
- Website consolidation
- Installation of new systems to coexist and augment existing applications sharing the same dataset
- Infrastructure maintenance
- Switching to centralized databases to attain interoperability
- Consolidation of information systems
- Data center relocation
Types of Data Migration
There are six types of data migration. A single data migration process can involve different types, including:
1. Storage Migration
Storage migration is where a business migrates data from one storage location to another. It means moving data from one physical medium to another. A common reason for storage migration is the upgrading of storage equipment to more sophisticated modern storage equipment. Hence, it encompasses movement from paper to digital, tapes to hard disk drives (HDD), HDD to solid-state drives, and hardware-based storage to virtual (cloud) based storage.
The movement is not driven by a lack of space but rather a desire to upgrade storage technology. It normally does not alter the content or format of data. During storage migration, certain steps such as data validation, cloning, and data cleaning and redundancy can be carried out.
2. Database Migration
Databases are data storage media where data is structured in an organized way. Databases are managed through database management systems (DBMS). Hence, database migration involves moving from one DBMS to another or upgrading from the current version of a DBMS to the latest version of the same DBMS. The former is more challenging especially if the source system and the target system use different data structures.
3. Application Migration
Application migration occurs when an organization goes through a change in application software or changes an application vendor. This migration requires moving data from one computing environment to another. A new application platform may require radical transformation due to new application interactions after the migration.
The major challenge comes from the old and target infrastructures having distinctive data models and using different data formats. Application programming interfaces (APIs) can be provided by vendors to protect data integrity. Vendor web interfaces may be scripted to facilitate data migration.
4. Cloud Migration
Cloud migration concerns the movement of data or applications from an on-premises location to the cloud or from one cloud environment to another. It is, in essence, a specific storage migration. IT experts continue to witness an increase in cloud migration and forecast that the majority of major corporations will be operating on the cloud before the end of the decade ending 2030.
5. Business Process Migration
Business process migration requires the movement of business applications and data on business processes and metrics to a new environment. The metrics can include customer, product, and operational information. The migration is commonly instigated by business optimization and reorganization and mergers and acquisitions (M&A). Such business combinations are necessitated by the need to enter new markets and remain competitive.
6. Data Center Migration
Data center migration relates to the migration of data center infrastructure to a new physical location or the movement of data from the old data center infrastructure to new infrastructure equipment at the same physical location. A data center houses the data storage infrastructure, which maintains the organization’s critical applications. It consists of servers, network routers, switches, computers, storage devices, and related data equipment.
Data Migration Process
The data migration process should be well planned, seamless, and efficient to ensure it does not go over budget or result in a protracted process. It involves the following steps in the planning, migration, and post-migration phases:
The data migration process can also follow the ETL process:
- Extraction of data
- Transformation of data
- Loading data
ETL tools can manage the complexities of the data migration process from processing huge datasets, profiling, and integration of multiple application platforms.
The data migration process remains the same whether a big bang approach or a trickle approach is adopted. A brief overview of the two approaches is given as follows:
1. Big Bang Data Migration Approach
The big bang data migration approach moves all data in one single operation from the current environment to the target environment. It is fast and less complex, and also less costly. Its implementation will mean all systems will be down and unavailable for users during the migration. Hence, it should be conducted during public holidays or periods where users are not expected to use the system.
The advantages of the above approach are offset by the risk of an expensive failure due to big data, which can overwhelm the network during transmission. Because of such risk, the big bang approach is more suitable for small companies with smaller amounts of data or for operations or projects where the migration involves a small amount of data. Furthermore, it should not be used on systems that cannot sustain any downtime.
2. Trickle Data Migration Approach
The trickle data migration approach is a phased approach to data migration. Trickle data migration breaks down the migration process into sub-processes where data is transferred in small increments. The old system remains operational and runs parallel with the migration. The advantage is that there is no downtime in the live system, and it is less susceptible to errors and unexpected failures.
However, on the downside, the iterative nature of the process makes it more complex, and it takes longer to complete. During the whole process, data should be synchronized between the old system and the new environment. The trickle migration process is ideal for big data organizations that cannot afford any downtime to their system.
- Pre-migration planning – The planning involves the evaluation of existing data sets for stability. An analysis of the source and target system should be carried out. Data standards should also be set to spot any potential data problems. Decisions on whether to use the big bang or trickle approaches are also made at the pre-migration planning stage. More crucially, it is where migration budgets, timelines, schedules, and deadlines are set.
- Data inspection – The data inspection stage involves inspecting the scope of the data that is to be migrated in terms of quality, anomalies, or any possible conflicts and duplications. Software application tools can be used to clean the data if the volume warrants it.
- Data backup – This data backup stage involves backing up all data that is to be migrated to guard against any migration failure that can lead to data loss. It is a prudent measure that eliminates the risk of data loss.
- Migration process design – The migration process stage stipulates the migration testing procedures, acceptance criteria, and other personnel responsibilities. Hiring an ETL developer or data engineer to take charge of the process is also part of this stage. Other specialists needed in the migration process, such as system analysts and business analysts, also need to be specified and hired.
- Execute and validate – Here, the execution of the migration process is initiated and rolled out. The extraction, transformation, and loading (ETL) processes also go live at this stage. The duration of the process will depend on the volume of data involved and the data migration approach chosen. It is essential to monitor and validate the process to see if there is any sign of failure and downtime to the old system if the trickle approach is selected. Continuous communication with business units is also paramount during the migration process. The migration process should be validated to see if it has been executed as per set guidelines and ensure data migrated to the new environment is complete and viable for business use.
- Decommission and monitor – A post-migration step in which the old system is shut down and decommissioned.
Data Migration Best Practices
There are some best practices that should be observed when conducting a data migration exercise to ensure the process is seamless with a high degree of success whilst avoiding costly delays.
- A dedicated migration team should be set up with the right specialists in place to manage and steer the project.
- Data migration should be a chance to clean data and raise its quality standard before it is transmitted so as not to inherit inferior quality data with old problems.
- The amount of data to be migrated should be right-sized as much as possible. Data cleaning can assist in ensuring that only quality and useful data is migrated.
- All data should be profiled before writing mapping scripts.
- Backup data before the migration begins to guard against data loss.
- Keep testing the migration from planning and design stages to execution and maintenance to guarantee the success of the migration project.
- The old system should only be switched off after confirmation of the success of the migration process. If it fails, a roll-back will be necessary with no downtime as the old system will still be running.
Data Migration Risks
Data migration risks include the following:
- Losing data – Data can be lost during migration; hence, it is crucial to back up and plan the migration diligently with help from professionals.
- Prolonged migration time – Data migration can take a long time, from a few months to several years, and can be prolonged if the process encounters network blockages that can affect transmission times. Connection speeds and infrastructure limitations can also affect the progress of the migration.
- Data security – Data should be encrypted before migration to ensure security.
- Breaking the budget – Prolonged migration can lead to breaking the budget. Personnel and vendor software costs may outstrip budgeted amounts leading to financial challenges that can threaten the success of the migration process.
Data Migration Tools
Examples of data migration tools include the following among others:
- IBM InfoSphere
- Microsoft SQL
- Oracle Data Service Integrator
- Informatica PowerCenter
- IRI NextForm
- AWS Data Migration
- Azure DocumentDB
- Talent Open Studio
CFI is the official provider of the (upcoming) Business Intelligence & Data Analyst (BIDA)® certification program, designed to help anyone become a world-class financial analyst. To keep advancing your career, the additional CFI resources below will be useful: