What is SSIS ?
- SSIS is a platform for data integration and workflow applications. It features a data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data.
- SQL Server Integration Service (SSIS) is a component of the Microsoft SQL Server database software that can be used to execute a wide range of data migration tasks. SSIS is a fast & flexible data warehousing tool used for data extraction, loading and transformation like cleaning, aggregating, merging data, etc.
- It makes it easy to move data from one database to another database. SSIS can extract data from a wide variety of sources like SQL Server databases, Excel files, Oracle and DB2 databases, etc.
- SSIS also includes graphical tools & wizards for performing workflow functions like sending email messages, FTP operations, data sources, and destinations.
- Features of SSIS:
- Organized and lookup transformations
- Tight integration with other Microsoft SQL family
- Provides rich Studio Environments
- Provides a lot of data integration functions for better transformations
- High-speed data connectivity
Why SSIS?
- Extract, Transform, and Load (ETL) data from SQL Server to a file and also from file to SQL.
- Sending an email.
- Download the File from FTP.
- Rename ,Delete , Move File From Defined Path.
- It allows you to join tables from different databases (SQL, Oracle, etc...) and from potentially different servers.
How SSIS Works?
- SSIS consists of three major components, mainly:
- Operational Data:
A database created to combine data from several sources for further actions on the data is known as an operational data store (ODS). This is the place where most of the data used in the current operation is housed before it’s transferred to the data warehouse for longer-term storage or archiving.
- ETL process:
ETL is a process to Extract, Transform and Load the data. The process of gathering data from multiple sources, modifying it to fit your needs, and then loading it into a target data warehouse is known as extract, transform, and load, or ETL. ETL offers a ONE STOP Solution that can solve all of these issues.
Extract: Extraction is the process of extracting the data from various homogeneous or heterogeneous data sources based on different validation points.
Transformation: In transformation, entire data is analyzed and various functions are applied on it in order to load the data to the target database in a cleaned and general format.
Load: Loading is the process of loading the processed data to a target data repository using minimal resources.
- Data Warehouse
- Data Warehouse captures the data from diverse sources for useful analysis and access.
- Data warehousing is a large set of data accumulated which is used for assembling and managing data from various sources for the purpose of answering business questions. Hence, helps in making decisions.
How to install SSDT(Sql Server Data Tools)?
Prerequisite and environment Setup for SSIS Project
- For Starting SSIS we need 2 Studios
- SQL Server Data Tools (SSDT) for developing the Integration Services packages that a business solution requires. SQL Server Data Tools (SSDT) provides the Integration Services project in which you create packages.
Installation Steps:
-
- Download SSDT setup from Microsoft website.
URL: https://docs.microsoft.com/en-us/sql/ssdt/previous-releases-of-sql-server-data-tools-ssdt-and-ssdt-bi?view=sql-server-ver15
-
- When you open the .exe file, you will be asked to restart the system before installation. So, restart first and Run Setup. And press Next.
-
- It will show the tools required and the features such as SQL Server Database, SSAS(SQL Server Analysis Services), SSRS(SQL Server Reporting Services) and SSIS(SQL Server Integration Services). Make sure you check SSIS and click the “install” button. Refer the below screenshot for the same.
We will see following contents In SSIS:
- Variables
- Connection Manager
- SSIS Toolbox
- Data Flow Task
Variable:
Variables store values that a SSIS package and its containers, tasks, and event handlers can use at runtime.
System variables : Defined by Integration Services
- SSIS provides a set of system variables that store information about the running package and its objects. These variables can be used in expressions and property expressions to customize packages, containers, tasks, and event handlers.
User-Defined variables : Defined by Package Developers
How to create user - define variable?
How to set expression for variable
Connection Manager:
SSIS provides different types of connection managers that enable packages to connect to a variety of data sources and servers:
- There are built-in connection managers that Setup installs when you install Integration Services.
- There are connection managers that are available for download from the Microsoft website.
- You can create your own custom connection manager if the existing connection managers do not meet your needs.
Let's see how we can add Connection Manager.
1)Solution Explorer > Connection Managers > New Connection Manager .
You can see the list of connection managers for different type of connections.
2)Add connection manager.
After adding your connection. you can see the all connection here.
SSIS Toolbox:
Steps: Menu bar > SSIS > select SSIS Toolbox.
now, you can see SSIS Toolbox on the left side.
SSIS Toolbox have list of tasks and containers that you can perform.
List of Containers:
- For each Loop Container : Runs a control flow repeatedly by using an enumerator.
- For Loop Container : Runs a control flow repeatedly by testing a condition.
- Sequence Container : Groups tasks and containers into control flows that are subsets of the package control flow.
List of Task:
- Data Flow Task
The task that runs data flows to extract data, apply column level transformations, and load data.
- Data Preparation Tasks
These tasks do the following processes: copy files and directories; download files and data; run Web methods; apply operations to XML documents; and profile data for cleansing.
- Workflow Tasks
The tasks that communicate with other processes to run packages, run programs or batch files, send and receive messages between packages, send e-mail messages, read Windows Management Instrumentation (WMI) data, and watch for WMI events.
- SQL Server Tasks
The tasks that access, copy, insert, delete, and modify SQL Server objects and data.
- Scripting Tasks
The tasks that extend package functionality by using scripts.
- Analysis Services Tasks
The tasks that create, modify, delete, and process Analysis Services objects.
- Maintenance Tasks
The tasks that perform administrative functions such as backing up and shrinking SQL Server databases, rebuilding and reorganizing indexes, and running SQL Server Agent jobs. You can add task/container by dragging the task/container from SSIS toolbox to design area.
Data Flow Task :
Drag the Data Flow task from SSIS Toolbox to design area and double click on it.
You are now in Data flow tab.
Now you can see that SSIS Toolbox has different components.
Type:
- Source : from where you want your data.
- Destination : it is where you want to move your data.
- Transformation : It is Operation that perform ETL(Extract, Transform, Load)
Conclusion:
SQL Server Integration Services offer tasks to insert data into your destination and to alter and validate data during the load process.It's helpful to understand the various SSIS tasks and how to use them instead of using TReplace:SQL to develop a stored procedure to validate or modify data.
RELATED BLOGS:
Create SSIS Data Flow Task Package Programmatically