Category - SSIS

Basics of SSIS(SQL Server Integration Service)
Sep 14, 2020

What is SSIS ? SSIS is a platform for data integration and workflow applications. It features a data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data. SQL Server Integration Service (SSIS) is a component of the Microsoft SQL Server database software that can be used to execute a wide range of data migration tasks. SSIS is a fast & flexible data warehousing tool used for data extraction, loading and transformation like cleaning, aggregating, merging data, etc. It makes it easy to move data from one database to another database. SSIS can extract data from a wide variety of sources like SQL Server databases, Excel files, Oracle and DB2 databases, etc. SSIS also includes graphical tools & wizards for performing workflow functions like sending email messages, FTP operations, data sources, and destinations. Features of SSIS: Organized and lookup transformations Tight integration with other Microsoft SQL family Provides rich Studio Environments Provides a lot of data integration functions for better transformations High-speed data connectivity   Why SSIS? Extract, Transform, and Load (ETL) data from SQL Server to a file and also from file to SQL. Sending an email. Download the File from FTP. Rename ,Delete , Move File From Defined Path. It allows you to join tables from different databases (SQL, Oracle, etc...) and from potentially different servers.   How SSIS Works? SSIS consists of three major components, mainly: Operational Data: An operational data store (ODS) is a database designed to integrate data from multiple sources for additional operations on the data. This is the place where most of the data used in the current operation is housed before it’s transferred to the data warehouse for longer-term storage or archiving. ETL process: ETL is a process to Extract, Transform and Load the data. Extract, Transform and Load (ETL) is the process of extracting the data from various sources, transforming this data to meet your requirement and then loading into a target data warehouse. ETL provides a ONE STOP SOLUTION for all these problems. Extract: Extraction is the process of extracting the data from various homogeneous or heterogeneous data sources based on different validation points. Transformation: In transformation, entire data is analyzed and various functions are applied on it in order to load the data to the target database in a cleaned and general format. Load: Loading is the process of loading the processed data to a target data repository using minimal resources. Data Warehouse Data Warehouse captures the data from diverse sources for useful analysis and access. Data warehousing is a large set of data accumulated which is used for assembling and managing data from various sources for the purpose of answering business questions. Hence, helps in making decisions.   How to install SSDT(Sql Server Data Tools)? Prerequisite and environment Setup for SSIS Project For Starting SSIS we need 2 Studios SQL Server Data Tools (SSDT) for developing the Integration Services packages that a business solution requires. SQL Server Data Tools (SSDT) provides the Integration Services project in which you create packages. Installation Steps: Download SSDT setup from Microsoft website.   URL: https://docs.microsoft.com/en-us/sql/ssdt/previous-releases-of-sql-server-data-tools-ssdt-and-ssdt-bi?view=sql-server-ver15 When you open the .exe file, you will be asked to restart the system before installation. So, restart first and Run Setup. And press Next. It will show the tools required and the features such as SQL Server Database, SSAS(SQL Server Analysis Services), SSRS(SQL Server Reporting Services) and SSIS(SQL Server Integration Services). Make sure you check SSIS and click the “install” button. Refer the below screenshot for the same.   We will see following contents In SSIS: Variables Connection Manager SSIS Toolbox Container Tasks Data Flow Task   Variable: Variables store values that a SSIS package and its containers, tasks, and event handlers can use at runtime.   System variables : Defined by Integration Services SSIS provides a set of system variables that store information about the running package and its objects. These variables can be used in expressions and property expressions to customize packages, containers, tasks, and event handlers.   User-Defined variables : Defined by Package Developers   How to create user - define  variable?   How to set expression for variable   Connection Manager: SSIS provides different types of connection managers that enable packages to connect to a variety of data sources and servers: There are built-in connection managers that Setup installs when you install Integration Services. There are connection managers that are available for download from the Microsoft website. You can create your own custom connection manager if the existing connection managers do not meet your needs.   Let's see how we can add Connection Manager. 1)Solution Explorer > Connection Managers > New Connection Manager . You can see the list of connection managers for different type of connections.   2)Add connection manager.   After adding your connection. you can see the all connection here.   SSIS Toolbox: Steps: Menu bar > SSIS > select SSIS Toolbox. now, you can see SSIS Toolbox on the left side. SSIS Toolbox have list of tasks and containers that you can perform.   List of Containers: For each Loop Container : Runs a control flow repeatedly by using an enumerator. For Loop Container : Runs a control flow repeatedly by testing a condition. Sequence Container : Groups tasks and containers into control flows that are subsets of the package control flow.   List of Task: Data Flow Task The task that runs data flows to extract data, apply column level transformations, and load data.   Data Preparation Tasks These tasks do the following processes: copy files and directories; download files and data; run Web methods; apply operations to XML documents; and profile data for cleansing.   Workflow Tasks The tasks that communicate with other processes to run packages, run programs or batch files, send and receive messages between packages, send e-mail messages, read Windows Management Instrumentation (WMI) data, and watch for WMI events.   SQL Server Tasks The tasks that access, copy, insert, delete, and modify SQL Server objects and data.   Scripting Tasks The tasks that extend package functionality by using scripts.   Analysis Services Tasks The tasks that create, modify, delete, and process Analysis Services objects.   Maintenance Tasks The tasks that perform administrative functions such as backing up and shrinking SQL Server databases, rebuilding and reorganizing indexes, and running SQL Server Agent jobs. You can add task/container by dragging the task/container from SSIS toolbox to design area.   Data Flow Task : Drag the Data Flow task from SSIS Toolbox to design area and double click on it. You are now in Data flow tab. Now you can see that SSIS Toolbox has different components.   Type: Source : from where you want your data. Destination : it is where you want to move your data. Transformation : It is Operation that perform ETL(Extract, Transform, Load)   Conclusion: SQL Server Integration Services provide tasks to transform and validate data during the load process and transformations to insert data into your destination. Rather than create a stored procedure with T-SQL to validate or change data, is good to know about the different SSIS tasks and how they can be used.   RELATED BLOGS: Create SSIS Data Flow Task Package Programmatically 

Create SSIS Data Flow Task Package Programmatically
Jul 27, 2020

In this article, we will review how to create a data flow task package of SSIS in Console Application with an example. Requirements Microsoft Visual Studio 2017 SQL Server 2014 SSDT Article  Done with the above requirements? Let's start by launching Microsoft Visual Studio 2017. Create a new Console Project with .Net Core.  After creating a new project, provide a proper name for it. In Project Explorer import relevant references and ensure that you have declared namespaces as below: using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using Microsoft.SqlServer.Dts.Runtime; using RuntimeWrapper = Microsoft.SqlServer.Dts.Runtime.Wrapper;   To import above namespaces we need to import below refrences.   We need to keep in mind that, above all references should have same version.   After importing namespaces, ask user for the source connection string, destination connection string and table that will be copied to destination. string sourceConnectionString, destinationConnectionString, tableName; Console.Write("Enter Source Database Connection String: "); sourceConnectionString = Console.ReadLine(); Console.Write("Enter Destination Database Connection String: "); destinationConnectionString = Console.ReadLine(); Console.Write("Enter Table Name: "); tableName = Console.ReadLine();   After Declaration, create instance of Application and Package. Application app = new Application(); Package Mipk = new Package(); Mipk.Name = "DatabaseToDatabase";   Create OLEDB Source Connection Manager to the package. ConnectionManager connSource; connSource = Mipk.Connections.Add("ADO.NET:SQL"); connSource.ConnectionString = sourceConnectionString; connSource.Name = "ADO NET DB Source Connection";   Create OLEDB Destination Connection Manager to the package. ConnectionManager connDestination; connDestination= Mipk.Connections.Add("ADO.NET:SQL"); connDestination.ConnectionString = destinationConnectionString; connDestination.Name = "ADO NET DB Destination Connection";   Insert a data flow task to the package. Executable e = Mipk.Executables.Add("STOCK:PipelineTask"); TaskHost thMainPipe = (TaskHost)e; thMainPipe.Name = "DFT Database To Database"; MainPipe df = thMainPipe.InnerObject as MainPipe;   Assign OLEDB Source Component to the Data Flow Task. IDTSComponentMetaData100 conexionAOrigen = df.ComponentMetaDataCollection.New(); conexionAOrigen.ComponentClassID = "Microsoft.SqlServer.Dts.Pipeline.DataReaderSourceAdapter, Microsoft.SqlServer.ADONETSrc, Version=14.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"; conexionAOrigen.Name = "ADO NET Source";   Get Design time instance of the component and initialize it. CManagedComponentWrapper instance = conexionAOrigen.Instantiate(); instance.ProvideComponentProperties();   Specify the Connection Manager. conexionAOrigen.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.GetExtendedInterface(connSource); conexionAOrigen.RuntimeConnectionCollection[0].ConnectionManagerID = connSource.ID;   Set the custom properties. instance.SetComponentProperty("AccessMode", 0); instance.SetComponentProperty("TableOrViewName", "\"dbo\".\"" + tableName + "\"");   Reinitialize the source metadata. instance.AcquireConnections(null); instance.ReinitializeMetaData(); instance.ReleaseConnections();   Now, Add Destination Component to the Data Flow Task. IDTSComponentMetaData100 conexionADestination = df.ComponentMetaDataCollection.New(); conexionADestination.ComponentClassID = "Microsoft.SqlServer.Dts.Pipeline.ADONETDestination, Microsoft.SqlServer.ADONETDest, Version=14.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"; conexionADestination.Name = "ADO NET Destination";   Get Design time instance of the component and initialize it. CManagedComponentWrapper instanceDest = conexionADestination.Instantiate(); instanceDest.ProvideComponentProperties();   Specify the Connection Manager. conexionADestination.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.GetExtendedInterface(connDestination); conexionADestination.RuntimeConnectionCollection[0].ConnectionManagerID = connDestination.ID;   Set the custom properties. instanceDest.SetComponentProperty("TableOrViewName", "\"dbo\".\"" + tableName + "\"");   Connect the source to destination component: IDTSPath100 union = df.PathCollection.New(); union.AttachPathAndPropagateNotifications(conexionAOrigen.OutputCollection[0], conexionADestination.InputCollection[0]);   Reinitialize the destination metadata. instanceDest.AcquireConnections(null); instanceDest.ReinitializeMetaData(); instanceDest.ReleaseConnections();   Map Source input Columns and Destination Columns foreach (IDTSOutputColumn100 col in conexionAOrigen.OutputCollection[0].OutputColumnCollection) {     for (int i = 0; i < conexionADestination.InputCollection[0].ExternalMetadataColumnCollection.Count; i++)     {         string c = conexionADestination.InputCollection[0].ExternalMetadataColumnCollection[i].Name;         if (c.ToUpper() == col.Name.ToUpper())         {             IDTSInputColumn100 column = conexionADestination.InputCollection[0].InputColumnCollection.New();             column.LineageID = col.ID;             column.ExternalMetadataColumnID = conexionADestination.InputCollection[0].ExternalMetadataColumnCollection[i].ID;         }     } }   Save Package into the file system. app.SaveToXml(@"D:\Workspace\SSIS\Test_DB_To_DB.dtsx", Mipk, null);   Execute package. Mipk.Execute(); Conclusion In this article, we have explained one of the alternatives for creating SSIS packages using .NET console application. In case you have any questions, please feel free to ask in the comment section below.   RELATED BLOGS: Basics of SSIS(SQL Server Integration Service)