Category - SQL Server

Kafka with ELK implementation
Aug 17, 2020

Apache Kafka is the numerous common buffer solution deployed together with the ELK Stack. Kafka is deployed within the logs delivery and the indexing units, acting as a segregation unit for the data being collected: In this blog, we’ll see how to deploy all the components required to set up a resilient logs pipeline with Apache Kafka and ELK Stack: Filebeat – collects logs and forwards them to a Kafka topic. Kafka – brokers the data flow and queues it. Logstash – aggregates the data from the Kafka topic, processes it and ships to Elasticsearch. Elasticsearch – indexes the data. Kibana – for analyzing the data.   My environment: To perform the steps below, I set up a single Ubuntu 18.04 VM machine on AWS EC2 using local storage. In real-life scenarios, you will probably have all these components running on separate machines. I started the instance in the public subnet of a VPC and then set up a security group to enable access from anywhere using SSH and TCP 5601 (for Kibana). Using Apache Access Logs for the pipeline, you can use VPC Flow Logs, ALB Access logs etc. We will start by installing the main component in the stack — Elasticsearch. Login to your Ubuntu system using sudo privileges. For the remote Ubuntu server using ssh to access it. Windows users can use putty or Powershell to log in to Ubuntu system. Elasticsearch requires Java to run on any system. Make sure your system has Java installed by running the following command. This command will show you the current Java version. sudo apt install openjdk-11-jdk-headless Check the installation is successful or not by the below command ~$ java — versionopenjdk 11.0.3 2019–04–16OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing) Finally, I added a new elastic IP address and associated it with the running instance. The example logs used for the tutorial are Apache access logs.   Step 1: Installing Elasticsearch We will start by installing the main component in the stack — Elasticsearch. Since version 7.x, Elasticsearch is bundled with Java so we can jump right ahead with adding Elastic’s signing key: Download and install the public signing key: wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - Now you may need to install the apt-transport-https package on Debian before proceeding: sudo apt-get install apt-transport-https echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list Our next step is to add the repository definition to our system: echo “deb https://artifacts.elastic.co/packages/7.x/apt stable main” | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list You can install the Elasticsearch Debian package with: sudo apt-get update && sudo apt-get install elasticsearch Before we bootstrap Elasticsearch, we need to apply some basic configurations using the Elasticsearch configuration file at: /etc/elasticsearch/elasticsearch.yml: sudo su nano /etc/elasticsearch/elasticsearch.yml Since we are installing Elasticsearch on AWS, we will bind Elasticsearch to the localhost. Also, we need to define the private IP of our EC2 instance as a master-eligible node: network.host: "localhost" http.port:9200 cluster.initial_master_nodes: ["<InstancePrivateIP"] Save the file and run Elasticsearch with: sudo service elasticsearch start To confirm that everything is working as expected, point curl to: http://localhost:9200, and you should see something like the following output (give Elasticsearch a minute or two before you start to worry about not seeing any response): {   "name" : "elasticsearch",   "cluster_name" : "elasticsearch",   "cluster_uuid" : "W_Ky1DL3QL2vgu3sdafyag",   "version" : {     "number" : "7.2.0",     "build_flavor" : "default",     "build_type" : "deb",     "build_hash" : "508c38a",     "build_date" : "2019-06-20T15:54:18.811730Z",     "build_snapshot" : false,     "lucene_version" : "8.0.0",     "minimum_wire_compatibility_version" : "6.8.0",     "minimum_index_compatibility_version" : "6.0.0-beta1"   },   "tagline" : "You Know, for Search" }   Step 2: Installing Logstash Next up, the “L” in ELK — Logstash. Logstash and installing it is easy. Just type the following command. sudo apt-get install logstash -y Next, we will configure a Logstash pipeline that pulls our logs from a Kafka topic, processes these logs and ships them on to Elasticsearch for indexing. Verify Java is installed: java -version openjdk version "1.8.0_191" OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) Let’s create a new config file: Since we already defined the repository in the system, all we have to do to install Logstash is run: sudo nano /etc/logstash/conf.d/apache.conf Next, we will configure a Logstash pipeline that pulls our logs from a Kafka topic, processes these logs, and ships them on to Elasticsearch for indexing. Let’s create a new config file: input {   kafka {     bootstrap_servers => "localhost:9092"     topics => "apache"     } } filter {     grok {       match => { "message" => "%{COMBINEDAPACHELOG}" }     }     date {     match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]     }   geoip {       source => "clientip"     } } output {   elasticsearch {     hosts => ["localhost:9200"]   } } As you can see — we’re using the Logstash Kafka input plugin to define the Kafka host and the topic we want Logstash to pull from. We’re applying some filtering to the logs and we’re shipping the data to our local Elasticsearch instance.   Step 3: Installing Kibana Let’s move on to the next component in the ELK Stack — Kibana. As before, we will use a simple apt command to install Kibana: sudo apt-get install kibana We will then open up the Kibana configuration file at: /etc/kibana/kibana.yml, and make sure we have the correct configurations defined: server.port: 5601 server.host: "<INSTANCE_PRIVATE_IP>" elasticsearch.hosts: ["http://<INSTANCE_PRIVATE_IP>:9200"] Then enable and start the Kibana service: sudo systemctl enable kibana sudo systemctl start kibana We would need to install Firebeat. Use: sudo apt install filebeat   Open up Kibana in your browser with http://<PUBLIC_IP>:5601. You will be presented with the Kibana home page.

Create SSIS Data Flow Task Package Programmatically
Jul 27, 2020

In this article, we will review how to create a data flow task package of SSIS in Console Application with example. Requirements Microsoft Visual Studio 2017 SQL Server 2014 SSDT Article  Done with the above requirements? Lets start by launching Microsoft Visual Studio 2017. Create a new Console Project with .Net Core.  After created new project provide proper name to it. In Project Explorer import relevant references and ensure that you have declared namespaces as below: using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using Microsoft.SqlServer.Dts.Runtime; using RuntimeWrapper = Microsoft.SqlServer.Dts.Runtime.Wrapper;   To import above namespaces we need to import below refrences.   We need to keep in mind that, above all references should have same version.   After importing namespaces, ask user for the source connection string, destination connection string and table that will be copied to destination. string sourceConnectionString, destinationConnectionString, tableName; Console.Write("Enter Source Database Connection String: "); sourceConnectionString = Console.ReadLine(); Console.Write("Enter Destination Database Connection String: "); destinationConnectionString = Console.ReadLine(); Console.Write("Enter Table Name: "); tableName = Console.ReadLine();   After Declaration, create instance of Application and Package. Application app = new Application(); Package Mipk = new Package(); Mipk.Name = "DatabaseToDatabase";   Create OLEDB Source Connection Manager to the package. ConnectionManager connSource; connSource = Mipk.Connections.Add("ADO.NET:SQL"); connSource.ConnectionString = sourceConnectionString; connSource.Name = "ADO NET DB Source Connection";   Create OLEDB Destination Connection Manager to the package. ConnectionManager connDestination; connDestination= Mipk.Connections.Add("ADO.NET:SQL"); connDestination.ConnectionString = destinationConnectionString; connDestination.Name = "ADO NET DB Destination Connection";   Insert a data flow task to the package. Executable e = Mipk.Executables.Add("STOCK:PipelineTask"); TaskHost thMainPipe = (TaskHost)e; thMainPipe.Name = "DFT Database To Database"; MainPipe df = thMainPipe.InnerObject as MainPipe;   Assign OLEDB Source Component to the Data Flow Task. IDTSComponentMetaData100 conexionAOrigen = df.ComponentMetaDataCollection.New(); conexionAOrigen.ComponentClassID = "Microsoft.SqlServer.Dts.Pipeline.DataReaderSourceAdapter, Microsoft.SqlServer.ADONETSrc, Version=14.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"; conexionAOrigen.Name = "ADO NET Source";   Get Design time instance of the component and initialize it. CManagedComponentWrapper instance = conexionAOrigen.Instantiate(); instance.ProvideComponentProperties();   Specify the Connection Manager. conexionAOrigen.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.GetExtendedInterface(connSource); conexionAOrigen.RuntimeConnectionCollection[0].ConnectionManagerID = connSource.ID;   Set the custom properties. instance.SetComponentProperty("AccessMode", 0); instance.SetComponentProperty("TableOrViewName", "\"dbo\".\"" + tableName + "\"");   Reinitialize the source metadata. instance.AcquireConnections(null); instance.ReinitializeMetaData(); instance.ReleaseConnections();   Now, Add Destination Component to the Data Flow Task. IDTSComponentMetaData100 conexionADestination = df.ComponentMetaDataCollection.New(); conexionADestination.ComponentClassID = "Microsoft.SqlServer.Dts.Pipeline.ADONETDestination, Microsoft.SqlServer.ADONETDest, Version=14.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"; conexionADestination.Name = "ADO NET Destination";   Get Design time instance of the component and initialize it. CManagedComponentWrapper instanceDest = conexionADestination.Instantiate(); instanceDest.ProvideComponentProperties();   Specify the Connection Manager. conexionADestination.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.GetExtendedInterface(connDestination); conexionADestination.RuntimeConnectionCollection[0].ConnectionManagerID = connDestination.ID;   Set the custom properties. instanceDest.SetComponentProperty("TableOrViewName", "\"dbo\".\"" + tableName + "\"");   Connect the source to destination component: IDTSPath100 union = df.PathCollection.New(); union.AttachPathAndPropagateNotifications(conexionAOrigen.OutputCollection[0], conexionADestination.InputCollection[0]);   Reinitialize the destination metadata. instanceDest.AcquireConnections(null); instanceDest.ReinitializeMetaData(); instanceDest.ReleaseConnections();   Map Source input Columns and Destination Columns foreach (IDTSOutputColumn100 col in conexionAOrigen.OutputCollection[0].OutputColumnCollection) {     for (int i = 0; i < conexionADestination.InputCollection[0].ExternalMetadataColumnCollection.Count; i++)     {         string c = conexionADestination.InputCollection[0].ExternalMetadataColumnCollection[i].Name;         if (c.ToUpper() == col.Name.ToUpper())         {             IDTSInputColumn100 column = conexionADestination.InputCollection[0].InputColumnCollection.New();             column.LineageID = col.ID;             column.ExternalMetadataColumnID = conexionADestination.InputCollection[0].ExternalMetadataColumnCollection[i].ID;         }     } }   Save Package into the file system. app.SaveToXml(@"D:\Workspace\SSIS\Test_DB_To_DB.dtsx", Mipk, null);   Execute package. Mipk.Execute(); Conclusion In this article, we have explained one of the alternatives for creating SSIS packages using .NET console application. In case you have any questions, please feel free to ask in the comment section below.

DELETE and UPDATE CASCADE in SQL Server foreign key
Jul 20, 2020

In this article, we will review on DELETE AND UPDATE CASCADE rules in SQL Server foreign key with different examples. DELETE CASCADE: When we create a foreign key using this option, it deletes the referencing rows in the child table when the referenced row is deleted in the parent table which has a primary key. UPDATE CASCADE: When we create a foreign key using UPDATE CASCADE the referencing rows are updated in the child table when the referenced row is updated in the parent table which has a primary key. We will be discussing the following topics in this article: Creating DELETE CASCADE and UPDATE CASCADE rule in a foreign key using T-SQL script Triggers on a table with DELETE or UPDATE cascading foreign key Let us see how to create a foreign key with DELETE and UPDATE CASCADE rules along with few examples.   Creating a foreign key with DELETE and UPDATE CASCADE rules Please refer to the below T-SQL script which creates a parent, child table and a foreign key on the child table with DELETE CASCADE rule.   Insert some sample data using below T-SQL script.   Now, Check Records.   Now I deleted a row in the parent table with CountryID =1 which also deletes the rows in the child table which has CountryID =1.   Please refer to the below T-SQL script to create a foreign key with UPDATE CASCADE rule.   Now update CountryID in the Countries for a row which also updates the referencing rows in the child table States.   Following is the T-SQL script which creates a foreign key with cascade as UPDATE and DELETE rules. To know the update and delete actions in the foreign key, query sys.foreign_keys view. Replace the constraint name in the script.   The below image shows that a DELETE CASCADE action and UPDATE CASCADE action is defined on the foreign key. Let’s move forward and check the behavior of delete and update rules the foreign keys on a child table which acts as parent table to another child table. The below example demonstrates this scenario. In this case, “Countries” is the parent table of the “States” table and the “States” table is the parent table of Cities table.   We will create a foreign key now with cascade as delete rule on States table which references to CountryID in parent table Countries.   Now on the Cities table, create a foreign key without a DELETE CASCADE rule. If we try to delete a record with CountryID = 3, it will throw an error as delete on parent table “Countries” tries to delete the referencing rows in the child table States. But on Cities table, we have a foreign key constraint with no action for delete and the referenced value still exists in the table.   The delete fails at the second foreign key.   When we create the second foreign key with cascade as delete rule then the above delete command runs successfully by deleting records in the child table “States” which in turn deletes records in the second child table “Cities”.   Triggers on a table with delete cascade or update cascade foreign key An instead of an update trigger cannot be created on the table if a foreign key on with UPDATE CASCADE already exists on the table. It throws an error “Cannot create INSTEAD OF DELETE or INSTEAD OF UPDATE TRIGGER ‘trigger name’ on table ‘table name’. This is because the table has a FOREIGN KEY with cascading DELETE or UPDATE.” Similarly, we cannot create INSTEAD OF DELETE trigger on the table when a foreign key CASCADE DELETE rule already exists on the table.   Conclusion In this article, we explored a few examples on DELETE CASCADE and UPDATE CASCADE rules in SQL Server foreign key. In case you have any questions, please feel free to ask in the comment section below.

Table partitioning in SQL
Jun 10, 2020

What is table partitioning in SQL? Table partitioning is a way to divide a large table into smaller, more manageable parts without having to create separate tables for each part. Data in a partitioned table is physically stored in groups of rows called partitions and each partition can be accessed and maintained separately. Partitioning is not visible to end-users, a partitioned table behaves like one logical table when queried. Data in a partitioned table is partitioned based on a single column, the partition column often called the partition key. Only one column can be used as the partition column, but it is possible to use a computed column. The partition scheme maps the logical partitions to physical filegroups. It is possible to map each partition to its own filegroup or all partitions to one filegroup.

Restore the encrypted database
Jun 01, 2020

We have to add the below script in the master database to restore an encrypted database.   CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<your_password>' CREATE CERTIFICATE <your_certificate_name> FROM File = '<path of.cer file>' WITH PRIVATE KEY (FILE = 'path of .pvk file', DECRYPTION BY PASSWORD = '<your_password>');   Now you have to do follow the normal restore process in SQL.

What is meant by MSBI?
Oct 16, 2019

MSBI stands for Microsoft Business Intelligence. This powerful suite is composed of tools which helps in providing best solutions for Business Intelligence and Data Mining Queries. This tool uses Visual studio along with SQL server. It empowers users to gain access to accurate and up-to-date information for better decision making in an organization. It offers different tools for different processes which are required in Business Intelligence (BI) solutions. MSBI is divided into 3 categories: SSIS – SQL Server Integration Services – Integration tool. This tool is used for integration like duping the data from one database to another like from Oracle to SQL Server or from Excel to SQL Server etc. This tool is also used for bulk transactions in the database like inserting lacs of records at once. We can create the integration services modules which will do the job for us.   SSAS – SQL Server Analytical Services -Analysis tool. This tool is used to analyze the performance of the SQL Server in terms of load balancing, heavy data, transaction etc. So it is more or less related to the administration of the SQL Server using this tool. This is a very powerful tool and through this, we can analyze the data inserting into the database like how many transactions happen in a second etc.   SSRS – SQL Server Reporting Services – Reporting tool. This is a very efficient tool as it is platform-independent. We can generate the report using this tool and can use it in any type of application. Nowadays this is very popular in the market.   “A visual always helps better to understand any concept.” Below Diagram broadly defines “Microsoft Business Intelligence (MSBI)”  

Query Optimization
Oct 11, 2019

The most underrated but most important topic, which is must while implementing the SQL Query, Stored Procedures or Functions. While implementing any SQL operations knowing the syntax and structures is a good thing, but one must know optimization. Without knowledge of optimization, any developer can create DDL and DML statements, but they are not well-designed procedures. You know why? Because while executing those statements there are chances that it will take the time or may create a deadlock situation. Proper joining is also considered to be part of optimization. This below is an actual execution plan flowchart. The most simple query execution flow is mentioned below: From Joins Where Group by Having clause Column list Distinct Order by Top From: First it will fetch all the records from the table mentioned after the ‘From’ keyword. Join: joins are an essential part of any SQL statements. A developer must have proper knowledge of tables; otherwise, it can cause the wrong data population of data or extended execution time. Where: It another filter applied to any query after applying joins. It is used to decrease no. of records provided filter wise. Group by: It is used for grouping the records with aggregate functions or grouping the records particular provided column-wise. Having: When we need to provide an aggregate function with filter then we should use it in the ‘Having’ clause. Column list: While putting ‘*’, we are calling all the columns and all records from a particular table. If not necessary then we must provide only those columns which are actually useful. Distinct: It is used to remove duplicate records while fetching the details through a select statement. Order by: This is useful to sort the data ascending or descending column-wise. Top: It is used to limit the no. of records to be displayed on the screen. Above mentioned query execution flow is 1st step of the optimization ever keyword should be placed as per the above plan. The last 3 steps (7, 8, and 9) are most crucial because it will process all the records and then do operations accordingly. ‘Group by’ and ‘Having’ clauses are also taken time while execution because it uses aggregate functions in it. Along with that, it is necessary for you to know that if not necessary then don’t do for inbuilt functions. ‘Convert’ an ‘Cast’ are the most frequent built-in functions that can extend the execution because of conversion. As mentioned above Joins are the most important part of any SQL statement because a good join increases the performance where wrong can mislead you. First of all, while implementing a join please check whether tables are properly indexed or not. Indexing is very important while the creation of table, a table must have at least one clustered index. Less or no use of temp tables. Temp tables tend to increase the complexities of the query because it increases the continuous use of ‘tempdb’ database. If necessary then create a clustered index on that temp table which increases the performance and doesn’t wait for temp table to be dropped automatically, drop it when it is of no use. Go for the execution plan if the query is taking too much time, by seeing the plan we can easily fetch which query or portion of the query is taking time. The execution plan shows which table used maximum process time from the overall time. Make your indexes unique using integer or unique identifier which increases the performances. A table must have one clustered key and can have one or more nonclustered keys. Use small data types for indexing. For the existence of any record don’t dependent on count statement in the query. For example, Always use ‘with (nolock)’ keyword to avoid locks while fetching records from the table. Avoid the use of ‘NOT IN’ statement in where condition, instead of that you can go for let join. The same way no need to go for ‘IN’ statement, you can simply use inner join. Please avoid loops and cursors while creating any store procedure, because looping also causes CPU process usage and calling the same statement again and again. So avoid it is not needed. Use ‘UNION ALL’ instead of ‘UNION’ for combining two or more ‘select’ statements.  

Schedule Database backup on SQL Server Express Edition
Sep 20, 2019

Have you ever attempted to set up an automated backup of your SQL Server Express Edition and found that there’s no SQL Server Agent where you can schedule the job which will took a backup of your database. Alas, the world does not end there and you don't need to pay extra bucks just to have the back up via an SQL Agent which is available only to Standard and Enterprise editions. There are many options to automate the backup job which runs on a specific time and does not require manual intervention. Here, we will learn how to do it via SQL Command using batch file and Windows in-build Task Scheduler. Hope, you may find this useful. Create a BAT(batch) file to execute the command to take a backup of Database and save it. echo off :: -------------------------------------------------- :: clear console cls :: -------------------------------------------------- :: Define variables set SERVERNAME=YOUR_SERVER_NAME set DATABASENAME=DATABASE_NAME set MyTime=%TIME: =0% set MyDate=%DATE:~-4%.%DATE:~7,2%.%DATE:~4,2%.%MyTime:~0,2%.%MyTime:~3,2%.%MyTime:~6,2% set FileName=%DATABASENAME%_%MyDate%.bak set BAK_PATH=DIRECTORY_PATH set DEST_FILE=%BAK_PATH%%FileName% :: -------------------------------------------------- :: BACKUP Database sqlcmd -E -S %SERVERNAME% -d master -Q "BACKUP DATABASE [%DATABASENAME%] TO DISK = N'%DEST_FILE%' WITH INIT , NOUNLOAD , NAME = N'%DATABASENAME% backup', NOSKIP , STATS = 10, NOFORMAT" :: -------------------------------------------------- :: Optional Part :: -------------------------------------------------- :: Zip file 7z a -tzip "%DEST_FILE%.zip" "%DEST_FILE%" :: -------------------------------------------------- :: Delete unziped file DEL "%DEST_FILE%"   “SERVERNAME” is the name of SQL Server physical machine. “DATABASENAME” is the database which will be backup. “FileName” sets as a database name and append date which has .bak extension  “BAK_PATH” is the path in which a database backup file will be saved. “DEST_FILE” is use backup path and file name. After defining all the variables database backup will be generated and save as zip file in “DEST_FILE” path and at the end, the unzipped file will be deleted from “DEST_FILE” Now, it's time to schedule this created batch file in #1 Start Menu -> Task Scheduler -> Run as administrator Click on Create Task... from the right bar and configure it with Triggers and Actions