SQL-Server AWS Azure Google-Cloud-Platform

Quick Setup: Kafka with ELK Integration

Aug 17, 2020

Apache Kafka is the numerous common buffer solution deployed together with the ELK Stack. Kafka is deployed within the logs delivery and the indexing units, acting as a segregation unit for the data being collected:

In this blog, we’ll see how to deploy all the components required to set up a resilient logs pipeline with Apache Kafka and ELK Stack:

Filebeat – collects logs and forwards them to a Kafka topic.
Kafka – brokers the data flow and queues it.
Logstash – aggregates the data from the Kafka topic, processes it and ships to Elasticsearch.
Elasticsearch – indexes the data.
Kibana – for analyzing the data.

My environment:

To perform the steps below, I set up a single Ubuntu 18.04 VM machine on AWS EC2 using local storage. In real-life scenarios, you will probably have all these components running on separate machines.

I started the instance in the public subnet of a VPC and then set up a security group to enable access from anywhere using SSH and TCP 5601 (for Kibana).

Using Apache Access Logs for the pipeline, you can use VPC Flow Logs, ALB Access logs etc.

We will start by installing the main component in the stack — Elasticsearch.

Login to your Ubuntu system using sudo privileges. For the remote Ubuntu server using ssh to access it. Windows users can use putty or Powershell to log in to Ubuntu system.

Elasticsearch requires Java to run on any system. Make sure your system has Java installed by running the following command. This command will show you the current Java version.

sudo apt install openjdk-11-jdk-headless

Check the installation is successful or not by the below command

~$ java — versionopenjdk 11.0.3 2019–04–16OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)

Finally, I added a new elastic IP address and associated it with the running instance.
The example logs used for the tutorial are Apache access logs.

Step 1: Installing Elasticsearch

We will start by installing the main component in the stack — Elasticsearch. Since version 7.x, Elasticsearch is bundled with Java so we can jump right ahead with adding Elastic’s signing key:

Download and install the public signing key:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

Now you may need to install the apt-transport-https package on Debian before proceeding:

sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

Our next step is to add the repository definition to our system:

echo “deb https://artifacts.elastic.co/packages/7.x/apt stable main” | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

You can install the Elasticsearch Debian package with:

sudo apt-get update && sudo apt-get install elasticsearch

Before we bootstrap Elasticsearch, we need to apply some basic configurations using the Elasticsearch configuration file at: /etc/elasticsearch/elasticsearch.yml:

sudo su
nano /etc/elasticsearch/elasticsearch.yml

Since we are installing Elasticsearch on AWS, we will bind Elasticsearch to the localhost.

Also, we need to define the private IP of our EC2 instance as a master-eligible node:

network.host: "localhost"
http.port:9200
cluster.initial_master_nodes: ["<InstancePrivateIP"]

Save the file and run Elasticsearch with:

sudo service elasticsearch start

To confirm that everything is working as expected, point curl to: http://localhost:9200, and you should see something like the following output (give Elasticsearch a minute or two before you start to worry about not seeing any response):

{
"name" : "elasticsearch",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "W_Ky1DL3QL2vgu3sdafyag",
"version" : {
    "number" : "7.2.0",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "508c38a",
    "build_date" : "2019-06-20T15:54:18.811730Z",
    "build_snapshot" : false,
    "lucene_version" : "8.0.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}

Step 2: Installing Logstash

Next up, the “L” in ELK — Logstash. Logstash and installing it is easy. Just type the following command.

sudo apt-get install logstash -y

Next, we will configure a Logstash pipeline that pulls our logs from a Kafka topic, processes these logs and ships them on to Elasticsearch for indexing.

Verify Java is installed:

java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

Let’s create a new config file:

Since we already defined the repository in the system, all we have to do to install Logstash is run:

sudo nano /etc/logstash/conf.d/apache.conf

Next, we will configure a Logstash pipeline that pulls our logs from a Kafka topic, processes these logs, and ships them on to Elasticsearch for indexing.

Let’s create a new config file:

input {
kafka {
    bootstrap_servers => "localhost:9092"
    topics => "apache"
    }
}

filter {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
    date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    }
geoip {
      source => "clientip"
    }
}

output {
elasticsearch {
    hosts => ["localhost:9200"]
}
}

As you can see — we’re using the Logstash Kafka input plugin to define the Kafka host and the topic we want Logstash to pull from. We’re applying some filtering to the logs and we’re shipping the data to our local Elasticsearch instance.

Step 3: Installing Kibana

Let’s move on to the next component in the ELK Stack — Kibana. As before, we will use a simple apt command to install Kibana:

sudo apt-get install kibana

We will then open up the Kibana configuration file at: /etc/kibana/kibana.yml, and make sure we have the correct configurations defined:

server.port: 5601
server.host: "<INSTANCE_PRIVATE_IP>"
elasticsearch.hosts: ["http://<INSTANCE_PRIVATE_IP>:9200"]

Then enable and start the Kibana service:

sudo systemctl enable kibana
sudo systemctl start kibana

We would need to install Firebeat. Use:

sudo apt install filebeat

Open up Kibana in your browser with http://<PUBLIC_IP>:5601. You will be presented with the Kibana home page.

TAGS Website Backup SQL-Jobs SQL

SQL-Server

Understanding Different Types of Switching in Table Partitioning in Microsoft SQL Server

Parshwa Kapadia

Jun 21, 2024

Introduction We focused on optimizing database performance and manageability, it’s important to understand the nuances of table partitioning in SQL Server, including partition switching. Partition switching is a feature in SQL Server that allows for fast data movement between tables and partitions. This blog explores the different types of partition switching and their applications in SQL Server. What is Partition Switching? Partition switching involves moving data between partitions or between a partition and a non-partitioned table without physically copying the data. Instead, metadata pointers are updated, making the operation extremely fast and efficient. This is especially useful for data archiving, loading new data, and maintaining large datasets. Types of Partition Switching 1. Switching Between Partitions in the Same Table Switching data between partitions within the same table can be useful for reorganizing data or when performing operations that require temporary partition rearrangement. Example: Suppose you have a table SalesData partitioned by month and you need to move data from one month to another. -- Switch data from partition 2 to partition 3 ALTER TABLE SalesData SWITCH PARTITION 2 TO SalesData PARTITION 3; 2. Switching Between a Table and a Partitioned Table This type of switching is typically used for bulk loading or removing data. You can switch a partition of a partitioned table to a non-partitioned table (and vice versa) to quickly load or archive data. Example: Loading new data into a partitioned table SalesData from a staging table StagingSalesData. -- Ensure the staging table matches the schema of the partitioned table CREATE TABLE StagingSalesData ( SaleID int, SaleDate datetime, Amount money ); -- Switch the staging table data into the partition ALTER TABLE StagingSalesData SWITCH TO SalesData PARTITION 1; 3. Switching Between a Partitioned Table and Another Partitioned Table This involves moving data between two different partitioned tables. It’s useful when dealing with different data lifecycle management scenarios, such as archiving old data into a separate historical table. Example: Switching data from a partition in CurrentSalesData to a partition in HistoricalSalesData. -- Both tables should have the same structure and partition scheme ALTER TABLE CurrentSalesData SWITCH PARTITION 2 TO HistoricalSalesData PARTITION 1; 4. Switching Data Out of a Partitioned Table This is used to remove data from a partitioned table and move it into a non-partitioned table for further processing or archiving. Example: Switching data from a partition in SalesData to a table OldSalesData. -- Ensure the target table matches the schema of the partitioned table CREATE TABLE OldSalesData ( SaleID int, SaleDate datetime, Amount money ); -- Switch the data out of the partition ALTER TABLE SalesData SWITCH PARTITION 1 TO OldSalesData; Guidelines for Partition Switching To ensure smooth partition switching, consider the following guidelines: Schema Matching: Ensure that the schemas of the source and target tables match exactly, including constraints and indexes. Partition Alignment: The source and target partitions must align correctly based on the partition function. Check Constraints: Check constraints on the tables must be consistent with the partition boundary conditions. Minimal Indexes: Avoid using non-aligned indexes on partitioned tables to ensure efficient switching. Benefits of Partition Switching Performance Efficiency: Since partition switching involves metadata operations rather than physical data movement, it is extremely fast and efficient. Minimal Downtime: Enables quick data loading, archiving, and reorganization with minimal downtime. Data Management Flexibility: Facilitates flexible data management strategies, allowing for efficient data lifecycle management. Conclusion Partition switching is a powerful feature in SQL Server that enhances the performance and manageability of large datasets. Understanding the different types of partition switching and their applications allows you, to implement efficient data loading, archiving, and maintenance strategies. By leveraging partition switching, you can ensure that your SQL Server environment remains robust, responsive, and well-organized, ultimately supporting your organization’s data management goals.

SQL-Server

Enhancing Performance and Manageability: Table Partitioning in Microsoft SQL Server

Parshwa Kapadia

Jun 14, 2024

Introduction Overseeing data management and performance optimization, implementing table partitioning in Microsoft SQL Server is a strategic decision to enhance database performance and manageability. Table partitioning is a powerful technique that allows large tables to be divided into smaller, more manageable pieces, improving query performance and simplifying maintenance tasks. In this blog, we'll explore the concept of table partitioning, its benefits, and a step-by-step guide to implementing it in SQL Server. Understanding Table Partitioning Table partitioning involves dividing a large table into smaller, more manageable segments called partitions. Each partition can be managed and accessed independently, which can significantly improve query performance and simplify maintenance tasks. Partitioning is especially beneficial for large tables with millions or billions of rows, where operations such as data loading, archiving, and querying can become cumbersome. Key Concepts Partition Function: Defines how data is distributed across partitions based on a specified column or columns. Partition Scheme: Maps the partitions defined by the partition function to specific filegroups within the database. Aligned Indexes: Indexes that are partitioned in the same way as the table, ensuring that queries using these indexes benefit from partitioning. Benefits of Table Partitioning Improved Query Performance: Queries that target specific partitions can avoid scanning the entire table, resulting in faster response times. Parallel processing of partitions can enhance performance for complex queries. Simplified Maintenance: Partition-level operations such as loading, archiving, and deleting data can be performed independently, reducing the impact on overall database performance. Easier management of large tables, as partitions can be individually managed and optimized. Enhanced Data Management: Partitioning can facilitate better data organization and management, such as separating historical data from current data. Efficient handling of data purging and archiving processes. Types of Table Partitions in SQL Server 1. Range Partitioning Range partitioning is the most common type of partitioning in SQL Server. It involves dividing a table based on a range of values in a specified column, often a date or numerical column. Each partition holds data that falls within a specific range. Use Cases: Partitioning data by date to manage historical data efficiently. Improving query performance for range-based queries. Example: CREATE PARTITION FUNCTION rangePartitionFunction (datetime) AS RANGE LEFT FOR VALUES ('2021-01-01', '2022-01-01', '2023-01-01'); CREATE PARTITION SCHEME rangePartitionScheme AS PARTITION rangePartitionFunction TO (fg1, fg2, fg3, fg4); CREATE TABLE SalesData ( SaleID int, SaleDate datetime, Amount money ) ON rangePartitionScheme (SaleDate); 2. List Partitioning List partitioning allows you to divide a table based on a list of values. Each partition is associated with specific values of a column, often used for categorizing data by discrete values such as regions or departments. Use Cases: Partitioning data by specific categories (e.g., regions, product types). Enhancing query performance for category-based queries. Example: CREATE PARTITION FUNCTION listPartitionFunction (nvarchar(20)) AS RANGE LEFT FOR VALUES ('North', 'South', 'East', 'West'); CREATE PARTITION SCHEME listPartitionScheme AS PARTITION listPartitionFunction TO (fg1, fg2, fg3, fg4); CREATE TABLE SalesRegionData ( SaleID int, Region nvarchar(20), Amount money ) ON listPartitionScheme (Region); 3. Composite Partitioning Composite partitioning combines two or more partitioning strategies. The most common combination is range-list or range-hash partitioning. This approach allows for more complex and flexible data distribution strategies. Use Cases: Managing large datasets with multiple logical divisions. Enhancing performance and manageability for complex queries. Example: -- Range-List Partitioning Example CREATE PARTITION FUNCTION rangePartitionFunction (datetime) AS RANGE LEFT FOR VALUES ('2021-01-01', '2022-01-01', '2023-01-01'); CREATE PARTITION FUNCTION listPartitionFunction (nvarchar(20)) AS RANGE LEFT FOR VALUES ('North', 'South', 'East', 'West'); Choosing the Right Partitioning Strategy Selecting the appropriate partitioning strategy depends on several factors, including data characteristics, query patterns, and maintenance requirements. Here are some guidelines to help you choose: Range Partitioning: Best for time-series data or data with natural ranges. Ideal for scenarios where you frequently query specific ranges of data. List Partitioning: Suitable for categorical data with a limited number of discrete values. Useful for scenarios where queries target specific categories. Composite Partitioning: Best for complex data structures that require multiple partitioning dimensions. Ideal for large datasets with varied query patterns and maintenance needs. Implementing Table Partitioning in SQL Server Step 1: Planning and Design Identify Candidate Tables: Analyze your database to identify large tables that will benefit from partitioning. Consider factors such as table size, query patterns, and data lifecycle. Choose Partitioning Column: Select a column that will be used to distribute data across partitions, often based on date or range values. Ensure the column has a high degree of cardinality to evenly distribute data. Step 2: Creating a Partition Function Define the Partition Function: Create a partition function that specifies the boundaries for each partition. CREATE PARTITION FUNCTION myPartitionFunction (int) AS RANGE LEFT FOR VALUES (1000, 2000, 3000); Step 3: Creating a Partition Scheme Map Partitions to Filegroups: Create a partition scheme that maps each partition to a specific filegroup. CREATE PARTITION SCHEME myPartitionScheme AS PARTITION myPartitionFunction TO (fg1, fg2, fg3, fg4); Step 4: Creating a Partitioned Table Create the Table Using Partition Scheme: Create the partitioned table and specify the partition scheme. CREATE TABLE myPartitionedTable ( id int, data nvarchar(100), partition_column int ) ON myPartitionScheme (partition_column); Step 5: Managing Indexes on Partitioned Tables Create Aligned Indexes: Ensure indexes are partitioned in the same way as the table. CREATE INDEX idx_myPartitionedTable ON myPartitionedTable (partition_column) ON myPartitionScheme (partition_column); Step 6: Maintaining Partitioned Tables Data Management: Use partition-level operations for data loading, archiving, and purging. Utilize partition switching to efficiently move data between tables. Monitoring and Optimization: Regularly monitor partition performance and manage storage distribution. Rebuild or reorganize partitions as needed to maintain optimal performance. Conclusion Implementing table partitioning in Microsoft SQL Server is a powerful strategy for improving database performance and manageability, especially for large tables. Guiding your team through the careful planning and implementation of partitioning can lead to significant performance gains and simplified maintenance processes. By following the steps outlined in this blog, you can ensure a successful partitioning implementation that enhances your organization's data management capabilities. Table partitioning is not just a technical enhancement; it's a strategic move towards better data management and performance optimization. Embrace this powerful feature to keep your SQL Server environment robust and responsive.

SQL-Server

Seamless Migration: On-Premise Report Server to Azure SQL Managed Instance

Parshwa Kapadia

Apr 22, 2024

Introduction The migration of an on-premise report server to Azure SQL Managed Instance requires strategic planning and meticulous execution. This transition offers numerous benefits, including scalability, reliability, and reduced maintenance overhead. In this blog, we'll explore the essential steps involved in migrating an on-premise report server to Azure SQL Managed Instance, ensuring a seamless transition for your organization. Understanding Azure SQL Managed Instance Before diving into the migration process, let's briefly understand Azure SQL Managed Instance. It is a fully managed platform as a service (PaaS) offering from Microsoft Azure, providing near-complete compatibility with on-premise SQL Server. Managed Instance offers features like automatic patching, automated backups, and built-in high availability, making it an attractive option for hosting SQL Server workloads in the cloud. Pre-Requisites 1. Azure SQL Managed Instance 2. SQL Server User Account – Using to connect Azure SQL Managed Instance 3. Azure Virtual Machine Configure Azure SQL Managed Instance 1. Go to Azure Portal and search for Azure SQL Managed Instance. 2. Set up the username and password, it will require connecting from SSMS and SSRS later. 3. Set up the required configuration. 4. Create the Azure SQL. 5. Create a new database (optional). 6. Open SSMS and verify the instance connection with SQL Server Authentication by entering a username and password of #2. 7. If it’s connecting successfully then we have configured Azure SQL Managed Instance correctly. Configure Azure Virtual Machine 1. Go to Azure Portal and search for Virtual Machine. 2. Select the Windows Operating System and set up the required configurations. 3. Create a Virtual Machine and connect via RDP. Install SSRS (SQL Server Reporting Services) in Azure VM 1. Connect your Azure VM using RDP. 2. Download the 2022 SSRS installer - Click here to download 3. Launch the installer of 2022 SSRS. 4. Choose Install Reporting Services and click Next. 5. Choose the appropriate Edition to match your licensing. Once selected choose Next. 6. Now you will want to accept the license and click Next. 7. Choose Install Reporting Services Only and click Next. 8. Change the Installation Location to a path of your choice, if you would like, then click Install. 9. Open Report Server Configuration Manager and click on Connect. 10. Start the Report Service if it’s not started. Connect On-Premises SQL Server 1. Connect to your on-premises SQL Server. 2. Take a backup of your ReportServer and ReportServerTempDB databases. 3. After successfully backup of both databases, upload it to Azure Blob Storage. Connect Azure SQL Managed Instance in SSMS 1. Connect your Azure SQL Managed Instance with your credentials. 2. Generate SAS Token to access Azure Blob Storage account. 3. Create new Credentials in SQL Managed Instance. CREATE CREDENTIAL [AZURE BLOB URL WITH CONTAINER/FOLDER] WITH IDENTITY = 'SHARED ACCESS SIGNATURE', SECRET = 'SAS TOKEN' ; GO 4. Restore ReportServer and ReportServerTempDB Databases RESTORE DATABASE ReportServer FROM URL = 'AZURE BLOB URL OF DATABASE BACKUP FILE' ; GO RESTORE DATABASE ReportServerTempDB FROM URL = 'AZURE BLOB URL OF DATABASE BACKUP FILE' ; GO 5. Delete old record from ReportServer.dbo.Keys table based on MachineName or InstanceName. (DELETE ReportServer.[dbo].[Keys] WHERE MachineName = 'OLD MACHINE NAME') 6. To view all subscriptions in the new server execute the below query. DECLARE @OldUserID uniqueidentifier DECLARE @NewUserID uniqueidentifier SELECT @OldUserID = UserID FROM dbo.Users WHERE UserName = 'OLD SERVER NAME WITH USER' SELECT @NewUserID = UserID FROM dbo.Users WHERE UserName = 'NEW SERVER NAME WITH USER' UPDATE dbo.Subscriptions SET OwnerID = @NewUserID WHERE OwnerID = @OldUserID 7. Restart SQL Server Reporting Service. 8. Open the Report Server in the browser to verify all the Reports and Subscriptions. Configure SSRS (SQL Server Reporting Services) in Azure VM 1. Connect your Azure VM using RDP. 2. Open Report Server Configuration Manager and click on Connect. 3. Start the Report Service if it’s not started. 4. Go to Database and click on Change Database. 5. Choose existing database option and click on Next. 6. Enter the database connection information of Azure SQL Managed Instance, Test the connection and click on Next. – IMPORTANT 7. Inside credentials, choose SQL Server Credentials option and, enter username and password of Azure SQL Managed Instance and click on Next. 8. Please verify the SQL Server Instance Name and other details in Summary and click on Next. 9. Click on Finish. 10. In Report Configuration Manager and select Web Service URL, then click Apply. 11. Go to Web Portal URL, then click Apply. 12. Go to E-mail Settings, update your email settings to send report subscription emails. 13. Open browser and enter your report server Web Portal URL.

About the Author

Parshwa Kapadia

Project Lead at MagnusMinds IT Solution

A seasoned project lead with extensive experience in Database Management, SSIS, SSRS, PowerBI, .NET Development, Cloud Services, Version Control, CI/CD, Setup of Azure/AWS accounts, Infrastructure Setup, and Server Configuration. I have had the privilege of leading and contributing to a wide array of projects, ranging from database optimization and architecture design to cloud migration and software development. With a passion for technology and a commitment to excellence, I am excited to share my insights, experiences, and expertise with you. Join me on this journey as we explore the dynamic world of IT together.

Please send us your requirements and we'll get back to you at the earliest!

Services +

Data Services +

Microsoft Platforms +

Web Development +

Database Development +

Mobile App Developemt +

Product Engineering +

Software Testing & QA Services +

Cloud & DevOps Developemt +

Front End Development +

Consulting Services +

Company +

About Us

Our Team

Worklife

Testimonial

Hire Team

IT Staff Augmentation

Technical Due Diligence

FAQs

Clutch

Upwork

Industry +

Healthcare Industry

Gaming Industry

Logistics Industry

Travel & Hospitality Industry

Resources +

Portfolio

Blogs

Case studies

Let's Make it Great!

Unlock Communication, Explore Boundless Connections

Quick Setup: Kafka with ELK Integration

About the Author

Parshwa Kapadia

RECENT POSTS

CATEGORIES

TAGS CLOUD

Please send us your requirements and we'll get back to you at the earliest!

Overview

Insight

Services +

Data Services +

Microsoft Platforms +

Web Development +

Database Development +

Mobile App Developemt +

Product Engineering +

Software Testing & QA Services +

Cloud & DevOps Developemt +

Front End Development +

Consulting Services +

Company +

About Us

Our Team

Worklife

Testimonial

Hire Team

IT Staff Augmentation

Technical Due Diligence

FAQs

Clutch

Upwork

Industry +

Healthcare Industry

Gaming Industry

Logistics Industry

Travel & Hospitality Industry

Resources +

Portfolio

Blogs

Case studies

Let's Make it Great!

Unlock Communication, Explore Boundless Connections

Quick Setup: Kafka with ELK Integration

About the Author

Parshwa Kapadia

RECENT POSTS

CATEGORIES

TAGS CLOUD