I hope you are all are safe and doing well! I want to share one small thing with you all that happened to me on 11th June 2020. It may help you to know how important can small things be. Let’s begin!

I was assigned a task to import real data provided by the client in excel at 11 am approx. After I import real data, I had to do some database operations to make the tool ready for testing. I had enough time to do this as it’s hardly a work of an hour or two.

I started with the import process to a new table first. Generally, I use the import-export tool provided by the SQL server itself. But it takes time to follow each step in that tool. My senior helped me make this process faster by providing me SQL query, which is useful to import data from an excel file.

sp_configure 'show advanced options', 1;

RECONFIGURE;

GO

sp_configure 'Ad Hoc Distributed Queries', 1;

RECONFIGURE;

GO

SELECT * INTO Data_dq

FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0; Database=E:\Sample.xlsx;Extended Properties=Excel 12.0', [Sheet1$]);

Till now, all good! All the data got imported successfully and quickly. There were around 100,000 records in file with 200+ columns. After a few minutes, I found that the data types for some fields are not proper, so the imported data is not proper. Hence I decided to change all columns data type to nvarchar(MAX). I executed the below query now:

INSERT INTO Data_d(Column1, Column2, Column3, ..., Column200)

SELECT Column1, Column2, Column3, ..., Column200

FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',

'Excel 12.0; Database=E:\Sample.xlsx;Extended Properties=Excel 12.0', [Sheet1$]);

All the data imported successfully, and all of it seemed to be proper now.

It was around 12 pm now. Still, I had enough time in my hand. The client called me up and asked me to get the tool ready before 7 pm and informed me about how important it is. The call took around half an hour. I took a lunch break and started to work again at 1:30 pm.

I started work with checking imported data. I cross-checked the data. As the data was valid, so I started performing the next steps around 3 pm. It was going well so far, and near about 4 pm, I performed one query to update one field. That field was representing the tree path of that node from top to bottom. I waited for a few minutes. I spent half an hour on it, but it was still running. I though it’s taking time because of the massive amount of data.

One hour had gone by, and the query was still executing. I was also not wanting to cancel this query execution because I was worried if it will take this much time once again. I was unsure whether the query was in an infinite loop, or the SQL server got hanged. So, I updated my senior regarding this issue. It was 5 pm already, and I was feeling the pressure now.

Senior then told me to execute one select query to know how many records were updated. I see there were around 23000 records that were updated in 1:30 hours. We knew that the query was running and updating records, but it was very slow. We were not able to cover up 100,000 records with this speed.

Now we started thinking in another direction to improve its speed. We had only 2 hours to meet the deadline. We have decided to keep this process running and thought to play with it on another server by creating a copy of this database. If in case we do not get to speed up the existing process, then at least update more data. Something is better than nothing.

Created DB on another server, and we turned off encryption on that database as it was one of the grey areas in our mind. We then tried to add the primary key in the table as it generates a clustered index, which helps us speed up. But it was throwing timeout error while creating primary key (maybe because of a considerable amount of data). We tried to create indexes on two columns, which were querying to update records. But we were unable to create any indexes as it did not show any columns which can be indexed. It was a surprise to us as to why we were unable to create indexes.

We also tried to search the Internet about it, but we felt that we could not waste time, due to the pressure. Only 1 hour to go now. We thought that might be because of nvarchar field; it is not allowing us to create indexes. At this point, we were unaware that indexes could not be created on nvarchar(max). So, we added two new columns and given it INT datatype, but again, it was throwing time out error while I tried to save. The pressure was increasing.

We truncated the table and then added a primary key as well as added two INT columns. Then we imported data in this table again. We were successful this time! It was a good sign for us. After that, we have added INT values in those two new columns, and then we have created indexes on those two columns as well.

We were really out of time as we had some more steps to perform, and then we had to do some testing as well. So, we informed the client that we will need some more time. The client was a bit disappointed. For him, it was just an hour of work. He asked us to continue working on it and told us that he would try to buy one more hour from the end client. We were also not so happy with this delay.

Then we checked how many records were updated on the previous server. It was around 43000 only after continuous execution of 3:30 hours. We were going to execute the same steps again with indexes and the primary key. Surprisingly it took only 12 seconds to update all the records!! After this point, we relaxed a bit and gained some confidence and followed all other steps. All the steps were quickly executed, and we were able to meet the deadline this time.

A small thing makes a big difference sometimes!!!

Need an Expert Web & Mobile Application
Developer for your project?

Related blogs

SEO vs. PPC Which is Better for Your Business?

Explore the differences between SEO vs. PPC.

Custom Mobile App Development: A Comprehensive Guide 2024

This blog post delves into the world of custom mobile app development, exploring its benefits, the process involved, and how it empowers businesses to stay ahead in the competitive market.

Top Reason to Choose MEAN Stack for Web Development

MEAN stands for MongoDB, ExpressJS, AngularJS and Node.js – a popular set of technologies for building scalable and rich web applications.

All Rights Reserved. Copyright © 2024 | TechAvidus