How to: tackle database performance issues


HDDIT companies spend millions of dollars trying to recover losses incurred due to poor application performance. I am sure each one of us has complained about a machine or application being slow or even dead, and then spent time at the coffee machine waiting for the results of a long running query. How can we fix that?

Most of the business applications or systems are designed to retrieve and/or write information to a local hard disk or a database system.

Consider a typical multi-tier architecture. It will contain the client tier, web tier, application tier, and data tier as shown below.


The data tier represents the database and mainly acts as the storage/manager for business data. Usually when an end-user/client requests some information or executes a query on the client tier, he/she expects to have a response ASAP. However the client tier has to talk to the data tier in order to get back the appropriate information to the client. This might take a few microseconds or sometimes even a few hours depending on several parameters.

Common parameters responsible for such delays include:

  1. Architecture of the system
  2. Algorithm
  3. Code complexity
  4. Unoptimized SQL queries
  5. Hardware (CPUs, RAM)
  6. Number of users
  7. Network traffic
  8. Database size
  9. Etc.

Out of all these parameters, unoptimized SQL queries contribute to the majority (around 60-70%) of database performance issues.


To avoid these delays, let’s look at some common database optimization approaches. There are three main approaches to go about optimizing databases:

  1. Optimize the database server hardware and network usage. This involves changing the hardware to a specific configuration to speed up the read/write onto the database. For example, use RAID 10 if there are equal read/write activities or RAID 5 if there are more read operations. This task is often performed as part of the deployment planning or infrastructure planning in the requirement analysis phase of the Software Development Lifecycle (SDLC). This exercise is also referred as hardware sizing.
  2. Optimize the database design. This involves the normalization of database. For example, you can go up to the third normal form of normalization, which will definitely help to make the database quicker. Usually this task is carried out during the design phase of the SDLC.
  3. Optimize the database queries. This involves studying the query plan, analyzing the queries for use of indexes, and joining and simplifying the queries for better performance. It is the most critical and effective approach for optimizing the database. The activity of query optimization can start in implementation phase and continue during testing, evolution, and maintenance phases.


In this post, I will focus on the database/SQL query optimization techniques/guidelines alone. The idea is to help tackle some of the critical database performance issues.


Many databases come with a built-in optimizer, which performs optimization and helps improve the performance to a certain extent. However the results are not always promising. There are database monitoring tools which only capture information on the resources consumed by the database servers. This can help address 20% of the performance issues. The best way to go about query optimization is to review the functional areas, which take a long time to respond mainly because of the underlying SQL queries.

Below I have tried to list a few SQL rules with examples, based on experience and best practices, which can help optimize the database to a great extent.

  1. Not using “WHERE” clause to filter the data will return all records and therefore make the system very slow Example 1: Original Query 1: select * from Production.TransactionHistory Returns 113443 rows Optimized Query 2: select * from Production.TransactionHistory where ProductID=712 Returns 2348 rows CAST-How-to-tackle-database-performance-issues-example-1 As the number of records retrieved are less (using “where” clause) the query executes much faster.

  3. Not using required column names in the “SELECT” part will take more time to return the same number of rows. Example 2: Original Query 1: select * from Production.TransactionHistory where ProductID=712 Returns 2348 rows Optimized Query 2: select TransactionID, Quantity from Production.TransactionHistory where ProductID=712 Returns 2348 rows CAST-How-to-tackle-database-performance-issues-example-2 Examples 1 & 2 might look quite obvious, but the idea is to think in a filtering mode and fetch the optimal set of data required for your application.

  5. Using Cartesian joins that lead to Cartesian products kills performance, especially when large data sets are involved. A Cartesian join is a multiple-table query that does not explicitly state a join condition among the tables and results in a Cartesian product. Example 3: Query 1: select count(*) from Production.Product Return 504 rows Query 2: select count(*) from Production.TransactionHistory Return 113443 rows Query 3: select count(*) from Production.Product, Production.TransactionHistory Return 57175272 rows (= 504 x 113443) -> Cartesian Product -------------------------------------------------------------------------- Original Query 4: select P.ProductID,TH.Quantity from Production.Product P, Production.TransactionHistory TH Return 57175272 rows Optimized Query 5: select P.ProductID,TH.Quantity from Production.Product P, Production.TransactionHistory TH where P.ProductID = TH.ProductID Return 113443 rows CAST-How-to-tackle-database-performance-issues-example-3

  7. Use Joins on indexed columns as much as possible. Example 4: Query 1: select P.ProductID,TH.Quantity from Production.Product P, Production.TransactionHistory TH where P.ProductID = TH.ProductID CAST-How-to-tackle-database-performance-issues-example-4 Execute the query without any index for the first time. Re-run the same query after adding an index on column ProductID.

  9. Avoid full table scans when dealing with larger tables. To prevent full table scans, we can add clustered indexes on the key columns with distinct values. Example 5: Query 1: select P.ProductID,PIn.LocationID from Production.Product P,Production.ProductInventory PIn where P.ProductID = PIn.ProductID  


    • A. Execute the query without any indexes for the first time. It will table scan by default. (Execution plan showing table scans for Product and ProductInventory tables below.) CAST-How-to-tackle-database-performance-issues-example-5A
    • B. Re-run the same query after adding clustered indexs on one of columns (LocationID). (Execution plan showing clustered index scan on the table ProductInventory and table scan on table Product below.) CAST-How-to-tackle-database-performance-issues-example-5B
    • C. Re-run the same query after adding indexes on both columns ProductID and LocationID to avoid table scan. (Execution Plan using index scan for both the tables Product and ProductInventory below.) CAST-How-to-tackle-database-performance-issues-example-5C

    In some cases, where there are not many unique values, a table scan can be more efficient as indexes will not be used.

  10. In general, subqueries tend to degrade database performance. In many cases, the alternate option is to use joins. Example 6: Non-correlated sub-query Original Query 1: SELECT Name,ProductID FROM Production.Product WHERE ProductID NOT IN (SELECT ProductID FROM Production.TransactionHistory) Correlated sub-query Original Query 2: SELECT Name,ProductID FROM Production.Product P WHERE NOT EXISTS (SELECT ProductID FROM Production.TransactionHistory where P.ProductID=ProductID) Replace sub-query by join Optimized Query 3: SELECT Name,P.ProductID FROM Production.Product P LEFT OUTER JOIN Production.TransactionHistory TH On (P.ProductID = TH.ProductID) where TH.ProductID is NULL CAST-How-to-tackle-database-performance-issues-example-6

  12. Too many indexes on a single table are costly. Every index increases the time it takes to perform INSERTS, UPDATES and DELETES, so the number of indexes should not be too high. There is an additional disk and memory cost involved with each index. The best solution is to monitor your system performance. If there is need to improve the performance, you can go for more indexes.

  14. When a SQL query is executed in a loop, it will require several roundtrips between the client and the database server. This consumes network resources/bandwidth and hurts performance. Therefore SQL queries inside loops should be avoided. The recommended workaround is to create one query using a temporary table. In this case, only one network round trip will be required. And further optimize the query.

There are few database analyzers in the market which check the SQL code against such rules and help identifying the weak SQL queries.

I will continue to blog on this subject to cover advanced optimizing guidelines linked to Stored Procedures, Cursors, Views and Dynamic SQL. Hope this post gives you a few tips to identify and resolve database performance issues.

Please feel free to share your feedback/questions on this blog, or experiences with any tools you have tried for database optimization.

Filed in:
  This report describes the effects of different industrial factors on  structural quality. Structural quality differed across technologies with COBOL  applications generally having the lowest densities of critical weaknesses,  while JAVA-EE had the highest densities. While structural quality differed  slightly across industry segments, there was almost no effect from whether the  application was in- or outsourced, or whether it was produced on- or off-shore.  Large variations in the densities in critical weaknesses across applications  suggested the major factors in structural quality are more related to  conditions specific to each application. CRASH Report 2020: CAST Research on  the Structural Condition of Critical Applications Report
Open source is part of almost every software capability we use today. At the  very least libraries, frameworks or databases that get used in mission critical  IT systems. In some cases entire systems being build on top of open source  foundations. Since we have been benchmarking IT software for years, we thought  we would set our sights on some of the most commonly used open source software  (OSS) projects. Software Intelligence Report <> Papers
Making sense of cloud transitions for financial and telecoms firms Cloud  migration 2.0: shifting priorities for application modernization in 2019  Research Report
Jayesh Golatkar
Jayesh Golatkar Associate VP of Product Development at Deloitte India
Dynamic professional with right attitude and strong technical background. Passionate about Product development & conceiving new ideas. An Inspiring lead, effective communicator with excellent team building & interpersonal skills. Believes in leading by example. Successful in building & executing Product strategies. Proven ability to drive teams & product to success.
Load more reviews
Thank you for the review! Your review must be approved first
New code

You've already submitted a review for this item