Why You Should NOT Use SELECT *

Database Star
20 Aug 202403:35

Summary

TLDRThe video script emphasizes the pitfalls of using 'SELECT *' in SQL queries, citing inefficiency, potential index misuse, vulnerability to table changes, and maintenance difficulties. It advocates for specifying columns to improve performance and maintainability, with the exception of ad-hoc queries for initial data exploration. The speaker also promotes SQL Cheat Sheets for mastering SQL syntax.

Takeaways

  • šŸš« Avoid using 'SELECT *' in production code as it can lead to inefficiencies and maintenance issues.
  • āš™ļø Writing 'SELECT *' might seem convenient, but it can waste resources by retrieving more data than necessary.
  • šŸ” Indexes may not be utilized effectively if all columns are selected, potentially slowing down query performance.
  • šŸ›  Be cautious, as 'SELECT *' can break your application if the table schema changes, like adding or removing columns.
  • šŸ“ 'SELECT *' queries are hard to maintain and understand, as they do not explicitly list the needed columns.
  • šŸ‘€ For quick checks or initial data exploration, 'SELECT *' can be useful, but not for feature development.
  • šŸ’” Using 'SELECT *' in ad-hoc queries helps in understanding the structure and content of a new database.
  • šŸ“ˆ When working with new datasets or databases, 'SELECT *' can assist in identifying column names and data types.
  • āŒ Refrain from using 'SELECT *' in production environments to prevent unexpected changes and performance issues.
  • šŸ”— For a better grasp on SQL syntax and commands, consider using cheat sheets to improve efficiency and accuracy.

Q & A

  • Why is it generally advised not to use 'SELECT *' in SQL queries?

    -Using 'SELECT *' can be inefficient because it retrieves all columns from a table, even if only a few are needed, leading to unnecessary data transfer and potential waste of resources.

  • How can 'SELECT *' affect the use of indexes in a database?

    -Indexes are optimized for specific query patterns. If a query uses 'SELECT *', the database might decide not to use an index that could have improved performance for a more targeted column selection.

  • What is the potential risk of using 'SELECT *' when the table schema changes?

    -If a table's schema changes, such as adding or removing columns, a 'SELECT *' query will automatically include these changes without any modifications to the query itself, which can lead to unexpected results or performance issues.

  • How does using 'SELECT *' impact the maintainability of code?

    -'SELECT *' makes it difficult to understand which columns are being returned, complicating the process of working with the data in application code or reporting tools, and thus making the code harder to maintain.

  • In what scenario is it acceptable to use 'SELECT *' according to the script?

    -Using 'SELECT *' is acceptable for quick, ad-hoc queries where the goal is to explore the contents of a table, such as when examining a new database or loading a sample dataset.

  • What is the purpose of an index in a database?

    -An index in a database is an object that helps improve the performance of queries by allowing the database to quickly look up data without scanning every row in a table.

  • Why might a database choose not to use an index when a 'SELECT *' query is executed?

    -A database may opt not to use an index with a 'SELECT *' query because the query does not specify a pattern that the index is designed to optimize, leading to a full table scan instead.

  • What is the impact of retrieving more data than necessary on client-side performance?

    -Retrieving more data than necessary can slow down the client-side performance as the client has to process and possibly display a larger amount of data, which can be inefficient and wasteful.

  • How can a 'SELECT *' query affect the results or performance of a query when the table schema is altered?

    -A 'SELECT *' query can cause the results to change or the query to slow down if the table schema is altered, such as when a column is added or removed, because the query will automatically reflect these changes without any manual intervention.

  • What is the recommendation for using 'SELECT *' in feature development or production code?

    -It is recommended not to use 'SELECT *' in feature development or production code due to its inefficiency, potential for performance issues, and difficulty in maintaining the code.

  • What resource is mentioned in the script for those who want a quick reference guide for SQL commands and features?

    -The script mentions SQL Cheat Sheets as a resource for those who need a quick reference guide for SQL commands and features to avoid forgetting the syntax.

Outlines

00:00

šŸš« Avoiding 'SELECT *' in Queries

This paragraph discusses the common but inefficient practice of using 'SELECT *' in SQL queries. It explains that while it's a quick way to retrieve all columns from a database table, it can lead to inefficiencies due to unnecessary data transfer. The speaker also points out that it may prevent the use of indexes, which are crucial for query performance. Additionally, it highlights the risk of code breakage if the table schema changes and the difficulty in maintaining such queries in the long run. However, an exception is made for quick, ad-hoc queries where the goal is to explore the table's contents.

Mindmap

Keywords

šŸ’”Select *

Select * is a shorthand in SQL for selecting all columns from a database table. It is often used for convenience, as it requires less typing than specifying each column individually. However, the video script emphasizes that using Select * can lead to inefficiencies and potential issues in database management and query performance, which is central to the video's message on best practices in SQL query writing.

šŸ’”Inefficiency

In the context of the video, inefficiency refers to the unnecessary use of resources, such as bandwidth and processing power, when a database returns more data than what is actually needed by the application. This occurs when Select * is used, causing the database to send all columns to the client, even if only a few are utilized.

šŸ’”Indexes

Indexes in databases are data structures that improve the speed of data retrieval operations. They allow the database to quickly locate data without scanning every row in a table. The script mentions that using Select * might prevent the database from utilizing indexes effectively, as the decision to use an index can depend on the specific columns being queried.

šŸ’”Query Performance

Query performance is a measure of how quickly a database can retrieve data in response to a query. The video script suggests that using Select * can negatively impact query performance because it may prevent the use of indexes and result in the retrieval of excess data, leading to slower query execution times.

šŸ’”SQL Cheat Sheets

SQL Cheat Sheets are reference materials that provide a quick overview of SQL commands and syntax. The script mentions them as a helpful tool for developers to avoid forgetting SQL syntax, indicating their utility in maintaining and improving SQL query writing skills.

šŸ’”Code Maintenance

Code maintenance refers to the process of altering existing code to correct faults, improve performance, or add functionality. The video script points out that using Select * can make code harder to maintain because it obscures which columns are being returned, complicating the process of updating or debugging the code.

šŸ’”Ad-hoc Queries

Ad-hoc queries are informal and typically one-time queries used to gather information or perform analysis on an as-needed basis. The script makes an exception for using Select * in this context, as it can be a quick way to explore the contents of a database table without the need for a formal query structure.

šŸ’”Feature Development

Feature development in software refers to the creation and implementation of new features or functionalities. The video script advises against using Select * in feature development or production code due to the potential for reduced efficiency and maintainability.

šŸ’”Production Code

Production code is the final, stable version of code that is deployed in a live environment for end-users. The script emphasizes that Select * should not be used in production code to avoid the risks associated with its use, such as inefficiency and the potential for code breakage.

šŸ’”Data Analysis

Data analysis involves inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making. The video script connects data analysis with the use of SQL for exploring and understanding the data within a database.

šŸ’”Sample Database

A sample database is a pre-populated database used for testing, learning, or demonstration purposes. The script mentions using Select * when working with a sample database to quickly understand the structure and content of the data it contains.

Highlights

The transcript discusses the common practice of using 'SELECT *' in SQL queries and its potential drawbacks.

Writing 'SELECT *' is considered easy and time-saving, but it has four main reasons why it's not recommended for production code.

Inefficiency is the first reason highlighted, as 'SELECT *' retrieves more data than necessary, leading to wasteful data transfer.

The second reason is that using 'SELECT *' might prevent the database from utilizing indexes, which could slow down query performance.

An SQL Cheat Sheet is mentioned as a helpful resource for common SQL commands and features, aiding in syntax retention.

The third reason against 'SELECT *' is that it can break code if the table schema changes, such as adding or removing columns.

The fourth reason is maintainability; 'SELECT *' queries are harder to read and work with in application and reporting tools.

An exception is presented for using 'SELECT *' in quick or ad-hoc queries for initial data exploration.

The use of 'SELECT *' is encouraged for understanding table contents and column names when working with new databases or datasets.

The transcript advises against using 'SELECT *' in feature development or production code due to its disadvantages.

The importance of data analysis and understanding what data represents is emphasized for those working with SQL.

A follow-up video on loading a sample database and using SQL to answer questions about the data is recommended for further learning.

The transcript provides insights into best practices for writing efficient and maintainable SQL queries.

The potential impact of 'SELECT *' on application performance and the importance of considering query efficiency are discussed.

The transcript suggests that while 'SELECT *' is convenient, it should be used judiciously to avoid long-term issues.

The importance of understanding the implications of 'SELECT *' on database performance and code stability is highlighted.

The transcript concludes by reiterating the recommendation to avoid 'SELECT *' in production environments for better code health.

Transcripts

play00:00

You probably do this all the time.

play00:02

I know I do. But we shouldnā€™t.

play00:04

And Iā€™ll tell you why.

play00:06

First, what am I talking about?

play00:08

This is about writing Select * for your queries.

play00:12

Why do we write Select *?

play00:15

Well, itā€™s just easy. We can use a singleĀ  character to get all of the columns from a table.

play00:20

In our application or reportingĀ  tool that runs the query,Ā Ā 

play00:23

we can just ignore the columns that we donā€™t need.

play00:26

It saves time.

play00:28

However, there are four reasons whyĀ  I recommend not using Select * inĀ Ā 

play00:32

code that youā€™re using that you want toĀ  keep or deployā€¦ but thereā€™s one exception.

play00:36

The first reason is that aĀ  query like this is inefficient.

play00:40

You use Select Star and theĀ  database returns all of theĀ Ā 

play00:43

columns from the table. Itā€™s likely thatĀ  you donā€™t need all of the columns. So,Ā Ā 

play00:48

you work with some of the columnsĀ  from the table and ignore the rest.

play00:51

This means the database is returning moreĀ  data than you need back to the client,Ā Ā 

play00:55

which is inefficient and wasteful.

play00:58

The second reason is thatĀ  indexes might not be used.

play01:02

Indexes are database objects that help toĀ  improve the performance of your queries.

play01:07

They are created to help with a specific type orĀ  range of queries, and the columns that are beingĀ Ā 

play01:12

retrieved from a table are one of the factors thatĀ  the database uses to decide which index to use.

play01:18

If the query retrieves all columns in a table,Ā  the database may decide not to use an index,Ā Ā 

play01:24

but the index could have been used if the queryĀ  only retrieved the few columns you needed.

play01:30

This can cause the query to run slower thanĀ  you want, and it can be hard to improve.

play01:34

If you want a quick reference guide forĀ  many common SQL commands and features,Ā Ā 

play01:38

and stop forgetting the syntax for SQL,Ā Ā 

play01:41

youā€™ll love the SQL Cheat Sheets I created.Ā  Get your copy at the link in the description.

play01:46

The third reason is that your query or codeĀ  can break if changes are made to the table.

play01:53

A SELECT * query does not specifyĀ  the columns in the Select clause.

play01:57

If a column is added to the table, it will showĀ  up in the Select * query without any changes.

play02:03

If a column is removed from the table,Ā  it will no longer show in the query.

play02:08

This may seem OK, but it can have anĀ  undesired effect on your queries. TheĀ Ā 

play02:13

modification of a query canĀ  cause the results to change,Ā Ā 

play02:16

or the query to slow down, even thoughĀ  you didnā€™t change the query itself.

play02:21

Reason number 4 is that a SelectĀ  * query is hard to maintain.

play02:25

With a Select * query, itā€™s hard to read itĀ  and know what columns are being returned.

play02:31

If youā€™re using it in application code,Ā Ā 

play02:33

you donā€™t know what the columns are calledĀ  so you canā€™t work with them in your code.

play02:38

In a reporting tool, itā€™sĀ  also harder to work with them.

play02:41

You need to take the extra step of looking intoĀ  the table to see what the columns are called,Ā Ā 

play02:46

which can slow things down andĀ  make the code harder to work with.

play02:50

However, thereā€™s one exception whereĀ  I think itā€™s OK to use select *.

play02:54

And that is for quick or ad-hoc queries, whereĀ  you just want to see whatā€™s inside a table.

play02:59

I use it a lot when looking at data in tables,Ā Ā 

play03:01

when Iā€™ve loaded a sample data set, orĀ  when Iā€™m starting to work with a newĀ Ā 

play03:05

database. It helps to understand whatā€™s inĀ  the table and what the columns are called.

play03:11

But, it shouldnā€™t be used for anyĀ  feature development or production code.

play03:15

Speaking of data analysis, exploringĀ  a database and finding out what theĀ Ā 

play03:19

data represents is a common taskĀ  for all of us working with SQL.

play03:23

Youā€™ll want to watch this video next to seeĀ  an example of loading a sample database,Ā Ā 

play03:27

querying the tables to understand the data, andĀ  using SQL to answer some questions about the data.

play03:33

Thanks for watching.

Rate This
ā˜…
ā˜…
ā˜…
ā˜…
ā˜…

5.0 / 5 (0 votes)

Related Tags
SQL Best PracticesDatabase EfficiencyQuery OptimizationIndex UtilizationCode MaintenanceAd-Hoc QueriesData AnalysisSELECT StarPerformance IssueSQL Cheat Sheet