Why You Should NOT Use SELECT *

Database Star
20 Aug 202403:35

Summary

TLDRThe video script emphasizes the pitfalls of using 'SELECT *' in SQL queries, citing inefficiency, potential index misuse, vulnerability to table changes, and maintenance difficulties. It advocates for specifying columns to improve performance and maintainability, with the exception of ad-hoc queries for initial data exploration. The speaker also promotes SQL Cheat Sheets for mastering SQL syntax.

Takeaways

  • 🚫 Avoid using 'SELECT *' in production code as it can lead to inefficiencies and maintenance issues.
  • ⚙️ Writing 'SELECT *' might seem convenient, but it can waste resources by retrieving more data than necessary.
  • 🔍 Indexes may not be utilized effectively if all columns are selected, potentially slowing down query performance.
  • 🛠 Be cautious, as 'SELECT *' can break your application if the table schema changes, like adding or removing columns.
  • 📝 'SELECT *' queries are hard to maintain and understand, as they do not explicitly list the needed columns.
  • 👀 For quick checks or initial data exploration, 'SELECT *' can be useful, but not for feature development.
  • 💡 Using 'SELECT *' in ad-hoc queries helps in understanding the structure and content of a new database.
  • 📈 When working with new datasets or databases, 'SELECT *' can assist in identifying column names and data types.
  • ❌ Refrain from using 'SELECT *' in production environments to prevent unexpected changes and performance issues.
  • 🔗 For a better grasp on SQL syntax and commands, consider using cheat sheets to improve efficiency and accuracy.

Q & A

  • Why is it generally advised not to use 'SELECT *' in SQL queries?

    -Using 'SELECT *' can be inefficient because it retrieves all columns from a table, even if only a few are needed, leading to unnecessary data transfer and potential waste of resources.

  • How can 'SELECT *' affect the use of indexes in a database?

    -Indexes are optimized for specific query patterns. If a query uses 'SELECT *', the database might decide not to use an index that could have improved performance for a more targeted column selection.

  • What is the potential risk of using 'SELECT *' when the table schema changes?

    -If a table's schema changes, such as adding or removing columns, a 'SELECT *' query will automatically include these changes without any modifications to the query itself, which can lead to unexpected results or performance issues.

  • How does using 'SELECT *' impact the maintainability of code?

    -'SELECT *' makes it difficult to understand which columns are being returned, complicating the process of working with the data in application code or reporting tools, and thus making the code harder to maintain.

  • In what scenario is it acceptable to use 'SELECT *' according to the script?

    -Using 'SELECT *' is acceptable for quick, ad-hoc queries where the goal is to explore the contents of a table, such as when examining a new database or loading a sample dataset.

  • What is the purpose of an index in a database?

    -An index in a database is an object that helps improve the performance of queries by allowing the database to quickly look up data without scanning every row in a table.

  • Why might a database choose not to use an index when a 'SELECT *' query is executed?

    -A database may opt not to use an index with a 'SELECT *' query because the query does not specify a pattern that the index is designed to optimize, leading to a full table scan instead.

  • What is the impact of retrieving more data than necessary on client-side performance?

    -Retrieving more data than necessary can slow down the client-side performance as the client has to process and possibly display a larger amount of data, which can be inefficient and wasteful.

  • How can a 'SELECT *' query affect the results or performance of a query when the table schema is altered?

    -A 'SELECT *' query can cause the results to change or the query to slow down if the table schema is altered, such as when a column is added or removed, because the query will automatically reflect these changes without any manual intervention.

  • What is the recommendation for using 'SELECT *' in feature development or production code?

    -It is recommended not to use 'SELECT *' in feature development or production code due to its inefficiency, potential for performance issues, and difficulty in maintaining the code.

  • What resource is mentioned in the script for those who want a quick reference guide for SQL commands and features?

    -The script mentions SQL Cheat Sheets as a resource for those who need a quick reference guide for SQL commands and features to avoid forgetting the syntax.

Outlines

00:00

🚫 Avoiding 'SELECT *' in Queries

This paragraph discusses the common but inefficient practice of using 'SELECT *' in SQL queries. It explains that while it's a quick way to retrieve all columns from a database table, it can lead to inefficiencies due to unnecessary data transfer. The speaker also points out that it may prevent the use of indexes, which are crucial for query performance. Additionally, it highlights the risk of code breakage if the table schema changes and the difficulty in maintaining such queries in the long run. However, an exception is made for quick, ad-hoc queries where the goal is to explore the table's contents.

Mindmap

Keywords

💡Select *

Select * is a shorthand in SQL for selecting all columns from a database table. It is often used for convenience, as it requires less typing than specifying each column individually. However, the video script emphasizes that using Select * can lead to inefficiencies and potential issues in database management and query performance, which is central to the video's message on best practices in SQL query writing.

💡Inefficiency

In the context of the video, inefficiency refers to the unnecessary use of resources, such as bandwidth and processing power, when a database returns more data than what is actually needed by the application. This occurs when Select * is used, causing the database to send all columns to the client, even if only a few are utilized.

💡Indexes

Indexes in databases are data structures that improve the speed of data retrieval operations. They allow the database to quickly locate data without scanning every row in a table. The script mentions that using Select * might prevent the database from utilizing indexes effectively, as the decision to use an index can depend on the specific columns being queried.

💡Query Performance

Query performance is a measure of how quickly a database can retrieve data in response to a query. The video script suggests that using Select * can negatively impact query performance because it may prevent the use of indexes and result in the retrieval of excess data, leading to slower query execution times.

💡SQL Cheat Sheets

SQL Cheat Sheets are reference materials that provide a quick overview of SQL commands and syntax. The script mentions them as a helpful tool for developers to avoid forgetting SQL syntax, indicating their utility in maintaining and improving SQL query writing skills.

💡Code Maintenance

Code maintenance refers to the process of altering existing code to correct faults, improve performance, or add functionality. The video script points out that using Select * can make code harder to maintain because it obscures which columns are being returned, complicating the process of updating or debugging the code.

💡Ad-hoc Queries

Ad-hoc queries are informal and typically one-time queries used to gather information or perform analysis on an as-needed basis. The script makes an exception for using Select * in this context, as it can be a quick way to explore the contents of a database table without the need for a formal query structure.

💡Feature Development

Feature development in software refers to the creation and implementation of new features or functionalities. The video script advises against using Select * in feature development or production code due to the potential for reduced efficiency and maintainability.

💡Production Code

Production code is the final, stable version of code that is deployed in a live environment for end-users. The script emphasizes that Select * should not be used in production code to avoid the risks associated with its use, such as inefficiency and the potential for code breakage.

💡Data Analysis

Data analysis involves inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making. The video script connects data analysis with the use of SQL for exploring and understanding the data within a database.

💡Sample Database

A sample database is a pre-populated database used for testing, learning, or demonstration purposes. The script mentions using Select * when working with a sample database to quickly understand the structure and content of the data it contains.

Highlights

The transcript discusses the common practice of using 'SELECT *' in SQL queries and its potential drawbacks.

Writing 'SELECT *' is considered easy and time-saving, but it has four main reasons why it's not recommended for production code.

Inefficiency is the first reason highlighted, as 'SELECT *' retrieves more data than necessary, leading to wasteful data transfer.

The second reason is that using 'SELECT *' might prevent the database from utilizing indexes, which could slow down query performance.

An SQL Cheat Sheet is mentioned as a helpful resource for common SQL commands and features, aiding in syntax retention.

The third reason against 'SELECT *' is that it can break code if the table schema changes, such as adding or removing columns.

The fourth reason is maintainability; 'SELECT *' queries are harder to read and work with in application and reporting tools.

An exception is presented for using 'SELECT *' in quick or ad-hoc queries for initial data exploration.

The use of 'SELECT *' is encouraged for understanding table contents and column names when working with new databases or datasets.

The transcript advises against using 'SELECT *' in feature development or production code due to its disadvantages.

The importance of data analysis and understanding what data represents is emphasized for those working with SQL.

A follow-up video on loading a sample database and using SQL to answer questions about the data is recommended for further learning.

The transcript provides insights into best practices for writing efficient and maintainable SQL queries.

The potential impact of 'SELECT *' on application performance and the importance of considering query efficiency are discussed.

The transcript suggests that while 'SELECT *' is convenient, it should be used judiciously to avoid long-term issues.

The importance of understanding the implications of 'SELECT *' on database performance and code stability is highlighted.

The transcript concludes by reiterating the recommendation to avoid 'SELECT *' in production environments for better code health.

Transcripts

play00:00

You probably do this all the time.

play00:02

I know I do. But we shouldn’t.

play00:04

And I’ll tell you why.

play00:06

First, what am I talking about?

play00:08

This is about writing Select * for your queries.

play00:12

Why do we write Select *?

play00:15

Well, it’s just easy. We can use a single  character to get all of the columns from a table.

play00:20

In our application or reporting  tool that runs the query,  

play00:23

we can just ignore the columns that we don’t need.

play00:26

It saves time.

play00:28

However, there are four reasons why  I recommend not using Select * in  

play00:32

code that you’re using that you want to  keep or deploy… but there’s one exception.

play00:36

The first reason is that a  query like this is inefficient.

play00:40

You use Select Star and the  database returns all of the  

play00:43

columns from the table. It’s likely that  you don’t need all of the columns. So,  

play00:48

you work with some of the columns  from the table and ignore the rest.

play00:51

This means the database is returning more  data than you need back to the client,  

play00:55

which is inefficient and wasteful.

play00:58

The second reason is that  indexes might not be used.

play01:02

Indexes are database objects that help to  improve the performance of your queries.

play01:07

They are created to help with a specific type or  range of queries, and the columns that are being  

play01:12

retrieved from a table are one of the factors that  the database uses to decide which index to use.

play01:18

If the query retrieves all columns in a table,  the database may decide not to use an index,  

play01:24

but the index could have been used if the query  only retrieved the few columns you needed.

play01:30

This can cause the query to run slower than  you want, and it can be hard to improve.

play01:34

If you want a quick reference guide for  many common SQL commands and features,  

play01:38

and stop forgetting the syntax for SQL,  

play01:41

you’ll love the SQL Cheat Sheets I created.  Get your copy at the link in the description.

play01:46

The third reason is that your query or code  can break if changes are made to the table.

play01:53

A SELECT * query does not specify  the columns in the Select clause.

play01:57

If a column is added to the table, it will show  up in the Select * query without any changes.

play02:03

If a column is removed from the table,  it will no longer show in the query.

play02:08

This may seem OK, but it can have an  undesired effect on your queries. The  

play02:13

modification of a query can  cause the results to change,  

play02:16

or the query to slow down, even though  you didn’t change the query itself.

play02:21

Reason number 4 is that a Select  * query is hard to maintain.

play02:25

With a Select * query, it’s hard to read it  and know what columns are being returned.

play02:31

If you’re using it in application code,  

play02:33

you don’t know what the columns are called  so you can’t work with them in your code.

play02:38

In a reporting tool, it’s  also harder to work with them.

play02:41

You need to take the extra step of looking into  the table to see what the columns are called,  

play02:46

which can slow things down and  make the code harder to work with.

play02:50

However, there’s one exception where  I think it’s OK to use select *.

play02:54

And that is for quick or ad-hoc queries, where  you just want to see what’s inside a table.

play02:59

I use it a lot when looking at data in tables,  

play03:01

when I’ve loaded a sample data set, or  when I’m starting to work with a new  

play03:05

database. It helps to understand what’s in  the table and what the columns are called.

play03:11

But, it shouldn’t be used for any  feature development or production code.

play03:15

Speaking of data analysis, exploring  a database and finding out what the  

play03:19

data represents is a common task  for all of us working with SQL.

play03:23

You’ll want to watch this video next to see  an example of loading a sample database,  

play03:27

querying the tables to understand the data, and  using SQL to answer some questions about the data.

play03:33

Thanks for watching.

Rate This

5.0 / 5 (0 votes)

相关标签
SQL Best PracticesDatabase EfficiencyQuery OptimizationIndex UtilizationCode MaintenanceAd-Hoc QueriesData AnalysisSELECT StarPerformance IssueSQL Cheat Sheet
您是否需要英文摘要?