Why You Should NOT Use SELECT *
Summary
TLDRThe video script emphasizes the pitfalls of using 'SELECT *' in SQL queries, citing inefficiency, potential index misuse, vulnerability to table changes, and maintenance difficulties. It advocates for specifying columns to improve performance and maintainability, with the exception of ad-hoc queries for initial data exploration. The speaker also promotes SQL Cheat Sheets for mastering SQL syntax.
Takeaways
- š« Avoid using 'SELECT *' in production code as it can lead to inefficiencies and maintenance issues.
- āļø Writing 'SELECT *' might seem convenient, but it can waste resources by retrieving more data than necessary.
- š Indexes may not be utilized effectively if all columns are selected, potentially slowing down query performance.
- š Be cautious, as 'SELECT *' can break your application if the table schema changes, like adding or removing columns.
- š 'SELECT *' queries are hard to maintain and understand, as they do not explicitly list the needed columns.
- š For quick checks or initial data exploration, 'SELECT *' can be useful, but not for feature development.
- š” Using 'SELECT *' in ad-hoc queries helps in understanding the structure and content of a new database.
- š When working with new datasets or databases, 'SELECT *' can assist in identifying column names and data types.
- ā Refrain from using 'SELECT *' in production environments to prevent unexpected changes and performance issues.
- š For a better grasp on SQL syntax and commands, consider using cheat sheets to improve efficiency and accuracy.
Q & A
Why is it generally advised not to use 'SELECT *' in SQL queries?
-Using 'SELECT *' can be inefficient because it retrieves all columns from a table, even if only a few are needed, leading to unnecessary data transfer and potential waste of resources.
How can 'SELECT *' affect the use of indexes in a database?
-Indexes are optimized for specific query patterns. If a query uses 'SELECT *', the database might decide not to use an index that could have improved performance for a more targeted column selection.
What is the potential risk of using 'SELECT *' when the table schema changes?
-If a table's schema changes, such as adding or removing columns, a 'SELECT *' query will automatically include these changes without any modifications to the query itself, which can lead to unexpected results or performance issues.
How does using 'SELECT *' impact the maintainability of code?
-'SELECT *' makes it difficult to understand which columns are being returned, complicating the process of working with the data in application code or reporting tools, and thus making the code harder to maintain.
In what scenario is it acceptable to use 'SELECT *' according to the script?
-Using 'SELECT *' is acceptable for quick, ad-hoc queries where the goal is to explore the contents of a table, such as when examining a new database or loading a sample dataset.
What is the purpose of an index in a database?
-An index in a database is an object that helps improve the performance of queries by allowing the database to quickly look up data without scanning every row in a table.
Why might a database choose not to use an index when a 'SELECT *' query is executed?
-A database may opt not to use an index with a 'SELECT *' query because the query does not specify a pattern that the index is designed to optimize, leading to a full table scan instead.
What is the impact of retrieving more data than necessary on client-side performance?
-Retrieving more data than necessary can slow down the client-side performance as the client has to process and possibly display a larger amount of data, which can be inefficient and wasteful.
How can a 'SELECT *' query affect the results or performance of a query when the table schema is altered?
-A 'SELECT *' query can cause the results to change or the query to slow down if the table schema is altered, such as when a column is added or removed, because the query will automatically reflect these changes without any manual intervention.
What is the recommendation for using 'SELECT *' in feature development or production code?
-It is recommended not to use 'SELECT *' in feature development or production code due to its inefficiency, potential for performance issues, and difficulty in maintaining the code.
What resource is mentioned in the script for those who want a quick reference guide for SQL commands and features?
-The script mentions SQL Cheat Sheets as a resource for those who need a quick reference guide for SQL commands and features to avoid forgetting the syntax.
Outlines
š« Avoiding 'SELECT *' in Queries
This paragraph discusses the common but inefficient practice of using 'SELECT *' in SQL queries. It explains that while it's a quick way to retrieve all columns from a database table, it can lead to inefficiencies due to unnecessary data transfer. The speaker also points out that it may prevent the use of indexes, which are crucial for query performance. Additionally, it highlights the risk of code breakage if the table schema changes and the difficulty in maintaining such queries in the long run. However, an exception is made for quick, ad-hoc queries where the goal is to explore the table's contents.
Mindmap
Keywords
š”Select *
š”Inefficiency
š”Indexes
š”Query Performance
š”SQL Cheat Sheets
š”Code Maintenance
š”Ad-hoc Queries
š”Feature Development
š”Production Code
š”Data Analysis
š”Sample Database
Highlights
The transcript discusses the common practice of using 'SELECT *' in SQL queries and its potential drawbacks.
Writing 'SELECT *' is considered easy and time-saving, but it has four main reasons why it's not recommended for production code.
Inefficiency is the first reason highlighted, as 'SELECT *' retrieves more data than necessary, leading to wasteful data transfer.
The second reason is that using 'SELECT *' might prevent the database from utilizing indexes, which could slow down query performance.
An SQL Cheat Sheet is mentioned as a helpful resource for common SQL commands and features, aiding in syntax retention.
The third reason against 'SELECT *' is that it can break code if the table schema changes, such as adding or removing columns.
The fourth reason is maintainability; 'SELECT *' queries are harder to read and work with in application and reporting tools.
An exception is presented for using 'SELECT *' in quick or ad-hoc queries for initial data exploration.
The use of 'SELECT *' is encouraged for understanding table contents and column names when working with new databases or datasets.
The transcript advises against using 'SELECT *' in feature development or production code due to its disadvantages.
The importance of data analysis and understanding what data represents is emphasized for those working with SQL.
A follow-up video on loading a sample database and using SQL to answer questions about the data is recommended for further learning.
The transcript provides insights into best practices for writing efficient and maintainable SQL queries.
The potential impact of 'SELECT *' on application performance and the importance of considering query efficiency are discussed.
The transcript suggests that while 'SELECT *' is convenient, it should be used judiciously to avoid long-term issues.
The importance of understanding the implications of 'SELECT *' on database performance and code stability is highlighted.
The transcript concludes by reiterating the recommendation to avoid 'SELECT *' in production environments for better code health.
Transcripts
You probably do this all the time.
I know I do. But we shouldnāt.
And Iāll tell you why.
First, what am I talking about?
This is about writing Select * for your queries.
Why do we write Select *?
Well, itās just easy. We can use a singleĀ character to get all of the columns from a table.
In our application or reportingĀ tool that runs the query,Ā Ā
we can just ignore the columns that we donāt need.
It saves time.
However, there are four reasons whyĀ I recommend not using Select * inĀ Ā
code that youāre using that you want toĀ keep or deployā¦ but thereās one exception.
The first reason is that aĀ query like this is inefficient.
You use Select Star and theĀ database returns all of theĀ Ā
columns from the table. Itās likely thatĀ you donāt need all of the columns. So,Ā Ā
you work with some of the columnsĀ from the table and ignore the rest.
This means the database is returning moreĀ data than you need back to the client,Ā Ā
which is inefficient and wasteful.
The second reason is thatĀ indexes might not be used.
Indexes are database objects that help toĀ improve the performance of your queries.
They are created to help with a specific type orĀ range of queries, and the columns that are beingĀ Ā
retrieved from a table are one of the factors thatĀ the database uses to decide which index to use.
If the query retrieves all columns in a table,Ā the database may decide not to use an index,Ā Ā
but the index could have been used if the queryĀ only retrieved the few columns you needed.
This can cause the query to run slower thanĀ you want, and it can be hard to improve.
If you want a quick reference guide forĀ many common SQL commands and features,Ā Ā
and stop forgetting the syntax for SQL,Ā Ā
youāll love the SQL Cheat Sheets I created.Ā Get your copy at the link in the description.
The third reason is that your query or codeĀ can break if changes are made to the table.
A SELECT * query does not specifyĀ the columns in the Select clause.
If a column is added to the table, it will showĀ up in the Select * query without any changes.
If a column is removed from the table,Ā it will no longer show in the query.
This may seem OK, but it can have anĀ undesired effect on your queries. TheĀ Ā
modification of a query canĀ cause the results to change,Ā Ā
or the query to slow down, even thoughĀ you didnāt change the query itself.
Reason number 4 is that a SelectĀ * query is hard to maintain.
With a Select * query, itās hard to read itĀ and know what columns are being returned.
If youāre using it in application code,Ā Ā
you donāt know what the columns are calledĀ so you canāt work with them in your code.
In a reporting tool, itāsĀ also harder to work with them.
You need to take the extra step of looking intoĀ the table to see what the columns are called,Ā Ā
which can slow things down andĀ make the code harder to work with.
However, thereās one exception whereĀ I think itās OK to use select *.
And that is for quick or ad-hoc queries, whereĀ you just want to see whatās inside a table.
I use it a lot when looking at data in tables,Ā Ā
when Iāve loaded a sample data set, orĀ when Iām starting to work with a newĀ Ā
database. It helps to understand whatās inĀ the table and what the columns are called.
But, it shouldnāt be used for anyĀ feature development or production code.
Speaking of data analysis, exploringĀ a database and finding out what theĀ Ā
data represents is a common taskĀ for all of us working with SQL.
Youāll want to watch this video next to seeĀ an example of loading a sample database,Ā Ā
querying the tables to understand the data, andĀ using SQL to answer some questions about the data.
Thanks for watching.
Browse More Related Video
Oracle Interview question : what is dual table in oracle
Dream Report: Acquiring Data from SQL Server
Dream Report: Acquiring Data using SQL Statements
Power BI Project For Beginners | Sales Insights Data Analysis Project - 3 - Data Analysis Using SQL
Part3 : Database Testing | How To Test Schema of Database Table | Test Cases
MYSQL
5.0 / 5 (0 votes)