[ACCEPTED]-Does the number of columns returned affect the speed of a query?-performance
You better avoid
- It leads to confusion when you change the table layout.
- It selects unneeded columns, and your data packets get larger.
- The columns can get duplicate names, which is also not good for some applications
- If all the columns you need are covered by an index,
SELECT columnswill only use this index, while
SELECT *will need to visit the table records to get the values you don't need. Also bad for performance.
SELECT * is usually never a good idea. It may not 14 slow down your DBMS fetch a lot but it will 13 probably result in more data being transmitted 12 over the network than you need.
However, that's 11 likely to be swamped into insignificance 10 by the use of the
LIKE '%frank%' clause which is basically 9 non-indexable and will result in a full 8 table scan.
You might want to consider cleaning 7 up the data as it enters the database since 6 that will almost certainly make subsequent 5 queries run much faster.
If you're after 4 frank, then make sure it's stored as frank 3 and use:
select x,y,z from table where name = 'frank'
If you want to get franklin as well, use:
select x,y,z from table where name like 'frank%'
Both 2 of these will be able to use an index on 1 the name column,
"%frank%" will not.
I'm going to go against the flow here and 19 say you should go with the select *. I think 18 that premature optimization is the root 17 of a lot of problems, and you may well find 16 that it doesn't affect your performance 15 when you get to real utilization. Of course, by 14 the book it is slower, it must be, but that 13 doesn't mean the difference is important 12 in practice.
Something to be aware of, though, is 11 that some SQL engines (MS-SQL for sure) will 10 cache the select *, so if you are using 9 a prepared statement, or a view or stored 8 procedure that has it, and change the table 7 schema, it won't pick up on the change unless 6 the view or sp is recompiled, so that is 5 a good reason to avoid doing it if you aren't 4 running these queries dynamically.
And of 3 course, this varies by database engine, so 2 a little load testing would be in order 1 to make sure the hit isn't obviously large.
Regardless of performance issues, it is 2 good practice to always enumerate all fields 1 in your queries.
- What if you decide to add a TEXT or BLOB column in the future that is used for a particular query? Your SELECT * will return the additional data whether you need it or not.
- What if you rename a column? Your SELECT * will always work, but the relying code will be broken.
For small projects, you can usually get 14 away with
select *. It's "right" to not do that, though. You 13 won't notice any appreciable speed difference 12 for one table in a non-index query... the 11 only thing you're appreciably doing is using 10 more bandwidth for columns you don't read.
That 9 said, you will notice a difference in index-only 8 queries where you're hitting the full table 7 when you only needed to hit the index. This 6 will especially crop up when you're doing 5 joins.
Select * does have uses though, and if you 4 use it properly (say, in combination with 3 a cache, making sure it's
select table.*, and addressing 2 results by column name) you can reduce queries 1 made by your application.
If remember correctly from college (and 6 its been awhile), selecting * is not prefered, but 5 not that bad -- until you start joining. When 4 you get into the relational alegbra of creating 3 the joined tuples, every column adds to 2 time, so I would definately avoid it if 1 possible.
The number of columns in the table does 38 not affect the performance of your query. The 37 number of columns operated upon in the query 36 will.
Note the following example from the 35 Oracle concepts manual:
Row Format and Size 34 Oracle stores each row of a database table 33 containing data for less than 256 columns 32 as one or more row pieces. If an entire 31 row can be inserted into a single data block, then 30 Oracle stores the row as one row piece. However, if 29 all of a row’s data cannot be inserted 28 into a single data block or if an update 27 to an existing row causes the row to outgrow 26 its data block, then Oracle stores the 25 row using multiple row pieces. A data 24 block usually contains only one row piece 23 for each row. When Oracle must store a 22 row in more than one row piece, it is 21 chained across multiple blocks.
When a 20 table has more than 255 columns, rows 19 that have data after the 255th column 18 are likely to be chained within the same 17 block. This is called intra-block chaining. A 16 chained row’s pieces are chained together 15 using the rowids of the pieces. With intra-block chaining, users 14 receive all the data in the same block. If 13 the row fits in the block, users do not 12 see an effect in I/O performance, because 11 no extra I/O operation is required to 10 retrieve the rest of the row.
HOWEVER: If 9 there are 400 columns, I would bet that 8 most rows will not fit in one block and 7 hence you will see a lot more 'db file 6 sequential read' than normally required. As 5 well, I remember that Steve Adams (or 4 someone long ago) mentioning that there 3 is an additional cost for accessing a 2 column "further down the list" - sorry 1 don't have that link.
If person only has Id, Forename, and Surname, the 9 queries should be equivalent. However, the 8 query time will grow proportionately to 7 the number of column (really amount of data) returned.
Also, if 6 query will only ever need those three columns, you 5 should only ask for those three. If you 4 SELECT * and you change your schema later, you're 3 basically just adding extra processing to 2 all of your queries with not real added 1 benefit.
I would visit this question on why using the "Select 6 * " construct is not preferred.
In my 5 experience selecting 3 columns versus select 4 * in a 3 column table might not have a noticeable 3 impact performance wise but as tables get 2 larger and wider you will notice a performance 1 difference.
Generally, in any situation, you want to 13 stay away from using
SELECT * FROM TABLE
in your code. Doing 12 so can lead to several issues, only one 11 of which is performance. Two others I can 10 think of off the top of my head are resource 9 utilization (if you're selecting columns 8 you don't need, or somebody adds columns 7 later...you're bringing back data and wasting 6 memory) and code readability (if somebody 5 sees SELECT * FROM in your code...they're 4 not necessarily going to know which columns 3 are actually being used in your application).
Just 2 a couple of things to think about...but 1 the best practice is NOT to use it.
Yes it does. Basically:
- More data has to be transfered from your database server
- The database server has to fetch more data
You shouldn't use 1 select *
In addition to the other answers, consider 9 that SELECT * will return data from all 8 tables in the query. Start adding other 7 tables through JOINs, and you'll start seeing 6 things you don't want to see.
I believe I've 5 also seen cases where SELECT * requires 4 data actually be fetched from a joined table, as 3 opposed to only using the indexes on that 2 table to help narrow down the overall result 1 set. I can't think of an example of that, though.
There are multiple dimensions to this. For 22 once the * will make your code more fragile. When 21 in later versions you change table layouts 20 code that relies on column order might break 19 - or might not but read or modify the wrong 18 columns if the data types still match which 17 can be a really nasty problem!
Moreover if 16 you always request all columns you will 15 require more memory on your database client 14 and the database server for the unneeded 13 columns. This can be really expensive if 12 the table contains long character fields, very 11 many fields and/or BLOBs. Selecting unnecessary 10 columns will also thrash the server's cache 9 by flooding it with superflous contents 8 that is never looked at by a client.
So in 7 general you should not use it. Most object 6 relational mapping tools generate SQL that 5 contains all column names anyway, so during 4 development this is probably not an issue 3 anyway. I personally only tend to use * for 2 quick ad-hoc queries that I have to type 1 manually.
This is the correct way and the most optimal. The 12 reason is that your only gathering the data 11 needed so it takes up the correct space 10 (What you need) in storing the data before 9 you get your results.
SELECT Id, Forename, Surname FROM Person WHERE PersonName Like(‘%frank%’)
This is incorrect as 8 it takes up unused fields which takes up 7 more space to run your query which slows 6 down your results. Even if you get lucky 5 and use all the fields in your query it's 4 best to list them individually. This will 3 clarify the query and what data is to be 2 returned to any other developer who might 1 need to modify the query in the future.
SELECT * FROM Person WHERE PersonName Like(‘%frank%’)
the only time i use "
select *" is not event really 5 a "
is not that same as
the 4 first returns the number of rows in the 3 table
but the second returns the number 2 of rows with a NOT NULL ID value.
a subtle 1 distinction but worth remembering.
SELECT * will be slower since it has to 10 transfer more data. Also because of some 9 other reasons already mentioned. It really 8 becomes a problem when joining tables since 7 you start adding many more columns, when 6 really all you want to do is join so you 5 can filter.
If you really want to use * , specify 4 the table you want all the columns from, like 3 SELECT Person.* FROM Person...
That will 2 narrow down the amount of data returned 1 and makes it a little more readable.
Let me play devils advocate and suggest 18 a scenario where SELECT * is a better choice. Suppose 17 you are creating a user interface where 16 you take the results of the dataset and 15 display it in some form of table or grid. You 14 could build the columns in the UI to match 13 the columns in the dataset and do the SELECT 12 * FROM MyView.
By using a View in the database 11 you have complete control over what columns 10 are returned by the query and the UI can 9 by dynamic enough to display all of the 8 columns. Changes to the view would be reflected 7 immediately in the UI without recompiling 6 and re0 Obviously I would suggest following 5 the previous advice and specify all of the 4 columns in the view definition.
Just thought 3 I would add that as sometimes people get 2 dogmatic about following certain rules and 1 forget that context matters.
More Related questions