Top 20+ MySQL Best Practices

Database operations often tend to be the main“active”, “inactive”, “pending”,
bottleneck for most web applications today. It’s not“expired” etc…
only the DBA’s (database administrators) that haveThere is even a way to get a “suggestion” from
to worry about these performance issues. We asMySQL itself on how to restructure your table. When
programmers need to do our part by structuring tablesyou do have a VARCHAR field, it can actually
properly, writing optimized queries and better code.suggest you to change that column type to ENUM
Here are some MySQL optimization techniques forinstead. This done using the PROCEDURE
programmers.ANALYSE() call. Which brings us to:
1. Optimize Your Queries For the Query Cache10. Get Suggestions with PROCEDURE ANALYSE()
Most MySQL servers have query caching enabled.PROCEDURE ANALYSE() will let MySQL analyze the
It’s one of the most effective methods ofcolumns structures and the actual data in your table to
improving performance, that is quietly handled by thecome up with certain suggestions for you. It is only
database engine. When the same query is executeduseful if there is actual data in your tables because
multiple times, the result is fetched from the cache,that plays a big role in the decision making.
which is quite fast.For example, if you created an INT field for your
The main problem is, it is so easy and hidden from theprimary key, however do not have too many rows, it
programmer, most of us tend to ignore it. Some thingsmight suggest you to use a MEDIUMINT instead. Or if
we do can actually prevent the query cache fromyou are using a VARCHAR field, you might get a
performing its task.suggestion to convert it to ENUM, if there are only few
// query cache does NOT workunique values.
$r = mysql_query("SELECT username FROM userYou can also run this by clicking the “Propose table
WHERE signup_date >= CURDATE()");structure” link in phpmyadmin, in one of your table
// query cache works!views.
$today = date("Y-m-d");Keep in mind these are only suggestions. And if your
$r = mysql_query("SELECT username FROM usertable is going to grow bigger, they may not even be
WHERE signup_date >= '$today'");the right suggestions to follow. The decision is
The reason query cache does not work in the first lineultimately yours.
is the usage of the CURDATE() function. This applies11. Use NOT NULL If You Can
to all non-deterministic functions like NOW() andUnless you have a very specific reason to use a
RAND() etc… Since the return result of the functionNULL value, you should always set your columns as
can change, MySQL decides to disable query cachingNOT NULL.
for that query. All we needed to do is to add an extraFirst of all, ask yourself if there is any difference
line of PHP before the query to prevent this frombetween having an empty string value vs. a NULL
happening.value (for INT fields: 0 vs. NULL). If there is no reason
2. EXPLAIN Your SELECT Queriesto have both, you do not need a NULL field. (Did you
Using the EXPLAIN keyword can give you insight onknow that Oracle considers NULL and empty string as
what MySQL is doing to execute your query. This canbeing the same?)
help you spot the bottlenecks and other problems withNULL columns require additional space and they can
your query or table structures.add complexity to your comparison statements. Just
The results of an EXPLAIN query will show you whichavoid them when you can. However, I understand
indexes are being utilized, how the table is beingsome people might have very specific reasons to
scanned and sorted etc…have NULL values, which is not always a bad thing.
Take a SELECT query (preferably a complex one,From MySQL docs:
with joins), and add the keyword EXPLAIN in front of“NULL columns require additional space in the row
it. You can just use phpmyadmin for this. It will showto record whether their values are NULL. For MyISAM
you the results in a nice table. For example, let’stables, each NULL column takes one bit extra, rounded
say I forgot to add an index to a column, which Iup to the nearest byte.”
perform joins on:12. Prepared Statements
After adding the index to the group_id field:There are multiple benefits to using prepared
Now instead of scanning 7883 rows, it will only scan 9statements, both for performance and security
and 16 rows from the 2 tables. A good rule of thumb isreasons.
to multiply all numbers under the “rows” column,Prepared Statements will filter the variables you bind to
and your query performance will be somewhatthem by default, which is great for protecting your
proportional to the resulting number.application against SQL injection attacks. You can of
3. LIMIT 1 When Getting a Unique Rowcourse filter your variables manually too, but those
Sometimes when you are querying your tables, youmethods are more prone to human error and
already know you are looking for just one row. Youforgetfulness by the programmer. This is less of an
might be fetching a unique record, or you might just beissue when using some kind of framework or ORM.
just checking the existence of any number of recordsSince our focus is on performance, I should also
that satisfy your WHERE clause.mention the benefits in that area. These benefits are
In such cases, adding LIMIT 1 to your query canmore significant when the same query is being used
increase performance. This way the database enginemultiple times in your application. You can assign
will stop scanning for records after it finds just 1,different values to the same prepared statement, yet
instead of going thru the whole table or index.MySQL will only have to parse it once.
// do I have any users from Alabama?Also latest versions of MySQL transmits prepared
// what NOT to do:statements in a native binary form, which are more
$r = mysql_query("SELECT * FROM user WHEREefficient and can also help reduce network delays.
state = 'Alabama'");if (mysql_num_rows($r) > 0) {There was a time when many programmers used to
// ...avoid prepared statements on purpose, for a single
}important reason. They were not being cached by the
// much better:MySQL query cache. But since sometime around
$r = mysql_query("SELECT 1 FROM user WHEREversion 5.1, query caching is supported too.
state = 'Alabama' LIMIT 1");if (mysql_num_rows($r) >To use prepared statements in PHP you check out
0) {the mysqli extension or use a database abstraction
// ...layer like PDO.
}// create a prepared statementif ($stmt =
4. Index the Search Fields$mysqli->prepare("SELECT username FROM user
Indexes are not just for the primary keys or the uniqueWHERE state=?")) {
keys. If there are any columns in your table that you// bind parameters
will search by, you should almost always index them.$stmt->bind_param("s", $state);
As you can see, this rule also applies on a partial string// execute
search like “last_name LIKE ‘a%’”. When$stmt->execute();
searching from the beginning of the string, MySQL is// bind result variables
able to utilize the index on that column.$stmt->bind_result($username);
You should also understand which kinds of searches// fetch value
can not use the regular indexes. For instance, when$stmt->fetch();printf("%s is from %sn", $username,
searching for a word (e.g. “WHERE post_content$state);
LIKE ‘%apple%’”), you will not see a benefit$stmt->close();
from a normal index. You will be better off using mysql}
fulltext search or building your own indexing solution.13. Unbuffered Queries
5. Index and Use Same Column Types for JoinsNormally when you perform a query from a script, it
If your application contains many JOIN queries, youwill wait for the execution of that query to finish before
need to make sure that the columns you join by areit can continue. You can change that by using
indexed on both tables. This affects how MySQLunbuffered queries.
internally optimizes the join operation.There is a great explanation in the PHP docs for the
Also, the columns that are joined, need to be the samemysql_unbuffered_query() function:
type. For instance, if you join a DECIMAL column, to an“mysql_unbuffered_query() sends the SQL query
INT column from another table, MySQL will be unablequery to MySQL without automatically fetching and
to use at least one of the indexes. Even the characterbuffering the result rows as mysql_query() does. This
encodings need to be the same type for string typesaves a considerable amount of memory with SQL
columns.queries that produce large result sets, and you can
// looking for companies in my statestart working on the result set immediately after the
$r = mysql_query("SELECT company_name FROMfirst row has been retrieved as you don’t have to
userswait until the complete SQL query has been
LEFT JOIN companies ON (users.state =performed.”
companies.state)However, it comes with certain limitations. You have to
WHERE users.id = $user_id");either read all the rows or call mysql_free_result()
// both state columns should be indexedbefore you can perform another query. Also you are
// and they both should be the same type andnot allowed to use mysql_num_rows() or
character encodingmysql_data_seek() on the result set.
// or MySQL might do full table scans14. Store IP Addresses as UNSIGNED INT
6. Do Not ORDER BY RAND()Many programmers will create a VARCHAR(15) field
This is one of those tricks that sound cool at first, andwithout realizing they can actually store IP addresses
many rookie programmers fall for this trap. You mayas integer values. With an INT you go down to only 4
not realize what kind of terrible bottleneck you canbytes of space, and have a fixed size field instead.
create once you start using this in your queries.You have to make sure your column is an UNSIGNED
If you really need random rows out of your results,INT, because IP Addresses use the whole range of a
there are much better ways of doing it. Granted it32 bit unsigned integer.
takes additional code, but you will prevent a bottleneckIn your queries you can use the INET_ATON() to
that gets exponentially worse as your data grows.convert and IP to an integer, and INET_NTOA() for
The problem is, MySQL will have to perform RAND()vice versa. There are also similar functions in PHP
operation (which takes processing power) for everycalled ip2long() and long2ip().
single row in the table before sorting it and giving you$r = "UPDATE users SET ip =
just 1 row.INET_ATON('{$_SERVER['REMOTE_ADDR']}')
// what NOT to do:WHERE user_id = $user_id";
$r = mysql_query("SELECT username FROM user15. Fixed-length (Static) Tables are Faster
ORDER BY RAND() LIMIT 1");When every single column in a table is
// much better:“fixed-length”, the table is also considered
$r = mysql_query("SELECT count(*) FROM user");“static” or “fixed-length”. Examples of
$d = mysql_fetch_row($r);column types that are NOT fixed-length are:
$rand = mt_rand(0,$d[0] - 1);VARCHAR, TEXT, BLOB. If you include even just 1 of
$r = mysql_query("SELECT username FROM userthese types of columns, the table ceases to be
LIMIT $rand, 1");fixed-length and has to be handled differently by the
So you pick a random number less than the number ofMySQL engine.
results and use that as the offset in your LIMIT clause.Fixed-length tables can improve performance because
7. Avoid SELECT *it is faster for MySQL engine to seek through the
The more data is read from the tables, the slower therecords. When it wants to read a specific row in a
query will become. It increases the time it takes for thetable, it can quickly calculate the position of it. If the row
disk operations. Also when the database server issize is not fixed, every time it needs to do a seek, it
separate from the web server, you will have longerhas to consult the primary key index.
network delays due to the data having to beThey are also easier to cache and easier to
transferred between the servers.reconstruct after a crash. But they also can take more
It is a good habit to always specify which columns youspace. For instance, if you convert a VARCHAR(20)
need when you are doing your SELECT’s.field to a CHAR(20) field, it will always take 20 bytes
// not preferredof space regardless of what is it in.
$r = mysql_query("SELECT * FROM user WHEREBy using “Vertical Partitioning” techniques, you
user_id = 1");can separate the variable-length columns to a
$d = mysql_fetch_assoc($r);echo "Welcomeseparate table. Which brings us to:
{$d['username']}";16. Vertical Partitioning
// better:Vertical Partitioning is the act of splitting your table
$r = mysql_query("SELECT username FROM userstructure in a vertical manner for optimization reasons.
WHERE user_id = 1");Example 1: You might have a users table that contains
$d = mysql_fetch_assoc($r);echo "Welcomehome addresses, that do not get read often. You can
{$d['username']}";choose to split your table and store the address info
// the differences are more significant with biggeron a separate table. This way your main users table
result setswill shrink in size. As you know, smaller tables perform
8. Almost Always Have an id Fieldfaster.
In every table have an id column that is the PRIMARYExample 2: You have a “last_login” field in your
KEY, AUTO_INCREMENT and one of the flavors oftable. It updates every time a user logs in to the
INT. Also preferably UNSIGNED, since the value canwebsite. But every update on a table causes the
not be negative.query cache for that table to be flushed. You can put
Even if you have a users table that has a uniquethat field into another table to keep updates to your
username field, do not make that your primary key.users table to a minimum.
VARCHAR fields as primary keys are slower. AndBut you also need to make sure you don’t
you will have a better structure in your code byconstantly need to join these 2 tables after the
referring to all users with their id’s internally.partitioning or you might actually suffer performance
There are also behind the scenes operations done bydecline.
the MySQL engine itself, that uses the primary key17. Split the Big DELETE or INSERT Queries
field internally. Which become even more important, theIf you need to perform a big DELETE or INSERT
more complicated the database setup is. (clusters,query on a live website, you need to be careful not to
partitioning etc…).disturb the web traffic. When a big query like that is
One possible exception to the rule are theperformed, it can lock your tables and bring your web
“association tables”, used for the many-to-manyapplication to a halt.
type of associations between 2 tables. For example aApache runs many parallel processes/threads.
“posts_tags” table that contains 2 columns:Therefore it works most efficiently when scripts finish
post_id, tag_id, that is used for the relations betweenexecuting as soon as possible, so the servers do not
two tables named “post” and “tags”.experience too many open connections and
These tables can have a PRIMARY key that containsprocesses at once that consume resources, especially
both id fields.the memory.
9. Use ENUM over VARCHARIf you end up locking your tables for any extended
ENUM type columns are very fast and compact.period of time (like 30 seconds or more), on a high
Internally they are stored like TINYINT, yet they cantraffic web site, you will cause a process and query
contain and display string values. This makes them apileup, which might take a long time to clear or even
perfect candidate for certain fields.crash your web server.
If you have a field, which will contain only a fewIf you have some kind of maintenance script that
different kinds of values, use ENUM instead ofneeds to delete large numbers of rows, just use the
VARCHAR. For example, it could be a column namedLIMIT clause to do it in smaller batches to avoid this
“status”, and only contain values such ascongestion.