I have a NewsStories tables which I'm left joining with some related tables. Each News story can have multiple images, categories and addresses. So the query is essentially:
SELECT * FROM NewStories
LEFT JOIN Images ON Newstories.id=Images.story_id
LEFT JOIN Categories ON NewsStories.id=Categories.story_id
LEFT JOIN Addresses ON NewsStories.id=Addresses.story_id
WHERE ...
There are usually a few images and addresses per story, and 1 or 2 categories. The NewsStories table has about 10,000 articles.
The trouble is that the performance is rather slow (in the order of 15-20 seconds, although it does vary quite a bit and sometimes drops to as low as 5 seconds).
I was wondering if there's a better way of organizing the query to speed it up (I'm quite new to SQL).
In particular it seems quite wasteful that the number of rows for a given story is multiplied by the number of images times the number of addresses times the number of categories.
I'm essentially trying to reconstruct the properties of the News story into a single object which I can manipulate in a frontend.
Here's the explain (apologies if the formatting doesn't come out correctly). I'm guessing I'm not indexing Addresses properly if it's "Using where". Is that correct?
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Addresses ALL NULL NULL NULL NULL 6640 Using where
1 SIMPLE NewsStories eq_ref PRIMARY PRIMARY 767 NewsStories.Addresses.story_id 1 Using where
1 SIMPLE Images ref PRIMARY PRIMARY 767 NewsStories.NewsStories.id 1 Using index
1 SIMPLE Categories ref PRIMARY PRIMARY 767 NewsStories.NewStories.id 1
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name [index_type] ON tbl_name (index_col_name,...) [index_type]
index_col_name: col_name [(length)] [ASC | DESC]
index_type: USING {BTREE | HASH}
Check if you really have to select every column in all of the tables? If not, make sure that you only select the columns that you need, avoid using select*
Double check if you really need LEFT JOINS, if no, use INNER JOINs.
If performance is still an issue after you're done tweaking your query, consider denormalizing your schema to eliminate joins
You may also want to consider reducing the load on the database by using caching applications like sphinxsearch and memcached
Check none of your joins are to views rather than actual tables
references:
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
Check none of your joins are to views rather than actual tables
. (y) - NullPointer
Make sure your tables are properly indexed. You need to have an index for every column you use to to select data, otherwise MySQL will go through every row in the table.
Try explaining the query (EXPLAIN SELECT * FROM ...etc
) in the MySQL console to see how MySQL treats the tables internally. Paste the results into your question for additional help.