10

I have a NewsStories tables which I'm left joining with some related tables. Each News story can have multiple images, categories and addresses. So the query is essentially:

SELECT * FROM NewStories 

LEFT JOIN Images ON Newstories.id=Images.story_id

LEFT JOIN Categories ON NewsStories.id=Categories.story_id

LEFT JOIN Addresses ON NewsStories.id=Addresses.story_id

WHERE ...

There are usually a few images and addresses per story, and 1 or 2 categories. The NewsStories table has about 10,000 articles.

The trouble is that the performance is rather slow (in the order of 15-20 seconds, although it does vary quite a bit and sometimes drops to as low as 5 seconds).

I was wondering if there's a better way of organizing the query to speed it up (I'm quite new to SQL).

In particular it seems quite wasteful that the number of rows for a given story is multiplied by the number of images times the number of addresses times the number of categories.

I'm essentially trying to reconstruct the properties of the News story into a single object which I can manipulate in a frontend.

Here's the explain (apologies if the formatting doesn't come out correctly). I'm guessing I'm not indexing Addresses properly if it's "Using where". Is that correct?

id  select_type table   type    possible_keys   key key_len ref rows    Extra

1   SIMPLE  Addresses   ALL NULL    NULL    NULL    NULL    6640    Using where

1   SIMPLE  NewsStories eq_ref  PRIMARY PRIMARY 767 NewsStories.Addresses.story_id 1    Using where

1   SIMPLE  Images  ref PRIMARY PRIMARY 767 NewsStories.NewsStories.id  1   Using index

1   SIMPLE  Categories  ref PRIMARY PRIMARY 767 NewsStories.NewStories.id   1


  • id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE Addresses ALL NULL NULL NULL NULL 6640 Using where 1 SIMPLE NewsStories eq_ref PRIMARY PRIMARY 767 NewsStories.Addresses.story_id 1 Using where 1 SIMPLE Images ref PRIMARY PRIMARY 767 NewsStories.NewsStories.id 1 Using index 1 SIMPLE Categories ref PRIMARY PRIMARY 767 NewsStories.NewStories.id 1 - Vijay Boyapati
  • See my answer on this question: stackoverflow.com/questions/6771975/sql-joining-6-tables/… - yper-crazyhat-cubeᵀᴹ

2 답변


6

  • Make sure that you have indexes on the fields that are in your WHERE statements and ON conditions, primary keys are indexed by default but you can also create indexes manually if you have to.

CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name [index_type] ON tbl_name (index_col_name,...) [index_type]

index_col_name: col_name [(length)] [ASC | DESC]

index_type: USING {BTREE | HASH}

  • Check if you really have to select every column in all of the tables? If not, make sure that you only select the columns that you need, avoid using select*

  • Double check if you really need LEFT JOINS, if no, use INNER JOINs.

  • If performance is still an issue after you're done tweaking your query, consider denormalizing your schema to eliminate joins

  • You may also want to consider reducing the load on the database by using caching applications like sphinxsearch and memcached

  • Check none of your joins are to views rather than actual tables

references:

http://www.sphinxsearch.com

http://dev.mysql.com/doc/refman/5.0/en/create-index.html


  • Thanks Mark. Follow up question: I understand why I should index columns that appear in the ON of the left join, but don't understand why I should index columns in the where. Those columns are various attributes of the News story I may wish to filter on, but I don't see much value in indexing. Does it really improve performance to index columns not used in the join? If so why? - Vijay Boyapati
  • yes, this is explicitly mentioned in the mysql reference guide. "MySQL uses indexes for these operations: To find the rows matching a WHERE clause quickly." dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html - Mark Basmayor
  • Excellent. Thanks for the pointer. - Vijay Boyapati
  • Smart answer for Check none of your joins are to views rather than actual tables. (y) - NullPointer

0

Make sure your tables are properly indexed. You need to have an index for every column you use to to select data, otherwise MySQL will go through every row in the table.

Try explaining the query (EXPLAIN SELECT * FROM ...etc) in the MySQL console to see how MySQL treats the tables internally. Paste the results into your question for additional help.


  • I hope this formats correctly: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE Addresses ALL NULL NULL NULL NULL 6640 Using where 1 SIMPLE NewsStories eq_ref PRIMARY PRIMARY 767 NewsStories.Addresses.story_id 1 Using where 1 SIMPLE Images ref PRIMARY PRIMARY 767 NewsStories.NewsStories.id 1 Using index 1 SIMPLE Categories ref PRIMARY PRIMARY 767 NewsStories.NewStories.id 1 - Vijay Boyapati

Linked


Related

Latest