This week we’ve been having a pretty big problem with one of the projects I’m working on. We’ve been getting so many hits on the server that it’s being overloaded. Not a bad problem to have. However, I didn’t think the server should be overloaded so Dan and I looked at the problem for quite a while and had basically narrowed it down to one mysql query. We weren’t sure this was the problem, but everything else we had tried had failed and we thought this was the problem.
We had a query that was consistently showing up in the mysql slow-queries log that didn’t seem like it should be slow. The query was:
SELECT keywordId, keyword FROM Keywords ORDER BY keywordId LIMIT 800, 10
We wanted to pull out 10 rows starting from row 800 (the 800 number changed with every query and ranged anywhere from 10 to 4,000). keywordId is a primary key so it has an index on it. For it to be a slow query didn’t make sense to me.
After exhausting all my resources, I asked Neal to put me in contact with Sacha Pachev, a mysql expert. He worked for mysql for a while and did some core development on the database. When I called him he knew immediatly what the problem was and explained it to me very well.
It turns out that when mysql has a LIMIT clause where you specify which row to start with, it goes through each row of the table before getting to the starting row. I’m not sure why (I think it may have something to do with the order clause and that mysql has to order the table before it finds the starting row), but I know that it was certainly slowing down our server. On queries where the starting row was around 3,000 it was taking over 700 seconds to execute the query. I don’t know why…but it was slow.
Either way, I have to totally thank Sasha Pachev for helping me out. That guy’s a genious.