28 January 2016

MySQL Swap Insanity: Solved

Yes, it has been a while since my last post.  Still in the midst of MySQL ecosystem, and enjoying it tremendously.  I thought I would make this post as it relates to a topic that has some varied results in the online world, and I would like to kick start this blog again.  We'll see how that goes... :)

MySQL has been able to harness the potential of more powerful (CPU) and larger (RAM, disk space) systems in recent years like never before. With this comes the possibility that something can go awry in a big way, and that just happened with us when we upgraded from MySQL 5.1 to MySQL 5.6. 

Our original 5.1 system was on CentOS, and aside from two funky issues (one being MySQL process getting shot in the head by cfengine), our system was stable and performant. With continuous data initiatives/pressures (more data, improved performance and increased availability), we upgraded our system to MySQL 5.6 on Ubuntu 14.04, managed by puppet.

The first day or two were smooth; transition of data was a big deal, but other than that, it was almost a non-event.

Until day 3. Swap started to increase.

Not only that, swap increased in a big way. From 1% to over 80% in a matter of days. Yes, it started to plateaued, and in some cases, it was at the ~70% mark, and in other cases, at the ~99% mark. And yes, we had a couple of MySQL instances restart due to the dreaded signal 11 message (ie crash). At that point, we ended up restarting several MySQL instances daily so that I would not get woken up in the middle of the night with an issue. And this way I would not have to repair dozens or hundreds of tables.

Restarting MySQL in this manner was clearly not acceptable, as it reminded me of my old SQL server days when we restarted those instances daily, a common Windows "fix" at that time in history. 

So what could be the culprit in dealing with this swap issue in MySQL? 

There were a number of possibilities that we considered.  The first four possibilites were MySQL related, the remaining ones were system related.

  1. key_buffer_size - reduce size 
  2. max_connections - reduce value
  3. performance_schema - turn it off 
  4. table_open_cache - reduce value
  5. swappiness - reduce value
  6. numactl --interleave=all 
  7. sysctl -q -w vm.drop_caches=3 


We (including my trusty DBA associate, Jonathan Park) tried a few things on this list, including #1, #2, #3, #6, and #7. 

We even tested on a box with Innodb engine set off. 

We still noticed that swap increased under our load test.  

In fact, we were very successful at increasing swap rapidly over a short time. We could start with swap at 1% and it would increase to about 80% in about 15 minutes. And I was merely running 8 threads to make that happen. 

What Worked?

Swappiness.

We noticed that swappiness was set to 60 by default (we have Ubuntu 14.04), and we reduced it to 10.  We ran our tests and swap did increase a little bit, but not as badly as before.  So we reduced it to 1 (not zero) and swap behavior stayed constant.

I am interested to find out what the MySQL community has done in addition to the prominent post at Jeremy Cole: MySQL Swap Insanity and the NUMA Architecture and would like to see what may have worked for you!

And if you try setting swappiness to 1, does it work for you?


24 April 2012

The Conflicted Data Analyst

Inspired by a post from Juice Analytics.

We are a conflicted people. We love our TV and movie violence but worry that it ruins our children’s minds. We want to reduce healthcare costs, but don’t want to restrict the free market.

Conflicts like these leave little room for a satisfactory answer. Basic principles are in conflict and deeply-rooted desires run up against painful consequences. We don’t want to choose, and the middle ground feels like failure.

The people who practice data analysis and visualization have their own set of conflicts. Does this sound familiar?

“I want a feature-rich analytics tool, but…
My powerful analytics tool is too complex to use.”

“I want a dashboard that summarizes my business or functional area, but…
I don’t want to feel limited in what information I get to see.”

“I want a predictive model, but…
I don’t trust black box models when a result doesn’t align with my understanding.”

“I only want to focus on actionable data, but…
hey, look at that data point…that’s kind of interesting.”

“I want this report immediately, but…
now that I’ve got it, I don’t really have time to look at it.”

“I want to access data without a technical resource, but…
I don’t want to learn SQL because that’s for technical people.”

“I want to run scenario analyses on my data to make better decisions, but… I don’t believe the analysis can account for the complexity of my business.”

“I love geographic maps and network diagrams, but…
it is hard to see patterns or find insights from these types of visualizations.”

In a sense, each of these conflicts represents an opportunity to innovate to solve a persistent problem within our data analysis solutions. Find a way to satisfy the initial desire without suffering the assumed consequence, and you’ve got a chance to delight users of data.

24 January 2012

How to Find Out if an Entire String Is Numeric

Problem: Find out if an entire string is numeric.  (Just like the is_numeric function in php.)

Problem, Part II: Most online resources show how you can find out if part of your string is numeric.

Solution:
SELECT str FROM tbl
  WHERE str REGEXP('(^[0-9]+$)');
Give it a whirl!


For those who would like to view some test results, here goes:
mysql> SELECT str FROM tbl;
+-----+
| str |
+-----+
| 10  |
| 10A |
| 10E |
| 78  |
| E78 |
+-----+
5 rows in set (0.00 sec)
mysql> SELECT str FROM tbl WHERE str REGEXP('(^[0-9]+$)');
+-----+
| str |
+-----+
| 10  |
| 78  |
+-----+
2 rows in set (0.00 sec)
Now for the other way around:
mysql> SELECT str FROM tbl WHERE str NOT REGEXP('(^[0-9]+$)');
+-----+
| str |
+-----+
| 10A |
| 10E |
| E78 |
+-----+
3 rows in set (0.00 sec)

03 June 2011

How to Parse the Most Selected Tables From MySQL General Query Log in One Line

Problem: Pin down redundant queries by finding out which tables are being selected from the most.

I like general query log parsing tools, such as mk-query-digest or mysqlsla.

However, there are times when I am on a server as a dba with limited access and can only read files in /tmp.  And other times I just need to get something done quickly.

So what did I do to find out the most selected tables from a database?

1. Make sure /tmp has enough space to do what I'm about to do
2. Turn on the general query log with the file at /tmp/mysql_general.log.  This is usually /var/log/mysqllog/mysql_general or similar, but I need access to this log without waking up the sysadmin.
3. Turn off the general query log before /tmp fills up (for some people this might be just minutes)
4. Parse using this nifty set of commands:
grep -i "SELECT " /tmp/mysql_general.log | grep -io "SELECT .*" | sed 's|\(FROM [^ ]*\) .*|\1|' | sort | uniq -c | sort -nr | head -100

That's all!

As my mother would say, "Try it... You'll like it."

27 April 2011

My Ignite Talk at MySQL Conference

My Ignite Talk (5 minute talk, 20 slides, 15 seconds per slide) at the MySQL Conference about Data Visualization was picked among three for "the Best of Ignite".  I would say I barely scraped into third spot, and it was inspiring giving a talk after Baron Schwartz gave his at the keynote on the last day of the conference.

The other two talks were "Scale Fail" by Josh Berkus (hilarious, must see) and "Causes of Downtime" by Baron Schwartz.  Enjoy!

02 February 2011

Need to Replicate to a Database with a Different Name?

This is the first time this has happened to me, so I got excited about it until I noticed that this command has been with us from nearly the very beginning (ie MySQL 4).

The Situation
Master Server contains db1
Slave Server contains db1 and db2
Slave Server needs db1 from the Master to replicate to db2

The Solution
On my.cnf on the slave server:
--replicate-rewrite-db=from_name->to_name

Thus, this situation would dictate the following:
--replicate-rewrite-db=db1->db2
Easy!

The Wrinkle
Part 1
What if the Master Server contained db1 and db2; and
the Slave needs only db1 from the Master to replicate to db2 on the Slave?

Part 2
We don't want to replicate both db1 and db2 to this slave, although we want to replicate both databases to another slave - this means that the filter for databases should reside on my.cnf on the slave...

Assuming that we are using replicate-rewrite-db=db1->db2:
Using replicate-do-db on the my.cnf file on the slave, which database would you use with that command?

db1 or db2?  It only takes a moment in time to find out... :)

11 January 2011

How to Log User Connections in MySQL

There is nothing that MySQL 5.1 explicitly performs to log user connections, but there is a combination of MySQL commands that can log user connections without having to turn on the general query log. I repeat: you do not have to turn the general query log on to make this work!

Would you like to find out what that is?

With the help of Baron Schwartz's post on http://www.xaprb.com/blog/2006/07/23/how-to-track-what-owns-a-mysql-connection/ I was able to find out as well. However, I wanted to simplify things a bit, because all I need is to track MySQL connections within MySQL for now. I'll probably increase the scope in the future, but for now, let's keep it simple.

Ingredients
init-connect
NOW()
CURRENT_USER()
CONNECTION_ID()

See it coming together?

Alrighty, here's the recipe...

Assumptions: The name of the database we will use to store this information will be called admin.

1. Create the table to store the connection information in. You don't have to include the unique key if you don't want to.

CREATE TABLE admin.connections (id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, connect_time DATETIME NOT NULL, user_host VARCHAR(50) NOT NULL, connection_id INT UNSIGNED NOT NULL);

2. Set the init-connect variable. This is a string to be executed for each client that connects. Details here.

SET GLOBAL init_connect = "INSERT INTO admin.connections (connect_time, user, connection_id) VALUES (NOW(), CURRENT_USER(), CONNECTION_ID());";

3. Check permissions of all of your users to ensure that they can insert a record into the admin.connections table.

4. Login as a user without global privileges and a row should be inserted in the admin.connections table. Note that the init-connect system variable does not work on users with global privileges. For better or for worse, all of us DBAs know what that means.

5. Watch your connection table grow and flourish. You've just started a new pet (project).