Over the past two days we experienced two 60/90 periods with
sluggish speeds.
The cause has been identified. The issue was with what the
techie types call a HEARTBEAT cable connection. It seems when two servers are
linked, as is the case with ours they exchange data over one connection but
have a second connection, the heartbeat connection that allows them to know how
the other is feeling. Apparently we experienced a partial failure with this
connection. It caused the servers to become preoccupied with checking on each
other and not pay attention to incoming traffic. It has been replaced and the heartbeat
now restored to a steady thump…thump.
But wait there is more
During this performance issue, one of our sites ran some
tests. They actually identified a bad server located between us and them. An issue
unrelated to the one above but an issue none the less. They asked us for help
in understanding the situation. Our hosting firm offered up the following
response and tips. I asked their permission to share that response with
everyone.
Trace Routes help in Identifying a BAD Server
The fact that you see a higher
time spent on just one server indicates that the particular internet switch is
lowering the priority of your request. They do this to allow for more important
traffic to go through. This type of prioritizing is becoming more and more
common. The following article may help explain http://forums.whirlpool.net.au/archive/98073.
Trace routes are good if you have no connection to
a particular destination. It will show you the exact internet switch that the
connection is breaking down on.
To test speed, a simple ping might work better
Ping has been turned on for our
servers. You can ping root.calancom.com at several points in a day and ideally across
several days, this will give you a sense of what normal ping times are from
your location. In addition you need a benchmark outside of the calan servers.
For instance, ping Google’s DNS servers 8.8.8.8 &
8.8.4.4 at the same time and you’ll see how the two compare. Yes
Google is likely quicker but that does not matter, it is the match point of the
two servers at a given moment that matters, not which is THE fastest.
Armed with those data points,
should you feel you are experiencing slowness, you can ping both calan and
Google. You’ll see if just one or both are running slower than your base line
results. If the connection to calan’s servers is the issue, root.calancom.com
will have high ping times and Google will have about the same ping names
(compared to tests run during normal speed).
Note:If you have questions feel free to reach out and ask more
questions. Hope everyone finds this helpful.