Using New Relic Geography
Here at Kent State we have weekly meetings to review the performance of our internal systems. This includes systems like our ERP, LMS, general webapps, and student portal (FlashLine). It was brought to attention in one of these meetings that our new student portal has poor performance via apdex score in the middle of the night. Since our performance reports are being generated from our New Relic account I started my investigation there.
As you can see traffic there is a drop in the apdex score between 2am and 6am. My immediate reaction is to think this is normal because we get so little traffic that time of night. I even found a graph that shows a page per minute, ppm, metric of less than 10! That's an easy explanation but I wanted to see what else new relic could tell me, because ultimately some people are having slow performance and we need to correct that.
I started to look into where that time was being spent and I noticed something interesting. The time was not on the application server level but there was a spike in network time.
I immediately started asking why is there an increase in network time. The obvious answer is geography, or cellphone network speeds. To prove this I started looking at the new relic geography view and discovered that most of the slow traffic from Asia, specially India and China. Additionally India had the majority of all page resets during that time span.
This was a satisfying answer but as I studied the graphs more I noticed a hotspot of slow request in the United States. Specifically Ashburn, Virginia. This is a particularly puzzling discovery given that our datacenter is also located in Virginia. I decided to check another night and realize every night we get a large amount of slow request form Ashburn, Virginia.
Thank to the handy pni module we were logging username to new relic with our transaction data. When I looked at the users that were making these slow requests I realized all the requests were coming from a one single account. The account handles our synthetic monitoring and because it is a “fake account” it was causing all of our data requests to timeout.
Finding the cause of these slow transactions was half the battle. I am not sure what the solution is going to be. I suspect we will have our hosting provisor provision a couple of additional servers in the Asia region and implement an aggressive cache warming strategy for our external data. I hope things gives you some insight on how to use new relic. New Relic is highly useful if you know what you're doing but If you don't know what you're doing there is great worth in learning. New relic gives you a lot of good information out of the box but can become an invaluable tool with a little tender love and care.