Infrastructure resource sizing: analyze operation metrics

Analyzing metrics collected by New Relic Infrastructure allows you to uncover opportunities to optimize your organization's operating environment, whether it is a physical datacenter or thousands of instances.

  • Using too few resources in key areas can lead to errors or performance problems.
  • Using too many resources can lead to unneeded costs.

For example, you may find that you can redistribute application instances to hosts that have extra memory and CPU resources, and terminate or repurpose the hosts those instances came from.

Use Infrastructure to ensure that your team is providing the right amount of compute power to meet customer expectations at appropriate costs.


This tutorial assumes you have reviewed New Relic's Establish objectives and baselines tutorial.

1. Evaluate your environment's current efficiency

To evaluate the current efficiency of your environment:

  1. Go to > Hosts.
  2. Review the charts showing your environment's CPU percentage, load average, and memory used percentage. Keep an eye on the averages for each metric (represented by the dotted black line).
  3. Watch for outliers (low and high spikes).

These metrics provide a good overview of your environment’s capacity.

Infrastructure: Optimize hosts > Hosts: To evaluate your operating environment's efficiency, review the Infrastructure metrics (including averages that show as dotted lines), watch for outliers, and expand and sort the table's Memory used column.

2. Identify underutilized hosts and applications

As you identify hosts that have extra capacity, start with memory usage, as that is a common limiting factor:

  1. From the Infrastructure Hosts page, expand [Expand icon] the table, then sort the Memory used column in descending order.
  2. To identify good redistribution candidates, look for hosts using a small amount of memory that have a small number of applications deployed on them.

Also consider the health of an application before moving it to a different host.

  1. To ensure that an application has predictable performance, use New Relic APM.
  2. From the APM Overview page, track metrics for Apdex (user satisfaction) and average response time for transactions.
  3. Review other app performance details from the APM UI.
  4. Stabilize applications that are volatile before introducing other variables into their runtime performance.

3. Downsize hosts or add apps

Downsize hosts or add more applications to them where appropriate. This step may prove to be more art than science, as it serves as an example of the classic bin packing problem. It is generally more cost effective to consolidate applications onto larger hosts than it is to downsize host count and run fewer applications on smaller hosts.

For more efficient use of computing resources, consider containerizing your applications. Orchestration services such as Kubernetes and Amazon Elastic Container Service (ECS) treat hosts as a collective pool of compute resources, redistributing container instances across clusters whenever they are started or restarted based on available capacity. New Relic Infrastructure includes many integration options, including Kubernetes, Amazon ECS, and more.

In a static host environment, you can track all of the details of your deployments; with containers you may not know what the last deployment was, or if it was completed. Use the Kubernetes integration with New Relic Infrastructure to monitor the health of your containerized infrastructure. In Infrastructure, go to Integrations > On host Integrations > Kubernetes. Once configured, you’ll get a default dashboard that show you lots of information:

Container health dashboard > Integrations > On host Integrations > Kubernetes

To start, focus on:

  • Deployments/Pods: Ensure all desired pods in a deployment are running and healthy. An isolated container restart is not a problem, but multiple restarts indicates a larger problem.
  • Nodes: Monitor the CPU, memory, and disk utilization for Kubernetes workers and masters to ensure all nodes are healthy.

If there is too much pressure on your pods, add more resources to run your applications.

Whether you are running applications in containers or traditional server hosts:

  • As you balance the load of each application across your environment, keep in mind redundancy and availability strategies.
  • After every redistribution operation, be sure to compare application performance and your operating environment's health to previously established baselines, to ensure that you continue to meet customer expectations for performance.

For more help

If you need more help, check out these support and learning resources: