• English日本語한국어
  • Log inStart now

Apache Hadoop integration

Our Apache Hadoop integration monitors the performance of your Hadoop cluster and applications.

After setting up our Apache Hadoop, we give you a dashboard for your Apache Hadoop metrics.

Install the infrastructure agent

To get data into New Relic, install our infrastructure agent. Our infrastructure agent collects and ingests data so you can keep track of your app's performance. The version should be 1.10.7 or higher to support NRI-Flex integration.

You can install the infrastructure agent two different ways:

Configure NRI-Flex for Apache Hadoop

Flex comes bundled with the New Relic infrastructure agent. To create a flex configuration file follow these steps:

  1. Create a file named nri-flex-hadoop-config.yml in this path:

    bash
    $
    /etc/newrelic-infra/integrations.d
  2. Use our configuration template to update the fields EVENT_TYPE and YOUR_DOMAIN in the created file named nri-flex-hadoop-config.yml. The value on the event_type is used to store metrics on the NRDB.

    Example:

    • EVENT_TYPE1 can be updated to HadoopResourceManagerSample
    • EVENT_TYPE2 can be updated to HadoopNameNodeSample

    Your nri-flex-hadoop-config.yml file should look like this:

    integrations:
    - name: nri-flex
    # interval: 30s
    config:
    name: hadoopMetrics
    apis:
    - event_type: EVENT_TYPE1
    commands:
    # run any command, you could cat .json file, or run some commands that produce a json output
    # the example just calls an API that returns json
    - run: curl -s https://YOUR_DOMAIN:9870/jmx #json output is retrieved from this command
    - event_type: EVENT_TYPE2
    commands:
    - run: curl -s https://YOUR_DOMAIN:8088/jmx?qry=Hadoop:*

Forward Apache Hadoop logs to New Relic

You can use our log forwarding to forward Apache Hadoop logs to New Relic.

On Linux machines, your log file named logging.yml should be present in this path:

bash
$
/etc/newrelic-infra/logging.d/

After creating the log file, add the following script to the logging.yml file:

logs:
- name: hadoop_secondarynamenode_log
file: /usr/local/hadoop/logs/hadoop-hadoopuser-secondarynamenode-hadoop-master.log
attributes:
logtype: hadoop_secondarynamenode_logs
- name: hadoop_resourcemanager_log
file: /usr/local/hadoop/logs/hadoop-hadoopuser-resourcemanager-hadoop-master.log
attributes:
logtype: hadoop_hadoop_resourcemanager_logs
- name: hadoop_namenode_log
file: /usr/local/hadoop/logs/hadoop-hadoopuser-namenode-hadoop-master.log
attributes:
logtype: hadoop_namenode_logs

Restart the New Relic infrastructure agent

Before you can start reading your data, use the instructions in our infrastructure agent docs to restart your infrastructure agent.

bash
$
sudo systemctl restart newrelic-infra.service

In a couple of minutes, your application will send metrics to one.newrelic.com.

Find your data

You can choose our pre-built dashboard template named Apache Hadoop to monitor your Apache Hadoop server metrics. Follow these steps to use our pre-built dashboard template:

  1. From one.newrelic.com, go to the + Add data page.
  2. Click on Dashboards.
  3. In the search bar, type apache hadoop.
  4. The Apache Hadoop dashboard should appear. Click on it to install it.

Your Apache Hadoop dashboard is considered a custom dashboard and can be found in the Dashboards UI. For docs on using and editing dashboards, see our dashboard docs.

Here is a NRQL query to check the active users from the resource manager:

SELECT latest(activeUsers)
FROM HadoopResourceManagerSample

Here is a NRQL query to view the number of active clients from the name node:

SELECT latest(numActiveClients)
FROM HadoopNameNodeSample

What's next?

To learn more about building NRQL queries and generating dashboards, check out these docs:

Copyright © 2024 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.