Planning Your Collector Deployment

The Collector is a machine on your network running Rapid7 software that either polls data or receives data that is pushed from Event Sources and makes it available for InsightIDR analysis. An Event Source represents a single device that sends logs to the Collector. For example, if you have three firewalls, you will have one Event Source for each firewall in the Collector. The Collector is the on-premise component of InsightIDR.

The Collector is responsible for gathering endpoint data. Note that it is oftentimes more efficient to deploy multiple Collectors throughout an environment rather than break firewall rules or overload a single Collector. Treat your Collectors as you would any other highly valuable asset – credentials for the various Event Sources you configure are stored on this device.

A Collector can be installed on a network server or virtual machine that meets the following requirements:

  • Operating system: Linux 64-bit or Windows 64-bit
  • Minimum Hardware: 4 GB RAM and 60 GB disk space
  • 2 CPUs recommended
  • CPU: 1 CPU per 16,000 endpoints scanned by the Endpoint Scan
  • Minimum network bandwidth: 100 Mbps network (recommended), 1000Mbps (strongly recommended)

There can only be one Collector installed per machine on your network. Rapid7 strongly recommends that the machine (physical or virtual) is dedicated to running the Collector.

Begin by configuring multiple Event Sources on a single Collector. Later, you can add Collectors as needed. For example, you may need to distribute the bandwidth across your network if you have very high logging levels or if your network is geographically dispersed.

To plan your Collector deployment, have the following information available for each server or virtual machine where you will install the Collector:

  • Display name
  • Network location
  • Server host name and IP address

You must have administrator rights to install a service on the server.

The following process pairs the Collector installed in your network to Amazon Web Services (AWS), where the InsightIDR servers are hosted. Note that no credentials are stored in AWS, and raw logs are stripped by the Collector in your environment so that no sensitive data (i.e., PII, medical records, etc.) is stored by Rapid7.

  1. Configure firewall/web proxy rules to allow the Collector to reach https://data.insight.rapid7.com and https://s3.amazonaws.com. If you have a firewall or web proxy that restricts outgoing connections, you need to grant permission for the Collector to be able to connect to the backend servers. Customers deployed in our Frankfurt, Germany instance need to be able to reach https://eu.data.insight.rapid7.com and https://s3.eu-central-1.amazonaws.com.
  2. All Collectors must be able to reach out to port 443 to: https://endpoint.ingress.rapid7.com (US) or https://eu.endpoint.ingress.rapid7.com (EMEA).
  3. Disable the local firewall (if possible).
  4. From your desktop, navigate to https://insight.rapid7.com and log in with your InsightIDR credentials (if you do not have credentials, contact a InsightIDR Sales Representative).
  5. Download the Collector installer from https://insight.rapid7.com.
  6. Copy it to the machine running InsightIDR.
  7. Follow the installation wizard.
  8. Click Activate Collector, name the Collector, paste the Agent Key, and click Activate.
  9. All Collectors must be configured with a fully qualified domain name (e.g. idrcollector23.myorg.com).
  10. All endpoints need to be able to communicate back to the Collector via Collector ports:
    • 5508
    • 6608
    • 20,000 - 30,000
  11. Overlapping endpoint monitoring ranges are not allowed. IP addresses or IP ranges defined on Collector A should not be duplicated on Collector B. If this exists, it should be updated before the migration or those ranges have to be manually updated after the migration.
  12. Each Collector can only support one set of endpoint monitoring credentials per Collector. A Collector instance will have to be setup for each set of endpoint monitoring credentials.

What's Next?