Performance tuning the Ingest service

You can tune search performance at data ingest time by adjusting your NiFi settings, or at runtime by changing the memory options for Elasticsearch.

Tuning Apache NiFi

To maximize performance, NiFi divides incoming data using a zero-master clustering approach. Data is divided into chunks and each node in the cluster performs the same operation on the chunk it receives. ZooKeeper elects one node as the Cluster Coordinator, and all other nodes send heartbeat data to it. The Coordinator is responsible for disconnecting nodes that do not report on time, or connecting new nodes that prove they have the same configuration as the other nodes in the cluster.

This architecture is sensitive to disk caching, JVM heap allocation and garbage collection efficiency issues. You can adjust the following configuration settings to improve NiFi performance.
Setting Description Default Range/options
Memory (Bootstrap.conf file settings)
JVM memory Minimum and maximum heap memory. note that a very large heap may slow down garbage collection. 512mb Set to 4 to 8 Gb, for example:
  • -Xms8g
  • -Xmx8g
Garbage collector: XX:+UseG1GC Java 8 has issues when using the recommended writeAheadProvenance implementation introduced in Apache NIFi 1.2.0 (HDF 3.0.0).
Java 8 or later ( file settings)
XX:ReservedCodeCacheSize NiFi stores its data on disk while processing it. Under conditions of high throughput the default CodeCache settings may prove inadequate. Varies according to Java version; can be as low as 32mb 256Mb
XX:CodeCacheMinimumFreeSpace Uncomment in to use. 10mb
XX:+UseCodeCacheFlushing Sets threshold for flushing cache.
Filesystem storage for NiFi internal repositories
Flow file
Tuning (per NiFi node)
Threads Number of threads for timer driven threads. Do not use event-driven threads. 2 to 4 times number of cores on host

Tuning Elasticsearch

Elasticsearch uses the same zero-master clusering approach as NiFi. The coordinating node receives write requests and allocates routing requests to other cluster instances (shards). By default each shard refreshes its filesystem cache once per second and commits every five seconds. The shard keeps a transaction log and flushes the log every thirty minutes.

In the query phase of the search process, the coordinating node takes incoming searches and sends them to all the shards. Each shard performs its own search, locally. The shard prioritizes the results and returns information about the top fifty documents to the coordinating node. In the fetch phase the coordinating node determines the top ten documents from each shard's list, and requests that each shard send it those documents.

The query phase will usually take significantly longer than the fetch phase, because during Query the shards have to match the search to a potentially long list of documents, and determine a score for each. In contrast, the fetch can complete quickly because the coordinating server requests a subset of the documents using direct addresses.

The primary way of improving Elasticsearch performance is to increase the refresh interval. When you do this, Elasticsearch will create a new Lucene segment and merge it later, increasing the total segment count. In addition, avoid swapping if at all possible. Set bootstrap.memory_lock=true to facilitate this.

Adjust the following specific settings to improve optimize Elasticsearch working environment.

Setting Description Default Range/options
JVM heap Minimum and maximum heap memory.
  • Set minimum and maximum the same to avoid resizing
  • Allocate only up to 50% of available memory
  • Do not exceed 32 GB size
  • Disable OS swapping
Garbage collector XX:+UseG1GC Use the Java 8 Garbage First collector for heap sizes below 4 GB.
Index buffer size 10% of heap size.
Filesystem cache
  • Filesystem storage type use NIO FS (maps to Lucene NIOFSDirectory) – allows multiple threads to read from same file concurrently –
  • [ nvd, dvd, tim, doc, dim
50% of Elasticsearch memory size
LRU cache
Node query cache Set with indices.queries.cache.size parameter 10% of heap size
Shar query cache Used for aggregation
Field data cache Set with indices.fielddata.cache.size parameter Limited to 30% of heap size
General settings
Threadpool generic, index, get, bulk