When do you need large heap for Elasticsearch?

When do we need large heap with Elasticsearch?

  • Running ES 1.5.2 JAVA 1.8_45 Windows 2008 4 nodes of 32 Core 128gb RAM 5TB SSDs (Per machine). My goal is to index about 2.5 billion documents. I am up to 810 million. 30k average per doc. I currently have ES_HEAP_SIZE=30g But I have been experience lots of memory pressure and STW pauses. Example: Currently one node is always above 90% heap usage while the rest are coasting anywhere between 30% and 40%. So it seems that 1 node wont GC??? Only 2 things are happening on the cluster bulk indexing (no errors) logged and some scroll searches. Using doc value where I can. Currently there's no field data cache (except marvel verry small) and filter cache is very minimal about 100MB per node. The nodes are still trying to recover so i just don't want to stop the cluster fully and reset the RAM to 10GB?? How I connect to the cluster in both bulk and scroll search... // Do this once at application startup and re-use the client instance. Settings settings = ImmutableSettings .settingsBuilder() .put("cluster.name", "xxxx") .build(); client = new TransportClient(settings) .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300)) .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300)) .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300)) .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300));

  • Answer:

    Don't send the bulk requests only to one node. The same goes for the search requests. The bulk request is kept in a memory buffer on the node that receives the request and, obviously, is not a good idea to send any kind of requests to just one node. Round robin the requests either by using a proxy server (if you have one), or by using a https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html and send the requests to that node. The client node knows how to do the round-robin mechanism. You can, also, look at other options (depending on the clients accessing the cluster) and see if those clients support automatic round-robin/load balancing the requests.

user432024 at Stack Overflow Visit the source

Was this solution helpful to you?

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.