search: process_sparql_query workaround oom issues (take 2)
This has been failing more and more, make a couple changes to try and get things running more consistently:
-
Increase output partitions to 4. On review of a few days data we see typical outputs of 1.5-3GB. Split the output into 4 partitions so we don't try and do as much work all in the same place.
-
Add some memory overhead. This was at the default of 10% before, or ~1.6GB. Increase to 3gb since yarn is killing executions due to overrunning.
-
Disable adaptive query execution. Not entirely sure that this is necessary, but in testing from a jupyter notebook it would regularly fail without this, but pass with it on. This job is very simple and shouldn't need anything fancy from AQE.