Skip to content

Use adaptive mw spark dataframe function

Milimetric requested to merge update-consistency-check into main

This uses the new mediawiki-jdbc spark datasource to update all the mw fetching code from the consistency checker. Tested for some small wikis, tuning may be required for bigger jobs. Uses a maximum of 64 partitions, could be made configurable with some more math.

Bug: T372677

Merge request reports