Port the standard wikitext list/table filter from section topics
- IMPORTANT: don't extract the lead section anymore. If coupled with the filter, causes a steady increase of executors memory, and isn't used downstream anyway
- adapt filtering logic from https://gitlab.wikimedia.org/repos/structured-data/section-topics/-/blob/a11b6e70f2b00d039b05715167545d6abc284717/section_topics/pipeline.py#L158
- remove
split_section
function -
_process_sections
now extracts heading and content, skips null or empty content, and skips content with standard lists or tables - isolate heading normalization logic
- update tests
Bug: T330841