Churn Without Fragmentation: How a Party-Label Bug Reversed My Headline Finding

0 0 9 minutes read

Churn Without Fragmentation: How a Party-Label Bug Reversed My Headline Finding

Between 2018 and 2022, English urban councils became nearly twice as volatile. Median volatility rose from 12.0 to 22.5.

But the party system did not fragment.

That distinction became visible only after fixing a categorical data bug.

Here, volatility measures how much vote share moved between party families. Fragmentation measures how many effective parties competed. A council can be highly volatile without becoming more fragmented if one major party collapses and another absorbs most of the loss.

The effective number of parties increased in only 18 of 67 comparable authorities. The median change in the fragmentation index stayed slightly negative: -0.31. The vote moved sharply, but it mostly moved inside an already-consolidating party system.

The first version of this analysis looked dramatically different. It suggested fragmentation had risen in 66 of 67 councils and that median volatility had tripled. That was wrong. The error came from treating ballot labels such as “Labour Party” and “Labour and Co-operative Party” as separate analytical parties. Once party families were normalised before computing the metrics, the headline changed completely.

What looked like a party-label bug was really a category-modelling failure. And its consequences propagated through every downstream metric.

The corrected story is less sensational. It is also more useful.

Categories are part of the model

Before walking through the findings, it is worth explaining what went wrong, because this is the part that generalises most directly beyond elections.

Party labels are not neutral strings. They encode messy institutional reality: alliances, ballot wording, local party brands, national party rebrands, and inconsistent source coding. If those labels are grouped incorrectly, every downstream metric can look precise and still be wrong.

That is exactly what happened. Fragmentation was computed before normalising party families. In boroughs where “Labour Party” and “Labour and Co-operative Party” both appeared, the Laakso-Taagepera denominator treated them as separate parties. That artificially inflated the effective number of parties. The same risk applied to UKIP, Reform UK, and Brexit Party labels.

The fix was conceptually simple: compute analytical party families before metric aggregation.

The pipeline now separates three identities:

Metric party family: used for fragmentation, volatility, and swing calculations.
Challenger party family: used for scenario and challenger identification.
Display party label: used only for Tableau colour and labelling.

Do not let display labels leak into metric definitions. Do not let raw strings define analytical categories without an explicit contract.

The difference between the original headline (“fragmentation rose in 66 of 67 councils”) and the corrected headline (“fragmentation rose in only 18 of 67”) is not a rounding error. It is a categorisation error that propagated through the entire pipeline. Every chart and every narrative conclusion shifted once the fix was applied.

The broader principle applies well beyond elections. Product categories, job titles, company names, diagnosis codes, and merchant names all have the same failure mode. If category normalisation happens after aggregation, it is too late. The story has already been distorted.

How the analysis works

The project follows a pattern-first approach: build the data pipeline, export the metrics, construct the visualisation, then let the data tell you which story it actually supports. The corrected fragmentation finding, the null turnout correlation, and the geographic shift in Green gains all emerged from diagnostic validation, not from the original project plan.

The pipeline ingests ward-level election results from the DCLEAPIL v1.0 dataset (Leman 2025), which draws on Andrew Teale’s LEAP archive and Democracy Club data. It normalises party families, aggregates vote shares to the authority level, computes fragmentation and volatility metrics, and exports structured CSVs for an interactive Tableau dashboard.

The analysis covers 68 English metropolitan borough, London borough, and West Yorkshire authorities across five regions. Of these, 67 have comparable fragmentation data across the 2018-to-2022 window.

The core metrics are:

Fragmentation Index: the Laakso-Taagepera effective number of parties, from authority-level vote shares.
Volatility Score: a composite metric combining a Pedersen-style absolute swing component with the change in fragmentation.
Turnout Delta: percentage-point change in turnout across the same window.
Party Swing: change in vote share by normalised party family.

The approach generalises to any domain where you need to compute derived metrics from messy categorical data and present them in a validated, reproducible visualisation. The full pipeline, calculated fields, and Tableau build guide are open-source.

The headline: volatility rose, fragmentation did not

The first dashboard panel maps volatility by authority. Circle size represents the volatility score. Colour represents the change in fragmentation: teal where it rose, amber where it fell.

Figure 1: Volatility by authority, 2018 to 2022. Circle size is volatility score. Colour shows whether fragmentation rose (teal) or fell (amber). Higher churn without broad-based fragmentation.

The map shows two things at once. First, volatility genuinely increased: about 1.9 times higher than the prior window. Second, fragmentation did not rise in most places. Only 18 of 67 comparable authorities had a higher effective number of parties in 2022 than in 2018.

The highest-volatility authorities were Solihull (67.6), Kingston upon Thames (60.3), Sutton (48.7), South Tyneside (47.4), and Havering (45.2). Five of the top eight are London boroughs, but the highest overall is Solihull. This is not simply a capital-city story.

Data science takeaway: when two related metrics (volatility and fragmentation) move in opposite directions, the analytical story changes completely. Always check whether your headline metric and your supporting metrics agree before publishing. The gap between the two is where the actual finding lives.

Brexit consolidated the vote. 2022 did not undo it.

The second view plots the effective number of parties across three points: each council’s last pre-2018 election, 2018, and 2022.

The old version described this chart as a V-shape: consolidation into 2018, then fragmentation after 2022. The corrected data does not support that. The better reading is consolidation, then partial stabilisation.

Figure 2: Effective number of parties by authority. Faint lines are councils. Bold lines are tier medians. Consolidation into 2018 and no broad fragmentation rebound in 2022.

Tier medians show the pattern: London declined from 2.87 to 2.16. Metropolitan boroughs declined from 3.22 to 2.65 (with a slight uptick from the 2018 low of 2.62). West Yorkshire declined sharply from 4.13 to 2.01.

The 2022 cycle was disruptive, but it was not a generalised splintering of the party system.

The mechanism: Conservative collapse, uneven absorption

The party-swing chart explains how volatility can rise while fragmentation falls.

Across 67 councils, the median party-family swing between 2018 and 2022 was: Labour +8.5 percentage points, Conservative -8.3, Liberal Democrats -2.3. Every other party moved less than 0.3 points in either direction.

These swings are calculated on normalised party families. Labour and Labour Co-operative are grouped together, as are UKIP, Reform UK, and Brexit Party labels. Without this normalisation, the raw data would show misleading Labour Co-operative gains alongside Labour losses in the same borough. The normalisation logic is documented in the data source metadata.

At the median, this is a Conservative-loss and Labour-gain story, not a third-party surge. But medians flatten geography. Labour absorbed the typical Conservative loss, while Liberal Democrats and Greens surged in specific councils.

Using an insurgency filter of at least a 5-point gain from a 2018 base of at least 2%: Liberal Democrats surged in 9 councils, Greens in 7, and the Yorkshire Party in 1. Independents and Reform/UKIP did not clear the threshold in this window.

Figure 3: Median party swing (top) and local insurgency counts (bottom). Labour gained most at the median, Conservatives lost most, but LD and Green surges were geographically concentrated.

Data science takeaway: threshold selection in categorical filters deserves the same rigour as hyperparameter tuning. The initial insurgency filter (5pp swing, no baseline floor) produced 12 Green “surge” councils. Diagnostic inspection revealed 5 were low-base artifacts: parties going from 0.5% to 5.5%. Adding a 2% baseline floor reduced the count to 7 and changed the geographic composition entirely. The analytical finding (Northern metros, not inner London) only emerged after the filter was corrected. Any threshold applied before a headline finding should be stress-tested by inspecting the edge cases it admits.

That is the mechanism: uneven absorption. Where Labour absorbed Conservative losses cleanly, volatility rose but fragmentation often fell. Where a third party absorbed part of the loss, local competition became more complex.

The Green story is geographic, not national

The Green median swing was +0.1 percentage points. That number is accurate and misleading.

It is accurate because the typical council did not see a large Green advance. It is misleading because Green support moved geographically.

In several inner London boroughs, Greens fell sharply:

Council	2018 Green %	2022 Green %	Swing
Islington	16.4	1.6	-14.8
Hackney	16.7	4.9	-11.9
Lambeth	18.8	7.8	-11.0

Table 1: Inner London Green retreat, 2018 to 2022. Three boroughs where Greens held double-digit vote share in 2018 saw sharp declines by 2022.

At the same time, Greens surged in Northern and Midlands authorities plus Westminster:

Council	2018 Green %	2022 Green %	Swing
Calderdale	4.2	18.2	+14.0
Bolton	2.6	12.1	+9.5
Westminster	2.1	11.5	+9.4
Bury	3.3	12.4	+9.1
Gateshead	4.3	12.2	+8.0
Wolverhampton	2.6	10.3	+7.8
Barnsley	3.7	9.3	+5.6

Table 2: Green surge councils, 2018 to 2022. Seven authorities where Greens gained 5+ percentage points from a base of at least 2%. Six are Northern and Midlands metropolitan boroughs. Westminster is the sole London borough on the list.

The inner London Green surge appears to have happened before 2018. Between 2018 and 2022, some of that vote moved back toward Labour. Meanwhile, Greens gained from lower bases in post-industrial metros.

The dataset cannot prove voter motivation. But it shows that a national Green median is the wrong level of analysis. A flat aggregate median can hide large offsetting movements across subgroups. The real pattern is redistribution across places, and you need the authority-level view to find it.

Regional volatility: group-level summaries are not explanations

Median volatility by region: North East 27.8, Yorkshire 25.7, London 22.0, North West 16.0, West Midlands 15.6.

Figure 4: Volatility by region. Each point is one authority. Horizontal markers show regional medians.

The West Midlands has the most volatile council in the dataset (Solihull at 67.6) but the lowest regional median. Aggregating by region helped orient the analysis, but it also showed why group-level summaries are not explanations. Council-level factors dominate regional geography.

Turnout and volatility moved independently

I expected volatile councils to have falling turnout.

Across 67 authorities, the Pearson correlation between turnout change and volatility is -0.12 (p = 0.35). Restricting to 64 election-active authorities: r = -0.15, p = 0.25. Both statistically null.

Figure 5: Turnout change versus volatility. The trend line is shallow and insignificant.

Data science takeaway: publishing null findings prevents bad narratives from becoming defaults. The original wireframes assumed a negative turnout-volatility correlation. When the computation returned r = -0.12 (p = 0.35), the headline was rewritten rather than the data re-scoped. Both scopes are reported transparently. Null findings are undervalued in data analysis. Letting data override assumptions is simple to describe and genuinely hard to practise.

What the corrected story says

English councils experienced much higher voter churn between 2018 and 2022. Median volatility rose from 12.0 to 22.5. But the effective number of parties did not rise in most councils. Fragmentation increased in only 18 of 67 comparable authorities, and the median change remained slightly negative.

Local electoral churn can be intense without producing a more fragmented party system. Voters moved, but in many places they moved from one dominant pole to another. Where smaller parties advanced, they did so locally and unevenly, not as a uniform national wave.

The real lesson is upstream: categories are part of the model. Get them wrong, and every chart tells a convincing but incorrect story.

Data sources and licensing

The underlying election results come from the DCLEAPIL v1.0 dataset (Leman, Jason, 2025), released under CC BY-SA 4.0. Supplementary data from the House of Commons Library is used under Open Parliament Licence v3.0. Derived datasets and pipeline code are released under MIT licence. Data provenance is documented in DATA_SOURCE_METADATA.md.

Methodology notes

68 in-scope authorities. 67 with comparable fragmentation values for the 2018-to-2022 window (Rotherham excluded from FI comparisons). Fragmentation uses the Laakso-Taagepera index. Volatility is a composite of Pedersen-style swing and fragmentation change. Party swings use normalised analytical families. The insurgency filter excludes Labour and Conservatives and requires a 2% 2018 baseline floor. Causal language is interpretive; the data captures outcomes, not motivations.

What comes next

A companion analysis will explore 2026 scenarios: baseline continuity, Reform local-surge assumptions, and major-party reconsolidation. Those are scenarios under algebraic assumptions, not forecasts.

The central question: if Conservative losses continue, does Labour absorb them again, or do the geographically concentrated LibDem and Green surges spread to new councils? And does that absorption pattern finally push fragmentation upward, or does the party system continue to consolidate even as individual councils churn?

That difference between churn and fragmentation is what the project is designed to measure.

The interactive dashboard is published on Tableau Public and the full data pipeline is available at github.com/Wisabi-Analytics/civic-lens.

Obinna Iheanachor is a Senior AI/Data Engineer and founder of Wisabi Analytics, a UK-based data engineering and AI consultancy. He creates content around production AI systems, data pipelines, and applied analytics at @DataSenseiObi on X and Wisabi Analytics on YouTube. Civic Lens is an open-source political data project at github.com/Wisabi-Analytics/civic-lens.

Source link

nimda 2 hours ago

0 0 9 minutes read