Skip to content

Data Representation

In preferential or ranked voting, each voter expresses an ordered list of preferences over a set of candidates. When representing this data in tabular form, there are multiple valid approaches, each suited to different kinds of analysis.

Let's assume there are 4 candidates: A, B, C, and D.

Ranking Position as Columns

One common approach is to use a "ranking-position as column" format. In this structure, each row represents a voter, and each column corresponds to a ranking position ( e.g. first, second, third). The cell values indicate the candidate ranked at that position. This format directly shows each voter's preference order.

count rank_1 rank_2 rank_3 rank_4
C A B D

Appropriate use cases:

  • Implementing ranked voting rules such as Instant Runoff, Borda Count, or Top-k approval.
  • Identifying the top-ranked candidate per voter.
  • Grouping or counting identical ballots.
  • Working with data that mirrors the structure of actual ballots.

Limitations:

  • It is less efficient for analyses that require identifying the rank assigned to a specific candidate without reshaping the data.

Candidate as Columns

An alternative approach is the "candidate as column" format, where each row also represents a voter, but each column corresponds to a candidate, and the cell values indicate the rank assigned to that candidate by the voter. This format makes it easy to compare how each candidate was ranked across voters.

count A B C D
1 2 0 3

Appropriate use cases:

  • Analyzing rankings from the perspective of each candidate (e.g., average or median rank).
  • Performing pairwise comparisons used in Condorcet methods.
  • Computing how frequently a specific candidate appears in a given rank.
  • Applying statistical or machine learning models where candidates are treated as variables.

Limitations:

  • It may require reshaping the data to apply position-based voting rules.

Comparison

Choosing the right tabular format for ranked voting data depends on the task.

Task Recommended Format
Implementing ranked-choice voting rules Ranking position as columns
Counting first-choice votes Ranking position as columns
Computing statistics per candidate Candidate as columns
Analyzing pairwise comparisons (Condorcet) Candidate as columns
Grouping or aggregating identical ballots Ranking position as columns
Applying statistical models to ranking data Candidate as columns