Skip to content

Getting started with votekit

This guide will help you get started using votekit, by using real election data from the 2013 Minneapolis mayoral election. This election had 35 candidates running for one seat, and used a single-winner IRV method to elect the winner. Voters were allowed to rank their top three candidates.

# these are the votekit functions we'll need access to
from votekit.cvr_loaders import load_csv
from votekit.elections import STV, fractional_transfer
from votekit.cleaning import remove_noncands

You can find the necessary csv file mn_2013_cast_vote_record.csv in the votekit/data folder of the GitHub repo. Alternatively, you can download the offical cast vote record (CVR) here. Download a verison of the file, and then edit the path below to where you placed it. The csv file has 3 columns we care about. The first, entitled '1ST CHOICE MAYOR MINNEAPOLIS' in the official CVR, tells us a voters top choice, then the second tells us their second choice, and the third their third choice.

The first thing we will do is create a PreferenceProfile object from our csv. A preference profile is a term from the social choice literature that represents the rankings of some set of candidates from some voters. Put another way, a preference profile stores the votes from an election, and is a collection of Ballot objects and candidates.

We give the load_csv function the path to the csv file. By default, each column of the csv should correspond to a ranking of a candidate, given in decreasing order (the first column is the voters top choice, the last column their bottom choice.) There are some other optional parameters which you can read about in the documentation, like how to read a csv file that has more columns than just rankings.

# you'll need to edit this path!
minneapolis_profile = load_csv("../src/votekit/data/mn_2013_cast_vote_record.csv")

The PreferenceProfile object has lots of helpful methods that allow us to study our votes. Let's use some of them to explore the ballots that were submitted. This is crucial since our data was not preprocessed. There could be undervotes, overvotes, defective, or spoiled ballots.

The get_candidates method returns a unique list of candidates. The head method shows the top n ballots. In the first column, we see the ballot that was cast. In the second column, we see how many of that type of ballot were cast.

# returns a list of unique candidates
print(minneapolis_profile.get_candidates())

# returns the top n ballots
minneapolis_profile.head(n=5)
['JOHN LESLIE HARTWIG', 'ALICIA K. BENNETT', 'ABDUL M RAHAMAN "THE ROCK"', 'CAPTAIN JACK SPARROW', 'STEPHANIE WOODRUFF', 'JAMES EVERETT', 'JAMES "JIMMY" L. STROUD, JR.', 'DOUG MANN', 'CHRISTOPHER CLARK', 'TROY BENJEGERDES', 'JACKIE CHERRYHOMES', 'DON SAMUELS', 'KURTIS W. HANNA', 'overvote', 'MARK ANDREW', 'OLE SAVIOR', 'TONY LANE', 'JAYMIE KELLY', 'MIKE GOULD', 'CHRISTOPHER ROBIN ZIMMERMAN', 'GREGG A. IVERSON', 'DAN COHEN', 'CYD GORMAN', 'UWI', 'BILL KAHN', 'RAHN V. WORKCUFF', 'MERRILL ANDERSON', 'CAM WINTON', 'EDMUND BERNARD BRUYERE', 'BETSY HODGES', 'undervote', 'BOB FINE', 'JOHN CHARLES WILSON', 'JEFFREY ALAN WAGNER', 'JOSHUA REA', 'MARK V ANDERSON', 'NEAL BAXTER', 'BOB "AGAIN" CARNEY JR']
Ballots Weight
0 (MARK ANDREW, undervote, undervote) 3864
1 (BETSY HODGES, MARK ANDREW, DON SAMUELS) 3309
2 (BETSY HODGES, DON SAMUELS, MARK ANDREW) 3031
3 (MARK ANDREW, BETSY HODGES, DON SAMUELS) 2502
4 (BETSY HODGES, undervote, undervote) 2212

Woah, that's a little funky! There's a candidate called 'undervote','overvote', and 'UWI'. In this dataset, 'undervote' says that someone left a ranking blank. The 'overvote' candidate arises when someone lists two candidates in one ranking, and in our data set, we lose any knowledge of their actual preference. 'UWI' stands for unregistered write-in.

It's really important to think carefully about how you want to handle cleaning up the ballots, as this depends entirely on the context of a given election. For now, let's assume that we want to get rid of the 'undervote', 'overvote', and 'UWI' candidates. The function remove_noncands will do this for us. If a ballot was "A B undervote", it would now be "A B". If a ballot was "A UWI B" it would now be "A B" as well. This might not be how you want to handle such things, but for now let's go with it.

minneapolis_profile = remove_noncands(minneapolis_profile, ["undervote", "overvote", "UWI"])
print(minneapolis_profile.get_candidates())
['NEAL BAXTER', 'JAYMIE KELLY', 'MIKE GOULD', 'CHRISTOPHER ROBIN ZIMMERMAN', 'GREGG A. IVERSON', 'DAN COHEN', 'JOHN LESLIE HARTWIG', 'ALICIA K. BENNETT', 'CYD GORMAN', 'BILL KAHN', 'RAHN V. WORKCUFF', 'MERRILL ANDERSON', 'CAPTAIN JACK SPARROW', 'CAM WINTON', 'STEPHANIE WOODRUFF', 'EDMUND BERNARD BRUYERE', 'JAMES EVERETT', 'BETSY HODGES', 'JAMES "JIMMY" L. STROUD, JR.', 'DOUG MANN', 'CHRISTOPHER CLARK', 'TROY BENJEGERDES', 'JACKIE CHERRYHOMES', 'BOB FINE', 'JOHN CHARLES WILSON', 'DON SAMUELS', 'JEFFREY ALAN WAGNER', 'KURTIS W. HANNA', 'JOSHUA REA', 'MARK ANDREW', 'OLE SAVIOR', 'MARK V ANDERSON', 'ABDUL M RAHAMAN "THE ROCK"', 'TONY LANE', 'BOB "AGAIN" CARNEY JR']

Alright, things are looking a bit cleaner. Let's examine some of the ballots.

# returns the top n ballots
minneapolis_profile.head(n=5, percents = True)
Ballots Weight Percent
0 (MARK ANDREW,) 3864 4.87%
1 (BETSY HODGES, MARK ANDREW, DON SAMUELS) 3309 4.17%
2 (BETSY HODGES, DON SAMUELS, MARK ANDREW) 3031 3.82%
3 (MARK ANDREW, BETSY HODGES, DON SAMUELS) 2502 3.15%
4 (BETSY HODGES,) 2212 2.79%

We can similarly print the bottom \(n\) ballots. Here we toggle the optional percents and totals arguments, which will show us the fraction of the total vote, as well as sum up the weights.

# returns the bottom n ballots
minneapolis_profile.tail(n=5, percents = False, totals = True)
Ballots Weight
6916 (STEPHANIE WOODRUFF,) 1
6915 (DON SAMUELS, ABDUL M RAHAMAN "THE ROCK", MARK... 1
6914 (DON SAMUELS, ABDUL M RAHAMAN "THE ROCK", MIKE... 1
6913 (DON SAMUELS, ABDUL M RAHAMAN "THE ROCK", OLE ... 1
6912 (DON SAMUELS, ABDUL M RAHAMAN "THE ROCK", RAHN... 1
Totals 5 out of 79378

There are a few other methods you can read about in the documentation, but now let's run an election!

Just because we have a collection of ballots does not mean we have a winner. To convert a PreferenceProfile into a winner (or winners), we need to choose a method of election. The mayoral race was conducted as a single winner IRV election, which in votekit is equivalent to a STV election with one seat. The transfer method tells us what to do if someone has a surplus of votes over the winning quota (which by default is the Droop quota).

minn_election = STV(profile = minneapolis_profile, transfer = fractional_transfer, seats = 1)
# the run_election method prints a dataframe showing the order in which candidates are eliminated under STV
minn_election.run_election()
Current Round: 35





                   Candidate     Status  Round
                BETSY HODGES    Elected     35
                 MARK ANDREW Eliminated     34
                 DON SAMUELS Eliminated     33
                  CAM WINTON Eliminated     32
          JACKIE CHERRYHOMES Eliminated     31
                    BOB FINE Eliminated     30
                   DAN COHEN Eliminated     29
          STEPHANIE WOODRUFF Eliminated     28
             MARK V ANDERSON Eliminated     27
                   DOUG MANN Eliminated     26
                  OLE SAVIOR Eliminated     25
               JAMES EVERETT Eliminated     24
           ALICIA K. BENNETT Eliminated     23
  ABDUL M RAHAMAN "THE ROCK" Eliminated     22
        CAPTAIN JACK SPARROW Eliminated     21
           CHRISTOPHER CLARK Eliminated     20
                   TONY LANE Eliminated     19
                JAYMIE KELLY Eliminated     18
                  MIKE GOULD Eliminated     17
             KURTIS W. HANNA Eliminated     16
 CHRISTOPHER ROBIN ZIMMERMAN Eliminated     15
         JEFFREY ALAN WAGNER Eliminated     14
                 NEAL BAXTER Eliminated     13
            TROY BENJEGERDES Eliminated     12
            GREGG A. IVERSON Eliminated     11
            MERRILL ANDERSON Eliminated     10
                  JOSHUA REA Eliminated      9
                   BILL KAHN Eliminated      8
         JOHN LESLIE HARTWIG Eliminated      7
      EDMUND BERNARD BRUYERE Eliminated      6
JAMES "JIMMY" L. STROUD, JR. Eliminated      5
            RAHN V. WORKCUFF Eliminated      4
       BOB "AGAIN" CARNEY JR Eliminated      3
                  CYD GORMAN Eliminated      2
         JOHN CHARLES WILSON Eliminated      1

And there you go! You've created a PreferenceProfile from real election data, done some cleaning, and then conducted an STV election. You can look at the offical results and confirm that votekit elected the same candidate as in the real 2013 election.