Scores

Generally, the point of running the chain is comparing the properties of some existing plan to the properties of the ensemble of plans generated by the chain. Thus, we need a way to record information about the plans generated during the progression of the chain. This is the point of Scores: each score has a scoring function that is evaluated on the current districting plan at each step of the chain. The resulting values are saved in a dictionary, which is itself added to a growing ChainScoreData object at each step of the chain. The ChainScoreData returned by recom_chain() and flip_chain() is like a "history" of the scores at each step of the chain. (See get_scores_at_step() for information about how to retrieve the values of any subset of scores at a particular step in the chain.)

Score types

All Scores have the type AbstractScore. There are four categories of scores (note that DistrictAggregate is really a sub-category of DistrictScore):

Score Type	Fields	Description
`DistrictAggregate`	`name` (String), `key` (String)	A DistrictAggregate score is a simple sum of a particular property over all nodes in a given district.
`DistrictScore`	`name` (String), `score_fn` (Function)	A `DistrictScore` takes a user-supplied function that returns some quantity of interest given the nodes in a given district. The signature of `score_fn` should be as follows: `score_fn(graph::BaseGraph, district_nodes::BitSet, district::int)`
`PlanScore`	`name` (String), `score_fn` (Function)	A PlanScore takes a user-supplied function that returns some quantity of interest given a BaseGraph and corresponding Partition object. The signature of `score_fn` should be as follows: `score_fn(graph::BaseGraph, partition::Partition)`
`CompositeScore`	`name` (String), `scores` (Array{S, 1} where S<:AbstractScore)	A CompositeScore is just a group of scores that are run in sequence. CompositeScores are especially useful when the score functions depend upon/modify some shared state. The `Election` object is implemented as a `CompositeScore`.

So, when should you use which score? Here's a general breakdown:

`DistrictAggregate`

Use this score when you just want to sum an attribute over all nodes in a district.

GerryChain.DistrictAggregate — Type

DistrictAggregate(name::String,
                  key::String)

A DistrictAggregate score is a simple sum of a particular property over all nodes in a given district.

source

For example, if you wanted to count the number of Black people in each district at every step of the chain, you could use a DistrictAggregate score to do so. If the attribute in the nodes of the graph was called "BLACK" and you wanted to call your score "Black_Pop", you would initialize the score as

julia> DistrictAggregate("Black_Pop", "BLACK")
DistrictAggregate("Black_Pop", "BLACK")

If you wanted to name your score the same as the node attribute, you could also use

GerryChain.DistrictAggregate — Method

DistrictAggregate(key::String)

Initializes a DistrictAggregate score where the name and key are the same.

source

In this case, you would instantiate your score as

julia> DistrictAggregate("BLACK")
DistrictAggregate("BLACK", "BLACK")

`DistrictScore`

This score works best when you're interested in tracking a statistic for each district for all plans in the chain. For example, you might want to know the Polsby-Popper score for each district at every step of the chain.

GerryChain.DistrictScore — Type

DistrictScore(name::Union{String, Missing}
              score_fn::Function)

A DistrictScore takes a user-supplied function that returns some quantity of interest given the nodes in a given district. The signature of score_fn should be as follows: score_fn(graph::BaseGraph, district_nodes::BitSet, district::int)

source

The example below illustrates a simple example of setting up a DistrictScore. Perhaps somehow you find yourself interested in computing the number of edges within a district.

julia> function num_edges_within_dist(graph::BaseGraph,
                                      district_nodes::BitSet,
                                      district::Int)
           """ Computes the number of edges within each district of the plan.
               Arguments:
                   graph:          The underlying BaseGraph
                   district_nodes: A Set of nodes in district `district`. Each node
                                   is represented as an Int.
                   district:       Int representing the district
           """
           all_edges = []
           for node in district_nodes, neighbor in graph.neighbors[node]
               if neighbor in district_nodes
                   edge = graph.adj_matrix[node, neighbor]
                   push!(all_edges, edge)
               end
           end
           return length(unique(all_edges))
       end
num_edges_within_dist (generic function with 1 method)

julia> # instantiate this score
       DistrictScore("edges_within_dist", num_edges_within_dist)
DistrictScore("edges_within_dist", Main.ex-env.num_edges_within_dist)

`PlanScore`

This type of score is suited to statistics that are evaluated on an entire plan. For example, the number of districts won by a party for a given plan would be a PlanScore.

GerryChain.PlanScore — Type

PlanScore(name::Union{String, Missing},
          score_fn::Function)

A PlanScore takes a user-supplied function that returns some quantity of interest given a BaseGraph and corresponding Partition object.

The signature of score_fn should be as follows: score_fn(graph::BaseGraph, partition::Partition)

source

Below we illustrate a few examples on how to construct a PlanScore:

julia> function num_max_blocks(graph::BaseGraph,
                               partition::Partition)
           """ Returns the maximum number of blocks in each district.
               This function assumes that the nodes in `graph` are blocks.
           """
           return maximum([length(nodes) for nodes in partition.dist_nodes])
       end
num_max_blocks (generic function with 1 method)

julia> function higher_num_cut_edges_than_parent(graph::BaseGraph,
                                                 partition::Partition)
           """ Returns True if `partition` has a higher number of cut edges than the
               previous plan, False otherwise.
           """
           if partition.parent isa Partition & partition.num_cut_edges > partition.parent.num_cut_edges
               return True
           end
           return False
       end
higher_num_cut_edges_than_parent (generic function with 1 method)

julia> # how to formulate these scores
       PlanScore("num_max_blocks", num_max_blocks)
PlanScore("num_max_blocks", Main.ex-env.num_max_blocks)

julia> PlanScore("higher_num_cut_edges_than_parent", higher_num_cut_edges_than_parent)
PlanScore("higher_num_cut_edges_than_parent", Main.ex-env.higher_num_cut_edges_than_parent)

`CompositeScore`

This might be the most difficult type of score to wrap one's mind around. CompositeScores are best when you have a series of scores with some shared state or rely on the same computation. They allow you to "group" scores together. For example, we use CompositeScores for Elections, since almost all election-related scores rely on vote counts and vote shares.

GerryChain.CompositeScore — Type

CompositeScore(name::String,
               scores::Array{S,1}) where {S<:AbstractScore} # should be other AbstractScores

A CompositeScore is just a group of scores that are run in sequence. CompositeScores are especially useful when the score functions depend upon/modify some shared state.

source

Built-In Score Functions

Currently only one score function comes built-in with GerryChain, and it is the number of cut-edges in the plan.

GerryChain.num_cut_edges — Function

num_cut_edges(name::String)::PlanScore

Returns a PlanScore that tracks the number of cut edges in a particular plan.

source

A little more detail on score functions

Why do we differentiate between the different types of scores? The answer boils down to efficiency. Recall that each plan in the chain (whether it is uses ReCom proposals or Flip proposals) only differs from the previous plan by 2 districts. This means we can save space / eliminate redundant computation by only re-calculating district-level scores on the districts that were changed. However, for plan-level scores, we always have to re-run the score function on the entire partition.

Other Score Functions

GerryChain.coerce_aggregated_attributes! — Method

coerce_aggregated_attributes!(graph::BaseGraph,
                              scores::Array{S, 1}) where {S<:AbstractScore}

Coerces DistrictAggregate attributes in scores to be Floats if they are Strings.

source

GerryChain.eval_score_on_district — Method

eval_score_on_district(graph::BaseGraph,
                       partition::Partition,
                       score::DistrictAggregate,
                       district::Int)::Number

Evaluates a DistrictAggregate score on the nodes in a particular district.

source

GerryChain.eval_score_on_district — Method

eval_score_on_district(graph::BaseGraph,
                       partition::Partition,
                       score::DistrictScore,
                       district::Int)

Evaluates a user-supplied DistrictScore function on the nodes in a particular district.

source

GerryChain.eval_score_on_partition — Method

eval_score_on_partition(graph::BaseGraph,
                        partition::Partition,
                        composite::CompositeScore)

Evaluates the user-supplied functions in the CompositeScore scores array on each of the districts in the partition.

Returns an Dict of the form:

{
    # District-level scores
    d\_score₁.name :     [a₁, a₂, ..., aᵢ]
        ...
    d\_scoreᵤ.name :     [b₁, b₂, ..., bᵢ]
    # Partition-level scores
    p\_score₁.name :     c,
        ...
    p\_scoreᵥ.name :     d,
}

source

GerryChain.eval_score_on_partition — Method

eval_score_on_partition(graph::BaseGraph,
                        partition::Partition,
                        score::PlanScore)

Evaluates a user-supplied PlanScore function on the entire partition.

source

GerryChain.eval_score_on_partition — Method

Evaluates a user-supplied DistrictScore function or DistrictAggregate score on all districts in an entire plan.

Returns an array of the form [a₁, a₂, ..., aᵢ], where i is the number of districts in the plan.

source

GerryChain.score_initial_partition — Method

score_initial_partition(graph::BaseGraph,
                        partition::Partition,
                        scores::Array{S, 1}) where {S<:AbstractScore}

Returns a dictionary of scores for the initial partition. The dictionary has the following form (where n is the number of districts, u is the number of district-level scores, v is the number of partition-level scores, and i is the number of composite scores):

{
    # District-level scores
    d\_score₁.name :     [a₁, a₂, ..., aᵤ]
        ...
    d\_scoreᵤ.name :     [b₁, b₂, ..., bᵤ]
    # Partition-level scores
    p\_score₁.name :     c,
        ...
    p\_scoreᵥ.name :     d,
    # Composite scores
    c\_score₁.name :
        {
            c\_score₁.scores[1].name :     ...
                ...
        }
        ...
    c\_scoreᵢ.name :
        {
            ...
        }
}

source

GerryChain.score_partition_from_proposal — Method

score_partition_from_proposal(graph::BaseGraph,
                              partition::Partition,
                              proposal::AbstractProposal,
                              scores::Array{S, 1}) where {S<:AbstractScore}

Returns a Dictionary of

(a) updated district-level scores for districts that were altered

after proposal was accepted,

(b) partition-level scores, and

For example, suppose district 4's new White population is 43 and the new Sen2010_Dem population is 62, district 8's new White population is 22 and new Sen2010_Dem population is 66. The Δ scores would look like:

    {
        "num_cut_edges" : partition.num\_cut\_edges,
        "dists"         : (5, 8),
        "White"         : [43, 22],
        "Sen2010_Dem"   : [62, 66],
    }

source