Full Package Reference
This page contains the full reference for the gerrychain package, and
is included primarily for the sake of developers trying to hunt down errors
or understand the codebase. For a more user-friendly introduction to the
usage of the package, please see the API Reference.
Accept
This module provides the main acceptance function used in ReCom Markov chains.
Dependencies:
random: For random number generation for probabilistic acceptance.
Last Updated: 11 Jan 2024
- gerrychain.accept.cut_edge_accept(partition: Partition) bool[source]
Always accepts the flip if the number of cut_edges increases. Otherwise, uses the Metropolis criterion to decide.
frm: TODO: Documentation: Add documentation on what the “Metropolis criterion” is…
- Parameters:
partition (Partition) – The current partition to accept a flip from.
- Returns:
True if accepted, False to remain in place
- Return type:
bool
Chain
This module provides the MarkovChain class, which is designed to facilitate the creation and iteration of Markov chains in the context of political redistricting and gerrymandering analysis. It allows for the exploration of different districting plans based on specified constraints and acceptance criteria.
Key Components:
MarkovChain: The main class used for creating and iterating over Markov chain states.
Validator: A helper class for validating proposed states in the Markov chain. See
Validatorfor more details.
Usage: The primary use of this module is to create an instance of MarkovChain with appropriate parameters like proposal function, constraints, acceptance function, and initial state, and then to iterate through the states of the Markov chain, yielding a new proposal at each step.
Dependencies:
typing: Used for type hints.
Last Updated: 11 Jan 2024
- class gerrychain.chain.MarkovChain(proposal: Callable, constraints: Iterable[Callable] | Validator | Iterable[Bounds] | Callable, accept: Callable, initial_state: Partition, total_steps: int)[source]
Bases:
objectMarkovChain is a class that creates an iterator for iterating over the states of a Markov chain run in a gerrymandering analysis context.
It allows for the generation of a sequence of partitions (states) of a political districting plan, where each partition represents a possible state in the Markov chain.
Example usage:
chain = MarkovChain(proposal, constraints, accept, initial_state, total_steps) for state in chain: # Do whatever you want - print output, compute scores, ...
- Parameters:
proposal (Callable) – Function proposing the next state from the current state.
constraints (Union[Iterable[Callable], Validator, Iterable[Bounds], Callable]) – A function with signature
Partition -> booldetermining whether the proposed next state is valid (passes all binary constraints). Usually this is aValidatorclass instance.accept (Callable) – Function accepting or rejecting the proposed state. In the most basic use case, this always returns
True. But if the user wanted to use a Metropolis-Hastings acceptance rule, this is where you would implement it.initial_state (Partition) – Initial
gerrychain.partition.Partitionclass.total_steps (int) – Number of steps to run.
- Returns:
None
- Raises:
ValueError – If the initial_state is not valid according to the constraints.
- property constraints: Validator
Read_only alias for the is_valid property. Returns the constraints of the Markov chain.
- Returns:
The constraints of the Markov chain.
- Return type:
String
- with_progress_bar()[source]
Wraps the Markov chain in a tqdm progress bar.
Useful for long-running Markov chains where you want to keep track of the progress. Requires the tqdm package to be installed.
- Returns:
A tqdm-wrapped Markov chain.
Constraints
- class gerrychain.constraints.bounds.Bounds(func: Callable, bounds: Tuple[float, float])[source]
Bases:
objectWrapper for numeric-validators to enforce upper and lower limits.
This class is meant to be called as a function after instantiation; its return is
Trueif the numeric validator is within set limits, andFalseotherwise.- Parameters:
func (Callable) – Numeric validator function. Should return an iterable of values.
bounds (Tuple[float, float]) – Tuple of (lower, upper) numeric bounds.
- class gerrychain.constraints.bounds.LowerBound(func: Callable, bound: float)[source]
Bases:
objectWrapper for numeric-validators to enforce lower limits.
This class is meant to be called as a function after instantiation; its return is
Trueif the numeric validator is within a set lower limit, andFalseotherwise.- Parameters:
func (Callable) – Numeric validator function. Should return a comparable value.
bounds (float) – Comparable lower bound.
- class gerrychain.constraints.bounds.SelfConfiguringLowerBound(func: Callable, epsilon: float = 0.05)[source]
Bases:
objectWrapper for numeric-validators to enforce automatic lower limits.
When instantiated, the initial lower bound is set as the initial value of the numeric-validator minus some configurable ε.
This class is meant to be called as a function after instantiation; its return is
Trueif the numeric validator is within a set lower limit, andFalseotherwise.- Parameters:
func (Callable) – Numeric validator function.
epsilon (float, optional) – Initial population deviation allowable by the validator as a percentage of the ideal population. Defaults to 0.05.
- class gerrychain.constraints.bounds.SelfConfiguringUpperBound(func: Callable)[source]
Bases:
objectWrapper for numeric-validators to enforce automatic upper limits.
When instantiated, the initial upper bound is set as the initial value of the numeric-validator.
This class is meant to be called as a function after instantiation; its return is
Trueif the numeric validator is within a set upper limit, andFalseotherwise.- Parameters:
func (Callable) – Numeric validator function.
- class gerrychain.constraints.bounds.UpperBound(func: Callable, bound: float)[source]
Bases:
objectWrapper for numeric-validators to enforce upper limits.
This class is meant to be called as a function after instantiation; its return is
Trueif the numeric validator is within a set upper limit, andFalseotherwise.- Parameters:
func (Callable) – Numeric validator function. Should return a comparable value.
bounds (float) – Comparable upper bound.
- class gerrychain.constraints.bounds.WithinPercentRangeOfBounds(func: Callable, percent: float)[source]
Bases:
objectWrapper for numeric-validators to enforce upper and lower limits determined by a percentage of the initial value.
When instantiated, the initial upper and lower bounds are set as the initial value of the numeric-validator times (1 ± percent).
This class is meant to be called as a function after instantiation; its return is
Trueif the numeric validator is within the desired percentage range of the initial value, andFalseotherwise.- Parameters:
func (Callable) – Numeric validator function.
percent (float) – Percentage of the initial value to use as the bounds.
- Returns:
None
Warning
The percentage is assumed to be in the range [0.0, 100.0].
- gerrychain.constraints.compactness.L1_polsby_popper(partition: Partition) float[source]
Returns the \(L^1\) norm of the Polsby-Popper scores for the given partition
- Parameters:
partition (Partition) – Partition representing a districting plan
- Returns:
\(L^1\) norm of the reciprocal Polsby-Popper scores
- Return type:
float
- gerrychain.constraints.compactness.L1_reciprocal_polsby_popper(partition: Partition) float[source]
Returns the \(L^1\) norm of the reciprocal Polsby-Popper scores for the given partition
- Parameters:
partition (Partition) – Partition representing a districting plan
- Returns:
\(L^1\) norm of the reciprocal Polsby-Popper scores
- Return type:
float
- gerrychain.constraints.compactness.L2_polsby_popper(partition: Partition) float[source]
Returns the \(L^2\) norm of the Polsby-Popper scores for the given partition.
- Parameters:
partition (Partition) – Partition representing a districting plan
- Returns:
\(L^2\) norm of the Polsby-Popper scores
- Return type:
float
- gerrychain.constraints.compactness.L_minus_1_polsby_popper(partition)[source]
Returns the \(L^{-1}\) norm of the Polsby-Popper scores for the given partition.
- Parameters:
partition (Partition) – Partition representing a districting plan
- Returns:
\(L^{-1}\) norm of the Polsby-Popper scores
- Return type:
float
- gerrychain.constraints.contiguity.contiguous(partition: Partition) bool[source]
Check if the parts of a partition are connected
- gerrychain.constraints.contiguity.contiguous_bfs(partition: Partition) bool[source]
Checks that a given partition’s parts are connected as graphs using a simple breadth-first search.
- Parameters:
partition (Partition) – Instance of Partition
- Returns:
Whether the parts of this partition are connected
- Return type:
bool
- gerrychain.constraints.contiguity.contiguous_components(partition: Partition) Dict[int, list][source]
Return the connected components of each of the subgraphs of the parts of the partition.
- Parameters:
partition (Partition) – Instance of Partition; contains connected components.
- Returns:
dictionary mapping each part ID to a list holding the connected subgraphs of that part of the partition
- Return type:
dict
- gerrychain.constraints.contiguity.number_of_contiguous_parts(partition: Partition) int[source]
- Parameters:
partition (Partition) – Instance of Partition; contains connected components.
- Returns:
Number of contiguous parts in the partition.
- Return type:
int
- gerrychain.constraints.contiguity.single_flip_contiguous(partition: Partition) bool[source]
Check if swapping the given node from its old assignment disconnects the old assignment class.
- Parameters:
- Returns:
whether the partition is contiguous
- Return type:
bool
We assume that removed_node belonged to an assignment class that formed a connected subgraph. To see if its removal left the subgraph connected, we check that the neighbors of the removed node are still connected through the changed graph.
- class gerrychain.constraints.validity.Validator(constraints: List[Callable])[source]
Bases:
objectA single callable for checking that a partition passes a collection of constraints. Intended to be passed as the
is_validparameter when instantiatingMarkovChain.This class is meant to be called as a function after instantiation; its return is
Trueif all validators pass, andFalseif any one fails.Example usage:
is_valid = Validator([constraint1, constraint2, constraint3]) chain = MarkovChain(proposal, is_valid, accept, initial_state, total_steps)
- Variables:
constraints – List of validator functions that will check partitions.
- Parameters:
constraints (List[Callable]) – List of validator functions that will check partitions.
- gerrychain.constraints.validity.deviation_from_ideal(partition: Partition, attribute: str = 'population') Dict[int, float][source]
Computes the deviation of the given
attributefrom exact equality among parts of the partition. Usuallyattributeis the population, and this function is used to compute how far a districting plan is from exact population equality.By “deviation” we mean
(actual_value - ideal)/ideal(not the absolute value).
- gerrychain.constraints.validity.districts_within_tolerance(partition: Partition, attribute_name: str = 'population', percentage: float = 0.1) bool[source]
Check if all districts are within a certain percentage of the “smallest” district, as defined by the given attribute.
- Parameters:
partition (Partition) – Partition class instance
attrName (str, optional) – String that is the name of an updater in partition. Default is
"population".percentage (float, optional) – What percent (as a number between 0 and 1) difference is allowed. Default is 0.1.
- Returns:
Whether the districts are within specified tolerance
- Return type:
bool
- gerrychain.constraints.validity.no_vanishing_districts(partition: Partition) bool[source]
Require that no districts be completely consumed.
- Parameters:
partition (Partition) – Partition to check.
- Returns:
Whether no districts are completely consumed.
- Return type:
bool
- gerrychain.constraints.validity.refuse_new_splits(partition_county_field: str) Callable[[Partition], bool][source]
Refuse all proposals that split a county that was previous unsplit.
- Parameters:
partition_county_field (str) – Name of field for county information generated by
county_splits().- Returns:
Function that returns
Trueif the proposal does not split any new counties.- Return type:
Callable[[Partition], bool]
- gerrychain.constraints.validity.within_percent_of_ideal_population(initial_partition: Partition, percent: float = 0.01, pop_key: str = 'population') Bounds[source]
Require that all districts are within a certain percent of “ideal” (i.e., uniform) population.
Ideal population is defined as “total population / number of districts.”
- Parameters:
- Returns:
A
Boundsconstraint on the population attribute identified bypop_key.- Return type:
Graph
This module provides a set of functions to help determine and manipulate the adjacency of geometries within a particular graph. The functions in this module are used internally to ensure that the geometry data that we are working with is sufficiently well-defined to be used for analysis.
Some of the type hints in this module are intentionally left unspecified because of import issues.
- gerrychain.graph.adjacency.intersections_with_neighbors(geometries)[source]
Generator yielding tuples of the form (id, {neighbor_id: intersection}). The intersections may be empty!
- Parameters:
geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object.
- Returns:
A generator yielding tuples of the form (id, {neighbor_id: intersection})
- Return type:
Generator
- gerrychain.graph.adjacency.neighboring_geometries(geometries, tree=None)[source]
Generator yielding tuples of the form (id, (ids of neighbors)).
- Parameters:
geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object to construct the tree from
tree (shapely.strtree.STRtree, optional) – A Sort-Tile-Recursive tree for spatial indexing. Default is None.
- Returns:
A generator yielding tuples of the form (id, (ids of neighbors))
- Return type:
Generator
- gerrychain.graph.adjacency.neighbors(df: GeoDataFrame, adjacency: str) Dict[source]
- gerrychain.graph.adjacency.queen(geometries)[source]
- Parameters:
geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object.
- Returns:
The queen adjacency dictionary for the given collection of polygons.
- Return type:
Dict
- gerrychain.graph.adjacency.rook(geometries)[source]
- Parameters:
geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object.
- Returns:
The rook adjacency dictionary for the given collection of polygons.
- Return type:
Dict
- gerrychain.graph.adjacency.str_tree(geometries)[source]
Add ids to geometries and create a STR tree for spatial indexing. Use this for all spatial operations!
- Parameters:
geometries (shapely.geometry.BaseGeometry) – A Shapely geometry object to construct the tree from.
- Returns:
A Sort-Tile-Recursive tree for spatial indexing.
- Return type:
shapely.strtree.STRtree
- gerrychain.graph.adjacency.warn_for_overlaps(intersection_pairs)[source]
- Parameters:
intersection_pairs (Iterable) – An iterable of tuples of the form (id, {neighbor_id: intersection})
- Returns:
A generator yielding tuples of intersection pairs
- Return type:
Generator
- Raises:
UserWarning if there are overlaps among the given polygons
This module contains functions for working with GeoDataFrames and Shapely geometries. Specifically, it contains functions for handling and reprojecting the Universal Transverse Mercator projection, and for identifying bad geometries within a given GeoDataFrame.
- exception gerrychain.graph.geo.GeometryError[source]
Bases:
ExceptionWrapper error class for projection failures. Changing a map’s projection may create invalid geometries, which may or may not be repairable using the .buffer(0) trick.
- gerrychain.graph.geo.explain_validity(geo) str[source]
Given a geometry that is shapely interpretable, explain the validity. Light wrapper around shapely’s explain_validity.
- Parameters:
geo (shapely.geometry.BaseGeometry) – Shapely geometry object
- Returns:
Explanation for the validity of the geometry
- Return type:
str
- gerrychain.graph.geo.identify_utm_zone(df: GeoDataFrame) int[source]
Given a GeoDataFrame, identify the Universal Transverse Mercator zone number for the centroid of the geometries in the dataframe.
- gerrychain.graph.geo.invalid_geometries(df)[source]
Given a GeoDataFrame, returns a list of row indices with invalid geometries.
- Parameters:
df (
geopandas.GeoDataFrame) – The GeoDataFrame to examine- Returns:
List of row indices with invalid geometries
- Return type:
list of int
- gerrychain.graph.geo.reprojected(df)[source]
- Returns a copy of df, projected into the coordinate reference system of a suitable
- Parameters:
df (
geopandas.GeoDataFrame) – The GeoDataFrame to reproject- Returns:
A copy of df, projected into the coordinate reference system of a suitable UTM zone.
- Return type:
geopandas.GeoDataFrame
- gerrychain.graph.geo.utm_of_point(point)[source]
Given a point, return the Universal Transverse Mercator zone number for that point.
This module provides tools for working with graphs in the context of geographic data. It extends the functionality of the NetworkX library, adding support for spatial data structures, geographic projections, and serialization to and from JSON format.
This module is designed to be used in conjunction with geopandas, shapely, and pandas libraries, facilitating the integration of graph-based algorithms with geographic information systems (GIS).
Note: This module relies on NetworkX, pandas, and geopandas, which should be installed and imported as required.
TODO: Documentation: Update top-level documentation for graph.py
- class gerrychain.graph.graph.FrozenGraph(graph: Graph)[source]
Bases:
objectRepresents an immutable graph to be partitioned. It is based off
Graph.This speeds up chain runs and prevents having to deal with cache invalidation issues. This class behaves slightly differently than
Graphornetworkx.Graph.Not intended to be a part of the public API.
- Variables:
graph – The underlying graph.
size – The number of nodes in the graph.
Note
The class uses __slots__ for improved memory efficiency.
Initialize a FrozenGraph from a Graph.
- param graph:
The mutable Graph to be converted into an immutable graph
- type graph:
Graph
- returns:
None
- degree(n: Any) int[source]
- property edge_indices: Iterable[Any]
- graph
- neighbors(n: Any) tuple[Any, ...][source]
- property node_indices: Iterable[Any]
- size
- subgraph(nodes: Iterable[Any]) FrozenGraph[source]
- class gerrychain.graph.graph.Graph[source]
Bases:
objectfrm TODO: Documentation: Clean up this documentation
frm: this class encapsulates / hides the underlying graph which can either be a NetworkX graph or a RustworkX graph. The intent is that it provides the same external interface as a NetworkX graph (for all of the uses that GerryChain cares about, at least) so that legacy code that operated on NetworkX based Graph objects can continue to work unchanged.
When a graph is added to a partition, however, the NX graph will be converted into an RX graph and the NX graph will become unaccessible to the user. The RX graph may also be “frozen” the way the NX graph was “frozen” in the legacy code, but we have not yet gotten that far in the implementation.
It is not clear whether the code that does the heavy lifting on partitions will need to use the old NX syntax or whether it will be useful to allow unfettered access to the RX graph so that RX code can be used in these modules. TBD…
- add_data(df: DataFrame, columns: Iterable[str] | None = None) None[source]
Add columns of a DataFrame to a graph as node attributes by matching the DataFrame’s index to node ids.
- Parameters:
df (
pandas.DataFrame) – Dataframe containing given columns.columns (Optional[Iterable[str]], optional) – list of dataframe column names to add. Default is None.
- Returns:
None
- add_edge(node_id1: Any, node_id2: Any) None[source]
Add an edge to the graph from node_id1 to node_id2
- Parameters:
node_id1 (Any) – The node_id for one of the nodes in the edge
node_id2 (Any) – The node_id for one of the nodes in the edge
- Return type:
None
- convert_from_nx_to_rx() Graph[source]
Convert an NX-based graph object to be an RX-based graph object.
The primary use case for this routine is support for users constructing a graph using NetworkX functionality and then converting that NetworkX graph to RustworkX when creating a Partition object.
- Returns:
An RX-based graph that is “the same” as the given NX-based graph
- Return type:
“Graph”
- degree(node_id: Any) int[source]
Return the degree of the given node, that is, the number of other nodes directly connected to the given node.
- Parameters:
node_id (Any) – The ID of a node
- Returns:
Number of nodes directly connected to the given node
- Return type:
int
- edge_data(edge_id: Any) dict[Any, Any][source]
Return the data dictionary that contains the data for the given edge.
Note that in NetworkX an edge_id can be almost anything, for instance, a string or even a tuple. However, in RustworkX, an edge_id is an integer. This code handles both kinds of edge_ids - hence the type, Any.
- Parameters:
edge_id (Any) – The ID of the edge
- Returns:
The data dictionary for the given edge’s data
- Return type:
dict[Any, Any]
- property edge_indices: set[Any]
Return a set of the edge_ids in the graph
- Return type:
set[Any]
- property edges: set[tuple[Any, Any]]
Return a set of all of the edges in the graph, where each edge is a tuple of node_ids
- Return type:
set[tuple[Any, Any]]:
- classmethod from_file(filename: str, adjacency: str = 'rook', cols_to_add: list[str] | None = None, reproject: bool = False, ignore_errors: bool = False) Graph[source]
Create a
Graphfrom a shapefile (or GeoPackage, or GeoJSON, or any other library thatgeopandascan read. Seefrom_geodataframe()for more details.- Parameters:
filename (str) – Path to the shapefile / GeoPackage / GeoJSON / etc.
adjacency (str, optional) – The adjacency type to use (“rook” or “queen”). Default is “rook”
cols_to_add (Optional[list[str]], optional) – The names of the columns that you want to add to the graph as node attributes. Default is None.
reproject (bool, optional) – Whether to reproject to a UTM projection before creating the graph. Default is False.
ignore_errors (bool, optional) – Whether to ignore all invalid geometries and try to continue creating the graph. Default is False.
- Returns:
The Graph object of the geometries from filename.
- Return type:
Warning
This method requires the optional
geopandasdependency. So please installgerrychainwith thegeoextra via the command:pip install gerrychain[geo]or install
geopandasseparately.
- classmethod from_geodataframe(dataframe: DataFrame, adjacency: str = 'rook', cols_to_add: list[str] | None = None, reproject: bool = False, ignore_errors: bool = False, crs_override: str | int | None = None) Graph[source]
Create the adjacency
Graphof geometries described by dataframe. The areas of the polygons are included as node attributes (with key area). The shared perimeter of neighboring polygons are included as edge attributes (with key shared_perim). Nodes corresponding to polygons on the boundary of the union of all the geometries (e.g., the state, if your dataframe describes VTDs) have a boundary_node attribute (set to True) and a boundary_perim attribute with the length of this “exterior” boundary.By default, areas and lengths are computed in a UTM projection suitable for the geometries. This prevents the bizarro area and perimeter values that show up when you accidentally do computations in Longitude-Latitude coordinates. If the user specifies reproject=False, then the areas and lengths will be computed in the GeoDataFrame’s current coordinate reference system. This option is for users who have a preferred CRS they would like to use.
- Parameters:
dataframe (
geopandas.GeoDataFrame) – The GeoDateFrame to convertadjacency (str, optional) – The adjacency type to use (“rook” or “queen”). Default is “rook”.
cols_to_add (Optional[list[str]], optional) – The names of the columns that you want to add to the graph as node attributes. Default is None.
reproject (bool, optional) – Whether to reproject to a UTM projection before creating the graph. Default is
False.ignore_errors (bool, optional) – Whether to ignore all invalid geometries and attept to create the graph anyway. Default is
False.crs_override (Optional[Union[str,int]], optional) – Value to override the CRS of the GeoDataFrame. Default is None.
- Returns:
The adjacency graph of the geometries from dataframe.
- Return type:
- classmethod from_json(json_file_name: str) Graph[source]
Create a
Graphfrom a JSON file- Parameters:
json_file_name (str) – JSON file # frm: TODO: Documentation: more detail on contents of JSON file needed here
- Returns:
A GerryChain Graph object with data from JSON file
- Return type:
“Graph”
- classmethod from_networkx(nx_graph: Graph) Graph[source]
Create a
Graphfrom a NetworkX.Graph objectThis supports the use case of users creating a graph using NetworkX which is convenient - both for users of the previous implementation of a GerryChain object which was a subclass of NetworkX.Graph and for users more generally who are familiar with NetworkX.
Note that most users will not ever call this function directly, because they can create a GerryChain Partition object directly from a NetworkX graph, and the Partition initialization code will use this function to convert the NetworkX graph to a GerryChain Graph object.
- Parameters:
nx_graph (networkx.Graph) – A NetworkX.Graph object with node and edge data to be converted into a GerryChain Graph object.
- Returns:
…text…
- Return type:
<type>
- classmethod from_rustworkx(rx_graph: PyGraph) Graph[source]
<Overview text for what the function does>
Create a
Graphfrom a RustworkX.PyGraph objectThere are three primary use cases for this routine: 1) converting an NX-based Graph to be an RX-based Graph, 2) creating a subgraph of an RX-based Graph, and 3) creating a Graph whose node_ids do not need to be mapped to some previous graph’s node_ids.
In a little more detail:
1) A typical way to use GerryChain is to create a graph using NetworkX functionality and to then rely on the initialization code in the Partition class to create an RX-based Graph object. That initialization code constructs a RustworkX PyGraph and then uses this routine to create an RX-based Graph object, and it then creates maps from the node_ids of the resulting RX-based Graph back to the original NetworkX.Graph’s node_ids.
2) When creating a subgraph of a RustworkX PyGraph object, the node_ids of the subgraph are (in general) different from those of the parent graph. So we create a mapping from the subgraph’s node_ids to the node_ids of the parent. The subgraph() routine creates a RustworkX PyGraph subgraph, then uses this routine to create an RX-based Graph using that subgraph, and it then creates the mapping of subgraph node_ids to the parent (RX) graph’s node_ids.
3) In those cases where no node_id mapping is needed this routine provides a simple way to create an RX-based GerryChain graph object.
- Parameters:
rx_graph (rustworkx.PyGraph) – a RustworkX PyGraph object
- Returns:
a GerryChain Graph object with an embedded RustworkX.PyGraph object
- Return type:
“Graph”
- generic_bfs_edges(source, neighbors=None, depth_limit=None) Generator[tuple[Any, Any], None, None][source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- generic_bfs_predecessors(root_node_id: Any) dict[Any, Any][source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- generic_bfs_successors(root_node_id: Any) dict[slice(Any, Any, None)][source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- generic_bfs_successors_generator(root_node_id: Any) Generator[tuple[Any, Any], None, None][source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- get_edge_from_edge_id(edge_id: Any) tuple[Any, Any][source]
Return the edge (tuple of node_ids) corresponding to the given edge_id
Note that in NX, an edge_id is the same as an edge - it is just a tuple of node_ids. However, in RX, an edge_id is an integer, so if you want to get the tuple of node_ids you need to use the edge_id to get that tuple…
- Parameters:
edge_id (Any) – The ID of the desired edge
- Returns:
An edge, namely a tuple of node_ids
- Return type:
tuple[Any, Any]
- get_edge_id_from_edge(edge: tuple[Any, Any]) Any[source]
Get the edge_id that corresponds to the given edge.
In RX an edge_id is an integer that designates an edge (an edge is a tuple of node_ids). In NX, an edge_id IS the tuple of node_ids. So, in general, to support both NX and RX, if you want to get access to the edge data for an edge (tuple of node_ids), you need to ask for the edge_id.
This functionality is needed, for instance, when
- Parameters:
edge (tuple[Any, Any]) – A tuple of node_ids.
- Returns:
The ID associated with the given edge
- Return type:
Any
- get_nx_graph() Graph[source]
Return the embedded NX graph object
- Return type:
networkx.Graph
- get_nx_to_rx_node_id_map() dict[Any, Any][source]
Return the dict that maps NX node_ids to RX node_ids
The primary use case for this routine is to support automatically converting NX-based graph objects to be RX-based when creating a Partition object. The issue is that when you convert from NX to RX the node_ids change and so you need to update the Partition object’s Assignment to use the new RX node_ids. This routine is used to translate those NX node_ids to the new RX node_ids when initializing a Partition object.
- Return type:
dict[Any, Any]
- get_rx_graph() PyGraph[source]
Return the embedded RX graph object
- Return type:
rustworkx.PyGraph
- internal_node_id_for_original_nx_node_id(original_nx_node_id: Any) Any[source]
Discover the “internal” node_id in the current GerryChain graph that corresponds to the “original” node_id in the top-level graph (presumably an NX-based graph object).
This was originally created to facilitate testing where it was convenient to express the test success criteria in terms of “original” node_ids, but the actual test needed to be made using the “internal” (RX) node_ids.
- Parameters:
original_nx_node_id (Any) – The “original” node_id
- Returns:
The corresponding “internal” node_id
- Return type:
Any
- is_a_tree() bool[source]
Return whether the current graph is a tree - meaning that it is connected and that it has no cycles.
- Returns:
Whether the current graph is a tree
- Return type:
bool
- is_directed() bool[source]
- is_nx_graph() bool[source]
Determine if the graph is NX-based
- Return type:
bool
- is_rx_graph() bool[source]
Determine if the graph is RX-based
- Return type:
bool
- property islands: set[Any]
Return a set of all node_ids that are not connected via an edge to any other node in the graph - that is, nodes with degree = 0
- Returns:
A set of all node_ids for nodes of degree 0
- Return type:
set[Any]
- issue_warnings() None[source]
Issue any warnings concerning the content or structure of the graph.
- Return type:
None
- join(dataframe: DataFrame, columns: list[str] | None = None, left_index: str | None = None, right_index: str | None = None) None[source]
Add data from a dataframe to the graph, matching nodes to rows when the node’s left_index attribute equals the row’s right_index value.
- Parameters:
dataframe (
pandas.DataFrame) – DataFrame.- Columns:
The columns whose data you wish to add to the graph. If not provided, all columns are added. Default is None.
- Left_index:
The node attribute used to match nodes to rows. If not provided, node IDs are used. Default is None.
- Right_index:
The DataFrame column name to use to match rows to nodes. If not provided, the DataFrame’s index is used. Default is None.
- Returns:
None
- laplacian_matrix() csr_array[source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- neighbors(node_id: Any) list[Any][source]
Return a list of the node_ids of the nodes that are neighbors of the given node - that is, all of the nodes that are directly connected to the given node by an edge.
- Parameters:
node_id (Any) – The ID of a node
- Returns:
A list of neighbor node_ids
- Return type:
list[Any]
- node_data(node_id: Any) dict[Any, Any][source]
Return the data dictionary that contains the given node’s data.
As docmented elsewhere, in GerryChain code before the conversion to RustworkX, users could access node data using the syntax:
graph.nodes[node_id][attribute_name]
This was because a GerryChain Graph object in that codebase was a subclass of NetworkX.Graph, and NetworkX was clever and implemented dict-like behavior for the syntax graph.nodes[]…
This Python cleverness was not carried over to the RustworkX implementation, so in the current GerryChain Graph implementation users need to access node data using the syntax:
graph.node_data(node_id)[attribute_name]
- Parameters:
node_id (Any) – The ID of a node
- Returns:
Data dictionary containing the given node’s data.
- Return type:
dict[Any, Any]
- property node_indices: set[Any]
Return a set of the node_ids in the graph
- Return type:
set[Any]
- property nodes: list[Any]
Return a list of all of the node_ids in the graph.
This routine still exists because there is a lot of legacy code that uses this syntax to iterate through all of the nodes in a graph.
There is another routine, node_indices(), which does essentially the same thing (it returns a set of node_ids, however, rather than a list).
Why have two routines that do the same thing? The answer is that with move to RX, it seemed appropriate to emphasize the distinction between objects and the IDs for objects, hence the introduction of node_indices() and edge_indices() routines. This distinction is critical for edges, but mostly not important for nodes. In fact this routine is implemented by just converting node_indices to a list. So, it is essentially a style issue - when referring to nodes, we are almost always really referring to node_ids, so why not use a routine called node_indices()?
Note that there is a subtle point to be made about node names vs. node_ids. It was common before the transition to RX to create nodes with IDs that were essentially names. That is, the ID had semantic weight. This is not true with RX node_ids. So, any code that relies on the semantics of a node’s ID (treating it like a name) is suspect in the new RX world.
- Returns:
A list of all of the node_ids in the graph
- Return type:
list[Any]
- normalized_laplacian_matrix() dia_array[source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- num_connected_components() int[source]
Return the number of connected components.
Note: A connected component is a maximal subgraph where every vertex is reachable from every other vertex in that same subgraph. In a graph that is not fully connected, connected components are the separate, distinct “islands” of connected nodes. Every node in a graph belongs to exactly one connected component.
- Returns:
The number of connected components
- Return type:
int
- original_nx_node_id_for_internal_node_id(internal_node_id: Any) Any[source]
Translate a node_id to its “original” node_id.
- Parameters:
internal_node_id (Any) – A node_id to be translated
- Returns:
A translated node_id
- Return type:
Any
- original_nx_node_ids_for_list(list_of_node_ids: list[Any]) list[Any][source]
Translate a list of node_ids to their “original” node_ids.
- Parameters:
list_of_node_ids (list[Any]) – A list of node_ids to be translated
- Returns:
A list of translated node_ids
- Return type:
list[Any]
- original_nx_node_ids_for_set(set_of_node_ids: set[Any]) Any[source]
Translate a set of node_ids to their “original” node_ids.
- Parameters:
set_of_node_ids (set[Any]) – A set of node_ids to be translated
- Returns:
A set of translated node_ids
- Return type:
set[Any]
- predecessors(root_node_id: Any) dict[slice(Any, Any, None)][source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- subgraph(nodes: Iterable[Any]) Graph[source]
Create a subgraph that contains the given nodes.
Note that creating a subgraph of an RustworkX (RX) graph renumbers the nodes, so that a node that had node_id: 4 in the parent graph might have node_id: 2 in the subgraph. This is a HUGE difference from the NX world where the node_ids in a subgraph do not change from those in the parent graph.
In order to make sense of the nodes in a subgraph in the RX world, we need to maintain mappings from the node_ids in the subgraph to the node_ids of the immediate parent graph and to the “original” top-level graph that contains all of the nodes. You will notice the creation of those maps in the code below.
- Parameters:
nodes (Iterable[Any]) – The nodes to be included in the subgraph
- Returns:
A subgraph containing the given nodes.
- Return type:
“Graph”
- subgraphs_for_connected_components() list[Graph][source]
Create and return a list of subgraphs for each set of nodes in the given graph that are connected.
Note that a connected graph is one in which there is a path from every node in the graph to every other node in the graph.
Note also that each of the subgraphs returned is a maximal subgraph of connected components, meaning that there is no other larger subgraph of connected components that includes it as a subset.
- Returns:
A list of “maximal” subgraphs each of which contains nodes that are connected.
- Return type:
list[“Graph”]
- successors(root_node_id: Any) dict[slice(Any, Any, None)][source]
<Overview text for what the function does>
- Parameters:
<param_name> (<type>) – …text… …more text…
- Returns:
…text…
- Return type:
<type>
- to_json(json_file_name: str, include_geometries_as_geojson: bool = False) None[source]
Dump a GerryChain Graph object to disk as a JSON file
- Parameters:
json_file_name (str) – name of JSON file to be created
- Return type:
None
- to_networkx_graph() Graph[source]
Create a NetworkX.Graph object that has the same nodes, edges, node_data, and edge_data as the GerryChain Graph object.
The intended purpose of this routine is to allow a user to run a MarkovChain - which uses an embedded RustworkX graph and then extract an equivalent version of that graph with all of its data as a NetworkX.Graph object - in order to use NetworkX routines to access and manipulate the graph.
In short, this routine allows users to use NetworkX functionality on a graph after running a MarkovChain.
If the GerryChain graph object is NX-based, then this routine merely returns the embedded NetworkX.Graph object.
- Returns:
A NetworkX.Graph object that is equivalent to the GerryChain Graph object (nodes, edges, node_data, edge_data)
- Return type:
networkx.Graph
- translate_subgraph_node_ids_for_flips(flips: dict[Any, int]) dict[Any, int][source]
Translate the given flips so that the subgraph node_ids in the flips have been translated to the appropriate node_ids in the parent graph.
The flips parameter is a dict mapping node_ids to parts (districts).
This routine is used when a computation that creates flips is made on a subgraph, but those flips want to be translated into the context of the parent graph at the end of the computation.
For more details, refer to the larger comment on subgraphs…
- Parameters:
flips (dict[Any, int]) – A dict containing “flips” which associate a node with a new part in a partition (a “part” is the same as a district in common parlance).
- Returns:
A dict containing “flips” that have been translated to have node_ids appropriate for the parent graph
- Return type:
dict[Any, int]
- translate_subgraph_node_ids_for_set_of_nodes(set_of_nodes: set[Any]) set[Any][source]
Translate the given set_of_nodes to have the appropriate node_ids for the parent graph.
This routine is used when a computation that creates a set of nodes is made on a subgraph, but those nodes want to be translated into the context of the parent graph at the end of the computation.
For more details, refer to the larger comment on subgraphs…
- Parameters:
set_of_nodes (set[Any]) – A set of node_ids in a subgraph
- Returns:
A set of node_ids that have been translated to have the node_ids appropriate for the parent graph
- Return type:
set[Any]
- verify_graph_is_valid() bool[source]
Verify that the graph is valid.
This may be overkill, but the idea is that at least in development mode, it would be prudent to check periodically to see that the graph data structure has not been corrupted.
- Returns:
True if the graph is deemed valid
- Return type:
bool
- warn_for_islands() None[source]
Issue a warning if there are any islands in the graph - that is, if there are any nodes in the graph that are not connected to any other node (degree = 0)
- Return type:
None
- gerrychain.graph.graph.add_boundary_perimeters(nx_graph: Graph, geometries: Series) None[source]
Add shared perimeter between nodes and the total geometry boundary.
- gerrychain.graph.graph.check_dataframe(df: DataFrame) None[source]
- Returns:
None
- Raises:
UserWarning if the dataframe has any NA values.
- gerrychain.graph.graph.convert_geometries_to_geojson(data: Graph) None[source]
Convert geometry attributes in a NetworkX adjacency data object to GeoJSON, so that they can be serialized. Mutates the
dataobject.Does nothing if no geometry attributes are found.
- Parameters:
data (networkx.Graph) – an adjacency data object (returned by
networkx.readwrite.json_graph.adjacency_data())- Returns:
None
- gerrychain.graph.graph.json_serialize(input_object: Any) int | None[source]
This function is used to handle one of the common issues that appears when trying to convert a pandas dataframe into a JSON serializable object. Specifically, it handles the issue of converting the pandas int64 to a python int so that JSON can serialize it. This is specifically used so that we can write graphs out to JSON files.
- Parameters:
input_object (Any (expected to be a pd.Int64Dtype)) – The object to be converted
- Returns:
The converted pandas object or None if input is not of type pd.Int64Dtype
- Return type:
Optional[int]
- gerrychain.graph.graph.remove_geometries(data: Graph) None[source]
Remove geometry attributes from NetworkX adjacency data object, because they are not serializable. Mutates the
dataobject.Does nothing if no geometry attributes are found.
- Parameters:
data (networkx.Graph) – an adjacency data object (returned by
networkx.readwrite.json_graph.adjacency_data())- Returns:
None
Grid
This module provides a Grid class used for creating and manipulating grid partitions. It’s part of the GerryChain suite, designed to facilitate experiments with redistricting plans without the need for extensive data processing. This module relies on NetworkX for graph operations and integrates with GerryChain’s Partition class.
Dependencies:
math: For math.floor() function.
- networkx: For graph operations with using the graph structure in
Graph.
typing: Used for type hints.
- class gerrychain.grid.Grid(dimensions: Tuple[int, int] | None = None, with_diagonals: bool = False, assignment: Dict | None = None, updaters: Dict[str, Callable] | None = None, parent: Grid | None = None, flips: Dict[Tuple[int, int], int] | None = None)[source]
Bases:
PartitionThe
Gridclass represents a grid partitioned into districts. It is useful for running little experiments with GerryChain without needing to do any data processing or cleaning to get started.Example usage:
grid = Grid((10,10))
The nodes of
grid.graphare labelled by tuples(i,j), for0 <= i <= 10and0 <= j <= 10. Each node has anareaof 1 and each edge hasshared_perim1.If the updaters are not specified, the default updaters are used, which are as follows:
default_updaters = { "cut_edges": cut_edges, "population": Tally("population"), "perimeter": perimeter, "exterior_boundaries": exterior_boundaries, "interior_boundaries": interior_boundaries, "boundary_nodes": boundary_nodes, "area": Tally("area", alias="area"), "polsby_popper": polsby_popper, "cut_edges_by_part": cut_edges_by_part, }
- Parameters:
dimensions (Tuple[int, int], optional) – The grid dimensions (rows, columns), defaults to None.
with_diagonals (bool, optional) – If True, includes diagonal connections, defaults to False.
assignment (Dict, optional) – Node-to-district assignments, defaults to None.
updaters (Dict[str, Callable], optional) – Custom updater functions, defaults to None.
parent (Grid, optional) – Parent Grid object for inheritance, defaults to None.
flips (Dict[Tuple[int, int], int], optional) – Node flips for partition changes, defaults to None.
- Raises:
Exception – If neither dimensions nor parent is provided.
- as_list_of_lists()[source]
Returns the grid as a list of lists (like a matrix), where the (i,j)th entry is the assigned district of the node in position (i,j) on the grid.
- Returns:
List of lists representing the grid.
- Return type:
List[List[int]]
- assignment
- default_updaters = {'area': <gerrychain.updaters.tally.Tally object>, 'boundary_nodes': <function boundary_nodes>, 'cut_edges': <function cut_edges>, 'cut_edges_by_part': <function cut_edges_by_part>, 'exterior_boundaries': <function exterior_boundaries>, 'interior_boundaries': <function interior_boundaries>, 'perimeter': <function perimeter>, 'polsby_popper': <function polsby_popper>, 'population': <gerrychain.updaters.tally.Tally object>}
- edge_flows
- flips
- flows
- graph
- parent
- subgraphs
- updaters
- gerrychain.grid.color_half(node: Tuple[int, int], threshold: int) int[source]
Assigns a color (as an integer) to a node based on its x-coordinate.
This function is used to partition the grid into two parts based on a given threshold. Nodes with an x-coordinate less than or equal to the threshold are assigned one color, and nodes with an x-coordinate greater than the threshold are assigned another.
- Parameters:
node (Tuple[int, int]) – The node to color, represented as a tuple of coordinates (x, y).
threshold (int) – The x-coordinate value that determines the color assignment.
- Returns:
An integer representing the color of the node. Returns 0 for nodes with x-coordinate less than or equal to the threshold, and 1 otherwise.
- Return type:
int
- gerrychain.grid.color_quadrants(node: Tuple[int, int], thresholds: Tuple[int, int]) int[source]
Assigns a color (as an integer) to a node based on its position relative to specified threshold coordinates, effectively dividing the grid into four quadrants.
The function uses two threshold values (one for each axis) to determine the color. Each combination of being higher or lower than the threshold on each axis results in a different color.
- Parameters:
node (Tuple[int, int]) – The node to color, represented as a tuple of coordinates (x, y).
thresholds (Tuple[int, int]) – A tuple of two integers representing the threshold coordinates (x_threshold, y_threshold).
- Returns:
An integer representing the color of the node, determined by its quadrant.
- Return type:
int
- gerrychain.grid.get_boundary_perim(node: Tuple[int, int], dimensions: Tuple[int, int]) int[source]
Determines the boundary perimeter of a node on the grid. The boundary perimeter is the number of sides of the node that are on the boundary of the grid.
- Parameters:
node (Tuple[int, int]) – The node to check.
dimensions (Tuple[int, int]) – The dimensions of the grid.
- Returns:
The boundary perimeter of the node.
- Return type:
int
- gerrychain.grid.give_constant_attribute(graph: Graph, attribute: Any, value: Any) None[source]
Sets the specified attribute to the specified value for all nodes in the graph.
- Parameters:
graph (Graph) – The graph to modify.
attribute (Any) – The attribute to set.
value (Any) – The value to set the attribute to.
- Returns:
None
Meta
Simple tooling to collect diversity stats on chain runs
- class gerrychain.meta.diversity.DiversityStats(unique_plans: int, unique_districts: int, steps_taken: int)[source]
Bases:
objectLightweight stats object that reports the diversity of a given chain.
- Variables:
unique_plans – The number of unique plans seen so far.
unique_districts – The number of unique districts seen so far.
steps_taken – The number of steps taken so far.
Example usage:
DiversityStats(unique_plans=44162, unique_districts=82992, steps_taken=100000)
- steps_taken: int
- unique_districts: int
- unique_plans: int
- gerrychain.meta.diversity.collect_diversity_stats(chain: Iterable[Partition]) Iterable[Tuple[Partition, DiversityStats]][source]
Report the diversity of the chain being run, live, as a drop-in wrapper. Requires the cut_edges updater on each Partition object. Plans/districts are considered distinct if they are not isomorphic. That is, relabled plans and districts are considered non-unique and counted as duplicate.
Example usage:
for partition, stats in collect_diversity_stats( Replay( graph, "sample-run.chain" ) ): print(stats) # normal chain stuff here
- Parameters:
chain (Iterable[Partition]) – A chain object to collect stats on.
- Returns:
An iterable of (partition, DiversityStat).
- Return type:
Iterable[Tuple[Partition, DiversityStats]]
Metagraph
This module provides the main tools for interacting with the metagraph of partitions. The metagraph of partitions is the set of partitions that are reachable from the current partition by a single flip.
Dependencies:
itertools: Used for product() function.
typing: Used for type hints.
Last Updated: 11 Jan 2024
- gerrychain.metagraph.all_cut_edge_flips(partition: Partition) Iterator[Dict][source]
Generate all possible flips of cut edges in a partition without any constraints.
This routine finds all edges on the boundary of districts - those that are “cut edges” where one node is in one district and the other node is in another district. These are all of the places where you could move the boundary between districts by moving a single node.
- Parameters:
partition (Partition) – The partition object.
- Returns:
An iterator that yields dictionaries representing the flipped edges.
- Return type:
Iterator[Dict]
- gerrychain.metagraph.all_valid_flips(partition: Partition, constraints: Iterable[Callable] | Callable) Iterator[Dict][source]
Generate all valid flips for a given partition subject to the prescribed constraints.
- Parameters:
partition (Partition) – The initial partition.
constraints (Union[Iterable[Callable], Callable]) – The constraints to be satisfied. Can be a single constraint or an iterable of constraints.
- Returns:
An iterator that yields dictionaries representing valid flips.
- Return type:
Iterator[Dict]
- gerrychain.metagraph.all_valid_states_one_flip_away(partition: Partition, constraints: Iterable[Callable] | Callable) Iterator[Partition][source]
Generates all valid Partitions that differ from the given partition by one flip. These are the given partition’s neighbors in the metagraph of partitions. (The metagraph of partitions is the set of partitions that is reachable from the given partition by a single flip under the prescribed constraints.)
- Parameters:
partition (Partition) – The initial partition.
constraints (Union[Iterable[Callable], Callable]) – Constraints to determine the validity of a partition. It can be a single callable or an iterable of callables.
- Returns:
An iterator that yields all valid partitions that differ from the given partition by one flip.
- Return type:
Iterator[Partition]
- gerrychain.metagraph.metagraph_degree(partition: Partition, constraints: Iterable[Callable] | Callable) int[source]
Calculate the degree of the node in the metagraph of the given partition. That is to say, compute how many possible valid states are reachable from the state given by partition in a single flip subject to the prescribed constraints.
- Parameters:
partition (Partition) – The partition object representing the current state.
constraints (Union[Iterable[Callable], Callable]) – The constraints to be applied to the partition. It can be a single constraint or an iterable of constraints.
- Returns:
The degree of the partition node in the metagraph.
- Return type:
int
Metrics
- gerrychain.metrics.compactness.compute_polsby_popper(area: float, perimeter: float) float[source]
Computes the Polsby-Popper score for a single district.
- Parameters:
area (float) – The area of the district
perimeter (float) – The perimeter of the district
- Returns:
The Polsby-Popper score for the district
- Return type:
float
- gerrychain.metrics.compactness.polsby_popper(partition) Dict[int, float][source]
Computes Polsby-Popper compactness scores for each district in the partition.
- Parameters:
partition (Partition) – The partition to compute scores for
- Returns:
A dictionary mapping each district ID to its Polsby-Popper score
- Return type:
Dict[int, float]
The partisan metrics in this file are later used in the module gerrychain.updaters.election.py. Thus, all of the election results objects here are implicilty typed as ElectionResults, but cannot be given an explicit type annotation due to problems with circular imports.
- gerrychain.metrics.partisan.efficiency_gap(election_results) float[source]
Computes the efficiency gap for the given ElectionResults. A positive value indicates an advantage for the first party listed in the Election’s party_names_to_node_attribute_names dictionary.
- Parameters:
election_results (ElectionResults) – An ElectionResults object
- Returns:
The efficiency gap for the given ElectionResults
- Return type:
float
- gerrychain.metrics.partisan.mean_median(election_results) float[source]
Computes the Mean-Median score for the given ElectionResults. A positive value indicates an advantage for the first party listed in the Election’s party_names_to_node_attribute_names dictionary.
- Parameters:
election_results (ElectionResults) – An ElectionResults object
- Returns:
The Mean-Median score for the given ElectionResults
- Return type:
float
- gerrychain.metrics.partisan.mean_thirdian(election_results) float[source]
Computes the Mean-Median score for the given ElectionResults. A positive value indicates an advantage for the first party listed in the Election’s party_names_to_node_attribute_names dictionary.
The motivation for this score is that the minority party in many states struggles to win even a third of the seats.
- Parameters:
election_results (ElectionResults) – An ElectionResults object
- Returns:
The Mean-Thirdian score for the given ElectionResults
- Return type:
float
- gerrychain.metrics.partisan.partisan_bias(election_results) float[source]
Computes the partisan bias for the given ElectionResults. The partisan bias is defined as the number of districts with above-mean vote share by the first party divided by the total number of districts, minus 1/2.
- Parameters:
election_results (ElectionResults) – An ElectionResults object
- Returns:
The partisan bias for the given ElectionResults
- Return type:
float
- gerrychain.metrics.partisan.partisan_gini(election_results) float[source]
Computes the partisan Gini score for the given ElectionResults. The partisan Gini score is defined as the area between the seats-votes curve and its reflection about (.5, .5).
For more information on the computation, see Definition 1 in: https://arxiv.org/pdf/2008.06930.pdf
- Parameters:
election_results (ElectionResults) – An ElectionResults object
- Returns:
The partisan Gini score for the given ElectionResults
- Return type:
float
- gerrychain.metrics.partisan.wasted_votes(party1_votes: int, party2_votes: int) Tuple[int, int][source]
Computes the wasted votes for each party in the given race. :param party1_votes: the number of votes party1 received in the race :type party1_votes: int :param party2_votes: the number of votes party2 received in the race :type party2_votes: int
- Returns:
a tuple of the wasted votes for each party
- Return type:
Tuple[int, int]
Optimization
- class gerrychain.optimization.optimization.SingleMetricOptimizer(proposal: Callable[[Partition], Partition], constraints: Callable[[Partition], bool] | List[Callable[[Partition], bool]], initial_state: Partition, optimization_metric: Callable[[Partition], Any], maximize: bool = True, step_indexer: str = 'step')[source]
Bases:
objectSingleMetricOptimizer represents the class of algorithms / chains that optimize plans with respect to a single metric. An instance of this class encapsulates the following state information:
the dual graph and updaters via the initial partition,
the constraints new proposals are subject to,
the metric over which to optimize,
and whether or not to seek maximal or minimal values of the metric.
- The SingleMetricOptimizer class implements the following common methods of optimization:
Short Bursts
Simulated Annealing
Tilted Runs
Both during and after a optimization run, the class properties best_part and best_score represent the optimal partition / corresponding score value observed. Note that these properties do NOT persist across multiple optimization runs, as they are reset each time an optimization run is invoked.
- Parameters:
proposal (Callable) – Function proposing the next state from the current state.
constraints (Union[Callable[[Partition], bool], List[Callable[[Partition], bool]]]) – A function, or lists of functions, determining whether the proposed next state is valid (passes all binary constraints). Usually this is a
Validatorclass instance.initial_state (Partition) – Initial state of the optimizer.
optimization_metric (Callable[[Partition], Any]) – The score function with which to optimize over. This should have the signature:
Partition -> 'awhere ‘a is comparable.maximize (bool, optional) – Boolean indicating whether to maximize or minimize the function. Defaults to True for maximize.
step_indexer (str, optional) – Name of the updater tracking the partitions step in the chain. If not implemented on the partition the constructor creates and adds it. Defaults to “step”.
- Returns:
A SingleMetricOptimizer object
- Return type:
SingleMetricOptimizer
- property best_part: Partition
Partition object corresponding to best scoring plan observed over the current (or most recent) optimization run.
- Returns:
Partition object with the best score.
- Return type:
- property best_score: Any
Value of score metric corresponding to best scoring plan observed over the current (or most recent) optimization run.
- Returns:
Value of the best score.
- Return type:
Any
- classmethod jumpcycle_beta_function(duration_hot: int, duration_cold: int) Callable[[int], float][source]
Class method that binds and return simple hot-cold cycle beta temperature function, where the chain runs hot for some given duration and then cold for some duration, and repeats that cycle.
- Parameters:
duration_hot (int) – Number of steps to run chain hot.
duration_cold (int) – Number of steps to run chain cold.
- Returns:
Beta function defining hot-cold cycle.
- Return type:
Callable[[int], float]
- classmethod linear_jumpcycle_beta_function(duration_hot: int, duration_cooldown, duration_cold: int)[source]
Class method that binds and returns a simple linear hot-cool cycle beta temperature function, where the chain runs hot for some given duration, cools down linearly for some duration, and then runs cold for some duration before jumping back to hot and repeating.
- Parameters:
duration_hot (int) – Number of steps to run chain hot.
duration_cooldown (int) – Number of steps needed to transition from hot to cold.
duration_cold (int) – Number of steps to run chain cold.
- Returns:
Beta function defining linear hot-cool cycle.
- Return type:
Callable[[int], float]
- classmethod linearcycle_beta_function(duration_hot: int, duration_cooldown: int, duration_cold: int) Callable[[int], float][source]
Class method that binds and returns a simple linear hot-cool cycle beta temperature function, where the chain runs hot for some given duration, cools down linearly for some duration, and then runs cold for some duration before warming up again and repeating.
- Parameters:
duration_hot (int) – Number of steps to run chain hot.
duration_cooldown (int) – Number of steps needed to transition from hot to cold or vice-versa.
duration_cold (int) – Number of steps to run chain cold.
- Returns:
Beta function defining linear hot-cool cycle.
- Return type:
Callable[[int], float]
- classmethod logit_jumpcycle_beta_function(duration_hot: int, duration_cooldown: int, duration_cold: int) Callable[[int], float][source]
Class method that binds and returns a logit hot-cool cycle beta temperature function, where the chain runs hot for some given duration, cools down according to the logit function
\(f(x) = (log(x/(1-x)) + 5)/10\)
for some duration, and then runs cold for some duration before jumping back to hot and repeating.
- Parameters:
duration_hot (int) – Number of steps to run chain hot.
duration_cooldown (int) – Number of steps needed to transition from hot to cold or vice-versa.
duration_cold (int) – Number of steps to run chain cold.
- classmethod logitcycle_beta_function(duration_hot: int, duration_cooldown: int, duration_cold: int) Callable[[int], float][source]
Class method that binds and returns a logit hot-cool cycle beta temperature function, where the chain runs hot for some given duration, cools down according to the logit function
\(f(x) = (log(x/(1-x)) + 5)/10\)
for some duration, and then runs cold for some duration before warming up again using the \(1-f(x)\) and repeating.
- Parameters:
duration_hot (int) – Number of steps to run chain hot.
duration_cooldown (int) – Number of steps needed to transition from hot to cold or vice-versa.
duration_cold (int) – Number of steps to run chain cold.
- property score: Callable[[Partition], Any]
The score function which is being optimized over.
- Returns:
The score function.
- Return type:
Callable[[Partition], Any]
- short_bursts(burst_length: int, num_bursts: int, accept: ~typing.Callable[[~gerrychain.partition.partition.Partition], bool] = <function always_accept>, with_progress_bar: bool = False)[source]
Performs a short burst run using the instance’s score function. Each burst starts at the best performing plan of the previous burst. If there’s a tie, the later observed one is selected.
- Parameters:
burst_length (int) – Number of steps to run within each burst.
num_bursts (int) – Number of bursts to perform.
accept (Callable[[Partition], bool], optional) – Function accepting or rejecting the proposed state. Defaults to
always_accept().with_progress_bar (bool, optional) – Whether or not to draw tqdm progress bar. Defaults to False.
- Returns:
Partition generator.
- Return type:
Generator[Partition]
- simulated_annealing(num_steps: int, beta_function: Callable[[int], float], beta_magnitude: float = 1, with_progress_bar: bool = False)[source]
Performs simulated annealing with respect to the class instance’s score function.
- Parameters:
num_steps (int) – Number of steps to run for.
beta_function (Callable[[int], float]) – Function (f: t -> beta, where beta is in [0,1]) defining temperature over time. f(t) = 0 the chain is hot and every proposal is accepted. At f(t) = 1 the chain is cold and worse proposal have a low probability of being accepted relative to the magnitude of change in score.
beta_magnitude (float, optional) – Scaling parameter for how much to weight changes in score. Defaults to 1.
with_progress_bar (bool, optional) – Whether or not to draw tqdm progress bar. Defaults to False.
- Returns:
Partition generator.
- Return type:
Generator[Partition]
- tilted_run(num_steps: int, p: float, with_progress_bar: bool = False)[source]
Performs a tilted run. A chain where the acceptance function always accepts better plans and accepts worse plans with some probability p.
- Parameters:
num_steps (int) – Number of steps to run for.
p (float) – The probability of accepting a plan with a worse score.
with_progress_bar (bool, optional) – Whether or not to draw tqdm progress bar. Defaults to False.
- Returns:
Partition generator.
- Return type:
Generator[Partition]
- tilted_short_bursts(burst_length: int, num_bursts: int, p: float, with_progress_bar: bool = False)[source]
Performs a short burst run using the instance’s score function. Each burst starts at the best performing plan of the previous burst. If there’s a tie, the later observed one is selected. Within each burst a tilted acceptance function is used where better scoring plans are always accepted and worse scoring plans are accepted with probability p.
- Parameters:
burst_length (int) – Number of steps to run within each burst.
num_bursts (int) – Number of bursts to perform.
p (float) – The probability of accepting a plan with a worse score.
with_progress_bar (bool, optional) – Whether or not to draw tqdm progress bar. Defaults to False.
- Returns:
Partition generator.
- Return type:
Generator[Partition]
- variable_length_short_bursts(num_steps: int, stuck_buffer: int, accept: ~typing.Callable[[~gerrychain.partition.partition.Partition], bool] = <function always_accept>, with_progress_bar: bool = False)[source]
Performs a short burst where the burst length is allowed to increase as it gets harder to find high scoring plans. The initial burst length is set to 2, and it is doubled each time there is no improvement over the passed number (stuck_buffer) of runs.
- Parameters:
num_steps (int) – Number of steps to run for.
stuck_buffer (int) – How many bursts of a given length with no improvement to allow before increasing the burst length.
accept (Callable[[Partition], bool], optional) – Function accepting or rejecting the proposed state. Defaults to
always_accept().with_progress_bar (bool, optional) – Whether or not to draw tqdm progress bar. Defaults to False.
- Returns:
Partition generator.
- Return type:
Generator[Partition]
- class gerrychain.optimization.gingleator.Gingleator(proposal: Callable, constraints: Iterable[Callable] | Validator | Iterable[Bounds] | Callable, initial_state: Partition, minority_perc_col: str | None = None, threshold: float = 0.5, score_function: Callable | None = None, minority_pop_col: str | None = None, total_pop_col: str = 'TOTPOP', min_perc_column_name: str = '_gingleator_auxiliary_helper_updater_min_perc_col')[source]
Bases:
SingleMetricOptimizerGingleator is a child class of SingleMetricOptimizer which can be used to search for plans with increased numbers of Gingles’ districts.
A gingles district (named for the Supreme Court case Thornburg v. Gingles) is a district that is majority-minority. aka 50% + 1 of some population subgroup. Demonstrating additional Gingles districts is one of the litmus tests used in bringing forth a VRA case.
- Parameters:
proposal (Callable) – Function proposing the next state from the current state.
constraints (Union[Iterable[Callable], Validator, Iterable[Bounds], Callable]) – A function with signature
Partition -> booldetermining whether the proposed next state is valid (passes all binary constraints). Usually this is aValidatorclass instance.initial_state (Partition) – Initial
gerrychain.partition.Partitionclass.minority_perc_col (Optional[str]) – The name of the updater mapping of district ids to the fraction of minority population within that district. If no updater is specified, one is made according to the
min_perc_column_nameparameter. Defaults to None.threshold (float, optional) – Fraction beyond which to consider something a “Gingles” (or opportunity) district. Defaults to 0.5.
score_function (Optional[Callable]) – The function to use during optimization. Should have the signature
Partition * str (minority_perc_col) * float (threshold) -> 'a where 'a is Comparable. This class implements a few potential choices as class methods. Defaults to None.minority_pop_col (Optional[str]) – If minority_perc_col is not defined, the minority population column with which to compute percentage. Defaults to None.
total_pop_col (str, optional) – If minority_perc_col is defined, the total population column with which to compute percentage. Defaults to “TOTPOP”.
min_perc_column_name (str, optional) – If minority_perc_col is not defined, the name to give the created percentage updater. Defaults to “_gingleator_auxiliary_helper_updater_min_perc_col”.
- classmethod num_opportunity_dists(part: Partition, minority_perc_col: str, threshold: float) int[source]
Given a partition, returns the number of opportunity districts.
- Parameters:
part (Partition) – Partition to score.
minority_perc_col (str) – The name of the updater mapping of district ids to the fraction of minority population within that district.
threshold (float) – Fraction beyond which to consider something a “Gingles” (or opportunity) district.
- Returns:
Number of opportunity districts.
- Return type:
int
- classmethod penalize_avg_over(part: Partition, minority_perc_col: str, threshold: float)[source]
Given a partition, returns the number of opportunity districts + (1 - the average excess) scaled to between 0 and 1.
- Parameters:
part (Partition) – Partition to score.
minority_perc_col (str) – The name of the updater mapping of district ids to the fraction of minority population within that district.
threshold (float) – Fraction beyond which to consider something a “Gingles” (or opportunity) district.
- Returns:
Number of opportunity districts + (1 - the average excess)
- Return type:
float
- classmethod penalize_maximum_over(part: Partition, minority_perc_col: str, threshold: float)[source]
Given a partition, returns the number of opportunity districts + (1 - the maximum excess) scaled to between 0 and 1.
- Parameters:
part (Partition) – Partition to score.
minority_perc_col (str) – The name of the updater mapping of district ids to the fraction of minority population within that district.
threshold (float) – Fraction beyond which to consider something a “Gingles” (or opportunity) district.
- Returns:
Number of opportunity districts + (1 - the maximum excess) / (1 - threshold)
- Return type:
float
- classmethod reward_next_highest_close(part: Partition, minority_perc_col: str, threshold: float)[source]
Given a partition, returns the number of opportunity districts, if no additional district is within 10% of reaching the threshold. If one is, the distance that district is from the threshold is scaled between 0 and 1 and added to the count of opportunity districts.
- Parameters:
part (Partition) – Partition to score.
minority_perc_col (str) – The name of the updater mapping of district ids to the fraction of minority population within that district.
threshold (float) – Fraction beyond which to consider something a “Gingles” (or opportunity) district.
- Returns:
Number of opportunity districts + (next highest district - (threshold - 0.1)) * 10
- Return type:
float
- classmethod reward_partial_dist(part: Partition, minority_perc_col: str, threshold: float) float[source]
Given a partition, returns the number of opportunity districts + the percentage of the next highest district.
- Parameters:
part (Partition) – Partition to score.
minority_perc_col (str) – The name of the updater mapping of district ids to the fraction of minority population within that district.
threshold (float) – Fraction beyond which to consider something a “Gingles” (or opportunity) district.
- Returns:
Number of opportunity districts + the percentage of the next highest district.
- Return type:
float
Partition
- class gerrychain.partition.assignment.Assignment(parts: Dict, mapping: Dict | None = None, validate: bool = True)[source]
Bases:
MappingAn assignment of nodes into parts.
The goal of
Assignmentis to provide an interface that mirrors a dictionary (what we have been using for assigning nodes to districts) while making it convenient/cheap to access the set of nodes in each part.An
Assignmenthas apartsproperty that is a dictionary of the form{part: <frozenset of nodes in part>}.- Parameters:
parts (Dict) – Dictionary mapping partition assignments frozensets of nodes.
mapping (Optional[Dict], optional) – Dictionary mapping nodes to partition assignments. Default is None.
validate (bool, optional) – Whether to validate the assignment. Default is True.
- Returns:
None
- Raises:
ValueError – if the keys of
partsare not uniqueTypeError – if the values of
partsare not frozensets
- copy()[source]
Returns a copy of the assignment. Does not duplicate the frozensets of nodes, just the parts dictionary.
- classmethod from_dict(assignment: Dict) Assignment[source]
Create an
Assignmentfrom a dictionary. This is probably the method you want to use to create a new assignment.This also works for
pandas.Series.- Parameters:
assignment (Dict) – dictionary mapping nodes to partition assignments
- Returns:
A new instance of
Assignmentwith the same assignments as the passed-in dictionary.- Return type:
Assignment
- items()[source]
Iterate over
(node, part)tuples, wherenodeis assigned topart.
- keys()[source]
- mapping
- new_assignment_convert_old_node_ids_to_new_node_ids(node_id_mapping: Dict) Assignment[source]
Create a new Assignment object from the one passed in, where the node_ids are changed according to the node_id_mapping from old node_ids to new node_ids.
This routine was motivated by the fact that node_ids are changed when converting from an NetworkX based graph to a RustworkX based graph. An Assignment based on the node_ids in the NetworkX based graph would need to be changed to use the new node_ids - the new Asignment would be semantically equivalent - just converted to use the new node_ids in the RX based graph.
The node_id_mapping is of the form {old_node_id: new_node_id}
- parts
- to_dict() Dict[source]
- Returns:
The assignment as a
{node: part}dictionary.- Return type:
Dict
- to_series() Series[source]
- Returns:
The assignment as a
pandas.Series.- Return type:
pandas.Series
- update_flows(flows)[source]
Update the assignment for some nodes using the given flows.
- update_parts(new_parts: Dict) None[source]
Update some parts of the assignment. Does not check that every node is still assigned to a part.
- Parameters:
new_parts (Dict) – dictionary mapping (some) parts to their new sets or frozensets of nodes
- Returns:
None
- values()[source]
- gerrychain.partition.assignment.get_assignment(part_assignment: str | Dict | Assignment, graph: Graph | None = None) Assignment[source]
Either extracts an
Assignmentobject from the input graph using the provided key or attempts to convert part_assignment into anAssignmentobject.- Parameters:
part_assignment (str) – A node attribute key, dictionary, or
Assignmentobject corresponding to the desired assignment.graph (Optional[Graph], optional) – The graph from which to extract the assignment. Default is None.
- Returns:
An
Assignmentobject containing the assignment corresponding to the part_assignment input- Return type:
Assignment
- Raises:
TypeError – If the part_assignment is a string and the graph is not provided.
TypeError – If the part_assignment is not a string or dictionary.
- gerrychain.partition.assignment.level_sets(mapping: ~typing.Dict, container: ~typing.Type[~typing.Set] = <class 'set'>) DefaultDict[source]
Inverts a dictionary.
{key: value}becomes{value: <container of keys that map to value>}.- Parameters:
mapping (Dict) – A dictionary to invert. Keys and values can be of any type.
container (Type[Set], optional) – A container type used to collect keys that map to the same value. By default, the container type is
set.
- Returns:
A dictionary where each key is a value from the original dictionary, and the corresponding value is a container (by default, a set) of keys from the original dictionary that mapped to this value.
- Return type:
DefaultDict
Example usage:
.. code_block:: python
>>> level_sets({'a': 1, 'b': 1, 'c': 2}) defaultdict(<class 'set'>, {1: {'a', 'b'}, 2: {'c'}})
- class gerrychain.partition.geographic.GeographicPartition(graph=None, assignment=None, updaters=None, parent=None, flips=None, use_default_updaters=True)[source]
Bases:
PartitionA
Partitionwith areas, perimeters, and boundary information included. These additional data allow you to compute compactness scores like Polsby-Popper.- Parameters:
graph – Underlying graph.
assignment – Dictionary assigning nodes to districts.
updaters – Dictionary of functions to track data about the partition. The keys are stored as attributes on the partition class, which the functions compute.
use_default_updaters – If False, do not include default updaters.
- assignment
- default_updaters = {'area': <gerrychain.updaters.tally.Tally object>, 'boundary_nodes': <function boundary_nodes>, 'cut_edges': <function cut_edges>, 'cut_edges_by_part': <function cut_edges_by_part>, 'exterior_boundaries': <function exterior_boundaries>, 'interior_boundaries': <function interior_boundaries>, 'perimeter': <function perimeter>}
- edge_flows
- flips
- flows
- graph
- parent
- subgraphs
- updaters
- class gerrychain.partition.partition.Partition(graph=None, assignment=None, updaters=None, parent=None, flips=None, use_default_updaters=True)[source]
Bases:
objectPartition represents a partition of the nodes of the graph. It will perform the first layer of computations at each step in the Markov chain - basic aggregations and calculations that we want to optimize.
- Variables:
graph – The underlying graph.
assignment – Maps node IDs to district IDs.
parts – Maps district IDs to the set of nodes in that district.
subgraphs – Maps district IDs to the induced subgraph of that district.
- Parameters:
graph (
Graph) – Underlying graph.assignment (
Assignment) – Dictionary assigning nodes to districts.updaters – Dictionary of functions to track data about the partition. The keys are stored as attributes on the partition class, which the functions compute.
use_default_updaters – If False, do not include default updaters.
- assignment
- crosses_parts(edge: Tuple) bool[source]
- Parameters:
edge (Tuple) – tuple of node IDs
- Returns:
True if the edge crosses from one part of the partition to another
- Return type:
bool
- default_updaters = {'cut_edges': <function cut_edges>}
- edge_flows
- flip(flips: Dict, use_original_nx_node_ids=False) Partition[source]
Returns the new partition obtained by performing the given flips on this partition.
- flips
- flows
- classmethod from_districtr_file(graph: Graph, districtr_file: str, updaters: Dict[str, Callable] | None = None) Partition[source]
Create a Partition from a districting plan created with Districtr, a free and open-source web app created by MGGG for drawing districts.
The provided
graphshould be created from the same shapefile as the Districtr module used to draw the districting plan. These shapefiles may be found in a repository in the mggg-states GitHub organization, or by request from MGGG.- Parameters:
graph (
Graph) – The graph to create the Partition fromdistrictr_file (str) – the path to the
.jsonfile exported from Districtrupdaters (Optional[Dict[str, Callable]], optional) – dictionary of updaters
- Returns:
The partition created from the Districtr file
- Return type:
- classmethod from_random_assignment(graph: ~gerrychain.graph.graph.Graph, n_parts: int, epsilon: float, pop_col: str, updaters: ~typing.Dict[str, ~typing.Callable] | None = None, use_default_updaters: bool = True, method: ~typing.Callable = <function recursive_tree_part>) Partition[source]
Create a Partition with a random assignment of nodes to districts.
- Parameters:
graph (
Graph) – The graph to create the Partition from.n_parts (int) – The number of districts to divide the nodes into.
epsilon (float population. Should be in [0,1].) – The maximum relative population deviation from the ideal
pop_col (str) – The column of the graph’s node data that holds the population data.
updaters (Optional[Dict[str, Callable]], optional) – Dictionary of updaters
use_default_updaters (bool, optional) – If False, do not include default updaters.
method (Callable, optional) – The function to use to partition the graph into
n_parts. Defaults torecursive_tree_part().
- Returns:
The partition created with a random assignment
- Return type:
- graph
- keys()[source]
- parent
- property parts
- plot(geometries=None, **kwargs)[source]
Plot the partition, using the provided geometries.
- Parameters:
geometries (geopandas.GeoDataFrame or geopandas.GeoSeries) – A
geopandas.GeoDataFrameorgeopandas.GeoSeriesholding the geometries to use for plotting. ItsIndexshould match the node labels of the partition’s underlyingGraph.**kwargs – Additional arguments to pass to
geopandas.GeoDataFrame.plot()to adjust the plot.
- Returns:
The matplotlib axes object. Which plots the Partition.
- Return type:
matplotlib.axes.Axes
- subgraphs
- updaters
- class gerrychain.partition.subgraphs.SubgraphView(graph: Graph, parts: List[List[Any]])[source]
Bases:
objectA view for accessing subgraphs of
Graphobjects.This class makes use of a subgraph cache to avoid recomputing subgraphs which can speed up computations when working with district assignments within a partition class.
- Variables:
graph – The parent graph from which subgraphs are derived.
parts – A list-of-lists dictionary (so a dict with key values indicated by the list index) mapping keys to subsets of nodes in the graph.
subgraphs_cache – Cache to store subgraph views for quick access.
- Parameters:
graph (Graph) – The parent graph from which subgraphs are derived.
parts (List[List[Any]]) – A list of lists of nodes corresponding the different parts of the partition of the graph.
- Returns:
None
- graph
- parts
- subgraphs_cache
Proposals
- gerrychain.proposals.proposals.flip(partition: Partition) Partition
Proposes a random boundary flip from the partition.
- gerrychain.proposals.proposals.propose_any_node_flip(partition: Partition) Partition[source]
Flip a random node (not necessarily on the boundary) to a random part
- gerrychain.proposals.proposals.propose_chunk_flip(partition: Partition) Partition[source]
Chooses a random boundary node and proposes to flip it and all of its neighbors
- gerrychain.proposals.proposals.propose_flip_every_district(partition: Partition) Partition[source]
Proposes a random boundary flip for each district in the partition.
- gerrychain.proposals.proposals.propose_random_flip(partition: Partition) Partition[source]
Proposes a random boundary flip from the partition.
- gerrychain.proposals.proposals.slow_reversible_propose(partition: Partition) Partition[source]
Proposes a random boundary flip from the partition in a reversible fasion by selecting uniformly from the (node, flip) pairs.
Temporary version until we make an updater for this set.
- gerrychain.proposals.proposals.slow_reversible_propose_bi(partition: Partition) Partition[source]
Proposes a random boundary flip from the partition in a reversible fashion for bipartitions by selecting a boundary node at random and uniformly picking one of its neighboring parts. For k-partitions this is not uniform since there might be multiple parts next to a single node.
Temporary version until we make an updater for this set.
- gerrychain.proposals.spectral_proposals.spectral_cut(subgraph: Graph, part_labels: Dict, weight_type: str, lap_type: str) Dict[source]
Spectral cut function.
Uses the signs of the elements in the Fiedler vector of a subgraph to partition into two components.
- Parameters:
subgraph (Graph) – The subgraph to be partitioned.
part_labels (Dict) – The current partition of the subgraph.
weight_type (str) – The type of weight to be used in the Laplacian.
lap_type (str) – The type of Laplacian to be used.
- Returns:
A dictionary assigning nodes of the subgraph to their new districts.
- Return type:
Dict
- gerrychain.proposals.spectral_proposals.spectral_recom(partition: Partition, weight_type: str | None = None, lap_type: str = 'normalized') Partition[source]
Spectral ReCom proposal.
Uses spectral clustering to bipartition a subgraph of the original graph formed by merging the nodes corresponding to two adjacent districts.
Example usage:
from functools import partial from gerrychain import MarkovChain from gerrychain.proposals import recom # ...define constraints, accept, partition, total_steps here... proposal = partial( spectral_recom, weight_type=None, lap_type="normalized" ) chain = MarkovChain(proposal, constraints, accept, partition, total_steps)
- Parameters:
partition (Partition) – The initial partition.
weight_type (Optional[str], optional) – The type of weight to be used in the Laplacian. Default is None.
lap_type (str, optional) – The type of Laplacian to be used. Default is “normalized”.
- Returns:
The new partition resulting from the spectral ReCom algorithm.
- Return type:
- exception gerrychain.proposals.tree_proposals.MetagraphError[source]
Bases:
ExceptionRaised when the partition we are trying to split is a low degree node in the metagraph.
- class gerrychain.proposals.tree_proposals.ReCom(pop_col: str, ideal_pop: int | float, epsilon: float, method: ~typing.Callable = <function bipartition_tree_random>)[source]
Bases:
objectReCom (short for ReCombination) is a class that represents a ReCom proposal for redistricting. It is used to create new partitions by recombining existing districts while maintaining population balance.
- Parameters:
pop_col (str) – The name of the column in the partition that contains the population data.
ideal_pop (Union[int,float]) – The ideal population for each district.
epsilon (float) – The epsilon value for population deviation as a percentage of the target population.
method (function, optional) – The method used for bipartitioning the tree. Defaults to bipartition_tree_random.
- exception gerrychain.proposals.tree_proposals.ReversibilityError(msg)[source]
Bases:
ExceptionRaised when the cut edge upper bound is violated.
- exception gerrychain.proposals.tree_proposals.ValueWarning[source]
Bases:
UserWarningRaised whe a particular value is technically valid, but may cause issues with the algorithm.
- gerrychain.proposals.tree_proposals.recom(partition: ~gerrychain.partition.partition.Partition, pop_col: str, pop_target: int | float, epsilon: float, node_repeats: int = 1, region_surcharge: ~typing.Dict | None = None, method: ~typing.Callable = <function bipartition_tree>) Partition[source]
ReCom (short for ReCombination) is a Markov Chain Monte Carlo (MCMC) algorithm used for redistricting. At each step of the algorithm, a pair of adjacent districts is selected at random and merged into a single district. The region is then split into two new districts by generating a spanning tree using the Kruskal/Karger algorithm and cutting an edge at random. The edge is checked to ensure that it separates the region into two new districts that are population balanced, and, if not, a new edge is selected at random and the process is repeated.
Example usage:
from functools import partial from gerrychain import MarkovChain from gerrychain.proposals import recom # ...define constraints, accept, partition, total_steps here... # Ideal population: pop_target = sum(partition["population"].values()) / len(partition) proposal = partial( recom, pop_col="POP10", pop_target=pop_target, epsilon=.05, node_repeats=10 ) chain = MarkovChain(proposal, constraints, accept, partition, total_steps)
- Parameters:
partition (Partition) – The initial partition.
pop_col (str) – The name of the population column.
pop_target (Union[int,float]) – The target population for each district.
epsilon (float) – The epsilon value for population deviation as a percentage of the target population.
node_repeats (int, optional) – The number of times to repeat the bipartitioning step. Default is 1.
region_surcharge (Optional[Dict], optional) – The surcharge dictionary for the graph used for region-aware partitioning of the grid. Default is None.
method (Callable, optional) – The method used for bipartitioning the tree. Default is
bipartition_tree().
- Returns:
The new partition resulting from the ReCom algorithm. print(“bipartition_tree: updating restarts and attempts”)
- Return type:
- gerrychain.proposals.tree_proposals.reversible_recom(partition: ~gerrychain.partition.partition.Partition, pop_col: str, pop_target: int | float, epsilon: float, balance_edge_fn: ~typing.Callable = <function find_balanced_edge_cuts_memoization>, M: int = 1, repeat_until_valid: bool = False, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>) Partition[source]
Reversible ReCom algorithm for redistricting.
This function performs the reversible ReCom algorithm, which is a Markov Chain Monte Carlo (MCMC) algorithm used for redistricting. For more information, see the paper “Spanning Tree Methods for Sampling Graph Partitions” by Cannon, et al. (2022) at https://arxiv.org/abs/2210.01401
- Parameters:
partition (Partition) – The initial partition.
pop_col (str) – The name of the population column.
pop_target (Union[int,float]) – The target population for each district.
epsilon (float) – The epsilon value for population deviation as a percentage of the target population.
balance_edge_fn (Callable, optional frm: it returns a list of Cuts - a named tuple defined in tree.py) – The balance edge function. Default is find_balanced_edge_cuts_memoization.
M (int, optional) – The maximum number of balance edges. Default is 1.
repeat_until_valid (bool, optional) – Flag indicating whether to repeat until a valid partition is found. Default is False.
choice (Callable, optional) – The choice function for selecting a random element. Default is random.choice.
- Returns:
The new partition resulting from the reversible ReCom algorithm.
- Return type:
Tree
This module provides tools and algorithms for manipulating and analyzing graphs, particularly focused on partitioning graphs based on population data. It leverages the NetworkX library to handle graph structures and implements various algorithms for graph partitioning and tree traversal.
Key functionalities include:
Predecessor and successor functions for graph traversal using breadth-first search.
Implementation of random and uniform spanning trees for graph partitioning.
The PopulatedGraph class, which represents a graph with additional population data, and methods for assessing and modifying this data.
Functions for finding balanced edge cuts in a populated graph, either through contraction or memoization techniques.
A suite of functions (bipartition_tree, recursive_tree_part, _get_seed_chunks, etc.) for partitioning graphs into balanced subsets based on population targets and tolerances.
Utility functions like get_max_prime_factor_less_than and _recursive_seed_part_inner to assist in complex partitioning tasks.
Dependencies:
networkx: Used for graph data structure and algorithms.
random: Provides random number generation for probabilistic approaches.
typing: Used for type hints.
Last Updated: 25 April 2024
- frm: This file, tree.py, needed to be modified to operate on new Graph
objects instead of NetworkX Graph objects because the routines are used by the Graph objects inside a Partion, which will soon be based on RustworkX. More specifically, these routines are used by Proposals, and we will soon switch to having the underlying Graph object used in Partitions and Proposals be based on RustworkX.
It may be the case that they are ONLY ever used by Proposals and hence could just have been rewritten to operate on RustworkX Graph objects, but there seemed to be no harm in having them work either way. It was also a good proving ground for testing whether the new Graph object could behave like a NetworkX Graph object (in terms of attribute access and syntax).
frm: RX Documentation
Many of the functions in this file operate on subgraphs which are different from NX subgraphs because the node_ids change in the subgraph. To deal with this, in graph.py we have a _node_id_to_parent_node_id_map data member for Graph objects which maps the node_ids in a subgraph to the corresponding node_id in its parent graph. This will allow routines operating on subgraphs to return results using the node_ids of the parent graph.
Note that for top-level graphs, we still define this _node_id_to_parent_node_id_map, but in this case it is an identity map that just maps each node_id to itself. This allows code to always translate correctly, even if operating on a top-level graph.
As a matter of coding convention, all calls to graph.subgraph() have been placed in the actual parameter list of function calls. This limits the scope of the subgraph node_ids to the called function - eliminating the risk of those node_ids leaking into surrounding code. Stated differently, this eliminates the cognitive load of trying to remember whether a node_id is a parent or a subgraph node_id.
- exception gerrychain.tree.BalanceError[source]
Bases:
ExceptionRaised when a balanced cut cannot be found.
- exception gerrychain.tree.BipartitionWarning[source]
Bases:
UserWarningGenerally raised when it is proving difficult to find a balanced cut.
- class gerrychain.tree.Cut(edge=None, weight=None, subset=None)
Bases:
tupleRepresents a cut in a graph.
Create new instance of Cut(edge, weight, subset)
- edge
The edge where the cut is made. Defaults to None.
- subset
The (frozen) subset of nodes on one side of the cut. Defaults to None.
- weight
The weight assigned to the edge (if any). Defaults to None.
- class gerrychain.tree.PopulatedGraph(graph: Graph, populations: Dict, ideal_pop: float | int, epsilon: float)[source]
Bases:
objectA class representing a graph with population information.
- Variables:
graph – The underlying graph structure.
subsets – A dictionary mapping nodes to their subsets.
population – A dictionary mapping nodes to their populations.
tot_pop – The total population of the graph.
ideal_pop – The ideal population for each district.
epsilon – The tolerance for population deviation from the ideal population within each district.
- Parameters:
graph (Graph) – The underlying graph structure.
populations (Dict) – A dictionary mapping nodes to their populations.
ideal_pop (Union[float, int]) – The ideal population for each district.
epsilon (float) – The tolerance for population deviation as a percentage of the ideal population within each district.
- contract_node(node, parent) None[source]
- degree(node) int[source]
- has_ideal_population(node, one_sided_cut: bool = False) bool[source]
Checks if a node has an ideal population within the graph up to epsilon.
- Parameters:
node (Any) – The node to check.
one_sided_cut (bool, optional) – Whether or not we are cutting off a single district. When set to False, we check if the node we are cutting and the remaining graph are both within epsilon of the ideal population. When set to True, we only check if the node we are cutting is within epsilon of the ideal population. Defaults to False.
- Returns:
True if the node has an ideal population within the graph up to epsilon.
- Return type:
bool
- exception gerrychain.tree.PopulationBalanceError[source]
Bases:
ExceptionRaised when the population of a district is outside the acceptable epsilon range.
- exception gerrychain.tree.ReselectException[source]
Bases:
ExceptionRaised when the tree-splitting algorithm is unable to find a balanced cut after some maximum number of attempts, but the user has allowed the algorithm to reselect the pair of districts from parent graph to try and recombine.
- gerrychain.tree.bipartition_tree(subgraph_to_split: ~gerrychain.graph.graph.Graph, pop_col: str, pop_target: int | float, epsilon: float, node_repeats: int = 1, spanning_tree: ~gerrychain.graph.graph.Graph | None = None, spanning_tree_fn: ~typing.Callable = <function random_spanning_tree>, region_surcharge: ~typing.Dict | None = None, balance_edge_fn: ~typing.Callable = <function find_balanced_edge_cuts_memoization>, one_sided_cut: bool = False, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>, max_attempts: int | None = 100000, warn_attempts: int = 1000, allow_pair_reselection: bool = False, cut_choice: ~typing.Callable = <function _region_preferred_max_weight_choice>) Set[source]
This function finds a balanced 2 partition of a graph by drawing a spanning tree and finding an edge to cut that leaves at most an epsilon imbalance between the populations of the parts. If a root fails, new roots are tried until node_repeats in which case a new tree is drawn.
Builds up a connected subgraph with a connected complement whose population is
epsilon * pop_targetaway frompop_target.- Parameters:
graph (Graph) – The graph to partition.
pop_col (str) – The node attribute holding the population of each node.
pop_target (Union[int, float]) – The target population for the returned subset of nodes.
epsilon (float) – The allowable deviation from
pop_target(as a percentage ofpop_target) for the subgraph’s population.node_repeats (int, optional) – A parameter for the algorithm: how many different choices of root to use before drawing a new spanning tree. Defaults to 1.
spanning_tree (Optional[Graph], optional) – The spanning tree for the algorithm to use (used when the algorithm chooses a new root and for testing).
spanning_tree_fn (Callable, optional) – The random spanning tree algorithm to use if a spanning tree is not provided. Defaults to
random_spanning_tree().region_surcharge (Optional[Dict], optional) – A dictionary of surcharges for the spanning tree algorithm. Defaults to None.
balance_edge_fn (Callable, optional) – The function to find balanced edge cuts. Defaults to
find_balanced_edge_cuts_memoization().one_sided_cut (bool, optional) – Passed to the
balance_edge_fn. Determines whether or not we are cutting off a single district when partitioning the tree. When set to False, we check if the node we are cutting and the remaining graph are both within epsilon of the ideal population. When set to True, we only check if the node we are cutting is within epsilon of the ideal population. Defaults to False.choice (Callable, optional) – The function to make a random choice of root node for the population tree. Passed to
balance_edge_fn. Can be substituted for testing. Defaults torandom.random().max_attempts (Optional[int], optional) – The maximum number of attempts that should be made to bipartition. Defaults to 10000.
warn_attempts (int, optional) – The number of attempts after which a warning is issued if a balanced cut cannot be found. Defaults to 1000.
allow_pair_reselection (bool, optional) – Whether we would like to return an error to the calling function to ask it to reselect the pair of nodes to try and recombine. Defaults to False.
cut_choice (Callable, optional) – The function used to select the cut edge from the list of possible balanced cuts. Defaults to
_region_preferred_max_weight_choice().
- Returns:
A subset of nodes of
graph(whose induced subgraph is connected). The other part of the partition is the complement of this subset.- Return type:
Set
- Raises:
BipartitionWarning – If a possible cut cannot be found after 1000 attempts.
RuntimeError – If a possible cut cannot be found after the maximum number of attempts given by
max_attempts.
- gerrychain.tree.bipartition_tree_random(subgraph_to_split: ~gerrychain.graph.graph.Graph, pop_col: str, pop_target: int | float, epsilon: float, node_repeats: int = 1, repeat_until_valid: bool = True, spanning_tree: ~gerrychain.graph.graph.Graph | None = None, spanning_tree_fn: ~typing.Callable = <function random_spanning_tree>, balance_edge_fn: ~typing.Callable = <function find_balanced_edge_cuts_memoization>, one_sided_cut: bool = False, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>, max_attempts: int | None = 100000) Set[Any] | None[source]
This is like
bipartition_tree()except it chooses a random balanced cut, rather than the first cut it finds.This function finds a balanced 2 partition of a graph by drawing a spanning tree and finding an edge to cut that leaves at most an epsilon imbalance between the populations of the parts. If a root fails, new roots are tried until node_repeats in which case a new tree is drawn.
Builds up a connected subgraph with a connected complement whose population is
epsilon * pop_targetaway frompop_target.- Parameters:
graph (Graph) – The graph to partition.
pop_col (str) – The node attribute holding the population of each node.
pop_target (Union[int, float]) – The target population for the returned subset of nodes.
epsilon (float) – The allowable deviation from
pop_target(as a percentage ofpop_target) for the subgraph’s population.node_repeats (int) – A parameter for the algorithm: how many different choices of root to use before drawing a new spanning tree. Defaults to 1.
repeat_until_valid (bool, optional) – Determines whether to keep drawing spanning trees until a tree with a balanced cut is found. If True, a set of nodes will always be returned; if False, None will be returned if a valid spanning tree is not found on the first try. Defaults to True.
spanning_tree (Optional[Graph], optional) – The spanning tree for the algorithm to use (used when the algorithm chooses a new root and for testing). Defaults to None.
spanning_tree_fn (Callable, optional) – The random spanning tree algorithm to use if a spanning tree is not provided. Defaults to
random_spanning_tree().balance_edge_fn (Callable, optional) – The algorithm used to find balanced cut edges. Defaults to
find_balanced_edge_cuts_memoization().one_sided_cut (bool, optional) – Passed to the
balance_edge_fn. Determines whether or not we are cutting off a single district when partitioning the tree. When set to False, we check if the node we are cutting and the remaining graph are both within epsilon of the ideal population. When set to True, we only check if the node we are cutting is within epsilon of the ideal population. Defaults to False.choice (Callable, optional) – The random choice function. Can be substituted for testing. Defaults to
random.choice().max_attempts (Optional[int], optional) – The max number of attempts that should be made to bipartition. Defaults to None.
- Returns:
A subset of nodes of
graph(whose induced subgraph is connected) or None if a valid spanning tree is not found.- Return type:
Union[Set[Any], None]
- gerrychain.tree.bipartition_tree_random_with_num_cuts(graph: ~gerrychain.graph.graph.Graph, pop_col: str, pop_target: int | float, epsilon: float, node_repeats: int = 1, repeat_until_valid: bool = True, spanning_tree: ~gerrychain.graph.graph.Graph | None = None, spanning_tree_fn: ~typing.Callable = <function random_spanning_tree>, balance_edge_fn: ~typing.Callable = <function find_balanced_edge_cuts_memoization>, one_sided_cut: bool = False, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>, max_attempts: int | None = 100000) Set[Any] | None[source]
This is like
bipartition_tree()except it chooses a random balanced cut, rather than the first cut it finds.This function finds a balanced 2 partition of a graph by drawing a spanning tree and finding an edge to cut that leaves at most an epsilon imbalance between the populations of the parts. If a root fails, new roots are tried until node_repeats in which case a new tree is drawn.
Builds up a connected subgraph with a connected complement whose population is
epsilon * pop_targetaway frompop_target.- Parameters:
graph (Graph) – The graph to partition.
pop_col (str) – The node attribute holding the population of each node.
pop_target (Union[int, float]) – The target population for the returned subset of nodes.
epsilon (float) – The allowable deviation from
pop_target(as a percentage ofpop_target) for the subgraph’s population.node_repeats (int) – A parameter for the algorithm: how many different choices of root to use before drawing a new spanning tree. Defaults to 1.
repeat_until_valid (bool, optional) – Determines whether to keep drawing spanning trees until a tree with a balanced cut is found. If True, a set of nodes will always be returned; if False, None will be returned if a valid spanning tree is not found on the first try. Defaults to True.
spanning_tree (Optional[Graph], optional) – The spanning tree for the algorithm to use (used when the algorithm chooses a new root and for testing). Defaults to None.
spanning_tree_fn (Callable, optional) – The random spanning tree algorithm to use if a spanning tree is not provided. Defaults to
random_spanning_tree().balance_edge_fn (Callable, optional) – The algorithm used to find balanced cut edges. Defaults to
find_balanced_edge_cuts_memoization().one_sided_cut (bool, optional) – Passed to the
balance_edge_fn. Determines whether or not we are cutting off a single district when partitioning the tree. When set to False, we check if the node we are cutting and the remaining graph are both within epsilon of the ideal population. When set to True, we only check if the node we are cutting is within epsilon of the ideal population. Defaults to False.choice (Callable, optional) – The random choice function. Can be substituted for testing. Defaults to
random.choice().max_attempts (Optional[int], optional) – The max number of attempts that should be made to bipartition. Defaults to None.
- Returns:
A subset of nodes of
graph(whose induced subgraph is connected) or None if a valid spanning tree is not found.- Return type:
Union[Set[Any], None]
- gerrychain.tree.epsilon_tree_bipartition(subgraph_to_split: ~gerrychain.graph.graph.Graph, parts: ~typing.Sequence, pop_target: float | int, pop_col: str, epsilon: float, node_repeats: int = 1, method: ~typing.Callable = functools.partial(<function bipartition_tree>, max_attempts=10000)) Dict[source]
Uses
bipartition_tree()to partition a tree into two parts of populationpop_target(withinepsilon).- param graph:
The graph to partition into two :math:`
- arepsilon`-balanced parts.
- type graph:
Graph
- param parts:
Iterable of part (district) labels (like
[0,1,2]orrange(4)).- type parts:
Sequence
- param pop_target:
Target population for each part of the partition.
- type pop_target:
Union[float, int]
- param pop_col:
Node attribute key holding population data.
- type pop_col:
str
- param epsilon:
How far (as a percentage of
pop_target) frompop_targetthe parts of the partition can be.- type epsilon:
float
- param node_repeats:
Parameter for
bipartition_tree()to use. Defaults to 1.- type node_repeats:
int, optional
- param method:
The partition method to use. Defaults to partial(bipartition_tree, max_attempts=10000).
- type method:
Callable, optional
- returns:
New assignments for the nodes of
graph.- rtype:
dict
- gerrychain.tree.find_balanced_edge_cuts_contraction(h: ~gerrychain.tree.PopulatedGraph, one_sided_cut: bool = False, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>) List[Cut][source]
Find balanced edge cuts using contraction.
- Parameters:
h (PopulatedGraph) – The populated graph.
one_sided_cut (bool, optional) – Whether or not we are cutting off a single district. When set to False, we check if the node we are cutting and the remaining graph are both within epsilon of the ideal population. When set to True, we only check if the node we are cutting is within epsilon of the ideal population. Defaults to False.
choice (Callable, optional) – The function used to make random choices.
- Returns:
A list of balanced edge cuts.
- Return type:
List[Cut]
- gerrychain.tree.find_balanced_edge_cuts_memoization(h: ~gerrychain.tree.PopulatedGraph, one_sided_cut: bool = False, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>) List[Cut][source]
Find balanced edge cuts using memoization.
This function takes a PopulatedGraph object and a choice function as input and returns a list of balanced edge cuts. A balanced edge cut is defined as a cut that divides the graph into two subsets, such that the population of each subset is close to the ideal population defined by the PopulatedGraph object.
- Parameters:
h (PopulatedGraph) – The PopulatedGraph object representing the graph.
one_sided_cut (bool, optional) – Whether or not we are cutting off a single district. When set to False, we check if the node we are cutting and the remaining graph are both within epsilon of the ideal population. When set to True, we only check if the node we are cutting is within epsilon of the ideal population. Defaults to False.
choice (Callable, optional) – The choice function used to select the root node.
- Returns:
A list of balanced edge cuts.
- Return type:
List[Cut]
- gerrychain.tree.get_max_prime_factor_less_than(n: int, ceil: int) int | None[source]
Helper function for _recursive_seed_part_inner. Returns the largest prime factor of
nless thanceil, or None if all are greater than ceil.- Parameters:
n (int) – The number to find the largest prime factor for.
ceil (int) – The upper limit for the largest prime factor.
- Returns:
The largest prime factor of
nless thanceil, or None if all are greater than ceil.- Return type:
Optional[int]
- gerrychain.tree.random_spanning_tree(graph: Graph, region_surcharge: Dict | None = None) Graph[source]
Builds a spanning tree chosen by Kruskal’s method using random weights.
- gerrychain.tree.recursive_seed_part(graph: ~gerrychain.graph.graph.Graph, parts: ~typing.Sequence, pop_target: float | int, pop_col: str, epsilon: float, method: ~typing.Callable = functools.partial(<function bipartition_tree>, max_attempts=10000), node_repeats: int = 1, n: int | None = None, ceil: int | None = None) Dict[source]
Returns a partition with
num_distsdistricts balanced withinepsilonofpop_targetby recursively splitting graph using _recursive_seed_part_inner.- Parameters:
graph (Graph) – The graph
parts (Sequence) – Iterable of part labels (like
[0,1,2]orrange(4)pop_target (Union[float, int]) – Target population for each part of the partition
pop_col (str) – Node attribute key holding population data
epsilon (float) – How far (as a percentage of
pop_target) frompop_targetthe parts of the partition can bemethod (Callable, optional) – Function used to find balanced partitions at the 2-district level Defaults to
bipartition_tree()node_repeats (int, optional) – Parameter for
bipartition_tree()to use. Defaults to 1.n (Optional[int], optional) – Either a positive integer (greater than 1) or None. If n is a positive integer, this function will recursively create a seed plan by either biting off districts from graph or dividing graph into n chunks and recursing into each of these. If n is None, this function prime factors ``num_dists``=n_1*n_2*…*n_k (n_1 > n_2 > … n_k) and recursively partitions graph into n_1 chunks. Defaults to None.
ceil (Optional[int], optional) – Either a positive integer (at least 2) or None. Relevant only if n is None. If
ceilis a positive integer then finds the largest factor ofnum_distsless than or equal toceil, and recursively splits graph into that number of chunks, or bites off a district if that number is 1. Defaults to None.
- Returns:
New assignments for the nodes of
graph.- Return type:
dict
- gerrychain.tree.recursive_tree_part(graph: ~gerrychain.graph.graph.Graph, parts: ~typing.Sequence, pop_target: float | int, pop_col: str, epsilon: float, node_repeats: int = 1, method: ~typing.Callable = functools.partial(<function bipartition_tree>, max_attempts=10000)) Dict[source]
Uses
bipartition_tree()recursively to partition a tree intolen(parts)parts of populationpop_target(withinepsilon). Can be used to generate initial seed plans or to implement ReCom-like “merge walk” proposals.- param graph:
The graph to partition into
len(parts):math:`
- arepsilon`-balanced parts.
- type graph:
Graph
- param parts:
Iterable of part (district) labels (like
[0,1,2]orrange(4)).- type parts:
Sequence
- param pop_target:
Target population for each part of the partition.
- type pop_target:
Union[float, int]
- param pop_col:
Node attribute key holding population data.
- type pop_col:
str
- param epsilon:
How far (as a percentage of
pop_target) frompop_targetthe parts of the partition can be.- type epsilon:
float
- param node_repeats:
Parameter for
bipartition_tree()to use. Defaluts to 1.- type node_repeats:
int, optional
- param method:
The partition method to use. Defaults to partial(bipartition_tree, max_attempts=10000).
- type method:
Callable, optional
- returns:
New assignments for the nodes of
graph.- rtype:
dict
Updaters
- gerrychain.updaters.compactness.boundary_nodes(partition, alias: str = 'boundary_nodes') Set[source]
- Parameters:
partition (
Partition) – A partition of a Graphalias (str, optional) – The name of the attribute that the boundary nodes are stored under. Default is ‘boundary_nodes’.
- Returns:
The set of nodes in the partition that are on the boundary.
- Return type:
Set
- gerrychain.updaters.compactness.exterior_boundaries(partition, previous: Set, inflow: Set, outflow: Set) Dict[source]
Updater function that responds to the flow of nodes between different partitions.
- Parameters:
partition (
Partition) – A partition of a Graphprevious (Set) – The previous set of exterior boundary nodes for a fixed part of the given partition.
inflow (Set) – The set of nodes that have flowed into the given part of the partition.
outflow (Set) – The set of nodes that have flowed out of the given part of the partition.
- Returns:
A dict mapping each part of the partition to the new exterior boundary of that part.
- Return type:
Dict
- gerrychain.updaters.compactness.exterior_boundaries_as_a_set(partition, previous: Set, inflow: Set, outflow: Set) Set[source]
Updater function that responds to the flow of nodes between different partitions.
- Parameters:
partition (
Partition) – A partition of a Graphprevious (Set) – The previous set of exterior boundary nodes for a fixed part of the given partition.
inflow (Set) – The set of nodes that have flowed into the given part of the partition.
outflow (Set) – The set of nodes that have flowed out of the given part of the partition.
- Returns:
The new set of exterior boundary nodes for the given part of the partition.
- Return type:
Set
- gerrychain.updaters.compactness.flips(partition) Dict[source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
The flips that were made to get from the parent partition to the given partition.
- Return type:
Dict
- gerrychain.updaters.compactness.initialize_exterior_boundaries(partition) Dict[int, float][source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
A dictionary mapping each part of a partition to the total perimeter of the boundary nodes in that part.
- Return type:
Dict[int, float]
- gerrychain.updaters.compactness.initialize_exterior_boundaries_as_a_set(partition) Dict[int, Set][source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
A dictionary mapping each part of a partition to the set of nodes in that part that are on the boundary.
- Return type:
Dict[int, Set]
- gerrychain.updaters.compactness.initialize_interior_boundaries(partition)[source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
A dictionary mapping each part of a partition to the total perimeter the given part shares with other parts.
- Return type:
Dict[int, float]
- gerrychain.updaters.compactness.interior_boundaries(partition, previous: Set, new_edges: Set, old_edges: Set) Dict[source]
Updater function that responds to the flow of nodes between different partitions.
- Parameters:
partition (
Partition) – A partition of a Graphprevious (Set) – The previous set of exterior boundary nodes for a fixed part of the given partition.
new_edges (Set) – The set of edges that have flowed into the given part of the partition.
old_edges (Set) – The set of edges that have flowed out of the given part of the partition.
- Returns:
A dict mapping each part of the partition to the new interior boundary of that part.
- Return type:
Dict
- gerrychain.updaters.compactness.perimeter(partition) Dict[int, float][source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
A dictionary mapping each part of a partition to its perimeter.
- Return type:
Dict[int, float]
- gerrychain.updaters.compactness.perimeter_of_part(partition, part: int) float[source]
Totals up the perimeter of the part in the partition.
Warning
frm: TODO: Refactoring: Add code to enforce this warning…
Requires that ‘boundary_perim’ be a node attribute, ‘shared_perim’ be an edge attribute, ‘cut_edges’ be an updater, and ‘exterior_boundaries’ be an updater.
- Parameters:
partition (
Partition) – A partition of a Graphpart (int) – The id of the part of the partition whose perimeter we want to compute.
- Returns:
The perimeter of the desired part.
- Return type:
float
- class gerrychain.updaters.county_splits.CountyInfo(split, nodes, contains)
Bases:
tupleA named tuple to store county split information.
- Parameters:
split (int) – The county split status. Makes use of
CountySplitenum to compute.nodes (List) – The nodes that are contained in the county.
contains (Set) – The assignment IDs that are contained in the county.
- contains
Alias for field number 2
- nodes
Alias for field number 1
- split
Alias for field number 0
- class gerrychain.updaters.county_splits.CountySplit(value)[source]
Bases:
EnumEnum to track county splits in a partition.
- Variables:
NOT_SPLIT – The county is not split.
NEW_SPLIT – The county is split in the current partition.
OLD_SPLIT – The county is split in the parent partition.
- NEW_SPLIT = 1
- NOT_SPLIT = 0
- OLD_SPLIT = 2
- gerrychain.updaters.county_splits.compute_county_splits(partition, county_field: str, partition_field: str) Dict[str, CountyInfo][source]
Track nodes in counties and information about their splitting.
- Parameters:
partition (
Partition) – The partition object to compute county splits for.county_field (str) – Name of county ID field on the graph.
partition_field (str) – Name of the attribute in the graph that stores the partition information. The county split information will be computed with respect to this division of the graph.
- Returns:
A dict containing the information on how counties changed between the parent and child partitions. If there is no parent partition, then only the OLD_SPLIT and NOT_SPLIT values will be used.
- Return type:
Dict[str, CountyInfo]
- gerrychain.updaters.county_splits.county_splits(partition_name: str, county_field_name: str) Callable[source]
Update that allows for the tracking of county splits.
- Parameters:
partition_name (str) – Name that the
Partitioninstance will store.county_field_name (str) – Name of county ID field on the graph.
- Returns:
The tracked data is a dictionary keyed on the county ID. The stored values are tuples of the form (split, nodes, seen). split is a
CountySplitenum, nodes is a list of node IDs, and seen is a list of assignment IDs that are contained in the county.- Return type:
Callable
- gerrychain.updaters.county_splits.tally_region_splits(reg_attr_lst)[source]
A naive updater for tallying the number of times a region attribute is split. for each region attribute in reg_attr_lst.
- Parameters:
reg_attr_lst (List[str]) – A list of region names to tally splits for.
- Returns:
A function that takes a partition and returns a dictionary which maps the region name to the number of times that it is split in a a particular partition.
- Return type:
Callable
- gerrychain.updaters.county_splits.total_reg_splits(partition, reg_attr)[source]
Returns the total number of times that reg_attr is split in the partition.
- gerrychain.updaters.cut_edges.cut_edges(partition)[source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
The set of edges that are cut by the given partition.
- Return type:
Set[Tuple]
- gerrychain.updaters.cut_edges.cut_edges_by_part(partition, previous: Set[Tuple], new_edges: Set[Tuple], old_edges: Set[Tuple]) Set[Tuple][source]
Updater function that responds to the flow of edges between different partitions.
- Parameters:
partition (
Partition) – A partition of a Graphprevious (Set[Tuple]) – The previous set of edges for a fixed part of the given partition.
new_edges (Set[Tuple]) – The set of edges that have flowed into the given part of the partition.
old_edges (Set[Tuple]) – The set of cut edges in the previous partition.
- Returns:
The new set of cut edges for the newly generated partition.
- Return type:
Set
- gerrychain.updaters.cut_edges.initialize_cut_edges(partition)[source]
- Parameters:
partition (
Partition) – A partition of a Graph
- frm: TODO: Documentation This description should be updated. Cut_edges are edges that touch
two different parts (districts). They are the internal boundaries between parts (districts). This routine finds all of the cut_edges in the graph and then creates a dict that stores all of the cut_edges for each part (district). This dict becomes the value of partition[“cut_edges”].
- Peter agreed:
Ah, you are correct. It maps parts to cut edges, not just any edges in the partition
- Returns:
A dictionary mapping each part of a partition to the set of edges in that part.
- Return type:
Dict
- class gerrychain.updaters.election.Election(name: str, party_names_to_node_attribute_names: Dict | List, alias: str | None = None)[source]
Bases:
objectRepresents the data of one election, with races conducted in each part of the partition.
As we vary the districting plan, we can use the same node-level vote totals to tabulate hypothetical elections. To do this manually with tallies, we would have to maintain tallies for each party, as well as the total number of votes, and then compute the electoral results and percentages from scratch every time. To make this simpler, this class provides an
ElectionUpdaterto manage these tallies. The updater returns anElectionResultsclass giving a convenient view of the election results, with methods likewins()orpercent()for common queries the user might make on election results.Example usage:
# Assuming your nodes have attributes "2008_D", "2008_R" # with (for example) 2008 senate election vote totals election = Election( "2008 Senate", {"Democratic": "2008_D", "Republican": "2008_R"}, alias="2008_Sen" ) # Assuming you already have a graph and assignment: partition = Partition( graph, assignment, updaters={"2008_Sen": election} ) # The updater returns an ElectionResults instance, which # we can use (for example) to see how many seats a given # party would win in this partition using this election's # vote distribution: partition["2008_Sen"].wins("Republican")
- Variables:
name – The name of the election. (e.g. “2008 Presidential”)
parties – A list of the names of the parties in the election.
node_attribute_names – A list of the node_attribute_names in the graph’s node data that hold the vote totals for each party.
party_names_to_node_attribute_names – A dictionary mapping party names to the node_attribute_names in the graph’s node data that hold the vote totals for that party.
tallies – A dictionary mapping party names to
DataTallyobjects that manage the vote totals for that party.updater – An
ElectionUpdaterobject that manages the tallies and returns anElectionResultsobject.alias – The name that the election is registered under in the partition’s dictionary of updaters.
- Parameters:
name (str) – The name of the election. (e.g. “2008 Presidential”)
party_names_to_node_attribute_names (Dict[str, str]) – A mapping from the name of a party to the name of an attribute of a node that contains the vote totals for that party. This parameter can be either a list or a dict. If a list, then the name of the party and the name of the node attribute are the same, for instance: [“Dem”, “Rep”] would indicate that the “Dem” party vote totals are stored in the “Dem” node attribute. If a list, then there are two possibilities.
- A dictionary matching party names to their
data node_attribute_names, either as actual node_attribute_names (list-like, indexed by nodes) or as string keys for the node attributes that hold the party’s vote totals. Or, a list of strings which will serve as both the party names and the node attribute keys.
- Parameters:
alias (Optional[str], optional) – Alias that the election is registered under in the Partition’s dictionary of updaters.
- class gerrychain.updaters.election.ElectionResults(election: Election, counts: Dict[str, Dict[int, float]], regions: List[int])[source]
Bases:
objectRepresents the results of an election. Provides helpful methods to answer common questions you might have about an election (Who won? How many seats?, etc.).
- Variables:
election – The
Electionobject that these results are associated with.totals_for_party – A dictionary mapping party names to the total number of votes that party received in each part of the partition.
regions – A list of regions that we would like the results for.
totals – A dictionary mapping each part of the partition to the total number of votes cast in that part.
percents_for_party – A dictionary mapping party names to the percentage of votes that party received in each part of the partition.
Note
The variable “regions” is generally called “parts” in other sections of the codebase, but we have changed it here to avoid confusion with the parameter “party” that often appears within the class.
- Parameters:
- Counts:
A dictionary mapping party names to the total number of votes that party received in each part of the partition.
- Returns:
None
- count(party: str, region: str | None = None) int[source]
- Parameters:
party (str) – Party ID.
region (Optional[int], optional) – ID of the part of the partition whose votes we want to tally.
- Returns:
The total number of votes that
partyreceived in a given region (part of the partition). Ifregionis omitted, returns the overall vote total ofparty.- Return type:
int
- counts(party: str) Tuple[source]
- Parameters:
party (str) – Party ID
- Returns:
tuple of the total votes cast for
partyin each part of the partition- Return type:
Tuple
- efficiency_gap() float[source]
Computes the efficiency gap for this ElectionResults object.
See:
efficiency_gap()- Returns:
The efficiency gap for this election.
- Return type:
float
- mean_median() float[source]
Computes the mean-median score for this ElectionResults object.
See:
mean_median()- Returns:
The mean-median score for this election.
- Return type:
float
- mean_thirdian() float[source]
Computes the mean-thirdian score for this ElectionResults object.
See:
mean_thirdian()- Returns:
The mean-thirdian score for this election.
- Return type:
float
- partisan_bias() float[source]
Computes the partisan bias for this ElectionResults object.
See:
partisan_bias()- Returns:
The partisan bias for this election.
- Return type:
float
- partisan_gini() float[source]
Computes the Gini score for this ElectionResults object.
See:
partisan_gini()- Returns:
The partisan Gini score for this election.
- Return type:
float
- percent(party: str, region: int | None = None) float[source]
- Parameters:
party (str) – Party ID.
region (Optional[int], optional) – ID of the part of the partition whose votes we want to tally.
- Returns:
The percentage of the vote that
partyreceived in a given region (part of the partition). Ifregionis omitted, returns the overall vote share ofparty.- Return type:
float
- percents(party: str) Tuple[source]
- Parameters:
party (str) – Party ID
- Returns:
The tuple of the percentage of votes that
partyreceived in each part of the partition- Return type:
Tuple
- seats(party: str) int[source]
- Parameters:
party (str) – Party name
- Returns:
The number of seats that
partywon.- Return type:
int
- total_votes() int[source]
- Returns:
The total number of votes cast in the election.
- Return type:
int
- votes(party: str) Tuple[source]
An alias for
counts().- Parameters:
party (str) – Party ID
- Returns:
tuple of the total votes cast for
partyin each part of the partition- Return type:
Tuple
- wins(party: str) int[source]
An alias for
seats().- Parameters:
party (str) – Party name
- Returns:
The number of seats that
partywon.- Return type:
int
- won(party: str, region: str) bool[source]
- Parameters:
party (str) – Party ID
region (str) – ID of the part of the partition whose votes we want to tally.
- Returns:
Answer to “Did
partywin the region in partregion?”- Return type:
bool
- class gerrychain.updaters.election.ElectionUpdater(election: Election)[source]
Bases:
objectThe updater for computing the election results in each part of the partition after each step in the Markov chain. The actual results are returned to the user as an
ElectionResultsinstance.- Variables:
election – The
Electionobject that this updater is associated with.
- get_previous_values(partition) Dict[str, Dict[int, float]][source]
- Parameters:
partition (
Partition) – The partition whose parent we want to obtain the previous vote totals from.- Returns:
A dictionary mapping party names to the vote totals that party received in each part of the parent of the current partition.
- Return type:
Dict[str, Dict[int, float]]
- gerrychain.updaters.election.format_part_results(percents_for_party: Dict[str, Dict[int, float]], part: int) str[source]
- Parameters:
percents_for_party (Dict[str, Dict[int, float]]) – A dictionary mapping party names to a dict containing the percentage of votes that party received in each part of the partition.
part (int) – The part of the partition whose results we want to format.
- Returns:
A formatted string containing the results for the given part of the partition.
- Return type:
str
- gerrychain.updaters.election.get_percents(counts: Dict, totals: Dict) Dict[source]
- Parameters:
counts (Dict) – A dictionary mapping each part in a partition to the count of the number of votes that a party received in that part.
totals (Dict) – A dictionary mapping each part in a partition to the total number of votes cast in that part.
- Returns:
A dictionary mapping each part in a partition to the percentage
- Return type:
Dict
- gerrychain.updaters.flows.compute_edge_flows(partition) Dict[source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
A flow dictionary containing the flow from the parent of this partition to this partition. This dictionary is of the form {part: {‘in’: <set of edges that flowed in>, ‘out’: <set of edges that flowed out>}}.
- Return type:
Dict
- gerrychain.updaters.flows.create_flow()[source]
- gerrychain.updaters.flows.flows_from_changes(old_partition, new_partition) Dict[source]
- Parameters:
- Returns:
A dictionary mapping each node that changed assignment between the previous and current partitions to a dictionary of the form {‘in’: <set of nodes that flowed in>, ‘out’: <set of nodes that flowed out>}.
- Return type:
Dict
- gerrychain.updaters.flows.neighbor_flips(partition) Set[Tuple][source]
- Parameters:
partition (
Partition) – A partition of a Graph- Returns:
The set of edges that were flipped in the given partition.
- Return type:
Set[Tuple]
- gerrychain.updaters.flows.on_edge_flow(initializer: Callable, alias: str) Callable[source]
Use this decorator to create an updater that responds to flows of cut edges between parts of the partition.
Decorate a function that takes: - The partition - The previous value of the updater for a fixed part P_i - The new cut edges that are just joining P_i at this step - The old cut edges that are just leaving P_i at this step and returns: - The new value of the updater for the fixed part P_i.
This will create an updater whose values are dictionaries of the form {part: <value of the given function on the part>}.
The initializer, by contrast, should take the entire partition and return the entire {part: <value>} dictionary.
Example:
@on_edge_flow(initializer, alias='my_updater') def my_updater(partition, previous, new_edges, old_edges): # return new value of the part
- Parameters:
initializer (Callable) – A function that takes the partition and returns a dictionary of the form {part: <value>}.
alias (str) – The name of the updater to be created.
- Returns:
A decorator that takes a function as input and returns a wrapped function.
- Return type:
Callable
- gerrychain.updaters.flows.on_flow(initializer: Callable, alias: str) Callable[source]
Use this decorator to create an updater that responds to flows of nodes between parts of the partition.
Decorate a function that takes: - The partition - The previous value of the updater on a fixed part P_i - The new nodes that are just joining P_i at this step - The old nodes that are just leaving P_i at this step and returns: - The new value of the updater for the fixed part P_i.
This will create an updater whose values are dictionaries of the form {part: <value of the given function on the part>}.
The initializer, by contrast, should take the entire partition and return the entire {part: <value>} dictionary.
Example:
@on_flow(initializer, alias='my_updater') def my_updater(partition, previous, new_nodes, old_nodes): # return new value for the part
- Parameters:
initializer (Callable) – A function that takes the partition and returns a dictionary of the form {part: <value>}.
alias (str) – The name of the updater to be created.
- Returns:
A decorator that takes a function as input and returns a wrapped function.
- Return type:
Callable
- class gerrychain.updaters.locality_split_scores.LocalitySplits(name: str, col_id: str, pop_col: str, scores_to_compute: List[str] = ['num_parts'], pent_alpha: float = 0.05)[source]
Bases:
objectComputes various splitting measures for a partition
Can be used to compute how a districting plan splits against any static attribute. The prototypical example is to consider how a districting plan subdivides counties or municipalities, but other units, such as city neighborhoods, state legislative districts, or Census tracts could be treated as ‘localities’
Example usage:
# Assuming your nodes have attributes "countyID" # with (for example) the name of the county that # node lies in and a population attribute "pop": county_splits = LocalitySplits( "countysplits", "countyID", "pop", ["num_parts", "symmetric_entropy","power_entropy"], pent_alpha = 0.8 ) # Assuming you already have a graph and assignment: partition = Partition( graph, assignment, updaters={"county_splits" : county_splits} ) # The updater returns an dictionary instance, which # at each step of the chain has the name of the score # and its value at that step
- Variables:
name – The name of the updater (e.g. “countysplits”)
col_id – The name of the column containing the locality attribute (i.e. county ids, municipality names, etc.)
pop_col – The name of the column containing population counts.
scores_to_compute – A list/tuple/set of strings naming the score functions to compute at each step. This will generally be some subcollection of
`['num_parts', 'num_pieces', 'naked_boundary', 'shannon_entropy', 'power_entropy', 'symmetric_entropy', 'num_split_localities']`pent_alpha – A number between 0 and 1 which is passed as the exponent to
power_entropy()localities – A list containing the unique locality identifiers (e.g. county names, municipality names, etc.) for the partition. This list is populated using the locality data stored on each of the nodes in the graph.
localitydict – A dictionary mapping node IDs to locality IDs. This is used to quickly look up the locality of a given node.
locality_splits – A dictionary mapping district IDs to a counter of localities in that district. That is to say, this tells us how many nodes in each district are of the given locality type.
locality_splits_inv – The inverted dictionary of locality_splits
allowed_pieces – A dictionary that maps each locality to the minimum number of districts that locality must touch. This is computed using the ideal district population. NOT CURRENTLY USED.
scores – A dictionary initialized with the key values from the initializer’s scores_to_compute parameter. The initial values are set to none and are updated in each call to store the compted score value for each metric of interest.
- Parameters:
name (str) – The name of the updater (e.g. “countysplits”)
col_id (str) – The name of the column containing the locality attribute (i.e. county ids, municipality names, etc.)
pop_col (str) – The name of the column containing population counts.
scores_to_compute (List[str], optional) – A list/tuple/set of strings naming the score functions to compute at each step. This should be some subcollection of
`['num_parts', 'num_pieces', 'naked_boundary', 'shannon_entropy', 'power_entropy', 'symmetric_entropy', 'num_split_localities']`. Default is [“num_parts”].pent_alpha (float, optional) – A number between 0 and 1 which is passed as the exponent to
power_entropy(). Default is 0.05.
- naked_boundary(partition) int[source]
- Computes the number of cut edges inside localities (i.e. the
number of cut edges with both endpoints in the same locality).
- Parameters:
partition (
Partition) – The partition to be scored.- Returns:
The number of cut edges within a locality.
- Return type:
int
- num_parts(partition) int[source]
Calculates the number of unique locality-district pairs.
- Parameters:
partition (
Partition) – The partition to be scored.- Returns:
The number of parts, i.e. the number of unique locality-district pairs.
- Return type:
int
- num_pieces(partition) int[source]
Calculates the number of pieces.
- Parameters:
partition (
Partition) – The partition to be scored.- Returns:
Number of pieces, where each piece is formed by cutting the graph by both locality and district boundaries.
- Return type:
int
- num_split_localities(partition) int[source]
Calculates the number of localities touching 2 or more districts.
- Parameters:
partition (
Partition) – The partition to be scored.- Returns:
The number of split localities, i.e. the number of localities touching 2 or more districts.
- Return type:
int
- power_entropy(partition) float[source]
Computes the power entropy score of a district plan.
- Parameters:
partition (
Partition) – The partition to be scored.- Returns:
Power entropy score.
- Return type:
float
- shannon_entropy(partition) float[source]
Computes the shannon entropy score of a district plan.
- Parameters:
partition (
Partition) – The partition to be scored.- Returns:
Shannon entropy score.
- Return type:
float
- symmetric_entropy(partition) float[source]
Calculates the symmetric entropy score.
Warning:
This function is previously marked incomplete.
- Parameters:
partition (
Partition) – The partition to be scored.- Returns:
The symmetric square root entropy score.
- Return type:
float
Updaters that compute spanning tree statistics.
- gerrychain.updaters.spanning_trees.num_spanning_trees(partition) Dict[int, int][source]
- Returns:
The number of spanning trees in each part (district) of a partition.
- Return type:
Dict[int, int]
- class gerrychain.updaters.tally.DataTally(data: Dict | Series | str, alias: str)[source]
Bases:
objectAn updater for tallying numerical data that is not necessarily stored as node attributes
- Variables:
data – A Dict or Series indexed by the graph’s nodes, or the string key for a node attribute containing the Tally’s data.
alias – The name of the tally in the Partition’s updaters dictionary
- Parameters:
data (Union[Dict, pandas.Series, str]) – A Dict or Series indexed by the graph’s nodes, or the string key for a node attribute containing the Tally’s data.
alias (str) – The name of the tally in the Partition’s updaters dictionary
- Returns:
None
- alias
- data
- class gerrychain.updaters.tally.Tally(fields: str | ~typing.List[str], alias: str | None = None, dtype: ~typing.Type = <class 'int'>)[source]
Bases:
objectAn updater for keeping a tally of one or more node attributes.
- Variables:
fields – The list of node attributes that you want to tally. Or just a single attribute name as a string.
alias – The aliased name of this Tally (meaning, the key corresponding to this Tally in the Partition’s updaters dictionary)
dtype – The type (int, float, etc.) that you want the tally to have
- Parameters:
fields (Union[str, List[str]]) – The list of node attributes that you want to tally. Or a just a single attribute name as a string.
alias (Optional[str], optional) – The aliased name of this Tally (meaning, the key corresponding to this Tally in the Partition’s updaters dictionary). Default is None.
dtype (Any, optional) – The type (int, float, etc.) that you want the tally to have. Default is int.
- Returns:
None
- alias
- dtype
- fields
- gerrychain.updaters.tally.compute_in_flow(graph, fields: str | List[str], flow: Dict) int[source]
- Parameters:
graph (
Graph) – The graph that the partition is defined on.fields (Union[str, List[str]]) – The list of node attributes that you want to tally. Or just a single attribute name as a string.
flow (Dict) – A dictionary containing the flow from the parent of this partition to this partition. This dictionary is of the form {part: {‘in’: <set of nodes that flowed in>, ‘out’: <set of nodes that flowed out>}}.
- Returns:
The sum of the “field” attribute of nodes in the “in” set of the flow.
- Return type:
int
- gerrychain.updaters.tally.compute_out_flow(graph, fields: str | List[str], flow: Dict) int[source]
- Parameters:
graph (
Graph) – The graph that the partition is defined on.fields (Union[str, List[str]]) – The list of node attributes that you want to tally. Or just a single attribute name as a string.
flow (Dict) – A dictionary containing the flow from the parent of this partition to this partition. This dictionary is of the form {part: {‘in’: <set of nodes that flowed in>, ‘out’: <set of nodes that flowed out>}}.
- Returns:
The sum of the “field” attribute of nodes in the “out” set of the flow.
- Return type:
int