Categorical ordinal#

map_ordered_values(ordered_values: Sequence[Any] | ndarray) tuple[dict[Any, int], int | None, int | None][source]#

Map consequtive integers to passed ordered values.

Parameters:

ordered_values (Sequence[Any] | np.ndarray) – A defined sequence of categorical values.

Returns:

  • ranks_mapping: A dictionary mapping each unique value to its rank.

  • min_rank: The minimum rank (or None if no categories).

  • max_rank: The maximum rank (or None if no categories).

Return type:

tuple[dict[Any, int], int | None, int | None]

get_cardinalities_mapping(column: Sequence[Any] | ndarray) tuple[dict[Any, int], list[int]][source]#

Count occurrences of each category value in an ordinal column.

Parameters:

column (Sequence[Any] | np.ndarray) – A sequence of ordinal values (may include NaN). NaN values are ignored in counting.

Returns:

  • counts_map: Mapping from each unique category value (excluding NaN) to its count.

  • counts_list: List of counts corresponding to each category value, ordered by sorted category values.

Return type:

tuple[dict[Any, int], list[int]]

collect_ordinal_cardinalities(data: ndarray) list[ndarray][source]#

Process a 2D array of ordinal columns to get counts per level for each column.

Parameters:

data (np.ndarray) – Two-dimensional array with shape (n_samples, n_ordinal_columns). Each column may contain NaN and ordinal categorical values.

Returns:

  • ordinals_cardinality:

    A list where each element is a 1D NumPy array of integer counts. Counts[i] is the number of occurrences of the i-th sorted category in that column.

Return type:

list[np.ndarray])