folktexts package
Subpackages
- folktexts.acs package
- folktexts.classifier package
- Submodules
- folktexts.classifier.base module
LLMClassifier
LLMClassifier.DEFAULT_INFERENCE_KWARGS
LLMClassifier.compute_risk_estimates_for_dataframe()
LLMClassifier.compute_risk_estimates_for_dataset()
LLMClassifier.correct_order_bias
LLMClassifier.custom_prompt_prefix
LLMClassifier.encode_row
LLMClassifier.fit()
LLMClassifier.inference_kwargs
LLMClassifier.model_name
LLMClassifier.predict()
LLMClassifier.predict_proba()
LLMClassifier.seed
LLMClassifier.set_fit_request()
LLMClassifier.set_inference_kwargs()
LLMClassifier.set_predict_proba_request()
LLMClassifier.set_predict_request()
LLMClassifier.set_score_request()
LLMClassifier.task
LLMClassifier.threshold
- folktexts.classifier.transformers_classifier module
- folktexts.classifier.web_api_classifier module
- Module contents
- folktexts.cli package
Submodules
folktexts.benchmark module
A benchmark class for measuring and evaluating LLM calibration.
- class folktexts.benchmark.Benchmark(llm_clf, dataset, config=BenchmarkConfig(numeric_risk_prompting=False, few_shot=None, reuse_few_shot_examples=False, batch_size=None, context_size=None, correct_order_bias=True, feature_subset=None, population_filter=None, seed=42))[source]
Bases:
object
Measures and evaluates risk scores produced by an LLM.
A benchmark object to measure and evaluate risk scores produced by an LLM.
- Parameters:
llm_clf (LLMClassifier) – A language model classifier object (can be local or web-hosted).
dataset (Dataset) – The dataset object to use for the benchmark.÷
config (BenchmarkConfig, optional) – The configuration object used to create the benchmark parameters. NOTE: This is used to uniquely identify the benchmark object for reproducibility; it will not be used to change the benchmark behavior. To configure the benchmark, pass a configuration object to the Benchmark.make_benchmark method.
- ACS_DATASET_CONFIGS = {'horizon': '1-Year', 'seed': 42, 'subsampling': None, 'survey': 'person', 'survey_year': '2018', 'test_size': 0.1, 'val_size': 0.1}
- property configs_dict: dict
- classmethod make_acs_benchmark(task_name, *, model, tokenizer=None, data_dir=None, max_api_rpm=None, config=BenchmarkConfig(numeric_risk_prompting=False, few_shot=None, reuse_few_shot_examples=False, batch_size=None, context_size=None, correct_order_bias=True, feature_subset=None, population_filter=None, seed=42), **kwargs)[source]
Create a standardized calibration benchmark on ACS data.
- Parameters:
task_name (str) – The name of the ACS task to use.
model (AutoModelForCausalLM | str) – The transformers language model to use, or the model ID for a webAPI hosted model (e.g., “openai/gpt-4o-mini”).
tokenizer (AutoTokenizer, optional) – The tokenizer used to train the model (if using a transformers model). Not required for webAPI models.
data_dir (str | Path, optional) – Path to the directory to load data from and save data in.
max_api_rpm (int, optional) – The maximum number of API requests per minute for webAPI models.
config (BenchmarkConfig, optional) – Extra benchmark configurations, by default will use BenchmarkConfig.default_config().
**kwargs – Additional arguments passed to ACSDataset and BenchmarkConfig. By default will use a set of standardized configurations for reproducibility.
- Returns:
bench – The ACS calibration benchmark object.
- Return type:
- classmethod make_benchmark(*, task, dataset, model, tokenizer=None, max_api_rpm=None, config=BenchmarkConfig(numeric_risk_prompting=False, few_shot=None, reuse_few_shot_examples=False, batch_size=None, context_size=None, correct_order_bias=True, feature_subset=None, population_filter=None, seed=42), **kwargs)[source]
Create a calibration benchmark from a given configuration.
- Parameters:
task (TaskMetadata | str) – The task metadata object or name of the task to use.
dataset (Dataset) – The dataset to use for the benchmark.
model (AutoModelForCausalLM | str) – The transformers language model to use, or the model ID for a webAPI hosted model (e.g., “openai/gpt-4o-mini”).
tokenizer (AutoTokenizer, optional) – The tokenizer used to train the model (if using a transformers model). Not required for webAPI models.
max_api_rpm (int, optional) – The maximum number of API requests per minute for webAPI models.
config (BenchmarkConfig, optional) – Extra benchmark configurations, by default will use BenchmarkConfig.default_config().
**kwargs – Additional arguments for easier configuration of the benchmark. Will simply use these values to update the config object.
- Returns:
bench – The calibration benchmark object.
- Return type:
- property model_name
- plot_results(*, show_plots=True)[source]
Render evaluation plots and save to disk.
- Parameters:
show_plots (bool, optional) – Whether to show plots, by default True.
- Returns:
plots_paths – The paths to the saved plots.
- Return type:
dict[str, str]
- property results
- property results_dir: Path
Get the results directory for this benchmark.
- property results_root_dir: Path
- run(results_root_dir, fit_threshold=0)[source]
Run the calibration benchmark experiment.
- Parameters:
results_root_dir (str | Path) – Path to root directory under which results will be saved.
fit_threshold (int | bool, optional) – Whether to fit the binarization threshold on a given number of training samples, by default 0 (will not fit the threshold).
- Returns:
The benchmark metric value. By default this is the ECE score.
- Return type:
float
- save_results(results_root_dir=None)[source]
Save the benchmark results to disk.
- Parameters:
results_root_dir (str | Path, optional) – Path to root directory under which results will be saved. By default will use self.results_root_dir.
- property task
- class folktexts.benchmark.BenchmarkConfig(numeric_risk_prompting=False, few_shot=None, reuse_few_shot_examples=False, batch_size=None, context_size=None, correct_order_bias=True, feature_subset=None, population_filter=None, seed=42)[source]
Bases:
object
A dataclass to hold the configuration for risk-score benchmark.
- numeric_risk_prompting
Whether to prompt for numeric risk-estimates instead of multiple-choice Q&A, by default False.
- Type:
bool, optional
- few_shot
Whether to use few-shot prompting with a given number of examples, by default None.
- Type:
int | None, optional
- reuse_few_shot_examples
Whether to reuse the same samples for few-shot prompting (or sample new ones every time), by default False.
- Type:
bool, optional
- batch_size
The batch size to use for inference.
- Type:
int | None, optional
- context_size
The maximum context size when prompting the LLM.
- Type:
int | None, optional
- correct_order_bias
Whether to correct the ordering bias in multiple-choice Q&A when prompting the LLM, by default True.
- Type:
bool, optional
- feature_subset
Whether to use a subset of the standard feature set for the task. The list should contain the names of the columns of features to use.
- Type:
list[str] | None, optional
- population_filter
Optional population filter for this benchmark; must follow the format {“column_name”: “value”}.
- Type:
dict | None, optional
- seed
Random seed – to set for reproducibility.
- Type:
int, optional
-
batch_size:
int
|None
= None
-
context_size:
int
|None
= None
-
correct_order_bias:
bool
= True
- classmethod default_config(**changes)[source]
Returns the default configuration with optional changes.
-
feature_subset:
list
[str
] |None
= None
-
few_shot:
int
|None
= None
-
numeric_risk_prompting:
bool
= False
-
population_filter:
dict
|None
= None
-
reuse_few_shot_examples:
bool
= False
-
seed:
int
= 42
folktexts.col_to_text module
- class folktexts.col_to_text.ColumnToText(name, short_description, value_map=None, question=None, connector_verb='is:', missing_value_fill='N/A', use_value_map_only=False)[source]
Bases:
object
Maps a single column’s values to natural text.
Constructs a ColumnToText object.
- Parameters:
name (str) – The column’s name.
short_description (str) – A short description of the column to be used before different values. For example, short_description=”yearly income” will result in “The yearly income is […]”.
value_map (dict[int | str, str] | Callable, optional) – A map between column values and their textual meaning. If not provided, will try to infer a mapping from the question.
question (QAInterface, optional) – A question associated with the column. If not provided, will try to infer a multiple-choice question from the value_map.
connector_verb (str, optional) – Which verb to use when connecting the column’s description to its value; by default “is”.
missing_value_fill (str, optional) – The value to use when the column’s value is not found in the value_map, by default “N/A”.
use_value_map_only (bool, optional) – Whether to only use the value_map for mapping values to text, or whether natural language representation should be generated using the connector_verb and short_description as well. By default (False) will construct a natural language representation of the form: “The [short_description] [connector_verb] [value_map.get(val)]”.
- get_text(value)[source]
Returns the natural text representation of the given data value.
- Return type:
str
- property name: str
- property question: QAInterface
- property short_description: str
- property value_map: Callable
Returns the value map function for this column.
folktexts.dataset module
General Dataset functionality for text-based datasets.
- class folktexts.dataset.Dataset(data, task, test_size=0.1, val_size=0.1, subsampling=None, seed=42)[source]
Bases:
object
Construct a Dataset object.
- Parameters:
data (pd.DataFrame) – The dataset’s data in pandas DataFrame format.
task (TaskMetadata) – The metadata for the prediction task.
test_size (float, optional) – The size of the test set, as a fraction of the total dataset size, by default 0.1.
val_size (float, optional) – The size of the validation set, as a fraction of the total dataset size, by default 0.1.
subsampling (float, optional) – Whether to use sub-sampling, and which fraction of the data to keep. By default will not use sub-sampling (subsampling=None).
seed (int, optional) – The random state seed, by default 42.
- property data: DataFrame
- property name: str
A unique name for this dataset.
- sample_n_train_examples(n, reuse_examples=False)[source]
Return a set of samples from the training set.
- Parameters:
n (int) – The number of example rows to return.
reuse_examples (bool, optional) – Whether to reuse the same examples for consistency. By default will sample new examples each time (reuse_examples=False).
- Returns:
X, y – The features and target data for the sampled examples.
- Return type:
tuple[pd.DataFrame, pd.Series]
- property seed: int
- property subsampling: float
- property task: TaskMetadata
- property test_size: float
- property train_size: float
- property val_size: float
folktexts.evaluation module
Module to map risk-estimates to a variety of evaluation metrics.
Notes
Code based on the error_parity.evaluation module, at: https://github.com/socialfoundations/error-parity/blob/main/error_parity/evaluation.py
- folktexts.evaluation.bootstrap_estimate(eval_func, *, y_true, y_pred_scores, sensitive_attribute=None, k=200, confidence_pct=95, seed=42)[source]
Computes bootstrap estimates of the given evaluation function.
- Parameters:
eval_func (Callable[[np.ndarray, np.ndarray, np.ndarray], dict[str, float]]) – The evaluation function to run for each bootstrap sample. Must follow the signature eval_func(y_true, y_pred_scores, sensitive_attribute).
y_true (np.ndarray) – The true labels.
y_pred_scores (np.ndarray) – The predicted scores.
sensitive_attribute (np.ndarray, optional) – Optionally, provide the sensitive attribute data to compute fairness metrics, by default None.
k (int, optional) – How many bootstrap samples to draw, by default 200.
confidence_pct (float, optional) – The confidence interval to use, in percentage, by default 95.
seed (int, optional) – The random seed, by default 42.
- Returns:
results – A dictionary containing bootstrap estimates for a variety of metrics.
- Return type:
dict[str, float]
- folktexts.evaluation.compute_best_threshold(y_true, y_pred_scores, *, false_pos_cost=1.0, false_neg_cost=1.0)[source]
Computes the binarization threshold that maximizes accuracy.
- Parameters:
y_true (np.ndarray) – The true class labels.
y_pred_scores (np.ndarray) – The predicted risk scores.
false_pos_cost (float, optional) – The cost of a false positive error, by default 1.0
false_neg_cost (float, optional) – The cost of a false negative error, by default 1.0
- Returns:
best_threshold – The threshold value that maximizes accuracy for the given predictions.
- Return type:
float
- folktexts.evaluation.evaluate_binary_predictions(y_true, y_pred)[source]
Evaluates the provided binary predictions on common performance metrics.
- Parameters:
y_true (np.ndarray) – The true class labels.
y_pred (np.ndarray) – The binary predictions.
- Returns:
A dictionary with key-value pairs of (metric name, metric value).
- Return type:
dict
- folktexts.evaluation.evaluate_binary_predictions_fairness(y_true, y_pred, sensitive_attribute, return_groupwise_metrics=False, min_group_size=0.04)[source]
Evaluates fairness of the given predictions.
Fairness metrics are computed as the ratios between group-wise performance metrics.
- Parameters:
y_true (np.ndarray) – The true class labels.
y_pred (np.ndarray) – The discretized predictions.
sensitive_attribute (np.ndarray) – The sensitive attribute (protected group membership).
return_groupwise_metrics (bool, optional) – Whether to return group-wise performance metrics (bool: True) or only the ratios between these metrics (bool: False), by default False.
min_group_size (float, optional) – The minimum fraction of samples (as a fraction of the total number of samples) that a group must have to be considered for fairness evaluation, by default 0.04. This is meant to avoid evaluating metrics on very small groups which leads to noisy and inconsistent results.
- Returns:
A dictionary with key-value pairs of (metric name, metric value).
- Return type:
dict
- folktexts.evaluation.evaluate_predictions(y_true, y_pred_scores, *, sensitive_attribute=None, threshold='best', model_name=None)[source]
Evaluates predictions on common performance and fairness metrics.
- Parameters:
y_true (np.ndarray) – The true class labels.
y_pred_scores (np.ndarray) – The predicted scores.
sensitive_attribute (np.ndarray, optional) – The sensitive attribute data. Will compute fairness metrics if provided.
threshold (float | str, optional) – The threshold to use for binarizing the predictions, or “best” to infer which threshold maximizes accuracy.
model_name (str, optional) – The name of the model to be used on the plots, by default None.
- Returns:
results – A dictionary with key-value pairs of (metric name, metric value).
- Return type:
dict
- folktexts.evaluation.evaluate_predictions_bootstrap(y_true, y_pred_scores, *, sensitive_attribute=None, threshold='best', k=200, confidence_pct=95, seed=42)[source]
Computes bootstrap estimates of classification metrics for the given predictions.
- Parameters:
y_true (np.ndarray) – The true labels.
y_pred_scores (np.ndarray) – The score predictions.
sensitive_attribute (np.ndarray, optional) – The sensitive attribute data. Will compute fairness metrics if provided.
threshold (float | str, optional) – The threshold to use for binarizing the predictions, or “best” to infer which threshold maximizes accuracy, by default “best”.
k (int, optional) – How many bootstrap samples to draw, by default 200.
confidence_pct (float, optional) – How large of a confidence interval to use when reporting lower and upper bounds, by default 95 (i.e., 2.5 to 97.5 percentile of results).
seed (int, optional) – The random seed, by default 42.
- Returns:
results – A dictionary containing bootstrap estimates for a variety of metrics.
- Return type:
dict[str, float]
folktexts.llm_utils module
Common functions to use with transformer LLMs.
- folktexts.llm_utils.add_pad_token(tokenizer)[source]
Add a pad token to the model and tokenizer if it doesn’t already exist.
Here we’re using the end-of-sentence token as the pad token. Both the model weights and tokenizer vocabulary are untouched.
Another possible way would be to add a new token [PAD] to the tokenizer and update the tokenizer vocabulary and model weight embeddings accordingly. The embedding for the new pad token would be the average of all other embeddings.
- folktexts.llm_utils.get_model_folder_path(model_name, root_dir='/tmp')[source]
Returns the folder where the model is saved.
- Return type:
str
- folktexts.llm_utils.get_model_size_B(model_name, default=None)[source]
Get the model size from the model name, in Billions of parameters.
- Return type:
int
- folktexts.llm_utils.is_bf16_compatible()[source]
Checks if the current environment is bfloat16 compatible.
- Return type:
bool
- folktexts.llm_utils.load_model_tokenizer(model_name_or_path, **kwargs)[source]
Load a model and tokenizer from the given local path (or using the model name).
- Parameters:
model_name_or_path (str | Path) – Model name or local path to the model folder.
kwargs (dict) – Additional keyword arguments to pass to the model from_pretrained call.
- Returns:
The loaded model and tokenizer, respectively.
- Return type:
tuple[AutoModelForCausalLM, AutoTokenizer]
- folktexts.llm_utils.query_model_batch(text_inputs, model, tokenizer, context_size)[source]
Queries the model with a batch of text inputs.
- Parameters:
text_inputs (list[str]) – The inputs to the model as a list of strings.
model (AutoModelForCausalLM) – The model to query.
tokenizer (AutoTokenizer) – The tokenizer used to encode the text inputs.
context_size (int) – The maximum context size to consider for each input (in tokens).
- Returns:
last_token_probs – Model’s last token linear probabilities for each input as an np.array of shape (batch_size, vocab_size).
- Return type:
np.array
- folktexts.llm_utils.query_model_batch_multiple_passes(text_inputs, model, tokenizer, context_size, n_passes, digits_only=False)[source]
Queries an LM for multiple forward passes.
Greedy token search over multiple forward passes: Each forward pass takes the highest likelihood token from the previous pass.
NOTE: could use model.generate in the future!
- Parameters:
text_inputs (list[str]) – The batch inputs to the model as a list of strings.
model (AutoModelForCausalLM) – The model to query.
tokenizer (AutoTokenizer) – The tokenizer used to encode the text inputs.
context_size (int) – The maximum context size to consider for each input (in tokens).
n_passes (int, optional) – The number of forward passes to run.
digits_only (bool, optional) – Whether to only sample for digit tokens.
- Returns:
last_token_probs – Last token linear probabilities for each forward pass, for each text in the input batch. The output has shape (batch_size, n_passes, vocab_size).
- Return type:
np.array
folktexts.plotting module
Module to plot evaluation results.
- folktexts.plotting.render_evaluation_plots(y_true, y_pred_scores, *, eval_results={}, model_name=None, imgs_dir=None, show_plots=False)[source]
Renders evaluation plots for the given predictions.
- Return type:
dict
folktexts.prompting module
Module to map risk-estimation questions to different prompting techniques.
e.g., - multiple-choice Q&A vs direct numeric Q&A; - zero-shot vs few-shot vs CoT;
- folktexts.prompting.apply_chat_template(tokenizer, user_prompt, system_prompt=None, chat_prompt='If had to select one of the options, my answer would be', **kwargs)[source]
- Return type:
str
- folktexts.prompting.encode_row_prompt(row, task, question=None, custom_prompt_prefix=None, add_task_description=True)[source]
Encode a question regarding a given row.
- Return type:
str
- folktexts.prompting.encode_row_prompt_chat(row, task, tokenizer, question=None, **chat_template_kwargs)[source]
- Return type:
str
- folktexts.prompting.encode_row_prompt_few_shot(row, task, dataset, n_shots, question=None, reuse_examples=False, custom_prompt_prefix=None)[source]
Encode a question regarding a given row using few-shot prompting.
- Parameters:
row (pd.Series) – The row that the question will be about.
task (TaskMetadata) – The task that the row belongs to.
n_shots (int, optional) – The number of example questions and answers to use before prompting about the given row, by default 3.
reuse_examples (bool, optional) – Whether to reuse the same examples for consistency. By default will resample new examples each time (reuse_examples=False).
- Returns:
prompt – The encoded few-shot prompt.
- Return type:
str
folktexts.qa_interface module
Interface for question-answering with LLMs.
Create different types of questions (direct numeric, multiple-choice).
Encode questions and decode model outputs.
Compute risk-estimate from model outputs.
- class folktexts.qa_interface.Choice(text, data_value, numeric_value=None)[source]
Bases:
object
Represents a choice in multiple-choice Q&A.
- text
The text of the choice. E.g., “25-34 years old”.
- Type:
str
- data_value
The categorical value corresponding to this choice in the data.
- Type:
object
- numeric_value
A meaningful numeric value for the choice. E.g., if the choice is “25-34 years old”, the numeric value could be 30. The choice with the highest numeric value can be used as a proxy for the positive class. If not provided, will try to use the choice.value.
- Type:
float, optional
-
data_value:
object
-
numeric_value:
float
= None
-
text:
str
- class folktexts.qa_interface.DirectNumericQA(column, text, num_forward_passes=2, answer_probability=True)[source]
Bases:
QAInterface
Represents a direct numeric question.
Notes
For example, the prompt could be ” Q: What is 2 + 2? A: ” With the expected answer being “4”.
If looking for a direct numeric probability, the answer prompt will be framed as so: ” Q: What is the probability, between 0 and 1, of getting heads on a coin flip? A: 0.” So that we can extract a numeric answer with at most 2 forward passes. This is done automatically by passing the kwarg answer_probability=True.
Note that some models have multi-digit tokens in their vocabulary, so we need to correctly assess which tokens in the vocabulary correspond to valid numeric answers.
-
answer_probability:
bool
= True
- get_answer_from_model_output(last_token_probs, tokenizer_vocab)[source]
Outputs a numeric answer inferred from the model’s output.
- Parameters:
last_token_probs (np.ndarray) – The last token probabilities of the model for the question. The first dimension must correspond to the number for forward passes as specified by num_forward_passes.
tokenizer_vocab (dict[str, int],) – The tokenizer’s vocabulary.
- Returns:
answer – The numeric answer to the question.
- Return type:
float | int
Notes
Eventually we could run a search algorithm to find the most likely answer over multiple forward passes, but for now we’ll just take the argmax on each forward pass.
-
num_forward_passes:
int
= 2
-
answer_probability:
- class folktexts.qa_interface.MultipleChoiceQA(column, text, num_forward_passes=1, choices=<factory>, _answer_keys_source=<factory>)[source]
Bases:
QAInterface
Represents a multiple-choice question and its answer keys.
- property answer_keys: tuple[str, ...]
- classmethod create_answer_keys_permutations(question)[source]
Yield questions with all permutations of answer keys.
- Parameters:
question (Question) – The template question whose answer keys will be permuted.
- Returns:
permutations – A generator of questions with all permutations of answer keys.
- Return type:
Iterator[Question]
- classmethod create_question_from_value_map(column, value_map, attribute, **kwargs)[source]
Constructs a question from a value map.
- Return type:
- get_answer_from_model_output(last_token_probs, tokenizer_vocab)[source]
Decodes the model’s output into an answer for the given question.
- Parameters:
last_token_probs (np.ndarray) – The model’s last token probabilities for the question. The first dimension corresponds to the number of forward passes as specified by self.num_forward_passes.
tokenizer_vocab (dict[str, int],) – The tokenizer’s vocabulary.
- Returns:
answer – The answer to the question.
- Return type:
float
- get_answer_key_from_value(value)[source]
Returns the answer key corresponding to the given data value.
- Return type:
str
- get_value_to_text_map()[source]
Returns the map from choice data value to choice textual representation.
- Return type:
dict
[object
,str
]
-
num_forward_passes:
int
= 1
- class folktexts.qa_interface.QAInterface(column, text, num_forward_passes)[source]
Bases:
ABC
An interface for a question-answering system.
-
column:
str
- get_answer_from_model_output(last_token_probs, tokenizer_vocab)[source]
Decodes the model’s output into an answer for the given question.
- Parameters:
last_token_probs (np.ndarray) – The model’s last token probabilities for the question. The first dimension corresponds to the number of forward passes as specified by self.num_forward_passes.
tokenizer (dict[str, int]) – The tokenizer’s vocabulary.
- Returns:
answer – The answer to the question.
- Return type:
float
-
num_forward_passes:
int
-
text:
str
-
column:
folktexts.task module
Definition of a generic TaskMetadata class.
- class folktexts.task.TaskMetadata(name, features, target, cols_to_text, sensitive_attribute=None, target_threshold=None, multiple_choice_qa=None, direct_numeric_qa=None, description=None, _use_numeric_qa=False)[source]
Bases:
object
A base class to hold information on a prediction task.
- check_task_columns_are_available(available_cols, raise_=True)[source]
Checks if all columns required by this task are available.
- Parameters:
available_cols (list[str]) – The list of column names available in the dataset.
raise (bool, optional) – Whether to raise an error if some columns are missing, by default True.
- Returns:
all_available – True if all required columns are present in the given list of available columns, False otherwise.
- Return type:
bool
-
cols_to_text:
dict
[str
,ColumnToText
] A mapping between column names and their textual descriptions.
- create_task_with_feature_subset(feature_subset)[source]
Creates a new task with a subset of the original features.
-
description:
str
= None A description of the task, including the population to which the task pertains to.
-
direct_numeric_qa:
DirectNumericQA
= None The direct numeric question and answer interface for this task.
-
features:
list
[str
] The names of the features used in the task.
- get_row_description(row)[source]
Encode a description of a given data row in textual form.
- Return type:
str
- get_target()[source]
Resolves the name of the target column depending on self.target_threshold.
- Return type:
str
- classmethod get_task(name, use_numeric_qa=False)[source]
Fetches a previously created task by its name.
- Parameters:
name (str) – The name of the task to fetch.
use_numeric_qa (bool, optional) – Whether to set the retrieved task to use verbalized numeric Q&A instead of the default multiple-choice Q&A prompts. Default is False.
- Returns:
task – The task object with the given name.
- Return type:
- Raises:
ValueError – Raised if the task with the given name has not been created yet.
-
multiple_choice_qa:
MultipleChoiceQA
= None The multiple-choice question and answer interface for this task.
-
name:
str
The name of the task.
- property question: QAInterface
Getter for the Q&A interface for this task.
-
sensitive_attribute:
str
= None The name of the column used as the sensitive attribute data (if provided).
- sensitive_attribute_value_map()[source]
Returns a mapping between sensitive attribute values and their descriptions.
- Return type:
Callable
-
target:
str
The name of the target column.
- property use_numeric_qa: bool
Getter for whether to use numeric Q&A instead of multiple-choice Q&A prompts.
folktexts.threshold module
Helper function for defining binarization thresholds.
- class folktexts.threshold.Threshold(value, op)[source]
Bases:
object
A class to represent a threshold value and its comparison operator.
- value
The threshold value to compare against.
- Type:
float | int
- op
The comparison operator to use. One of ‘>’, ‘<’, ‘>=’, ‘<=’, ‘==’, ‘!=’.
- Type:
str
- apply_to_column_data(data)[source]
Applies the threshold operation to a pandas Series or scalar value.
- Return type:
int
|Series
- apply_to_column_name(column_name)[source]
Standardizes naming of thresholded columns.
- Return type:
str
-
op:
str
-
valid_ops:
ClassVar
[dict
] = {'!=': <built-in function ne>, '<': <built-in function lt>, '<=': <built-in function le>, '==': <built-in function eq>, '>': <built-in function gt>, '>=': <built-in function ge>}
-
value:
float
|int