Author image Nathan Gary Glenn
and 3 contributors


analogize - classify data with AM from the command line


version 3.12


analogize --format <format> [--exemplars <file>] [--test <file>] [--project <dir>] [--print <config_info,statistical_summary, analogical_set_summary,gang_summary,gang_detailed>] [--help]


Classify data with analogical modeling from the command line. Required arguments are format and either exemplars or project. You can use old AM::Parallel projects (a directory containing data and test files) or specify individual data and test files. By default, only the accuracy of the predicted outcomes is printed. More detail may be printed using the print option.



specify either commas or nocommas format for exemplar and test data files (= should be used for "null" variables). See "dataset_from_file" in Algorithm::AM::DataSet for details on the two formats.

exemplars, data or train

path to the file containing the examplar/training data


path to an AM::Parallel-style project (ignores 'outcome' file); this should be a directory containing a file called data containing known exemplars and test containing test exemplars. If the test file does not exist, then a leave-one-out scheme is used for testing using the exemplars in the data file.


path to the file containing the test data. If none is specified, performs leave-one-out classification with the exemplar set.


reports to print, separated by commas (be careful not to add spaces between report names!). For example, --print analogical_set_summary,gang_summary would print analogical sets and gang summaries.

Available options are:


Describes the configuration used and some simple information about the data, i.e. cardinality, etc.


A statistical summary of the classification results, including all predicted outcomes with their scores and percentages and the total score for all outcomes. Whether the predicted class is correct, incorrect, or a tie is also included, if the test item had a known class.


The analogical set, showing all items that contributed to the predicted outcome, along with the amount contributed by each item (score and percentage overall).


A summary of the gang effects on the outcome prediction.


Same as gang_summary, but also includes lists of exemplars for each gang.


Allow a test item to be included in the data set during classification. If false (default), test items will be removed from the dataset during classification.


Treat null variables in a test item as regular variables. If false (default), these variables will be excluded and not considered during classification.


Calculate scores using occurrences (linearly) instead of using pointers (quadratically).

help or ?

print help message


This distribution comes with a sample dataset in the datasets/soybean directory. Data exemplars are in data and a single test exemplar is in test. The files are in the commas format. The following two commands are equivalent and will analyze the test exemplar and output a summary of gang effects to gang.txt:

    analogize --exemplars datasets/soybean/data --test datasets/soybean/test --format commas --print gang_summary > gang.txt

    analogize --project datasets/soybean --format commas --print gang_summary > gang.txt

The resulting files are best viewed in a text editor with word wrap turned off.


Theron Stanford <>, Nathan Glenn <>


This software is copyright (c) 2021 by Royal Skousen.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.