SubString, licensed under the EUPL V.1.1.

The SubString package is an open-source set of Unix Shell scripts used for
substring reduction and frequency consolidation of word n-grams of different
length. In the process, the frequencies of substrings are reduced by the
frequencies of their superstrings and a consolidated list with n-grams of
different lengths is produced without an inflation of the overall word count.
The functions performed by SubString will primarily be of interest to
linguists working on formulaic language, multi-word sequences and similar
phraseological phenomena.

SubString is cross-platform and currently available as beta software. It was
tested under MacOS X and Xubuntu Linux, but should work well on any plat-
form that can run the bash shell.

What is frequency consolidation? | project page on github | about the author

download current release | download previous versions


last update: 2017-06-14