Hi r/linguistics!
I'm a member of the Phonological CorpusTools (PCT) team and we just released version 1.0 of it, with Windows and Mac binaries available here and the source code (in Python) available here.
PCT performs phonological algorithms found in the literature on large corpora. You can use it to calculate lexical-based properties like phonotactic probabiltiy and neighborhood density of individual words, which have been used in psycholinguistic studies. You can also calculate segment-based properties like the functional load of two segments in a corpus or how predictable two segments are from their environments. You can also calculate acoustic similarity between individual wav files in a directory or groups of wav files in different directories.
We currently support loading of corpora from running text files, column-delimited files with headers (like CSV), Praat TextGrids and formats used by the Buckeye corpus and TIMIT. You can also download example corpora (a toy corpus and IPHOD from within the application. Likewise, you can download feature matrices in the style of Chomsky and Halle (SPE) and in the style of Hayes for several transcription systems. Custom column-delimited feature matrices can also be loaded.
We hope you find it helpful!
[link] [comment]