Notes on Kyoto Corpus installation and format

Posted on Thu 21 January 2016 in Notes


  • unzip KyotoCorpus4.0.tar.gz (available here)
  • If you do not have a CD-drive, copy the mai95.txt from the Mainichi Shinbun 1995 CD-ROM to the KyotoCorpus4.0 library you just unzipped (USB stick from friend who has an old PC ...)
  • If you have a CD-drive, the install script should find the file automatically from your drive.
  • Run ./auto_conv -d . to run the install script and have it look for mai95.txt in the same directory.
  • When installing with CD, just run ./auto_conv
  • On windows you can install Kyoto Corpus via cygwin

The install script relies heavily on Perl's encode function, which is deprecated. Expect lots of warnings! The script will probably not run on Perl 6, and only versions of Perl newer than 5.8 (5.18, on OSX 10.11 works fine!)