Notes about Keras deeplearning framework

Posted on Mon 27 June 2016 in Notes

Short introduction to Keras

Keras can be thought of as 4 steps

  • Prepare input and output tensors
  • Create first layer (input)
  • Create output layer
  • Build any model inbetween

Everything is a layer!

Layers are minimally defined as output dimensions (input dimensions are optional, typically only required for first layer)

return_sequence, if true output can be feeded to another RNN

map to a sequence

if false, feed to fully connected layers

even dropout is a layer. Makes sense, as dropout can be seen as a

random matrix that will multiply inputs with 1 or 0

models

instantiate a model:

model = Sequential()

Expand a model

 model.add([layer type])

Check a model

model.summary()

Neural Net is implemented as a model

  • Layers are all contained within a model

Sequential model

  • Regular run-of-the-mill NN
  • setup input and output layer
  • one layer feeds into the next

Graph

  • One layer can split into several layers

model.Compile() sets up your model, which loss-function and optimizer that will be used. This will compile the model into machine code via Theano or TensorFlow.

model.fit() is the training function.


'Hiragana <-> katakana transliteration in 4 lines of Python'

Posted on Thu 26 May 2016 in Notes

This is a quick script to make good hiragana <-> katakana conversion in just 4 lines of Python.

This code will make it easy to convert かたかな to カタカナ and ヒラガナ to ひらがな without any dependencies. It even handles mixed script correctly.

If you don't need romaji translitteration and want to lower your scripts dependencies you can forgo pip installing some surprisingly large libraries just to convert from hiraganan to katakana and simply copy paste the below 4 lines (and preferrably a link to my homepage or github) and you are good to go.

Tested in Python 3.x, doesn't seem to work in Python 2.7

Download it off my github here

How it works

I use the builtin string function translate which converts characters to corrosponding characters in a translations table, easily created with another string function, maketrans. See documentation here

We simply create our hiragana and katakana translation tables and use the str.translate() function to do the heavy lifting.

I've used Mark Rogoyski list of hiragana and katakana unicode codepoints and removed characters I don't want transliterated. For example, I want to be able to convert コーヒ to hiragana and back. If I had naively used the table, then would be converted into , which wouldn't make any sense.

The magic happens in these 4 lines of code:

katakana_chart = "ァアィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰヱヲンヴヵヶヽヾ"
hiragana_chart = "ぁあぃいぅうぇえぉおかがきぎくぐけげこごさざしじすずせぜそぞただちぢっつづてでとどなにぬねのはばぱひびぴふぶぷへべぺほぼぽまみむめもゃやゅゆょよらりるれろゎわゐゑをんゔゕゖゝゞ" 
hir2kat = str.maketrans(hiragana_chart, katakana_chart)
kat2hir  =str.maketrans(katakana_chart, hiragana_chart)

And it is used like so:

mixed = 'きゃりーぱみゅぱみゅは日本の歌手です。'
print(mixed.translate(hir2kat))
# out: キャリーパミュパミュハ日本ノ歌手デス。

# transliterate back and forth
print(mixed.translate(hir2kat).translate(kat2hir))
# out: きゃりーぱみゅぱみゅは日本の歌手です。

Notice how kanji and special characters are left alone.


PyCNN install notes (for OSX, but also in general)

Posted on Tue 19 April 2016 in Notes

The guide available at https://github.com/clab/cnn/blob/master/INSTALL.md will mostly get you through installing PyCNN with CNN and Eigen (necessary for CNN)

Here's a few notes

  • The PyCNN install guide recommends latest stable version of Eigen, the README file for CNN recommends latest development build. You might have to try both. Either can be installed from http://eigen.tuxfamily.org/ (just unzip and move to cnn folder and rename to eigen and you don't have to mess with mercury/hg)
  • The PyCNN install guide hints that you should use python 2.x. This is not a hint Python 2.7 is required for PyCNN
  • You can add extra compiler flags in setup.py. On OSX you may need to include the following extra_compile_args(remember the trailing comma. These are function arguments!): [‘-mmacosx-version-min=10.7′,'-std=c++11′,'-stdlib=libc++'],

  • You might need to do things several times. Be patient and be stubborn!


Fix "ValueError: unknown locale: UTF-8" under Mac OS X'

Posted on Mon 04 April 2016 in Notes

This is a problem I've been having after switching from OSX' default bash to oh-my-zsh.

When importing things like matplotlib in Python I get the following error:

ValueError: unknown locale: UTF-8

The problem is that the locale has not been set and UTF-8 is not a valid locale, as it is only an encoding.

In bash or zsh run

$ locale

if it looks like

LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

You are in trouble. You want it to look something like the following if you are using US locale

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

If you browse around the web for a solution you will be told to add

export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8

To all sorts of places. ~/.bash, ~/.profile, /etc/.profile and the list goes on. If you are running oh-my-zsh you need to edit ~/.zshrc and add the above two lines and restart you terminal and python as well.


Installing Tensorflow in Python 3.5 with Anaconda

Posted on Fri 29 January 2016 in Notes

Since the release of Tensorflow 0.6, support for Python 3.3+ has finally been added.

However, if you are trying to install Tensorflow into an Anaconda install with conda you might just be using this command that is floating around the web:

# Old tensorflow version
$ conda install -c https://conda.anaconda.org/jjhelmus tensorflow

However, this is a packaged version of Tensorflow 0.5, and won't run on Python 3.3+

Instead you can install via pip into your Anaconda installation. Activate the environment you want to install into, or just install into root, and then use the following commands:

$ sudo easy_install --upgrade six
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py3-none-any.whl

In the official documentation [1] they link to ...tensorflow-0.5.0-py2-none-any.whl, which is an older version of tensorflow for python 2.x. At the time of writing the newest version is 0.7, so I just manually updated the numbers in the URL to fetch and install the version I want.

On OSX El Capitan, with Anaconda 2.5 the above does throw an exception, but the exception happens late enough in the install script that tensoflow will be installed once done. I suppose this is part of working with pre-1.0 release software.

Before installing tensorflow, consider updating your anaconda installation, by running

$ conda update anaconda

[1] https://www.tensorflow.org/versions/0.6.0/get_started/os_setup.html#pip_install