Quote on Color vs. Colour

Posted on Thu 26 March 2015 in Notes

COLOUR and COLOR are phonetically similar and have the same meaning. COLOUR, COLLAR and CARLA are phonetically similar but differ in meaning. Yet meaning is determined by context. We might COLLAR in a picture, using bright KILLERS; or buy a CARLA for our dog.

Context assists with meaningful interpretation of what has been said or read, and we happily substitute ‘preferred forms' for words or 'concepts' we consider to be in error. Utter the words out of context however, and many variant spellings may be offered in return. This is particularly true of names; for example CAREL, CARRELL, CARROL, CURRALL, CURRELL, KAREEL, KAREL, KAROL and KARIEL.

The word 'phonetic' refers to spoken sounds and not to the spelling of words. Different spelling sequences often represent similar sounds; for example, the 'PH' in PHONE is similar to the 'F' in FOAM. What are defined as vowels and consonants in the written alphabet are not consistent with spoken language, and are unlikely to be since language is continually changing (the written form taking longer to change than the spoken). The longer documents and records are accumulated, the harder it becomes to justify changing those alphabets.

T.N. Gadd. Fisching Fore Werds, 1988

I was very impressed with this concise explanation on phonetics and homonyms made by Gadd when he first introduced his Phonix improvement to the Soundex algorithm for phonetic retrieval of names.

Check out my implementation of Phonix.



'Phonix.py – phonetic name search in Python'

Posted on Tue 24 March 2015 in Notes

  • ICL, good morning!

  • Good morning, my name is George Parson. I’m looking for one of your employees, Philip Carlquist from Engsboda.

  • Let me see, Filip Karlkvist you said ... no, I’m sorry I can’t find anybody here with that name. I’ll just look up the location, ... Ängsboda, no, we don’t even seem to have an office there. Are you sure he works for ICL sir?

  • Quite sure. I met him only last week.

  • Well, I’m sorry, he is not in the computer, he must have left the company since.

  • But ...

Looking up names can be difficult, because there are many possible ways to spell otherwise similar sounding names and places and we can't expect the user to know of - or even think to try - all possible spellings.

Phonetic search algorithms dates back to the early 1900 when Russel & Odell invented the Soundex algorithm in order to simplify and improve an index wherein names are to be entered and grouped phonetically rather than in accordance with the alphabetical construction of the names (Patent 1918).

Soundex grew to fame when it was later used for an genealogical analysis of the US Census data from 1890-1920. Today, despite its limitation it is found everywhere from SQL to The Art of Computer Programming.

The algorithm splits all letters in a name into 1 of 8 groups, each represented by a number 0-7. All zeros and consecutive numbers are pruned. The first letter of the original name is reinstated, and if result is less than 4 characters long, it is extended by zeros.

Not content with the obvious short comings of this approach, T. N. Gadd sat down and painstakingly constructed no less than 160 rules for how to change the spelling of names to be more phonetic. He also optimized the grouping and published the new resulting algorithm, Phonix, in 1988's 3rd number of'Information System', under the catching title: "Fisching Fore Werds", which if run through the Phonix algorithm would produce the same result as "Fishing for words".

Since the Phonix algorithm is at the heart of several more advanced phonetic search algorithms, and there exists no live open implementation of it, I decided to reimplement it in Python and release it publicly under BSD.


On science and engineering

Posted on Fri 27 February 2015 in Notes

Science: If you know what you are doing, you are doing it wrong.

Engineering: If you don't know what you are doing, you are doing it wrong

Dr. Richard W. Hamming, 1995 "Learning to learn""

Measuring succes of Voice Activity Detection algorithms. HR0 and HR1

Posted on Fri 06 February 2015 in Notes

When measuring the effectiveness of a Voice Activity Detection algorithm (VAD) looking at 0-1 accuracy is rarely enough. We typically also look at Nonspeech Hit Rate (HR0) and Speech Hit Rate (HR1).

  1. HR0 is computed as the ratio of the number of correctly detected nonspeech frames to the number of real nonspeech frames.
  2. HR1 is computed as the ratio of the number of correctly detected speech frames to the number of real speech frames.

Park et al. 2014 [1]

Another way to put it is _the percentage of nonspeech and speech frames that are correctly predicted. In Python, this can be calculated in the following way:

import numpy as np
import our-vad-library as VAD

X = VAD.load_data()
y = VAD.load_targets()

y_hat = VAD.predict(X)

# Find nonspeech and speech hit rates:
index0 = np.where(y ==0)
index1 = np.where(y ==1)

hr0 = (y_hat[index0] == y[index0]).mean()
hr1 = (y_hat[index1] == y[index1]).mean()

First we create 2 indexes of y using numpy's where() function (see more). index0 is a vector of all the positions of y that represents a silent frame in our data. Say y = [0,0,0,1,1,0], then index0 = [0,1,2,5], since y[0] = y[1] = y[2] = y[5] = 0.

this means that

print y[index0] 
# -> [0,0,0,0]

Which in and of itself is not interesting. However, we can use the same index to pull out all the predictions in ŷ and compare them to the ground-truth in y

y_hat[index0] == y[index0]
# -> (True, True, False ... , dtype=bool)

This gives us a new array of the same dimensions with boolean True or False values. Each True represents a correct prediction and each False an incorrect. A neat python trick is that boolean values are treated as 0 and 1, so we can take the mean of this boolean result array to get the ratio between correct and incorrect prediction using the .mean() function.

[1] Park, Jinsoo, Wooil Kim, David K. Han, and Hanseok Ko. “Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting.” The Scientific World Journal 2014 (August 6, 2014): e146040. doi:10.1155/2014/146040.


Posted on Mon 26 January 2015 in Notes

If [the curse of dimensionality] problem didn't exist, we would use the nearest neighbour averaging as the sole basis for doing estimation.

Trevor Hastie from his "Dimensionality and structured models" lecture in his Statistical Learning course at Stanford.