Naive Bayes Classifier in Python v1.0.4

Just finished work on a Naive Bayes Classifier in Python. Was interested to benchmark Python performance with large data sets. Also had the chance to get to know more about Cython. Indeed as a C extension, it increased performance. So this project all started from my own implementation in PHP here. As it turns out, PHP is more performant than Python as of version 1.0.4 of this library. But there are differences. [Read More]

Naive Bayes Classifier - Revisited

During the last week, I’ve been following up work with a side project to do machine learning with Urbanesia’s comprehensive data. A lot of late night reading and fiddling with foreign codes were the highlights of my last week. Wanted to elaborate my implementations and how several kinds of technologies affect benchmarks particularly with classification performance. The repo for the codes is at Github here. During time span of the first batch of codes until now, I have made lots of changes to the codes and also the data store. [Read More]

Simple Naive Bayes Classifier for PHP

Recently Hacker News is flooded with numerous articles discussing or at least mentioning Naive Bayes Classifier algorithm. It’s a basic algorithm to classify a set of words into a certain category (set) based on prior learning of words and its probabilities. It sounds simple enough but without actual technical guide book, it’s quite trivial since most of the information out there regarding it is too messy for newbies like myself. [Read More]