Just finished work on a Naive Bayes Classifier in Python. Was interested to benchmark Python performance with large data sets. Also had the chance to get to know more about Cython. Indeed as a C extension, it increased performance.
So this project all started from my own implementation in PHP here. As it turns out, PHP is more performant than Python as of version 1.0.4 of this library. But there are differences.
The Python module redis available at PyPi is not compiled as a C extension while the PHP counterpart is definitely a C extension. So the bottleneck here I suspect is with the Redis client. Expect some more enhancements to the Redis clients in future versions.
So long story short, why not give it a go at https://github.com/tistaharahap/python-bayes-redis. Would love for feedbacks on how to further optimize the codes. Still very fresh with Python at the moment.