For the past month, I’ve been pleasantly hacking my Cubieboard to try out several different things. This time, I wanna know how performant Cubieboard is. Benchmarks are configured in such a way to replicate a real Web Application.
Preparing
Here are the specs for my Cubieboard:
- AllWinner A10 ARM Single Core CPU
- 1 GB DDR3 @ 480 MHz
- 5V / 2A = 10 Watts
- SATA HD - 5400 RPM
- Cubian r7 - http://cubian.org
- 2 GB Swapfile (SATA HD)
- US$ 49
I’ve written a tutorial on how to install Cubian to a SATA HD here. The main reason why I’m doing a SATA HD Install for this is to preserve my micro-SD’s lifetime. We’re gonna be compiling and doing I/O intensive tasks so it’s best to delegate off the micro-SD card.
Other than preserving the micro-SD, these cheap ARM boards have limited I/O performances so a SATA 5400 RPM HD might be better but ultimately an SSD should be used because it’s pitted against SSD opponents. I don’t have a spare SSD so this will have to do.
Benchmarking
The Cubieboard is pitted against my Macbook Pro and Digital Ocean’s lowest spec VM.
A Macbook Pro will reflect how the Cubieboard perform compared with a typical development machine while the Digital Ocean VM is a real world server.
DISCLAIMER: This is not an apple to apple comparison, please keep in mind that between the 3 systems there are gaps in specs. The benchmarks served here are good only for references.
Macbook Pro
- Early 2011
- Dual Core i5 @ 2.3 GHz
- 8 GB DDR3 @ 1333 MHz
- SSD HD
- OS X Mavericks 10.9.1
- US$ 1199
Digital Ocean
- Single Core
- 512 MB Memory
- SSD HD
- Ubuntu 12.04 LTS
- 2 GB Swapfile
- US$ 5/month
Compiler
We need to install a build system so that the Cubieboard will be able to do the benchmarks.
$ sudo -i $ apt-get install -y build-essential git gcc-arm-linux-gnueabihf python-dev
Redis
Surprisingly the Cubieboard compiled Redis successfully! I wasn’t hoping much for this piece of hardware but then it’s not just a toy, it’s a full blown computer with a very small form factor.
Compile, Configure & Install
For this blog post, I’m compiling Redis 2.8.4. Let’s go ahead.
$ wget http://download.redis.io/releases/redis-2.8.4.tar.gz $ tar xfz redis-2.8.4.tar.gz $ cd redis-2.8.4 $ make $ mkdir -p /usr/local/redis/bin $ mkdir -p /usr/local/redis/conf $ cp src/redis-{benchmark,check-aof,check-dump,cli,sentinel,server} /usr/local/redis/bin $ cp redis.conf /usr/local/redis/conf $ ln -s /usr/local/redis/bin/redis-{benchmark,check-aof,check-dump,cli,sentinel,server} /usr/local/bin/ $ ln -s /usr/local/redis/conf/redis.conf /etc/redis.conf
Now that it’s compiled and installed appropriately, let’s go ahead and change some configurations to our needs.
$ vim /etc/redis.conf
Use/modify the file reflecting the below values and leave the rest on their default values.
daemonize yes bind 127.0.0.1
Now let’s start the server.
$ redis-server /etc/redis.conf
redis-benchmark
redis-benchmark
comes with the Redis we will compile. It will basically benchmark throughputs for various Redis commands. The ones that I’m particularly interested are GET
, SET
and INCR
. Those 3 are the commands I would normally used within a web application.
For this benchmark, I’m gonna test using 1, 2, 3, 4, 8 and 20 concurrent connections. The results are CSVs.
$ redis-benchmark -h 127.0.0.1 -p 6379 -n 1000 --csv -c 1 > reds-bench-1.csv $ redis-benchmark -h 127.0.0.1 -p 6379 -n 1000 --csv -c 2 > reds-bench-2.csv $ redis-benchmark -h 127.0.0.1 -p 6379 -n 1000 --csv -c 3 > reds-bench-3.csv $ redis-benchmark -h 127.0.0.1 -p 6379 -n 1000 --csv -c 4 > reds-bench-4.csv $ redis-benchmark -h 127.0.0.1 -p 6379 -n 1000 --csv -c 8 > reds-bench-8.csv $ redis-benchmark -h 127.0.0.1 -p 6379 -n 1000 --csv -c 20 > reds-bench-20.csv
Results
Cubieboard
The higher the better.
Macbook Pro
The higher the better.
Digital Ocean
The higher the better.
Full Results CSV - Pending Upload
Conclusion
This benchmark stresses the raw CPU power of the device and its memory bandwidth.
The Cubieboard is obviously the under achiever which is expected. However, with a very low power usage, I believe ARM processors do have a market for low end servers.
The worst performer here I think is Digital Ocean. It is performing at around 60% of my Macbook Pro’s performance and only roughly 3 times as fast as my Cubieboard.
BayesRedis
This is a small Python library I developed to train and classify sets of text. It is available on Github here. Installation for Cubieboard is a bit tricky because Cubian doesn’t come with pip
by default. We will have to install it manually.
Installation
For your Cubieboard follow the steps below, other platforms may skip should you have pip
already installed.
$ cd /usr/local/src $ wget https://pypi.python.org/packages/source/s/setuptools/setuptools-2.1.tar.gz#md5=2044725530450d0517393882dc4b7508 $ tar xfz setuptools-2.1.tar.gz $ cd setuptools-2.1 $ python setup.py install $ cd .. $ wget https://pypi.python.org/packages/source/p/pip/pip-1.5.tar.gz#md5=6969b8a8adc4c7f7c5eb1707118f0686 $ tar xfz pip-1.5.tar.gz $ cd pip-1.5 $ python setup.py install
BayesRedis is written in Python but upon installation it will be compiled natively. Here are the steps.
$ pip install redis hiredis bayesredis
If you take a look at the Github repo, there’s a test.py
file at the root directory. I’ve made a Github Gist of the file so let’s go ahead and customize it to our needs.
$ wget https://gist.github.com/tistaharahap/8446592/raw/ac7350b7e7e17c07c7beff89affd7c3766633077/test.py $ vim test.py
As you can see, the training examples are all commented out, for our first run we must uncomment them by removing the triple single quotes '''
before and after the examples.
Benchmarking
Now let’s run the benchmark.
$ redis-cli flushdb # This will truncate all your Redis data, use with care $ python test.py
Results
The lower the better.
If we see the chart above, Cubieboard is definitely miles apart. This particular benchmark stresses the memory bandwidth of the system. A wider memory bandwidth is key to this benchmark success.
This kind of benchmark is actually computed by servers everyday around the world. It’s synthetic but it can explain how the real world would exploit the hardware beneath.
A more familiar implementation example would be recommending you items to purchase by analyzing your purchase history.
As you can see, the Cubieboard is not ready for real world use by seeing the result of this benchmark. The Cubieboard is practically limited by its memory bandwidth.
nginx - Static Files
nginx
is on the rise right now. It’s steadily dominating web servers around the world. The most obvious reason why nginx is so successful in simply because it’s lightning fast.
Just by reverse proxying traffic with nginx in front, I have seen at least 30% performance increase. The nature of handling requests asynchronously with a very lightweight memory footprint makes nginx the de facto choice for performance hungry websites.
Compile, Configure & Run
For this benchmark we only want to test synthetic raw performances. There are many factors in the real world that will influence a web server’s perceived speed with network latency as the usual suspect.
Benchmarks will be executed from another machine on the same network.
Compile
Our nginx
installation is gonna be located at /usr/local/nginx
. The only dependency we’re gonna need is libpcre3-dev
.
$ cd /usr/local/src $ wget http://nginx.org/download/nginx-1.4.4.tar.gz $ tar xfz nginx-1.4.4.tar.gz $ cd nginx-1.4.4 $ apt-get install -y libpcre3 libpcre3-dev $ ./configure --prefix=/usr/local/nginx $ make && make install
Configure & Run
We’re gonna benchmark using only the default parameters of nginx.conf
. As an addition, I want to benchmark image serving.
$ /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf $ cd /usr/local/nginx/html $ wget http://nginx.org/nginx.gif
Check if nginx
is running by opening up a web browser and type in the IP of your Cubieboard.
Benchmark
This benchmark will be executed by 2 application which are:
HTML
$ ab -n 10000 -c 10 http://192.168.1.134/ $ ab -n 10000 -c 100 http://192.168.1.134/ $ ab -n 10000 -c 250 http://192.168.1.134/ $ ab -n 10000 -c 500 http://192.168.1.134/ $ ab -n 10000 -c 1000 http://192.168.1.134/ $ wrk -r 10000 -t 1 -c 10 http://192.168.1.134/ $ wrk -r 10000 -t 1 -c 100 http://192.168.1.134/ $ wrk -r 10000 -t 1 -c 250 http://192.168.1.134/ $ wrk -r 10000 -t 1 -c 500 http://192.168.1.134/ $ wrk -r 10000 -t 1 -c 1000 http://192.168.1.134/
Results
The higher the better.
The higher the better.
The higher the better.
Image
$ ab -n 10000 -c 10 http://192.168.1.134/nginx.gif $ ab -n 10000 -c 100 http://192.168.1.134/nginx.gif $ ab -n 10000 -c 250 http://192.168.1.134/nginx.gif $ ab -n 10000 -c 500 http://192.168.1.134/nginx.gif $ ab -n 10000 -c 1000 http://192.168.1.134/nginx.gif $ wrk -r 10000 -t 1 -c 10 http://192.168.1.134/nginx.gif $ wrk -r 10000 -t 1 -c 100 http://192.168.1.134/nginx.gif $ wrk -r 10000 -t 1 -c 250 http://192.168.1.134/nginx.gif $ wrk -r 10000 -t 1 -c 500 http://192.168.1.134/nginx.gif $ wrk -r 10000 -t 1 -c 1000 http://192.168.1.134/nginx.gif
Results
The higher the better.
The higher the better.
The higher the better.
Cubieboard as a Web Server
The results are in and Cubieboard proved a resilient piece of hardware. It is outperformed in every tests by its opponents but surprisingly when compared with my Macbook Pro, the Cubieboard is proving to be a reliable machine with fewer HTTP response errors.
OS tuning is a factor why my Macbook Pro is spitting out lots of HTTP response errors. I tested as is without any OS tuning to my Macbook Pro and the other opponents.
On this benchmark my Macbook Pro is giving the highest throughput against the other opponents up until 250 concurrent connections when HTTP response errors are building up.
The Digital Ocean server stood very well only spitting out HTTPS response errors at 1000 concurrent connections. It’s definitely a web server.
Conclusion
The Cubieboard is great at handling CPU intensive tasks. Not so great when memory is vital to performance. I can see myself rigging a cluster of Cubieboards to do data science. And do it cheap with only 10 watts of power usage and US$ 49 price tag per board.
A typical unbranded server here in Indonesia can cost upwards of US$ 1000 per server which amounts to 20 Cubieboards. When adding up the electricity costs of let’s say a 450 Watts PSU, monthly usages is more cost efficient with a Cubieboard cluster.
To be fair, a Cubieboard configured as a server would at least cost US$ 49 + US$ 50 for a SATA HD - 5400 RPM. If SSD is an option, it could wind up to US$ 165 for a 64GB SSD totalling to US$ 224 per rig.
All and all, I am pleased with the benchmark results for my Cubieboard and in the future, I believe an ARM board such as the Cubieboard is performant enough and cost efficient to be a cluster of data science computing.