Mar16

More performances results...

Posted by jmspaggi on 16/03/13  ~  Posted in: Non catégorisé  ~  Réagir »

HBase release 0.94.6 is coming soon. RC2 is out, and I used it to re-run the performances tests.

I have also included all previous HBase releases from 0.90.x.

As you can see on THIS PDF file, even if some performances are pretty stable, some others got good improvments.

All those tests are run in the same computer, the same way without any other process running, as I explain in the previous post.

Going forward, I will update this PDF file each time a new HBase version is coming out.

If you have any question or need any more details, fell free to ask.

Enjoy.

 

Update: Now available with the TRUNK results.

Feb28

HBase is coming with few tools to measure operations performances.

As I discussed in "HBase performances/load tests", one of them is PerformanceEvaluation. To show the progress made over the HBase 0.94.x history, I ran all those tests against all the 0.94.x version. As you will see below, some big improvements have been made recently! Another version to migrate to a newer release if you are running in a version < 0.94.3.

You can also download THIS PDF with all the charts.The units are usually lines per seconds, but I sometime had to multiply or divide the results by 10x to make them more readable. The idea here is not to see how fast it is, but is to compare one release to another one. All those test were run on a dedicated computer where absolutly nothing was running at the same time. For those who will notice, the computer was upgraded just before the last test. That's why the sequentialWrite test is way faster than the read one.

I will now run the same tests for 0.90.x and 0.92.x version, and also add HFilePerformanceEvaluation test too then add everything to the results below.

 

filterScan


randomRead

 

randomScan
 

randomWrite
 

scanRange10
 

scanRange100
 

scanRange1000
 

sequentialRead
 

sequentialWrite
Fév10

Hadoop HDFS-FUSE installation

Posted by jmspaggi on 10/02/13  ~  Posted in: Non catégorisé  ~  6 retours »

It has been a looooooong day!!!

Today I tried to configure FUSE with hadoop to be able to have some batch scripts backuping some data into HDFS. And it took me the entire day to figure how to make it work! There is many steps  described on internet but none of them really worked for me. The solution I built is working, even if it's most probably not the most elegent one. Those are the notes I took over the installation, and it required multiple tries. So maybe I missed some steps. You will also have probably to adjust the directories based on your own installation.

So everything started from there: http://wiki.apache.org/hadoop/MountableHDFS

 

Let's see how to adjust that to make it work now.

First, you need to download the hadoop source files and decompress them locally. Make sure to take the same version as what you are already using. After many tries, I figured that some files where not created in the build process, so before starting, simply copy your existing hadoop installation files into the newly downloaded folder. Basically, this should look like this:
cd ~
svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.3/ hadoop-common-1.0.3
cp -r hadoop-1.0.3/* hadoop-common-1.0.3/

 

Now, go into your newly created hadoop folder and compile it.
cd hadoop-common-1.0.3
ant compile jar

This will build almost all the files you need and prepare for the next steps. You can start the libhdfs compilation:
ant compile-c++-libhdfs -Dlibhdfs=1

And do the packaging:
ant package

 But this will fail... You will see a compilation error for Gridmix class.

 
[echo] contrib: gridmix
[javac] Compiling 31 source files to /home/hadoop/branch-1.0_0427/build/contrib/gridmix/classes
[javac] /home/hadoop/branch-1.0_0427/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java:396: error: type argument ? extends T is not within bounds of type-variable E
[javac] private <T> String getEnumValues(Enum<? extends T>[] e) {
[javac] ^
[javac] where T,E are type-variables:
[javac] T extends Object declared in method <T>getEnumValues(Enum<? extends T>[])
[javac] E extends Enum<E> declared in class Enum
[javac] /home/hadoop/branch-1.0_0427/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java:399: error: type argument ? extends T is not within bounds of type-variable E
[javac] for (Enum<? extends T> v : e) {
[javac] ^
[javac] where T,E are type-variables:
[javac] T extends Object declared in method <T>getEnumValues(Enum<? extends T>[])
[javac] E extends Enum<E> declared in class Enum
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 2 errors

To fix that, you need to apply https://issues.apache.org/jira/browse/HADOOP-8329 manually. Simply edit the Gridmix file, and do the small updates.

Start the packaging again... Failed one more time. This time it's most probably because you did not gave him a java5 path. I tried to download a 32 bits java 5 package but it was not working. So simply give him your regular java path. Like that:
ant package -Djava5.home=/usr/local/jdk1.7.0_05/

But it's failing again! You need to download and install Apache Forrest. And even then, it will still fail again! The key is to remove the docs and cn-docs for the package target from to build.xml file. Open the build.xml file with vi, search for 'target name="package"' and on the same line, remove 'docs, cn-docs'. Then start the packaging again.

This time it should work. So you can go to the contrib compilation:
ant compile-contrib -Dlibhdfs=1 -Dfusedfs=1

Guess what: It will fail again! It's looking for the .so files. Seems it should have been generated by the first compilation, but it's not. However, you should still have those files from your original installation. So simply linked the 2 together with something like that:
ln -s /home/hadoop/hadoop-1.0.3/c++/Linux-amd64-64/lib /home/hadoop/hadoop-common-1.0.3/build/c++/Linux-amd64-64/lib

This will help you to make some steps, but the same command, again, will fail! This time, to solve the issue, I have directly copied the libhdfs.so file into the JDK lib folder! To make sure it will always find it!

cp c++/Linux-amd64-64/lib/libhdfs.so /usr/local/jdk1.7.0_05//jre/lib/amd64/server/
ant compile-contrib -Dlibhdfs=1 -Dfusedfs=1

Now it's working! To mount hdfs on your system, you will need to type this (-d is for debug. You will be able to remove it later):
./fuse_dfs_wrapper.sh dfs://node3:9000 /hadoop -d

You will need to be in the right folder since fuse_dfs_wrapper.sh will search for some other files. You might face some permissions erros, and also you might need to add some libs into your LD_LIBRARY_PATH.

So first, make sure you have the right lilbrary path and next, the classpath:
export LD_LIBRARY_PATH=/home/hadoop/hadoop-common-1.0.3/c++/Linux-amd64-64/lib/:/usr/local/jdk1.7.0_05/jre/lib/amd64/server/
export CLASSPATH=`/home/hadoop/hadoop-1.0.3/bin/hadoop classpath`

Then make sure the user you want to give access to hdfs is on the fuse group:
adduser jmspaggi fuse

Then using the standard hadoop tools, create a folder for this user:
bin/hadoop fs -mkdir /user/jmspaggi
bin/hadoop fs -chown jmspaggi /user/jmspaggi

And enjoy!

You can also try to updated your /etc/fstab. I have not been able to do that, and I can do without. So I will not comment about this step here. And as I said, it's really not the good way to get it working. There might be some way simpler ways do to the same thing.

Fév07

HBase performances/load tests.

Posted by jmspaggi on 07/02/13  ~  Posted in: Performance testing  ~  3 retours »

There is multiple ways to measure HBase performances. There are tools included in HBase, external tools, or even home-made scripts.

Let's try to list them first.

Tools included in HBase:

  • org.apache.hadoop.hbase.PerformanceEvaluation
  • org.apache.hadoop.hbase.util.LoadTestTool

 

External tools:

  • YCSB

Home-made scripts:

  • DIY

 


If you know other performances/load test tools for HBase, feel free to let me know so I can give them a try and add them on this list.

Why performances tests?

It's important to have a way to measure the performances of your cluster to be able to compare the overall performances with different settings or different hardware. Some clusters are mainly used in write mode, some others in read mode and then some in both. Having a good baseline of your cluster speed will allow you to decide how to configure it to fit your needs and will show you performance impacts of configuration/hardware changes.

Performance test is also used in quality control to ensure that newly added features or API are not impacting the baseline performance.


 PerformanceEvaluation

This first post will be about org.apache.hadoop.hbase.PerformanceEvaluation. The other tools will come in other posts and table of contents will link to them directly.

The fist and simplest way to measure HBase performances is to use the PerformanceEvaluation tool shipped with HBase into the test jar. Simply download HBase, extract the .tar.gz file, start HBase and run the test. No configuration required (standalone mode) nor special setting required. Of course you can still run this test against your own cluster to measure its overall performances. But if you simply want to compare the performances of 2 different computers, then nothing more is required.

You can call this tool with this command line:
bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation

And you will get something like this:

Usage: java org.apache.hadoop.hbase.PerformanceEvaluation \
[--miniCluster] [--nomapred] [--rows=ROWS]

Options:
miniCluster Run the test on an HBaseMiniCluster
nomapred Run multiple clients using threads (rather than use mapreduce)
rows Rows each client runs. Default: One million
flushCommits Used to determine if the test should flush the table. Default: false
writeToWAL Set writeToWAL on puts. Default: True
presplit Create presplit table. Recommended for accurate perf analysis (see guide). Default: disabled

Command:
filterScan Run scan test using a filter to find a specific row based on it's value (make sure to use --rows=20)
randomRead Run random read test
randomSeekScan Run random seek and scan 100 test
randomWrite Run random write test
scan Run scan test (read every row)
scanRange10 Run random seek scan with both start and stop row (max 10 rows)
scanRange100 Run random seek scan with both start and stop row (max 100 rows)
scanRange1000 Run random seek scan with both start and stop row (max 1000 rows)
scanRange10000 Run random seek scan with both start and stop row (max 10000 rows)

sequentialRead Run sequential read test
sequentialWrite Run sequential write test

Args:
nclients Integer. Required. Total number of clients (and HRegionServers)
running: 1 Examples:
To run a single evaluation client:
$ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1

 

As you can see, PerformanceEvaluation allow you to test the read and write methods under different scenarios. It will start a local MapReduce job to do it, but you can also ask it to use threads instead. You just have to follow the given example to start it. PerformanceEvaluation will create a table called TestTable and will use it for its needs.

Here is an extract from this command line results with HBase 0.94.0

13/02/05 22:08:36 INFO hbase.PerformanceEvaluation: Start class org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at offset 0 for 1048576 rows
13/02/05 22:09:23 INFO hbase.PerformanceEvaluation: 0/104857/1048576
13/02/05 22:11:16 INFO hbase.PerformanceEvaluation: 0/209714/1048576
13/02/05 22:12:33 INFO hbase.PerformanceEvaluation: 0/314571/1048576
13/02/05 22:13:39 INFO hbase.PerformanceEvaluation: 0/419428/1048576
13/02/05 22:15:36 INFO hbase.PerformanceEvaluation: 0/524285/1048576
13/02/05 22:16:30 INFO hbase.PerformanceEvaluation: 0/629142/1048576
13/02/05 22:17:27 INFO hbase.PerformanceEvaluation: 0/733999/1048576
13/02/05 22:18:24 INFO hbase.PerformanceEvaluation: 0/838856/1048576
13/02/05 22:19:20 INFO hbase.PerformanceEvaluation: 0/943713/1048576
13/02/05 22:21:03 INFO hbase.PerformanceEvaluation: 0/1048570/1048576
13/02/05 22:21:03 INFO hbase.PerformanceEvaluation: Finished class org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest in 747156ms at offset 0 for 1048576 rows

How to interpret that? 1048576 is the number of rows the PerformanceEvaluation tool writes for its test. Basically, it's 1024*1024 rows. In the example above, it took him 747 seconds to write those 1048576 lines into my standalone test environment. That mean, performances for this specific test in my environment are 1403 lines per second. Now to compare the performances between different versions of HBase, or for different settings, you just need to run the exact same test on the same computer. I ran the same test with HBase 0.94.1 and got 1409 lines per seconds. It's less than 1% difference, so we can say results are pretty identical.

Also, you can't rely on those results if you run the test only once. To be accurate, you need to run it multiple times so you have a strong baseline. What I recommend is to run the test at least 10 times. Then remove the fastest one, remove the slowest one, and do an average of the 8 remaining results. That will give you a way more accurate picture of the test you run. But you need to be aware that some of those tests can be long. RandomSeekScanTest took 1h20 on my computer... Running it 10 times will take 13h.

Now, I really want to emphasis on the "exact same test on the same computer"... PerformanceEvaluation is a good example of what I mean by "exact same test". As I said previously, the test is creating a table named "TestTable". This table is not pre-split, and if running with a single user, there is nothing done if the table already exist. Some tests are writing into it, some are reading. That mean each time you are using this table, some values are put into cache. And at the end, you might end with a big table, with many regions, full of many rows added by write tests. Before each test you are starting, you need to make sure that your environment is EXACTLY as it was for the previous test... "Exact same test on the same computer"... Just to illustrate, running 10 times SequentialWriteTest has created 10 times the same 1024x1024 rows in my TestTable because I did not remove it between each test. That mean the table was initially empty. After the first run, there was 1024*1024 rows into the table. One version of each. After the 2nd run, there was another 1024*1024 rows written, with the same key. That mean I got 2 versions of each row, and so on. After the 3rd run, the number of version is going over the configured default number of versions (3) and some compactions might occur while rows are added in run 4. You will see that on the performances tests results below. So to run the tests properly, you need to remove the created data and restart HBase to make sure things are in the same state before each test start.

Another thing you need to understand about performances tests. The longer the test is, the more accurate the results are. If your test last 10 seconds and unfortunately it's just at that time that a 100ms log rotate take place, a 100ms write to disk occurs, or your daily ntpdate cron entry start, anything like that will represent 1% or more of your total test duration! Any activity done by the operating system or any other process might impact your results. Now, if you take the same log rotate or nptdate but you dilute it over a 10 minutes test, it's no more 1% that it represents, but it's something like 0.02%.

 

Here are the results for the same test ran on another computer (without cleaning between each run):
13/02/06 20:02:37 INFO ... SequentialWriteTest in 231577ms at offset 0 for 1048576 rows => 4528
13/02/06 20:06:13 INFO ... SequentialWriteTest in 202305ms at offset 0 for 1048576 rows => 5183 (Max)
13/02/06 20:12:04 INFO ... SequentialWriteTest in 261183ms at offset 0 for 1048576 rows => 4015
13/02/06 20:17:58 INFO ... SequentialWriteTest in 336621ms at offset 0 for 1048576 rows => 3155
13/02/06 20:22:53 INFO ... SequentialWriteTest in 239211ms at offset 0 for 1048576 rows => 4383
13/02/06 20:29:39 INFO ... SequentialWriteTest in 398487ms at offset 0 for 1048576 rows => 2631 (Min)
13/02/06 20:35:59 INFO ... SequentialWriteTest in 330099ms at offset 0 for 1048576 rows => 3177
13/02/06 20:42:13 INFO ... SequentialWriteTest in 332859ms at offset 0 for 1048576 rows => 3150
13/02/06 20:47:07 INFO ... SequentialWriteTest in 284928ms at offset 0 for 1048576 rows => 3680
13/02/06 20:54:14 INFO ... SequentialWriteTest in 398410ms at offset 0 for 1048576 rows => 2632

Average is 3590, standard deviation is 322 (9%).

For those tests, I did not stop HBase and I did not clear the table between each test. This is mainly why it's getting slower because table needs to be compacted (due to the number of versions).

Now, here are the results with a clean between each test. The command line used for that was:
for i in {1..10}; do rm -rf /tmp/*; bin/start-hbase.sh; sleep 60; bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1; bin/stop-hbase.sh; done

13/02/06 22:31:46 INFO ... SequentialWriteTest in 184663ms at offset 0 for 1048576 rows => 5678
13/02/06 22:36:46 INFO ... SequentialWriteTest in 187087ms at offset 0 for 1048576 rows => 5605
13/02/06 22:41:20 INFO ... SequentialWriteTest in 190496ms at offset 0 for 1048576 rows => 5504
13/02/06 22:45:54 INFO ... SequentialWriteTest in 189181ms at offset 0 for 1048576 rows => 5543
13/02/06 22:50:41 INFO ... SequentialWriteTest in 197229ms at offset 0 for 1048576 rows => 5317 (Min)
13/02/06 22:55:20 INFO ... SequentialWriteTest in 187134ms at offset 0 for 1048576 rows => 5603
13/02/06 22:59:49 INFO ... SequentialWriteTest in 185768ms at offset 0 for 1048576 rows => 5645
13/02/06 23:04:25 INFO ... SequentialWriteTest in 191631ms at offset 0 for 1048576 rows => 5472
13/02/06 23:08:53 INFO ... SequentialWriteTest in 184092ms at offset 0 for 1048576 rows => 5696
13/02/06 23:13:23 INFO ... SequentialWriteTest in 183385ms at offset 0 for 1048576 rows => 5718 (Max)

Average 5593, standard deviation is 65 (1.17%)

As you can see, this time tests are more consistent, standard deviation is only 65. This is way better.

If I remove the best and the worst and average the others, for HBase 0.94.4, I have an average of 5593 lines/seconds. Doing exactly the same on the same environment, but with HBase 0.94.0, is giving me an average of 5088 lines/seconds (1.13%). What we can deduct from those tests is that 0.94.4 is almost 10% faster than 0.94.0 for SequentialWriteTest tests (doesn't mean it's the same for the other tests).

As an example of some configuration impacts, I have activated the HBase checksums on 0.94.4 and re-run the same test (again, 10 times) and the result is now 5524 (0.852%) lines/seconds. Since both series are accurate (less to 2% standard deviation), we can conclude that checksums are impacting HBase performance by 1.25%. This shows us the performances impacts the change of the configuration can have.

Conclusion

What you have to retain from those tests is:

  • It's very important to make sure you have the exact same conditions for your tests if you want to be able to compare it with other runs.
  • org.apache.hadoop.hbase.PerformanceEvaluation provides you an easy way to get quick numbers out of your installation.

 

Jan31

How to activate HBase ShortCircuit.

Posted by jmspaggi on 31/01/13  ~  Posted in: Non catégorisé  ~  2 retours »

Recent vesion of HBase (0.94 has it) come with an option to "shortcircuit" hadoop to read data. This is supposed to ingrease the performances. I tried to activate that recently and faced some issues. Therefor I have decided to share with you how to activate that propertly.

First, let's do a rowcount in a 10M line table: 11m1.013s. This will be our baseline.

Basically, there is only 2 things mandatory, and one recommanded.

The 2 things mandatory are to update you hdfs-site.xml to add something like that

  <property>
    <name>dfs.block.local-path-access.user</name>
    <value>hbase</value>
  </property>

Where hbase is the id running your HBase process. BUT... If you are running MR jobs under the hadoop user, and those MR jobs are using HBase too, you will have to change and run those jobs with the HBase user because they will get the access to the HDFS denied... Not doing that will give you errors like :

org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Can't continue with getBlockLocalPathInfo() authorization. The user hadoop is not allowed to call getBlockLocalPathInfo

The 2nd thing you need to do is to update your HBase configuration. In the hbase-site.xml file,  add this:

   <property>
    <name>dfs.client.read.shortcircuit</name>
    <value>true</value>
  </property>

Depending on the way your users are configured, you might need to assign them to the other group to with something like:

usermod -a -G hbase hadoop

With those 2 entries modified, you can already restart your HBase and your Hadoop and try...

I re-ran the rowcount base line and got this respons time: 6m27.983s

It's 41% faster!!!! Significant improvment!

Now, there is another thing to look at. Hadoop is maintaining a checksum on his side, and it's recommanded to de-activate it, and move it on the HBase side.

This is done by updating hbase-site.xml to add:

  <property>
    <name>hbase.regionserver.checksum.verify</name>
    <value>true</value>
  </property>

This will tell to HBase to check himself for the data checksum instead of asking Hadoop to do it and will reduce IOs.

I ran a major_compaction agian to make sure all the checksum are added by HBase and the new respons time is now: 5m56.803s

Which is now 46% faster!

Based on those results, I highlighy recommand to activate this if the version of HBase and Hadoop you are using permit.

1 2 >>