Feb23

Like for almost all 0.94 and 0.96 HBase releases, I spent some time the last few days to test HBase 0.94.17RC2 release. Since I have not provided performances numbers for a while, I have decided to share here some information about what I test, and some of the result numbers. HBase release canditates are actively tested by many HBase contributors, which ensure their stability.

To test a release, I proceed with 2 main tracks.. First track  is stability, to ensure release run correctly, no errors in the logs, documentation is correct, features are as expected, etc. Second track is performances. I ensure the release provides performances similar to the previous release from the same branch. After each release tested I try to find something which I can add to my testing plan. When a release is sunk because someone else find something wrong, I make sure I add a test in my plan to make sure this is covered into the next release.

So here is the result of the 0.94.17RC2 testing. For each step I give an overall description with the result, then more details on what is done.

  • Jars and signatures checked => Passed

Checked the secured and not secured jars by calcultating the MD5 signatured, compared with the provided result. Tried to un-compressed the 2 jars to make sure they un-compress correctly.

  • Content => Passed

Checked the CHANGES.TXT file to make sure release fixed are correctly listed. Randomly checked some documentation pages.

  • Run default test suite => Passed

Ran the default HBase test suite using maven command and verified that all tests ran correctly. 

  • Start the release and check => Passed

Start the release in a 4 node cluster (1 master, 3 region servers). Checked the logs, checked the web interface. Check consistency using HBCK.

  • Ran default/performances tests and compared => Passed

HBase come with multiple tools to test performances, like PerformanceEvaluation, LoadTestTool, IntegrationTestLoadAndVerify, HLogPerformanceEvaluation and IntegrationTestBigLinkedList (at least). I use all those tools to compare the stability and the performances between the test release and the previous stable one from the same branch. For the purpose of this release candidate, I ran all those tests with HBase 0.94.16 + Hadoop 1.2.1 and HBase 0.94.17RC2 + Hadoop 1.2.1 and below are the results of those tests for those 2 releases. Numbers below are operations per seconds for PerformanceEvaluation. Like, for RandomScanWithRange100Test, numbers mean we can do 147 scan of 100 rows per seconds.

Test Name 0.94.16 0.94.17
FilteredScanTest 0.22 0.25
RandomReadTest 463.70 388.84
RandomSeekScanTest 173.17 161.83
RandomScanWithRange10Test 283.40 285.62
RandomScanWithRange100Test 147.01 147.06
RandomScanWithRange1000Test 38.22 37.00
SequentialReadTest 1239.94 1238.21
SequentialWriteTest 13709.37 14095.54
RandomWriteTest 14047.49 13555.70

 

IntegrationTestLoadAndVerify still give me an incorrect result. This test is supposed to give REFERENCES_CHECKED=10000000 however I still get results like REFERENCES_CHECKED=9855584 what ever version I try with (HBase 0.94.16, HBase 0.94.17, HBase 0.96.1).

LoadTestTool results are below. They are very close between the 2 version.

Test 0.94.16 0.94.17
LoadTestTool

real 19m15.945s
user 35m45.028s
sys 11m44.004s

real 19m2.060s
user 35m14.672s
sys 11m47.820s

 

HLogPerformanceEvaluation results are also very similar between the 2 releases. Values are ops/s. I tried to compare with HBase 0.96.1 + HBase 2.2.0 but the 2 re not returning the same metrics so I have not been able to compare them correctly.

Test 0.94.16 0.94.17
HLogPerformanceEvaluation 103,066 103,379

 

Last automated test is IntegrationTestBigLinkedList where I got the right result for the 2 release.  (REFERENCED=6000000)

  • Run manual tests => Passed

Using the shell, create a table, put some values, scan, flush, compact, validate VERSIONS behaviour, drop table, all passed succesfuly. 

  • Test splits, transition and balancing => Passed.

On a 8 node cluster, offline merge a 17 region table into a single region. Run HBCK to validate everything is fine. Then run man split and compaction to make sure table is automatically split by HBase back into multiple regions. Validate regions transition and balancing, and run HBCK again. Everything working fine.

  • Run some MapReduce jobs => Passed

Ran the simple RowCounter map reduce job first because this one doesn't write anything into the cluster so is safer. Since it passed, run another MapReduce job doing reads, writes, lookups and deletes. All passed succesfuly.

All over the process, HBCK and logs are checked multiple times to make sure no exception is found and no unconsistencies are produced.

 

As a result, all tests with 0.94.17 are giving similar results as 0.94.16. Performances are equivalents, all features tests passed and are stable. I have provided a +1 in the HBase mailing list based on my results.

Dec25

 

Almost a year ago I published some lines about Snappy installation on HBase 0.94.x. Since both Hadoop 2.2.0 and HBase 0.96.0 are now out, I have decided to install a new cluster with those 2 versions.

Installation was quite simple but whe when I tried to move the data in this new cluster, things did not work well since I was missing Snappy, again.

Since it has not been that much straight forward to install it, goal of this blog is to provide some feedback on my experience.

Again, and first thing, the command line to confirm if snappy is working or not is the following:

bin/hbase org.apache.hadoop.hbase.util.CompressionTest file:///tmp/test.txt snappy

The goal is to get this final output:

2013-12-25 18:08:02,820 INFO  [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2013-12-25 18:08:03,903 INFO  [main] util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32
2013-12-25 18:08:03,905 INFO  [main] util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C
2013-12-25 18:08:04,143 INFO  [main] compress.CodecPool: Got brand-new compressor [.snappy]
2013-12-25 18:08:04,149 INFO  [main] compress.CodecPool: Got brand-new compressor [.snappy]
2013-12-25 18:08:04,685 INFO  [main] compress.CodecPool: Got brand-new decompressor [.snappy]
SUCCESS

And here are the steps to get it. All the difficulties to get the snappy libs is to get the right versions of the right tools. To get the snappy libs for your infrastructure, you need to compile both Snappy and Hadoop (since it doesn't come with 64 bits native libs). To compile the Snappy lib and get the related so files, you can refer to the previous post. Getting the so files for Hadoop is a bit more complicated. Hadoop 2.2.0 depends on maven 3.0.5 and on protobuf 2.5. Debian Wheezy (stable) doesn't include those versions of those tools. Only Jessie (testing) contains those version. So you have 2 options to compile it. First, you change your distribution to the testing version, or you compile it into a VM then deploy it. I choosed the 2nd option.

Maven and protobuf required a recent version of libc6. You will need to install this version on all the servers you will install the libs on.

First, those packages are required:
- subversion
- maven
- biuld-essential
- cmake
- zlib1g-dev
- libsnappy-dev
- pkg-config
- libssl-dev

To install them, just run, as root:
apt-get install subversion maven buid-essential cmake zlib1g-dev libsnappy-dev pkg-config


Make sure you JAVA_HOME is setup correctly. I tried with Sun JDK 1.7.0u45.

When everything is installed correctly, you can start with the core of the operations. First, you will need to extract the hadoop source code. Make sure you extract the right tag. Adjust that it based on the Hadoop version you will install those libraries to.

svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0/

If you are using 2.2.0, some files will fail to compil. They are related to the tests. We don’t need them, so simply delete them.

release-2.2.0/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java
release-2.2.0/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/TestPseudoAuthenticator.java
release-2.2.0/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/TestKerberosAuthenticator.java

Now the tricky part. Hadoop 2.2.0 depends on Protobuf 2.5. However, this version is only available in debian experimental! So you will need to update your apt sources.list file to add the experimental repository and run this command:

apt-get -t experimental install protobuf-compiler

This will also update those packages under the experimental version:

libc-dev-bin libc6 libc6-dev locales

So you will have to get those versions too on the servers you are going to install to.

Now move into the release-2.2.0 folder and build using the following command:

mvn package -Drequire.snappy -Pdist,native,src -DskipTests -Dtar

If everything go as expected, you should find your so files under hadoop-dist/target/hadoop-2.2.0/lib/native/. Look for both libhdfs.so and libhadoop.so.

You should now have 3 files:

- libsnappy.so
- libhdfs.so
- libhadoop.so

Restore your sources.list to the initial version (remove the testing and experimental lines), and copy those 3 files under hbase/lib/native/Linux-amd64-64 (or under hbae/lib/native/Linux-i386-32 if you are not running 64 bits).

With your 3 files now in place you can now re-run the initial command and snappy should be running fine. Don't forget to copy those files on all your region servers.


 

Sep09

HBase vs Network Topology.

Posted by jmspaggi on 09/09/13  ~  Posted in: Non catégorisé  ~  1 réaction »

You think since few weeks to give a try to HBase and you have decided to move forward. You already have some computers available, some switch, cables, and have assigned some of them for you tests to see how fast HBase can be. The case described below is a real one I faced recently. Let's see it as example to give you some hints regarding what you might face.

Let’s consider a network with 4 low price switches configured that way.

Numbers and letters into the switches are the nodes identifiers. Only servers assigned to the cluster are indicated. All the network interface cards are 1Gbs. Let’s start to give a naive quick look at the network performances.

We can see that worst network case will be from the edge node E to one of the top nodes. Let’s pickup the node 2. Using netperf from a command line you can evaluate the bandwidth between the 2 nodes. Logging from the node2, netperf command line is the following:

@hbasetest:~$ sleep 120; netperf -D 10 -l 300 -H node2

This will test the network bandwidth between the 2 servers for 5 minutes (300 seconds), display results every 10 seconds. And the output is the following:

@e:~$ sleep 1; netperf -D 10 -l 300 -H node2
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node2 (192.168.23.6) port 0 AF_INET : demo
Interim result:  935.11 10^6bits/s over 10.00 seconds
Interim result:  937.60 10^6bits/s over 10.00 seconds
Interim result:  937.41 10^6bits/s over 10.00 seconds
Interim result:  938.06 10^6bits/s over 10.00 seconds
Interim result:  936.74 10^6bits/s over 10.01 seconds
Interim result:  937.31 10^6bits/s over 10.00 seconds
Interim result:  939.09 10^6bits/s over 10.00 seconds
Interim result:  939.09 10^6bits/s over 10.00 seconds
Interim result:  936.80 10^6bits/s over 10.02 seconds
Interim result:  936.51 10^6bits/s over 10.00 seconds
Interim result:  938.97 10^6bits/s over 10.00 seconds
Interim result:  936.95 10^6bits/s over 10.02 seconds
Interim result:  937.15 10^6bits/s over 10.00 seconds
Interim result:  938.40 10^6bits/s over 10.00 seconds
Interim result:  937.89 10^6bits/s over 10.01 seconds
Interim result:  937.87 10^6bits/s over 10.00 seconds
Interim result:  937.37 10^6bits/s over 10.01 seconds
Interim result:  937.24 10^6bits/s over 10.00 seconds
Interim result:  936.39 10^6bits/s over 10.01 seconds
Interim result:  938.52 10^6bits/s over 10.00 seconds
Interim result:  938.20 10^6bits/s over 10.00 seconds
Interim result:  937.97 10^6bits/s over 10.01 seconds
Interim result:  938.06 10^6bits/s over 10.00 seconds
Interim result:  938.77 10^6bits/s over 10.00 seconds
Interim result:  937.71 10^6bits/s over 10.01 seconds
Interim result:  934.79 10^6bits/s over 10.03 seconds
Interim result:  938.93 10^6bits/s over 10.00 seconds
Interim result:  938.20 10^6bits/s over 10.01 seconds
Interim result:  938.84 10^6bits/s over 10.01 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec 
87380  16384  16384    300.01    937.65  

The average bandwidth between those 2 servers is 937 Mbs/sec. Which mean very close to the NIC capacity.  That’s good. Let’s try another link. This time between nodes B and 5. I will remove the interim results for readability.

Results are:
@b:~# netperf -D 10 -l 300 -H node5
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node5 (192.168.23.9) port 0 AF_INET : demo
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec 
87380  16384  16384    300.01    941.38  

This, again, is expected.

Let’s do a last one with node 1 to node 3.

Again, results are:
@node1:~#  netperf -D 10 -l 300 -H node3
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node3 (192.168.23.7) port 0
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec 
87380  16384  16384    300.02    940.47

So network seems to be find between all the nodes in the cluster. Transfers are fasts and all the servers are responding quickly.  Now, let’s see why this is wrong.

We have done the following bandwidth tests.
E -> 2
B -> 5
1 -> 3

Here is the same topology schema, but this time, each arrow between the switches has been scaled based on the number of nodes using it for this test.

And as you can see, the link between the 2 first switches is used a lot. How does it impact the bandwidth? Let’s give that another try. This time, let’s run the 3 tests together. To be able to highlight the impact, they will be started with a 60 seconds delay each.

Here are the results for the 3 tests.

@node1:~# netperf -D 10 -l 300 -H node3
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node3 (192.168.23.7) port 0 AF_INET : demo
Interim result:  939.00 10^6bits/s over 10.01 seconds
Interim result:  940.53 10^6bits/s over 10.00 seconds
Interim result:  941.84 10^6bits/s over 10.01 seconds
Interim result:  940.20 10^6bits/s over 10.02 seconds
Interim result:  942.38 10^6bits/s over 10.01 seconds
Interim result:  941.74 10^6bits/s over 10.01 seconds
Interim result:  479.64 10^6bits/s over 19.63 seconds
Interim result:  470.53 10^6bits/s over 10.19 seconds
Interim result:  471.76 10^6bits/s over 10.01 seconds
Interim result:  471.41 10^6bits/s over 10.01 seconds
Interim result:  470.75 10^6bits/s over 10.01 seconds
Interim result:  470.51 10^6bits/s over 10.00 seconds
Interim result:  470.11 10^6bits/s over 10.01 seconds
Interim result:  471.35 10^6bits/s over 10.01 seconds
Interim result:  470.59 10^6bits/s over 10.02 seconds
Interim result:  471.74 10^6bits/s over 10.01 seconds
Interim result:  471.16 10^6bits/s over 10.01 seconds
Interim result:  470.36 10^6bits/s over 10.02 seconds
Interim result:  470.78 10^6bits/s over 10.02 seconds
Interim result:  470.22 10^6bits/s over 10.01 seconds
Interim result:  471.11 10^6bits/s over 10.01 seconds
Interim result:  470.64 10^6bits/s over 10.01 seconds
Interim result:  471.51 10^6bits/s over 10.01 seconds
Interim result:  471.11 10^6bits/s over 10.01 seconds
Interim result:  470.62 10^6bits/s over 10.01 seconds
Interim result:  470.10 10^6bits/s over 10.01 seconds
Interim result:  470.99 10^6bits/s over 10.01 seconds
Interim result:  470.58 10^6bits/s over 10.01 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec 
87380  16384  16384    300.04    565.50  

@b:~# sleep 60; netperf -D 10 -l 300 -H node5
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node5 (192.168.23.9) port 0 AF_INET : demo
Interim result:  471.24 10^6bits/s over 10.00 seconds
Interim result:  470.81 10^6bits/s over 10.01 seconds
Interim result:  470.56 10^6bits/s over 10.01 seconds
Interim result:  470.46 10^6bits/s over 10.00 seconds
Interim result:  470.63 10^6bits/s over 10.00 seconds
Interim result:  470.57 10^6bits/s over 10.00 seconds
Interim result:  239.79 10^6bits/s over 19.62 seconds
Interim result:  235.03 10^6bits/s over 10.20 seconds
Interim result:  235.18 10^6bits/s over 10.02 seconds
Interim result:  235.25 10^6bits/s over 10.01 seconds
Interim result:  235.38 10^6bits/s over 10.01 seconds
Interim result:  235.29 10^6bits/s over 10.00 seconds
Interim result:  234.97 10^6bits/s over 10.02 seconds
Interim result:  235.29 10^6bits/s over 10.01 seconds
Interim result:  235.10 10^6bits/s over 10.01 seconds
Interim result:  235.38 10^6bits/s over 10.01 seconds
Interim result:  235.13 10^6bits/s over 10.01 seconds
Interim result:  235.42 10^6bits/s over 10.00 seconds
Interim result:  235.21 10^6bits/s over 10.01 seconds
Interim result:  235.01 10^6bits/s over 10.01 seconds
Interim result:  235.36 10^6bits/s over 10.00 seconds
Interim result:  235.29 10^6bits/s over 10.00 seconds
Interim result:  243.72 10^6bits/s over 10.01 seconds
Interim result:  470.73 10^6bits/s over 10.01 seconds
Interim result:  470.99 10^6bits/s over 10.00 seconds
Interim result:  470.35 10^6bits/s over 10.01 seconds
Interim result:  470.67 10^6bits/s over 10.00 seconds
Interim result:  470.48 10^6bits/s over 10.00 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

87380  16384  16384    300.02    330.01  

@e:~$ sleep 120; netperf -D 10 -l 300 -H node2
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.23.6 (192.168.23.6) port 0 AF_INET : demo
Interim result:  235.96 10^6bits/s over 10.01 seconds
Interim result:  235.77 10^6bits/s over 10.01 seconds
Interim result:  235.06 10^6bits/s over 10.03 seconds
Interim result:  235.54 10^6bits/s over 10.01 seconds
Interim result:  235.35 10^6bits/s over 10.01 seconds
Interim result:  235.27 10^6bits/s over 10.01 seconds
Interim result:  235.51 10^6bits/s over 10.01 seconds
Interim result:  235.33 10^6bits/s over 10.01 seconds
Interim result:  235.53 10^6bits/s over 10.00 seconds
Interim result:  235.39 10^6bits/s over 10.01 seconds
Interim result:  235.45 10^6bits/s over 10.01 seconds
Interim result:  235.21 10^6bits/s over 10.01 seconds
Interim result:  235.47 10^6bits/s over 10.00 seconds
Interim result:  235.46 10^6bits/s over 10.00 seconds
Interim result:  235.41 10^6bits/s over 10.01 seconds
Interim result:  235.24 10^6bits/s over 10.01 seconds
Interim result:  235.53 10^6bits/s over 10.00 seconds
Interim result:  257.75 10^6bits/s over 10.01 seconds
Interim result:  471.03 10^6bits/s over 10.00 seconds
Interim result:  470.56 10^6bits/s over 10.01 seconds
Interim result:  470.71 10^6bits/s over 10.00 seconds
Interim result:  470.89 10^6bits/s over 10.00 seconds
Interim result:  470.84 10^6bits/s over 10.00 seconds
Interim result:  496.46 10^6bits/s over 10.00 seconds
Interim result:  937.86 10^6bits/s over 10.00 seconds
Interim result:  936.67 10^6bits/s over 10.01 seconds
Interim result:  937.65 10^6bits/s over 10.00 seconds
Interim result:  938.53 10^6bits/s over 10.00 seconds
Interim result:  939.26 10^6bits/s over 10.00 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec 
87380  16384  16384    300.01    424.25  

And let’s put them together on a single chart.


As you can see, the performances for each of the subsequents tests are highly reduced. This is because all the 3 tests are sharing the same link between the last 2 switches. The available network bandwidth between those 2 switches is 1Gb/s and that's why we see a limitation close to this value. So even if each node appears to be fast enough, when they will all start to work together, you will see huge performances degradations because of this topology.

Now, what is the impact from this on HBase.

You will see the main impacts when all the nodes are communicating or transferring to each other. This can occurs when you are running complex MapReduce jobs where tasks need to lookup on other regions/servers. This can also occurs when you are running major compaction on tables with low data locality or when you are doing bulk load. At the end, this can also occurs when you loose a node and need all it’s regions to be re-assigned and hadoop to replicate missing blocks.

Since at some points, some links are going to but used by 3 servers at a time, the maximum bandwidth of your network will be divided by 3. But for bigger clusters, it can be divided by even more. This can particularly occurs when you have 2 racks of servers, linked by a single network cable. If you have 10 servers in the rack, since Hadoop replication tries to place second replicate on another rack, that mean all writes will communicate to the other rack. If all your servers are doing that at the same time, your network bandwidth for each rack is going to be divided by 20!

Just to show you the difference between before and after, I replaced the 4 switches by a single 24 ports enterprise grade switch. Here are the results for the same test:

jmspaggi@node6:~$ netperf -D 10 -l 300 -H node4
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node4 (192.168.23.8) port 0 AF_INET : demo
Interim result:  922.08 10^6bits/s over 10.46 seconds
Interim result:  922.92 10^6bits/s over 10.00 seconds
Interim result:  921.17 10^6bits/s over 10.02 seconds
Interim result:  922.78 10^6bits/s over 10.00 seconds
Interim result:  922.64 10^6bits/s over 10.00 seconds
Interim result:  924.90 10^6bits/s over 10.00 seconds
Interim result:  925.01 10^6bits/s over 10.00 seconds
Interim result:  925.40 10^6bits/s over 10.00 seconds
Interim result:  925.15 10^6bits/s over 10.00 seconds
Interim result:  925.37 10^6bits/s over 10.00 seconds
Interim result:  925.00 10^6bits/s over 10.00 seconds
Interim result:  925.27 10^6bits/s over 10.00 seconds
Interim result:  924.91 10^6bits/s over 10.00 seconds
Interim result:  924.94 10^6bits/s over 10.00 seconds
Interim result:  924.93 10^6bits/s over 10.00 seconds
Interim result:  925.25 10^6bits/s over 10.00 seconds
Interim result:  925.17 10^6bits/s over 10.00 seconds
Interim result:  925.10 10^6bits/s over 10.00 seconds
Interim result:  925.27 10^6bits/s over 10.00 seconds
Interim result:  925.23 10^6bits/s over 10.00 seconds
Interim result:  925.35 10^6bits/s over 10.00 seconds
Interim result:  925.30 10^6bits/s over 10.00 seconds
Interim result:  925.15 10^6bits/s over 10.00 seconds
Interim result:  925.05 10^6bits/s over 10.00 seconds
Interim result:  924.97 10^6bits/s over 10.00 seconds
Interim result:  924.91 10^6bits/s over 10.00 seconds
Interim result:  925.14 10^6bits/s over 10.00 seconds
Interim result:  924.90 10^6bits/s over 10.00 seconds
Interim result:  925.15 10^6bits/s over 10.00 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

87380  16384  16384    300.01    924.63  

root@buldo:/home/jmspaggi# sleep 60; netperf -D 10 -l 300 -H node5
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node5 (192.168.23.9) port 0 AF_INET : demo
Interim result:  941.53 10^6bits/s over 10.00 seconds
Interim result:  940.23 10^6bits/s over 10.01 seconds
Interim result:  940.71 10^6bits/s over 10.00 seconds
Interim result:  940.30 10^6bits/s over 10.00 seconds
Interim result:  940.50 10^6bits/s over 10.00 seconds
Interim result:  940.49 10^6bits/s over 10.00 seconds
Interim result:  940.55 10^6bits/s over 10.00 seconds
Interim result:  940.65 10^6bits/s over 10.00 seconds
Interim result:  940.58 10^6bits/s over 10.00 seconds
Interim result:  940.36 10^6bits/s over 10.00 seconds
Interim result:  940.72 10^6bits/s over 10.00 seconds
Interim result:  940.32 10^6bits/s over 10.00 seconds
Interim result:  940.57 10^6bits/s over 10.00 seconds
Interim result:  940.50 10^6bits/s over 10.00 seconds
Interim result:  940.81 10^6bits/s over 10.00 seconds
Interim result:  940.57 10^6bits/s over 10.00 seconds
Interim result:  940.20 10^6bits/s over 10.00 seconds
Interim result:  940.72 10^6bits/s over 10.00 seconds
Interim result:  940.68 10^6bits/s over 10.00 seconds
Interim result:  940.33 10^6bits/s over 10.00 seconds
Interim result:  940.52 10^6bits/s over 10.00 seconds
Interim result:  940.46 10^6bits/s over 10.00 seconds
Interim result:  940.49 10^6bits/s over 10.00 seconds
Interim result:  940.62 10^6bits/s over 10.00 seconds
Interim result:  940.81 10^6bits/s over 10.00 seconds
Interim result:  940.23 10^6bits/s over 10.01 seconds
Interim result:  940.60 10^6bits/s over 10.00 seconds
Interim result:  940.40 10^6bits/s over 10.00 seconds
Interim result:  940.60 10^6bits/s over 10.00 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

87380  16384  16384    300.01    940.52
root@node1:/home/jmspaggi# sleep 120; netperf -D 10 -l 300 -H node3
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to node3 (192.168.23.7) port 0 AF_INET : demo
Interim result:  942.59 10^6bits/s over 10.01 seconds
Interim result:  941.09 10^6bits/s over 10.02 seconds
Interim result:  940.58 10^6bits/s over 10.01 seconds
Interim result:  942.08 10^6bits/s over 10.00 seconds
Interim result:  941.72 10^6bits/s over 10.00 seconds
Interim result:  941.08 10^6bits/s over 10.01 seconds
Interim result:  940.68 10^6bits/s over 10.00 seconds
Interim result:  942.37 10^6bits/s over 10.00 seconds
Interim result:  940.16 10^6bits/s over 10.02 seconds
Interim result:  941.94 10^6bits/s over 10.00 seconds
Interim result:  941.55 10^6bits/s over 10.01 seconds
Interim result:  940.86 10^6bits/s over 10.01 seconds
Interim result:  941.74 10^6bits/s over 10.01 seconds
Interim result:  941.56 10^6bits/s over 10.00 seconds
Interim result:  940.93 10^6bits/s over 10.01 seconds
Interim result:  941.14 10^6bits/s over 10.02 seconds
Interim result:  941.03 10^6bits/s over 10.01 seconds
Interim result:  938.26 10^6bits/s over 10.03 seconds
Interim result:  936.22 10^6bits/s over 10.02 seconds
Interim result:  938.81 10^6bits/s over 10.01 seconds
Interim result:  938.18 10^6bits/s over 10.01 seconds
Interim result:  938.00 10^6bits/s over 10.02 seconds
Interim result:  939.35 10^6bits/s over 10.01 seconds
Interim result:  940.84 10^6bits/s over 10.01 seconds
Interim result:  941.94 10^6bits/s over 10.01 seconds
Interim result:  941.42 10^6bits/s over 10.01 seconds
Interim result:  941.31 10^6bits/s over 10.00 seconds
Interim result:  940.59 10^6bits/s over 10.01 seconds
Interim result:  941.49 10^6bits/s over 10.01 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

87380  16384  16384    300.02    940.68  

And here is the graph for those results.


As you can see, now, the total bandwidth is no more limited and each server can talk to any other server at full speed even if 2 other servers are already talking together. For HBase, at the end, that will improve the overall performances. So you might want to take a look at the way your network is configured... Take a look at the internal switch maximum bandwidth to see if all the nodes are going to be able to table to each others at the maximum speed.

So keep an eye on your network topology. You HBase cluster performances can be impacted.

 

Mai30

This article can also be titled "How to create a local VM cluster in 10 minutes..."

Basically, if you want to try hadoop or hbase tools easily and locally you can install a single instance and run your tests with it, but if you want to go a step further, you will need to run more than one VirtualMachine locally and I have not found any easy step by step documentation to help with that. The intend of this article is to help you to achieve this. I posted it without the screenshots for early adopters. Screenshots will come later.

  • First, the best and easiest way to install and manage a cluster is to use Cloudera Manager. Since there is a free version available, let's use it greatly simplify all the steps.
  • Second, if you want to built a local VM cluster, you will need to install Linux on it, and play with the parameters and the configuration. To reduce your work, you will find below 2 VMs where almost everything is already done. There is still some configuration steps which are required based on your network specifications, but let's try to keep that very simple.


So let's start.

The 2 VMs are configured to run with 2GB of ram, but you can adjust that based on your needs.

The first one is hosting the cloudera manager. There is a graphical interface to allow you to access Cloudera Manager with Firefox. You will use this instance to access your cluster, monitor it, configure it, add some nodes, etc. This will be you main access to your cluster.

The second VM is your first node. This is where all the hadoop related applications run. We are talking here about all the applications deployed by Cloudera Manager, which are HBase, Hive, Impala, Hue, HDFS, MapReduce, Oozie, and Zookeeper.

As there is only one node at the beginning, all the roles/applications are running on this single node. I will show you later how to change that when more nodes will be added. This will be an importat step else you will have multiple nodes but will use only one.

The only software you will need to install your cluster is VirtualBox. You also need to have a DHCP server on your local network. Also, at any time, if anything is not working as described below, don't go further! Stop there and investigate. You can also restart from the beginning. Just shutdown the VM, remove it and re-import it (Please do not re-download it...)

I recommend you to download the 2 VMs below once and then back them up to have them for future tries.

Cloudera Manager.ova.torrent (Size: 3.3G)
node1.ova.torrent (Size: 2.6G)

The users on the VMs are root and user. The passwords for those users are the same as the user name. So root's password is root and user's password is user

One node cluster installation steps.

This is the easiest one. Let's see the steps overall.

  1. Import the 2 VMs into VirtualBox and start them
  2. Configure the network
  3. Enjoy


Now, in details.

  1. The VMs you have downloaded are .OVA files. In VirtualBox, go to File -> Import -> Choose and select the VM you downloaded. Then click on next, on import, and  you are done. This should take about a minute. Repeat the same thing for the node1 image.
  2. a) Now, start the 2 VMs by clicking on the VM and then on "Start". If you are getting an error like "Could not start the machine... the following physical network interfaces were not found" then click on "Change Network Settings", wait few seconds for the popup to totally load and just press "ok". Each VM will connect to your DHCP to get an IP. But you will have to let each VM know what the IP of the other one is. Let's see how to do that.
    b) On the node1 VM, login as root and type "ifconfig". This will show you your IP address. Note it. On the manager VM, login as user. Open a root console and type "ifconfig" again. Look at the IP, note it.
    c)Now that you have the IPs, you need to configure the hosts file. If you have "vi" skills, just edit the /etc/hosts file and add those hosts. Else, here is an easy way to achieve the same thing. You need to do that as root on a console. First, remove the hosts file using "rm /etc/hosts". And now add the hosts back to the file using "echo 192.168.1.1 manager >> /etc/hosts" by replacing 192.168.1.1 with your manager IP. Do the same thing for your node1. "echo 192.168.1.2 node1 >> /etc/hosts" by replacing again the IP address with the node1 IP address. And last add localhost "echo 127.0.0.1 localhost >> /etc/hosts". Last, verify that it's working fine. From your manager console, type "ping -c 1 node1". You should see "0% packet loss". From your node1 type "ping -c manager". Same thing, you should have "0% packet loss". If this is not the case, don't move forward and solve that. If this is the case, bingo! Close the consoles and go to step 3.
  3. On the manager VM, you have a firefox icon on the top bar. Just click on it. The login/password to connect to cloudera manager is admin/admin. You should see all the CDH services deployed and stopped. You can start the one you need, modify the configuration, etc. Just do what ever you want to do with it! Enjoy.


Additional node installation steps



Now, you have one node and one manager but you want more? Let's see the steps to duplicate node1 into a new node. You can follow those steps as many times as you want to create as many new nodes as you want.

First, let's talk about the issues. When you will duplicate the node1, the new node will get the same host name and the same MAC address (network). As a result, it will not get an IP, and will not be able to communicate with the manager. There is simple options to fix that and I will show you how.

Basically, here are the required steps:

  1. Re-import the node1 under another name;
  2. Reconfigure the host name
  3. Reconfigure the network
  4. Add the new node to the cluster
  5. Add some services to this node
  6. Enjoy


Now in details. You can do the steps below while manager and node1 are still running.

  1. You need to re-import the node1 into VirtualBox but call it nodex (replace x by 2 or more). For that, go into File -> Import Appliance -> Choose. Select node1.ova then next. On the next screen, chance the node name. Replace "node1_1" by what you need (nodex). Also check "Reinitialize the MAC address" box. You need to do that to make sure VirtualBox will provide a new MAC address to this VM. We are doing that to make sure it's not conflicting with the original node1 interface. The issue is, the node will not find it's network interface anymore, but we will solve that later. Click on import and wait. Your next now should now be in VirtualBox.
  2. Start your newly created node and wait for it to be ready. On the login, log in with "root/root" and do the following. "echo nodex > /etc/hostname" where you will replace nodex with the number you want like node2. This will update the host name.
  3. As I said previously, since we changed the MAC address of the interface, the VM will lost the network interface. The easiest way to do that is to simply reset the interfaces list by doing "rm /etc/udev/rules.d/70-persistent-net.rules". You can now restart it with "init 6". Now the VM will find the interface and will get an IP from the DHCP. Login again as "root/root" and type "ifconfig" to get the IP. Now we will have to update the hosts file the same way we did on 2.2 in the One node cluster section. First, remove the hosts file using "rm /etc/hosts". And now add the hosts back to the file using "echo 192.168.1.1 manager >> /etc/hosts" by replacing 192.168.1.1 with your manager IP. Do the same thing for your nodex. "echo 192.168.1.x nodex >> /etc/hosts" by replacing again the IP address with the nodex IP address. And last add localhost "echo 127.0.0.1 localhost >> /etc/hosts". You need to make sure that all the nodes have all the other nodes IPs and names in the hosts files, so add your new node information into the other already installed nodes too. On the manager side, add the new node to the hosts file doing "echo 192.168.1.x nodex >> /etc/hosts" too. Don't forget to run the same test as 2.2 with the ping between this new node and the manager. At the end, all the hosts files on all the nodes including the manager should be the same with the same hosts names refering to the same IPs. You can also copy the hosts file from the manager to te other nodes using "scp /etc/hosts nodex:/etc/" on a manager shell. If you do so, you just need to make sure the hosts file on the manager side is complete and then copy it on all the installed nodes.
  4. Let's go back on the manager to add our new node to the cluster.. Into firefox. Re-login if required. On the top, click on "Hosts". If you have done things correctly, you should already see your new node there with no CDH version and no roles. If it's not the case, don't go futher and search for the cause. If the node is there, simply click on "Add ew Host to Cluster" then "Continue". There you should have a "Currently Managed Hosts" tab. Select it, select your new node and click on "Continue". Cloudera Manager will now install CDH and Impala to this new node (Step 1). Clic on "Continue" when it's done. Then simply follow all the next steps up to 4. Now you are done. Node has been added to the cluster.
  5. You have another new node on your cluster, but it's not serving anything for now. So let's give it some work. Clic on "Services" on the top of the screen and select hdfs1. Click on instances, then add. Check the mark on the new node line, and the datanode column, to add a datanode service to this node. Then click on Continue. Your node is now also a manager! Add a tasktracker (mapreduce) and a regionserver (hbase) to it too. You can now start all of that, or restart, or start only the new services, by going on "services" -> "All services" -> "Actions" (The one on the top) -> "Start". Wait some time, and enjoy your new cluster!
  6. Enjoy ;)

 

How to use your newly installed cluster

Your Manager instance will help you to manage your cluster, but the most convenient way to use the hadoop applications if you need them is to connect to the node1 with a shell and work from there.

To do that, open a shell on your manager and type "ssh node1". The password is "user". And then simply use what you need, like "hadoop fs -ls /" and you will get:

user@node1:~$ hadoop fs -ls /
Found 3 items
drwxr-xr-x   - hbase hbase               0 2013-05-30 19:00 /hbase
drwxrwxrwt   - hdfs  supergroup          0 2013-05-30 15:01 /tmp
drwxr-xr-x   - hdfs  supergroup          0 2013-05-30 15:03 /user


That's all.

If there is anything which is not working for you, please let me know so I can improve this documentation. I plan to add some screenshot and I will do the required update to make it even simpler/better if required.

Mar16

More performances results...

Posted by jmspaggi on 16/03/13  ~  Posted in: Non catégorisé  ~  Réagir »

HBase release 0.94.6 is coming soon. RC2 is out, and I used it to re-run the performances tests.

I have also included all previous HBase releases from 0.90.x.

As you can see on THIS PDF file, even if some performances are pretty stable, some others got good improvments.

All those tests are run in the same computer, the same way without any other process running, as I explain in the previous post.

Going forward, I will update this PDF file each time a new HBase version is coming out.

If you have any question or need any more details, fell free to ask.

Enjoy.

 

Update: Now available with the TRUNK results.

1 2 3 >>