HBase写入性能测试VS单机文件

本文测试Hbase写入和系统文件写入性能比较

测试背景:


月小升曾经用文件系统做了个简单的数据监测系统,简单监测商品的曝光和点击。就是把客户浏览的商品ID,cookie之类数据一次一条的写在文件里。后期用java去读那个文件。想着Hbase会不会好一些。

大规模写入测试:


1. 如果写入100000次,少了行数,说明并发报错了。
2. 写入hbase没少,说明没错误。

测试结果

出错情况:
在并发写入5000条记录的时候,文件系统和HBase都没有发生错误
在并发写入10000条记录的时候,文件系统无错误,HBase发生漏数据

性能:
总体速度比较文件系统胜出。但是胜出的很微小。

思考:
月小升事后思考为文件系统属于Linux底层系统,速度比HBase快是应该的。Hbase牺牲了一点点性能,但是后期数据读取,集群这方面才是其核心的应用价值。另外Hbase本身是和java配合的软件,用Thrift作为中介来调用也可能消耗性能。测试中出错为SSL错误,可能是我的服务器性能不够。

这个测试总体打消了我对文件系统的质疑。也不在迷信什么NOSQL,各种数据库都有自己的特长,看我们用什么。毕竟如果我要用50台机器来组织数据,用文件系统就看起来很不靠谱,用Hbase就会方便很多。单机嘛,越简单的东西,越快。

测试过程


1. php写入Hbase 5000条记录

ab -n 5000 -c 100 https://java-er.com/phptest/test.php

性能指标

Document Path:          /phptest/test.php
Document Length:        49 bytes
 
Concurrency Level:      100
Time taken for tests:   156.906 seconds
Complete requests:      5000
Failed requests:        965
   (Connect: 0, Receive: 0, Length: 965, Exceptions: 0)
Total transferred:      1058657 bytes
HTML transferred:       243820 bytes
Requests per second:    31.87 [#/sec] (mean)
Time per request:       3138.111 [ms] (mean)
Time per request:       31.381 [ms] (mean, across all concurrent requests)
Transfer rate:          6.59 [Kbytes/sec] received

核心PHP代码 具体Hbase被php调用请翻阅上一个博客文章php通过Thrift调用Hbase

//随机数测试
function rand_number ($min, $max) {
    return sprintf("%".strlen($max)."d", mt_rand($min,$max));
 }
 
$rand = trim(rand_number(1,10000000)).'-'.trim(rand_number(1,10000000)).'-'.time();
 
 
 
 
$tablename='test';
####写数据####
echo "----write data $rand ----\n";
$row = 'rw'.$rand; //行名字
$atrribute = array();
$mutations = array(
    new Mutation(array(
        'column' => 'cf:name',
        'value' => 'name3'
    )),
);
try {
    $client->mutateRow($tablename, $row, $mutations, $atrribute);
} catch (Exception $e) {
    var_dump($e);//这里自己打log
}
count 'test'
Current count: 1000, row: rw2871772-367852-1581336848                           
Current count: 2000, row: rw463782-8150364-1581336838                           
Current count: 3000, row: rw6450359-1512426-1581336945                          
Current count: 4000, row: rw8201124-9401600-1581336824                          
Current count: 5000, row: rw9999031-1386008-1581336827                          
5000 row(s)
Took 0.2436 seconds                                                             
=> 5000

写入完成,一条不少。

2. 测试php写入文件性能

<?php
//文件写入测试2020-02-11
 
 
//随机数测试
function rand_number ($min, $max) {
    return sprintf("%".strlen($max)."d", mt_rand($min,$max));
 }
 
$rand = trim(rand_number(1,10000000)).'-'.trim(rand_number(1,10000000)).'-'.time();
 
writeover("f/a.txt",$rand."\n","a+");
echo 'OK';
 
/*
‘r’ 只读方式打开,将文件指针指向文件头。
‘r+’ 读写方式打开,将文件指针指向文件头。
‘w’ 写入方式打开,将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
‘w+’ 读写方式打开,将文件指针指向文件头并将文件大小截为零。如果文件不存在则尝试创建之。
‘a’ 写入方式打开,将文件指针指向文件末尾。如果文件不存在则尝试创建之。
‘a+’ 读写方式打开,将文件指针指向文件末尾。如果文件不存在则尝试创建之。
 
LOCK_SH取得共享锁定(读取的程序)。
LOCK_EX 取得独占锁定(写入的程序。
LOCK_UN 释放锁定(无论共享或独占)。
*/
function writeover($filename,$data,$method="rb+",$iflock=1) 
{ 
    @touch($filename);/*文件不存在则创建之.可以采用file_exists验证并其他创建文件函数代替.测试结果效率相当*/ 
    $handle=@fopen($filename,$method); 
    if($iflock){ 
        flock($handle,LOCK_EX); 
    } 
    fwrite($handle,$data); 
    if($method=="rb+") ftruncate($handle,strlen($data)); 
    fclose($handle); 
}
?>

在相同的目录下放个文件。

Server Software:        nginx/1.12.2
Server Hostname:        java-er.com
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name:        java-er.com
 
Document Path:          /phptest/testf.php
Document Length:        4 bytes
 
Concurrency Level:      100
Time taken for tests:   165.680 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      835000 bytes
HTML transferred:       20000 bytes
Requests per second:    30.18 [#/sec] (mean)
Time per request:       3313.597 [ms] (mean)
Time per request:       33.136 [ms] (mean, across all concurrent requests)
Transfer rate:          4.92 [Kbytes/sec] received
cat f/a.txt | wc -l
5000

一条不少

性能看起来差不多。

3. 10000次文件测试

ab -n 10000 -c 200 https://java-er.com/phptest/testf.php
 
Concurrency Level:      200
Time taken for tests:   309.332 seconds
Complete requests:      10000
Failed requests:        3
   (Connect: 0, Receive: 0, Length: 3, Exceptions: 0)
Total transferred:      1670000 bytes
HTML transferred:       40000 bytes
Requests per second:    32.33 [#/sec] (mean)
Time per request:       6186.648 [ms] (mean)
Time per request:       30.933 [ms] (mean, across all concurrent requests)
Transfer rate:          5.27 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0 3845 1962.5   3581   39088
Processing:     7 2164 1178.4   2316   61974
Waiting:        5 1079 293.7   1159    1693
Total:        105 6009 2290.3   5945   61974
 
Percentage of the requests served within a certain time (ms)
  50%   5945
  66%   6211
  75%   6299
  80%   6360
  90%   6704
  95%   7649
  98%  11434
  99%  13240
 100%  61974 (longest request)
cat a.txt | wc -l
10000

4. 10000次Hbase写入测试

ab -n 10000 -c 200 https://java-er.com/phptest/test.php
Server Software:        nginx/1.12.2
Server Hostname:        java-er.com
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name:        java-er.com
 
Document Path:          /phptest/test.php
Document Length:        49 bytes
 
Concurrency Level:      200
Time taken for tests:   319.749 seconds
Complete requests:      10000
Failed requests:        1952
   (Connect: 0, Receive: 0, Length: 1952, Exceptions: 0)
Total transferred:      2112044 bytes
HTML transferred:       486445 bytes
Requests per second:    31.27 [#/sec] (mean)
Time per request:       6394.974 [ms] (mean)
Time per request:       31.975 [ms] (mean, across all concurrent requests)
Transfer rate:          6.45 [Kbytes/sec] received
 
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0 3822 4170.5   3237   60249
Processing:     8 1653 4049.9   1270   62795
Waiting:        0  702 420.5    650    3031
Total:        101 5475 5669.7   5267   62795
 
Percentage of the requests served within a certain time (ms)
  50%   5267
  66%   5763
  75%   5979
  80%   6152
  90%   7164
  95%  10471
  98%  18708
  99%  33536
 100%  62795 (longest request)
 count 'test'
Current count: 1000, row: rw1892143-1221244-1581386448                          
Current count: 2000, row: rw2771495-7809247-1581386480                          
Current count: 3000, row: rw3689320-8531023-1581386665                          
Current count: 4000, row: rw4585476-8798631-1581386402                          
Current count: 5000, row: rw549959-7823089-1581386653                           
Current count: 6000, row: rw6359363-5793712-1581386555                          
Current count: 7000, row: rw7264484-6396171-1581386491                          
Current count: 8000, row: rw820720-5596950-1581386641                           
Current count: 9000, row: rw9106369-9643-1581386690                             
9989 row(s)
Took 1.1088 seconds                                                             
=> 9989

少了11次,中间发生了SSL handshake failed (5).

SSL handshake failed (5).
Completed 9000 requests
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).


This entry was posted in Linux, PHP, 高并发与大数据 and tagged , . Bookmark the permalink.
月小升QQ 2651044202, 技术交流QQ群 178491360
首发地址:月小升博客https://java-er.com/blog/hbase-vs-file-write/
无特殊说明,文章均为月小升原创,欢迎转载,转载请注明本文地址,谢谢
您的评论是我写作的动力.
2020.03.24 评论已经全局关闭,有事加QQ聊天