hbase 增删改查

张映 发表于 2019-09-27

分类目录: hadoop/spark/scala

标签:

通过hive,或者sparksql创建的表,无法实现对单条数据的update和delete。但是hbase可以。对表的操作请参考:hbase 创建表 增删列

一,插入数据

1,插入格式

put ’<table name>’,’<rowkey>’,’<colfamily:colname>’,’<value>’

2,样例

hbase(main):011:0* put 'test_ns:user','1000320190926','login:username','test2'
0 row(s) in 0.0070 seconds

hbase(main):033:0> get 'test_ns:user',1000320190926,'login:username'
COLUMN CELL
 login:username timestamp=1569461617502, value=test2
1 row(s) in 0.0300 seconds

二,更新数据

1,更新格式

put ’<table name>’,’<rowkey>’,’<colfamily:colname>’,’<newvalue>’

2,样例

hbase(main):034:0> put 'test_ns:user','1000320190926','login:username','newtest2'
0 row(s) in 0.0100 seconds

hbase(main):035:0> get 'test_ns:user',1000320190926,'login:username'
COLUMN CELL
 login:username timestamp=1569463065916, value=newtest2
1 row(s) in 0.0080 seconds

三,get获取数据

1,get获取数据格式

get ’<table name>’,’<rowkey>’,’<colfamily:colname>’,’<colfamily>’,....
get ’<table name>’,’<rowkey>’,’<colfamily>’,{表达式}....

2,样例

//获取user,rowkey为1000320190926,login列族下的所有数据
hbase(main):037:0> get 'test_ns:user',1000320190926,'login'
COLUMN CELL
 login:password timestamp=1569461617521, value=pass2
 login:username timestamp=1569463065916, value=newtest2
1 row(s) in 0.0070 seconds

//获取user,rowkey为1000320190926,login列族下的所有数据,以及contact列族下tel列的所有数据
hbase(main):038:0> get 'test_ns:user',1000320190926,'login','contact:tel'
COLUMN CELL
 contact:tel timestamp=1569461617594, value=02132345678
 login:password timestamp=1569461617521, value=pass2
 login:username timestamp=1569463065916, value=newtest2
1 row(s) in 0.0060 seconds

//获取user,rowkey为1000320190926,login,info,contact列族下的所有数据,时间范围1569161617594, 1569463065916,版本为1
hbase(main):049:0> get 'test_ns:user',1000320190926, {COLUMN => ['login','info','contact'], \
hbase(main):050:1* TIMERANGE => [1569161617594, 1569463065916], VERSIONS => 1}
COLUMN CELL
 contact:mobile timestamp=1569461617577, value=15822345678
 contact:tel timestamp=1569461617594, value=02132345678
 info:age timestamp=1569461617561, value=36
 info:sex timestamp=1569461617538, value=male2
 login:password timestamp=1569461617521, value=pass2
1 row(s) in 0.0150 seconds

//获取user,rowkey为1000320190926,login,info列族下的所有数据,contact列族下tel列的所有,时间范围1569161617594, 1569463065916,版本为1
hbase(main):051:0> get 'test_ns:user',1000320190926, {COLUMN => ['login','info','contact:tel'], \
hbase(main):052:1* TIMERANGE => [1569161617594, 1569463065916], VERSIONS => 1}
COLUMN CELL
 contact:tel timestamp=1569461617594, value=02132345678
 info:age timestamp=1569461617561, value=36
 info:sex timestamp=1569461617538, value=male2
 login:password timestamp=1569461617521, value=pass2
1 row(s) in 0.0100 seconds

四,scan获取数据

1,scan获取数据格式

scan ’<table name>’,’<colfamily>’,{表达式}....

2,样例

//获取全user表数据
scan 'test_ns:user'

//获取user表,contact列族下tel列的所有数据
scan 'test_ns:user', {COLUMNS => 'contact:tel'}

//获取user表,login,info列族下的2条数据
scan 'test_ns:user', {COLUMNS => ['login', 'info'], LIMIT => 2}

//反向显示表user的数据
scan 'test_ns:user', {REVERSED => true}

//显示所有版本为1的数据,包括删除的数据
scan 'test_ns:user', {RAW => true, VERSIONS => 1}

//获取user表,login,info列族,contact列族tel列,根据时间范围,rowkey的范围获取2条数据
hbase(main):056:0> scan 'test_ns:user', {COLUMNS => ['login','info','contact:tel'], LIMIT => 2, \
hbase(main):057:1* TIMERANGE => [1569401820233, 1569461709184],STARTROW => '1000120190925',ENDROW=> '1000520190925'}
ROW COLUMN+CELL
 1000120190925 column=contact:tel, timestamp=1569401821139, value=02112345678
 1000220190926 column=contact:tel, timestamp=1569461617469, value=02122345678
 1000220190926 column=info:age, timestamp=1569461617408, value=35
 1000220190926 column=info:sex, timestamp=1569461617380, value=male1
 1000220190926 column=login:password, timestamp=1569461617362, value=pass1
 1000220190926 column=login:username, timestamp=1569461617333, value=test1
2 row(s) in 0.0160 seconds

get和scan取数据最大的区别在于,单条,还是多条。get取数据rowkey是必须参数,rowkey是表里面的唯一字段。查询的操作很多,后会单独,在写一篇。

五,append拼接数据

1,append拼接数据格式

append ’<table name>’,’<rowkey>’,’<colfamily:colname>’,’<colfamily>’,....

2,样例

hbase(main):061:0> append 'test_ns:user','1000120190925','login:username','_111'
CURRENT VALUE = Tank_111
0 row(s) in 0.0460 seconds

六,delete删除数据

1,格式

delete ‘<table name>’, ‘<rowkey>’, ‘<column name >’, ‘<timestamp>’

2,样例

hbase(main):068:0> get 'test_ns:user','1000120190925','login:username'
COLUMN CELL
 login:username timestamp=1569399395200, value=Tank
1 row(s) in 0.0100 seconds

//删除user表下,rowkey为1000120190925,时间为1569399395200,username列数据
hbase(main):069:0> delete 'test_ns:user','1000120190925','login:username',1569399395200
0 row(s) in 0.0070 seconds

hbase(main):070:0> get 'test_ns:user','1000120190925','login:username'
COLUMN CELL
0 row(s) in 0.0070 seconds

七,deleteall删除数据

1,格式

deleteall ‘<table name>’, ‘<rowkey>’,

2,样例

//删除表user下rowkey为1000120190925所有数据
hbase(main):082:0> deleteall 'test_ns:user','1000120190925'
0 row(s) in 0.0120 seconds

//删除表user下rowkey为1000220190926,login列族所有数据
hbase(main):085:0> deleteall 'test_ns:user','1000220190926','login'
0 row(s) in 0.0050 seconds

//删除表user下rowkey为1000220190926,info列族下sex列所有数据
hbase(main):086:0> deleteall 'test_ns:user','1000220190926','info:sex'
0 row(s) in 0.0070 seconds

//删除表user下rowkey为1000220190926,contact列族所有数据,时间为1569461617380
hbase(main):087:0> deleteall 'test_ns:user','1000220190926','contact',1569461617380
0 row(s) in 0.0060 seconds

八,求行数

hbase(main):089:0> count 'test_ns:user'
4 row(s) in 0.0300 seconds

=> 4

、、查询表t1中的行数,每10条显示一次,缓存区为1000
hbase(main):090:0> count 'test_ns:user', INTERVAL => 10, CACHE => 1000
4 row(s) in 0.0130 seconds

=> 4

 



转载请注明
作者:海底苍鹰
地址:http://blog.51yip.com/hadoop/2188.html