最新文章专题视频专题问答1问答10问答100问答1000问答2000关键字专题1关键字专题50关键字专题500关键字专题1500TAG最新视频文章推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37视频文章20视频文章30视频文章40视频文章50视频文章60 视频文章70视频文章80视频文章90视频文章100视频文章120视频文章140 视频2关键字专题关键字专题tag2tag3文章专题文章专题2文章索引1文章索引2文章索引3文章索引4文章索引5123456789101112131415文章专题3
问答文章1 问答文章501 问答文章1001 问答文章1501 问答文章2001 问答文章2501 问答文章3001 问答文章3501 问答文章4001 问答文章4501 问答文章5001 问答文章5501 问答文章6001 问答文章6501 问答文章7001 问答文章7501 问答文章8001 问答文章8501 问答文章9001 问答文章9501
当前位置: 首页 - 科技 - 知识百科 - 正文

HBaseintrarowscanning

来源:懂视网 责编:小采 时间:2020-11-09 13:31:51
文档

HBaseintrarowscanning

HBaseintrarowscanning:By Lars Hofhansl Updated (again) Wednesday, January 25th, 2012. As I painfully worked through HBASE-5229 I realized that HBase already has all the building blocks needed for complex (local) transactions. What's important here is that (see
推荐度:
导读HBaseintrarowscanning:By Lars Hofhansl Updated (again) Wednesday, January 25th, 2012. As I painfully worked through HBASE-5229 I realized that HBase already has all the building blocks needed for complex (local) transactions. What's important here is that (see

By Lars Hofhansl Updated (again) Wednesday, January 25th, 2012. As I painfully worked through HBASE-5229 I realized that HBase already has all the building blocks needed for complex (local) transactions. What's important here is that (see

By Lars Hofhansl

Updated (again) Wednesday, January 25th, 2012.

As I painfully worked through HBASE-5229 I realized that HBase already has all the building blocks needed for complex (local) transactions.

What's important here is that (see my introduction to HBase):

  1. HBase ensures atomicity for operations for the same row key
  2. HBase keys have internal structure: (row-key, column family, column, ...)
The missing piece was ColumnRangeFilter. With this filter it is possible to retrieve all columns whose identifier starts with "abc", or all columns whose identifier sorts > "test". For example:

// all columns whose identifier starts with "abc"
Filter f = new ColumnRangeFilter(Bytes.toBytes("abc"), true,
Bytes.toBytes("abd"), false);

// all columns whose identifier sorts after "test"
Filter f = new ColumnRangeFilter(Bytes.toBytes("test"), true,
null, true);


So this allows to search (scan) inside a row by column identifier just as HBase allows searching by row key.

A client application can exploit this to achieve transactions by grouping all entities that can participate in the same transaction into a single row (and single column family).
Then using prefixes of the column identifiers can be used to define rows inside that group. Basically the search criteria for keys was moved one level down to the column identifier.

Say we wanted to implement a store with transactional tables that contain rows and columns. One way to doing this with HBase as follows:

  • the HBase row-key/column-family maps to a "table"
  • a prefix of the HBase column identifier maps to a "row"
  • the rest of the HBase column identifier identifies the "column"
  • This is in fact similar to what Google's Megastore (pdf) does.

    This leads to potentially wide HBase rows with many columns. The missing piece is allowing a Scan to efficiently retrieve a slice of a wide row.

    This where ColumnRangeFilter comes into play. This filter seeks efficiently into the row by seeking ahead to the first HBase block that contains the first KeyValue (or cell) for that column.

    Let's model a table "pets" this way. And let's say a pet has a name and a species. The HBase key for entries would look like this:
    (table, CF1, rowA|column1) -> value for column1 in rowA
    The code would look something like this:
    (apologies for the initial incorrect code that I had posted here)

    HTable t = ...;
    Scan s = ...;
    s.setStartRow("pets");
    s.setStopRow("pets");
    // get all columns for my pet "fluffy".
    Filter f = new ColumnRangeFilter(Bytes.toBytes("fluffy"), true,
    Bytes.toBytes("fluffz"), false);
    s.setFilter(f);
    s.setBatch(20); // avoid getting all columns for the HBase row
    ResultScanner rs = t.getScanner(s);
    for (Result r = rs.next(); r != null; r = rs.next()) {

    // r will now have all HBase columns that start with "fluffy",

    // which would represent a single row
    for (KeyValue kv : r.raw()) {
    // each kv represent - the latest version of - a column
    }
    }

    The downside of this is that HBase achieves atomicity by collocating all cells with the same row-key, so it has to be hosted by a single region server.

    声明:本网页内容旨在传播知识,若有侵权等问题请及时与本网联系,我们将在第一时间删除处理。TEL:177 7030 7066 E-MAIL:11247931@qq.com

    文档

    HBaseintrarowscanning

    HBaseintrarowscanning:By Lars Hofhansl Updated (again) Wednesday, January 25th, 2012. As I painfully worked through HBASE-5229 I realized that HBase already has all the building blocks needed for complex (local) transactions. What's important here is that (see
    推荐度:
    标签: row hbase scanning
    • 热门焦点

    最新推荐

    猜你喜欢

    热门推荐

    专题
    Top