presto（十一）——data之什么时候去hive拉数据

上一节我们知道PageSourceProvider提供了获取hive split相关的信息，这一节，我们来看看是谁会使用它？

1、了解什么是page

Page的样子

public class Page
{
    private final Block[] blocks;
    private final int positionCount;
    private final AtomicLong sizeInBytes = new AtomicLong(-1);
    private final AtomicLong retainedSizeInBytes = new AtomicLong(-1);

一个Page有多个Block，一个block代表一个列（多个行）数据
可以从下面这段代码看出来:

blocks[fieldId] = dataPage.getBlock(columnMappings.get(fieldId).getIndex());

public static List<ColumnMapping> buildColumnMappings(List<HivePartitionKey> partitionKeys, List<HiveColumnHandle> columns, Path path)
        {
              ImmutableList.Builder<ColumnMapping> columnMappings = ImmutableList.builder();
            for (int i = 0; i < columns.size(); i++) {
                HiveColumnHandle column = columns.get(i);
                //省略，上面获取了所有的列，下面是添加到列的mapping中
                columnMappings.add(new ColumnMapping(column, prefilledValue, currentIndex));
            }
            return columnMappings.build();
        }

2、两个类ScanFilterAndProjectOperator和TableScanOperator
有getOutput和isFinished方法中包含了： createSourceIfNecessary()方法。

private void createSourceIfNecessary()
    {
        if ((split != null) && (source == null)) {
            source = pageSourceProvider.createPageSource(operatorContext.getSession(), split, columns);
        }
    }

3、从上面看到他是从Operator中读取hive数据的
不同的sql语句最终会生具体的operator操作，他是根据PlanFragment定义的两个类型：symbol和type来决定创建具体的Operator实现。路径是通过restful方式提供这两个参数给LocalExcutionPlan来创建具体的。 LocalExcutionPlan由SqlTaskExecution生成。

TODO：//接下来系列会介绍从解析Sql语句到从hive读取数据是如何工作的。

presto（十一）——data之什么时候去hive拉数据

推荐阅读更多精彩内容