Hive

Overview The latest HDP 2.6.x has both Hive version 1 and 2 installed together. However, it does not allow user to run hive version to command directly, but only use beeline. The lab_dev repository here provides an demo virtual box image to have both Hive version configured properly. Conf. Changes The trick thing to make both hive version working is do not add any setting in the .profile anymore. See below, I comments out all pervious hive settings.

Hive Get the Max/Min

in article

February 2, 2018

Big Data Books Reviews

in review

January 10, 2018

Learning Spark SQL Level Ent. Level Mid. Level Adv. Published in Sep. 2017. Start reading it. Learning Apache Flink Level Ent. Level Mid. There are very few books about Apache Flink. Besides offical document, this is a good one for people who wants to know Flink quicker. This book, published in the earlier of 2017, covers most of core topics for Flink with examples.

Hive RowID Generation

in article

November 2, 2017

Introduction It is quite often that we need a unique identifier for each single rows in the Apache Hive tables. This is quite useful when you need such columns as surrogate keys in data warehouse, as the primary key for data or use as system nature keys. There are following ways of doing that in Hive. ROW_NUMBER() Hive have a couple of internal functions to achieve this. ROW_NUMBER function, which can generate row number for each partition of data.

Run Hive 1 and 2 Together

Hive Get the Max/Min

Big Data Books Reviews

Hive RowID Generation

Search

Categories

Tags