This is the code repository for Apache Hive Essentials - Second Edition, published by Packt.
Essential techniques to help you process, and get unique insights from, big data
In this book, we prepare you for your journey into big data by frstly introducing you to backgrounds in the big data domain, alongwith the process of setting up and getting familiar with your Hive working environment.
This book covers the following exciting features:
- Create and set up the Hive environment
- Discover how to use Hive's definition language to describe data
- Discover interesting data by joining and filtering datasets in Hive
- Transform data by using Hive sorting, ordering, and functions
- Aggregate and sample data in different ways
If you feel this book is for you, get your copy today!
All of the code is organized into folders. For example, Chapter02.
The code will look like the following:
export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/conf
export HIVE_HOME=/opt/hive
export HIVE_CONF_DIR=/opt/hive/conf
export PATH=$PATH:$HIVE_HOME/bin:$HADOOP_HOME/
bin:$HADOOP_HOME/sbin
Following is what you need for this book: If you are a data analyst, developer, or simply someone who wants to quickly get started with Hive to explore and analyze Big Data in Hadoop, this is the book for you. Since Hive is an SQL-like language, some previous experience with SQL will be useful to get the most out of this book.
With the following software and hardware list you can run all code files present in the book (Chapter 1-10).
Chapter | Software required | OS required |
---|---|---|
2,3,4,5,6,7,8 | NA | Windows, Mac OS X, and Linux (Any) |
8 | Eclipse | Windows, Mac OS X, and Linux (Any) |
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.
Dayong Du Dayong Du is a big data practitioner, author, and coach with over 10 years' experience in technology consulting, designing, and implementing enterprise big data architecture and analytics in various industries, including finance, media, travel, and telecoms. He has a master's degree in computer science from Dalhousie University and is a Cloudera certified Hadoop developer. He is a cofounder of Toronto Big Data Professional Association and the founder of DataFiber website.
Click here if you have any feedback or suggestions.