I worked extensively with the Hive language on Amazon Elastic MapReduce (EMR) these last two days. The effort was not trivial. I come away with a lot more insight into Unix, Terminal, and the EMR technologies. To put in the required amount of work is key to arriving at a comprehensive understanding of those technologies.
I previously completed other examples while working with EMR while working with Pig, Python, and Ruby languages. Those experiences and others are documented in this blog’s Technology category.
Before proceeding to perform the steps below I suggest that you:
- Make sure your Key Pair files are located on a Unix-freindly filepath and set with the correct permissions. I set mine to, ‘og-rwx’, (no single-quotes) using the chmod command.
- You recognize the Command Prompt is not part of the provided scripts or statements when entering them in the command line interface and other documents the steps ask for.
Here are my suggested steps:
- Become familiar with Hive by watching the video, Video Tutorial: Getting started with Hive on Amazon Elastic MapReduce.
- Read, Running Hive on Amazon ElasticMap Reduce
- Consider using these references as needed:
- Read, then sign up for, Elastic MapReduce, and carefully proceed through all steps:
- Work the, Create a Job Flow Using Hive, example. For a deeper explanation see the expanded version, Contextual Advertising using Apache Hive and Amazon Elastic MapReduce with High Performance Computing instances.
- Work the, Operating a Data Warehouse with Hive, Amazon Elastic MapReduce and Amazon SimpleDB, example. To see the results, you can use Firefox SDB Tool [To install, open Firefox and point to this Add-on]. It’s easy to use. Look under the SimpleDB domain created by the example. When finished, delete the domain(s) listed using SDB Tool.
I’m happy to complete those exercises after making many attempts at debugging the process. To perform research online as part of the effort made me aware of more tools and ideas. That’s all part of learning these days no matter what the discipline. 😉
As an extra for Mac OS X Snow Leopard users, consider installing the, SymbolicLinker, a contextual menu plugin. It’s not needed for this message, but it comes in handy when having multiple versions of programming languages installed on your computer. Why have multiple versions, you ask? The very short answer is so that your lesson tools work as expected by using the indicated program version. A word to the wise! 🙂