In my previous post on Amazon MapReduce software I mentioned I was using the CLI (Command Line Interface) to manage MapReduce Instances with Job Flows. Now in the last few days, I’ve been working with Python and Ruby examples and using S3 Tools from the Command Line which is an additional set of code to install.
Mostly the examples worked as expected or with only minor tweaks. I got the most out of that smooth experience. This example, Word Count, worked fine from the CLI.
I found some of the examples from the pdf document, Getting Started with Amazon Elastic MapReduce, easier to work with than others. In the two examples listed next, I spent hours trying to find the online fixes for them to no avail:
- Job Flow #1a – Storing names in Amazon SimpleDB
- Using the Amazon Elastic MapReduce CLI
- On February 5, 2011, I was able to make changes to the, “freebase_jobflow.json” file allowing Step 1 to complete. However, Step 2 still fails. 🙁
I’m sure that a support person from Amazon’s MapReduce team could fix the issues and post the updates to the examples so that they run correctly as described in the examples.
In my research for fixes, I came across this page discussing one part of the fix, an update to the Ruby code for the MapReduce Ruby CLI. I found that online after spending about an hour debugging why the json file wouldn’t parse correctly. 🙁
I enjoyed learning these new skills even with the buggy examples. I continue onward investigation more uses of Amazon’s MapReduce technologies at extremely low cost.
In closing I would like to suggest that code examples always include the “as required” usage syntax and not just the specification syntax. In other words, show the literal character sequence for a command in its entirety as it needs to be entered on the Command Line itself so that the program can run without error. For example in this hypothetical command, instead of something like,
Thank you. 🙂