{"id":1799,"date":"2014-04-28T15:22:34","date_gmt":"2014-04-28T22:22:34","guid":{"rendered":"http:\/\/3.209.169.194\/blogs\/bobb\/?p=1799"},"modified":"2014-04-28T16:06:00","modified_gmt":"2014-04-28T23:06:00","slug":"installing-running-hdp-2-1-windows","status":"publish","type":"post","link":"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/","title":{"rendered":"Installing and Running HDP 2.1 on Windows"},"content":{"rendered":"<p>And now, for something completely different. Hadoop.<\/p>\n<p>I&#8217;ve been playing with everything I can get my hands on (and can find time for) with Hadoop for a while now. Have tried Hortonworks on CentOS (seems like the Hortonworks &#8220;reference&#8221; platform), Hortonworks HDP on Windows, and HDInisght. I even think I may have a clue about what big data is and why it matters. But that&#8217;s a topic for another time. It&#8217;s intruiging that Hadoop on Linux has a built-in GUI dev tools (like Hue) but the Windows&#8217; versions (put out only by Hortonworks\/Microsoft) have no GUI except for the NameNode\/YARN\/HBase status pages, and almost mandate you work on PowerShell. I thought the Linux folks were the bigger command-line wonks. I haven&#8217;t tried any other distos (e.g. Cloudera) yet (or tried building it myself for that matter).<\/p>\n<p>Over the weekend, I installed Hortonworks new GA 2.1.1.0 distro for Windows. The main reason I installed it was to try out Hive 0.13 (with vectorized query &#8211; think SQL Server columnstore batch mode, ORC files are the columnstore) and Hive under Tez. Think of Tez as a more flexible, powerful, faster MapReduce. I did a single-node install for simplicity. Ran into some problems and anomolies that I&#8217;ll mention here. But did get most of what there is working. I say &#8220;what there is&#8221; because the reference diagram at <a href=\"http:\/\/hortonworks.com\/hdp\/whats-new\/\">http:\/\/hortonworks.com\/hdp\/whats-new\/<\/a> mentions all the things I do see, but the Windows distro doesn&#8217;t include Accumulo (another fast data storage offering based on BigTable) and Solr (a search offering). Perhaps the CentOS distro does. And, of course, no Hue. And, although there&#8217;s an exe and it&#8217;s in the list of Windows services, HWI (Hive Web Interface) doesn&#8217;t start up. That&#8217;s been reported by others too.<\/p>\n<p>BTW, this version isn&#8217;t available on HDInight yet. Maybe soon. Microsoft allocated some engineers to Hive 0.13 (and other phases of the <a href=\"http:\/\/hortonworks.com\/labs\/stinger\/\">Stinger Initiative<\/a>), so I&#8217;m pretty sure they want it in HDInsight too.<\/p>\n<p>There were the usual potholes in the docs, if you&#8217;re trying to follow the docs by rote. First, the docs warn you not to use path names with spaces in the Java or Python installs (<a href=\"http:\/\/docs.hortonworks.com\/HDPDocuments\/HDP2\/HDP-2.1.1-Win\/bk_installing_hdp_for_windows\/content\/win-software-install-gui.html\">http:\/\/docs.hortonworks.com\/HDPDocuments\/HDP2\/HDP-2.1.1-Win\/bk_installing_hdp_for_windows\/content\/win-software-install-gui.html<\/a>). Good idea in any case (don&#8217;t have to put quotes around names in command lines) but especially with Unix ports to Windows. But they warn you in the Python directions and Python 2.7 doesn&#8217;t install in &#8220;Program Files&#8221;. Java does. And one of the instructions says you should add (step 5c) e.g C:\\Java\\jdk1.7.0_45\\bin as the value for JAVA_HOME. Don&#8217;t use the &#8220;bin&#8221;. Step 6b is wrong too; the path should include the &#8220;bin&#8221; directory. Easy to fix. The install reports &#8220;JAVA_HOME is not set&#8221; if you mess it up. Ask me how I know&#8230;I should have been paying attention.<\/p>\n<p>If you choose &#8220;Run Hive Under Tez&#8221; you must peform Section 6.1 &#8220;Setting up Tez for Hive&#8221;. But there&#8217;s a glitch there too. Step 5: &#8220;Copy the Tez home directory on the local machine into the HDFS \/apps\/tez directory&#8221; is wrong too.<br \/>\n%HADOOP_HOME%\\bin\\hadoop.cmd dfs -put %TEZ_HOME%* \/apps\/tez<br \/>\nshould be:<br \/>\n%HADOOP_HOME%\\bin\\hadoop.cmd dfs -put %TEZ_HOME%\/* \/apps\/tez<br \/>\n(note the slash after %TEZ_HOME% and before the *)<\/p>\n<p>Until you do that, Hive fails starting up the command line app with \/apps\/tez\/lib not found.<\/p>\n<p>And its likely that step 6 should be:<br \/>\n%HADOOP_HOME%\\bin\\hadoop.cmd dfs -rmr -skipTrash \/apps\/tez\/conf<br \/>\nbecause there is no conf3 directory.<\/p>\n<p>With the right copy command the Tez smoke test and Hive under Tez run just fine. I did have trouble with the HCatalog and Hive smoke tests at first (something about hadoop user needing a GRANT, GRANTs are expanded in Hive 0.13), but after enough tries (3 reinstalls of the distro and 7-8 runs of smoke tests) it just started working. Don&#8217;t know exactly what I did to make that start happening. So all the smoke tests (including optional components) succeed.<\/p>\n<p>So THANKS folks! Lots more to try. There&#8217;s a lovely <a href=\"http:\/\/hortonworks.com\/hadoop-tutorial\/supercharging-interactive-queries-hive-tez\/\">Hive\/Tez\/vectorized query tutorial <\/a>(do it sans-GUI) that shows the new features off. And I&#8217;m glad that I can specify precision\/scale for decimal types. And all other Stinger features and additional ecosystem components too.<\/p>\n<p>BTW, I ran this on Windows 2012 R2 OS, even though the docs say only Windows 2008 R2 an 2012 OS are officially supported. It lists these under minimal requirements.<\/p>\n<p>Hopefully this information will be helpful to someone else.<\/p>\n<p>Cheers, @bobbeauch<\/p>\n","protected":false},"excerpt":{"rendered":"<p>And now, for something completely different. Hadoop. I&#8217;ve been playing with everything I can get my hands on (and can find time for) with Hadoop for a while now. Have tried Hortonworks on CentOS (seems like the Hortonworks &#8220;reference&#8221; platform), Hortonworks HDP on Windows, and HDInisght. I even think I may have a clue about [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[44],"tags":[],"class_list":["post-1799","post","type-post","status-publish","format-standard","hentry","category-hadoop"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.9.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Installing and Running HDP 2.1 on Windows - Bob Beauchemin<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Installing and Running HDP 2.1 on Windows - Bob Beauchemin\" \/>\n<meta property=\"og:description\" content=\"And now, for something completely different. Hadoop. I&#8217;ve been playing with everything I can get my hands on (and can find time for) with Hadoop for a while now. Have tried Hortonworks on CentOS (seems like the Hortonworks &#8220;reference&#8221; platform), Hortonworks HDP on Windows, and HDInisght. I even think I may have a clue about [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/\" \/>\n<meta property=\"og:site_name\" content=\"Bob Beauchemin\" \/>\n<meta property=\"article:published_time\" content=\"2014-04-28T22:22:34+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2014-04-28T23:06:00+00:00\" \/>\n<meta name=\"author\" content=\"Bob Beauchemin\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Bob Beauchemin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/\",\"name\":\"Installing and Running HDP 2.1 on Windows - Bob Beauchemin\",\"isPartOf\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/#website\"},\"datePublished\":\"2014-04-28T22:22:34+00:00\",\"dateModified\":\"2014-04-28T23:06:00+00:00\",\"author\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/#\/schema\/person\/62bfa986c5b5d28fcffd8b4fc409c73e\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Hadoop\",\"item\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/category\/hadoop\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Installing and Running HDP 2.1 on Windows\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/#website\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/\",\"name\":\"Bob Beauchemin\",\"description\":\"SQL Server Blog\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/#\/schema\/person\/62bfa986c5b5d28fcffd8b4fc409c73e\",\"name\":\"Bob Beauchemin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/6f80e6cc667410857fa6a21931dc528b8092f4d112bf7a8ff7c267674d44ee37?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/6f80e6cc667410857fa6a21931dc528b8092f4d112bf7a8ff7c267674d44ee37?s=96&d=mm&r=g\",\"caption\":\"Bob Beauchemin\"},\"sameAs\":[\"http:\/www.sqlskills.com\/blogs\/bobb\/\"],\"url\":\"https:\/\/www.sqlskills.com\/blogs\/bobb\/author\/bobb\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Installing and Running HDP 2.1 on Windows - Bob Beauchemin","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/","og_locale":"en_US","og_type":"article","og_title":"Installing and Running HDP 2.1 on Windows - Bob Beauchemin","og_description":"And now, for something completely different. Hadoop. I&#8217;ve been playing with everything I can get my hands on (and can find time for) with Hadoop for a while now. Have tried Hortonworks on CentOS (seems like the Hortonworks &#8220;reference&#8221; platform), Hortonworks HDP on Windows, and HDInisght. I even think I may have a clue about [&hellip;]","og_url":"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/","og_site_name":"Bob Beauchemin","article_published_time":"2014-04-28T22:22:34+00:00","article_modified_time":"2014-04-28T23:06:00+00:00","author":"Bob Beauchemin","twitter_misc":{"Written by":"Bob Beauchemin","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/","url":"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/","name":"Installing and Running HDP 2.1 on Windows - Bob Beauchemin","isPartOf":{"@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/#website"},"datePublished":"2014-04-28T22:22:34+00:00","dateModified":"2014-04-28T23:06:00+00:00","author":{"@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/#\/schema\/person\/62bfa986c5b5d28fcffd8b4fc409c73e"},"breadcrumb":{"@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/installing-running-hdp-2-1-windows\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.sqlskills.com\/blogs\/bobb\/"},{"@type":"ListItem","position":2,"name":"Hadoop","item":"https:\/\/www.sqlskills.com\/blogs\/bobb\/category\/hadoop\/"},{"@type":"ListItem","position":3,"name":"Installing and Running HDP 2.1 on Windows"}]},{"@type":"WebSite","@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/#website","url":"https:\/\/www.sqlskills.com\/blogs\/bobb\/","name":"Bob Beauchemin","description":"SQL Server Blog","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.sqlskills.com\/blogs\/bobb\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/#\/schema\/person\/62bfa986c5b5d28fcffd8b4fc409c73e","name":"Bob Beauchemin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.sqlskills.com\/blogs\/bobb\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/6f80e6cc667410857fa6a21931dc528b8092f4d112bf7a8ff7c267674d44ee37?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/6f80e6cc667410857fa6a21931dc528b8092f4d112bf7a8ff7c267674d44ee37?s=96&d=mm&r=g","caption":"Bob Beauchemin"},"sameAs":["http:\/www.sqlskills.com\/blogs\/bobb\/"],"url":"https:\/\/www.sqlskills.com\/blogs\/bobb\/author\/bobb\/"}]}},"_links":{"self":[{"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/posts\/1799","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/comments?post=1799"}],"version-history":[{"count":0,"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/posts\/1799\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/media?parent=1799"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/categories?post=1799"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/bobb\/wp-json\/wp\/v2\/tags?post=1799"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}