<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>LectureMaker, LLC &#187; Michael E. Driscoll</title>
	<atom:link href="http://www.lecturemaker.com/tag/michael-e-driscoll/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.lecturemaker.com</link>
	<description>Teach, Market, and Sell From a Broadcast Video Studio, Supported by Influential Speech Direction, and Distributed  on Smart Internet Video Publishing Platforms</description>
	<lastBuildDate>Sat, 04 Feb 2012 04:39:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>RHIPE: An Interface to Hadoop and R for Large and Complex Data Analysis</title>
		<link>http://www.lecturemaker.com/2011/02/rhipe/</link>
		<comments>http://www.lecturemaker.com/2011/02/rhipe/#comments</comments>
		<pubDate>Sun, 13 Feb 2011 20:17:06 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[R users group]]></category>
		<category><![CDATA[Videos]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Bay Area useR Group]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Incremental Algorithms]]></category>
		<category><![CDATA[Jyotsna Paintal]]></category>
		<category><![CDATA[Large Data Sets]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Michael E. Driscoll]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Revolution Analytics]]></category>
		<category><![CDATA[Ron Fredericks]]></category>
		<category><![CDATA[Saptarshi Guha]]></category>
		<category><![CDATA[streaming]]></category>
		<category><![CDATA[Terabytes]]></category>
		<category><![CDATA[William S. Cleveland]]></category>

		<guid isPermaLink="false">http://www.lecturemaker.com/?p=3005</guid>
		<description><![CDATA[LectureMaker's video hosting platform includes random access video playback with navigation topics, text to math symbol publishing, and source code display with both language highlighting and links to language references. As an example...<br />
Dr. Saptarshi Guha's event video demonstrates his open-source interface between R and Hadoop called RHIPE to a packed Bay Area R User Group audience at Facebook. <a href="http://www.lecturemaker.com/2011/02/rhipe/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><script type="text/javascript" src="/scripts/highlight_me.js"></script><br />
<script type="text/javascript" src="/scripts/lmplay_support_v10.js"></script><br />
<script type="text/javascript"><!--
google_ad_client = "ca-pub-4184215318352482";
/* 468_60_v1 */
google_ad_slot = "6370068834";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br />
Ron Fredericks writes: <a href="http://www.stat.purdue.edu/~sguha/">Dr. Saptarshi Guha</a> created an open-source interface between R and Hadoop called the <em>R and Hadoop Integrated Processing Environment</em> or RHIPE for short.  LectureMaker was on the scene filming Saptarshi&#8217;s RHIPE presentation to the <a href="http://www.meetup.com/R-Users/events/11203642/">Bay Area&#8217;s useR Group</a>, introduced by <a href="http://www.dataspora.com/blog/about/">Michael E. Driscoll</a> and hosted at Facebook&#8217;s Palo Alto office on March 9&#8242;th 2010. Special thanks to <a href="http://www.meetup.com/R-Users/members/11576660/">Jyotsna Paintal</a> for helping me film the event.</p>
<p>Saptarshi received his Ph.D from Purdue University in 2010, having been advised by <a href="http://www.stat.purdue.edu/people/faculty/wsc">Dr. William S. Cleveland</a>. Saptarshi works at <a href="http://blog.revolutionanalytics.com/2010/10/the-r-files-saptarshi-guha.html">Revolution Analytics</a> in Palo Alto, as of the last update to this blog post.</p>
<blockquote><p>Hadoop is an open source implementation of both the MapReduce programming model, and the underlying file system Google developed to support web scale data.</p>
<p>The MapReduce programming model was designed by Google to enable a clean abstraction between large scale data analysis tasks and the underlying systems challenges involved in ensuring reliable large-scale computation. By adhering to the MapReduce model, your data processing job can be easily parallelized and the programmer doesn’t have to think about the system level details of synchronization, concurrency, hardware failure, etc.</p>
<p>Reference: &#8220;<a href="http://www.cloudera.com/blog/2009/05/5-common-questions-about-hadoop/">5 common questions about Hadoop&#8221;</a> a cloudera blog post May 2009 &#8211; by Christophe Bisciglia
</p></blockquote>
<p>RHIPE allows the R programmer to submit large datasets to Hadoop for a Map, Combine, Shuffle, and Reduce to process analytics at a high speed. See the figure below as an overview of the video&#8217;s key points and use cases.<br />
<a href="http://www.lecturemaker.com/wp-content/uploads/2011/02/mapreduceDIagram_v2.jpg"><img src="http://www.lecturemaker.com/wp-content/uploads/2011/02/mapreduceDIagram_v2.jpg" alt="" title="mapreduceDIagram_v2" width="640" height="330" class="alignleft size-full wp-image-3137" /></a></p>
<p><script type="text/javascript"><!--
google_ad_client = "ca-pub-4184215318352482";
/* 468_60_v1 */
google_ad_slot = "6370068834";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br />
<a name="video"></a></p>
<h2><a href="http://www.lecturemaker.com/2011/02/rhipe/#video" title="use this link to reference the video">The RHIPE Video</a></h2>
<blockquote><p>Note: On August 5&#8242;th 2011, LectureMaker&#8217;s new video player version 4.1 was put into use on this page&#8230; &#8220;Now you can hot link to video content, just like the <a href="#table" title="jump to the table">Video Topics and Navigation Table</a> below suggests.&#8221; &#8212;  Ron Fredericks, Co-founder LectureMaker LLC.</p></blockquote>

<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
			id="fm_lmv_lm_v41_733148030"
			class="flashmovie"
			width="858"
			height="524">
	<param name="movie" value="http://www.lecturemaker.com/scripts/lmv_lm_v41.swf" />
	<param name="flashvars" value="vidName=rhipe_update1a_meta.flv&amp;lecID=1&amp;lecSubDir=/RMeetUp2010/&amp;imgPreLoad=LMPreloadImage_854_530.jpg&amp;imgSubDir=/RMeetUp2010/&amp;gaFID=2011/02/rhipe&amp;winPtr=http%3A%2F%2Fwww.google.com%2Fsearch%3Fq%3Dhadoop%2Brhipe%2Br%2Bprogramming%26hl%3Den%26num%3D10%26lr%3D%26ft%3Di%26cr%3D%26safe%3Dimages%26tbs%3Dqdr%253Am%2Cqdr%3Am&amp;winTip=google search on Hadoop+R+RHIPE+programming (for the past month)&amp;navDotParam={Credits}{.9835}, {Q: What optimization methods were used?}{.937}, {RHIPE Lessons learned}{.854}, {RHIPE Todo list}{.7998}, {RHIPE on EC2:nSimulation timing}{.778}, {Q: What is the discrepancy between sampled data?}{.720}, {RHIPE on EC2:nIndiana bio-terrorism project}{.613}, {Another example:nDept. of Homeland Security}{.540 }, {Case study: VOIP summary}{.5269}, {Case study, step 5:nStatistical routines across subsets}{.4895}, {Case study, step 4:nCreate new objects}{.457 }, {Case study, step 3:nCompute summaries}{.4155}, {Case study, step 2:nFeed data to a reduce}{.354 }, {Case study, step 1:nConvert raw data to R dataset}{.303 }, {Case study: VOIP}{.244 }, {High performance computing with RHIPEnMapReduce and R}{.155 }, {High performance computing with existing R packages}{.136 }, {Overview of Hadoop}{.081 }, {Analysis of very large data sets}{.043 }, {Introduction:nRHIPE}{.030 }, {Introduction:nSaptarshi Guha}{.0095}, {Beginning}{.000 }" />
	<!--[if !IE]>-->
	<object	type="application/x-shockwave-flash"
			data="http://www.lecturemaker.com/scripts/lmv_lm_v41.swf"
			name="fm_lmv_lm_v41_733148030"
			width="858"
			height="524">
		<param name="flashvars" value="vidName=rhipe_update1a_meta.flv&amp;lecID=1&amp;lecSubDir=/RMeetUp2010/&amp;imgPreLoad=LMPreloadImage_854_530.jpg&amp;imgSubDir=/RMeetUp2010/&amp;gaFID=2011/02/rhipe&amp;winPtr=http%3A%2F%2Fwww.google.com%2Fsearch%3Fq%3Dhadoop%2Brhipe%2Br%2Bprogramming%26hl%3Den%26num%3D10%26lr%3D%26ft%3Di%26cr%3D%26safe%3Dimages%26tbs%3Dqdr%253Am%2Cqdr%3Am&amp;winTip=google search on Hadoop+R+RHIPE+programming (for the past month)&amp;navDotParam={Credits}{.9835}, {Q: What optimization methods were used?}{.937}, {RHIPE Lessons learned}{.854}, {RHIPE Todo list}{.7998}, {RHIPE on EC2:nSimulation timing}{.778}, {Q: What is the discrepancy between sampled data?}{.720}, {RHIPE on EC2:nIndiana bio-terrorism project}{.613}, {Another example:nDept. of Homeland Security}{.540 }, {Case study: VOIP summary}{.5269}, {Case study, step 5:nStatistical routines across subsets}{.4895}, {Case study, step 4:nCreate new objects}{.457 }, {Case study, step 3:nCompute summaries}{.4155}, {Case study, step 2:nFeed data to a reduce}{.354 }, {Case study, step 1:nConvert raw data to R dataset}{.303 }, {Case study: VOIP}{.244 }, {High performance computing with RHIPEnMapReduce and R}{.155 }, {High performance computing with existing R packages}{.136 }, {Overview of Hadoop}{.081 }, {Analysis of very large data sets}{.043 }, {Introduction:nRHIPE}{.030 }, {Introduction:nSaptarshi Guha}{.0095}, {Beginning}{.000 }" />
	<!--<![endif]-->
		
<p><a href="http://adobe.com/go/getflashplayer"><img src="http://www.adobe.com/images/shared/download_buttons/get_flash_player.gif" alt="Get Adobe Flash player" /></a></p>

	<!--[if !IE]>-->
	</object>
	<!--<![endif]-->
</object>
<p>&nbsp;</p>
<p><a name="table"></a></p>
<h3>Video Topics and Navigation Table</h3>

<table id="wp-table-reloaded-id-2-no-1" class="wp-table-reloaded wp-table-reloaded-id-2">
<thead>
	<tr class="row-1 odd">
		<th class="column-1">Elapsed Time</th><th class="column-2">Description of Topics  (plus hot links into the video)</th>
	</tr>
</thead>
<tbody>
	<tr class="row-2 even">
		<td class="column-1">0.00%</td><td class="column-2">Beginning <script type="text/javascript">displayNavDotLink(22, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-3 odd">
		<td class="column-1">1.00%</td><td class="column-2">Introduction to Dr. Saptarshi Guha <script type="text/javascript">displayNavDotLink(21, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-4 even">
		<td class="column-1">3.00%</td><td class="column-2">Introduction to RHIPE <script type="text/javascript">displayNavDotLink(20, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-5 odd">
		<td class="column-1">4.40%</td><td class="column-2">Analysis of very large data sets <script type="text/javascript">displayNavDotLink(19, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-6 even">
		<td class="column-1">8.10%</td><td class="column-2">Overview of Hadoop <script type="text/javascript">displayNavDotLink(18, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-7 odd">
		<td class="column-1">13.6%</td><td class="column-2">High performance computing with existing R packages <script type="text/javascript">displayNavDotLink(17, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-8 even">
		<td class="column-1">15.6%,</td><td class="column-2">High performance computing with RHIPE:  MapReduce interface to R <script type="text/javascript">displayNavDotLink(16, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-9 odd">
		<td class="column-1">24.4%</td><td class="column-2">Case study: VOIP <script type="text/javascript">displayNavDotLink(15, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-10 even">
		<td class="column-1">30.3%</td><td class="column-2">Case study, step 1: Convert raw data to R dataset <script type="text/javascript">displayNavDotLink(14, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-11 odd">
		<td class="column-1">35.4%</td><td class="column-2">Case study, step 2: Feed data to a reducer <script type="text/javascript">displayNavDotLink(13, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-12 even">
		<td class="column-1">41.6%</td><td class="column-2">Case study, step 3: Compute summaries <script type="text/javascript">displayNavDotLink(12, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-13 odd">
		<td class="column-1">45.7%</td><td class="column-2">Case study, step 4: Create new objects <script type="text/javascript">displayNavDotLink(11, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-14 even">
		<td class="column-1">49.3%</td><td class="column-2">Case study, step 5: Statistical routines across subsets <script type="text/javascript">displayNavDotLink(10, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-15 odd">
		<td class="column-1">52.7%</td><td class="column-2">Case study: VOIP summary <script type="text/javascript">displayNavDotLink(9, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-16 even">
		<td class="column-1">54.6%</td><td class="column-2">Another example: Dept. of Homeland Security <script type="text/javascript">displayNavDotLink(8, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-17 odd">
		<td class="column-1">62.0%</td><td class="column-2">RHIPE on EC2: Indiana bio-terrorism project <script type="text/javascript">displayNavDotLink(7, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-18 even">
		<td class="column-1">72.8%</td><td class="column-2">Q: What is the discrepancy between sampled data? <script type="text/javascript">displayNavDotLink(6, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-19 odd">
		<td class="column-1">78.7%</td><td class="column-2">RHIPE on EC2: Simulation timing <script type="text/javascript">displayNavDotLink(5, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-20 even">
		<td class="column-1">80.6%</td><td class="column-2">RHIPE Todo list <script type="text/javascript">displayNavDotLink(4, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-21 odd">
		<td class="column-1">86.3%</td><td class="column-2">RHIPE Lessons learned <script type="text/javascript">displayNavDotLink(3, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-22 even">
		<td class="column-1">95.0%</td><td class="column-2">Q: What optimization methods were used? <script type="text/javascript">displayNavDotLink(2, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-23 odd">
		<td class="column-1">99.9%</td><td class="column-2">Credits <script type="text/javascript">displayNavDotLink(1, " go there", "", "#video", 1);</script></td>
	</tr>
	<tr class="row-24 even">
		<td class="column-1"></td><td class="column-2"> <script type="text/javascript">displayNavDotLink(0, " Reset video", "", "#video");</script></td>
	</tr>
</tbody>
</table>

<p>&nbsp;</p>
<p><a name="code"></a></p>
<h2>Code Examples from the Video</h2>
<p>Source code highlighter note: R and RHIPE language constructs are color coded and hot-linked to appropriate online resources. Click on these links to learn more about these programming features. I manage the R/RHIPE source code highlighter project on my engineering site here: <a href="http://www.embeddedcomponents.com/blogs/geshi-language-highlighting/r/">R highlighter</a>. </p>
<h3>Move Raw Data Into Hadoop File System for Use In R Data Frames</h3>
<div class="ch_code_container" style="font-family: monospace;white-space: nowrap;height:300px;">
<div style="">Code (r)</div>
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## Case Study &#8211; VoIP</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## Copy text data to HDFS</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/functions.html#rhput-copying-to-the-hdf"><span style="color: #993300;">rhput</span></a><span style="color: #080;">&#40;</span>&#8216;<span style="color:#A020F0;">/home/sguha/pres/voip/text/<span style="color: #ff0000;">20040312</span><span style="color: #ff0000;">-105951</span><span style="color: #ff0000;">-0</span>.<span style="">iprtp</span>.<span style="">out</span></span>&#8216;,&#8217;<span style="color:#A020F0;">/pres/voip/text</span>&#8216;<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## Use RHIPE to convert text data :</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">##&nbsp; &nbsp;1079089238.075950 IP UDP 200 67.17.54.213 6086 67.17.50.213 15074 0</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">##&nbsp; &nbsp;&#8230;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">##&nbsp; &nbsp;to R data frames</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">input &lt; &#8211; <a href="http://astrostatistics.psu.edu/datasets/R/html/base/html/eval.html"><span style="color: #0000FF;">expression</span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## create components (direction, id.ip,id.port) from Sys.getenv(&quot;mapred.input.file&quot;)</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">v &lt;- <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/lapply.html"><span style="color: #0000FF;">lapply</span></a><span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/seq.html"><span style="color: #0000FF;">seq_along</span></a><span style="color: #080;">&#40;</span>map.<span style="">values</span><span style="color: #080;">&#41;</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/function.html"><span style="color: #0000FF;">function</span></a><span style="color: #080;">&#40;</span>r<span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">value0 &lt;- <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/strsplit.html"><span style="color: #0000FF;">strsplit</span></a><span style="color: #080;">&#40;</span>map.<span style="">values</span><span style="color: #080;">&#91;</span><span style="color: #080;">&#91;</span>r<span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span>,&quot; +&quot;<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">key &lt;- <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/paste.html"><span style="color: #0000FF;">paste</span></a><span style="color: #080;">&#40;</span>value0<span style="color: #080;">&#91;</span>id.<span style="">ip</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span>,value0<span style="color: #080;">&#91;</span>id.<span style="">port</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span>,value0<span style="color: #080;">&#91;</span>id.<span style="">ip</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp;,value0<span style="color: #080;">&#91;</span>id.<span style="">port</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span>,direction,sep=&quot;.&quot;<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/mr.html#rhcollect-writing-data-to-hadoop-mapreduce"><span style="color: #993300;">rhcollect</span></a><span style="color: #080;">&#40;</span>key,value0<span style="color: #080;">&#91;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">9</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;<span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></div>
</li>
</ol>
</div>
</pre>
<p>&nbsp;</p>
<h3>Submit a MapReduce Job Then Retrieve semi-calls</h3>
<div class="ch_code_container" style="font-family: monospace;white-space: nowrap;height:300px;">
<div style="">Code (r)</div>
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## Case Study - VoIP</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## We can run this from within R:</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mr&lt; -<a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/mr.html#rhmr-creating-the-mapreduce-object"><span style="color: #993300;">rhmr</span></a><span style="color: #080;">&#40;</span>map=input,reduce=reduce, inout=<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span>'<span style="color:#A020F0;">text</span>','<span style="color:#A020F0;">map</span>'<span style="color: #080;">&#41;</span>, ifolder='<span style="color:#A020F0;">/pres/voip/text</span>', ofolder='<span style="color:#A020F0;">/pres/voip/df</span>',jobname='<span style="color:#A020F0;">create</span>'</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp;,mapred=<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/list.html"><span style="color: #0000FF;">list</span></a><span style="color: #080;">&#40;</span>mapred.<span style="">reduce</span>.<span style="">tasks</span>=<span style="color: #ff0000;">5</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mr &lt;- <a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/functions.html#rhex-submitting-a-mapreduce-r-object-to-hadoop"><span style="color: #993300;">rhex</span></a><span style="color: #080;">&#40;</span>mr<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## This takes 40 minutes for 70 gigabytes across 8 computers(72 cores).</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">##&nbsp; &nbsp; Saved as 277K data frames(semi-calls) across 14 gigabytes.</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## We can retrieve semi-calls:</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/functions.html#rhgetkeys-reading-values-from-map-files"><span style="color: #993300;">rhgetkey</span></a><span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/list.html"><span style="color: #0000FF;">list</span></a><span style="color: #080;">&#40;</span>'<span style="color:#A020F0;"><span style="color: #ff0000;">67.17</span><span style="color: #ff0000;">.50</span><span style="color: #ff0000;">.213</span><span style="color: #ff0000;">.5002</span><span style="color: #ff0000;">.67</span><span style="color: #ff0000;">.17</span><span style="color: #ff0000;">.50</span><span style="color: #ff0000;">.6</span><span style="color: #ff0000;">.5896</span>.<span style="">out</span></span>'<span style="color: #080;">&#41;</span>,paths='<span style="color:#A020F0;">/pres/voip/df/p*</span>'<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## a list of lists(key,value pairs)</span></div>
</li>
</ol>
</div>
</pre>
<p>&nbsp;</p>
<h3>Compute Summaries With MapReduce</h3>
<div class="ch_code_container" style="font-family: monospace;white-space: nowrap;height:300px;">
<div style="">Code (r)</div>
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## Case Study - VoIP</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">m&lt; -<a href="http://astrostatistics.psu.edu/datasets/R/html/base/html/eval.html"><span style="color: #0000FF;">expression</span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/lapply.html"><span style="color: #0000FF;">lapply</span></a><span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/seq.html"><span style="color: #0000FF;">seq_along</span></a><span style="color: #080;">&#40;</span>map.<span style="">values</span><span style="color: #080;">&#41;</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/function.html"><span style="color: #0000FF;">function</span></a><span style="color: #080;">&#40;</span>i<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## make key from map.keys[[i]]</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">value&lt;-<a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">if</span></a><span style="color: #080;">&#40;</span>tmp<span style="color: #080;">&#91;</span><span style="color: #ff0000;">11</span><span style="color: #080;">&#93;</span>==&quot;in&quot;<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span>in.<span style="">start</span>=start,in.<span style="">end</span>=end,in.<span style="">dur</span>=dur,in.<span style="">pkt</span>=n.<span style="">pkt</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">else</span></a></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span>out.<span style="">start</span>=start,out.<span style="">end</span>=end,out.<span style="">dur</span>=dur,out.<span style="">pkt</span>=n.<span style="">pkt</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/mr.html#rhcollect-writing-data-to-hadoop-mapreduce"><span style="color: #993300;">rhcollect</span></a><span style="color: #080;">&#40;</span>key,value<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">r&lt;-<a href="http://astrostatistics.psu.edu/datasets/R/html/base/html/eval.html"><span style="color: #0000FF;">expression</span></a><span style="color: #080;">&#40;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">pre=<span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mydata&lt;-<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/list.html"><span style="color: #0000FF;">list</span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">ifnull &lt;- <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/function.html"><span style="color: #0000FF;">function</span></a><span style="color: #080;">&#40;</span>r,def=NA<span style="color: #080;">&#41;</span> <a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">if</span></a><span style="color: #080;">&#40;</span>!<a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/NULL.html"><span style="color: #003300;">is.<span style="">null</span></span></a><span style="color: #080;">&#40;</span>r<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> r <a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">else</span></a> NA</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span>,reduce=<span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mydata&lt;-<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/append.html"><span style="color: #0000FF;">append</span></a><span style="color: #080;">&#40;</span>mydata,reduce.<span style="">values</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span>,post=<span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">mydata&lt;-<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/unlist.html"><span style="color: #0000FF;">unlist</span></a><span style="color: #080;">&#40;</span>mydata<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">in.<span style="">start</span>&lt;-ifnull<span style="color: #080;">&#40;</span>mydata<span style="color: #080;">&#91;</span>'<span style="color:#A020F0;">in.<span style="">start</span></span>'<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">.....</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="">out</span>.<span style="">end</span>&lt;-ifnull<span style="color: #080;">&#40;</span> mydata<span style="color: #080;">&#91;</span>'<span style="color:#A020F0;">out.<span style="">end</span></span>'<span style="color: #080;">&#93;</span> <span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">out.<span style="">start</span>&lt;-ifnull<span style="color: #080;">&#40;</span>mydata<span style="color: #080;">&#91;</span>'<span style="color:#A020F0;">out.<span style="">start</span></span>'<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">value&lt;-<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span>in.<span style="">start</span>,in.<span style="">end</span>,in.<span style="">dur</span>,in.<span style="">pkt</span>,out.<span style="">start</span>,out.<span style="">end</span>,out.<span style="">dur</span>,out.<span style="">pkt</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/mr.html#rhcollect-writing-data-to-hadoop-mapreduce"><span style="color: #993300;">rhcollect</span></a><span style="color: #080;">&#40;</span>reduce.<span style="">key</span>,value<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></div>
</li>
</ol>
</div>
</pre>
<p>&nbsp;</p>
<h3>Compute Summaries With MapReduce Across HTTP and SSH Connections</h3>
<div class="ch_code_container" style="font-family: monospace;white-space: nowrap;height:300px;">
<div style="">Code (r)</div>
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;">## Example Compute total bytes, total packets across all HTTP and SSH connections.</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">m &lt; - <a href="http://astrostatistics.psu.edu/datasets/R/html/base/html/eval.html"><span style="color: #0000FF;">expression</span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">w &lt;- <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/lapply.html"><span style="color: #0000FF;">lapply</span></a><span style="color: #080;">&#40;</span>map.<span style="">values</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/function.html"><span style="color: #0000FF;">function</span></a><span style="color: #080;">&#40;</span>r<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">if</span></a><span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/any.html"><span style="color: #0000FF;">any</span></a><span style="color: #080;">&#40;</span>r<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span>'<span style="color:#A020F0;">sport</span>','<span style="color:#A020F0;">dport</span>'<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span> %in% <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">22</span>,<span style="color: #ff0000;">80</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> T <a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">else</span></a> F<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/lapply.html"><span style="color: #0000FF;">lapply</span></a><span style="color: #080;">&#40;</span>map.<span style="">values</span><span style="color: #080;">&#91;</span>w<span style="color: #080;">&#93;</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/function.html"><span style="color: #0000FF;">function</span></a><span style="color: #080;">&#40;</span>v<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">key &lt;- <a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">if</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">22</span> %in% v<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span>'<span style="color:#A020F0;">dport</span>','<span style="color:#A020F0;">sport</span>'<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span> <span style="color: #ff0000;">22</span> <a href="http://www.astrostatistics.psu.edu/datasets/R/html/base/html/Control.html"><span style="color: #0000FF;">else</span></a> <span style="color: #ff0000;">80</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/mr.html#rhcollect-writing-data-to-hadoop-mapreduce"><span style="color: #993300;">rhcollect</span></a><span style="color: #080;">&#40;</span>key, <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/sum.html"><span style="color: #0000FF;">sum</span></a><span style="color: #080;">&#40;</span>v<span style="color: #080;">&#91;</span>,'<span style="color:#A020F0;">datasize</span>'<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/nrow.html"><span style="color: #0000FF;">nrow</span></a><span style="color: #080;">&#40;</span>v<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">r &lt; - <a href="http://astrostatistics.psu.edu/datasets/R/html/base/html/eval.html"><span style="color: #0000FF;">expression</span></a><span style="color: #080;">&#40;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">pre &lt;- <span style="color: #080;">&#123;</span> sums &lt;- <span style="color: #ff0000;">0</span> <span style="color: #080;">&#125;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">reduce &lt;-<span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">v &lt;- <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/do.call.html"><span style="color: #0000FF;">do.<span style="">call</span></span></a><span style="color: #080;">&#40;</span>&quot;rbind&quot;,reduce.<span style="">values</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">sums &lt;- sums+apply<span style="color: #080;">&#40;</span>v,<span style="color: #ff0000;">2</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/sum.html"><span style="color: #0000FF;">sum</span></a><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span>,post=<span style="color: #080;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/mr.html#rhcollect-writing-data-to-hadoop-mapreduce"><span style="color: #993300;">rhcollect</span></a><span style="color: #080;">&#40;</span>reduce.<span style="">key</span>,<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/c.html"><span style="color: #0000FF;">c</span></a><span style="color: #080;">&#40;</span>bytes=sums<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>,pkts=sums<span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></div>
</li>
</ol>
</div>
</pre>
<p>&nbsp;</p>
<h3>Load RHIPE on an EC2 Cloud</h3>
<div class="ch_code_container" style="font-family: monospace;white-space: nowrap;height:300px;">
<div style="">Code (r)</div>
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/library.html"><span style="color: #0000FF;">library</span></a><span style="color: #080;">&#40;</span>Rhipe<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/load.html"><span style="color: #0000FF;">load</span></a><span style="color: #080;">&#40;</span>&quot;ccsim.<span style="">Rdata</span>&quot;<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/functions.html#rhput-copying-to-the-hdf"><span style="color: #993300;">rhput</span></a><span style="color: #080;">&#40;</span>&quot;/root/ccsim.<span style="">Rdata</span>&quot;,&quot;/tmp/&quot;<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">setup &lt; - <a href="http://astrostatistics.psu.edu/datasets/R/html/base/html/eval.html"><span style="color: #0000FF;">expression</span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#123;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/load.html"><span style="color: #0000FF;">load</span></a><span style="color: #080;">&#40;</span>&quot;ccsim.<span style="">Rdata</span>&quot;<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">suppressMessages<span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/library.html"><span style="color: #0000FF;">library</span></a><span style="color: #080;">&#40;</span>survstl<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">suppressMessages<span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/library.html"><span style="color: #0000FF;">library</span></a><span style="color: #080;">&#40;</span>stl2<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">chunk &lt;- <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/Round.html"><span style="color: #003300;">floor</span></a><span style="color: #080;">&#40;</span><a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/length.html"><span style="color: #0000FF;">length</span></a><span style="color: #080;">&#40;</span>simlist<span style="color: #080;">&#41;</span>/ <span style="color: #ff0000;">141</span><span style="color: #080;">&#41;</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">z &lt;- <a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/rhlapply.html"><span style="color: #993300;">rhlapply</span></a><span style="color: #080;">&#40;</span>a,cc_sim, setup=setup,N=chunk,shared=&quot;/tmp/ccsim.<span style="">Rdata</span>&quot;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp;,aggr=<a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/function.html"><span style="color: #0000FF;">function</span></a><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> <a href="http://www.iiap.res.in/astrostat/School07/R/html/base/html/do.call.html"><span style="color: #0000FF;">do.<span style="">call</span></span></a><span style="color: #080;">&#40;</span>&quot;rbind&quot;,x<span style="color: #080;">&#41;</span>,doLoc=TRUE<span style="color: #080;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/functions.html#rhex-submitting-a-mapreduce-r-object-to-hadoop"><span style="color: #993300;">rhex</span></a><span style="color: #080;">&#40;</span>z<span style="color: #080;">&#41;</span></div>
</li>
</ol>
</div>
</pre>
<p>&nbsp;</p>
<h2>References:</h2>
<blockquote><p>“I just watched the Saptarshi Guha video. It looks great!! Thank you! The picture is incredibly crisp, and the timeline tab is a nice touch for reviewing the film. Thank you!” -- Matt Bascom</p></blockquote>
<p>VMware's open-source partnership with Cloudera offers you a virtual machine with Hadoop, PIG, and HIVE - <a href="http://www.vmware.com/appliances/directory/va/78133/download">download</a></p>
<p>The University of Purdue hosts the documentation and open-source code base for RHIPE - <a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/index.html">download</a></p>
<p><script type="text/javascript">
<!--
highlight_me("h3", "tt", "http://www.lecturemaker.com/2011/02/rhipe/#video",
			 "RHIPE: An Interface Between Hadoop and R",
			 "Presented by Saptarshi Guha",
			 "http://www.lecturemaker.com/lectures/RMeetUp2010/RHIPE_Lecture.jpg",
			 "300", "169", "80", "5"
			 );
//--></script></p>
<p><strong>This is what the share text above looks like...</strong></p>
<div><strong><a href="http://www.lecturemaker.com/2011/02/rhipe/#video" title="Click link to go to the video page">RHIPE: An Interface Between Hadoop and R</a></strong><br />Presented by Saptarshi Guha</div>
<p><a href="http://www.lecturemaker.com/2011/02/rhipe/#video"><img src="http://www.lecturemaker.com/lectures/RMeetUp2010/RHIPE_Lecture.jpg" alt="Video Link" width="300" height="169" border="0" title="Click image to go to the video page" /></a>                           </p>

<!-- start wp-tags-to-technorati 1.02 -->

<p class='technorati-tags'>Technorati Tags: <a class='technorati-link' href='http://technorati.com/tag/Algorithm' rel='tag' target='_self'>Algorithm</a>, <a class='technorati-link' href='http://technorati.com/tag/Bay+Area+useR+Group' rel='tag' target='_self'>Bay Area useR Group</a>, <a class='technorati-link' href='http://technorati.com/tag/Facebook' rel='tag' target='_self'>Facebook</a>, <a class='technorati-link' href='http://technorati.com/tag/Hadoop' rel='tag' target='_self'>Hadoop</a>, <a class='technorati-link' href='http://technorati.com/tag/Incremental+Algorithms' rel='tag' target='_self'>Incremental Algorithms</a>, <a class='technorati-link' href='http://technorati.com/tag/Jyotsna+Paintal' rel='tag' target='_self'>Jyotsna Paintal</a>, <a class='technorati-link' href='http://technorati.com/tag/Large+Data+Sets' rel='tag' target='_self'>Large Data Sets</a>, <a class='technorati-link' href='http://technorati.com/tag/MapReduce' rel='tag' target='_self'>MapReduce</a>, <a class='technorati-link' href='http://technorati.com/tag/Michael+E.+Driscoll' rel='tag' target='_self'>Michael E. Driscoll</a>, <a class='technorati-link' href='http://technorati.com/tag/R' rel='tag' target='_self'>R</a>, <a class='technorati-link' href='http://technorati.com/tag/Revolution+Analytics' rel='tag' target='_self'>Revolution Analytics</a>, <a class='technorati-link' href='http://technorati.com/tag/Ron+Fredericks' rel='tag' target='_self'>Ron Fredericks</a>, <a class='technorati-link' href='http://technorati.com/tag/Saptarshi+Guha' rel='tag' target='_self'>Saptarshi Guha</a>, <a class='technorati-link' href='http://technorati.com/tag/streaming' rel='tag' target='_self'>streaming</a>, <a class='technorati-link' href='http://technorati.com/tag/Terabytes' rel='tag' target='_self'>Terabytes</a>, <a class='technorati-link' href='http://technorati.com/tag/William+S.+Cleveland' rel='tag' target='_self'>William S. Cleveland</a></p>

<!-- end wp-tags-to-technorati -->
]]></content:encoded>
			<wfw:commentRss>http://www.lecturemaker.com/2011/02/rhipe/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Bay Area R User Group 2009 Kickoff Video</title>
		<link>http://www.lecturemaker.com/2009/02/r-kickoff-video/</link>
		<comments>http://www.lecturemaker.com/2009/02/r-kickoff-video/#comments</comments>
		<pubDate>Wed, 25 Feb 2009 23:39:28 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Marketing]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[R users group]]></category>
		<category><![CDATA[Software Platforms]]></category>
		<category><![CDATA[Videos]]></category>
		<category><![CDATA[Website Management]]></category>
		<category><![CDATA[Bo Cowgill]]></category>
		<category><![CDATA[CRAN]]></category>
		<category><![CDATA[Data Evolution]]></category>
		<category><![CDATA[Dataspora LLC]]></category>
		<category><![CDATA[David Smith]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Itamar Rosenn]]></category>
		<category><![CDATA[Jim Porzak]]></category>
		<category><![CDATA[LectureMaker]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[Michael E. Driscoll]]></category>
		<category><![CDATA[Octave]]></category>
		<category><![CDATA[Power Law Distribution]]></category>
		<category><![CDATA[Predictive Analytics]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[R User Group]]></category>
		<category><![CDATA[Revolution Computing]]></category>
		<category><![CDATA[Ron Fredericks]]></category>
		<category><![CDATA[RPy]]></category>
		<category><![CDATA[Sampling]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[The Generations Network]]></category>
		<category><![CDATA[Virtual World]]></category>

		<guid isPermaLink="false">http://www.lecturemaker.com/2009/02/r-and-science-of-predictive-analytics-lecture-production/</guid>
		<description><![CDATA[Ron Fredericks writes: In February I attended the Bay Area R User Group meeting recently held at Predictive Analytics World 2009. Michael E. Driscoll, one of the two kickoff meeting co-chairs, was gracious enough to let LectureMaker capture the video &#8230; <a href="http://www.lecturemaker.com/2009/02/r-kickoff-video/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><script type="text/javascript"><!--
google_ad_client = "ca-pub-4184215318352482";
/* 468_60_v1 */
google_ad_slot = "6370068834";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br />
Ron Fredericks writes: In February I attended the <a href="http://www.meetup.com/R-Users/">Bay Area R User Group </a>meeting recently held at <a href="http://www.predictiveanalyticsworld.com/">Predictive Analytics World 2009</a>. Michael E. Driscoll, one of the two kickoff meeting co-chairs, was gracious enough to let LectureMaker capture the video for the event as a <a href="http://www.lecturemaker.com/products_services/">technical marketing </a>&#8220;lighthouse&#8221; project. </p>
<p><span style="color:grey">Editor&#8217;s Note: New Bay Area useR Group Video can be found here:<br />
<a href="http://www.lecturemaker.com/2011/02/rhipe/">RHIPE: An Interface Between Hadoop and R for Large and Complex Data Analysis</a></span></p>
<p>Today I am happy to present this video to my readers.</p>
<blockquote><p>If you manage the marketing, feature roll-out, or web site design, for a social network or professional ecosystem, then you need the techniques presented in this video.</p></blockquote>
<p>Watch this video to learn about:</p>
<ol>
<li>The open-source analytics programming language called R</li>
<li>How Google and Facebook approach analytics to predict their web user community&#8217;s behavior</li>
<li>Where to download R and get enterprise level support</li>
<li>How the meeting co-chairs use R</li>
</ol>
<p><a name="media_link"></a></p>
<table width="800" border="0" cellpadding="4" >
<tr>
<td>
<strong>The R and Science of Predictive Analytics: Four Case Studies in R &#8211; <em>the Video</em></strong>
</td>
</tr>
<tr>
<td>

<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
			id="fm_lmpremovie_425395161"
			class="flashmovie"
			width="800"
			height="600">
	<param name="movie" value="http://www.lecturemaker.com/wp-content/uploads/2009/02/lmpremovie.swf" />
	<!--[if !IE]>-->
	<object	type="application/x-shockwave-flash"
			data="http://www.lecturemaker.com/wp-content/uploads/2009/02/lmpremovie.swf"
			name="fm_lmpremovie_425395161"
			width="800"
			height="600">
	<!--<![endif]-->
		
<p><font color="red"><a href="http://adobe.com/go/getflashplayer"><img src="http://www.lecturemaker.com/wp-content/uploads/2009/02/downloadflashbutton800_600.gif" alt="Get Adobe Flash player 10" title="hummm..." /></a>If you prefer not to upgrade to flash 10, you may try playing the flash movie from this link: <a href="http://www.lecturemaker.com/wp-content/uploads/2009/02/lmpremovie.swf">lmpremovie.swf</a></font></p>

	<!--[if !IE]>-->
	</object>
	<!--<![endif]-->
</object>
</td>
</tr>
<tr>
<td>
<table width="800"  bgcolor="#cccccc">
<tr>
<td align="left">
Email  <a href="mailto: ?subject=Bay Area R User Group 2009 Kickoff Video&#038;body=I found this video interesting, here's the link: http://www.lecturemaker.com/2009/02/r-kickoff-video/#media_link">this video</a>
</td>
<td align="center">
Link <a href="http://www.lecturemaker.com/2009/02/r-kickoff-video/#media_link"> to this video</a>
</td>
<td align="right">
LectureMaker Video Player <a href="http://www.lecturemaker.com/products_services/video-player/#support">product and support page</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
<p>Panel of four recognized R users from industry:</p>
<ul>
<li>Bo Cowgill, Google</li>
<li>Itamar Rosenn, Facebook</li>
<li>David Smith, Revolution Computing</li>
<li>Jim Porzak, The Generations Network</li>
</ul>
<p>Moderator and co-chair of Bay Area R User Group:</p>
<ul>
<li>Michael E. Driscoll, Dataspora LLC</li>
</ul>
<p><span id="more-23"></span></p>
<p><a name="r_code_highlight"></a><br />
<script type="text/javascript"><!--
google_ad_client = "ca-pub-4184215318352482";
/* 468_60_v1 */
google_ad_slot = "6370068834";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br />
<strong><a href="http://www.lecturemaker.com/2009/02/r-kickoff-video/#r_code_highlight">A live R demo</a></strong><br />
The co-chairs, Michael and Jim, presented a great overview of the R language. Here at LectureMaker, source code highlighting is supported with automatic links back to language documentation for <a href="http://qbnz.com/highlighter/news.php?id=122">132 languages</a> plus my own <a href="http://www.embeddedcomponents.com/blogs/geshi-language-highlighting/r/">R highlighter</a>. </p>
<p>For example, here is a short R program:</p>
<div class="ch_code_container" style="font-family: monospace;white-space: nowrap;height:300px;">
<div style="">Code (r)</div>
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">x &lt;- <a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/Normal.html"><span style="color: #0000FF;">rnorm</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #228B22;"># 100 random numbers from a normal(0,1) distribution</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">y &lt;- <a href="http://astrostatistics.psu.edu/su07/R/html/base/html/Log.html"><span style="color: #0000FF;">exp</span></a><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> + <a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/Normal.html"><span style="color: #0000FF;">rnorm</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>&nbsp; &nbsp;<span style="color: #228B22;"># an exponential function with error</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">result &lt;- <a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/lm.html"><span style="color: #0000FF;">lm</span></a><span style="color: #080;">&#40;</span>y ~ x<span style="color: #080;">&#41;</span>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #228B22;"># regress x on y and store the results</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF;">summary</span></a><span style="color: #080;">&#40;</span>result<span style="color: #080;">&#41;</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #228B22;"># print the regression results</span></div>
</li>
<li style="font-weight: bold;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/plot.html"><span style="color: #0000FF;">plot</span></a><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #228B22;"># pretty obvious what this does</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/abline.html"><span style="color: #0000FF;">abline</span></a><span style="color: #080;">&#40;</span>result<span style="color: #080;">&#41;</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #228B22;"># add the regression line to the plot</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/lines.html"><span style="color: #0000FF;">lines</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/lowess.html"><span style="color: #0000FF;">lowess</span></a><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>, col=<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>&nbsp; <span style="color: #228B22;"># add a nonparametric regression line (a smoother)</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/hist.html"><span style="color: #0000FF;">hist</span></a><span style="color: #080;">&#40;</span>result$residuals<span style="color: #080;">&#41;</span>&nbsp; &nbsp; &nbsp;<span style="color: #228B22;"># histogram of the residuals from the regression</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #228B22;"># source code reference: http://bayes.math.montana.edu/Rweb/Rweb.general.html</span></div>
</li>
</ol>
</div>
<p>You can click on the highlighted methods, or copy the code above and paste it into a web version of R called <a href="http://bayes.math.montana.edu/Rweb/Rweb.general.html">RWeb</a></p>
<p><strong>How I developed the video</strong><br />
The LectureMaker video capture, editing, publishing, and technical marketing service starts with the on location video capture of a live event. Druing the process I become emersed in the presenter(s) and the audience participation. My empathy acquired during the project combined with my own professional background allows me to turn video into technical marketing.</p>
<p>I watched the video presented here myself several times: Starting with the content through the lens of my <a href="http://www.amazon.com/gp/product/B0017SR4JO?ie=UTF8&#038;tag=lectur-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=B0017SR4JO">Canon XL-H1S 3CCD HDV High Definition Professional Camcorder with 20x HD Video Zoom Lens III</a><img src="http://www.assoc-amazon.com/e/ir?t=lectur-20&#038;l=as2&#038;o=1&#038;a=B0017SR4JO" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /><br />
 prosumer digital video camera during the event, followed by several editing iterations on my <a href="http://www.amazon.com/gp/product/B001B1SFIQ?ie=UTF8&#038;tag=lectur-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=B001B1SFIQ">eVGA e-GeForce GTX280 1GB DDR3 PCI-Express 2.0 Graphics Card-Lifetime Warranty with Free Special Edition EVGA Precision Overclocking Utility</a><img src="http://www.assoc-amazon.com/e/ir?t=lectur-20&#038;l=as2&#038;o=1&#038;a=B001B1SFIQ" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /><br />
 graphics powered PC, ending with comprehensive testing from several Linux web servers connected to Mac, Linux, and Windows client PCs. </p>
<p>To create the video I took the following steps to edit and publish the video, primarily with <a href="http://www.amazon.com/gp/product/B001EUCTPE?ie=UTF8&#038;tag=lectur-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=B001EUCTPE">Adobe Creative Suite 4 Master Collection</a><img src="http://www.assoc-amazon.com/e/ir?t=lectur-20&#038;l=as2&#038;o=1&#038;a=B001EUCTPE" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /><br />
 software: </p>
<ul>
<li>Edit the video with Adobe After Effects and Premier Pro CS4. I incorporated the features of the <a href="http://www.neatvideo.com/">Neat Video Pro </a>noise reduction plug-in to correct for low light high gain raw camera issues</li>
<li>Edit and optimize the sound track separately using Adobe Soundbooth CS4 along with brightening of the speech itself as only <a href="http://www.toastmasters.org/ToastmastersMagazine/ToastmasterArchive/2007/August/Articles/Charisma.aspx">Toastmasters International experience</a> can provide</li>
<li>Design and develop a state of the art Video Player with custom hot spot navigation dots, intelligent preloader, and client-server progressive content management using Adobe Flash Professional CS4 ActionScript 3.0</li>
<li>Customize the LectureMaker Video Player with external preloader graphic image, personalized playback faceplate, selection of a playback control skin, and custom buttons using Adobe Illustrator and Photoshop CS4 Extended</li>
<li>Compress the 75 minute 200 GB high definition 1440 x 1080 mpeg video down to a 85 MB 800 x 600 flash video using both Adobe Media Encoder CS4 and <a href="http://www.sorensonmedia.com/products/?pageID=1&#038;ppc=3&#038;p=12">Sorenson Squeeze 5 Pro</a></li>
<li>Lastly, I hosted the video inside this WordPress.org blog package currently at version 2.71 along with <a href="http://wordpress.org/extend/plugins/kimili-flash-embed/other_notes/">KFE SWFObject</a> flash movie publisher plug-in</li>
<p>Some thoughts for next time: I did not have ideal placement for the video camera so I struggled with noise, light, who is talking when, and tripod stability. I made a few mistakes with sound and white balance during live experiments with the 100 or so buttons on this new camera too. I have since solved these issues with more experience, an after-market <a href="http://www.amazon.com/gp/product/B00126W14O?ie=UTF8&#038;tag=lectur-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=B00126W14O">Sony XEL-1 11-Inch OLED Digital TV</a><img src="http://www.assoc-amazon.com/e/ir?t=lectur-20&#038;l=as2&#038;o=1&#038;a=B00126W14O" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /><br />
 and wired remote control. I apologize to you the viewer, the R presenters, and attendees who participated with their questions. Hopefully these errors will seem small thanks to the excellent delivery of great material during the R User Group event itself. This was one of the best panel discussions I have attended in either engineering or science.
</ul>
<p><strong>Contact LectureMaker to develop a video enhanced technical marketing program</strong><br />
This blog post&#8217;s content and its longtail of moderated comments over time demonstrates LectureMaker&#8217;s live video capture, video editing, social networking outreach, and technical marketing service. It might be based on a single live event, or a series of events integrated into a year-long community outreach program. What&#8217;s more, most of the cost for the LectureMaker service is already built into your current editorial events calender! Yet adding LectureMaker service to your existing events program could make all the difference in meeting your goals around ecosystem or sales development. You can get a brief overview here:<br />
<a href="http://www.lecturemaker.com/products_services/">http://www.lecturemaker.com/products_services/</a></p>
<blockquote><p>LectureMaker targets this service to exceptional people, brands, and products. Turn your high value live but localized events into reusable global marketing and education programs</p></blockquote>
<p><strong>How the panelists use R for predictive analytics</strong><br />
Each panelist came prepared to discuss R&#8217;s strengths and weaknesses as a tool, along with example case studies. Mike has a great blog post summarizing this meeting. So I invite those who would like to learn more about the techniques presented in this video to jump over to read it now: </p>
<p>How Google and Facebook are using R<br />
by Michael E. Driscoll | February 19, 2009<br />
<a href="http://dataspora.com/blog/predictive-analytics-using-r/">http://dataspora.com/blog/predictive-analytics-using-r/</a></p>
<p>At the end of this process I learned a great deal about the material presented: including <a href="http://cran.r-project.org/web/packages/">R packages</a>, where to <a href="http://www.revolution-computing.com/">download it</a>, who to contract as project mentors, who to lead group training, and some additional open-source packages such as <a href="http://www.gnu.org/software/octave/">Octave</a> and <a href="http://rpy.sourceforge.net/">R&#8217;s interface with Python</a>.  I Developed a passion for the techniques discussed in this video, and every video product I put together. But in this case I also plan to get started with R along with some of the integrated tools that were discussed in the near future.</p>
<p><strong>Want to learn R? Here is a link to the book recommended by the R User Group:</strong><br />
<a href="http://www.amazon.com/gp/product/0387759352?ie=UTF8&#038;tag=lectur-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=0387759352">Software for Data Analysis: Programming with R (Statistics and Computing)</a><img src="http://www.assoc-amazon.com/e/ir?t=lectur-20&#038;l=as2&#038;o=1&#038;a=0387759352" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></p>

<!-- start wp-tags-to-technorati 1.02 -->

<p class='technorati-tags'>Technorati Tags: <a class='technorati-link' href='http://technorati.com/tag/Bo+Cowgill' rel='tag' target='_self'>Bo Cowgill</a>, <a class='technorati-link' href='http://technorati.com/tag/CRAN' rel='tag' target='_self'>CRAN</a>, <a class='technorati-link' href='http://technorati.com/tag/Data+Evolution' rel='tag' target='_self'>Data Evolution</a>, <a class='technorati-link' href='http://technorati.com/tag/Dataspora+LLC' rel='tag' target='_self'>Dataspora LLC</a>, <a class='technorati-link' href='http://technorati.com/tag/David+Smith' rel='tag' target='_self'>David Smith</a>, <a class='technorati-link' href='http://technorati.com/tag/Facebook' rel='tag' target='_self'>Facebook</a>, <a class='technorati-link' href='http://technorati.com/tag/Google' rel='tag' target='_self'>Google</a>, <a class='technorati-link' href='http://technorati.com/tag/Itamar+Rosenn' rel='tag' target='_self'>Itamar Rosenn</a>, <a class='technorati-link' href='http://technorati.com/tag/Jim+Porzak' rel='tag' target='_self'>Jim Porzak</a>, <a class='technorati-link' href='http://technorati.com/tag/LectureMaker' rel='tag' target='_self'>LectureMaker</a>, <a class='technorati-link' href='http://technorati.com/tag/math' rel='tag' target='_self'>math</a>, <a class='technorati-link' href='http://technorati.com/tag/Michael+E.+Driscoll' rel='tag' target='_self'>Michael E. Driscoll</a>, <a class='technorati-link' href='http://technorati.com/tag/Octave' rel='tag' target='_self'>Octave</a>, <a class='technorati-link' href='http://technorati.com/tag/Power+Law+Distribution' rel='tag' target='_self'>Power Law Distribution</a>, <a class='technorati-link' href='http://technorati.com/tag/Predictive+Analytics' rel='tag' target='_self'>Predictive Analytics</a>, <a class='technorati-link' href='http://technorati.com/tag/Python' rel='tag' target='_self'>Python</a>, <a class='technorati-link' href='http://technorati.com/tag/R+User+Group' rel='tag' target='_self'>R User Group</a>, <a class='technorati-link' href='http://technorati.com/tag/Revolution+Computing' rel='tag' target='_self'>Revolution Computing</a>, <a class='technorati-link' href='http://technorati.com/tag/Ron+Fredericks' rel='tag' target='_self'>Ron Fredericks</a>, <a class='technorati-link' href='http://technorati.com/tag/RPy' rel='tag' target='_self'>RPy</a>, <a class='technorati-link' href='http://technorati.com/tag/Sampling' rel='tag' target='_self'>Sampling</a>, <a class='technorati-link' href='http://technorati.com/tag/Statistics' rel='tag' target='_self'>Statistics</a>, <a class='technorati-link' href='http://technorati.com/tag/The+Generations+Network' rel='tag' target='_self'>The Generations Network</a>, <a class='technorati-link' href='http://technorati.com/tag/Virtual+World' rel='tag' target='_self'>Virtual World</a></p>

<!-- end wp-tags-to-technorati -->
]]></content:encoded>
			<wfw:commentRss>http://www.lecturemaker.com/2009/02/r-kickoff-video/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
	</channel>
</rss>

