Rpad

Rpad is an interactive, web-based analysis program. Rpad pages are interactive workbook-type sheets based on R, an open-source implementation of the S language. Rpad is an analysis package, a web-page designer, and a gui designer all wrapped in one. Rpad makes it easy to develop powerful data analysis applications that can be shared with others (most likely on an intranet). The user doesn't have to install anything--everything's done through a browser.

From the user's point of view, Rpad shouldn't really need any documentation. It just looks like a web page. The user can run a pre-canned analysis and poke around with the inputs and scripts and outputs. For designers of Rpad pages, some more background is helpful.

Rpad is available in two versions: a local version and an intranet/internet version. The local version works through the user's local installation of R with through the user's web browser. The intranet/internet version works in client-server fashion with the user accessing a remote server through a standard web browser.

Browser Support: Mozilla and IE

Rpad is most mature on Internet Explorer. It's been tested on version 6.0 and may work in 5.5 (untested). The main advantage of IE is more mature support of WYSIWYG editing. IE has many negatives though: the html that the editor generates is awful, the browser is inferior to Mozilla browser in many ways, and it's Windows-only.

Rpad works with Mozilla (tested on v1.5) and with Firefox (tested on v1.0) with the Mozile extension [downloads]. Rpad pages will work without the Mozile extension, but the pages are "read-only" -- none of the page is editable. But, the user can still enter data in input boxes and operate GUI widgets and calculate the page. If all editable areas are kept as INPUT's or TEXTAREA's, then such pages should work fine even without WYSIWYG editing capability. Mozile adds the contentEditable feature to allow editing of anything on the page. This works pretty well, but Mozile is beta software and is not quite as smooth as the IE contentEditable. While Mozilla support is still somewhat immature, it should be great as it matures.

Rpad actually has two editors, one based on HTMLArea, and a "lite" editor that doesn't use HTMLArea (but it does use some utility routines taken from HTMLArea). The lite editor is especially useful with Mozilla browsers. The lite editor does not embed an Iframe, so some things work better (like internal links in Rpad pages). Startup should be faster as well. The lite editor is the default on Mozilla browsers, but it can also be used in IE (you need to change the first line in Rpad_loader.js). Likewise, you can use the HTMLArea editor with Mozilla by changing Rpad_loader.js, but many of the buttons on the HTMLArea toolbar won't work (but the buttons on the Mozile toolbar still should work.

HTMLArea has options for many plugins that can also be used with Rpad. Extra capabilities include better table support and context menu options.

Rpad doesn't work on other browsers (Opera, konquerer, etc.). To work, a browser needs the contentEditable feature, and it needs xmlHttp.

Editing and Running

The whole page is editable as a simple WYSIWYG html editor.

The simplest way to set up a page is to use one or more Rpad script regions. Scripts can be inside of div, span, or textareas. Scripts can be R, shell, or javascript. Server-side shell scripts are useful for using other scripting or command-lines (python, octave, imagemagik, and so on). Javascript is useful for local processing, interactivity, or for debugging Rpad problems.

To "run" or "calculate" a page, press the F9 key or use the mouse to press the "Calculate" menu option. If you want to "calculate" just one region, press F8 when in the Rpad_input region or the Rpad_results region (this has no menu option). All scripts are run from top to bottom. There is no automatic cross-referencing like a spreadsheet. While a web page is up, Rpad is communicating with the same R process. That means all the variables are still there (even if you change the script). This can have some desirable or undesirable effects, so it's good to be aware of it. Refreshing the page (make sure to save first) will restart the R process.

Scripts can be hidden from the user. Press F2 to toggle between hidden and revealed Rpad_input sections. There's also a menu option to reveal all hidden scripts.

A special type of Rpad_input section is executed just at startup. These sections have rpad_run="init" set in the html codes. For now, there's no gui to specify this. You either have to enter this code directly in the html (via the <> toolbar button in IE), copy and paste it from another Rpad page, or use another text editor to add this. These "init" blocks are useful for reading in files or data and also for setting up GUI elements from server data (they only run once, so the user can change the GUI elements generated on the server). If rpad_run="none", then the script doesn't run until the user explicitly calculates that block. This is also useful in some situations.

Rpad output sections are normally rendered as plain text with a fixed-width font, so text script outputs are formatted properly. The output blocks can also contain HTML codes for displaying images or for formatted blocks. To change to HTML formatting send '' to the output. This is cat('') in R scripts. To turn HTML mode off, send ''.

The builtin page editor in Rpad is somewhat limited. It doesn't generate very good HTML, and saving has some bugs in it (it doesn't save anything in the HEAD, and on some browsers, changes to INPUT blocks don't get saved right). For more advanced and reliable editing, use an external text or HTML editor. The builtin page editor is best used for online changes and exploration.

Rpad package

The Rpad package provides several R utility routines to help the user do graphs and generate HTML output and HTML GUI elements.

GUIs

There are several ways to add interactivity to an Rpad page. From simplest to complex, some possibilities are:

The basic GUI elements that are "named" are translated into variables with the same name in R. The variable name appears in a box on the upper right of this window when you select a GUI element.

HTML File Structure

You can turn any HTML file into an editable Rpad page. Simply insert the following code into the page header:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<script type="text/javascript" src="editor/Rpad_loader.js">

Within the body of an HTML file, any element can be tagged with 'style="Rpad_input"'. Normally, this is either a div (fills the width of the element) or a span (which can be used within a paragraph), but textarea and input are also useful.

There are several kinds of Rpad_input variations. The tag 'Rpad_type' specifies the options, which include:

R
R scripts of course
Rvariable
R expression assigned to a variable
Rstring
R string
shellscript
server shell script
javascript
local javascript interpreter

Another option available on Rpad_input blocks is rpad_run="init". With this, the script will only run when the page first loads and not after (unless the user explicitly requests calculation by hitting F8). Another option is rpad_run="all" where the block runs initially and every time the page is calculated.

Another optional tag available on Rpad_input blocks is rpad_output, which has options of 'none', 'html', or the default 'text'. 'html' and 'text' determine how the output is interpretted. Within an R block, the formatting can be switched by HTMLon() or HTMLoff(). There is no GUI to adjust this yet.

HTML GUI elements (input, select, radio, and checkbox) with a name tag are automatically sent to R when calculate is run. The 'name' is translated into an R variable name. To make the GUI elements selectable must be surrounded by <span contenteditable="false"></span> (also for url links and buttons).

Here is an example of a complete Rpad HTML file:

<html>
  <head>
    <title>GUI Examples</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <script type="text/javascript" src="editor/Rpad_loader.js"></script>
  </head>

  <body>
    <span contenteditable="false"><input type="checkbox" name="isPepperoni"></span>Pepperoni <br/>
    <span contenteditable="false">
    <input class="Rpad_input" name="favoriteNumber"rpad_type=" Rvariable" value="44">
    </span>Favorite number<br/>
    <br/>

    <div class="Rpad_input" rpad_type="R" rpad_run="init">
      HTMLon()<br/>
      data(state)<br/>
      select("favoriteAmericanState", state.name) # generate the select box
    </div>

    <div class="Rpad_input" rpad_type="R">
      cat("isPepperoni=",isPepperoni,"\n")<br/>
      cat("favoriteAmericanState=",favoriteAmericanState,"\n")
    </div>
  </body>
</html>

In Rpad_input sections, everything must be valid HTML, so "<-" must be "&lt;-". Also, <br/> tags are needed for line breaks, except for textarea sections.

You can use a <PRE> tag to make the HTML easier as follows. This makes it easier to paste R code into an HTML file: you don't need to use <br/>'s to separate lines, and whitespace is preserved. One disadvantage is that Mozile (the editing plugin for Mozilla) is a little buggy when editing PRE sections, so the user might get a little confused (the cursor doesn't work right all of the time).

  <div class="Rpad_input" rpad_type="R" rpad_run="init">
    <pre>
      HTMLon()
      data(state)
      select("favoriteAmericanState", state.name) # generate the select box
    </pre>
  </div>

Server Setup in the Client/Server Version

The files in Rpad are layed out as follows on the server:

In Rpad scripts, although you can access files in the main directory (the directory of the page) with "../../whatever", you are better off using RpadBaseFile("whatever") to maintain compatibility with the local version of Rpad. You can access files in the R process's current directory as you would normally, but if you want to specify it as a URL for output in the browser (for a link or an embedded file), use RpadBaseURL("whatever").

In the client/server version, each Rpad page gets its own R process on the server, so it is rather resource intensive. Threading and Duncan Temple Lang's idea of "multiple interpreters" would greatly help reduce the resource usage on the server.

Browser-Server Communications

Rpad uses Xmlhttp to communicate between the browser and the server, so browser refreshes are not needed to update data. This is available for both Mozilla and IE. Rpad uses standard HTTP GET or PUT requests, and all data is transmitted as ascii. Xmlhttp allows both synchronous and asynchronous communications. The best implementation (I've tried both) appears to be asynchronous. Synchronous communications can lock up the browser for long periods if the server becomes unavailable, or there's a communications glitch. Asynchronous communication avoids this, but the logic is more contorted internally.

Security

There is no built in security in Rpad. The user has complete access to any command in R and also in the system shell. I consider this a feature, not a bug. For protection, the system needs to be locked down on the server. Write protect any files and databases that are a concern, and lock out access to the server user to other parts of the system.

With the base setup, Rpad users can stomp on each others files. If a user saves a file in the Rpad directory, another user can write over it. To make it more permanent, the user would have to request that the system administrator write protect the file and any data that went with the file.

Because of the relatively lax security, Rpad is best suited for intranets or other relatively controlled situations.

Rpad could be integrated into a more advanced content management system that would better handle multiple users, passwords, and so on.

Server Installation

For notes on server installations, click here.

Differences Between the Local Version and the Client/Server Version

In the client/server version, each Rpad page gets its own R process on the server. So, multiple user's won't interfere with each other. In the local version of Rpad, the browser connects with one R process, so there is more chance of interaction between multiple Rpad pages. Each time the user starts a new page in the client-server version, the local environment is clean. But in the local version of Rpad, this is not true: the local environment may have objects from previous Rpad pages or from interaction at the command line.

There are several differences in directory layouts between the local and client-server versions of Rpad. In the local version, the current working directory of the R process is the directory of the Rpad page. In the client-server version, the current directory of the R process is two levels below the Rpad page (something like /var/www/Rpad/server/ddjieyfewx/ for files with a root of /var/www/Rpad/). In both versions, graphics are created in R's current working directory. But pages may read or write data differently depending on whether in the local or client-server version. See the Rpad functions RpadURL, RpadBaseURL, and RpadBaseFile for convenience utilities to make pages portable between the local version and the client-server verstion.

In the local version of Rpad, the default graphics device is R's builtin png device. On the client-server version, the default graphics device is the ghostscript-based pngalpha device. Therefore, graphic files may look different between versions. The pngalpha device has the advantage that it is antialiased, you can link to an eps file, and you have more fonts. The disadvantages of the pngalpha device are that you have to have ghostscript installed, and it's slower. On the server side, if you want to use R's png device on unix, you must have X running.




by Tom Short, tshort@eprisolutions.com, Copyright 2005. EPRI Solutions, Inc., license: GNU GPL v2 or greater