Tutorial: Shiny Application to Visualize Data Generated On-the-Fly

From dftwiki3
Jump to: navigation, search

--D. Thiebaut (talk) 08:31, 18 July 2015 (EDT)


This tutorial deals with visualizing data in R. It illustrates one way (there are others, probably simpler, faster, and more efficient) to get data generated on the fly on a server, and to display them them in a Shiny application. When the user activates one of the input widgets in Shiny, an HTTP request is sent to a server, and the value of the parameter defined by the widget is sent along. A python script in the cgi-bin directory on the server gets activated, generates new data, ajd sends them back to the Shiny App as straight text. The Shiny App then displays the data. You can try the resulting app here.


Other tutorials on R and various technologies can be found here.

Video Demo



Full Markdown


The raw, full markdown resulting from the CVC workshop is available here. Use for background information.

Shiny App


Python CGI script


This script is written in Python V.3. It must reside in a cgi-bin directory on a server you have access to, and on which the Apache server has been configured to allow CGI scripts.

The reason I use a cgi-bin solution is that the computational load for generating the data can be quite high, and using cgi allows me to put the computation on any server that has enough CPU power and memory storage, including AWS clusters.


dominique@hadoop0:~/R$ cat /usr/lib/cgi-bin/generatePop.py
#! /usr/bin/env python3
# generatePop.py
# D. Thiebaut
# CGI script that lives in /usr/lib/cgi-bin on remote server.
# 

import random, sys
import cgi

print( "Content-Type: text/plain" )
print()

# define parameters
dataFileName = "pop%02d_%04d.dat"
maxT         = 2          # how fast we progress in time
T            = 3 * 31     # max time frame (3 months)
maxPop       = 2400       # max # of students
proportion   = 0.50       # how much of the population contributes
                          # to new cases of infection
noSimulations= 200                    
oneBigFile   = True
severalFiles = False
printOut     = True
dico         = {}

def getParams():
    dico = {}
    arguments = cgi.FieldStorage()
    for i in arguments.keys():
        #print( i, "-->", arguments[i].value )
        dico[i] = arguments[i].value
    return dico

def getProportion( dico ):
    if "param" in dico:
        prop = int(dico["param"])/100.0
    elif "proportion" in dico:
        prop = int(dico["proportion"])/100.0
    else:
        prop = 0.50

    return prop

def getNoSimulations( dico ):

    if "simulations" in dico:
        noSim = int( dico["simulations"])
    else:
        noSim = 200
    return noSim

def generateOneInfectionHistory( Id ):
    global dataFileName
    
    # iterate and generate population
    pop = 0             # starting pop
    t = 0
    out = ""
    while t <= T:
        out += "%d, %d, %d\n" % ( Id, t, pop )
        if pop < maxPop / 2:
            pop += 1 + random.randrange( int( pop*proportion) +1 )
        else:
            pop += 1 + random.randrange( int( (maxPop - pop)*proportion) + 1 )
        pop = min( maxPop, pop )    
        t += 1 + random.randrange( maxT )
    return out


def main():
    global noSimulations, proportion

    # get the parameters from the URL
    dico = getParams()

    # get proportion parameter from URL
    proportion = getProportion( dico )
    noSimulations = getNoSimulations( dico )

    #print( "proportion = ", proportion )
    #print( "simulations = ", noSimulations )
    #return

    allOut = "Id, time, pop\n"
    if printOut:
        print( allOut, end="" )

    for i in range( noSimulations ):
        out = generateOneInfectionHistory( i+1 )       
        if printOut: 
            print( out, end="" )

        allOut += out
        if severalFiles:
            out = "Id, time, pop\n" + out
            open( "pop%02d_%04d.dat" % (int(proportion*100),i+1), 'w' ).write( out )
            #print( dataFileName % (int(proportion*100), i+1), "created" )

    if oneBigFile:
        open( "pop%02d_%04d_%04d.dat" % (int(proportion*100),0,(i+1)), 'w' ).write( allOut )


        
main()



Id, time, pop
1, 0, 0
1, 2, 1
1, 4, 2
1, 5, 4
1, 7, 7
1, 8, 11
1, 10, 17
1, 12, 20
1, 14, 29
1, 16, 38
...


Shiny App


# app.R
# D. Thiebaut
# reads data from files on a Web server using cgi-bin
# uses a slider
library( "ggplot2" )
library( "shiny" )

# ===================================================
#                                                 S E R V E R
# ===================================================
server <- function( input, output ) {
  
  dataSet <- reactive( {
    fileName <- paste0( "http://hadoop0.dyndns.org/cgi-bin/generatePop.py?proportion=", input$n_breaks,
                        "&simulations=", input$noSimulations )
    read.csv( url( fileName ) )
  } )
  
  output$main_plot <- renderPlot({
    ggplot( data=dataSet(), 
            aes( x = time, y = pop, color = Id ) ) + 
      geom_point( ) +  
      scale_colour_gradientn(colours=rainbow(4)) + 
      stat_summary(fun.y = mean, geom = 'line', color = 'blue' )
  } )
}

# ===================================================
#                                                          U I
# ===================================================

ui <- fluidPage(

  #---  TITLE  ---
  titlePanel( "Infected Population", windowTitle = "Growth of Infected Population" ),
  
  #---  MAIN PANEL ---
  mainPanel( h2( "Description"), p( "This graph shows the result of 200 simulations of the growth a population of infected students on a campus, as a function of some 'magic' parameter controlled by the slider."),
             p( "The points show the growth resulting from the 200 simulations, and the line shows the average of the points over bins of 3 time periods." ),
             p( "The data are read from a URL where a server generates data on the fly.  The value of the slider is sent as a suffix to the URL (e.g. http://hadoop0.dyndns.org/cgi-bin/generatePop.py?param=71) and the server generates 200 different simulations." )
                
  ),

  #--- INPUT BOX ---
  selectInput(inputId = "noSimulations",
              label = "Number of Simulations:",
              choices = c(20, 100, 250, 500, 1000),
              selected = 250),
  
  #--- SLIDER ---
  sliderInput( inputId = "n_breaks", label = "Population Growth (magic param):",
               min = 1, max = 99, step = 0.5, value = 50 ),
  
  plotOutput(outputId = "main_plot", height = "300px")
)

shinyApp(ui = ui, server = server)


Output


  • This Shiny App has been published to shinyapps.io, and is available here.
  • A static version of the app is shown below:


ShinyPopGrowth.png


Getting the Data Using the RCurl Package


Rather than pasting the URL that gets the data from the CGI script, we can use the RCurl package, and get cleaner code, with the same behavior.

# app.R
# D. Thiebaut
# reads data from files on a Web server using cgi-bin
# uses a slider
library( "ggplot2" )
library( "shiny" )
library( "RCurl")

server <- function( input, output ) {
  dataSet <- reactive( {
      URL <-  "http://hadoop0.dyndns.org/cgi-bin/generatePop.py"
      tt <- getForm( URL, param = toString(input$n_breaks), simulations = toString(input$noSimulations ) )
      read.csv( textConnection( tt ) )
  } ) 
  
  output$main_plot <- renderPlot({
    ggplot( data=dataSet(), 
            aes( x = time, y = pop, color = Id ) ) + 
      geom_point( ) +  
      scale_colour_gradientn(colours=rainbow(4)) + 
      stat_summary(fun.y = mean, geom = 'line', color = 'blue' )
  } )
}

ui <- fluidPage(

  titlePanel( "Infected Population", windowTitle = "Growth of Infected Population" ),
  
  mainPanel( h2( "Description"), p( "This graph shows the result of 200 simulations of the growth a population of infected students on a campus, as a function of some 'magic' parameter controlled by the slider."),
             p( "The points show the growth resulting from the 200 simulations, and the line shows the average of the points over bins of 3 time periods." ),
             p( "The data are read from a URL where a server generates data on the fly.  The value of the slider is sent as a suffix to the URL (e.g. http://hadoop0.dyndns.org/cgi-bin/generatePop.py?param=71) and the server generates 200 different simulations." )
                
  ),
  selectInput(inputId = "noSimulations",
              label = "Number of Simulations:",
              choices = c(20, 100, 250, 500, 1000),
              selected = 250),
  
  sliderInput( inputId = "n_breaks", label = "Population Growth (magic param):",
               min = 1, max = 99, step = 0.5, value = 50 ),
  
  plotOutput(outputId = "main_plot", height = "300px")
)

shinyApp(ui = ui, server = server)


Food for Thought


  • The RCurl package supports ssh and scp, which makes it possible to directly copy data from a remote server, and run an application on the remote server, without it having to be a CGI script.
  • See http://stackoverflow.com/ for some ideas on how to do this...



...


Publishing the Shiny App on Smith's Shiny Server


  • ssh to studio.smith.edu with your Smith credentials
  • Pick a name for your Shiny app. I've chosen dft_app1. This will be part of the URL of the shiny app, once uploaded to the Shiny server.
  • Enter the following commands at the Linux prompt:
mkdir /srv/shiny-server/dft_app1
cd  /srv/shiny-server/dft_app1
emacs -nw server.R

  • Enter the following code in the file server.R:


library(shiny)
library( "ggplot2" )

# Define server logic required to draw a histogram
shinyServer(
  function( input, output ) {
  
  dataSet <- reactive( {
    fileName <- paste0( "http://hadoop0.dyndns.org/cgi-bin/generatePop.py?proportion=", input$n_breaks,
                        "&simulations=", input$noSimulations )
    read.csv( url( fileName ) )
  } )
  
  output$main_plot <- renderPlot({
    ggplot( data=dataSet(), 
            aes( x = time, y = pop, color = Id ) ) + 
      geom_point( ) +  
      scale_colour_gradientn(colours=rainbow(4)) + 
      stat_summary(fun.y = mean, geom = 'line', color = 'blue' )
  } )
} )


  • Save the file you just created, and create a new one:
emacs -nw ui.R

  • Enter the following code:
library( "shiny" )

shinyUI( fluidPage(
  
  #---  TITLE  ---
  #runtime: shiny
  #---
  mainPanel( h2( "Description"), p( "This graph shows the result of 200 simulations of the growth a population of infected students on a campus, as a function of some 'magic' parameter controlled by the slider."),
             p( "The points show the growth resulting from the 200 simulations, and the line shows the average of the points over bins of 3 time periods." ),
             p( "The data are read from a URL where a server generates data on the fly.  The value of the slider is sent as a suffix to the URL (e.g. http://hadoop0.dyndns.org/cgi-bin/generatePop.py?param=71) and the server generates 200 different simulations." )
             
  ),
  
  #--- INPUT BOX ---
  selectInput(inputId = "noSimulations",
              label = "Number of Simulations:",
              choices = c(20, 100, 250, 500, 1000),
              selected = 250),
  
  #--- SLIDER ---
  sliderInput( inputId = "n_breaks", label = "Population Growth (magic param):",
               min = 1, max = 99, step = 0.5, value = 50 ),
  
  plotOutput(outputId = "main_plot", height = "300px")
) 
)
  • Save the file. One more command to make the files readable by all:
chmod a+r *


SmithShinyServer.png