Budget and User Feedback Control Strategy-Based PRMS Scenario Web Application
The Precipitation-Runoff Modeling System (PRMS) is used to study and simulate hydrological environment systems. It is common for an environmental scientist to execute hundreds of PRMS model runs to learn different scenarios in a study field. If the study case is complex, this procedure can be very time-consuming. Also, it is very hard to create different scenarios without an efficient method. In this paper, we propose a PRMS scenario web application. It can execute multiple model runs in parallel and automatically rent extra servers based on needs. The control strategy introduced in the paper guarantees that the expense is within the planned budget and can remind a system manager if the quantified user feedback score crosses the predefined threshold. The application has user-friendly interfaces and any user can create and execute different PRMS model scenarios by simply clicking buttons. The application can support other environmental models besides PRMS by filling the blueprint file.
KeywordsPRMS Budget control User feedback Web application
The U.S. Geological Survey developed the Precipitation-Runoff Modeling System (PRMS) in the 1980s [8, 9, 10]. The model is widely used in hydrological research. PRMS can consume a very long time to finish a model run with a regular personal computer, especially if a user chooses to generate a PRMS animation file. This problem becomes worse if a user builds a different scenario and start multiple PRMS model runs. There are some similar works already done. However, most of them do not consider how to change a server size based on the needs. Another problem is that PRMS is executed only in a terminal without friendly user interfaces and all the input data is stored with the text files. A new user can spend a very long time to study how to start a model run and modify the input parameters.
To solve these problems, we built a web-based PRMS scenario tool. It executes PRMS models in parallel, changes service size based on budget, takes into account user opinions, and provides user-friendly interfaces. The tool contains two parts: server and client. The server contains multiple Docker workers  to finish the PRMS model run requests. These workers contain execution files and are independent. Usually, a server has better hardware than a normal personal computer. Therefore, the performance is guaranteed. If there are many PRMS model run requests and more than the owned server capability, the server can automatically rent machines to increase the server power. It can also quantify the user feedback and warn the system manager if users are not satisfied with the system. The client provides user-friendly interfaces. A user can execute model runs, check modifications, and create different scenarios by clicking mouse buttons.
Our proposed method controls the system from the budget and user feedback. Budget to set up a server is crucial. Unlike industry companies, the academia always has limited funding and the budget should be controlled by the plan. User feedback can be used to test if the users are satisfied. However, if there is a huge amount of the user feedback it can be hard to process them. In our opinion, the user feedback should be quantified and this can simplify the analysis procedure. The prototype system is set up and running in .
In the rest of the paper, Sect. 64.2 introduces some related work; Sect. 64.3 shows the system design and how the components are connected; Sect. 64.4 presents how the system changes its size automatically based on the budget and quantify the user opinion; Sect. 64.5 explains how to create a PRMS scenario and other services.
64.2 Related Work
There are numerous studies conducted in the field of dynamic provisioning of computing resources in a cloud environment. Some of the successful works are briefly discussed in this section.
Rodrigo et al.  proposed an adaptive provisioning of computing resources based on workload information and analytical performance to offer end users the guaranteed Quality of Services (QoS). The QoS targets were application specific and were based on requests service time, the rejection rate of requests and utilization of available resources. The proposed model could estimate the number of VM instances that are to be allocated for each application by analyzing the observed system performance and the predicted load information. The efficiency of the proposed provisioning approach was tested using application-specific workloads, and the model could dynamically provision resources to meet the predefined QoS targets by analyzing the variations in the workload intensity. However, the approach offers no control over the expenses as it does not consider budget constraints and user feedback while provisioning resources to ensure guaranteed QoS.
Qian Zhu et al.  proposed a dynamic resource provisioning algorithm based on feedback control and budget constraints to allocate computational resources. The goal of the study was to maximize the application QoS by meeting both time and budget constraints. The CPU cycles and memory were dynamically provisioned between multiple virtual machines inside a cluster to meet the application QoS targets. The proposed approach worked better than the static scheduling methods and conserving strategies on resource provisioning. The flaw with this approach was that it requires the reconfiguration of computing resources within the machine instances, which is not well recommended in the current cloud environment where resources could be efficiently managed by the addition and removal of virtual machines from the cloud host providers. Moreover, the dynamic allocation of resources based on CPU cycle and memory usage could go inaccurate more often, as the parameters cannot truly indicate the need for more resources. There are chances that, the virtual machine is just busy with some low-CPU or low network jobs. In this paper, we propose an innovative server-usage optimization approach to facilitate on-demand provisioning of computing resources to ensure reduced waiting time for jobs consistently over a predefined period of time within the allocated budget constraints. The proposed approach uses a modified queuing model to provide estimations on waiting time and queue length based on the budget amount which is very relevant as it helps the admin in making budget decisions more easily. The user feedbacks are continuously monitored in the system and auto alert emails are generated to notify the admin of experiencing severe performance issues.
There have been many types of research going on in the field of environmental modeling by different interdisciplinary research groups. Consortium of Universities for the Advancement of Hydrologic Science, Inc (CUASHI)  is a research organization comprising the universities in America to develop services and infrastructure for understanding and exploring the mysteries related to water science. Hydroshare  is one of their projects aimed at providing cyberinfrastructure to facilitate an online collaborative environment for sharing hydrological models and data. Hydroshare offers various web apps to share, visualize, analyze and run hydrological models. The goal of Hydroshare is to enhance the collaboration in the research community by helping in the discovery and access of data models published by other researchers. Geographic Storage, Transformation and Retrieval Engine (GSToRE)  is a data management framework to support the storage, management, discovery and sharing of scientific and geographic data. It was developed at Earth Data Analysis Center, the University of New Mexico with a goal to offer a flexible and scalable data management platform for scientific research. GSToRE offers a REST API to facilitate the storage, retrieval, and removal of geospatial data and associated meta-data.
64.3 System Design
64.3.1 Queue Master
To arrange resources reasonably and in order, our system has a queue component. All the user requests are handled by a docker container named “Queue Master” (see Fig. 64.1). This container classifies the requests into different groups based on the required service types, Then different requests go into different queues. For example, if a user wants to run a PRMS scenario, the request is handled by the PRMS scenario service. Therefore, it will enter the corresponding queue. The system can offer an estimated waiting time based on a proposed method introduced in this thesis and the user can choose which queue based on their needs.
64.3.2 Server Master
The server master is used to inspect all the host machines health information, such as server failures and budget information. For example, the CPU, memory, and network usage percentages. It can automatically rent another host machine based on the control strategy. The rent host machine event can be triggered based on the rules stored in Rule Manager container.
To connect different docker containers in different host machines, we set up a key-value store node. If docker containers are on the same host machine they can ping each other directly without any further operations. However, docker containers cannot ping each other if they are on different host machines. The key-value store node is used to store different host machine IPs, networks, and endpoints. After the node is setup, different host machines can view each other. Then, it is easy to create a docker overlay network across different host machines.
For each group, there should be a task manager node and a feedback collector. The task manager node is used to create worker containers in different host machines and delete the worker container after the job is finished. We did not use any orchestration tool, such as Docker swarm  and Apache Mesos , because it is not possible to stop a container in a certain machine with these tools based on our knowledge. These tools can change the server size based on some events, such as CPU and memory usage percentages. However, it is at a high level (server as a whole). Based on our experiences, only the worker container itself knows what happens inside. It is not reasonable to shrink the server size based on CPU usage or time. Sometimes, the CPU usage is low and the container is busy. For example, file transportation jobs mainly use I/O bandwidth instead of CPU or memory. In our opinion, we can stop the container, only when the container finishes the last line of the script. Therefore, each worker in our prototype system sends a termination request to the task manager after it finishes the job and then the task manager will stop the container. For a large cluster, it is possible that many containers report termination events to the server master container at the same time. It can cause problems if the network bandwidth is not big enough. There are two solutions for this problem: (1) more than one server master containers are set up in the system to process the reports. Each master container is only in charge a group of containers and this method reduces the burden. (2) Jittering APIs can be applied. Each container does not send the termination information to the server master through the API directly after the container finish the job. The worker container waits a random time and then send the information. This avoids too much information occupies the network at the same time. The relationships between the server master node, host machines, and worker container are displayed in Fig. 64.2.
Servers are physical machines in the system. These machines are set into different groups based on the needs of different services. For example, there are more PRMS scenario requests than other requests. Therefore, more machines are in PRMS scenario group than other service groups. The server master runs docker configuration files to download docker images and setup services in different machines. The server master may rent more machines automatically based on the rules stored in the rule manager.
64.3.4 Feedback Collector
The user feedback is very important for the project manager to set up reasonable rules. The survey is used to collect the user feedback in our prototype system. The system can turn the survey results into a feedback score and change the server size automatically or offer suggestions to the project manager based on the feedback score.
The feedback collector can send a survey invitation to the user after he/she uses the service. Each question has different weights and the options of different questions have different points. The project manager can setup a threshold for each question. If the point passes the threshold, it means the project manager needs to do something. Here is an example: the project manager wants to know the user’s opinion about the service performance. Therefore, he puts two questions in the survey: (1) What’s your opinion about the waiting time? (question weight 0.8). Options: A. Too long (2 points), B. Long (1 point), C. Not Sure (0 point), D. Short (−1 point), E. Very short (−2 points). (2) Do you want to pay more to have a faster service? (question weight 1.2). Options: A. Strongly Agree (2 points). B. Agree (1 point). C. Not Sure (0 point). D. Disagree (−1 point). E. Strongly disagree (−2 points). If a user chooses A for the first question and D for the second question, then it contributes 0.8*2 + 1.2*(−1)=0.4 to the global feedback score. If the global feedback score passes the threshold it means the users believe the server is slow and they want to pay more for a faster service. Therefore, the project manager may need to allow the server master to rent more machines from the third party companies. Each service has a feedback collector. The feedback collector contains a survey predefined by the project manager. Based on the feedback score, the feedback collector can affect rules stored in the rule manager.
64.3.5 Rule Manager
Queue master and server master follow rules stored in the rule manager. The rules are applied to these two masters through RESTful APIs. When the project managers want to add a new rule, they may also need to modify RESTful APIs in the queue master and server master. This is because the RESTful APIs may not include the functions required in the rules. For example, there is a rule requiring average job waiting time in Service 1 queue should be 10 s. However, the queue master does not have a RESTful API to change the average job waiting time. Then, the project managers should work on the API first.
64.4 Control Strategy
The control strategy is stored in the rule manager of the prototype system (introduced in Sect. 64.3.5). The strategy includes how the service requests are handled in a queue and how to manage the server based on user feedback. This section introduces brief ideas. More details on the algorithm and validation can be found in our previous work [5, 17].
Algorithm 1 Create rent worker
Algorithm 1 illustrates the logic for the creation of new rented workers in the proposed system. On inputting the budget amount (B) and the cost of rented instances ($P/hr), the system estimates the total available rented time (T) from the cloud provider. The average execution time of the job (Trent) is already available in the system from previous job execution details. Then the total number of jobs that could be processed with rented workers is N = T/Trent. The counter variable RW (Rented Workers) would keep track of the total number of jobs rented. To utilize the rented resources judiciously, the usage of rented workers is distributed uniformly across the budget time period Tb. To achieve this, the manager should rent a job at every time interval, Tint = (Tb * Trent * P)/B, if the owned servers are not available at Tint to process the job. At every Tint interval, the system would inspect whether there is a necessity for new workers. At Tint, if owned workers are not available (i.e. there are jobs waiting in the queue), then the system will create a new worker in one of the rented machines in the worker pool and increments the counter variable RW by one. If at Tint, an owned worker is available, then the system would record such occasions on to a counter variable UR (Unused Rentals), so that later, if a job comes in and the owned workers are not available, the system could immediately create a new worker to handle the job instead of waiting for the next Tint interval. The project manager can increase or decrease the budget amount in the middle of the execution and the system updates N accordingly with the changes in the budget amount. The execution time of the job varies with the workload on the host machine. The value of N is constantly updated on completion of each job based on the actual running time each job has taken for its execution. This process will be repeated until the number of jobs rented equals N, i.e. the maximum number of jobs that could be processed with rented containers for the given budget. More details about the budget control strategy can be found in our previous work .
The system also includes a dashboard where the user can view the details of the finished jobs in real time. On finishing a model simulation job, the dashboard will display the details of the job such as the task id, the cost for the job execution, run-time of the job, waiting time in the queue, the name of the worker that processed the job and the category to which the worker belongs (owned or rented). The dashboard also shows the total number of jobs finished, the number of jobs processed with rented and owned workers, and also the remaining amount available to spend. The prototype has a feedback survey form, where the user can provide feedbacks on the performance of the service. The manager can also view the results of the feedback survey from the users in Survey Results page. The page displays separate bar graphs for each question in the survey questionnaire. The bar graphs show the total votes obtained for each option of the question. This will help the manager to easily understand how efficiently the system can serve its users and help in taking decisions on increasing or decreasing the budget amount.
64.5 PRMS Scenario Tool
PRMS Model Scenario component enables researchers to modify existing model simulations and re-run models with modified input files to analyze user-defined model scenarios. It also provides data conversion service. This allows a user to modify model input and output files format. The user does not need to be required any programming skills to use our model modification component. The user interface is intuitive and user-friendly so that the user can perform the model modification activities through simple mouse clicks.
To create a user-defined simulation scenario, the user has to choose one of the existing model simulations or a default simulation for modification. Then, the user determines to modify which parameters to get the desired model scenario. For example, the user can change the vegetation types of the study area from “forest” into “bare ground” to study what can happen if people cut too many trees in the field. Once the parameters are decided, the user needs to specify Hydrologic Response Units (HRUs) of the study area that should be changed. Then, the system knows where and what to modify.
64.5.1 HRU Selection Methods
126.96.36.199 Manual Selection
188.8.131.52 Parameter Selection
64.5.2 Other Services
184.108.40.206 Data Convertor
In the PRMS Scenario tool, data is stored in NetCDF format. NetCDF is a self-describing and machine-independent data format . It is widely used in climate data research. However, this file formats may not be supported by other tools used by a modeler. Therefore, the scenario tool contains a data convertor. It can convert NetCDF file into text and text file into NetCDF file. More details are introduced in this paper .
The system offers the one-time authentication service. This means a user only needs to login once and the user can use all the services in the system. This is done by using JWT (JSON Web Token). It is safer Because the system uses JWT instead of passing the user’s username and password. More details can be found in our previous work .
64.6 Conclusion and Future Work
In this paper, we have proposed a PRMS scenario web application and a budget and user feedback control framework. It allows a modeler to create different scenarios and execute them in parallel. The application’s server can rent extra machines to increase its computing power automatically based on needs and warn the system manger based on the quantified user feedback. The budget control strategy can also make sure the renting cost is within the plan. It is easy to extend the system with other models and control rules because of the proposed design. In the future, we want to extend the tools with more environmental models and also add payment component. We also would like to improve the proposed queuing model by considering a server starting time. This can be a challenge because usually the server starting time is not fixed and can be very long. Last but not least, the proposed tools should be validated by different programs besides environmental models.
This material is based upon work supported by the National Science Foundation under grant numbers IIA-1329469 and IIA-1301726. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
- 1.R.N. Calheiros, R. Ranjan, R. Buyya, Virtual machine provisioning based on analytical performance and QoS in cloud computing environments, in Proceedings of the 2011 International Conference on Parallel Processing. ICPP ’11 (IEEE Computer Society, Washington, DC, 2011), pp. 295–304Google Scholar
- 2.Docker, Docker - build, ship and run any app. https://www.docker.com/. Accessed 18 July 2017
- 3.Docker, Docker Swarm — Docker. https://www.docker.com/products/dockerswarm. Accessed 18 July 2017
- 4.M.M. Hossain et al., Web-service framework for environmental models, in Seventh International Conference on Internet Technologies & Applications (ITA) (IEEE, New York, 2017)Google Scholar
- 5.M. Hossain et al., Becoming dataone tier-4 member node: steps taken by the nevada research data center, in 2017 International Conference on Optimization of Electrical and Electronic Equipment (OPTIM) & 2017 International Aegean Conference on Electrical Machines and Power Electronics (ACEMP)(IEEE, New York, 2017), pp. 1089–1094Google Scholar
- 6.In Consortium of universities for the advancement of hydrologic science. CUASHI. https://www.cuahsi.org/. Accessed 18 July 2017
- 7.S. Karlin, J. McGregor, Many server queueing processes with Poisson input and exponential service times. Pacific J. Math. 8(1), 87–118 (1958)Google Scholar
- 8.G.H. Leavesley et al., Precipitaion-runoff modeling system: user’s manual. Water-Resources Investigations Report (1983), pp. 83–4238Google Scholar
- 9.S.L. Markstrom, R.G. Niswonger et al., GSFLOW – coupled ground-water and surface-water flow model based on the integration of the precipitation-runoff modeling system (PRMS) and the modular ground-water flow model. Water-Resources Investigations Report (2005)Google Scholar
- 10.S.L. Markstrom et al., The Precipitation-Runoff Modeling System. Version 4. U.S. Geological Survey Techniques and Methods, Book 6 (chap. B7) (Clarendon Press, Oxford, 2015), p. 158. http://doi.org/http://dx.doi.org/10.3133/tm6B7
- 11.A. Mesos, Apache Mesos. http://mesos.apache.org/. Accessed 18 July 2017
- 12.Open Geospatial Consortium, OGC network common data form (netCDF) standards suite (2014)Google Scholar
- 13.L. Palathingal et al., Data processing toolset for the virtual watershed, in 2016 International Conference on Collaboration Technologies and Systems (CTS) (IEEE, New York, 2016), pp. 281–287Google Scholar
- 14.D.G. Tarboton et al., HydroShare: an online, collaborative environment for the sharing of hydrologic data and models (Invited). AGU Fall Meeting Abstracts (2013)Google Scholar
- 16.R. Wu, Virtual watershed platform. https://virtualwatershed.org/. Accessed 18 May 2017
- 17.R. Wu et al., Self-managed elastic scale hybrid server using budget input and user feedback, in 12th FC: Workshop on Feedback Computing. ICAC 2017, Columbus, OH (2017)Google Scholar
- 18.Q. Zhu, G. Agrawal, Resource provisioning with budget constraints for adaptive applications in cloud environments, in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (ACM, New York, 2010), pp. 304–307Google Scholar