Keywords

1 Introduction

The Web was originally developed to support collaboration in science. Although scientists benefit from many forms of collaboration on the Web (e.g., blogs, wikis, forums, code sharing, etc.), most collaborative projects are coordinated over email, phone calls, and in-person meetings.

We are interested in supporting scientific collaborations where joint work occurs on a concrete problem of interest, with many participants, and over a long period of time. Although the Web may be used to share information, there is no explicit support for the shared tasks involved. These tasks are discussed through email, phone calls, and occasional face-to-face meetings.

2 Organic Data Science

We are developing an Organic Data Science framework based on a task-centered organization of the collaboration, and that includes principles from social sciences for successful on-line communities. Figure 1 illustrates the representation of a task, which includes properties such as the owner, participants, start and end times, and expertise required. Users can create additional task properties, as is typical in semantic wikis. They can add subtasks, and sign up as participants in tasks created by others. These tasks capture the what, who, when, and how of the activities pursued by the collaboration, and capture a novel form of science processes that has not been explicitly captured before.

Our Organic Data Science framework is implemented as an extension of a semantic wiki, in particular the Semantic MediaWiki platform [1]. Users can add properties to tasks as needed, and can describe any entity of interest to the collaboration (datasets, software, papers, etc.) using semantic properties of the wiki. Every task has its own page, and therefore a unique URL, which gives users a way to refer to the task from any other pages in the site as well as outside of it. Semantic wikis provide an easy-to-use interface where users can define structured properties, which are then represented in RDF and exported as linked data. The framework is still under development, and it evolves to accommodate user feedback and to incorporate new collaboration features.

We view the scientific collaboration as an on-line community, and have designed the Organic Data Science framework following social design principles uncovered by research on successful on-line communities [2]. The community design aspects are described in [3]. The semantic aspects of the design are described in [4].

The Organic Data Science framework captures science processes that are not made explicit in publications, supports the formation of ad-hoc groups to work on tasks of interest, enables anyone to contribute to tasks that match their interests, and advertises ongoing work to potential newcomers.

Fig. 1.
figure 1

Collaboration is organized around tasks in the Organic Data Science framework, represented through RDF properties in the underlying semantic wiki. These semantic properties are used by the system to generate content for other pages, such as the user page shown in the right side based on the tasks that this user is participating in. Status icons, shown as pie charts next to tasks, are also derived from the semantic properties that specify task type and deadlines.

Table 1 illustrates the features of the Organic Data Science framework (shown at the bottom), compared other collaborative tools on the Web that scientists use. Our framework is the only one that is designed to support on-line communities, is organized around tasks, and captures semantic structures for the entities involved in the collaboration.

Table 1. An overview of features supported by existing online collaboration tools.

3 Supporting Scientific Collaboration

The major user of the Organic Data Science framework is a community of hydrologists and limnologists that are studying the age of water in an ecosystem. Other communities are beginning to use the framework for neuroscience and geosciences research.

Table 2 shows the tasks defined so far in the age of water collaboration. In a 10-week time period started on 1st of October 2014, all task pages together were accessed more than 2,900 times. Person pages were accessed 328 times. We logged in total more than 19,000 events as users interacted with the system. Users have defined a total of 1047 RDF triples.

Table 2. Owner and participant’s distribution of task types.

The Organic Data Science framework provides live collaboration data evaluation for every community with a dashboard. A community dashboard includes a collaboration graph and statistical task metadata. The oldest ODS community is the age of water communityFootnote 1, followed by the Organic Data Science framework development community, geosciences communityFootnote 2 and the private neuroscience community.