Introduction

Lee Dirks, Microsoft Corporation
Tony Hey, Microsoft Corporation

 

By now, it is a well-observed fact that scholarly communication is in the midst of tremendous upheaval. That is as exciting to many as it is terrifying to others. What is less obvious is exactly what this dramatic change will mean for the academic world – specifically what influence it will have on the research community – and the advancement of science overall. In an effort to better grasp the trends and the potential impact in these areas, we’ve assembled an impressive constellation of top names in the field – as well as some new, important voices – and asked them to address the key issues for the future of scholarly communications resulting from the intersecting concepts of cyberinfrastructure, scientific research, and Open Access. All of the hallmarks of sea-change are apparent: attitudes are changing, roles are adjusting, business models are shifting – but, perhaps most significantly, individual and collective behaviors are very slow to evolve – far slower than expected. That said, each of the authors in this CTWatch Quarterly issue puts forward a variety of visions and approaches, some practical considerations, and in several cases, specific prototypes or break-through projects already underway to help point the way.

Leading off is Clifford Lynch’s excellent overview (“The Shape of the Scientific Article in the Developing Cyberinfrastructure”) – an outstanding entry point to the broad range of issues raised by the reality of cyberinfrastructure and the impact it will have on scientific publishing in the near-term. His paper is an effective preview to the fundamental shift of how scholarly communication will work, namely how the role of the author is changing in a Web 2.0 environment. A core element in this new world is the growing potential benefit for inclusion of data in submissions (or links to data sets). Lynch thoughtfully addresses the many implications arising in this new paradigm (e.g., papers + data) – and how policies and behaviors will need to adapt – most especially the impact this will have on the concept of peer review. He astutely raises the issue of the importance of software/middleware in this new ecosystem – namely in the areas of viewing/reading and visualization. This is a critical point for accurate dissemination to facilitate further research – and is also integral to discoverability as well as the ability to aggregate across multiple articles.

In his piece, “Next-Generation Implications of Open Access,” Paul Ginsparg provides an invaluable perspective on the current state of affairs – a “long view” – as one of the originators of the Open Access movement. Having in essence invented the Open Access central repository when he launched arXiv.org in 1991, Ginsparg’s brief retrospective and forward-looking assessment of this space is a useful look at the features and functionality that open repositories must consider to stay relevant and to add value in this changing environment. Indeed, it is a testament that arXiv.org has been able to remain true to its original tenets of remaining low-cost, selective, and complementary/supplemental to other publishers or repositories. However, Ginsparg’s treatment hints at several new directions and areas for enhancement/improvement relating to the issues of (a) storage and access of documents/articles at scale, (b) the social networking implications for large-scale repositories as well as (c) a discourse on how to handle compound objects, data and other related supporting documentation. Also insightful are Ginsparg’s musings of the economics of Open Access, and he surfaces the important theme highlighted by several of the authors in this issue—the notion that a generational shift is required to enable the necessary behavioral change, and the recognition that our field(s) may not progress until this reality is brought about.

Timo Hannay’s extremely useful survey of the Web 2.0 landscape is an especially valuable landscape map. In this environmental scan, Hannay takes a snapshot of the current state-of-the-art and provides not only definitions but also definitive examples/applications that demonstrate the reality, the potential, and the remaining hurdles faced by the social-networking phenomenon. Now that we’ve finally begun to realize the power and potential that had been promised us with the “web-as-platform” – we’re also understanding the many benefits and the driving-force of the network effect: the more who participate, the richer the experience. (Yet, Hannay also points out the cruel truth that the scientific community has been miserably late to the game, when it should have been first – considering the Internet was initially constructed to facilitate the sharing of scientific data.) As exciting as it might be at this point in time, a core tenet of this article is to point out that – as a community – we have yet to realize the full potential of Web 2.0, as we are still so very early in the initial phase. Considering the very medium we are using changes/alters the methods we employ, Hannay stresses that is “impossible to predict” the future, but the hints he provides promise us a very exciting journey.

 

Lynn Fink and Phil Bourne’s “Reinventing Scholarly Communication in the Electronic Age” is an especially compelling article in that it lays out examples of research and projects currently in progress to enact Web 2.0 principals. Echoing the irony from Hannay’s paper, the authors note that scientific and scholarly articles have not evolved at the same pace as other developments on the Internet. In an effort to change that, the University of California, San Diego is undertaking two projects to catalyze developments in this space: (1) the “BioLit” project relating to the semantic mark-up of journal articles enhanced during the authoring stage – not after the fact – and, (2) the launch of “SciVee.com”, a new online resource for augmenting scientific papers with brief video presentations. Also striking is that the article raises a theme that appears in several of the other papers related to the “generational change” that is a crucial underlying factor to the success of these types of projects. There is clearly agreement among the authors that the newer, younger generation is going to carry the scientific world forward in a way that the existing, established community cannot. So, changes that are being implemented now will begin to have greater and greater impact as this new generation of scientists and scholars shift the behavior of the community. It is an exciting prospect – but the groundwork to ensure this occurs is only being laid now with enabling research projects such as these from UCSD.

Herbert Van de Sompel and Carl Lagoze’s “Interoperability for the Discovery, Use, and Re-Use of Units of Scholarly Communication” is an exciting look at the seminal work now underway related to the Open Archives Initiative’s “Object Reuse & Exchange” (ORE) project. Building upon a concept initially referenced in Hannay’s paper, they point out the need to understand “the change in the nature of the unit of scholarly communication” – meaning we can no longer effectively think about an article or paper as the primary vessel of conveying knowledge within the academic world or across scientific disciplines. The transmission of knowledge has grown, broadened, and now spans a range from atomic data to entire datasets, from a single paragraph to a series of articles about specific concept, to presentations, videos, or other modes/formats related to the dissemination of a given concept. Since our ability to communicate information has exploded, we must likewise evolve our effort to describe, find, and utilize these new “compound objects.” This paper is an in-depth “nuts-and-bolts” presentation, and Van de Sompel and Lagoze explain why there is a crucial need to re-architect scholarship on the internet and how they propose to enact this to ensure that we maximize the intent and the value of what is made available for scholars and researchers. Accompanying their article is an illustrative, online demonstration of their prototype ORE implementation in the form of a [screencast] referenced in the Appendix of their article; this companion piece provides useful context and a quick overview with examples of their work in progress.

In the article by Stevan Harnad et al. entitled “Incentivizing the Open Access Research Web: Publication-Archiving, Data-Archiving and Scientometrics,” we see a bold proposal for employing a new application of research metric across multiple sources (defined as “scientometrics” – the collection, measurement and analysis of full-text, metadata, download and citation metrics) to drive author self-archiving, encourage data-archiving, and enhance research quality overall. Similarly referenced by other articles in this publication, this piece also notes that behavior and current practices of researchers and scholars need to change to match the potential that technology has provided academia. Namely, the authors propose a system to help accomplish this systemic and behavioral change – and a substantial transformation this would be. The system they present is based around three core components: (1) functionality of a network infrastructure, (2) established and agreed upon metrics to provide the necessary incentive(s), and (3) mandates from authoritative organizations that promote publishing into an Open Access system that maps directly into #1 and #2. In an effort to provide concrete examples of how such a system could evolve, the authors point to Citebase and the UK’s “Research Assessment Exercise” (RAE) as tangible case studies that could be modeled/expanded to achieve a vision of complementary components meshing together. Indeed, the tenet here being that, with this system in place, the “Open Access Impact Advantage” can be realized – where self-archiving driving citation impact becomes a virtuous cycle delivering more value than publication into closed/proprietary journals.

Brian Fitzgerald and Kylie Pappalardo’s important review of the legal options for treatment of open access implications is a focused piece honing in on how the law is working in a positive way to enable knowledge sharing in this new world of Open Access – which is made possible by the “collective endeavor through networked cyberinfrastructure.” Addressing the larger issue of “open licensing” models – Fitzgerald and Pappalardo provide multiple case-studies/examples of how legal concepts have been applied to complex issues to produce simple tools (like Creative Commons licenses) to enable researchers and academics to protect themselves and their work. The easier the system is, the more likely it is to be used and promulgated. Based on the uptake of the Creative Commons license – it is clear that scholars are leveraging this resource to protect their intellectual property – but with the spirit of sharing it as broadly as possible in the process. This is certainly a welcome trend and one that promises to further encourage others in the process.

This issue also includes two special articles in the “Perspectives” section:

In John Wilbanks’ piece, “Cyberinfrastructure for Knowledge Sharing,” we see an intriguing outline of the many painful issues currently faced in achieving true scientific research in our current information environment. The core thesis behind Wilbanks’ article is that “…we aren’t sharing knowledge as efficiently as we could be” – meaning that even though the potential is there, we are not yet realizing the full potential presented to us by cyberinfrastructure. The content is there, the data is there, but the entire system and network is not yet fully “wired” and functioning for optimal efficiency. Indeed, Wilbanks posits, we’re not even close. To address this opportunity space, Scientific Commons was created to overcome hurdles related to (1) access to literature, (2) access to experimental materials, and (3) to encourage data sharing. Wilbanks describes some projects currently underway (e.g., Neurocommons), but also charges the community to address the challenge and make the most of the tools around them to push forward faster.

And finally, Peter Suber’s “Trends Favoring Open Access” article is a personal closing look at where things stand with Open Access. From the opening lines of this strongly-opinionated piece, you glean the fact that that Suber clearly views the Open Access movement as a joint endeavor (evidenced by the many mentions of “us” across the article), a community undertaking with this as a mid-term report card. In this assessment, he acknowledges the short-comings and inadequacies to date (namely, low deposit rates), but he also calls out the significant achievements – specifically the solid momentum and high-arching trajectory of Open Access. What is undeniable is the dramatic progress that has been made on all nearly all fronts – progress that would have been nearly unimaginable 5-10 years ago. That said, Suber is also quick to point out the areas where for-profit companies are continuing to consolidate their positions. There is a battle raging and a victory for Open Access is not yet assured—although the tide would appear to be turning.

Evidenced by the breadth of trends and topics addressed in this issue and the progress we are seeing in the environment overall, we are obviously at an inflection point in the world of Open Access + Cyberinfrastructure. Perhaps not at the fabled tipping point – yet – but we have clearly summated the crest of one range and find ourselves peering anxiously at the next (last?) set of mountains to conquer. There can be no question: it is no longer IF, but WHEN…and some would argue the coming revolution has already arrived.