Wednesday, January 26, 2011

Open Access and Publishing

The report from the SOAP Symposium provided by Derek Haank CEO from Springer (see my older posting).

The key findings of the project are:

* The number of OA articles published in “full” or “hybrid” OA journals was around 120’000 in 2009, some 8-10% of the estimated yearly global scientific output (see also http://arxiv.org/abs/1010.0506). Journals offering a “hybrid” OA option had a take-up of around 2%.
* OA journals in several disciplines (including Life Sciences, Medicine, and Earth Sciences) are of outstanding quality, and have Impact Factors in the top 1-2% of their disciplines.
* Out of some 40’000 published scholars who answered a large-scale online survey, approximately 90% are convinced that OA journals are or would be beneficial for their field. The main reasons given for this view are: benefit for the scientific community as a whole; financial issues; public good; and benefit to the individual scientist. The vast majority disagrees with the idea that OA journals are either of low quality or undermine the process of peer review.
* A separate survey of scientists who published in OA journals reveals that their drivers for this choice were the free availability of the content to readers and the quality of the journal, as well as the speed of publication and, in some cases, the fact that no fee had to be paid directly by the author.
* The main barriers encountered by 5’000 scientists who would like to publish in OA journals but did not manage to do so are funding (for 39% of them) and the lack of journals of sufficient quality in their field (for 30%).

The latter is the most often heared argument against OA in the domain of taxonomy, where authors assume that publishing is linked to no costs involved. The US authors are more used to page charges, but when it comes to pay the very modest page charges in taxonomic hybrid open access journals (eg Zootaxa) they also tend rather not to pay them. The numbers of authors paying is below 25%.

Tuesday, January 25, 2011

"I like Wikipedia - they hate me"

Rod Page in his lecture at the recent ViBRANT Scripting Life meeting "why aren't we there yet?" talked about his encounter with Wikipedia, where he tried to link, actually automate the linking between NCBI and Wikipedia. Whilst NCBI was all for it, a live discussion developed leading to a very personal exchange on the very negative side (see slides 48-56 in Rod's talk and the actual exchange).

This is actually an interesting point, since we had the same experience trying to create for each new, and eventually all the known species a wikipedia entry. The idea is based on the notion that each species deserves an entry on Wikipedia, not just the famous hairy, feathery and flowery and that with such a service the broad community had a starting point. We could extract all this descriptive data from the original publications and this then could become the starting point to modify, rewrite, amend those descriptions, which are at the very moment the only source that exists. This would also provide a chance to he community to enhance such descriptions, whilst the original can always be seen on Plazi or similar sources.

But as in Rod's case, this idea was smashed by few editors that, in my view had little understanding no authority in the sense of understanding the issues rather then extorting their power within the Wikisystem.

Here is a visualization of what happened to one of our species, Monomorium dentatum, which can also be followed in the history of this species.
Subspecies again

Charruau et al. state in their recent paper Molecular Ecology (DOI: 10.1111/j.1365-294X.2010.04986.x, Phylogeography, genetic structure and population divergence time of cheetahs in Africa and Asia: evidence for long-term geographic isolates) on Cheetah that

Mitochondrial DNA monophyly and overall levels of genetic
differentiation support the distinctiveness of Northern-East African cheetahs (Acinonyx jubatus soemmeringii). Moreover, combining archaeozoological and contemporary samples, we show that Asiatic cheetahs (Acinonyx jubatus venaticus) are unambiguously separated from African subspecies.


With this evidence, why aren't they treating those taxa as species, since they are not only geographically separated but obviously also genetically? It seems that there is also evidence that the Egyptian population is about 30% smaller than the subsaharan population. The did not do any morphological work.

Friday, January 21, 2011

ViBRANT to provide the tools to measure decline of species within IPBES?

What follows is my reading of IPBES documents and need be followed up with a direct exchange. So, keep this in mind.

IPBES will provide global assessment of the fate of biodiversity and ecoystem services. Those will be based on the analysis of reports, that means there will be no IPBES-internal data collecting activities, but a lot of data mining and analyses of existing reports. Changes of species will only included, if their is information on them.

This is the chance for http://scratchpads.eu/. Why not provide a standard report form that suits IPBES, and link this to activties to update them according to the review cycle in IPBES. lets say, this is four year, so it would be interesting to motivate scratchpad users to make an effort to update their pages to that point in time.

It would also be an interesting experiment to see, how much information ought be available, to make statements on the distribution for the study of the global dynamic of biodiversity.

The impact might not be with those species that are widely known and ducmented, and not those that are very rare. But there ought be a middle ground that is not yet used in such exercices and that could also be recollected.

http://scratchpads.eu/ could play a role by opening up such assessemnts for groups that are not yet covered with such data. For example, if you would do ants, all the most recent publications could be taken, the treatmens and materials examined extracted and made acessible, and in a subsequent step, more material could be added through the scratchpads.

I could for example imagine, that this could complement what IUCN's SSC does by assessing the global diversity and threat of mammals, amphibians.

Friday, January 14, 2011

Open Access and publishing: an interview with Derek Haank, CEO Springer Science+Business Media

Richard Poynder interviewed Derek Haank and provides an interesting overview over publishing with many provocative statements.

Almost cynic comments, with a core that points out the weakness of the academic information infrastructure and ingenuity.

Some of the comments, such as about the growth of OA contradict trends reported by Peter Hendriks, Springer's president STM Publishing & Marketing, that predicts OA growth rate of 20% (total article growth 3.5%) and a share of ca 25% in 2020. Obvioulsy, the CEO doesn't look very much into the future.


Q: We have focused on OA publishing or Gold OA. There is also Green OA, or self-archiving, where researchers continue publishing in subscription journals but then make copies of their papers freely available in an institutional or central repository such as PubMed Central. Some argue that this is a faster and more effective way of providing OA. And most subscription publishers consent to some form of self-archiving. As I understand it, Springer allows authors to self-archive the "author-created" versions of their papers in their institutional repository, but not Springer's PDF.
A: We have always tried to take a balanced view on this, so we are of the opinion that, in principle, author archiving is fine. Were self-archiving ever to become sufficiently professional that it began to mimic our journals, however, it could create a lot of problems. If that were to happen why would people continue to take out a subscription?

But we are such a long way from that situation today that we are very easy going about author archiving. Since we cannot see it destroying the system, we see no reason to make life miserable for our valued authors.


... on access to data

Q: Some researchers argue that OA is not enough; they also want open data. The Cambridge-based chemist Peter Murray-Rust, for instance, wants scholarly publishers not only to make research papers freely available, but all the supplementary data associated with scientific papers too, even if the paper is published in a subscription journal. Moreover, he wants that data to be made available free of copyright restrictions so that others can re-use it. Is that a reasonable request in your view?
A: I have some sympathy for the request. However, I am not convinced that, even if every publisher were to make all such data freely available tomorrow, open data would take off very quickly, not least because only a tiny minority of articles have this supplementary information.

So I am not against the principle that if we publish an article, be it subscription-based or OA—any relevant data attached to it should be made freely available.

Q: And you are happy for the data to be made available on a reuse basis so that people can, say, mash it up?
A: I am. I see very little downside to doing it, because at the end of the day, it would progress scientific research, which is what we are all here for, and from which we will all ultimately benefit. And I am not worried about the commercial aspects because in reality, we are only talking about a tiny subset of the total number of articles we publish. But it is just not a pressing issue today. Occasionally, the topic is raised, and we all say, "Yes, we should definitely do something about that"; and then the issue goes away again.



The future of print

Q: Many assumed that OA publishing would prove less expensive than subscription-based publishing. Is that not the case?
A: OA makes no material difference to pricing because most of the functions remain exactly the same. You could argue that OA allows you to dispense with print costs, but even under the subscription model, there is hardly any print any more.

Likewise, you could argue that you don't have to sell subscriptions with OA publishing. But OA requires selling institutional memberships. So whether the library pays for the system through subscriptions, or the institution pays author charges via membership schemes, it makes not a jot of difference to the overall costs in the end.


The future of OA
Q: How large a niche do you envisage OA being?
A: I expect it to remain between 5% and 10% at a maximum.


The future and reason of increasing costs
Q: Again, I doubt librarians would agree that publishers are showing price restraint. As you will know, the University of California (UC) Libraries recently released a public statement complaining that Nature was seeking to increase its license for 67 journals by 400%. For that reason, UC Libraries said, they were considering boycotting NPG.
A: I would not want to comment on this particular example; there might be a reason for it. You know, what is sometimes forgotten in discussions like this is that we operate in a growth industry: For the last couple of hundred years, we have seen a constant growth in research. More research means more researchers (because it is very labor intensive activity), and around every 2 years, each researcher produces a scientific article. That is the volume problem, and there is nothing we can do about it. So this is an enormous growth industry, and it is just not realistic to assume that there will be no price increase in the next 10 years for our growing database. One thing we have learned, however, is that the days when publishers could just say, "This is the price increase and you have to pay us" are no more.


On the amount of new output
After all, we are seeding our database with 13,000 to 14,000 new pages of information every day.


About the innovativeness and inertia of a publisher
Q: Looking to the future, what new developments can we expect from Springer?
A: Our first priority is to continue as we are. When you talk about all the new things going on, there is a temptation to forget that. But it is my job to think of what more can be done. As we have discussed, there is great pressure on the traditional library market. So we need to look at nontraditional markets.

(...)
But the first priority is to make our products more accessible. That means developing a better search engine, and improving the formatting of our data.


On the future of non library access
Q: Scholarly publishers tend to sell single articles for around $25 a time. That is not an iPad pricing model, and it is not a price I suspect many individuals would be willing to pay.
A: So maybe we will need to realign our prices there.

Tuesday, January 11, 2011

BHL Modern

The reaction of the post on IPBES made me think why we all stick so much in the legacy literature. This is clearly a quagmire very expensive to move in what we need today, that is access to its content. To work on this needs a lot of tools that need be customized not only to different languages but different styles, fonts, almost anything one can imagine. What we experienced with our little effort to make all the ant literature available for the Malagasy ants is, that the conversion of a scanned image of a text into a semantically enhanced TaxonX is complicated and needs quiet some domain expertise. This is its own science. A science very different from publishing semantically enhanced publications from scratch.

What this really means, that besides best practices and pitfalls, working on prospective publishing is something very different. For that reason, I think we should create a BHL-Modern.

BHL-Modern would not be centered around articles but around treatments - the stuff the taxonomist and most of the users want. The main archive would not be scans of books or publications, but its content, treatments that are linked to the original source. It could also have a department of old literature that are extracted from BHL and then either linked to the digital source that is publicly available or then to BHL-Modern members only for stuff that is under copyright. Special sections would include ontologies, definitions of measurements, bibliographic records, abbreviations, some of which can be shared with BHL.

Monday, January 10, 2011

IPBES a challenge to biodiversity informatics and publishing

The recent adoption by the United Nations General Assembly of the Intergovernmental Platform on Biodiversity and Ecosystem Services (IPBES; December 21, 2010) is an interesting challenge to the way we taxonomists publish our data: Will it be integrated into the IPBES reporting? It could, since they will base their report on the assessment of scientific paper, reports.

The IPBES will achieve this in part by prioritizing, making sense of and bringing consistency to the great variety of reports and assessments conducted by United Nations bodies, research centres, universities and others as they relate to biodiversity and ecosystem services.


I assume, that in an optimistic view, that ought be based on the assessment of publications such as taxonomic reports. With the parallel development of using publications as seeds for growing knowledge on particular taxa and publishing tools allowing such to happen, this might not just be a dream.

So we need to think of how we can demonstrate this, and how we can drew the attention of the IPBES to this potential.

See also IPCC for Nature: IPBES

Sunday, January 09, 2011

The Future of Taxonomic Publishing (2)

What is what the user wants?

What is the user looking for?

One clear long-term trend is that smaller pieces of information are being published. Considering just modern digital forms of publishing, there is a roughly chronological progression toward smaller publications: emails, Usenet postings, web pages, blog posts, blog comments, tweets, tags.

Terry Jones

This is followed by smaller size publishing.

A second trend is a reduction in friction. As access to easy-to-use and inexpensive publishing technology increases, it becomes economically feasible to publish smaller and less valuable pieces of content. We have reached the point where anyone with access to the Internet can easily and cheaply publish trivial, tiny pieces of information -- even single words.

Terry Jones

Finally, this is followed by the desire the publish personal comments, both about ourselves as much as about the piece in front of us.

A solution to that is to make the content alive on the Internet, so that people can not just read it but can do things with it, like adding external links, reuse it, follow links.

Another solution is to break away from a traditional publication at all and serve the snippet the user want: A single treatment in the context of the whole.
The Future of Taxonomic Publishing

Following are some thoughts about the future of taxonomic publishing.

This is a though by Hugh McGuire

Defining a book by what you cannot do

What's striking about this state of affairs -- though not surprising, given the conservative nature of the publishing business, and the complete unknowns about business models -- is that we define ebooks by a laundry list of things one cannot do with them:

* You cannot deep link into an ebook -- say to a specific page or paragraph chapter or image or table
* Indeed you cannot really "link" to an ebook, only various access points to instances of that ebook, because there is no canonical "ebook" to link to ... there is no permalink for a chapter, and no Uniform Resource Locator (url) for an ebook itself
* You (usually) cannot copy and paste text, the most obvious thing one might wish to do
* You cannot query across, say, all books about Montreal, written in 1942 -- even if they are from the same publisher

You cannot do any of these things, because we still consider that books -- the information, words, and data inside of them -- live outside of the Internet, even if they are of the e-flavor. You might be able to buy them on the Internet, but the stuff contained within them is not hooked in. Ebooks are an attempt to make it easier for people to buy and read books, without changing this fundamental fact, without letting ebooks become part of the Internet.

Many people don't want books to become part of the Internet, because we just don't know what business would look like if they were.


Unfortunately, the "many people" are not few authors and many publishers, but explicitly or implicitly a majority of the taxonomists (authors as much as users) that don't understand what it means to have a piece of a description (treatment) online with links to external resources, that can also be reused, linked to.