Wednesday, February 29, 2012

antbase.org shut down: Lessons to be learned



Antbase.org has been shutdown on February 1, 2012. The story leading to it will be published once the time comes. At the moment, it is important to assure that this source of digital literature for the ant taxonomic community will become life as soon as possible. The technical issues are being dealt in my antbase.org blog.
Below, I am beginning to write up the lessons being learned during this process. The points will be explained in due course.

There are lessons to be learned from the shut down of antbase.

You are not important.

Policies can change.

There is no institutional commitment to maintain digital work, as is for libraries and physical collections.

Have multiple copies.

Have as much under your own control – even though it comes at a (insurmountable?) cost.

React at the slightest indication of change.

Digital collections are not for the eternity (may be with the exception of images).

Our institutional policies shift, and are not necessarily in accordance with the scientists needs.

Do not even think about the ramification to build a repository.

As a scientist, stay within institutions

Read how to manage and operate within voluntary organizations

Tuesday, February 28, 2012

e-publications and ICZN: this time Psyche

Psyche is another fantastic example of two issues relating to e-publication and the Code: The delay in handling the issue of e-publishing, and the deposition of hard copies.

Psyche, the journal of the Cambridge Entomological Club based at Harvard has been converted in an open access journal run by Hindawi publishers. Recently, they published a special edition “Advances in Neotropical Myrmecology” including the description of new species. Since this is normally an e-only publication, they followed the Code and at the very end, in the acknowledgement mentioned that they deposited hard copies (see eg the Tatuidris paper).

“In accordance with Section 8.6 of the ICZN’s International Code of Zoological Nomenclature, printed copies of the edition of Psyche containing this article are deposited at the following six publicly accessible libraries: Green Library (Stanford University),Bayerische Staatsbibliothek, Library—ECORC (Agriculture & Agri-Food Canada), Library—Bibliotheek (Royal Belgium Institute of Natural Sciences), Koebenhavns Universitetsbibliotek, University of Hawaii Library”

This is in several ways remarkable.
1. Psyche is published from within the MCZ and Harvard. No copy has been deposited at Harvard.
2. All the libraries have nothing at all to do with the myrmecological world.
3. The citation is somewhat unlike something printed: Psyche, Volume 2012, Article ID 926089, 6 pages
doi:10.1155/2012/926089, but one could use it as Psyche 2012 (926089): 1-6
4. The indication that hard copies have been deposited has been hidden at the very unusual place, the acknowledgments at the end of the article.

The remarkable point here is, that this is not an obscure renegade journal, but a very successful OA publisher, a very old journal and from within the hear of taxonomy. Or have the centers of gravity shifted?

Monday, March 14, 2011

Taxonomy in the News: Taxonomy, the naming crisis

It is always a good thing to be covered by the news, so no complaint about today's article about taxonomy in the Independent.

Besides that, it is as usually depressing to read a necrology of a science that seems one to make the news with morbid reports, such as the decrease of taxonomist, the increasing underfunding, the lack of universities training taxonomists.

There is so much happening in and around taxonomy, but the speakers for taxonomy seem to be conditioned to mourn and black painting, complaining about the technophile funders. A little bit more creativity and optimism in communication of our science would not do any damage.

Sunday, March 06, 2011

Global Names Index (GNI)or don't we learn anything?!


The other day I was looking up Panacedechis papuanus trevorhawkeswoodi on Google to see what is known about this species and whether I can find some images. What I found was the link to the Global Names Index, I guess, because there is not so much online on this particular species.
What frustrated me immediately is, that there are just name strings. Names attached to nothing. Not even an author. And then there are name strings with an author and year, but they are not linked, not reference given, and when one clicks to the source, it is just a dead end without actually displaying the citation.
I think this is incredible for a new tool that wants to deal with all names, and is supported by the Global Biodiversity Information Facility (GBIF). We don't know what's behind a particular name in a particular publication. But now, we do not even now to what use of the name refers.
I always hoped, that especially institutions like GBIF, one of the main player in the field of biodiversity informatics, would push that names are linked to a publication in which links are provided to the materials examined that allows to understand the species concept used in this particular usage.

But no, they seem to be even more ignorant of what the Internet provides: Linkage.

I am hoping, I am wrong.

Saturday, March 05, 2011

Makham, Hawkeswood and Calodema: What a strange set-up

The journal Calodema has become the red herring in taxonomic publishing because of the very low standards of its publications. The editor of Calodema, Hawkeswood seems to be a very competitive fellow as is seen in this exchange from one of his papers, commenting on two letters that Chadwick mailed him:
I do not need to say much more and I will now continue with publishing papers overseas in entomological journals, without worrying any further about C.E. Chadwick and his cronies. Into the dustbin of history he and his research go!
.
The current debate in the Taxacom listserver refers to a very recent description of a new spider family, the Hawkeswoodidae Makhan & Ezzatpanah published in A new spider family, Hawkeswoodidae fam. nov. and Amrishoonops amrishi gen. et sp. nov. (Araneae) from Suriname; however, when opening the link, a different paper appears "Aschnaoonops aschnae gen. et sp. nov. from Suriname (Araneae: Oonopidae)".
But then, as Thorpe points out
Hawkeswoodidae was proposed with Amrishoonops as type genus! As it is not formed from the stem of an available generic name, it is not available [sensu ICZN 11.7.1.1. be a noun in the nominative plural formed from the stem of an available generic name] ...

When you actually read through the description, it is extremely short and makes no mention of why this species and genus is different from any existing one.
Description (male): Total length 1.8 mm. Palp with a C-shaped projection. Underside of projection strongly sclerotised, upper side soft, open and seed-like inside. Palp with large brown setae on dorsal side. Carapace brown, with brown setae, widest at posterior side. Abdomen on dorsal side light brown with brown setae, ventral side brown, with brown setae. Spinnerets light brown, with white setae. Legs light brown, with white hairs and with large thick spines.

and here the generic description

Type species: Aschnaoonops aschnae Makhan & Ezzatpanah sp. nov.
Description: Small brown species. Carapace round. Palp with a C-shaped projection. Underside of the projection strongly sclerotised and upper side soft, open and seed-like inside. Legs with large thick spines and hairs.


May be such as simple description is like Einstein's E=mc2, we just don't get it - that at least seems to be the message at the main page of Calodema
"All truth passes through three stages: first, it is ridiculed; second, it is violently opposed; third, it is accepted as self-evident!"
- Arthur Schopenhauer

"Great spirits have always encountered violent opposition from mediocre minds."
- Albert Einstein


P.S. Looking at some additional publications, it seems that Calodema is in fact the journal for highly combative authors, like Ghahari in his checklist of Iranian Braconid wasps
The results of this research indicate that the braconid fauna of north-western Iran is diverse and comprises some very interesting species. The mentioned region is very vast and includes diverse flora and fauna, and also has boundaries with Turkey, Armenia and Azerbaijan. Therefore, this small research paper, which is restricted to some areas, is not an extensive work, and the conducting of other surveys is necessary for determining many other species in this region. Since Iran is a large country with various geographical regions and climates, faunistic surveys in different regions of Iran is necessary for determining the extent of Iranian Braconidae step by step. A checklist of Iranian Braconidae was published by allahzadeh & Saghaei (2009) without perfect attention to all the resources available on Iranian Braconidae, e.g. Ghahari et al. (2009a, b, c, d) and many others. A checklist is a type of informational aid used to reduce failure by compensating for potential limits of human memory and attention.
Therefore, it is expected that a checklist would contain all the data on the subject and a checklist with deficiencies is not usable and helpful for researchers. This is main reason that all the systematic checklists must be prepared by the authorized specialists or at least edited and/or refereed by them carefully.
Caught in your History, marred in Garbage

Eli Pariser made a plea at his Junk Food Algorithms and the World They Feed Us TED lecture to Google, Facebook and other social networks not to cage the users (all of us) into their own history, and essentially make them unaware of different opinions.

This is similar to an analysis of Ghaddafi and his peculiar style of dressing and acting: Nobody does tell him anymore, that this or that might be questioned.

Whilst doing so might end up close to capital punishment, Google et al seems rather stuck in a wrong philosophical approach, that somebody can be defined by few mouse clicks.

I would argue even further. Google not only puts on blinds on my eyes, it is mainly garbage that I am being fed. Most of what I really want does not show up, because the algorithms operate without context.

For example, if I want to know something about "Formica" a group (genus) of ants living in the northern temperate region, I get Formica as material but furthermore a huge array of pages that somehow end up to have Formica in them ("About 6,770,000 results").

Does this make sense? Google is the victim of its own success, and I think a very stubbern company with the same symptoms of all the succesful companies: Becoming stupid because of the insistence of its own past, the search algorithms, web crawlers and server capacity to was at the begin of its success.

So, what is missing here is context in the search.

And this might not be just Google's mistake. As long as we stick to context insensitive html, we do not deliver Google the "Food for search" they need.

Alas, we need to move into the semantic tagging of content - and that's where I believe lies the success of our approach to add domain specific elements to publishing XML, such as the NLM Journal Archiving and Interchange Tag Suite, which then becomes TaxPub, which now is used by Pensoft with all its advantages of dissemination its content as widely as possibly, and being harvested. Such a small harvester is Plazi, that allows searching within the contest of treatments (the descriptions of species, such as species of Formica) or other elements that have been tagged, resulting for example in a list of species of a particular country

Sunday, February 13, 2011

Kurt Pickett dead

After a long illness, my colleague and friend Kurt died a couple of days ago.

The last time I was listening to Kurt was when he presented his lecture at the meeting of the International Society of Hymenopterists in Köszeg. He wasn't able to fly and would deliver via a remote link, a lecture in which he both experimented with content and style. Even in that moment he was innovative and driven with his endless curiosity - it is sad that he is not anymore among us, an inspiring and outspoken person with a formidable mix of intellect, field- and labwork and style. I will miss him.

Kurt's lab

Saturday, February 12, 2011

Watch this: Zoological Nomenclature as it unfolds

There is a new flurry of debate in ICZN list (the list of the International Commission on Zoological Nomenclature or the institution that supposedly should take care of the scientific naming of zoological taxa (species, genera, etc.) and all that is related to it in order to provide stability in names) about gender of the species epithet. This is a discussion which brakes loose once and then and wouldn't be as strange, if there would actually be a system that provides the stability.

But there isn't anything like that and we have been waiting now for a long time to get there. We have Zoobank that supposedly gets yet another overhaul to be part of an even bigger system (Global Name Architecture as part of the Global Names Usage Bank. In this corner of the biodiversity informatics world exist three creeds: We need to provide the all encompassing system, the shell that can harbor all the names; there are people out there (the community of crowd) that do the work and want to chip in; The relevant stuff is in the past.

It seems to me, that this all is a misconception of Facebook, Flickr, Google which provide exactly the environment that these people envision. They all are chasing names that are legacy - but are they really that important?

Taxonomy is publishing approximately 17,000 new species and I guess ca 85,000 redescriptions per year, the biomedical field uses few 100 species names in their domain which produces millions of papers a year. But there is, with very few exceptions, no system in place that is set up so that new names are automatically collected and added to those databases.

Zoobank has at best a very tedious interface that allows adding data manually, something that does not happen as we know from a GBIF sponsored project to add Zootaxa names. Instead of spending all the effort to get this system up to date, an overhaul of Zoobank is now on its way for several years without end in sight.

There is no debate about this, but all the effort is focused on the gender agreement and similar matter. And it could not be more abstruse:

If a proper genus and species group name combination was UNIQUE and STABLE, why would we need LSID?
To a computer,
267B9A8B-372C-45EC-BFE5-661AF13CABC8
and
Stomosis arachnophila
are both UNIQUE and Stable codes.

So, why does a computer have to have a long unrecognizeable (to humans) LSID? [Chris Thompson]


(the answer is simple: so the computer knows when we talk about something for which we can not agree the proper name because we can not agree on the ending of the species epithet)

The only systematic collector of names, Zoological Record, struggles all along with, if I am right, 25 employees (zoologists etc.) to decipher the cryptic and well hidden information in our taxonomic work to produce there reference work of zoological literature.
The other large initiative, the Biodiversity Heritage Library, with an important goal to convert as much of legacy print publication into the digital world, struggles as an aside to collect names, though because of copyright issues most older than 70 years and thus irrelevant for the user on the street.

It needs something different, like PubMed and PubMed Central where the publishers submit their paper to be archived and discoverable via a form that includes all the relevant information.

We have to create something BHL-Modern for our field that does exactly this: Whenever something is being published the document is published in a form that can easily read into a dedicated database, the content be checked for validity (in case nomencaltorial acts are included) during the submission process, and then all the information will be available, including treatment, names, bibliographic records and links to external resources, such as DNA, images, etc.

We know have a system that is close to this, just the Zoobank part is missing, and probably should be dealt with differently, by just doing this part without Zoobank but either internally at BHL-Modern or Zoological Record that has the manpower to operate.
A prototype of a system is Pensoft and their suit of journals like Zookeys, Journal of Hymenoptera Research that produces taxbup NLM DTD based output with all the taxonomic elements semantically marked up that is then read in Plazi where the treatments are available for use by EOL, GBIF and whoever requests it. Zoobank is included but only through an akward interface that is being fed manually. Right now, this is not complete, but it is open for criticism: The schema can be modified, the elements to be included in the publication be defined for fulfill the purpose of ICZN, Zoological Record and others.
The good thing about this development is, that there is something alive that is growing; it is a real system that is being used, fed by authors, and paid for, and since it is open access, it is open to all sorts of experiments, but most importantly, to all the users unlimited.