Thursday, June 9, 2016

The many shades of citizen science

Everyone is a citizen but not all have the same kind of grounding in the methods of science. Someone with a training in science should find it especially easy to separate pomp from substance. The phrase "citizen science" is a fairly recent one which has been pompously marketed without enough clarity.

In India, the label of a "scientist" is a status symbol, indeed many actually proceed on the academic path just to earn status. In many of the key professions (example: medicine, law) authority is gained mainly by guarded membership, initiation rituals, symbolism and hierarchies. At its roots, science differs in being egalitarian but the profession is at odds and its institutions are replete with tribal ritual and power hierarchies. Indian science might tends to carry more than the ordinary share of  ritual.

Long before the creation of the profession of science, "Victorian scientists" (who of course never called themselves that) pursued the quest for knowledge (i.e. science) and were for the most part quite good as citizens. In the field of taxonomy, specimens came to be the reliable carriers of information and they became a key aspect of most of zoology and botany. After all what could you write about or talk about if you did not have a name for the subject under study. Specimens became currency. Victorian scientists collaborated in various ways that involved sharing information, sharing /exchanging specimens, debating ideas, and tapping a network of friends and relatives for gathering more "facts". Learned societies and their journals helped the participants meet and share knowledge across time and geographic boundaries.  Specimens, the key carriers of unquestionable information, were acquired for a price and there was a niche economy created with wealthy collectors, not-so-wealthy field collectors and various agencies bridging them. That economy also included the publishers of monographs, field guides and catalogues who grew in power along with organizations such as  museums and later universities. Along with political changes, there was also a move of power from private wealthy citizens to state-supported organizations. Power brings disparity and the Victorian brand of science had its share of issues but has there been progress in the way of doing science?

Looking at the natural world can be completely absorbing. The kinds of sights, sounds, textures, smells and maybe tastes can keep one completely occupied. The need to communicate our observations and reactions almost immediately makes one need to look for existing structure and framework and that is where organized knowledge a.k.a. science comes in. While the pursuit of science might seem be seen by individuals as being value neutral and objective, the settings of organized and professional science are decidedly not. There are political and social aspects to science and at least in India the tendency is to view these aspects as undesirable and not be talked about lest one "appears" un-professional.  

Silent diplomacy probably adds to the the problem. Not engaging in conversation or debate with "outsiders" (a.k.a. mere citizens) probably fuels the growing claims of the "arrogance" of scientists (or even science itself). Once the egalitarian ideal of science is tossed out of the window, you can be sure that "citizen science" moves from useful and harmless territory to a region of conflict and potential danger. Many years ago I saw a bit of this  tone in a publication boasting the virtues of Cornell's ebird and commented on it. Ebird was not particularly novel (especially as it was not the first either by idea or implementation, lots of us would have tinkered with such ideas, such as this one - BirdSpot - aimed to be federated and peer-to-peer - ideally something like torrent) but Cornell obviously is well-funded to run PR campaigns. I think it is extremely easy to set up a basic software system that captures a specific set of data but fitting it to meet grander visions and wider geographical scales takes much more than mere software construction to meet more than the needs of a few American scientists. I commented in 2007 that the wording used in ebird publicity sounded more like "scientists using citizens rather than looking upon citizens as scientists", the latter being in my view the nobler aim to achieve. Over time, ebird has gained global coverage, but it has remained "closed" code-wise and vision-wise. There are no open and public discussions on software construction and the average contributor is not regarded as a stakeholder. It has, on the other hand, upheld traditional political hierarchies and processes that ensure conflict and lack of progress. Indeed it reflects political and cultural systems based on hierarchies. (There is a quote in software engineering that the architecture of a software mirrors the organization) As someone who has watched and appreciated the growth of systems like Wikipedia it is hard not to see the philosophical differences - almost as stark as right-wing versus left-wing politics.

Do projects like ebird see the politics in "citizen-science"?
Arnstein's ladder is a nice guide to judge
the philosophy behind a project.
I write this while noting that criticisms of ebird are slowly becoming more commonplace (after the initial glowing accounts). There are comments on how it is reviewed by self-appointed police  (it seems that the problem seems to be not just in the appointment - indeed why could not have the software designers allowed anyone to question any record and put in methods to suggest alternative identifications - gather measures of confidence based on community queries and opinions on confidence measures), there are supposedly a class of user who manages something called "filters" (the problem here is not just with the idea of creating user classes but also with the idea of using manually-defined "filters", to an outsider like me who has some insight in software engineering poor-software construction is symptomatic of poor vision, guiding philosophy and probably issues in project governance ), there are issues with taxonomic changes (I heard someone complain about a user being asked to verify identification - because of a taxonomic split - and that too a split that allows one to unambiguously relabel older records based on geography - these could have been automatically resolved but developers tend to avoid fixing problems and obviously prefer to get users to manage it by changing their way of using it - trust me I have seen how professional software development works), and there are now dangers to birds themselves. There are also issues and conflicts associated with licensing, intellectual property and so on. Now it is easy to fix all these problems piecemeal but that does not make the system better, fixing the underlying processes and philosophies is the big thing to aim for. So how do you go from a system designed for gathering data to one where you want the stakeholders to be enlightened. Well, a start could be made by first discussing in the open.

I guess many of us who have seen and discussed ebird privately could have just said I told you so, but sadly many of the problems were easily foreseeable. One merely needs to read the history of ornithology to see how conflicts worked out between the center and the periphery (conflicts between museum workers and collectors); the troubles of peer-review and open-ness; the conflicts between the rich and the poor (not just measured by wealth); or perhaps the haves and the have-nots. And then of course there are scientific issues - the conflicts between species concepts not to mention conservation issues - local versus global thinking. Conflicting aims may not be entirely solved but you cannot have an isolated software development team, a bunch of "scientists" and citizens at large expected merely to key in data and be gone. There is perhaps a lot to learn from other open-source projects and I think the lessons in the culture, politics of Wikipedia are especially interesting for citizen science projects like ebird. I am yet to hear of an organization where the head is forced to resign by the long tail that has traditionally been powerless in decision making and allowing for that is where a brighter future lies. Even better would be where the head and tail cannot be told apart.


There is an interesting study of fieldguides and their users in Nature - which essentially shows that everyone is quite equal in making misidentifications - just another reason why ebird developers ought to just remove this whole system creating an uber class involved in rating observations/observers.

Additionally one needs to examine how much of ebird data is actually from locals (perhaps definable as living within walking distance of the area being observed). India has a legacy of tourism-based research (not to mention, governance) - in fact there are entire institutions where students travel far afield to study when even their own campuses remain scientific blanks.

23 December 2016 - For a refreshingly honest and deep reflection on analyzing a citizen science project see -  Caroline Gottschalk Druschke & Carrie E. Seltzer (2012) Failures of Engagement: Lessons Learned from a Citizen Science Pilot Study, Applied Environmental Education & Communication, 11:178-188.
20 January 2017 - An excellent and very balanced review (unlike my opinions) can be found here -  Kimura, Aya H.; Abby Kinchy (2016) Citizen Science: Probing the Virtues and Contexts of Participatory Research Engaging Science, Technology, and Society 2:331-361.

Sunday, June 5, 2016

Ordering chicken

A few weeks ago, I was asked a few questions by a couple of friends relating to why the bird-groups in bird-books are ordered the way they are. The groupings themselves were not so much in question, it was only the sequence. Why do the larger birds come before the smaller birds? It then led to further questions on why the Galliformes (for example chicken or junglefowl) are considered representatives of an older branch of birds (simply sometimes stated as "primitive", and termed "basal" by cladists) compared to say crows. Part of the question was also the understanding that the sequence to a large extent has been around since the first bird-guides for the Indian region. It was hard to make a clean, coherent, non-anachronistic reconstruction particularly since the sequence has to a large extent been followed long before molecular biology took root. In trying to clarify this, at least for myself, I have been forced to look at some historic literature that few read in modern times. [Needless to say my research also led me to improve some Wikipedia biographies.]

Max Fürbringer (1846-1920)
We can skip the pre-Darwinian state largely because the birds were then put in groups whose order was largely decided by tradition and never questioned (at least not within birds which were themselves placed in the scala naturae / "ladder") - and in this we have already seen ideas such as Quinarianism that were followed by Jerdon in India. But why order birds? It appears that everyone wants some order when listing out all the birds of the world and like dictionaries they initially followed some kind of convention that did not need to be questioned. These lists beginning with that of Linnaeus include those of G.R. Gray and R.B.Sharpe. The sequence that Gray and Sharpe followed was based on one established by N.A.Vigors - again a quinarian. Whether the desire to get away from this sequence was related to the unpopularity of quinarianism, I do not know but the sequence followed in  modern bird books has its roots in a system that was established by two ornithologists who are sadly somewhat lesser known perhaps because their writings were in German. (Interestingly Jerdon's contemporary, Edward Blyth, taught himself German and had little tolerance for Jerdon's scheme that followed Swainson). The two Germans who matter for our analysis are Max Fürbringer (1846-1920) and Hans Gadow (1855–1928) [and they had a counterpart in botany Adolf Engler (1844-1930)]. Gadow by virtue of moving to Britain and writing in English is somewhat better known but his work draws greatly on a lot of hard work and thinking on the part of Fürbringer. 

Pierre Belon's comparative anatomy (1555)
After Darwin, the idea of genealogical trees was well adopted and it was also quite clear that evolutionary processes decidedly lacked order and the ragged bush representing the birds of the world had some bushy branches while others were skeletal and many where difficult to place. Now flattening out this bush (or at least the leaves on a 2-dimensional representation of the bush) and reducing it to a linear list can be done in many ways (computer literates will recognize only two - a breadth-first and a depth-first approach!). There were numerous ways in which the tree itself was being re-arranged (phylogenetics) starting with methods that went from the use of intelligent guesswork on the basis of morphological and anatomical characters to methods that reduced guesswork and attempted to reconstruct evolutionary history on the basis of DNA sequences. Ernst Mayr and Walter Bock referred to the "standard sequence" as one based on Gadow-Wetmore-Peters. Mayr and Bock also went to the extent of suggesting that the sequence be maintained independent of matters of phylogeny (then already showing signs of fluidity) so as to make communication easier. Modern bird-guidebook authors and publishers have obviously given that suggestion a pass. Mayr and Greenway in 1956 set three principles for the taxonomic sequence to be followed - (A) To follow as closely as possiblethe traditional arrangements, except where subsequent work has shown conclusively that a change is advisable (B) To place familes near each other whichare presumably closely related (C) To place the more primitive families near the beginning and the more advanced families near the end.

Back to Max Fürbringer who was a student of  Carl Gegenbaur and a great comparative anatomist. Comparative anatomy at this point had evolved from its early origins as an area of amateur investigation in medical studies. It was not just about looking at gross skeletal similarities but looked at minutiae such as the twisting of the tendons of the foot and the bones of the skull. Fürbringer made use of 51 characters, mostly internal anatomy but also some that included whether the state in which the young are born. He had worked earlier on reptiles and their musculature and was an expert on fossils and osteology as well. He gives special importance to the muscules and tendons on the shoulder. It is quite mind boggling to think of the time and effort it would take to dissect and examine the shoulders of so many kinds of birds, leave alone obtaining the specimens needed for it. Given that it has to be done over a significant length of time, it involves meticulous note making and sketching. Fürbringer identified the key characters for each of the bird groups and then he compared every pair of bird groups noting the number of common characteristics and the number of differing characters. He used this pair of numbers (matches and mismatches) to decide a measure of distance between the groups (what we would now call as phenetics - but all this was done before Hennig and the formal birth of cladistics). Gadow would, four years later in 1892, comment that Fürbringer was being a bit too precise (read "German"!) in doing this pair-wise distance computation and that this was unneeded overkill. Gadow also made some alterations, he emphasised that not all characters were equal and that the equal weightage for characters was inappropriate. So he decided based on his expertise that some of the relationships that Fürbringer saw were spurious. It is worth reading his original text:

The anatomical portion has been written with the view of abstracting there from a classification. In the meantime (after Huxley, Garrod, Forbes, Sclater, and Reichenow's systems) have appeared several other classifications: one each by Prof. Newton, Dr. Elliott Coues, Dr. Stejneger, Prof. Fuerbringer, Dr. R. B. Sharpe, and two or three by Mr. Seebohm. Some of these systems or classifications give no reasoning, and seem to be based upon either ornithological matters or upon inclination—in other words, upon personal convictions. Fuerbringer5s volumes of ponderous size have ushered in a new epoch of scientific ornithology. No praise can be high enough for this work, and no blame can be greater than that it is too long and far too cautiously expressed. For instance, the introduction of " intermediate " groups (be they suborders or gentes) cannot be accepted in a system which, if it is to be a working one, must appear in a fixed form.     In several important points I do not agree with my friend ; moreover, I was naturally anxious to see what my own resources would enable me to find out. This is my apology for the new classification which I propose in the following pages.

The author of a new classification ought to state the reasons which have led him to the separation and grouping together of the birds known to him. This means not simply to enumerate the characters which he has employed, but also to say why and how he has used them. Of course there are characters and characters. Some are probably of little value, and others are equivalent to half a dozen of them. Some are sure to break down unexpectedly somewhere, others run through many families and even orders;  but the former characters are not necessarily bad and the latter are not necessarily good. The objection has frequently been made that we have no criterion to determine the value of characters in any given group, and that therefore any classification based upon any number of characters however large (but always arbitrary, since composed of non-equivalent units) must necessarily be artificial and therefore be probably a failure. This is quite true if we take all these characters, treat them as all alike, and by a simple process of plus or minus, i. e. present or absent, large or small, 1, 2, 3, 4, &c, produce a "Key," but certainly not a natural classification.

To avoid this evil, we have to sift or weigh  the same characters every time anew and in different ways, whenever we inquire into the degree of affinity between two or more species, genera, families, or larger groups of creatures.

Of my 40 characters about half occur also in Fuerbringer's table, which contains 51 characters. A number of skeletal characters I have adopted from Mr. Lydekker's 'Catalogue of Fossil Birds' after having convinced myself, from a study of that excellent book, of their taxonomic value. Certain others referring to the formation of the rhamphotheca, the structure and distribution of the down in the young and in the adult, the syringeal muscles, the intestinal convolutions, and the nares, have not hitherto been employed in the Class of Birds.
Of course this merely mathematical principle is scientifically faulty, because the characters are decidedly not all equivalent. It may happen that a great numerical agreement between two families rests upon unimportant characters only, and a small number of coincidences may be due to fundamentally valuable structures, and in either case the  true affinities would  be obscured.

Of the 26 positive points not less than 19 are common to Falconidae, Psittaci, and Coccyges. In the remaining 7 points Psittaci and Falconidae agree together against Coccyges, namely nestlings, downs of young and adult, fifth cubital, temporal fossa, fleshy tongue, convolutions of intestines. Most of these characters seem important, especially the woolly nestlings, considering that Psittaci breed in holes, and agree in the convolutions in spite of the totally different food.
On the other hand, the sifting of the 14 negative characters shows On the other hand, the sifting of the 14 negative characters shows that in 13 of them the Parrots agree with Cuculidae or with Musophagidae, or with both, and differ along with the Coccyges from the Falconidae. The syrinx is an absolute specialization. Fuerbringer remarks that powder-downs, ceroma, and beak speak for Falconidae against Coccyges. Again, Psittaci and Falconidae differ greatly in the formation of the furcula, in nearly the whole of the muscular system, and in the bones of the wings and legs.
Conclusion.—The Psittaci are much more nearly allied to the Coccyges than to the Falconidae, and of the Coccyges the Musophagidae are nearer than the  Cuculidae because of the vegetable food, ventral pterylosis, presence of aftershaft, tufted oil-gland, absence of vomer, truncated mandible and absence of caeca.

Gadow's weighing and sifting probably went wrong there as a 2011 study re-established the closeness between the parrots and falcons. (Fürbringer had carefully compared them but he too had them branching apart widely).
Suh A, Paus M, Kiefmann M, et al. Mesozoic retroposons reveal parrots as the closest living relatives of passerine birds. Nature Communications. 2011;2:443-. doi:10.1038/ncomms1448.
It is somewhat sad that Fürbringer is still hardly known in ornithological circles. Mayr and Bock call the bird-sequence used for so many years as the Gadow-Wetmore-Peter's sequence. (this despite Mayr being a historian of biology!) I saw with delight however that Tim Birkhead in his Ten Thousand Birds (2014) puts 1888 on the ornithological timeline to mark this landmark work.

Fürbringer's work is also remarkable because he finally produced a graphical summary of his entire work. An evolutionary tree and wait, it was a three-dimensional tree! He tried to represent it with side views from two opposite points and horizontal cross-sections at three levels. The cross-sections indicate phenetic distances between the groups. He seems to have hit upon some kind of manual equivalent of what we might produce today using canonical correspondence analysis. (It would be amazing if someone-who-knows-German could recreate his three-dimensional rendition and compare his own distance matrix which what a CCA algorithm would produce - Heidelberg University would do well to make a three-dimensional tree model as a tribute)


Front views of the avian tree.
Rear view of tree

Okay, so we now hopefully have a historical view of how the bird relationships were established. We still have a part of the original question hanging, why are chicken considered "primitive" or "basal" to use the more accurate phylogenetic term. The answer again lies in Fürbringer's scientific past- he had worked extensively on reptilian anatomy and he saw more of the older traits in parts of his bird-tree. Remember also that he tried to place extinct birds into the tree. Today, the way a tree is rooted or oriented is by comparing with an outgroup - a specimen that you know from prior knowledge to be distant enough to have a common shared ancestor with all the others that are in focus.

The specific characters that Gadow listed for the Galliformes (in which he also included the hoatzin) while he placed them as the 14th group (after the ratites, herons, seabirds and falcons but before the cranes) are :
  • Galliformes- Phytophagous. Nares impervious. Furcula with hypocleidium. Plagiocoelous type V. Caeca large. Crop globular. 10 primaries. 
    • Galli -  16 or more cervical vertebrae. Holorhinal. Coracoids touching each other. Flexors of type I. Hallux large. Neck without lateral apteria.
      • Gallidae -16 cervical vertebrae. Nidifugous. Spina communis sterni. Sternum with long posterior later processs and with obliue processes. Hypotarsus complex.
      • Opisthocomidae - 18 or 19 cervical vertebrae. Nidicolous. Spina externa only present. Sternum with small notches or fenestra only; no oblique process. Oil-gland tufted.
The Hoatzin has since been moved elsewhere but interestingly the claws on the hand used for clambering up vegetation are not even used.

That leaves one other question which is on whether the sizes matters in this sequence. It appears that the Galloanserae which appear early in the sequence are in general somewhat large sized, the ratites and flightless birds also tend to be large. At the other end of the spectrum the passerines tend to be small but it appears that there is no strong evolutionary trend in size.

Note: Thanks to Emmanuel Theophilus and Ashish Kothari for the original questions and discussions.
Postscript: Note that there were many other comparative anatomists in the period and many pieces of bird and reptile evolution had been figured out by several others including Archibald Garrod (1873-74 on muscles part 1 part 2 and W.H. Flower.
I have also found this very nice interactive site on comparative anatomy of birds that uses chicken as a model. 
Note that I had mistakenly attributed the parrot-falcon affinity to Fürbringer, turns out that he did not think much about it.
9 June 2016 - I have also found an interesting review by R.W. Shufeldt (that infamous racist!) which also summarizes the work of Professor William Kitchen Parker. Parker (1862) is quoted "I will first show, in two parallel columns, how both the Fowls and the Rails run insensibly through certain leading genera into the lowest (reptilian) types of diving-birds" 1862, William Kitchen Parker "On the Osteology of Gallinaceous Birds and Tinamous" in Shufeldt, R.W. (1904) An Arrangement of the Families and the Higher Groups of Birds. The American Naturalist 38(455/456):833-857.
Appendix - a list of the characters used by Gadow.

A.   Development.
Condition of young when hatched: whether uidifugous ur nidi-colous; whether naked or downy, or whether passing through a downy stage.
B.  Integument.
Structure and distribution of the first downs, and where distributed.
Structure and distribution of the downs in the adult: whether absent, or present on pteryls or on apteria or on both.
Lateral cervical pterylosis : whether solid or with apteria.
Dorso-spinal pterylosis : whether solid or with apterium, and whether forked or not.
Ventral pterylosis: extent of the median apterium.
Aftershaft:  whether present, rudimentary, or absent.
Number of primary remiges.
Cubital or secondary remiges: whether quinto-or aquinto-cubital.
Oil-gland: present or absent, nude or tufted.
Rhamphotheca: whether simple or compound, i. e. consisting of more than two pieces on the upper bill.
C. Skeleton.
Palate: Schizo-desmognathous.   Nares, whether pervious or impervious, i. e. with or without a complete solid naso-ethmoidal septum.
Basipterygoid processes: whether preseut, rudimentary, or absent: and their position.
Temporal fossa, whether deep or shallow.
Mandible: os angulare, whether truncated or produced ; long and straight or recurved.
Number of cervical vertebra;.
Haemapophyses of cervical and of thoracic vertebra;: occurrence and shape.
Spina externa and spina interna sterui: occurrence, size, and shape.
Posterior margin of the sternum, shape of.
Position of the basal ends  of the  coracoids: whether separate, touching, or overlapping.
Procuracoid process: its size and the mode of its combination with acrocoraeoid.
Furcula: shape; presence or absence of hypocleidium and of interclavicular process.
Groove on the humerus for the humero-coracoidal ligament: its occurrence and depth.
Humerus, with or without ectepicondj lar process.
Tibia: with bony or only with ligamentous bridge, near its distal tibio-tarsal end, for the long extensor tendons of the toes : occurrence and position of an intercondylar tubercle, in vicinity of the bridge.
Hypotarsus : formation with reference to the tendons of the long toe-muscles:—(1) simple, if having only one broad groove; (2) complex, if grooved and perforated ; (3) deeply grooved and to what extent, although not perforated.
Toes :   number and position, and connexions
D.  Muscles.
Garrod's symbols of thigh-muscles A B X Y,—used, however, in the negative sense.
Formation of the tendons of the m. flexor perforans digitorum : the number of modifications of which is 8 (I.-VIII.) according to the numbering in Bronn's Vogel, p. 195, and Fuerbringer, p. 1587.
E.   Syrinx.
Tracheal, broncho-tracheal, or bronchial.
Number and mode of insertion of syringeal muscles.
F.   Carotids.
If both right and left present, typical: or whether only left present, and the range of the modifications.
G.   Digestive Organs
Convolutions of the intestinal canal. Eight types, numbered L-VIIL, according to Bronn's Vogel, p. 708, and P. Z.S. 1889, pp. 303-
Caeca: whether functional or not.
Tongue: its shape.
Food.—Two principal divisions, i. e. Phytophagous or Zoophagous, with occasional subdivisions such as Herbivorous, Frugivorous, Piscivorous, Insectivorous, etc.

List of Characters employed occasionally.
Shape of bill.
Pattern of colour. Number of rectrices ; and mode of overlapping of wing-coverts, according to Goodchild (P.Z.S. 1886, pp. 184-203).
Vomer.    Pneumatic foramen of humerus.
Supraorbital glands.
Certain wing-muscles according to Fuerbringer.
Mode of life: Aquatic, Terrestrial, Aerial, Diurnal, Nocturnal, Rapacious, etc.
Mode of nesting: breeding in holes.
Structure of eggs.
Geographical distribution.

Postscript: I have subsequently come to learn of Stigler's Law of Eponymy.