Responses to discussion paper: How can we improve agriculture, food and nutrition with open data?

Medha Devare, Data and Knowledge Manager, CGIAR Consortium

Medicine, open data and the smallholder farmer

It is important to recognize that smallholders typically do not interact with data, particularly in developing countries. Whether data is open or not therefore has little ramification for smallholders directly, but will have an indirect impact on them as the ultimate beneficiaries of research and data on agriculture. While small farmers will not find data itself of much use, they do need demand-driven information that is reliable and relevant—which means that it must be useful, applicable, timely and accurate at their local scale.

The trail-blazing medical informatics domain has seen accelerated innovation in health care solutions concurrent with publications and data being made open, fueling a translational medicine revolution. Innovation in agriculture primarily involves successes in the areas of data collection, analysis, interpretation, and technologies--but in large part, a movement towards “translational agriculture” is still lacking. Translational medicine has benefitted patients (the ultimate beneficiaries) via data sets being discoverable, accessible, interoperable, and reusable—that is, open—but this has happened not through patients relating to data, but researchers interacting with a rich landscape of datasets to combine and reinterpret, develop and publish new medical interventions that would not have been possible without access to a largesse of data. These interventions are then converted to easy-to-follow recommendations for the patient by “translators”, the doctors (whether real or virtual) who interact with patients.

Similarly, for our smallholder “patients” to benefit from agronomic research in the face of daunting challenges, researchers first need to develop technologies and predictions that are applicable at a variety of scales (from landscape to field) by easily accessing and reinterpreting multiple coherently described data sets. For innovations to have impact on the ground and in the livelihoods of farmers, these solutions then need to be translated from science-speak (including model outputs and scientific publications) to understandable, actionable language by the NGO, extension, and other “doctors”. These might include virtual means such as web sites and mobile solutions. However, to carry the doctor-patient analogy a little further, for a solution to be useful to the small holder it must respond to a specific need as a doctor responds to a patient issue. Given the large proportion of semi-literate or illiterate small farmers in developing countries, a sophistication in algorithms and interactivity is often required that is generally difficult to achieve. An exception to this is a highly pictorial web and smartphone-based Crop Manager tool, developed by CGIAR’s International Rice Research Institute: It provides location-specific rice, maize, and wheat nutrient management advice and can be used by small farmers as well as others (such as extension and NGO personnel) via the web and smartphone, including an Internet Voice Recording option for literacy-hampered farmers.

It is perhaps easier and more realistic to conceive of devising automated solutions to provide simpler information to small holders using data such as market prices, weather forecasts, and input location and stocks. The availability and reliability of these data diminish from the beginning to the end of this list, but they are still likely to be easier to collect, manage, query, and disseminate via appropriate interfaces or IVR than location-specific information on specific technologies.

Elisabeth Fischer, CHBS, Global Project Manager – The Good Growth Plan, Syngenta

Congratulations on this excellent white paper. It will be an important reference for our open data work inside and outside Syngenta. Please find below a summary of our thoughts on themes and points which could be improved or added. Thank you, Derek, for putting it together. Apologies for the late response to your call for comments. I hope you can still consider it.

 (1) Strategic Global Policy: The report quite rightly focuses using  uses cases, and emphasizes the importance of using real problems as the justification for open data initiatives. However there is a slight risk that this can lead to some silo thinking, especially when projects or initiatives have particular passion to deliver solutions. One risk with this is that vital and important meta-data might be "re-invented" or new meta-data created by a project may not be shared for future use. This is an important global consideration:

(a) frequently data integration can be over 80% of the cost in any data exploitation projects - having good and accessible meta-data, means that funds can be used more effectively.

(b) A global policy that stipulates that reusable meta-data should be an outcome of public and privately funded projects, would mean that  benefits to growers can be delivered more promptly and it would create a virtuous circle of quicker delivery and greater benefits overtime.

To enable this requires a body or organization who has a mandate to deliver open data sharing in a non-partisan way and to strongly advocate this approach. From our perspective GODAN should fulfill this role. The paper does not state this as a stand-alone point. We think it would benefit from the stating the importance of meta-data and strategic thinking in this area.

(2) Again (on strategic meta-data support). There is no use case which shows the benefit of standardizes vocabularies and ontologies in this space. The significant progress that industry and public research have made in the last three years in exploiting public and private data sets, would not have been possible without the large corpus of ontologies at www.obofoundry.org. These have been used as a lingua-franca for the community and means that they have been able to share datasets which use this resource. We are missing that same level of cohesive thinking about vocabularies  and ontologies in the agri space. We do have some - for the examples or the CIARD ring and AGRIS do exemplify this, but there is nothing that is as strong as something like obofoundry.org or bioportal.org. The nearest is a combination of the AIMS (FAO) meta-data sets.

This is another area, where some global and strategic direction is needed. And it could help head-off an explosion of harmful meta-data "noise", in emergent areas such as the Internet of Things, Agriculture sensor systems and earth observation in agriculture. This is a great opportunity for the GODAN project.

(3) Using common terms and relationships. In The Good Growth Plan Progress Data and in some of our research datasets we have machine-readable statements like this :

syn:Maize -- skos:Exactmatch -- agrovoc:c_12332

- as a human, to understand this you have to know that agrovoc:c_12332 is the identifier that FAO use for Maize.

In English this statement is saying that whenever you come across the term "Maize" in the data, what it means is the same as the FAO definition of Maize.

This is a simple, but very important thing to consider.

(a) As an organization this means that we do not have to maintain a whole series of vocabularies and terms. It can save money.

(b) By using a common external and shared vocabulary, we have made our data easily integrable by other people and organizations. That FAO definition of Maize is available for anyone else to use. That means if a SME creates useful applications, or exploiting the data for public benefit, it does not have to carry the burden of resolving identifiers and terms.

Using reliable and recognized machine-readable references in this way should be planned for by all organizations publishing their data and should be considered at Experimental design. This will help agility in the sector.

Ajit Maru, DDNG 

  1.   The most important advocacy point for GODAN, especially for economically developing countries, in Opening Access to agricultural and nutrition data is the huge potential of open data access in generating employment, particularly for educated youth in rural areas, through enabling value added knowledge services for the Agri-food sector.
  2.   There is at present a “Wild West” situation with very little international and national regulatory mechanisms in the generation, management, control and exploitation of data collected from a variety of sources. There are significant encroachments in property and privacy rights, national and individual security, unethical practices and “robber baron” tactics used in getting control and use of open data sets for example by Google and Monsanto owned/related corporate bodies. The discussion paper does not mention and discuss this situation.
  3.   The paper does not discuss that most of the agricultural and nutritional data will be generated by communities and how these communities, who technically and legally should own this data, will be included in management and use of the data and data sets.
  4.   Some of the examples used do not give a balanced perspective of the downside of opening data. For example weather data, especially rainfall, can also be used to predict food grain production and enable forecast of prices in international markets. In the food price crises of 2007-2009, the rise of food prices has now been attributed to stock market speculators especially in the Middle East. Open data on agriculture can be used against the interest of farmers and consumers for private profit by these speculators analyzing weather data. Similarly, doing soil analysis to “help” farmers also contribute, when aggregated and mapped geographically to predict fertilizer use very concisely. This can enable increasing profit through cartelization, hoarding, creating shortages, black marketing and increased pricing for fertilizer producers and sellers as fertilizer use is seasonal and time bound for farmers. This has been seen in India. Open land holding data, again in India, led to land grabbing. Supermarkets and fast food restaurant chains collect data on food consumption patterns etc. and aggregate them with other public data to create advertisements and pricing structures to maximize their profits at the cost of and against the interests of consumers. Several business ventures have recently started, in the guise of aiding “poor”, small farmers “free” or at “low cost” services through providing automated weather stations, soil sensors and laboratory analysis etc.. These ventures aggregate the data and sell the data sets for profit without any regulatory mechanisms and sharing profits with farmers whose data it legally should be.
  5.   With new technologies such as sensors and the Internet of Things in machinery, tools and equipment a large volume of agricultural and nutritional data is already being collected with very little information or control of those who “own” the tools or use them. John Deere now considers its tractors and other equipment as legally “software” and not a machine. It does not sell a machine but leases, as Microsoft or Apple does, the use of its software. This as John Deere claims gives them the right to use data generated as “feedback” from their machinery. There is a growing farmers’ movement on their rights to their data generated from their farms. The paper does not discuss this issue and how it can impact upon opening of agricultural data.
  6.   The discussion paper advocates opening agricultural and nutritional data without any consideration of how to support the costs in generating, managing and enabling its effective use by countries and communities who do not have the capacities to generate, manage and use it. Opening data without enabling effective, equitable use can be considered a form of piracy. GODAN should consider these issues in formulating its advocacy.
  7.   Advocacy for opening data must also include advocacy for open ICT tools and techniques such as storage, search and analytical algorithms and open technologies for agriculture and improving nutrition.

Nathan Deutschand Iain Davidson-Hunt2

1 Research consultant, Belluno, IT; Theme on Sustainable Livelihoods, CEESP, IUCN

2 Natural Resources Institute, University of Manitoba, CA; Co-lead, Theme on Sustainable Livelihoods, CEESP, IUCN

A vast amount of data has been collected on human use of biodiversity for food and nutrition. Much of this data is publicly available but is poorly accessible. A more systematic approach to analysis of linkages between biodiversity and nutrition is required for better safeguarding of the full and potential role of biodiversity in human health.

An identifiable gap in the scoping of the GODAN discussion paper concerns information on agrobiodiversity, from genes to ecosystems, concerning traditional crop varieties, “wild” plant and animal species that are typically harvested and utilized across cultivated, semi-cultivated and uncultivated landscapes.

Addressing this gap would require linking data across policy domains of biodiversity protection and human nutrition using data sources such as disparate databases, but also knowledge in scientific papers, books, reports and archives. The Cross-Cutting Initiative on Biodiversity for Food and Nutrition provides precedence for renewed efforts to link data across biodiversity and nutrition as distinct domains of knowledge.

Efforts to link open data across biodiversity and nutrition domains of knowledge would address a gap in terms of the role of biodiversity in risks to nutrition security, particularly regarding micro-nutrients that are essential to maternal and child health. Micro-nutrient content varies tremendously across varieties, indigenous crops and non-cultivated species. There is a clear need to link information on healthy and accessible diets for indigenous peoples and local communities, which are often located in remote areas and significantly rely on biodiversity to meet these needs.

The quantity and quality of information readily available in openly accessible digital formats constitutes an important challenge, alongside the extensive work that will likely be needed in order to find relevant data and make linkages between sources. Data in food composition and nutrition studies and databases often focus on foods, and are not readily linkable to biodiversity data on species or varieties. Furthermore, data on nutrition has not been effectively linked to information on threats to species and varieties in order to allow for assessment of stability and availability over time, and under shocks and stresses.

IUCN has recently launched a project, currently titled Human Dependency on Nature. An important component of this work has been to consider possibilities for linking data on threats to biodiversity with the nutritional composition of wild foods and traditional landraces. There are likely synergies to be explored in linking efforts, both of which are attempting to better understanding the contribution of nature to human health.

Michelle Zelli, AidData

It is my pleasure to submit AidData's response to the GODAN Discussion Paper. AidData will also be in attendance at the GODAN meeting in Ottawa this week. We look forward to contributing during the meeting and appreciate the opportunity to provide insight and feedback on the discussion paper. The following is in response to the recently announced ODI/GODAN Discussion Paper, “How can we improve agriculture, food and nutrition with open data?”.  As AidData is a new partner of the GODAN network, we seek to respond to the main points addressed in the discussion paper through the lens of our area of expertise: tracking geo-referenced development finance data at the sub-national level to inform policies and promote evidence-based decision-making.

Ease of Use and Accessibility

While the discussion paper can not include every aspect of the use of open data for agriculture, food and nutrition policy initiatives, an area that we would like to highlight is the use of decision-support tools and dashboards to improve the accessibility and ease of use of open data. The aim should be to not only publish data, but also to render data more accessible and usable for a wide audience through the use of decision-support tools and dashboards. These tools should be enhanced to also incorporate ways to make data interfaces more user-friendly and accessible through different formats such as mobile phones and apps. Open data sources should also incorporate a design thinking approach in order to create platforms for accessible, open data with features and functionality with end-users in mind.

Open Data Feedback Loops

The impact of open data for agriculture, food and nutrition initiatives can be extended beyond capacity building by incorporating ways that farmers, civil society organizations and citizens can contribute towards the collection and maintenance of open data resources. The inclusion of increased feedback mechanisms has two benefits: 1) It increases opportunities for target users to provide comments and suggestions based on their needs and experiences interacting with the data, 2) It increases country ownership of the data thereby improving uptake and maintenance of user contributed data. Increasing feedback loops in the collection and maintenance processes can significantly add value to open data resources.

Nutrition Sensitive Investments Approach

Particularly with regards to agriculture, food and nutrition funding, the use of activity-level – rather than project-level – information can allow for a more precise classification of “nutrition sensitive” financial flows that incorporate projects with nutrition and food security aspects. The use of nutrition sensitive tracking allows for a standardized system across sectors and donors while taking into account funding designated for projects with nutrition outcomes. An example of the use of “nutrition sensitive” tracking is described in the following working paper: Ickes, B. et al. 2015. Building a Stronger System for Tracking Nutrition Sensitive Spending: A methodology and estimate of global spending for nutrition sensitive foreign aid,” AidData Workshop Paper 7, April 2015.

Additional opportunities for the use of open data for agriculture, food and nutrition initiatives includes addressing factors such as the interoperability of existing datasets, the ability to analyze nutrition and agriculture data vis-à-vis the SDG framework, and integrating gender outcomes. An example of the use of spatial data to inform gender-responsive development outcomes is here: Gender and GIS: Using Maps to Improve Food Security in Uganda and could be used as an additional use case example. The use of open data to improve agriculture, food and nutrition outcomes has many applications. Examples of its potential uses extend beyond the identification of data sources to improving accessibility and usability for local decision-makers and policymakers.

Jaime Adams, OSEC, USDA

While the revolution to open data around the global moves forward at warp speed, many still ask why, what is the value of opening data?  Here it is, 13 cases studies for opening agriculture and nutrition data, just the tip of the iceberg.   Congratulations to the authors and contributors of the discussion paper, “How can we improve agriculture, food and nutrition with open data?”  It is exciting to see the fruits of our labor to assemble over 120 partners to advocate for open agriculture and nutrition data around the world.  I hope is that this is only the first of many publications from GODAN.

As one of the GODAN architects, I challenge the “owners” of the case studies and those who will endeavor similar efforts in the future – first, look around.  What is already being done?  Motivation behind creating GODAN not only included bringing partners together to advocate for open agriculture and nutrition data, but also to encourage partners to share knowledge and collaborate.  By sharing knowledge and collaborating, together we will reduce duplication of efforts and fast track toward solutions for global challenges.

Since this is a discussion paper, I would encourage a discussion around current and future case studies to include:  Did the project owners explore whether there were similar efforts underway?  Did the project owners seek collaborators through initiatives such as GODAN?  Are project owners now educating others on their work to reduce future duplication?   These are the steps we encourage GODAN partners to take.  Talk to your GODAN colleagues before starting a new project in opening agriculture and nutrition data.  Seek collaborators to achieve economies of scale.   Share the knowledge you gain from your projects by making the results open.

I am often asked: why do you think the agriculture sector and opening agriculture and nutrition data is so important?  My answer is simple.  Agriculture = food.  No food = no human race.  No human race = no other sectors.  Kudos to a job well done and looking forward to the next publication!

Savania Chinamaringa, Department of Environment, Food and Rural Affairs (DEFRA), UK (Personal response)

The GODAN paper touches on some fundamental issues regarding open data and its role in agriculture, food and nutrition. The authors should be commended for bringing together a diverse collection of use cases to tell the story of open data. While the paper is comprehensive I think that there are a few additions or clarifications that, as GODAN, we should seek to explore further as we proceed to support and implement the approaches put forward in this paper.

First observation is about how we frame use cases and case studies. The paper starts and finishes by emphasising the need to always start from the viewpoint of real-world problems that we seek to solve with open data. In line with that thinking it will be good to frame our case studies in that way. Doing so will make it easier to effectively communicate GODAN’s key messages. I suggest we structure the use cases as follows: what was the real-world problem, how were data used, what was the outcome and the impact. The suggestion would be to use a uniform template on all use cases.

The second point I want to comment on is the level at which we assess impact of open data. Most use cases and benefits are framed at the level of the individual farmer (as private and/or business entity) consumer and citizen.  I believe that misses a very important level – institutional. What I mean here is that in developing economies where agricultural practices may not be advanced enough for individual farmers to fully exploit data there is need for strong institutions. I propose that GODAN looks at this issue separately – i.e. how open data initiatives can be used to build strong institutions to support agriculture, food and nutrition at local, regional and global level

The third point I want to make is about too much focus on the ‘open’ in open data – not on the ‘data’. Understandably the focus of GODAN is open data, but when we are looking at how open data can be used to promote agriculture, food and nutrition we need to look for success stories where data in general, not just open data, has been used to improve outcomes. In cases where such data were not open we can then argue that if it were open the impact would certainly multiply many times over.

The fourth and final point I want to make is about forging partnerships with other initiatives. There are numerous other groupings and initiatives that are also trying to solve the numerous challenges in agriculture, food and nutrition. Instead of appearing to be pushing a separate data-driven, GODAN should seek to make synergies with such initiatives and promote the adoption and use of data’s pivotal role as one of the many potential means to some ends.

Jerry Janes, Janes Consulting

As a GODAN partner, noting the reference to Software tools for collecting, cleaning, scraping and publishing data, you need that framework in place. It was when I built it last year, and was a highly scalable and functional Open Source solution in place, awaiting data. I applied to the position to continue this work but never got a reply until I pressed, and then was dismissed because I am remote, which was an option per the job description.

I know I am beating a dead horse, but you already had in place what you continue to say you need, and it exceeded your needs and goals. I assume you hired someone, but I have not received such notification after I asked, so I will again offer my assistance as I see none of this work has been done since mine.

If you would like a similar case study as to what my deleted (I still would like a copy of my work if nothing else) project entailed, feel free to register and kick the tires on another project I developed, http://iri.li. By the way, that is one of the core features ofISCedu.org, which itself is aligned to your goals.

What I suggest is you hire me immediately as a consultant so I can rebuild the framework if the backup never was made, or if no longer available, because without it, none of the rest of your proposal can happen.