The aim of Nature Navigator is to provide detailed analysis and insights on individual research topics in order to aid your strategic and tactical decision making. Below is an explanation of how Springer Nature is processing the data from different raw data sources. It also contains instructions for you to get in contact with us in case you find a mistake in our data or in case you would like to request a rectification/removal of information on this website which is in conflict with your personal legit interest.
Nature Navigator does not actively source data. We process different raw data sources in order to provide the information you can find on topic pages. Which data source is used for a specific topic can be subject to individual licence terms. In general, the different available data sources are the following, but not limited to those:
The raw data on which Nature Navigator insights are based is a derivative of standard bibliographic metadata of scientific content. In order to provide meaningful and accurate summaries of current research landscapes we are evaluating and processing metadata from above sources. This metadata does contain information about individuals, e.g. authors of scientific content and their affiliations. You might therefore be part of topic overviews if one or more content pieces have been authored by you.
In order to summarise and analyse content for a specific research landscape, we are processing several different metadata attributes of scientific content, including but not limited to the following
Scientific fields are shaped and influenced by the research and more importantly the researchers conducting the research. In order to summarise how a certain research field or topic is evolving and to be able to predict how it might evolve in future, it is therefore key to look at who is contributing with what to the research landscape. Our aim is to provide our users with an objective understanding of the state of certain fields.
There are two main steps in creating topic overviews and summaries for Nature Navigator:
The selection of relevant content is based on selection criteria specified by the creator of a topic. The creator can be an editor, data specialist or scientist at Springer Nature as well as individual end users of the platform. There are two basic modes of content selection.
Direct content selection refers to defining a certain set of metadata related criteria which are used to filter all available content. These criteria are directly related to the attributes we are sourcing, which are described in What data attributes is Nature Navigator using? Following are a few examples of possible direct selection criteria, written in non-technical terms
Another and more powerful way (for our creators) of selecting content is through indirect selection. In this case example content from the desired field of interest is used to select similar content using state-of-the-art classification techniques. Creators provide sample content as well as the type of similarity they would like the machine to use.Indirect selection via similarity is in principle more susceptible to biases due to the selection the creator did while seeding the process. Please see Improper selection criteria for more information.
There are different levels of analysis that are done on the selected content above.
We offer standard ways of aggregating the sum of all content into meaningful charts for analysis. The simplest example of a statistical analysis could be an aggregation of all relevant content based on one or more metadata criteria, e.g. the publication output in the topic per year.
Certain metadata enable the creation of relationships and connections between the individual content pieces, like the co-authorship (two authors or affiliations contributing to the same content piece), co-citation (two content pieces referencing the same third content pieces) or usage of a similar subset of concepts. We use common network analysis techniques to identify relationships and the strength of relationships within the content selection. The result of such an analysis can be for example the network of authors which surfaces and visualises
In principle, there are different sources for biases in the content processing described above. We are continuously working on minimising the impact of biases where we are in direct control.
Any analysis is influenced directly by the raw data that is fed into it. Hence, the section criteria are a common source for bias by ignoring or suppressing a relevant amount of content. Therefore, creators have to be mindful about the selection criteria they are using. For topics created by Springer Nature we try to use only filter criteria that are directly related to the question, e.g. applying a country filter only makes sense when the research question to be answered is about the research of a given country/region. When working with sample content for indirect selection, we aim to use large input lists with diverse content, e.g. content of many publishers.
Another source for biases is incomplete data. We try to address this by constantly improving and enriching our raw data sources as well as avoiding analysis on attributes that are known to be incomplete. In cases where the current data landscape does not allow for more complete data and we believe this can have an impact on the analysis we aim to indicate this to users.
In case you see a mistake within our content or should you be of the opinion that it infringes your own interest, please do not hesitate to contact us. We are looking forward to hearing from you and to finding a suitable way forward. Please send an email to navigator@nature.com. It will help us tremendously if you send us a link and screenshot of the detail you would like us to have a look at.