diff --git a/.github/ISSUE_TEMPLATE/general-report.md b/.github/ISSUE_TEMPLATE/general-report.md deleted file mode 100644 index 85ac8935b5..0000000000 --- a/.github/ISSUE_TEMPLATE/general-report.md +++ /dev/null @@ -1,15 +0,0 @@ ---- -name: General report -about: Create a report to help us improve -title: '' -labels: '' -assignees: '' ---- - -Before creating a new issue, make sure you had a look at the [official documentation](https://grobid.readthedocs.com). For specific questions, you can try the [Mendable Q/A chat](https://www.mendable.ai/demo/723cfc12-fdd6-4631-9a9e-21b80241131b) (**NOTE**: This is rather experimental, if not sure, make sure you double-check using the official documentation.) - -- What is your OS and architecture? Windows is not supported and Mac OS arm64 is experimentally supported. For non-supported OS, you can use Docker (https://grobid.readthedocs.io/en/latest/Grobid-docker/) - -- What is your Java version (`java --version`)? - -- In case of build or run errors, please submit the error while running gradlew with ``--stacktrace`` and ``--info`` for better log traces (e.g. `./gradlew run --stacktrace --info`) or attach the log file `logs/grobid-service.log` or the console log. diff --git a/.github/ISSUE_TEMPLATE/general-report.yml b/.github/ISSUE_TEMPLATE/general-report.yml new file mode 100644 index 0000000000..d12ad8fdd6 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/general-report.yml @@ -0,0 +1,36 @@ +name: General report +description: Create a report to help us improve +body: + - type: markdown + attributes: + value: | + Thanks for taking the time to fill out this bug report! Before creating a new issue, make sure you had a look at the [official documentation](https://grobid.readthedocs.com) or with the **experimental** [Mendable Q/A chat](https://www.mendable.ai/demo/723cfc12-fdd6-4631-9a9e-21b80241131b). **NOTE**: the suggested method of running grobid is through Docker (https://grobid.readthedocs.io/en/latest/Grobid-docker/). + - type: input + id: os + attributes: + label: Operating System and architecture (arm64, amd64, x86, etc.) + description: Please remember that Windows is not supported and Mac OS arm64 is still experimental. + validations: + required: false + - type: input + id: java + attributes: + label: What is your Java version + description: "java --version" + validations: + required: false + - type: textarea + id: logs + attributes: + label: Log and information + description: In case of build or run errors, please submit the error while running gradlew with ``--stacktrace`` and ``--info`` for better log traces (e.g. `./gradlew run --stacktrace --info`) or attach the log file `logs/grobid-service.log` or the console log. + validations: + required: false + - type: textarea + id: what-happened + attributes: + label: Further information + description: Please give us any information that could be of help + validations: + required: false + diff --git a/.github/workflows/ci-build-manual-crf.yml b/.github/workflows/ci-build-manual-crf.yml index 6606f0bfd1..d794f4d663 100644 --- a/.github/workflows/ci-build-manual-crf.yml +++ b/.github/workflows/ci-build-manual-crf.yml @@ -3,10 +3,11 @@ name: Build and push a CRF-only docker image on: workflow_dispatch: inputs: - suffix: + custom_tag: type: string - description: Docker image suffix (e.g. develop, crf, full) - required: false + description: Docker image tag + required: true + default: "latest-crf" jobs: build: @@ -42,6 +43,6 @@ jobs: registry: docker.io pushImage: true tags: | - latest-develop, latest-crf${{ github.event.inputs.suffix != '' && '-' || '' }}${{ github.event.inputs.suffix }} + latest-develop, ${{ github.event.inputs.custom_tag}} - name: Image digest run: echo ${{ steps.docker_build.outputs.digest }} diff --git a/.github/workflows/ci-build-manual-full.yml b/.github/workflows/ci-build-manual-full.yml index ce1a0a175b..9ba70d16dc 100644 --- a/.github/workflows/ci-build-manual-full.yml +++ b/.github/workflows/ci-build-manual-full.yml @@ -1,7 +1,13 @@ name: Build and push a full docker image -on: "workflow_dispatch" - +on: + workflow_dispatch: + inputs: + custom_tag: + type: string + description: Docker image tag + required: true + default: "latest-full" jobs: build: @@ -35,7 +41,7 @@ jobs: image: lfoppiano/grobid registry: docker.io pushImage: true - tags: latest-full + tags: latest-full, ${{ github.event.inputs.custom_tag}} dockerfile: Dockerfile.delft - name: Image digest run: echo ${{ steps.docker_build.outputs.digest }} diff --git a/.github/workflows/ci-build-unstable.yml b/.github/workflows/ci-build-unstable.yml index 808931f3d7..558933a0e9 100644 --- a/.github/workflows/ci-build-unstable.yml +++ b/.github/workflows/ci-build-unstable.yml @@ -4,7 +4,7 @@ on: [ push ] concurrency: group: gradle - cancel-in-progress: true + cancel-in-progress: false jobs: diff --git a/.gitignore b/.gitignore index 4bcec65bcd..35526b5247 100644 --- a/.gitignore +++ b/.gitignore @@ -8,6 +8,7 @@ Thumbs.db .settings .classpath .idea +.vscode .gradle **/build */out/ diff --git a/CHANGELOG.md b/CHANGELOG.md index e5df60bb81..bbf5ba3459 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,22 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). +## [0.8.2] - TBD + +### Added +- New model specialisation/variants (flavors) mechanism #1151 +- New specialised models for a lightweight processing that covers other type of scientific articles that are not following the general segmentation schema (e.g. corrections, editorial letters, etc.) #1202 +- Additional training data covering edge cases where the Data Availability statements are over multiple pages #1200 +- Added a flag that allow output the raw copyright information in TEI #1181 + +### Changed + +### Fixed +- Fix URL identification for certain edge cases #1190, #1191, #1185 +- Fix fulltext model training data #1107 +- Fix header model training data #1128 +- Updated the docker image's packages to reduce the vulnerabilities #1173 + ## [0.8.1] - 2024-09-14 ### Added diff --git a/Dockerfile.delft b/Dockerfile.delft index ab96ac3e09..d9c0f27f2f 100644 --- a/Dockerfile.delft +++ b/Dockerfile.delft @@ -87,6 +87,7 @@ ENTRYPOINT ["/tini", "-s", "--"] # install JRE, python and other dependencies RUN apt-get update && \ + apt-mark hold libcudnn8 && \ apt-get -y upgrade && \ apt-get -y --no-install-recommends install apt-utils build-essential gcc libxml2 libfontconfig unzip curl \ openjdk-17-jre-headless ca-certificates-java \ @@ -141,7 +142,7 @@ RUN python3 preload_embeddings.py --registry ./resources-registry.json && \ RUN mkdir delft && \ cp ./resources-registry.json delft/ -ENV GROBID_SERVICE_OPTS "--add-opens java.base/java.lang=ALL-UNNAMED" +ENV GROBID_SERVICE_OPTS "--add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens java.base/java.io=ALL-UNNAMED" CMD ["./grobid-service/bin/grobid-service"] diff --git a/Readme.md b/Readme.md index 66b4dd6791..989bc5c3ca 100644 --- a/Readme.md +++ b/Readme.md @@ -78,7 +78,11 @@ For facilitating the usage GROBID service at scale, we provide clients written i - Java GROBID client - Node.js GROBID client -All these clients will take advantage of the multi-threading for scaling large set of PDF processing. As a consequence, they will be much more efficient than the [batch command lines](https://grobid.readthedocs.io/en/latest/Grobid-batch/) (which use only one thread) and should be preferred. +A third party client for Go is available offering functionality similar to the Python client: + +- Go GROBID client + +All these clients will take advantage of the multi-threading for scaling large set of PDF processing. As a consequence, they will be much more efficient than the [batch command lines](https://grobid.readthedocs.io/en/latest/Grobid-batch/) (which use only one thread) and should be preferred. For example, we have been able to run the complete full-text processing at around 10.6 PDF per second (around 915,000 PDF per day, around 20M pages per day) with the node.js client listed above during one week on one 16 CPU machine (16 threads, 32GB RAM, no SDD, articles from mainstream publishers), see [here](https://github.com/kermitt2/grobid/issues/443#issuecomment-505208132) (11.3M PDF were processed in 6 days by 2 servers without interruption). diff --git a/doc/Deep-Learning-models.md b/doc/Deep-Learning-models.md index c3db143110..886597af46 100644 --- a/doc/Deep-Learning-models.md +++ b/doc/Deep-Learning-models.md @@ -18,7 +18,7 @@ Current neural models can be up to 50 times slower than CRF, depending on the ar ## Recommended Deep Learning models -By default, only CRF models are used by Grobid. You need to select the Deep Learning models you would like to use in the GROBID configuration yaml file (`grobid/grobid-home/config/grobid.yaml`). See [here](https://grobid.readthedocs.io/en/latest/Configuration/#configuring-the-models) for more details on how to select these models. The most convenient way to use the Deep Learning models is to use the full GROBID Docker image and pass a configuration file at launch of the container describing the selected models to be used instead of the default CRF ones. Note that the full GROBID Docker image is already configured to use Deep Learning models for bibliographical reference and affiliation-address parsing. +By default, only CRF models are used by Grobid. You need to select the Deep Learning models you would like to use in the GROBID configuration yaml file (`grobid/grobid-home/config/grobid.yaml`). See [here](Configuration.md#configuring-the-models) for more details on how to select these models. The most convenient way to use the Deep Learning models is to use the full GROBID Docker image and pass a configuration file at launch of the container describing the selected models to be used instead of the default CRF ones. Note that the full GROBID Docker image is already configured to use Deep Learning models for bibliographical reference and affiliation-address parsing. For current GROBID version 0.8.1, we recommend considering the usage of the following Deep Learning models: @@ -46,7 +46,7 @@ However, if you need a "local" library installation and build, prepare a lot of #### Classic python and Virtualenv -0. Install GROBID as indicated [here](https://grobid.readthedocs.io/en/latest/Install-Grobid/). +0. Install GROBID as indicated [here](Install-Grobid.md). The following was tested with Java version up to 17. @@ -130,7 +130,7 @@ INFO [2020-10-30 23:04:07,756] org.grobid.core.jni.DeLFTModel: Loading DeLFT mo INFO [2020-10-30 23:04:07,758] org.grobid.core.jni.JEPThreadPool: Creating JEP instance for thread 44 ``` -It is then possible to [benchmark end-to-end](https://grobid.readthedocs.io/en/latest/End-to-end-evaluation/) the selected Deep Learning models as any usual GROBID benchmarking exercise. In practice, the CRF models should be mixed with Deep Learning models to keep the process reasonably fast and memory-hungry. In addition, note that, currently, due to the limited amount of training data, Deep Learning models perform significantly better than CRF only for a few models (`citation`, `affiliation-address`, `reference-segmenter`). This should of course certainly change in the future! +It is then possible to [benchmark end-to-end](End-to-end-evaluation.md) the selected Deep Learning models as any usual GROBID benchmarking exercise. In practice, the CRF models should be mixed with Deep Learning models to keep the process reasonably fast and memory-hungry. In addition, note that, currently, due to the limited amount of training data, Deep Learning models perform significantly better than CRF only for a few models (`citation`, `affiliation-address`, `reference-segmenter`). This should of course certainly change in the future! #### Anaconda diff --git a/doc/End-to-end-evaluation.md b/doc/End-to-end-evaluation.md index 24284141c7..4e8505629a 100644 --- a/doc/End-to-end-evaluation.md +++ b/doc/End-to-end-evaluation.md @@ -12,7 +12,7 @@ For actual benchmarks, see the [Benchmarking page](Benchmarking.md). We describe ## Datasets -The corpus used for the end-to-end evaluation of Grobid are all available in a single place on Zenodo: https://zenodo.org/record/7708580. Some of these datasets have been further annotated to make the evaluation of certain sub-structures possible (in particular code and data availability sections & funding sections). +The corpus used for the end-to-end evaluation of Grobid are all available in a single place on Zenodo: [https://zenodo.org/record/7708580](https://zenodo.org/record/7708580). Some of these datasets have been further annotated to make the evaluation of certain sub-structures possible (in particular code and data availability sections & funding sections). These resources are originally published under CC-BY license. Our additional annotations are similarly under CC-BY. We thank NIH, bioRxiv, PLOS and eLife for making these resources Open Access and reusable. diff --git a/doc/Grobid-batch.md b/doc/Grobid-batch.md index d856126eab..563ec60819 100644 --- a/doc/Grobid-batch.md +++ b/doc/Grobid-batch.md @@ -1,6 +1,6 @@
"); tei.append(TextUtilities.HTMLEncode(biblio.getCopyright())); - tei.append("\n"); + tei.append("
\n"); } tei.append("\t\t\t\t\n"); @@ -315,7 +315,7 @@ public StringBuilder toTEIHeader(BiblioItem biblio, if (config.getIncludeRawCopyrights() && biblio.getCopyright() != null && biblio.getCopyright().length()>0) { tei.append("\t\t\t\t\t"); tei.append(TextUtilities.HTMLEncode(biblio.getCopyright())); - tei.append("\n"); + tei.append("
\n"); } tei.append("\t\t\t\t\n"); diff --git a/grobid-core/src/main/java/org/grobid/core/engines/AffiliationAddressParser.java b/grobid-core/src/main/java/org/grobid/core/engines/AffiliationAddressParser.java index 95798c36c9..1e97e411a1 100755 --- a/grobid-core/src/main/java/org/grobid/core/engines/AffiliationAddressParser.java +++ b/grobid-core/src/main/java/org/grobid/core/engines/AffiliationAddressParser.java @@ -1,21 +1,22 @@ package org.grobid.core.engines; -import org.chasen.crfpp.Tagger; +import org.apache.commons.collections4.CollectionUtils; +import org.apache.commons.lang3.StringUtils; +import org.grobid.core.GrobidModel; import org.grobid.core.GrobidModels; import org.grobid.core.data.Affiliation; +import org.grobid.core.engines.label.TaggingLabel; +import org.grobid.core.engines.label.TaggingLabels; import org.grobid.core.exceptions.GrobidException; import org.grobid.core.features.FeaturesVectorAffiliationAddress; import org.grobid.core.layout.LayoutToken; import org.grobid.core.lexicon.Lexicon; +import org.grobid.core.tokenization.TaggingTokenCluster; +import org.grobid.core.tokenization.TaggingTokenClusteror; +import org.grobid.core.utilities.LayoutTokensUtil; import org.grobid.core.utilities.OffsetPosition; import org.grobid.core.utilities.TextUtilities; import org.grobid.core.utilities.UnicodeUtil; -import org.grobid.core.utilities.LayoutTokensUtil; -import org.grobid.core.engines.tagging.GenericTaggerUtils; -import org.grobid.core.tokenization.TaggingTokenCluster; -import org.grobid.core.tokenization.TaggingTokenClusteror; -import org.grobid.core.engines.label.TaggingLabel; -import org.grobid.core.engines.label.TaggingLabels; import java.util.ArrayList; import java.util.List; @@ -24,8 +25,12 @@ public class AffiliationAddressParser extends AbstractParser { public Lexicon lexicon = Lexicon.getInstance(); + protected AffiliationAddressParser(GrobidModel model) { + super(model); + } + public AffiliationAddressParser() { - super(GrobidModels.AFFILIATION_ADDRESS); + this(GrobidModels.AFFILIATION_ADDRESS); } public List
* Note: due to an older bug, kr is currently map to Korean too - this should
* disappear at some point in the future after retraining of models
*
@@ -847,7 +849,7 @@ public List
* For example "The car is in Milan" as Milan is a location, would return OffsetPosition(14,19)
*
* @param s the input string
@@ -1009,7 +1011,7 @@ public List
* For example "The car is in Milan" as Milan is a location, would return OffsetPosition(14,19)
*
* @param s the input list of LayoutToken
@@ -1023,7 +1025,7 @@ public List The thermodynamic parameters of the LaH 10 superconductor were calculated by means of Eliashberg equations and The symbols Δ = Δ Ω where ρ(0) denotes the value of electronic density of states at Fermi surface; Z n The difference in the specific heat between the superconducting and the normal state (ΔC = C S − C N ) is given by: The most convenient way of estimation the specific heat for the normal state is using the expression: Nevertheless, a sensible qualitative analysis can be made with respect to the influence of the atomic mass of the where λ La , λ X , and λ H are the contributions to the electron-phonon coupling constant derived from both metals while the symbols appearing in Eq. (8) are defined in Table 1. Let us calculate explicitly the relevant quantities: and We are going to consider the case Ω
+
diff --git a/grobid-trainer/resources/dataset/fulltext/corpus/tei/s41598-020-58065-9.training.fulltext.tei.xml b/grobid-trainer/resources/dataset/fulltext/corpus/tei/s41598-020-58065-9.training.fulltext.tei.xml
index 59bdd0ce98..39251fe8a6 100644
--- a/grobid-trainer/resources/dataset/fulltext/corpus/tei/s41598-020-58065-9.training.fulltext.tei.xml
+++ b/grobid-trainer/resources/dataset/fulltext/corpus/tei/s41598-020-58065-9.training.fulltext.tei.xml
@@ -14,11 +14,11 @@
on the imaginary axis 23 :
π
μ
Δ
=
Ω − Ω −
Ω
Ω + Δ
Δ
=−
Z
k T
K
[ (
)
( )]
,
n n
B
m
M
M
n
m
m
m
m
m
2
2
π
μ
Δ
=
Ω − Ω −
Ω
Ω + Δ
Δ
=−
Z
k T
K
[ (
)
( )]
,
()
n n
B
m
M
M
n
m
m
m
m
m
2
2
π
= +
Ω − Ω
Ω + Δ
Ω
Ω
.
=−
Z
kT
K
Z
1
(
)
n
B
m
M
M
n
m
m
m
m
n
m
2
2
π
= +
Ω − Ω
Ω + Δ
Ω
Ω
.
=−
Z
kT
K
Z
1
(
)
()
n
B
m
M
M
n
m
m
m
m
n
m
2
2
i
( )
n
n and =
Ω
Z
Z i
( )
n
n denote the order parameter and the wave function renormalization
factor, respectively. The quantity Ω n represents the Matsubara frequency:
π
Ω =
−
k T n
( 2
1)
n
B
, where k B is the
Boltzmann constant. The pairing kernel is defined by:
λ
Ω − Ω =
Ω
Ω − Ω
+Ω
K(
)
n
m
(
)
C
n
m
C
2
2
2 , where λ denotes the elec-
tron-phonon coupling constant. We determined the value of λ on the basis of experimental data 20,21 and the
condition: Δ
=
= =
[
]
0
n
T T
1
C
. The fitting between the theory and the experimental results is presented in Fig. 1. We
obtained λ a = 2.187 for p a = 150 GPa and λ b = 2.818 for p b = 190 GPa. The symbol Ω C represents the character-
istic phonon frequency, its value being assumed as Ω C = 100 meV.
ρ
π
Δ = −
Ω + Δ − Ω
×
−
Ω
Ω + Δ
=
F
k T
Z
Z
(0)
2
(
)
,
B
n
M
n
n
n
n
S
n
N
n
n
n
1
2
2
2
2
ρ
π
Δ = −
Ω + Δ − Ω
×
−
Ω
Ω + Δ
=
F
k T
Z
Z
(0)
2
(
)
,
()
B
n
M
n
n
n
n
S
n
N
n
n
n
1
2
2
2
2
S and Z n
N are the wave function
normalization factors for the superconducting and the normal state, respectively. Note that ΔF is equal to zero
exactly for T = T C . This fact results from the overt dependence of free energy on solutions of Eliashberg equations
(Δ n and Z n ) that have been adjusted to the experimental value of critical temperature by appropriate selection of
electron-phonon coupling constant (see Fig. 1). Thermodynamic critical field should be calculated from the
formula:
π
ρ
= − Δ
.
H
F
(0)
8 [ / (0)]
C
π
ρ
= − Δ
.
H
F
(0)
8 [ / (0)]
()
C
ρ
Δ
= −
Δ
.
C T
k
k T
d
F
d k T
( )
(0)
[ / (0)]
(
)
B
B
B
2
2
ρ
Δ
= −
Δ
.
C T
k
k T
d
F
d k T
( )
(0)
[ / (0)]
(
)
()
B
B
B
2
2
γ
=
.
C T
k
k T
( )
(0)
N
B
B
γ
=
.
C T
k
k T
( )
(0)
()
N
B
B
X element on a value of the critical temperature (since the mass of the X element determines Ω max ). In this regard,
let us refer to the theoretical results obtained within the Eliashberg formalism for H 2 S and H 3 S superconduc-
tors 5,6 . They prove that contributions to the Eliashberg function (α Ω
F( )
2
) coming from sulphur and from hydro-
gen are separated due to a huge difference between atomic masses of these two elements. To be precise, the
electron-phonon interaction derived from sulphur is crucial in the frequency range from 0 meV to Ω max
S
equal to
about 70 meV, while the contribution derived from hydrogen (Ω
= 220
max
H
meV) is significant above ~100 meV.
It is noteworthy that we come upon a similar situation in the case of the LaH 10 compound 30 . Therefore the follow-
ing factorization of the Eliashberg function for the LaXH compound can be assumed:
λ
θ
λ
θ
λ
θ
Ω =
Ω
Ω
Ω
− Ω +
Ω
Ω
Ω
− Ω
+
Ω
Ω
Ω
− Ω
F( )
(
)
(
)
(
) ,
- 2
L a
max
La 2
max
La
X
max
X
2
max
X
H
max
H
2
max
H
2
L a
max
La 2
max
La
X
max
X
2
max
X
H
max
H
2
max
H
(La, X) and hydrogen, respectively. Similarly, the symbols Ω max
La , Ω max
X , and Ω max
H represent the respective maxi-
mum phonon frequencies. The value of the critical temperature can be assessed from the generalised formula of
the BCS theory 7 :
λ
λμ
=
Ω
.
− .
+
− + .
k T
f f
1 27
exp
1 14(1
)
(1 0 163 )
,
B C
1 2
ln
λ
λμ
=
Ω
.
− .
+
− + .
k T
f f
1 27
exp
1 14(1
)
(1 0 163 )
,
()
B C
1 2
ln
λ
λ
=
+
+ ,
λ
λ
=
+
+ ,
()
λ
λ
λ
λ
λ
λ
λ
λ
λ
λ
λ
Ω =
+
+
Ω
−
×
+
+
Ω
−
×
+
+
Ω
−
exp
l n(
)
1
2
exp
l n(
)
1
2
exp
l n(
)
1
2
,
-
+ ()
ln
La
La
X
H
max
La
X
La
X
H
max
X
H
La
X
H
max
H
λ
λ
λ
λ
λ
λ
λ
λ
λ
λ
λ
Ω =
+
+
Ω
+
+
+
Ω
+
+
+
Ω
.
(
)
2
(
)
2
(
)
2
2
La
La
X
H
max
La 2
X
La
X
H
max
X
2
H
La
X
H
max
H
2
λ
λ
λ
λ
λ
λ
λ
λ
λ
λ
λ
Ω =
+
+
Ω
+
+
+
Ω
+
+
+
Ω
.
(
)
2
(
)
2
(
)
2
()
2
La
La
X
H
max
La 2
X
La
X
H
max
X
2
H
La
X
H
max
H
2
< Ω
<
~40 meV
100 meV
max
La
max
X
. It means that we are interested in
such an X element, the contribution of which to the Eliashberg function fills the gap between contributions com-
ing from lanthanum and hydrogen. It can be assumed that 0 < λ X < 1, while keeping in mind that λ La = 0.68 31 .
Additionally, the previous calculations discussed in the work allow to write that λ La + λ H is equal to λ a = 2.187
for p a = 150 GPa or to λ b = 2.818 for p b = 190 GPa. The quantity
μ occurring in the Eq. (8) serves now as the
fitting parameter. One should remember that the formula for the critical temperature given by the Eq. (8) was
derived with the use of significant simplifying assumptions (the value of the cut-off frequency is neglected, as well
as the retardation effects modeled by the Matsubara frequency). Therefore the value of the Coulomb pseudopo-
tential determined from the full Eliashberg equations usually differs from the value of
μ calculated analytically.
The experimental data for the LaH 10 superconductor can be reproduced using Eq. (8) and assuming that
μ = .
0 170
a
and μ = .
0 276
b
.
open access article distributed under the terms of
the Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
conducted in India obtained from ClinicalTrials.gov
is made available in the corresponding Supporting
Information file. Other authors can also access this
information through ClinicalTrials.gov. We obtained
publication data from the Scopus database which
is a proprietary database (www.scopus.com).
Researchers interested in replicating our study can
access data on trial-related publications following
the search procedure described in the paper.
Researchers do not need special privileges to
access the Scopus database, however, a
subscription may be required. The authors did not
have special access privileges to the data.
conducted in India obtained from ClinicalTrials.gov
is made available in the corresponding Supporting
Information file. Other authors can also access this
information through ClinicalTrials.gov. We obtained
publication data from the Scopus database which
is a proprietary database (www.scopus.com).
Researchers interested in replicating our study can
access data on trial-related publications following
the search procedure described in the paper.
Researchers do not need special privileges to
access the Scopus database, however, a
subscription may be required. The authors did not
have special access privileges to the data.
for this work.