-
Notifications
You must be signed in to change notification settings - Fork 0
/
SemanticScholar.Rmd
189 lines (109 loc) · 3.07 KB
/
SemanticScholar.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
---
title: "Semantic Scholar"
output: html_notebook
---
Open Research Corpus
https://api.semanticscholar.org/
https://labs.semanticscholar.org/corpus/
https://github.com/allenai/citeomatic
https://api.semanticscholar.org
https://github.com/Santosh-Gupta/Research2Vec
http://www.jaist.ac.jp/event/SCIDOCA/2018/files/Semantic-Scholar-SCIDOCA-2018.pdf
https://open.semanticscholar.org
https://www.springernature.com/gp/researchers/scigraph
https://github.com/allenai/science-parse
http://pdffigures2.allenai.org
http://flourishoa.org
http://babel.eigenfactor.org
http://viziometrics.org/about/
https://www.openacademic.ai
http://eigenfactor.org
http://labs.semanticscholar.org/citeomatic/
---
---
Example
This is a subset of the full Semantic Scholar corpus which represents papers crawled from the Web and subjected to a number of filters.
{
"id": "4cd223df721b722b1c40689caa52932a41fcc223",
"title": "Knowledge-rich, computer-assisted composition of Chinese couplets",
"paperAbstract": "Recent research effort in poem composition has focused on the use of automatic language generation...",
"entities": [
"Conformance testing",
"Natural language generation",
"Natural language processing",
"Parallel computing",
"Stochastic grammar",
"Web application"
],
"s2Url": "https://semanticscholar.org/paper/4cd223df721b722b1c40689caa52932a41fcc223",
"s2PdfUrl": "",
"pdfUrls": [
"https://doi.org/10.1093/llc/fqu052"
],
"authors": [
{
"name": "John Lee",
"ids": [
"3362353"
]
},
"..."
],
"inCitations": [
"c789e333fdbb963883a0b5c96c648bf36b8cd242"
],
"outCitations": [
"abe213ed63c426a089bdf4329597137751dbb3a0",
"..."
],
"year": 2016,
"venue": "DSH",
"journalName": "DSH",
"journalVolume": "31",
"journalPages": "152-163",
"sources": [
"DBLP"
],
"doi": "10.1093/llc/fqu052",
"doiUrl": "https://doi.org/10.1093/llc/fqu052",
"pmid": ""
}
Attributes
id string
S2 generated research paper ID.
title string
Research paper title.
paperAbstract string
Extracted abstract of the paper.
entities list
S2 extracted list of relevant entities or topics.
s2Url string
URL to S2 research paper details page.
s2PdfUrl string
URL to PDF on S2 if available.
pdfUrls list
URLs related to this PDF scraped from the web.
authors list
List of authors with an S2 generated author ID and name.
inCitations list
List of S2 paperId's which cited this paper.
outCitations list
List of paperId's which this paper cited.
year int
Year this paper was published as integer.
venue string
Extracted venue published.
journalName string
Name of the journal that published this paper.
journalVolume string
The volume of the journal where this paper was published.
journalPages string
The pages of the journal where this paper was published.
sources list
Identifies papers sourced from DBLP or Medline.
doi string
Digital Object Identifier registered at doi.org.
doiUrl string
DOI link for registered objects.
pmid string
Unique identifier used by PubMed.