Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VACCINATIONS: how to add new countries or entries to our data #230

Open
edomt opened this issue Dec 25, 2020 · 1,076 comments
Open

VACCINATIONS: how to add new countries or entries to our data #230

edomt opened this issue Dec 25, 2020 · 1,076 comments
Labels
dom:vaccinations Related to COVID-19 vaccination

Comments

@edomt
Copy link
Collaborator

edomt commented Dec 25, 2020

This is a centralized issue to let our users suggest sources to add new countries, or new entries for existing countries, to our data on COVID-19 vaccinations.

  • We only take into account numbers that are announced by official sources (head of government, ministry of health, public health agencies, public officials in charge of the vaccination campaign, etc.).
  • We only count administered doses, not distributed doses.
  • We will not include participants in the vaccine arm of clinical trials, as this data is not available for many of the hundreds of trials currently taking place.
  • Our current sources are visible in vaccinations/locations.csv.

For countries that are already included in our data, most of our imports are automated and will collect the latest number when we update our dataset. If you don't see the latest number appear in our data, please wait at least 24 hours before suggesting it here.

Note that contributions via pull requests are not possible due to the way our data pipeline is set up.

Emoji system

@lucasrodes and @edomt use emojis to track comments in this issue.

  • 👍 means that the comment has been read and looked into, and the data will be added during our next update (our vaccination dataset is updated each morning, London time)
  • 👀 means that the comment has been read and looked into, but that the data (or some of the data) will not be added. This can be for a number of reasons: because we already have this data, because there is something wrong with the source, because the numbers contradict what other sources are showing, etc. If you have questions in this regard, you can read our contribute guideline or open a new issue referencing the comment.

Remarks

  • Please read our "how to contribute" section for more details.
  • This thread is meant to be used only to report new data points, for specific questions/suggestions please consider opening a new issue. If your comment is not a data report, a new thread may be opened to follow the discussion (see example).
@edomt edomt added the dom:vaccinations Related to COVID-19 vaccination label Dec 25, 2020
@edomt edomt pinned this issue Dec 25, 2020
@edomt edomt changed the title VACCINATIONS: adding new countries and/or entries to our data VACCINATIONS: how to add new countries or entries to our data Dec 25, 2020
@EladHeller
Copy link

EladHeller commented Dec 28, 2020

Minister of Health of Israel:
On December 27, approximately 98,900 people were vaccinated in Israel.
In total, about 379,000 people were vaccinated in Israel.
https://twitter.com/YuliEdelstein/status/1343423578205794305

@artdgn
Copy link

artdgn commented Dec 30, 2020

In case some additional sources can be found from https://www.bloomberg.com/graphics/covid-vaccine-tracker-global-distribution/ - it seems to have very similar data, except Russia is assigned an order of magnitude more vaccinations (440K) compared to your data (55K).

@edomt
Copy link
Collaborator Author

edomt commented Dec 30, 2020

@artdgn Thanks! We know about that 440k estimate from Bloomberg for Russia, but we can't identify a source that would confirm this. In fact, we realized a few days ago that the total number of doses administered so far in Russia was much lower than previously thought: https://twitter.com/redouad/status/1343544952052133888

@philiprusinov
Copy link

Status update for Bulgaria, 30-12-2020
3844 already vaccinated. Information were presented by ministry of health at today's government meeting.
https://www.gov.bg/bg/prestsentar/novini/pravitelstvoto-otpusna-oshte-125-miliona-leva-za-vaksini-sreshtu-COVID%E2%80%9319

@philiprusinov
Copy link

Hello,
Official statistic for vaccination in Bulgaria is available now. Please check the table on the right, on the government COVID portal. It seem this could be automated. For now there are 4608 vaccinated.
Best regards,

https://coronavirus.bg/bg/statistika

@edomt
Copy link
Collaborator Author

edomt commented Dec 31, 2020

Thank you @philiprusinov, that's very useful! We'll be able to automate the collection with this.

@Daavide
Copy link

Daavide commented Dec 31, 2020

Italy:
https://app.powerbi.com/view?r=eyJrIjoiMzg4YmI5NDQtZDM5ZC00ZTIyLTgxN2MtOTBkMWM4MTUyYTg0IiwidCI6ImFmZDBhNzVjLTg2NzEtNGNjZS05MDYxLTJjYTBkOTJlNDIyZiIsImMiOjh9

Trying to find in a better format.

@RozaGkliva
Copy link

Hello,
Thank you for this massive effort.
Regarding sourcing data for Greece, the source you used for vaccinations is a regional newspaper. A more official source would be the periodic announcements from the National Public Health Organization. In particular for Dec 30 they gave the specific numbers in an answer to a journalist's question. The announcement's transcript can be found here: https://eody.gov.gr/enimerosi-20201230/

@edomt
Copy link
Collaborator Author

edomt commented Jan 1, 2021

Thank you @RozaGkliva!

@tsdgeos
Copy link

tsdgeos commented Jan 3, 2021

How would you feel about adding stats for Catalonia?

You have England/Scotland/Wales/NorthernIreland in addition to UK so i think that maybe adding Catalonia is OK?

Total number is available as text at https://dadescovid.cat/ (the Vacunats field), as a daily number in https://dadescovid.cat/diari (the vacunats column) and as a csv at https://dadescovid.cat/static/csv/catalunya_diari_total_pob.zip

Maybe adding Catalonia we can convince Spain/the rest of regions to start giving vaccination numbers?

@edomt
Copy link
Collaborator Author

edomt commented Jan 3, 2021

How would you feel about adding stats for Catalonia?

@tsdgeos We've considered the opportunity to add partial data for Spain, but it would likely lead to requests to add many more subnational regions, which we simply don't have the resources to handle right now. My hope is that the Spanish government will soon publish aggregated data for the whole country. If that's still not the case by January 10, we'll probably have to use some manually-aggregated numbers.

@tsdgeos
Copy link

tsdgeos commented Jan 3, 2021

Can you please explain what's your rationale to include Wales but not Catalonia?

I mean don't get me wrong, this is your dataset and you do whatever you want with it, but i sincerely would like to understand why a subnational region gets included and another one doesn't.

@edomt
Copy link
Collaborator Author

edomt commented Jan 3, 2021

Can you please explain what's your rationale to include Wales but not Catalonia?

@tsdgeos Of course! That's a completely legitimate question. There are 3 main reasons:

  • Timing: we added UK subnational data very early, when everyone was very eager to see vaccination data and the UK was the only country in the world administering the vaccine.
  • Automation: maintaining this data is basically painless as the collection is fully automated via the official API. In many countries, gathering subnational data involves manually collecting data from dashboards on a daily basis. (Catalonia is of course a counter-example since the collection could be automated, but I'm guessing that it's sadly not the case for some of the other 16 autonomous regions.)
  • Size: while there are only 4 nations in the UK, adding subnational data for Spain would mean collecting data for 17 locations, the United States would add 50 locations, etc. This is simply something we can't see ourselves doing with our current resources.

That's not to say that we'll never do this—but national data itself is taking the bulk of our time right now.

@michael404
Copy link

michael404 commented Jan 4, 2021

Norway:
The director of the Norwegian Institute of Public Health announced "over 2200" persons vaccinated in a press conference on Jan 3rd, as reported here: https://www.tv2.no/nyheter/11869599/

@edomt
Copy link
Collaborator Author

edomt commented Jan 4, 2021

Thanks @michael404 !

@kevloral
Copy link

kevloral commented Jan 5, 2021

Regarding the information about vaccination in Spain, an update:

However, I have just seen that they have added a new file in ODS format with the same information as in the PDF, but in a more easily parseable format. It is available here:

https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov/documentos/Informe_Comunicacion.ods

@edomt
Copy link
Collaborator Author

edomt commented Jan 5, 2021

Thanks @kevloral—you're right that the ODS file will likely become much more usable in the future, but I'm tempted to wait for its next update to see whether the format changes (for example the name of the sheet Hoja3 sounds very temporary).

lucasrodes added a commit that referenced this issue Mar 22, 2022
@aywi
Copy link
Contributor

aywi commented Mar 25, 2022

China Update on 2022-03-24

Source: http://www.nhc.gov.cn/xcs/s3574/202203/7ae455e5e5db4512a471ab4f9500e8e7.shtml

……截至3月24日,全国累计报告接种新冠疫苗32亿4359.9万剂次,疫苗接种总人数达12亿7554.1万,已完成全程接种12亿4077.7万人,覆盖人数占全国总人口的90.47%,全程接种人数占全国总人口的88.01%。完成加强免疫接种6亿7127万人,其中序贯加强免疫接种1219.1万人。……

Summary:

location date total_vaccinations people_vaccinated people_fully_vaccinated total_boosters
China 2022-02-07 3010669000 1266426000 1228340000 459843000
China 2022-02-18 3075752000 1268180000 1232543000
China 2022-02-25 3114622000 1269302000 1234540000 554728000
China 2022-03-14 3198272000 1272537000 1239171000
China 2022-03-17 3213773000 1273470000 1239570000 644680000
China 2022-03-18 3218716000 1273811000 1239706000 649156000
China 2022-03-21 3230367000 1274734000 1240413000 659200000
China 2022-03-24 3243599000 1275541000 1240777000 671270000

@edomt @lucasrodes

aywi added a commit to aywi/covid-19-data that referenced this issue Mar 26, 2022
Date mismatch for China Update on 2022-03-24. See: owid#230 (comment)
@aywi aywi mentioned this issue Mar 26, 2022
@aywi
Copy link
Contributor

aywi commented Apr 1, 2022

China Update on 2022-03-31

Source: http://www.nhc.gov.cn/xcs/s3574/202204/ddca56754d524ccf8d4a3c3d1834772b.shtml

……截至3月31日,全国累计报告接种新冠疫苗32亿7087.4万剂次,接种总人数达12亿7770.9万,已完成全程接种12亿4228.1万人,覆盖人数占全国总人口的90.63%,全程接种人数占全国总人口的88.11%。完成加强免疫接种6亿9493.6万人,其中序贯加强免疫接种1734.4万人。……

Summary:

location date total_vaccinations people_vaccinated people_fully_vaccinated total_boosters
China 2022-02-07 3010669000 1266426000 1228340000 459843000
China 2022-02-18 3075752000 1268180000 1232543000
China 2022-02-25 3114622000 1269302000 1234540000 554728000
China 2022-03-14 3198272000 1272537000 1239171000
China 2022-03-17 3213773000 1273470000 1239570000 644680000
China 2022-03-18 3218716000 1273811000 1239706000 649156000
China 2022-03-21 3230367000 1274734000 1240413000 659200000
China 2022-03-24 3243599000 1275541000 1240777000 671270000
China 2022-03-31 3270874000 1277709000 1242281000 694936000

@edomt @lucasrodes

lucasrodes added a commit that referenced this issue Apr 5, 2022
@aywi
Copy link
Contributor

aywi commented Apr 6, 2022

China Update on 2022-04-05

Source: http://www.nhc.gov.cn/xcs/s3574/202204/fe15261f598d46a6ae4f0bdf0b102532.shtml

……截至4月5日,全国累计报告接种新冠疫苗32亿8358.6万剂次,接种总人数达12亿7872.4万人,已完成全程接种12亿4322.6万人,覆盖人数占全国总人口的90.70%,全程接种人数占全国总人口的88.18%。完成加强免疫接种7亿569.3万人,其中序贯加强免疫接种1951.2万人。……

Summary:

location date total_vaccinations people_vaccinated people_fully_vaccinated total_boosters
China 2022-02-07 3010669000 1266426000 1228340000 459843000
China 2022-02-18 3075752000 1268180000 1232543000
China 2022-02-25 3114622000 1269302000 1234540000 554728000
China 2022-03-14 3198272000 1272537000 1239171000
China 2022-03-17 3213773000 1273470000 1239570000 644680000
China 2022-03-18 3218716000 1273811000 1239706000 649156000
China 2022-03-21 3230367000 1274734000 1240413000 659200000
China 2022-03-24 3243599000 1275541000 1240777000 671270000
China 2022-03-31 3270874000 1277709000 1242281000 694936000
China 2022-04-05 3283586000 1278724000 1243226000 705693000

@edomt @lucasrodes

aywi added a commit to aywi/covid-19-data that referenced this issue Apr 6, 2022
Minor correction for 2022-03-31 (see owid#230 (comment)), and update for 2022-04-05 (see owid#230 (comment)).
aywi added a commit to aywi/covid-19-data that referenced this issue Apr 12, 2022
@aywi
Copy link
Contributor

aywi commented Apr 12, 2022

It is possible to automatically fetch the full vaccination data of China (which is updated irregularly). The following way matched the data on 2022-03-24, 2022-03-31, 2022-04-05, and 2022-04-11:

  1. Fetch the new article on the list of http://www.nhc.gov.cn/xcs/s2906/new_list.shtml whose title matches 国务院联防联控机制($yyyy)年($M1)月($d1)日新闻发布会文字实录 and get $url;
  2. Open $url and fetch the following text segments:
  • 截至($M2)月($d2)日,($any_characters)累计报告接种新冠疫苗($vaccinations1)亿($vaccinations2)万剂次
  • 接种总人数达($vaccinated1)亿($vaccinated2)万
  • 已完成全程接种($fully_vaccinated1)亿($fully_vaccinated2)万人,覆盖人数占全国总人口的
  • 完成加强免疫接种($boosters1)亿($boosters2)万人,其中序贯加强免疫接种
  1. Check whether '$M1-$d1' is the next day of '$M2-$d2', and then get $MM and $dd from $M2 and $d2.
  2. Get the final data:
$date = '$yyyy-$MM-$dd'
$source_url = '$url'
$total_vaccinations = int($vaccinations1 * 1e8 + $vaccinations2 * 1e4)
$people_vaccinated = int($vaccinated1 * 1e8 + $vaccinated2 * 1e4)
$people_fully_vaccinated = int($fully_vaccinated1 * 1e8 + $fully_vaccinated2 * 1e4)
$total_boosters = int($boosters1 * 1e8 + $boosters2 * 1e4)

This will only work if NHC China won't change the data format in the future.

@aywi
Copy link
Contributor

aywi commented Apr 13, 2022

It is possible to automatically fetch the full vaccination data of China (which is updated irregularly). The following way matched the data on 2022-03-24, 2022-03-31, 2022-04-05, and 2022-04-11:

  1. Fetch the new article on the list of http://www.nhc.gov.cn/xcs/s2906/new_list.shtml whose title matches 国务院联防联控机制($yyyy)年($M1)月($d1)日新闻发布会文字实录 and get $url;
  2. Open $url and fetch the following text segments:
  • 截至($M2)月($d2)日,($any_characters)累计报告接种新冠疫苗($vaccinations1)亿($vaccinations2)万剂次
  • 接种总人数达($vaccinated1)亿($vaccinated2)万
  • 已完成全程接种($fully_vaccinated1)亿($fully_vaccinated2)万人,覆盖人数占全国总人口的
  • 完成加强免疫接种($boosters1)亿($boosters2)万人,其中序贯加强免疫接种
  1. Check whether '$M1-$d1' is the next day of '$M2-$d2', and then get $MM and $dd from $M2 and $d2.
  2. Get the final data:
$date = '$yyyy-$MM-$dd'
$source_url = '$url'
$total_vaccinations = int($vaccinations1 * 1e8 + $vaccinations2 * 1e4)
$people_vaccinated = int($vaccinated1 * 1e8 + $vaccinated2 * 1e4)
$people_fully_vaccinated = int($fully_vaccinated1 * 1e8 + $fully_vaccinated2 * 1e4)
$total_boosters = int($boosters1 * 1e8 + $boosters2 * 1e4)

This will only work if NHC China won't change the data format in the future.

A much more robust version of regex which works for the latest 16 matched articles:

  • summary(month, day, total_vaccinations, people_fully_vaccinated): 截至(\d{1,2})月(\d{1,2})日.*疫苗([\d\.亿零]+万)剂次.*全程接种的?人数(?:为|.{0,9}达到)([\d\.亿零]+万)人
  • people_vaccinated: (?:接种|疫苗)的?总人数(?:达到?|为)([\d\.亿零]+万)
  • total_boosters: 加强免疫(?:已经)?接种的?是?([\d\.亿零]+万)人

Update on 2022-04-19:
Fix regex for people_vaccinated on 2022-04-18, see #2601

@edomt @lucasrodes

@beansrowning
Copy link

United Republic of Tanzania data is out of date (3/23).

See latest figures from WHO COVID-19 Dashboard, which is cited in locations.csv

image

Total Vaccine Doses Administered per 100 population: 8.42
Persons Fully Vaccinated with Last Dose of Primary Series: 5.14

@lucasrodes
Copy link
Member

lucasrodes commented Apr 27, 2022

@beansrowning We are reporting the numbers from the latest WHO update, which dates back to the 23rd of Marc:

  • people vaccinated: 3,941,772
  • people fully vaccinated: 3,067,877

The differences that you note occur in per-capita metrics because different population numbers might have been used. You can find more details here: https://ourworldindata.org/covid-vaccinations#frequently-asked-questions

@beansrowning
Copy link

Thanks, @lucasrodes. I'll follow up with them to see that they're updated there.

@aywi
Copy link
Contributor

aywi commented Apr 29, 2022

New Manual Entry: China Update on 2022-04-28

Source: http://www.nhc.gov.cn/xcs/s3574/202204/96a54177c4f84f418cd5c1e86fe1ca2c.shtml

……截至4月28日,全国累计报告接种新冠疫苗33亿4071.1万剂次,接种总人数达到12亿8493.5万,已完成全程接种12亿4968.8万人,覆盖人数和全程接种人数分别占全国总人口的91.14%和88.64%。完成加强免疫接种7亿5018.9万人,其中序贯加强免疫接种2996.7万人,……

Summary:

location date total_vaccinations people_vaccinated people_fully_vaccinated total_boosters
China 2022-03-18 3218716000 1273811000 1239706000 649156000
China 2022-03-21 3230367000 1274734000 1240413000 659200000
China 2022-03-24 3243599000 1275541000 1240777000 671270000
China 2022-03-31 3270874000 1277709000 1242281000 694936000
China 2022-04-05 3283586000 1278724000 1243226000 705693000
China 2022-04-11 3300328000 1280156000 1244923000 719324000
China 2022-04-18 3317463000 1282120000 1246769000 732659000
China 2022-04-27 3338555000 1284646000 1249413000 748596000
China 2022-04-28 3340711000 1284935000 1249688000 750189000

The "国务院新闻办公室(20\d{2})年(\d{1,2})月(\d{1,2})日新闻发布会文字实录" still doesn't look like a regular source (occurs less than once a month), so I will not consider adding this to the automation at this time point.

I have added this to #2621.

@edomt @lucasrodes

@dstaermans
Copy link

Vaccination data for Latvia is no longer updated. I have reached out to our local authority, and it turns out there is a new data table and the old one will no longer be updated.
The old one: https://data.gov.lv/dati/eng/dataset/covid19-vakcinacijas/resource/9320d913-a4a2-4172-b521-73e58c2cfe83
The new one: https://data.gov.lv/dati/eng/dataset/covid19-vakcinacijas/resource/bd59f38b-3698-45d8-ae63-19d15c2640ec

I attach the official response (in Latvian) to this comment.

Could you please consider changing the data source?

Thank you!

Latvia_vaccinations_new_data

@CKingX
Copy link

CKingX commented Dec 21, 2022

Oops sorry I created separate issues. For India, we can get doses administered by vaccine manufacturer here: https://dashboard.cowin.gov.in/

It might be harder to script however as it shows up as an interactive graph where you can hover
image
You can even cross out some to get numbers on smaller vaccine manufacturers
image

For Bangladesh, that data is here: http://dashboard.dghs.gov.bd/webportal/pages/covid19-vaccination-update.php
image

Finally, for Thailand, the data is here (though it tends to round): https://dashboard-vaccine.moph.go.th
image

These 3 data sources make up 20.4% of total administered doses with India making up 16.7%! This brings the total data on vaccines administered by manufacturer from 18.4% to 38.8%

@CKingX

This comment was marked as outdated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dom:vaccinations Related to COVID-19 vaccination
Projects
None yet
Development

No branches or pull requests