Menu

Speeches dataset

To assist researchers in the field of central bank communication, we offer a precompiled dataset containing the content of all speeches together with limited metadata.

We hope that by making this dataset freely available, we will stimulate natural language processing research on the impact of our speeches on the market and beyond.

Download all speeches (CSV file) last update: 14 July 2020

For licence details, please see the copyright section of our disclaimer.

Frequently asked questions

How often is the dataset updated?

The dataset is currently updated every two months.

How do I cite the dataset

The dataset can be cited as follows:

European Central Bank. (25 October 2019). Speeches dataset. Retrieved from: https://www.ecb.europa.eu/press/key/html/downloads.en.html.

What is the format of the CSV file?

The CSV file is encoded in UTF-8 (without a BOM). New lines consist of CRLF (Windows format). Columns are separated by the vertical bar: “|”.

The following columns are included:

date
The original publication date of the speech on the ECB website. Dates are given in the format YYYY-MM-DD. For example, 3 October 2019 would be written as “2019-10-03”.
speakers
Comma-separated list of speakers. Only ECB Executive Board members’ speeches are included. If a speech, or part of a speech, is given by a speaker who is not an Executive Board member, his/her name is not listed.
title
The title of the speech
subtitle
The subtitle of the speech. Usually in the format “TYPE by SPEAKER, ROLE, at OCCASION”.
contents
The contents of the speech are given in full including footnotes.
Why do some speeches not have content?

Speeches that were only published as slides do not include any content.

Does the dataset include all speeches?

Speeches published on this website after the last update of the dataset are not included. Speeches given at the time of the European Monetary Institute, before the ECB existed, are not complete. A subset of seven speeches by Alexandre Lamfalussy that are also available on this website under “speeches by date” is included in the dataset.

Why are some of the speeches not from the European Central Bank?

Some of the oldest speeches predate the existence of the ECB and were given by the president of the European Monetary Institute.

How do I load the dataset in my favourite app?

Below we provide a snippet of code for loading the dataset in R. It is assumed that you have already saved the dataset on your personal computer.

This snippet is provided solely as inspiration. No support for this snippet or any other languages and tools can be provided.

R

Importing the CSV file in R as a list can be done with the following code:

# Navigate to the dataset, 
# select it and click the "Open" button
dataset <- read.table(
  file.choose(),
  sep = "|",
  quote = "",
  fill = TRUE,
  header = TRUE,
  encoding = "UTF-8",
  stringsAsFactors = FALSE
)