ACPAS dataset: Aligned Classical Piano Audio and Score

October 7, 2021

ACPAS is a dataset with aligned audio and scores for classical piano music containing 497 distinct music scores aligned with 2189 performances, in total 179.77 hours. For each performance, we provide the corresponding performance audio (real recording or synthesized recording), performance MIDI, and MIDI score, together with rhythm and key annotations.

Download

ACPAS dataset is composed of a Real recording subset and a Synthetic subset. The two subsets are published separately at Zenodo; please follow the links below to download.

Real recording subset:

Synthetic subset:

We provide an additional .csv file listing all the distinct pieces covered in this dataset, available at distinct_pieces.csv.

For downloading without the audio recordings: ACPAS dataset (no audio recording)

Description

ACPAS dataset is created based on several source piano datasets, including the MAPS dataset [1], A-MAPS dataset [2], Classical Piano MIDI (PCM) database, and ASAP dataset [3]. Additional audio recordings are synthesized from performance MIDIs using Native Instrument Kontakt Player.

A detailed description of how this dataset is created and its contents can be found at:

Lele Liu, Veronica Morfi and Emmanouil Benetos, "ACPAS: A Dataset of Aligned Classical Piano Audio and Scores for Audio-to-Score Transcription," submitted to ISMIR Late-breaking Demo, 2021.

Content

The real recording subset covers 578 real recording performances from

the two real recording subsets ("ENSTDkCl" and "ENSTDkAm") in the MAPS dataset and corresponding MIDI scores from the A-MAPS dataset;
the performances and scores from the ASAP dataset with audio recordings obtained from the MAESTRO dataset [4].

The synthetic subset covers 1611 performances with synthetic audio recordings from the following three sources:

performances from the MAPS synthetic subsets, and MIDI scores from the A-MAPS dataset;
MIDI performances and scores from the ASAP dataset, and audio files synthesized from performance MIDIs using Native Instrument Kontakt Player;
MIDI performances and scores from the CPM database, and audio files synthesized from performance MIDI using Native Instrument Kontakt Player.

Statistics

A detailed statistic of the dataset is as below:

Subset	Source	Split	Distinct Pieces	Performances	Duration (hours)
Real recording	MAPS	test	52	59	4.28
	ASAP	train	109	368	32.74
	ASAP	validation	17	49	2.52
	ASAP	test	44	102	9.42
	All	Total	215	578	48.96
Synthetic	--	train	359	1155	94.96
	--	validation	49	135	8.67
	--	test	89	321	27.18
	--	Total	497	1611	130.81
Both	--	train	359	1523	127.70
	--	validation	49	184	11.19
	--	test	89	482	40.88
	--	Total	497	2189	179.78

A chart of number of music pieces by different composers (or Christmas song):

Metadata

The dataset metadata is provided in three files:

metadata_R.csv provides the metadata for all the performances in the Real recording subset.
metadata_S.csv provides the metadata for all the performances in the Synthetic subset.
distinct_pieces.csv is a list of distinct pieces in this dataset, together with the allocated train/validation/test split.

The parameters in the two metadata_X.csv files are:

performance_id: The ID of the performance in this dataset. Performances from the Real recording subset have IDs starting with R_ and those from the Synthetic subset have IDs starting with S_.
composer: composer of the music piece.
piece_id: ID of the corresponding music piece, this is in line with the piece ID provided in distinct_pieces.csv.
title: title of the music pieces, in line with the title in distinct_pieces.csv.
source: the source dataset of the performance, can be "MAPS", "ASAP" or "CPM".
performance_audio_external: path to the performance audio in the source dataset.
performance_MIDI_external: path to the performance MIDI in the source dataset.
MIDI_score_external: path to the MIDI score in the source dataset.
performance_annotation_external: path to the performance annotation in the source dataset.
score_annotation_external: path to the score annotation in the source dataset.
folder: folder to the audio, MIDI, and annotation files.
performance_audio: performance audio file.
performance_MIDI: performance MIDI file.
MIDI_score: MIDI score file.
aligned: True if the performance and score are aligned. There are 30 performances that are not aligned with the corresponding score. This is because of some errors made during the performance.
performance_annotation: performance annotation file.
score_annotation: score annotation file.
duration: duration of the performance in seconds.
split: train/validation/test split.

Limitations

Please note that the voice and hand annotations in the scores are not checked and we do not suggest using these as ground truth. In addition, due to the limitation of the scores in MIDI format, the dataset is not suitable for tasks like score formatting.

Questions

For any questions, suggestions, or comments, please do not hesitate to contact lele.liu@qmul.ac.uk.

References

[1] V. Emiya et al., "Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle," IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 6, pp. 1643-1654, 2010.

[2] A. Ycart and E. Benetos, "A-MAPS: Augmented MAPS Dataset with Rhythm and Key Annotations," in International Society for Music Information Retrieval (ISMIR) Conference, Late Breaking Demo, 2018.

[3] F. Foscarin et al., "ASAP: A Dataset of Aligned Scores and Performances for Piano Transcription," in International Society for Music Information Retrieval (ISMIR) Conference, 2020.

[4] C. Hawthorne et al., "Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset," in International Conference on Learning Representations (ICLR), 2019.