ACPAS dataset: Aligned Classical Piano Audio and Score

October 7, 2021


ACPAS is a dataset with aligned audio and scores for classical piano music containing 497 distinct music scores aligned with 2189 performances, in total 179.77 hours. For each performance, we provide the corresponding performance audio (real recording or synthesized recording), performance MIDI, and MIDI score, together with rhythm and key annotations.

Download


ACPAS dataset is composed of a Real recording subset and a Synthetic subset. The two subsets are published separately at Zenodo; please follow the links below to download.

Real recording subset:

  • metadata_R.csv
  • Download subset
  • Synthetic subset:

  • metadata_S.csv
  • Download subset
  • We provide an additional .csv file listing all the distinct pieces covered in this dataset, available at distinct_pieces.csv.

    For downloading without the audio recordings: ACPAS dataset (no audio recording)

    Description


    ACPAS dataset is created based on several source piano datasets, including the MAPS dataset [1], A-MAPS dataset [2], Classical Piano MIDI (PCM) database, and ASAP dataset [3]. Additional audio recordings are synthesized from performance MIDIs using Native Instrument Kontakt Player.

    A detailed description of how this dataset is created and its contents can be found at:

  • Lele Liu, Veronica Morfi and Emmanouil Benetos, "ACPAS: A Dataset of Aligned Classical Piano Audio and Scores for Audio-to-Score Transcription," submitted to ISMIR Late-breaking Demo, 2021.
  • Content

    The real recording subset covers 578 real recording performances from

    1. the two real recording subsets ("ENSTDkCl" and "ENSTDkAm") in the MAPS dataset and corresponding MIDI scores from the A-MAPS dataset;
    2. the performances and scores from the ASAP dataset with audio recordings obtained from the MAESTRO dataset [4].

    The synthetic subset covers 1611 performances with synthetic audio recordings from the following three sources:

    1. performances from the MAPS synthetic subsets, and MIDI scores from the A-MAPS dataset;
    2. MIDI performances and scores from the ASAP dataset, and audio files synthesized from performance MIDIs using Native Instrument Kontakt Player;
    3. MIDI performances and scores from the CPM database, and audio files synthesized from performance MIDI using Native Instrument Kontakt Player.

    Statistics

    A detailed statistic of the dataset is as below:

    Subset Source Split Distinct Pieces Performances Duration (hours)
    Real recording MAPS test 52 59 4.28
    ASAP train 109 368 32.74
    ASAP validation 17 49 2.52
    ASAP test 44 102 9.42
    All Total 215 578 48.96
    Synthetic -- train 359 1155 94.96
    -- validation 49 135 8.67
    -- test 89 321 27.18
    -- Total 497 1611 130.81
    Both -- train 359 1523 127.70
    -- validation 49 184 11.19
    -- test 89 482 40.88
    -- Total 497 2189 179.78

    A chart of number of music pieces by different composers (or Christmas song):

    composer distribution

    Metadata

    The dataset metadata is provided in three files:

    • metadata_R.csv provides the metadata for all the performances in the Real recording subset.
    • metadata_S.csv provides the metadata for all the performances in the Synthetic subset.
    • distinct_pieces.csv is a list of distinct pieces in this dataset, together with the allocated train/validation/test split.

    The parameters in the two metadata_X.csv files are:

    • performance_id: The ID of the performance in this dataset. Performances from the Real recording subset have IDs starting with R_ and those from the Synthetic subset have IDs starting with S_.
    • composer: composer of the music piece.
    • piece_id: ID of the corresponding music piece, this is in line with the piece ID provided in distinct_pieces.csv.
    • title: title of the music pieces, in line with the title in distinct_pieces.csv.
    • source: the source dataset of the performance, can be "MAPS", "ASAP" or "CPM".
    • performance_audio_external: path to the performance audio in the source dataset.
    • performance_MIDI_external: path to the performance MIDI in the source dataset.
    • MIDI_score_external: path to the MIDI score in the source dataset.
    • performance_annotation_external: path to the performance annotation in the source dataset.
    • score_annotation_external: path to the score annotation in the source dataset.
    • folder: folder to the audio, MIDI, and annotation files.
    • performance_audio: performance audio file.
    • performance_MIDI: performance MIDI file.
    • MIDI_score: MIDI score file.
    • aligned: True if the performance and score are aligned. There are 30 performances that are not aligned with the corresponding score. This is because of some errors made during the performance.
    • performance_annotation: performance annotation file.
    • score_annotation: score annotation file.
    • duration: duration of the performance in seconds.
    • split: train/validation/test split.

    Limitations

    Please note that the voice and hand annotations in the scores are not checked and we do not suggest using these as ground truth. In addition, due to the limitation of the scores in MIDI format, the dataset is not suitable for tasks like score formatting.

    Questions

    For any questions, suggestions, or comments, please do not hesitate to contact lele.liu@qmul.ac.uk.


    References

    [1] V. Emiya et al., "Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle," IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 6, pp. 1643-1654, 2010.

    [2] A. Ycart and E. Benetos, "A-MAPS: Augmented MAPS Dataset with Rhythm and Key Annotations," in International Society for Music Information Retrieval (ISMIR) Conference, Late Breaking Demo, 2018.

    [3] F. Foscarin et al., "ASAP: A Dataset of Aligned Scores and Performances for Piano Transcription," in International Society for Music Information Retrieval (ISMIR) Conference, 2020.

    [4] C. Hawthorne et al., "Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset," in International Conference on Learning Representations (ICLR), 2019.