Help With Mass Downloads

• May 23, 2024 - 16:05

Hi all! I'm working on a research project with the University of Virginia to help computers learn to read printed sheet music. For this, we need to create a training dataset, and for this, we want to download a massive number of MuseScore XML and PDF files. Would anyone know of ways to automate this download process?



I cannot help you with your request, unfortunatelly.

Are you planning to only train on svg/vector based music? which is what MuseScore generated PDFs normally contain. Or are you also generating "pixel based" images like jpg/png etc?

It would be interesting to know what "correct answer" you are planning to use for the training. Is it a MuseScore file? or is it the MusicXML file? Or something else.

As I'm sure you know, there are lots of printed music at as well. And a substantial part has been copied/"transcribed" :-) into MuseScore scores. That might be possible to use as training material as well.

I would be surprised if there is a way to automate such a process. You're going to have to wander around the 'Net finding the scores. They're not particularly in one place where you could automate the download. One exception to that statement: IMSLP. Talking to them would probably be your first step. I believe there are other similar libraries of music; I just don't know where they are.

