Beruflich Dokumente
Kultur Dokumente
and it is found to have high accuracy . It uses the image based As shown in Fig 2(a) multiple generators work independently
spectral information to train the network and understand the and generate music of its own track from a private random
frequency changes, the evolution of the crust and trough and vector zi, i = 1, 2,...,M, where M denotes the number of
cycle of waves. But in this paper we use MuseGAN which is generators (or tracks). Therefore to generate n tracks we use n
another GAN based approach for music generation. This generators and n discriminators.
approach is completely a complex structure where it uses train
the model using the data collected from Lakh Pianoroll b)Composer Model:
Dataset to generate pop song phrases consisting of bass, One single generator creates a multichannel piano-roll, with
drums, guitar, piano and strings tracks. each channel representing a specific track, as shown in Fig
2(b).Hence an entire sequence is generated by single generator
and is evaluated by a single discriminator.
III. METHODOLOGY Hybrid Model:
Combining the idea of jamming and composing, we further
GAN(Generative Adversarial Neural Network): propose the hybrid model. As illustrated in Fig 2(c), each of
The GAN consists of two neural networks namely the the M generators takes as inputs an inter-track random vector
Generator and the Discriminator. These two networks define z and an intra-track random vector z(i).This model wholly
the overall structure of the GANs. The structures and the relies on a single discriminator for each and every track
functions of these two are totally different. But they both are generated by each generators.
interlinked to each other, i.e. they both depend on each other
for training as shown in Fig:1
Generator :
The Generator is the one which produces the
result as per our requirement.The Generator receives an input
vector/noise that is passed through a series of neural networks
of the Generator and produces some values/entities based on Fig 2(a): Jamming Model
the input vector.
Discriminator:
The Discriminator is the one which teaches the
Generator to produce the original entity. So once the values
that are generated by the Generator, it is passed to the
Discriminator. The Discriminator is now trained with the
ground truth values and the fake values so as to make the
Generator to understand more on the original value. Hence the Fig 2(b): Composer Model
discriminator is trained on the actual data and understands the
difference between the real and fake ones and makes the
updation of weights of the Generator.
MUSEGAN:
DATASET:
The piano-roll dataset we use in this work is derived from the Fig 4(a):Piano roll Representation of Hard Thresholding
Lakh MIDI dataset (LMD) (Raffel 2016), a large collection of
176,581 unique MIDI files. We convert the MIDI files to
V. CONCLUSION
multi-track piano-rolls. For each bar, we set the height to 128
and the width (time resolution) to 96 for modeling common The objective metrics and the subjective user study show that
temporal patterns such as triplets and 16th notes.5 We use the the proposed models can start to learn something about music.
python library pretty midi (Raffel and Ellis 2014) to parse and Although musically and aesthetically it may still fall behind
process the MIDI files.These MIDI files are converted into the level of human musicians, the proposed model has a few
piano rolls which consists on only binary values. Hence our desirable properties, and we hope follow-up research can
final output is also binary output which is a piano roll and has further improve it.
to be converted into MIDI files. Now we have produced music without knowing any prior
information on music and instruments. This is the cool stuff of
IV. EXPERIMENTAL RESULTS Neural Networks. We have not learnt anything on how
generate music. But the whole simple thing was made by our
The model was trained on a pertained model where the
weights and model files were imported. Now based on the Neural Network. We have found that not only humans, but
pretrained model we received different samples of music files. also machines can be trained to be super intellectual to
The resultant music was classified on the basis on Bernoulli compose a music. No one other than a machine can compose
sampling and hard thresholding. The music generated was a music within few minutes. This is the major advantage of
found to be more likely a human composed one. Fig Neural Network where they can produce music as per our
4(a),(b),(c),(d) shows the piano roll representation of the likeness. This work can be further extended to generate music
music generated by MuseGAN. based on gerne. Also music can be performed based on the
A Web Application was developed where we can click the music we have in our playlists so that the GANs can produce
generate button to generate music . This produces the music music as per our likeness.
generated by the GAN and the user can experience the music
performed by the machine. Flask framework was used to
suffice this purpose. VI. ACKNOWLEDGMENT
REFERENCE