Simple LSTM Music Generator
@ -1,28 +1,32 @@
This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018/W32TeX) (preloaded format=pdflatex 2019.2.21) 28 MAY 2019 12:32
This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018/W32TeX) (preloaded format=pdflatex 2019.2.16) 13 JUN 2019 10:04
entering extended mode
entering extended mode
restricted \write18 enabled.
restricted \write18 enabled.
file:line:error style messages enabled.
%&-line parsing enabled.
%&-line parsing enabled.
LaTeX2e <2018-12-01>
LaTeX2e <2018-12-01>
Document Class: article 2018/09/03 v1.4i Standard LaTeX document class
Document Class: mwbk 2017/05/13 v0.75 A LaTeX document class (MW)
*** Beta version. Formatting may change
File: size10.clo 2018/09/03 v1.4i Standard LaTeX file (size option)
*** in future versions of this class.
File: mwbk12.clo 2017/05/13 v0.75 A document class size option (MW)
) (c:/software/latex/texmf-dist/tex/latex/polski/polski.sty
Package: polski 2017/05/04 v1.3.4 Polish language package
Package: polski 2017/05/04 v1.3.4 Polish language package
Switching to Polish text encoding and Polish maths fonts.
Switching to Polish text encoding and Polish maths fonts.
@ -32,8 +36,6 @@ Now handling font encoding OT4 ...
... no UTF-8 mapping file for font encoding OT4
... no UTF-8 mapping file for font encoding OT4
LaTeX Font Info: Try loading font information for OT4+cmr on input line 360.
LaTeX Font Info: Try loading font information for OT4+cmr on input line 360.
File: ot4cmr.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
File: ot4cmr.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
@ -51,7 +53,6 @@ LaTeX Font Info: Overwriting math alphabet `\mathit' in version `bold'
(Font) OT1/cmr/bx/it --> OT4/cmr/bx/it on input line 360.
(Font) OT1/cmr/bx/it --> OT4/cmr/bx/it on input line 360.
LaTeX Font Info: Encoding `OT1' has changed to `OT4' for symbol font
LaTeX Font Info: Encoding `OT1' has changed to `OT4' for symbol font
(Font) `operators' in the math version `normal' on input line 360.
(Font) `operators' in the math version `normal' on input line 360.
LaTeX Font Info: Overwriting symbol font `operators' in version `normal'
LaTeX Font Info: Overwriting symbol font `operators' in version `normal'
(Font) OT1/cmr/m/n --> OT4/cmr/m/n on input line 360.
(Font) OT1/cmr/m/n --> OT4/cmr/m/n on input line 360.
LaTeX Font Info: Overwriting symbol font `letters' in version `normal'
LaTeX Font Info: Overwriting symbol font `letters' in version `normal'
@ -68,84 +69,294 @@ LaTeX Font Info: Overwriting symbol font `letters' in version `bold'
(Font) OML/cmm/b/it --> OML/plm/b/it on input line 360.
(Font) OML/cmm/b/it --> OML/plm/b/it on input line 360.
LaTeX Font Info: Overwriting symbol font `symbols' in version `bold'
LaTeX Font Info: Overwriting symbol font `symbols' in version `bold'
(Font) OMS/cmsy/b/n --> OMS/plsy/b/n on input line 360.
(Font) OMS/cmsy/b/n --> OMS/plsy/b/n on input line 360.
) (c:/software/latex/texmf-dist/tex/latex/base/inputenc.sty
Package: inputenc 2018/08/11 v1.3c Input encoding file
) (c:/software/latex/texmf-dist/tex/latex/amsmath/amsmath.sty
Package: amsmath 2018/12/01 v2.17b AMS math features
For additional information on amsmath, use the `?' option.
Package: amstext 2000/06/29 v2.01 AMS text
File: amsgen.sty 1999/11/30 v2.0 generic functions
)) (c:/software/latex/texmf-dist/tex/latex/amsmath/amsbsy.sty
Package: amsbsy 1999/11/29 v1.2d Bold Symbols
) (c:/software/latex/texmf-dist/tex/latex/amsmath/amsopn.sty
Package: amsopn 2016/03/08 v2.02 operator names
LaTeX Info: Redefining \frac on input line 223.
LaTeX Info: Redefining \overline on input line 385.
LaTeX Info: Redefining \ldots on input line 482.
LaTeX Info: Redefining \dots on input line 485.
LaTeX Info: Redefining \cdots on input line 606.
LaTeX Font Info: Redeclaring font encoding OML on input line 729.
LaTeX Font Info: Redeclaring font encoding OMS on input line 730.
LaTeX Info: Redefining \[ on input line 2844.
LaTeX Info: Redefining \] on input line 2845.
) (c:/software/latex/texmf-dist/tex/latex/amsfonts/amsfonts.sty
Package: amsfonts 2013/01/14 v3.01 Basic AMSFonts support
LaTeX Font Info: Overwriting math alphabet `\mathfrak' in version `bold'
(Font) U/euf/m/n --> U/euf/b/n on input line 106.
) (c:/software/latex/texmf-dist/tex/latex/base/makeidx.sty
Package: makeidx 2014/09/29 v1.0m Standard LaTeX package
) (c:/software/latex/texmf-dist/tex/latex/graphics/graphicx.sty
Package: graphicx 2017/06/01 v1.1a Enhanced LaTeX Graphics (DPC,SPQR)
Package: keyval 2014/10/28 v1.15 key=value parser (DPC)
) (c:/software/latex/texmf-dist/tex/latex/graphics/graphics.sty
Package: graphics 2017/06/25 v1.2c Standard LaTeX Graphics (DPC,SPQR)
Package: trig 2016/01/03 v1.10 sin cos tan (DPC)
) (c:/software/latex/texmf-dist/tex/latex/graphics-cfg/graphics.cfg
File: graphics.cfg 2016/06/04 v1.11 sample graphics configuration
Package graphics Info: Driver file: pdftex.def on input line 99.
File: pdftex.def 2018/01/08 v1.0l Graphics/color driver for pdftex
) (c:/software/latex/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty
Package: fancyhdr 2019/01/31 v3.10 Extensive control of page headers and footers
) (./chapter-style.sty (c:/software/latex/texmf-dist/tex/latex/graphics/color.sty
Package: color 2016/07/10 v1.1e Standard LaTeX Color (DPC)
File: color.cfg 2016/01/02 v1.6 sample color configuration
Package color Info: Driver file: pdftex.def on input line 147.
\openout3 = `document.idx'.
LaTeX Warning: Unused global option(s):
Writing index file document.idx
LaTeX Warning: Label `' multiply defined.
LaTeX Warning: Label `' multiply defined.
\openout1 = `document.aux'.
\openout1 = `document.aux'.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 18.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 42.
LaTeX Font Info: ... okay on input line 18.
LaTeX Font Info: ... okay on input line 42.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 18.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 42.
LaTeX Font Info: ... okay on input line 18.
LaTeX Font Info: ... okay on input line 42.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 18.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 42.
LaTeX Font Info: ... okay on input line 18.
LaTeX Font Info: ... okay on input line 42.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 18.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 42.
LaTeX Font Info: ... okay on input line 18.
LaTeX Font Info: ... okay on input line 42.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 18.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 42.
LaTeX Font Info: ... okay on input line 18.
LaTeX Font Info: ... okay on input line 42.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 18.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 42.
LaTeX Font Info: ... okay on input line 18.
LaTeX Font Info: ... okay on input line 42.
LaTeX Font Info: Checking defaults for OT4/cmr/m/n on input line 18.
LaTeX Font Info: Checking defaults for OT4/cmr/m/n on input line 42.
LaTeX Font Info: ... okay on input line 18.
LaTeX Font Info: ... okay on input line 42.
LaTeX Font Info: Try loading font information for OML+plm on input line 20.
[Loading MPS to PDF converter (version 2006.09.02).]
) (c:/software/latex/texmf-dist/tex/latex/oberdiek/epstopdf-base.sty
Package: epstopdf-base 2016/05/15 v2.6 Base part for package epstopdf
Package: infwarerr 2016/05/16 v1.4 Providing info/warning/error messages (HO)
) (c:/software/latex/texmf-dist/tex/latex/oberdiek/grfext.sty
Package: grfext 2016/05/16 v1.2 Manage graphics extensions (HO)
Package: kvdefinekeys 2016/05/16 v1.4 Define keys (HO)
Package: ltxcmds 2016/05/16 v1.23 LaTeX kernel commands for general use (HO)
))) (c:/software/latex/texmf-dist/tex/latex/oberdiek/kvoptions.sty
Package: kvoptions 2016/05/16 v3.12 Key value format for package options (HO)
Package: kvsetkeys 2016/05/16 v1.17 Key value parser (HO)
Package: etexcmds 2016/05/16 v1.6 Avoid name clashes with e-TeX commands (HO)
Package: ifluatex 2016/05/16 v1.4 Provides the ifluatex switch (HO)
Package ifluatex Info: LuaTeX not detected.
Package etexcmds Info: Could not find \expanded.
(etexcmds) That can mean that you are not using pdfTeX 1.50 or
(etexcmds) that some package has redefined \expanded.
(etexcmds) In the latter case, load this package earlier.
))) (c:/software/latex/texmf-dist/tex/generic/oberdiek/pdftexcmds.sty
Package: pdftexcmds 2018/09/10 v0.29 Utility functions of pdfTeX for LuaTeX (HO)
Package: ifpdf 2018/09/07 v3.3 Provides the ifpdf switch
Package pdftexcmds Info: LuaTeX not detected.
Package pdftexcmds Info: \pdf@primitive is available.
Package pdftexcmds Info: \pdf@ifprimitive is available.
Package pdftexcmds Info: \pdfdraftmode found.
Package epstopdf-base Info: Redefining graphics rule for `.eps' on input line 438.
Package grfext Info: Graphics extension search list:
(grfext) [.pdf,.png,.jpg,.mps,.jpeg,.jbig2,.jb2,.PDF,.PNG,.JPG,.JPEG,.JBIG2,.JB2,.eps]
(grfext) \AppendGraphicsExtensions on input line 456.
File: epstopdf-sys.cfg 2010/07/13 v1.3 Configuration of (r)epstopdf for TeX Live
Underfull \hbox (badness 10000) in paragraph at lines 83--84
Underfull \hbox (badness 10000) in paragraph at lines 85--86
{c:/software/latex/texmf-var/fonts/map/pdftex/updmap/}] [2
LaTeX Font Info: Try loading font information for OML+plm on input line 123.
File: omlplm.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
File: omlplm.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
LaTeX Font Info: Try loading font information for OMS+plsy on input line 20.
LaTeX Font Info: Try loading font information for OMS+plsy on input line 123.
File: omsplsy.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
File: omsplsy.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
LaTeX Font Info: Try loading font information for OMX+plex on input line 20.
LaTeX Font Info: Try loading font information for OMX+plex on input line 123.
File: omxplex.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
File: omxplex.fd 2008/02/24 v1.2.1 Font defs for fonts PL (MW)
LaTeX Font Info: External font `plex10' loaded for size
LaTeX Font Info: External font `plex10' loaded for size
(Font) <12> on input line 20.
(Font) <12> on input line 123.
LaTeX Font Info: External font `plex10' loaded for size
LaTeX Font Info: External font `plex10' loaded for size
(Font) <8> on input line 20.
(Font) <8> on input line 123.
LaTeX Font Info: External font `plex10' loaded for size
LaTeX Font Info: External font `plex10' loaded for size
(Font) <6> on input line 20.
(Font) <6> on input line 123.
LaTeX Font Info: Try loading font information for U+msa on input line 123.
{c:/software/latex/texmf-var/fonts/map/pdftex/updmap/}] (./document.t
File: umsa.fd 2013/01/14 v3.01 AMS symbols A
LaTeX Font Info: External font `plex10' loaded for size
(Font) <10> on input line 2.
LaTeX Font Info: External font `plex10' loaded for size
(Font) <7> on input line 2.
LaTeX Font Info: External font `plex10' loaded for size
(Font) <5> on input line 2.
LaTeX Font Info: Try loading font information for U+msb on input line 123.
\openout3 = `document.toc'.
File: umsb.fd 2013/01/14 v3.01 AMS symbols B
LaTeX Font Info: External font `plex10' loaded for size
(Font) <10.95> on input line 131.
[3] [4
[1] [2]
[3] (./document.aux) )
] (./document.toc)
\openout4 = `document.toc'.
] [6
] [7] [8
] [9] [10
Overfull \vbox (15.90912pt too high) detected at line 155
Rozdzia\PlPrIeC {\l } 1.
[11] [12]
Overfull \vbox (16.08192pt too high) detected at line 174
Rozdzia\PlPrIeC {\l } 2.
] [14] [15] [16
Overfull \vbox (16.08192pt too high) detected at line 205
Rozdzia\PlPrIeC {\l } 3.
[17] [18] [19
] [20] [21
] [22
] (./document.ind) [23] (./document.aux)
LaTeX Warning: There were multiply-defined labels.
Here is how much of TeX's memory you used:
Here is how much of TeX's memory you used:
474 strings out of 492616
2757 strings out of 492616
5750 string characters out of 6131816
37267 string characters out of 6131816
66703 words of memory out of 5000000
104328 words of memory out of 5000000
4428 multiletter control sequences out of 15000+600000
6598 multiletter control sequences out of 15000+600000
13779 words of font info for 39 fonts, out of 8000000 for 9000
15784 words of font info for 47 fonts, out of 8000000 for 9000
1141 hyphenation exceptions out of 8191
1141 hyphenation exceptions out of 8191
23i,7n,25p,379b,252s stack positions out of 5000i,500n,10000p,200000b,80000s
41i,11n,43p,781b,258s stack positions out of 5000i,500n,10000p,200000b,80000s
Output written on document.pdf (23 pages, 96674 bytes).
Output written on document.pdf (4 pages, 61011 bytes).
PDF statistics:
PDF statistics:
42 PDF objects out of 1000 (max. 8388607)
114 PDF objects out of 1000 (max. 8388607)
29 compressed objects within 1 object stream
80 compressed objects within 1 object stream
0 named destinations out of 1000 (max. 500000)
0 named destinations out of 1000 (max. 500000)
1 words of extra memory for PDF output out of 10000 (max. 10000000)
1 words of extra memory for PDF output out of 10000 (max. 10000000)
@ -1,80 +1,247 @@
\textheight 21.1 cm
Generowanie muzyki \\
przy pomocy głębokiego uczenia \\
\large Music generation with deep learning}
\voffset = 1.2 cm
\textwidth 14 cm
Cezary Pukownik \\
\small Opiekun pracy:\\
dr hab. Tomasz Górecki}
\hoffset = -0.5 cm
\oddsidemargin = 1.4 cm
%\renewcommand{\sectionmark}[1]{\markright{\thesection\ #1}}
\fancyhf{} \fancyhead[LE,RO]{\small\thepage}
\fancyheadoffset[LO]{0 cm}
\fancyheadoffset[RE]{0 cm}
% Strona tytułowa
\vglue 0.1 cm
\vglue 2.1 cm
{\LARGE \bf Cezary Adam Pukownik}
\vglue 1cm
{\large Kierunek: Analiza i przetwarzanie danych}
{\large Specjalność: Uczenie maszynowe}
{\large Numer albumu: 444337}
{\Huge \bf Generowanie muzyki \\[4pt] przy pomocy głębokiego uczenia\\}
{\large \bf Music generation with deep learning\\}
\hspace{7.5cm}{Praca licencjacka}\\[-12pt]
\hspace{7.5cm}{napisana pod kierunkiem}\\[-12pt]
\hspace{7.5cm}{dr hab. Tomasza Góreckiego}
\textsc{POZNAŃ 2020}
% Koniec strony tytułowej
\clearpage \thispagestyle{empty} \cleardoublepage
% Oświadczenie
Poznań, dnia .....................
\vglue 2.4 cm
\vglue 1.2 cm
Ja, niżej podpisany Cezary Pukownik, student Wydziału Matematyki i Informatyki Uniwersytetu im. Adama Mickiewicza w Poznaniu oświadczam, że przedkładaną pracę dyplomową pt: "Generowanie muzyki przy pomocy głębokiego uczenia", napisałem samodzielnie. Oznacza to, że przy pisaniu pracy, poza niezbędnymi konsultacjami, nie korzystałem z pomocy innych osób, a w szczególności nie zlecałem opracowania rozprawy lub jej części innym osobom, ani nie odpisywałem tej rozprawy lub jej części od innych osób.
Oświadczam również, że egzemplarz pracy dyplomowej w wersji drukowanej jest całkowicie zgodny z egzemplarzem pracy dyplomowej w wersji elektronicznej.
Jednocześnie przyjmuję do wiadomości, że przypisanie sobie, w pracy dyplomowej, autorstwa istotnego fragmentu lub innych elementów cudzego utworu lub ustalenia naukowego stanowi podstawę stwierdzenia nieważności postępowania w sprawie nadania tytułu zawodowego.
\noindent $[TAK]^{\star}$ - wyrażam zgodę na udostępnianie mojej pracy w czytelni Archiwum UAM
\noindent $[TAK]^{\star}$ - wyrażam zgodę na udostępnianie mojej pracy w zakresie koniecznym do ochrony mojego prawa do autorstwa lub praw osób trzecich
\vglue 1.2 cm
\noindent{\small $^{\star}$Należy wpisać TAK w przypadku wyrażenia zgody na udostępnianie pracy w czytelni Archiwum UAM, NIE w przypadku braku zgody. Niewypełnienie pola oznacza brak zgody na udostępnianie pracy.}
\vglue 2 cm
\hglue 6cm ............................................................
% Koniec oświadczenia
\clearpage \thispagestyle{empty} \cleardoublepage
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut fermentum lorem libero. Duis a magna arcu. Nam sit amet porta odio. Cras sit amet euismod elit. Etiam a turpis eget magna pharetra malesuada. Vivamus accumsan leo eget turpis efficitur, non interdum tortor pretium. Maecenas at massa nec elit imperdiet sagittis. Maecenas pellentesque libero et risus aliquam consectetur.
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut fermentum lorem libero. Duis a magna arcu. Nam sit amet porta odio. Cras sit amet euismod elit. Etiam a turpis eget magna pharetra malesuada. Vivamus accumsan leo eget turpis efficitur, non interdum tortor pretium. Maecenas at massa nec elit imperdiet sagittis. Maecenas pellentesque libero et risus aliquam consectetur.
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Uczenie maszynowe w ostatnich latach mocno zyskało na popularności. Zastosowania i możliwości różnych algorytmów Mashine Learning czasami przekraczają nasze wyobrażenie o tym, co komputer może zrobić. Niektóre aplikacje potrafią wręcz zaskoczyć użytkowników tym, co potrafią zrobić. Wśród takich aplikacji znajdują się takie, które potrafią przewidywać następne wartości akcji giełdowych, rozpoznawać na filmie obiekty w czasie rzeczywistym czy nawet prowadzić samochód. Algorytmy wyuczone proponują nam spersonalizowane reklamy, czy produkty na podstawie naszych upodobań. Najczęstsze zastosowania dotyczą przetwarzania obrazów lub tekstu, natomiast zastosowania w przetwarzaniu muzyki są niszowe i rzadko spotykane.
\section{Zastosowania uczenia maszynowego w muzyce}
Wśród najbardziej rozwiniętych zastosowań uczenia maszynowego w muzyce, można wymienić algorytmy polecania utworów w portalach streamingowych takich jak Spotify czy Tidal. Algorytmy potrafią znajdywać podobne do siebie utwory i polecać je nam na podstawie naszych preferencji.
\section{Modele generatywne}
Jednym z najnowszych modeli sieci neuronowych są sieci generatywne. Wychodzą one poza standardowe zastosowania klasyfikacji i regresji. Modele generatywne, uczą się ze zbioru uczącego najważniejszych ale i ogólnych cech i potrafią reprodukować podobne wyniki.
\section{Muzyka symboliczna, a muzyka}
Należy rozróżnić dwa pojęcia, które są od siebie różne w podstawowych założeniach. Muzyka symboliczna, i muzyka odegrana. Muzyka symboliczna, jest to utwór zapisany, skomponowany ale na papierze. Przedstawia to muzykę jako koncepcję, taki przepis na utwór. Taka muzyka zapisywana jest klasycznie na pięciolinii, czy komputerowo przy pomocy protokołu MIDI. Druga muzyka, jest to muzyka już odebrana, która nie przechowuje informacji o tym jak zagrać, czy odtworzyć utwór muzyczny, ale brzmienie tego utworu jako fala dźwiękowa. W tej pracy będę opisywał przede wszystkim generowanie muzyki symbolicznej.
\section{Cele tej pracy.}
Celem tej pracy, jest zastosowanie technik głębokiego uczenia, do generowania muzyki. Jest to bardzo ogólny cel, ponieważ muzyką może być prosta melodia oparta na kilku dźwiękach grana przez jeden instrument ale również aranżacja orkiestralna na wiele instrumentów, które razem współgrają i wybrzmiewają jako jeden pełny utwór.
\chapter{Reprezentacja muzyki}
Muzyka jesy
\section{Podstawowe koncepcje}
To jest wstep do pracy magisterskiej
Każdy utwór muzyczny składa się nut. Nuta jest podstawowym obiektem w muzycznym słowniku. Każda nuta ma dwa parametry, wartości oraz wysokości. Wartość nuty określa jak długo będzie ona trwać w czasie, relatywnie do pozostałych nut. Wysokość noty oznacza z jaką częstotliwością fala dzwiękowa tej nuty ma wybrzmieć. Częstotliwości te są nazwane literami alfabetu ABCDEFG lub w zapisie polskim AHCDEFG.
Teraz opowiem troche o muzyce, i dlaczego trudno jest ja generowac, co o tym sądze, oraz czy sztuczna inteligencja zastapi muzyków w przyszłości.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut fermentum lorem libero. Duis a magna arcu. Nam sit amet porta odio. Cras sit amet euismod elit. Etiam a turpis eget magna pharetra malesuada. Vivamus accumsan leo eget turpis efficitur, non interdum tortor pretium. Maecenas at massa nec elit imperdiet sagittis. Maecenas pellentesque libero et risus aliquam consectetur.
\section{MIDI, Muzyka jako Informacje}
Tutaj opiszę w jaki sposób muzyka jest zapisywana jako informacje komputerowe, protokuł midi, przedstawienie muzyki jako pianorolle.
Tutaj opiszę protokuł MIDI
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Aenean malesuada interdum hendrerit. Integer quis nisl et neque iaculis dapibus at in metus. Cras pretium bibendum magna at aliquet. Integer aliquet cursus augue, efficitur sollicitudin felis fringilla efficitur. Vivamus euismod bibendum justo, vitae suscipit nunc mattis a. Sed egestas porttitor velit, sit amet volutpat tortor suscipit vitae. Nulla nec dignissim mauris. Curabitur maximus viverra mollis. Suspendisse molestie turpis sit amet turpis interdum viverra ac eu lorem. Suspendisse iaculis ultricies ante, a condimentum odio congue nec. Integer varius lobortis diam, eget scelerisque nisl mattis at.
\section{Reprezentacja muzyki}
\subsection{Zapis klasyczny - pięciolinia}
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Tutaj opisze co todsdsddsdss są pianorolle, jak je czytać i czemu służą.
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
\subsection{Muzyka jako trójwymiarowa tablica}
Tutaj opisze dlaczego muzykę moża opisać jako trójwymiarowa tablicę.
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
\section{Generatwne sieci neuronowe - GANy, VAE, LSTMy}
\chapter{Sieci neuronowe}
Tutaj będzie opisane, dlaczego sieci neuronowe, radzą sobie lepiej w produkowaniu muzyki niż inne modele. Oraz jakie modele są odpowidnie do pewnych zastosowań, JAZZ - LSTM, bardziej ustrukturyzowana - VAE itp.
Tutaj będzie opisane, dlaczego sieci neuronowe, radzą sobie lepiej w produkowaniu muzyki niż inne modele. Oraz jakie modele są odpowidnie do pewnych zastosowań, JAZZ - LSTM, bardziej ustrukturyzowana - VAE itp.
\subsection{Autoencodery, VAE}
\subsection{Wstęp do sieci neuronowych, definicje wzory itp.}
Teraz opowiem troche o muzyce, i dlaczego trudno jest ja generowac
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Teraz opowiem troche o muzyce, i dlaczego trudno jest ja generowac
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
\section{Modele generatywne stosowane w generowaniu muzyki}
Przykłady gotowych podeść do generowania muzyki, oraz jakie modele zostały zastosowane. dlaczego takie itp.
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
\subsection{Project Magenta}
Teraz opowiem troche o muzyce, i dlaczego trudno jest ja generowac
Teraz opowiem troche o muzyce, i dlaczego trudno jest ja generowac
Teraz opowiem troche o muzyce, i dlaczego trudno jest ja generowac
\section{Budowanie generatora muzyki}
\chapter*{Budowanie generatora muzyki}
W tym rozdzialę opiszę w jaki sposób zbudowałem swój własny geneator muzyki, jak przechodził procesz uczenia, jakie próbki udało mi się wygenrować. Opis kodu który napisałem.
W tym rozdzialę opiszę w jaki sposób zbudowałem swój własny geneator muzyki, jak przechodził procesz uczenia, jakie próbki udało mi się wygenrować. Opis kodu który napisałem.
\subsection{Wyodrębnienie danych z plików MIDI}
\subsection{Przygotowanie danych}
\subsection{Przygotowanie Modelu GAN}
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
\subsection{Proces uczenia, próbki co kilka epochów, costloss wykres}
\subsection{Architektura sieci neuronowej}
\subsection{Próbki końcowe, jaką muzykę da się z tego wygenerować}
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
\subsection{Proces treningowy}
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
\subsection{Przykłady wygenerowanej muzyki}
Proin ac dui orci. Cras nec elit eleifend lacus eleifend gravida. Ut placerat lacinia dolor non viverra. Curabitur rhoncus sit amet nibh sed malesuada. Integer iaculis eros venenatis, tempor enim non, sollicitudin sapien. Vestibulum eu scelerisque erat. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla at felis massa. Ut est arcu, rhoncus ac tincidunt vel, consequat eu sem. Aliquam neque orci, lacinia molestie enim fermentum, ullamcorper congue mauris. Phasellus pellentesque, ante nec ultricies porta, erat erat placerat ante, vitae vehicula ipsum enim id ante. Donec malesuada tortor id ornare mattis. Nulla nec augue at augue dictum aliquet.
Ostateczne wnioski, czy muzyka generowana komputerowa da się lubić? Czy to pozytywnie wpłynie na przemysł muzyczny? Tak i nie. Może złużyć jako inspiracja dla muzyków, proces wspierający. Z drugiej strony może obnizy koszty produkowania muzyki pop, która i tak jest już bardzo powtarzalna. Czy sieci neuronowe nauczą się produkować Hity?
Ostateczne wnioski, czy muzyka generowana komputerowa da się lubić? Czy to pozytywnie wpłynie na przemysł muzyczny? Tak i nie. Może złużyć jako inspiracja dla muzyków, proces wspierający. Z drugiej strony może obnizy koszty produkowania muzyki pop, która i tak jest już bardzo powtarzalna. Czy sieci neuronowe nauczą się produkować Hity?
\bibitem{} Briot, J.P., Hadjeres, G., Pachet, F.D. (2019): {\em Deep Learning Techniques for Music Generation - A Survey. arXiv:1709.01620v3}
\bibitem{} Goodfellow, I., Bengio, Y., Courville, A. (2016): {\em Deep Learning. MIT Press.}
\bibitem{} Zocca, V., Spacagna, G., Slater, D., Roelants, P. (2018): {\em Deep Learning. Uczenie głębokie z językiem Python. Helion.}
@ -1,19 +1,30 @@
\contentsline {section}{\numberline {1}Wst\IeC {\k e}p}{2}%
\contentsline {chapter}{Streszczenie}{7}%
\contentsline {subsection}{\numberline {1.1}Muzyka}{2}%
\contentsline {chapter}{Abstract}{9}%
\contentsline {section}{\numberline {2}MIDI, Muzyka jako Informacje}{2}%
\contentsline {chapter}{Rozdzia\PlPrIeC {\l }\ 1\relax .\leavevmode@ifvmode \kern .5em Wst\IeC {\k e}p}{11}%
\contentsline {subsection}{\numberline {2.1}MIDI}{2}%
\contentsline {section}{\numberline {1.1\relax .\leavevmode@ifvmode \kern .5em }Zastosowania uczenia maszynowego w muzyce}{11}%
\contentsline {subsection}{\numberline {2.2}Pianoroll}{2}%
\contentsline {section}{\numberline {1.2\relax .\leavevmode@ifvmode \kern .5em }Modele generatywne}{11}%
\contentsline {subsection}{\numberline {2.3}Muzyka jako tr\IeC {\'o}jwymiarowa tablica}{2}%
\contentsline {section}{\numberline {1.3\relax .\leavevmode@ifvmode \kern .5em }Muzyka symboliczna, a muzyka}{11}%
\contentsline {section}{\numberline {3}Generatwne sieci neuronowe - GANy, VAE, LSTMy}{2}%
\contentsline {section}{\numberline {1.4\relax .\leavevmode@ifvmode \kern .5em }Cele tej pracy.}{12}%
\contentsline {subsection}{\numberline {3.1}Autoencodery, VAE}{2}%
\contentsline {chapter}{Rozdzia\PlPrIeC {\l }\ 2\relax .\leavevmode@ifvmode \kern .5em Reprezentacja muzyki}{13}%
\contentsline {subsection}{\numberline {3.2}LSTM}{2}%
\contentsline {section}{\numberline {2.1\relax .\leavevmode@ifvmode \kern .5em }Podstawowe koncepcje}{13}%
\contentsline {section}{\numberline {4}Modele generatywne stosowane w generowaniu muzyki}{3}%
\contentsline {subsection}{\numberline {2.1.1\relax .\leavevmode@ifvmode \kern .5em }Nuta}{13}%
\contentsline {subsection}{\numberline {4.1}Project Magenta}{3}%
\contentsline {subsection}{\numberline {2.1.2\relax .\leavevmode@ifvmode \kern .5em }Skala}{13}%
\contentsline {subsection}{\numberline {4.2}MuseGAN}{3}%
\contentsline {subsection}{\numberline {2.1.3\relax .\leavevmode@ifvmode \kern .5em }Akord}{13}%
\contentsline {subsection}{\numberline {4.3}VAE-MIDI}{3}%
\contentsline {subsection}{\numberline {2.1.4\relax .\leavevmode@ifvmode \kern .5em }Utw\IeC {\'o}r}{14}%
\contentsline {section}{\numberline {5}Budowanie generatora muzyki}{3}%
\contentsline {section}{\numberline {2.2\relax .\leavevmode@ifvmode \kern .5em }Reprezentacja muzyki}{14}%
\contentsline {subsection}{\numberline {5.1}Wyodr\IeC {\k e}bnienie danych z plik\IeC {\'o}w MIDI}{3}%
\contentsline {subsection}{\numberline {2.2.1\relax .\leavevmode@ifvmode \kern .5em }Zapis klasyczny - pi\IeC {\k e}ciolinia}{14}%
\contentsline {subsection}{\numberline {5.2}Przygotowanie Modelu GAN}{3}%
\contentsline {subsection}{\numberline {2.2.2\relax .\leavevmode@ifvmode \kern .5em }Tabulatura}{14}%
\contentsline {subsection}{\numberline {5.3}Proces uczenia, pr\IeC {\'o}bki co kilka epoch\IeC {\'o}w, costloss wykres}{3}%
\contentsline {subsection}{\numberline {2.2.3\relax .\leavevmode@ifvmode \kern .5em }Pianoroll}{14}%
\contentsline {subsection}{\numberline {5.4}Pr\IeC {\'o}bki ko\IeC {\'n}cowe, jak\IeC {\k a} muzyk\IeC {\k e} da si\IeC {\k e} z tego wygenerowa\IeC {\'c}}{3}%
\contentsline {subsection}{\numberline {2.2.4\relax .\leavevmode@ifvmode \kern .5em }Tekstowa}{15}%
\contentsline {section}{\numberline {6}Podsumowanie}{3}%
\contentsline {chapter}{Rozdzia\PlPrIeC {\l }\ 3\relax .\leavevmode@ifvmode \kern .5em Sieci neuronowe}{17}%
\contentsline {subsection}{\numberline {3.0.1\relax .\leavevmode@ifvmode \kern .5em }Wst\IeC {\k e}p do sieci neuronowych, definicje wzory itp.}{17}%
\contentsline {subsection}{\numberline {3.0.2\relax .\leavevmode@ifvmode \kern .5em }Autoencodery}{17}%
\contentsline {subsection}{\numberline {3.0.3\relax .\leavevmode@ifvmode \kern .5em }LSTM}{17}%
\contentsline {subsection}{\numberline {3.0.4\relax .\leavevmode@ifvmode \kern .5em }GAN}{18}%
\contentsline {chapter}{Budowanie generatora muzyki}{19}%
\contentsline {subsection}{\numberline {3.0.5\relax .\leavevmode@ifvmode \kern .5em }Przygotowanie danych}{19}%
\contentsline {subsection}{\numberline {3.0.6\relax .\leavevmode@ifvmode \kern .5em }Architektura sieci neuronowej}{19}%
\contentsline {subsection}{\numberline {3.0.7\relax .\leavevmode@ifvmode \kern .5em }Proces treningowy}{19}%
\contentsline {subsection}{\numberline {3.0.8\relax .\leavevmode@ifvmode \kern .5em }Przyk\IeC {\l }ady wygenerowanej muzyki}{20}%
\contentsline {chapter}{Podsumowanie}{21}%
\contentsline {chapter}{Bibliografia}{23}%
@ -1,26 +1,96 @@
#!/usr/bin/env python3
''' This module generates a sample, and create a midi file.
>>> ./ [trained_model_path] [output_path]
import settings
import sys
import random
import pickle
import numpy as np
import numpy as np
import tensorflow as tf
import tensorflow as tf
import pypianoroll as roll
import matplotlib.pyplot as plt
from tqdm import trange, tqdm
from music21 import converter, instrument, note, chord, stream
from keras.layers import Input, Dense, Conv2D
from keras.layers import Input, Dense, Conv2D
from keras.models import Model
from keras.models import Model
import settings
from keras.layers import Input, Dense, Conv2D, Flatten, LSTM, Dropout, TimeDistributed, RepeatVector
from keras.models import Model, Sequential
input_shape = settings.midi_resolution*128
input_img = tf.keras.layers.Input(shape=(input_shape,))
encoded = tf.keras.layers.Dense(160, activation='relu')(input_img)
decoded = tf.keras.layers.Dense(input_shape, activation='sigmoid')(encoded)
autoencoder = tf.keras.models.Model(input_img, decoded)
def choose_by_prob(list_of_probs):
''' This functions a list of values and assumed
that if the value is bigger it should by returned often
# load weights into new model
It was crated to give more options to choose than argmax function,
thus is more than one way that you can develop a melody.
print("Loaded model from {}".format(settings.model_path))
# generate_seed = np.random.rand(12288).reshape(1,12288)
Returns a index of choosen value from given list.
generate_seed = np.load(settings.samples_path)['arr_0'][15].reshape(1,12288)
sum_prob = np.array(list_of_probs).sum()
prob_normalized = [x/sum_prob for x in list_of_probs]
cumsum = np.array(prob_normalized).cumsum()
prob_cum = cumsum.tolist()
random_x = random.random()
for i, x in enumerate(prob_cum):
if random_x < x:
return i
generated_sample = autoencoder.predict(generate_seed)
trained_model_path = sys.argv[1]
np.savez_compressed(settings.generated_sample_path, generated_sample)
output_path = sys.argv[2]
# load model and dictionary that can translate back index_numbers to notes
# this dictionary is generated with model
print('Loading... {}'.format(trained_model_path))
model = pickle.load(open(trained_model_path, 'rb'))
int_to_note, n_vocab, seq_len = pickle.load(open('{}_dict'.format(trained_model_path), 'rb'))
seed = [random.randint(0,n_vocab) for x in range(seq_len)]
music = []
for i in trange(124):
predicted_vector = model.predict(np.array(seed).reshape(1,seq_len,1))
# using best fitted note
# predicted_index = np.argmax(predicted_vector)
# using propability distribution for choosing note
# to prevent looping
predicted_index = choose_by_prob(predicted_vector)
seed = seed[1:1+seq_len]
offset = 0
output_notes = []
for _event in tqdm(music):
event, note_len = _event.split(';')
if (' ' in event) or event.isdigit():
notes_in_chord = event.split(' ')
notes = []
for current_note in notes_in_chord:
new_note = note.Note(current_note)
new_note.storedInstrument = instrument.Piano()
new_chord = chord.Chord(notes)
new_chord.offset = offset
new_note = note.Note(event)
new_note.offset = offset
new_note.storedInstrument = instrument.Piano()
offset += float(note_len)
midi_stream = stream.Stream(output_notes)
midi_stream.write('midi', fp='{}.mid'.format(output_path))
@ -1,82 +1,133 @@
#!/usr/bin/env python3
#!/usr/bin/env python3
''' This module contains functions to endocing midi files into data samples
that is prepared for model training.
midi_folder_path - the path to directiory containing midi files
output_path - the output path where will be created samples of data
>>> ./ <midi_folder_path> <output_path> <sequence_lenth>
import settings
import settings
import pypianoroll as roll
import pypianoroll as roll
import matplotlib.pyplot as plt
import numpy as np
import numpy as np
import os
import os
from tqdm import tqdm
from tqdm import tqdm
from math import floor
from math import floor
import sys
import sys
from collections import defaultdict
import pickle
from music21 import converter, instrument, note, chord, stream
import music21
def to_samples(midi_file_path, midi_res=settings.midi_resolution):
class MidiParseError(Exception):
"""Error that is raised then midi file cannot be parsed"""
# this function export a samples from midi file:
# and for every track in midi file chopped pianoroll
# for a samples of given beat_lenth (midi_res)
# every track is single line
print('Exporting samples from: {}'.format(midi_file_path))
all_beats = np.empty((0, settings.midi_resolution, 128))
for track in roll.Multitrack(midi_file_path).tracks:
print('Track: {}'.format(
if not track.is_drum:
number_of_beats = floor(track.pianoroll.shape[0] / midi_res)
track_pianoroll = track.pianoroll[: number_of_beats * midi_res]
track_beats = track_pianoroll.reshape(number_of_beats, midi_res, 128)
all_beats = np.concatenate([track_beats, all_beats], axis=0)
print('Exported {} samples of {}'.format(number_of_beats, settings.midi_program[track.program]))
# add code for drums samples
return all_beats
def to_midi(samples, output_path=settings.generated_midi_path, program=0, tempo=120, beat_resolution=settings.beat_resolution):
tracks = [roll.Track(samples, program=program)]
return_midi = roll.Multitrack(tracks=tracks, tempo=tempo, downbeat=[0, 96, 192, 288], beat_resolution=beat_resolution)
roll.write(return_midi, settings.generated_midi_path)
# todo: this function is running too slow.
def parse_argv(argv):
def delete_empty_samples(sample_pack):
'''This function is parsing given arguments when running a midi script.
print('Deleting empty samples...')
Returns a tuple consinting of midi_folder_path, output_path, seq_len'''
temp_sample_pack = sample_pack
index_manipulator = 1
midi_folder_path = argv[1]
for index, sample in enumerate(sample_pack):
output_path = argv[2]
if sample.sum() == 0:
seq_len = int(argv[3])
temp_sample_pack = np.delete(temp_sample_pack, index-index_manipulator, axis=0)
return midi_folder_path, output_path, seq_len
index_manipulator = index_manipulator + 1
except IndexError:
print('Deleted {} empty samples'.format(index_manipulator-1))
raise AttributeError('You propably didnt pass parameters to run script.\
return temp_sample_pack
>>> ./ <midi_folder_path> <output_path> <sequence_lenth>')
def to_sequence(midi_path, seq_len):
''' This function is supposed to be used on one midi file in directory loop.
Its encoding midi files, into sequances of given lenth as a train_X,
and the next note as a train_y. Also splitting midi samples into
instrument group.
Use for LSTM neural network.
- midi_path: path to midi file
- seq_len: lenght of sequance before prediction
Returns: Tuple of train_X, train_y dictionaries consisinting of samples of song grouped by instruments
seq_by_instrument = defaultdict( lambda : [] )
midi_file = music21.converter.parse(midi_path)
except music21.midi.MidiException:
raise MidiParseError
stream = music21.instrument.partitionByInstrument(midi_file)
for part in stream:
for event in part:
if part.partName != None:
if isinstance(event, music21.note.Note):
to_export_event = '{};{}'.format(str(event.pitch), float(event.quarterLength))
elif isinstance(event, music21.chord.Chord):
to_export_event = '{};{}'.format(' '.join(str(note) for note in event.pitches), float(event.quarterLength))
X_train_by_instrument = defaultdict( lambda : [] )
y_train_by_instrument = defaultdict( lambda : [] )
for instrument, sequence in seq_by_instrument.items():
for i in range(len(sequence)-(seq_len)) :
X_train_by_instrument[instrument].append(np.array(sequence[i:i+seq_len])) # <seq lenth
return X_train_by_instrument, y_train_by_instrument
def colect_samples(midi_folder_path, seq_len):
'''This function is looping throuth given directories and
collecting samples from midi files.
Parameters: midi_folder_path - a path to directory with midi files
seq_len - a lenth of train_X sample that tells
how many notes is given do LSTM to predict the next note.
Returns: Tuple of train_X, train_y dictionaries consisinting
of samples of all songs in directory grouped by instruments.
print('Collecting samples...')
train_X = defaultdict( lambda : [] )
train_y = defaultdict( lambda : [] )
for directory, subdirectories, files in os.walk(midi_folder_path):
for midi_file in tqdm(files):
midi_file_path = os.path.join(directory, midi_file)
_X_train, _y_train = to_sequence(midi_file_path, seq_len)
except MidiParseError:
for (X_key, X_value), (y_key, y_value) in zip(_X_train.items(), _y_train.items()):
return train_X, train_y
def save_samples(output_path, samples):
'''This function save samples to npz packages, splitted by instrument.'''
if not os.path.exists(output_path):
train_X, train_y = samples
for (X_key, X_value), (y_key, y_value) in tqdm(zip(train_X.items(), train_y.items())):
if X_key == y_key:
np.savez_compressed('{}/{}.npz'.format(output_path, X_key), np.array(X_value), np.array(y_value))
def main():
def main():
midi_folder_path, output_path, seq_len = parse_argv(sys.argv)
if sys.argv[1]=='export':
save_samples(output_path, colect_samples(midi_folder_path, seq_len))
print('Exporting started...')
sample_pack = np.empty((0,settings.midi_resolution,128))
for midi_file in os.listdir(settings.midi_dir):
midi_file_path = '{}/{}'.format(settings.midi_dir, midi_file)
midi_samples = to_samples(midi_file_path)
if midi_samples is None:
sample_pack = np.concatenate((midi_samples, sample_pack), axis=0)
# I commented out this line, because it was too slow
# sample_pack = delete_empty_samples(sample_pack)
np.savez_compressed(settings.samples_dir, sample_pack)
print('Exported {} samples'.format(sample_pack.shape[0]))
fig, axes = plt.subplots(nrows=10, ncols=10, figsize=(20, 20))
for idx, ax in enumerate(axes.ravel()):
n = np.random.randint(0, sample_pack.shape[0])
sample = sample_pack[n]
ax.imshow(sample, cmap = plt.get_cmap('gray'))
if __name__ == '__main__':
if __name__ == '__main__':
Normal file
Normal file
@ -0,0 +1,26 @@
- - code for data extraction, and midi convertion
- - code for model definition, and training session
- - code for model loading, predicting ang saving to midi_dir
- - file where deafult settings are stored
- - this file
- data/midi - directory where input midi are stored
- data/models - directory where trained models are stored
- data/output - directory where generated music is stored
- data/samples - directory where extracted data from midi is stored
- data/samples.npz - deprecated
#How to use:
1. Use to export data from midi files
>>> ./ [midi_folder_path] [output_path]
2. Use to train a model (this can take a while)
>>> ./ [input_training_data] [model_save_path] [epochs]
3. Use to generate music from trained models
>>> ./ [trained_model_path] [output_path] [treshold]
plt.imshow(instruments.T, cmap='gray')
@ -15,11 +15,11 @@ beats_per_sample = 1
ignore_note_lenght = False
ignore_note_lenght = False
epochs = 1000
epochs = 1
midi_program = {
midi_program = {
0 : 'Perc',
# Piano
1 : 'Acoustic Grand Piano',
1 : 'Acoustic Grand Piano',
2 : 'Bright Acoustic Piano',
2 : 'Bright Acoustic Piano',
3 : 'Electric Grand Piano',
3 : 'Electric Grand Piano',
@ -28,6 +28,7 @@ midi_program = {
6 : 'Electric Piano 2',
6 : 'Electric Piano 2',
7 : 'Harpsichord',
7 : 'Harpsichord',
8 : 'Clavi',
8 : 'Clavi',
# Chromatic Percussion
9 : 'Celesta',
9 : 'Celesta',
10 : 'Glockenspiel',
10 : 'Glockenspiel',
11 : 'Music Box',
11 : 'Music Box',
@ -36,6 +37,7 @@ midi_program = {
14 : 'Xylophone',
14 : 'Xylophone',
15 : 'Tubular Bells',
15 : 'Tubular Bells',
16 : 'Dulcimer',
16 : 'Dulcimer',
# Organ
17 : 'Drawbar Organ',
17 : 'Drawbar Organ',
18 : 'Percussive Organ',
18 : 'Percussive Organ',
19 : 'Rock Organ',
19 : 'Rock Organ',
@ -44,6 +46,7 @@ midi_program = {
22 : 'Accordion',
22 : 'Accordion',
23 : 'Harmonica',
23 : 'Harmonica',
24 : 'Tango Accordion',
24 : 'Tango Accordion',
# Guitar
25 : 'Acoustic Guitar (nylon)',
25 : 'Acoustic Guitar (nylon)',
26 : 'Acoustic Guitar (steel)',
26 : 'Acoustic Guitar (steel)',
27 : 'Electric Guitar (jazz)',
27 : 'Electric Guitar (jazz)',
@ -52,6 +55,7 @@ midi_program = {
30 : 'Overdriven Guitar',
30 : 'Overdriven Guitar',
31 : 'Distortion Guitar',
31 : 'Distortion Guitar',
32 : 'Guitar harmonics',
32 : 'Guitar harmonics',
# Bass
33 : 'Acoustic Bass',
33 : 'Acoustic Bass',
34 : 'Electric Bass (finger)',
34 : 'Electric Bass (finger)',
35 : 'Electric Bass (pick)',
35 : 'Electric Bass (pick)',
@ -60,6 +64,7 @@ midi_program = {
38 : 'Slap Bass 2',
38 : 'Slap Bass 2',
39 : 'Synth Bass 1',
39 : 'Synth Bass 1',
40 : 'Synth Bass 2',
40 : 'Synth Bass 2',
# Strings
41 : 'Violin',
41 : 'Violin',
42 : 'Viola',
42 : 'Viola',
43 : 'Cello',
43 : 'Cello',
@ -68,6 +73,7 @@ midi_program = {
46 : 'Pizzicato Strings',
46 : 'Pizzicato Strings',
47 : 'Orchestral Harp',
47 : 'Orchestral Harp',
48 : 'Timpani',
48 : 'Timpani',
# Ensemble
49 : 'String Ensemble 1',
49 : 'String Ensemble 1',
50 : 'String Ensemble 2',
50 : 'String Ensemble 2',
51 : 'SynthStrings 1',
51 : 'SynthStrings 1',
@ -76,6 +82,7 @@ midi_program = {
54 : 'Voice Oohs',
54 : 'Voice Oohs',
55 : 'Synth Voice',
55 : 'Synth Voice',
56 : 'Orchestra Hit',
56 : 'Orchestra Hit',
# Brass
57 : 'Trumpet',
57 : 'Trumpet',
58 : 'Trombone',
58 : 'Trombone',
59 : 'Tuba',
59 : 'Tuba',
@ -84,6 +91,7 @@ midi_program = {
62 : 'Brass Section',
62 : 'Brass Section',
63 : 'SynthBrass 1',
63 : 'SynthBrass 1',
64 : 'SynthBrass 2',
64 : 'SynthBrass 2',
# Reed
65 : 'Soprano Sax',
65 : 'Soprano Sax',
66 : 'Alto Sax',
66 : 'Alto Sax',
67 : 'Tenor Sax',
67 : 'Tenor Sax',
@ -92,6 +100,7 @@ midi_program = {
70 : 'English Horn',
70 : 'English Horn',
71 : 'Bassoon',
71 : 'Bassoon',
72 : 'Clarinet',
72 : 'Clarinet',
# Pipe
73 : 'Piccolo',
73 : 'Piccolo',
74 : 'Flute',
74 : 'Flute',
75 : 'Recorder',
75 : 'Recorder',
@ -100,6 +109,7 @@ midi_program = {
78 : 'Shakuhachi',
78 : 'Shakuhachi',
79 : 'Whistle',
79 : 'Whistle',
80 : 'Ocarina',
80 : 'Ocarina',
# Synth Lead
81 : 'Lead 1 (square)',
81 : 'Lead 1 (square)',
82 : 'Lead 2 (sawtooth)',
82 : 'Lead 2 (sawtooth)',
83 : 'Lead 3 (calliope)',
83 : 'Lead 3 (calliope)',
@ -108,6 +118,7 @@ midi_program = {
86 : 'Lead 6 (voice)',
86 : 'Lead 6 (voice)',
87 : 'Lead 7 (fifths)',
87 : 'Lead 7 (fifths)',
88 : 'Lead 8 (bass + lead)',
88 : 'Lead 8 (bass + lead)',
# Synth Pad
89 : 'Pad 1 (new age)',
89 : 'Pad 1 (new age)',
90 : 'Pad 2 (warm)',
90 : 'Pad 2 (warm)',
91 : 'Pad 3 (polysynth)',
91 : 'Pad 3 (polysynth)',
@ -116,6 +127,7 @@ midi_program = {
94 : 'Pad 6 (metallic)',
94 : 'Pad 6 (metallic)',
95 : 'Pad 7 (halo)',
95 : 'Pad 7 (halo)',
96 : 'Pad 8 (sweep)',
96 : 'Pad 8 (sweep)',
# Synth Effects
97 : 'FX 1 (rain)',
97 : 'FX 1 (rain)',
98 : 'FX 2 (soundtrack)',
98 : 'FX 2 (soundtrack)',
99 : 'FX 3 (crystal)',
99 : 'FX 3 (crystal)',
@ -124,6 +136,7 @@ midi_program = {
102 : 'FX 6 (goblins)',
102 : 'FX 6 (goblins)',
103 : 'FX 7 (echoes)',
103 : 'FX 7 (echoes)',
104 : 'FX 8 (sci-fi)',
104 : 'FX 8 (sci-fi)',
# Ethnic
105 : 'Sitar',
105 : 'Sitar',
106 : 'Banjo',
106 : 'Banjo',
107 : 'Shamisen',
107 : 'Shamisen',
@ -132,6 +145,7 @@ midi_program = {
110 : 'Bag pipe',
110 : 'Bag pipe',
111 : 'Fiddle',
111 : 'Fiddle',
112 : 'Shanai',
112 : 'Shanai',
# Percussive
113 : 'Tinkle Bell',
113 : 'Tinkle Bell',
114 : 'Agogo',
114 : 'Agogo',
115 : 'Steel Drums',
115 : 'Steel Drums',
@ -140,6 +154,7 @@ midi_program = {
118 : 'Melodic Tom',
118 : 'Melodic Tom',
119 : 'Synth Drum',
119 : 'Synth Drum',
120 : 'Reverse Cymbal',
120 : 'Reverse Cymbal',
# Sound Effects
121 : 'Guitar Fret Noise',
121 : 'Guitar Fret Noise',
122 : 'Breath Noise',
122 : 'Breath Noise',
123 : 'Seashore',
123 : 'Seashore',
@ -149,3 +164,150 @@ midi_program = {
127 : 'Applause',
127 : 'Applause',
128 : 'Gunshot'
128 : 'Gunshot'
midi_group = {
# Piano
1 : 'Piano',
2 : 'Piano',
3 : 'Piano',
4 : 'Piano',
5 : 'Piano',
6 : 'Piano',
7 : 'Piano',
8 : 'Piano',
# Chromatic Percussion
9 : 'Chromatic_Percussion',
10 : 'Chromatic_Percussion',
11 : 'Chromatic_Percussion',
12 : 'Chromatic_Percussion',
13 : 'Chromatic_Percussion',
14 : 'Chromatic_Percussion',
15 : 'Chromatic_Percussion',
16 : 'Chromatic_Percussion',
# Organ
17 : 'Organ',
18 : 'Organ',
19 : 'Organ',
20 : 'Organ',
21 : 'Organ',
22 : 'Organ',
23 : 'Organ',
24 : 'Organ',
# Guitar
25 : 'Guitar',
26 : 'Guitar',
27 : 'Guitar',
28 : 'Guitar',
29 : 'Guitar',
30 : 'Guitar',
31 : 'Guitar',
32 : 'Guitar',
# Bass
33 : 'Bass',
34 : 'Bass',
35 : 'Bass',
36 : 'Bass',
37 : 'Bass',
38 : 'Bass',
39 : 'Bass',
40 : 'Bass',
# Strings
41 : 'Strings',
42 : 'Strings',
43 : 'Strings',
44 : 'Strings',
45 : 'Strings',
46 : 'Strings',
47 : 'Strings',
48 : 'Strings',
# Ensemble
49 : 'Ensemble',
50 : 'Ensemble',
51 : 'Ensemble',
52 : 'Ensemble',
53 : 'Ensemble',
54 : 'Ensemble',
55 : 'Ensemblee',
56 : 'Ensemble',
# Brass
57 : 'Brass',
58 : 'Brass',
59 : 'Brass',
60 : 'Brass',
61 : 'Brass',
62 : 'Brass',
63 : 'Brass',
64 : 'Brass',
# Reed
65 : 'Reed',
66 : 'Reed',
67 : 'Reed',
68 : 'Reed',
69 : 'Reed',
70 : 'Reed',
71 : 'Reed',
72 : 'Reed',
# Pipe
73 : 'Pipe',
74 : 'Pipe',
75 : 'Pipe',
76 : 'Pipe',
77 : 'Pipe',
78 : 'Pipe',
79 : 'Pipe',
80 : 'Pipe',
# Synth Lead
81 : 'Synth_Lead',
82 : 'Synth_Lead',
83 : 'Synth_Lead',
84 : 'Synth_Lead',
85 : 'Synth_Lead',
86 : 'Synth_Lead',
87 : 'Synth_Lead',
88 : 'Synth_Lead',
# Synth Pad
89 : 'Synth_Pad',
90 : 'Synth_Pad',
91 : 'Synth_Pad',
92 : 'Synth_Pad',
93 : 'Synth_Pad',
94 : 'Synth_Pad',
95 : 'Synth_Pad',
96 : 'Synth_Pad',
# Synth Effects
97 : 'Synth_Effects',
98 : 'Synth_Effects',
99 : 'Synth_Effects',
100 : 'Synth_Effects',
101 : 'Synth_Effects',
102 : 'Synth_Effects',
103 : 'Synth_Effects',
104 : 'Synth_Effects',
# Ethnic
105 : 'Ethnic',
106 : 'Ethnic',
107 : 'Ethnic',
108 : 'Ethnic',
109 : 'Ethnic',
110 : 'Ethnic',
111 : 'Ethnic',
112 : 'Ethnic',
# Percussive
113 : 'Percussive',
114 : 'Percussive',
115 : 'Percussive',
116 : 'Percussive',
117 : 'Percussive',
118 : 'Percussive',
119 : 'Percussive',
120 : 'Percussive',
# Sound Effects
121 : 'Sound_Effects',
122 : 'Sound_Effects',
123 : 'Sound_Effects',
124 : 'Sound_Effects',
125 : 'Sound_Effects',
126 : 'Sound_Effects',
127 : 'Sound_Effects',
128 : 'Sound_Effects'
@ -1,33 +1,68 @@
#!/usr/bin/env python3
#!/usr/bin/env python3
import sys
import tensorflow as tf
import settings
from tensorflow.keras import layers
from keras.layers import Input, Dense, Conv2D, Flatten
from keras.models import Model, Sequential
import numpy as np
from sys import exit
import pickle
import pickle
import settings
print('Reading samples from: {}'.format(settings.samples_path))
import numpy as np
from keras.layers import Input, Dense, Conv2D, Flatten, LSTM, Dropout, TimeDistributed, RepeatVector, Activation, Bidirectional, Reshape
from keras.models import Model, Sequential
from keras.utils.np_utils import to_categorical
train_X = np.load(settings.samples_path)['arr_0']
n_samples = train_X.shape[0]
def load_data(samples_path):
input_shape = settings.midi_resolution*128
print('Loading... {}'.format(train_data_path))
train_X = train_X.reshape(n_samples, input_shape)
train_X = np.load(train_data_path, allow_pickle=True)['arr_0']
train_y = np.load(train_data_path, allow_pickle=True)['arr_1']
return train_X, train_y
# encoder model
# TODO: make transformer class with fit, transform and reverse definitions
input_img = tf.keras.layers.Input(shape=(input_shape,))
def preprocess_samples(train_X, train_y):
encoded = tf.keras.layers.Dense(160, activation='relu')(input_img)
vocab_X = np.unique(train_X)
decoded = tf.keras.layers.Dense(input_shape, activation='sigmoid')(encoded)
vocab_y = np.unique(train_y)
autoencoder = tf.keras.models.Model(input_img, decoded)
vocab = np.concatenate([vocab_X, vocab_y])
n_vocab = vocab.shape[0]
note_to_int = dict((note, number) for number, note in enumerate(vocab))
int_to_note = dict((number, note) for number, note in enumerate(vocab))
_train_X = []
_train_y = []
for sample in train_X:
# TODO: add normalizasion
_train_X.append([note_to_int[note] for note in sample])
train_X = np.array(_train_X).reshape(train_X.shape[0], train_X.shape[1], 1)
train_y = np.array([note_to_int[note] for note in train_y]).reshape(-1,1)
train_y = to_categorical(train_y)
||||||, train_X, epochs=settings.epochs, batch_size=32)
return train_X, train_y, n_vocab, int_to_note
train_data_path = sys.argv[1]
print("Model save to {}".format(settings.model_path))
train_X, train_y = load_data(train_data_path)
train_X, train_y, n_vocab, int_to_note = preprocess_samples(train_X, train_y)
save_model_path = sys.argv[2]
epochs = int(sys.argv[3])
model = Sequential()
model.add(LSTM(512, input_shape=(train_X.shape[1], train_X.shape[2]), return_sequences=True))
model.add(LSTM(512, return_sequences=True))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# This code will train our model, with given by parameter number of epochs
|, train_y, epochs=epochs, batch_size=64)
# it saves model, and additional informations of model
# that is needed to generate music from it
pickle.dump(model, open(save_model_path,'wb'))
pickle.dump((int_to_note, n_vocab, train_X.shape[1]), open('{}_dict'.format(save_model_path),'wb'))
print("Model saved to: {}".format(save_model_path))
Reference in New Issue
Block a user