epoch 0 train loss: 0.10178063126471196 valid loss scaled: 0.08694887030887369 valid loss: 14.393254678046581 valid loss clipped: 14.393254678046581 train loss: 0.08234276634246003 valid loss scaled: 0.0941885423619173 valid loss: 15.591687502080195 valid loss clipped: 15.591687502080195 train loss: 0.07840566897624576 valid loss scaled: 0.10380323876951889 valid loss: 17.18327815406788 valid loss clipped: 17.18327815406788 train loss: 0.07422883822886046 valid loss scaled: 0.10199038650759494 valid loss: 16.883180520434045 valid loss clipped: 16.883180520434045 train loss: 0.0715406045230223 valid loss scaled: 0.08943446505286708 valid loss: 14.8047131664572 valid loss clipped: 14.8047131664572 train loss: 0.07021876769740436 valid loss scaled: 0.0936252853835668 valid loss: 15.4984476046049 valid loss clipped: 15.498044701389192 train loss: 0.0698000478439966 valid loss scaled: 0.0732450713218563 valid loss: 12.124766595640176 valid loss clipped: 12.1245106015308 train loss: 0.06787076175194112 valid loss scaled: 0.08167600706479376 valid loss: 13.520399589246514 valid loss clipped: 13.520399589246514 train loss: 0.06687028820999129 valid loss scaled: 0.07860967955406604 valid loss: 13.012812666808495 valid loss clipped: 13.012812666808495 train loss: 0.06485297715840126 valid loss scaled: 0.06548883445499123 valid loss: 10.840826069687282 valid loss clipped: 10.840826069687282 train loss: 0.06536078446808377 valid loss scaled: 0.06285778998197525 valid loss: 10.405289848890389 valid loss clipped: 10.402398559859241 train loss: 0.06282014927278873 valid loss scaled: 0.0596510515677964 valid loss: 9.874458507430498 valid loss clipped: 9.87346211871712 train loss: 0.06211401904304488 valid loss scaled: 0.08676563866997164 valid loss: 14.362925503357548 valid loss clipped: 14.362736109627358 train loss: 0.06245454523502525 valid loss scaled: 0.061674062543468834 valid loss: 10.209341911143197 valid loss clipped: 10.209341911143197 train loss: 0.06125578036664975 valid loss scaled: 0.07651956140507003 valid loss: 12.666819321757812 valid loss clipped: 12.665205579153804 train loss: 0.059952159220873956 valid loss scaled: 0.060298738548431624 valid loss: 9.981674467979698 valid loss clipped: 9.981674467979698 train loss: 0.058864656505384534 valid loss scaled: 0.06224854018969625 valid loss: 10.304438001377102 valid loss clipped: 10.304438001377102 train loss: 0.06033644216377098 valid loss scaled: 0.06860725018131864 valid loss: 11.357033358430243 valid loss clipped: 11.355784159072005 train loss: 0.05914900476107049 valid loss scaled: 0.06558627462957999 valid loss: 10.856954814387644 valid loss clipped: 10.855340320357298 train loss: 0.05801354450611692 valid loss scaled: 0.05499361709851459 valid loss: 9.103479951662178 valid loss clipped: 9.1033016162492 train loss: 0.05722296967684818 valid loss scaled: 0.057276662331433935 valid loss: 9.481407818900754 valid loss clipped: 9.481123608784491 train loss: 0.05615422017549816 valid loss scaled: 0.058946735056267775 valid loss: 9.757867643250142 valid loss clipped: 9.757867643250142 train loss: 0.0568488666286758 valid loss scaled: 0.05493375345527444 valid loss: 9.093570558704585 valid loss clipped: 9.093570558704585 train loss: 0.056790975754785165 valid loss scaled: 0.05718280344549578 valid loss: 9.465868729560718 valid loss clipped: 9.465666290138364 train loss: 0.056136715463394736 valid loss scaled: 0.053061117528704765 valid loss: 8.783576806818123 valid loss clipped: 8.783576806818123 train loss: 0.0553076835671036 valid loss scaled: 0.05756376005144751 valid loss: 9.528932262971411 valid loss clipped: 9.528929186851473 train loss: 0.05608131737157289 valid loss scaled: 0.05928617719392673 valid loss: 9.814053006356623 valid loss clipped: 9.813086909431942 train loss: 0.054209274725265544 valid loss scaled: 0.059807781220226314 valid loss: 9.900402019838845 valid loss clipped: 9.899803739726012 train loss: 0.05412653998906017 valid loss scaled: 0.05289933204229587 valid loss: 8.75679344111035 valid loss clipped: 8.75679344111035 train loss: 0.05518407843528995 valid loss scaled: 0.05688392864725153 valid loss: 9.416393057892193 valid loss clipped: 9.416393057892193 train loss: 0.054482056935617085 valid loss scaled: 0.05216977017299728 valid loss: 8.636023503699716 valid loss clipped: 8.636023503699716 train loss: 0.05399237402441963 valid loss scaled: 0.05409103409701342 valid loss: 8.954068173354056 valid loss clipped: 8.954068173354056 train loss: 0.05278416502551686 valid loss scaled: 0.05505897657177228 valid loss: 9.114298554400836 valid loss clipped: 9.114128444401343 train loss: 0.05283459274043561 valid loss scaled: 0.05330468837007633 valid loss: 8.823897722342918 valid loss clipped: 8.823469673021748 train loss: 0.05349978882139668 valid loss scaled: 0.05340672519610164 valid loss: 8.840791683139289 valid loss clipped: 8.840791683139289 train loss: 0.05244616329396176 valid loss scaled: 0.05313163808650701 valid loss: 8.795251288181513 valid loss clipped: 8.795251288181513 train loss: 0.05458822252512801 valid loss scaled: 0.05058266187593807 valid loss: 8.373303749237403 valid loss clipped: 8.372144978003309 train loss: 0.052178348999718266 valid loss scaled: 0.05477028984676988 valid loss: 9.066509788893315 valid loss clipped: 9.066509788893315 train loss: 0.05264708004419216 valid loss scaled: 0.05061255628492586 valid loss: 8.378252333491037 valid loss clipped: 8.378252333491037 epoch 0 done, EVALUATION ON FULL DEV: valid loss scaled: 0.05304482890568026 valid loss: 8.780882018378279 valid loss clipped: 8.780282676016194 evaluation done epoch 1 train loss: 0.05193906537483793 valid loss scaled: 0.05525216534943099 valid loss: 9.146276710629154 valid loss clipped: 9.146276710629154 train loss: 0.05052301682721696 valid loss scaled: 0.05013342859224921 valid loss: 8.298936329028248 valid loss clipped: 8.298936329028248 train loss: 0.05098356105392436 valid loss scaled: 0.05003979649887808 valid loss: 8.283437402089449 valid loss clipped: 8.283437402089449 train loss: 0.050088782917609716 valid loss scaled: 0.049286247248487895 valid loss: 8.15869774398909 valid loss clipped: 8.15869774398909 train loss: 0.04976562104718302 valid loss scaled: 0.049223300781573644 valid loss: 8.148276889380307 valid loss clipped: 8.148276889380307 train loss: 0.04991738679144195 valid loss scaled: 0.057702729077406346 valid loss: 9.551934437355772 valid loss clipped: 9.550634254813586 train loss: 0.049231458019302426 valid loss scaled: 0.04882268479584859 valid loss: 8.081957979949655 valid loss clipped: 8.081957979949655 train loss: 0.04899937393720371 valid loss scaled: 0.05770907825550645 valid loss: 9.552989047406287 valid loss clipped: 9.552741958522342 train loss: 0.04908777283465607 valid loss scaled: 0.05134136848123586 valid loss: 8.498893804182858 valid loss clipped: 8.49871165029239 train loss: 0.04764838316443966 valid loss scaled: 0.05261156122562749 valid loss: 8.709157880799538 valid loss clipped: 8.709138480005972 train loss: 0.048582658014159445 valid loss scaled: 0.058413308832729716 valid loss: 9.669566298033944 valid loss clipped: 9.668254240676646 train loss: 0.04743617173694683 valid loss scaled: 0.051687093210397744 valid loss: 8.556127664837945 valid loss clipped: 8.556127664837945 train loss: 0.04656181514429912 valid loss scaled: 0.0507906026516647 valid loss: 8.407722978198628 valid loss clipped: 8.407722978198628 train loss: 0.04762343910575039 valid loss scaled: 0.04810272429882445 valid loss: 7.962782613217737 valid loss clipped: 7.962782613217737 train loss: 0.048021365618576804 valid loss scaled: 0.051694526258934385 valid loss: 8.55735450556255 valid loss clipped: 8.55735450556255 train loss: 0.04710475399633604 valid loss scaled: 0.04621696889011772 valid loss: 7.650616536714813 valid loss clipped: 7.650616536714813 train loss: 0.04589946658188546 valid loss scaled: 0.04597697057310632 valid loss: 7.610890791241026 valid loss clipped: 7.610890791241026 train loss: 0.047086027235787566 valid loss scaled: 0.04816656697145074 valid loss: 7.973350046367094 valid loss clipped: 7.973350046367094 train loss: 0.04701215419012016