epoch 1 train loss: 0.04859849993016349 valid loss scaled: 0.049774259863821814 valid loss: 8.239480139682152 valid loss clipped: 8.239480139682152 train loss: 0.048566044032315264 valid loss scaled: 0.05388942821210208 valid loss: 8.920693268651688 valid loss clipped: 8.920693268651688 train loss: 0.048645840325942186 valid loss scaled: 0.05395363235295337 valid loss: 8.931319168623325 valid loss clipped: 8.931319168623325 train loss: 0.04814379534040526 valid loss scaled: 0.0505393534476585 valid loss: 8.366129820820378 valid loss clipped: 8.366112817712773 train loss: 0.04762666338418628 valid loss scaled: 0.0559599088881008 valid loss: 9.263433639650358 valid loss clipped: 9.263433639650358 train loss: 0.047656307189366594 valid loss scaled: 0.05023614907590337 valid loss: 8.315944918367261 valid loss clipped: 8.315944918367261 train loss: 0.04783966918808628 valid loss scaled: 0.05002744041130589 valid loss: 8.281391949540064 valid loss clipped: 8.281391949540064 train loss: 0.0470857543381118 valid loss scaled: 0.05007423239511511 valid loss: 8.289137880244128 valid loss clipped: 8.289137880244128 train loss: 0.04748013890490943 valid loss scaled: 0.048425306807186 valid loss: 8.01617955284889 valid loss clipped: 8.01617955284889 train loss: 0.04756126793491012 valid loss scaled: 0.05838172127632288 valid loss: 9.664334319586205 valid loss clipped: 9.664334319586205 train loss: 0.04653691036105432 valid loss scaled: 0.04918750869152843 valid loss: 8.14234734821747 valid loss clipped: 8.14225860100907 train loss: 0.047854342213726093 valid loss scaled: 0.04825916678037553 valid loss: 7.988678617267059 valid loss clipped: 7.988678617267059 train loss: 0.04773736108071371 valid loss scaled: 0.04693122838927694 valid loss: 7.768854512974581 valid loss clipped: 7.768764906393828 train loss: 0.04762865603151333 valid loss scaled: 0.05211583033031525 valid loss: 8.627096447506986 valid loss clipped: 8.627096447506986 train loss: 0.04704306549118133 valid loss scaled: 0.048404824273360625 valid loss: 8.012792317972876 valid loss clipped: 8.012792317972876 train loss: 0.047052753271308353 valid loss scaled: 0.047940905819710194 valid loss: 7.935991222098737 valid loss clipped: 7.935991222098737 train loss: 0.047227875012458086 valid loss scaled: 0.04819175795767938 valid loss: 7.977517309701107 valid loss clipped: 7.977427346370881 train loss: 0.04710633027268819 valid loss scaled: 0.06383742791518349 valid loss: 10.567450933423165 valid loss clipped: 10.567450933423165 train loss: 0.04722847116132579 valid loss scaled: 0.050155074146505475 valid loss: 8.302519880598078 valid loss clipped: 8.302519880598078 train loss: 0.04704662696396088 valid loss scaled: 0.049731758847849555 valid loss: 8.232442667326458 valid loss clipped: 8.231381949146131 train loss: 0.04640111281869779 valid loss scaled: 0.04439513521075078 valid loss: 7.349034015457206 valid loss clipped: 7.349034015457206 train loss: 0.04608356887790085 valid loss scaled: 0.04921432936453068 valid loss: 8.146788631177284 valid loss clipped: 8.146788631177284 train loss: 0.04667254772723141 valid loss scaled: 0.04490475325651902 valid loss: 7.433397045419523 valid loss clipped: 7.433397045419523 train loss: 0.04732138091412676 valid loss scaled: 0.050358595447874285 valid loss: 8.336210587795785 valid loss clipped: 8.336210587795785 train loss: 0.046673220858461255 valid loss scaled: 0.05013948923902952 valid loss: 8.29994005436984 valid loss clipped: 8.29994005436984 train loss: 0.04553470963703993 valid loss scaled: 0.04682243262330682 valid loss: 7.750845583038412 valid loss clipped: 7.750845583038412 train loss: 0.046360710414935315 valid loss scaled: 0.045567268129456086 valid loss: 7.543068199989672 valid loss clipped: 7.543068199989672 train loss: 0.04552500833500083 valid loss scaled: 0.04640085875333069 valid loss: 7.681058866001286 valid loss clipped: 7.681058866001286 train loss: 0.0452476728574542 valid loss scaled: 0.046615049866360565 valid loss: 7.716513190971443 valid loss clipped: 7.716513190971443 train loss: 0.045669575255858894 valid loss scaled: 0.04750326795304539 valid loss: 7.8635460404556 valid loss clipped: 7.8635460404556 train loss: 0.04647987461932483 valid loss scaled: 0.04860825732568691 valid loss: 8.046460836562115 valid loss clipped: 8.046460836562115 train loss: 0.046793957499830896 valid loss scaled: 0.04575770256354883 valid loss: 7.574592460104707 valid loss clipped: 7.574592460104707 train loss: 0.04585571441358016 valid loss scaled: 0.04501184464469181 valid loss: 7.451126169626989 valid loss clipped: 7.451126169626989 train loss: 0.04610028065682122 valid loss scaled: 0.046051780212043245 valid loss: 7.623273458385178 valid loss clipped: 7.623083864668157 train loss: 0.046177845466100334 valid loss scaled: 0.06025220972281953 valid loss: 9.973968222380467 valid loss clipped: 9.973968222380467 train loss: 0.044866837962671394 valid loss scaled: 0.04543754842407298 valid loss: 7.5215926794190535 valid loss clipped: 7.5215926794190535 train loss: 0.044414652746250796 valid loss scaled: 0.044867695763598454 valid loss: 7.4272631776866636 valid loss clipped: 7.427128339884669 train loss: 0.04534780122814873 valid loss scaled: 0.045269020539499126 valid loss: 7.493695395829885 valid loss clipped: 7.493620279412681 train loss: 0.04439006437582844 valid loss scaled: 0.04507342677914177 valid loss: 7.461320119658475 valid loss clipped: 7.461320119658475 epoch 1 done, EVALUATION ON FULL DEV: valid loss scaled: 0.04480323708981578 valid loss: 7.416593515309324 valid loss clipped: 7.416387303935181 evaluation done epoch 2 train loss: 0.04156288899572168 valid loss scaled: 0.04604067304159161 valid loss: 7.621432637256123 valid loss clipped: 7.621432637256123 train loss: 0.04100707919178113 valid loss scaled: 0.052927161282336356 valid loss: 8.761404249024789 valid loss clipped: 8.761404249024789 train loss: 0.04102466219659839 valid loss scaled: 0.0449000373685904 valid loss: 7.432613525687441 valid loss clipped: 7.432613525687441 train loss: 0.04056007982480108 valid loss scaled: 0.04401167129402493 valid loss: 7.285561051000317 valid loss clipped: 7.285561051000317 train loss: 0.04043435892664155 valid loss scaled: 0.0482961730545656 valid loss: 7.9948042408749025 valid loss clipped: 7.9948042408749025 train loss: 0.040083140595004084 valid loss scaled: 0.05474324524206743 valid loss: 9.062036144331996 valid loss clipped: 9.061846768141493 train loss: 0.040820699932255504 valid loss scaled: 0.0489683078886907 valid loss: 8.106066491790516 valid loss clipped: 8.106061770281157 train loss: 0.04011373971574097 valid loss scaled: 0.04683979676309513 valid loss: 7.75372067780672 valid loss clipped: 7.75372067780672 train loss: 0.040472179534386435 valid loss scaled: 0.04310859235241129 valid loss: 7.13606575431109 valid loss clipped: 7.13606575431109 train loss: 0.040441015502005524 valid loss scaled: 0.05050364392711698 valid loss: 8.360220506489952 valid loss clipped: 8.360220506489952 train loss: 0.03942231891194775 valid loss scaled: 0.0488088141837757 valid loss: 8.079661712550005 valid loss clipped: 8.079636855330948 train loss: 0.04015028625372404 valid loss scaled: 0.04516411367366714 valid loss: 7.476330067311157 valid loss clipped: 7.476182879344117 train loss: 0.04084930318361188 valid loss scaled: 0.04687766826105318 valid loss: 7.7599889055078215 valid loss clipped: 7.759971606690415 train loss: 0.04066393936852439 valid loss scaled: 0.04189752550714794 valid loss: 6.935592127831488 valid loss clipped: 6.935592127831488 train loss: 0.03997029890541135 valid loss scaled: 0.044517524088250444 valid loss: 7.369297241211859 valid loss clipped: 7.369000766181205 train loss: 0.03993799675444034 valid loss scaled: 0.04544862752523738 valid loss: 7.523431609784254 valid loss clipped: 7.523394388992367 train loss: 0.040077832674748126 valid loss scaled: 0.04967513279215436 valid loss: 8.223067645108816 valid loss clipped: 8.223067645108816 train loss: 0.03990507605068726 valid loss scaled: 0.05784423713065072 valid loss: 9.575362301216071 valid loss clipped: 9.575185345956458 train loss: 0.040589384846144605 valid loss scaled: 0.0496796067080015 valid loss: 8.223812362251003 valid loss clipped: 8.223812362251003 train loss: 0.04017873370107866 valid loss scaled: 0.046199073425154856 valid loss: 7.647655631709408 valid loss clipped: 7.647218458815767 train loss: 0.04044891785946012 valid loss scaled: 0.04150905843239381 valid loss: 6.871281648789468 valid loss clipped: 6.871281648789468 train loss: 0.03947431928291259 valid loss scaled: 0.05190902248718164 valid loss: 8.592867224099864 valid loss clipped: 8.592867224099864 train loss: 0.03990533801732083 valid loss scaled: 0.04276042828750439 valid loss: 7.078432226253377 valid loss clipped: 7.078432226253377 train loss: 0.040009430050001356 valid loss scaled: 0.04785424872916424 valid loss: 7.92164959531741 valid loss clipped: 7.92164959531741 train loss: 0.03973275025215787 valid loss scaled: 0.0463738630632847 valid loss: 7.676589189399943 valid loss clipped: 7.67643024694617 train loss: 0.038768686683078965 valid loss scaled: 0.04392041103001954 valid loss: 7.270451050508478 valid loss clipped: 7.270451050508478 train loss: 0.039348882183647814 valid loss scaled: 0.04542625701933914 valid loss: 7.519721720555645 valid loss clipped: 7.519588671593333 train loss: 0.03904904227063566 valid loss scaled: 0.04735228840997471 valid loss: 7.838551573747202 valid loss clipped: 7.8383747004264395 train loss: 0.03885343865845939 valid loss scaled: 0.04507136710901525 valid loss: 7.460976738161493 valid loss clipped: 7.460948397077222 train loss: 0.039375428725538786 valid loss scaled: 0.049669896283577444 valid loss: 8.222201176726562 valid loss clipped: 8.222057108632997 train loss: 0.03995204566972607 valid loss scaled: 0.048435733190966335 valid loss: 8.017900004946629 valid loss clipped: 8.017900004946629 train loss: 0.0398491290780488 valid loss scaled: 0.04720521208137839 valid loss: 7.814206857210747 valid loss clipped: 7.814180725425085 train loss: 0.039705720615770324 valid loss scaled: 0.04476360371587965 valid loss: 7.410029695071655 valid loss clipped: 7.410029695071655 train loss: 0.03922532949685088 valid loss scaled: 0.043110160856005976 valid loss: 7.136322356669672 valid loss clipped: 7.136322356669672 train loss: 0.03944114218638488 valid loss scaled: 0.05162181950303874 valid loss: 8.54532049172949 valid loss clipped: 8.545318087656034 train loss: 0.038248045063862056 valid loss scaled: 0.045729077826710006 valid loss: 7.5698496787836405 valid loss clipped: 7.5698496787836405 train loss: 0.038367600972177264 valid loss scaled: 0.045730969331967096 valid loss: 7.570170291346379 valid loss clipped: 7.570170291346379 train loss: 0.03882607528483637 valid loss scaled: 0.04907676118577929 valid loss: 8.124020217147429 valid loss clipped: 8.124020217147429 train loss: 0.03804710767993723 valid loss scaled: 0.043570846150311024 valid loss: 7.2125900868664665 valid loss clipped: 7.2125900868664665 epoch 2 done, EVALUATION ON FULL DEV: valid loss scaled: 0.04305710652707711 valid loss: 7.127543370821652 valid loss clipped: 7.127414565514862 evaluation done epoch 3 train loss: 0.03567875976729889 valid loss scaled: 0.0435919204035305 valid loss: 7.216075509096017 valid loss clipped: 7.216075509096017 train loss: 0.03612129883597582 valid loss scaled: 0.05422117357473846 valid loss: 8.975609535535733 valid loss clipped: 8.975609535535733 train loss: 0.03610084946473125 valid loss scaled: 0.04305326329624392 valid loss: 7.126904973342827 valid loss clipped: 7.126904973342827 train loss: 0.0354239810019526 valid loss scaled: 0.041925094025886615 valid loss: 6.940151276296263 valid loss clipped: 6.940151276296263 train loss: 0.0352127675835109 valid loss scaled: 0.0480450296753359 valid loss: 7.9532327819071025 valid loss clipped: 7.9532327819071025 train loss: 0.03504781431121313 valid loss scaled: 0.05108940815500588 valid loss: 8.457186233906421 valid loss clipped: 8.45717667639358 train loss: 0.034990980345574955 valid loss scaled: 0.05037606369447504 valid loss: 8.339104845822392 valid loss clipped: 8.339104845822392 train loss: 0.034820230333430666 valid loss scaled: 0.04548966136491771 valid loss: 7.530221839492527 valid loss clipped: 7.530221839492527 train loss: 0.034814962549952325 valid loss scaled: 0.044887413795706835 valid loss: 7.430525821074159 valid loss clipped: 7.430525821074159 train loss: 0.035407486527384235 valid loss scaled: 0.04845567947436388 valid loss: 8.021209744958515 valid loss clipped: 8.021177381420596