[2022-01-14 18:11:57] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-14 18:11:57] [marian] Running on s470607-gpu as process 1795 with command line: [2022-01-14 18:11:57] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1 [2022-01-14 18:11:57] [config] after: 0e [2022-01-14 18:11:57] [config] after-batches: 0 [2022-01-14 18:11:57] [config] after-epochs: 1 [2022-01-14 18:11:57] [config] all-caps-every: 0 [2022-01-14 18:11:57] [config] allow-unk: false [2022-01-14 18:11:57] [config] authors: false [2022-01-14 18:11:57] [config] beam-size: 6 [2022-01-14 18:11:57] [config] bert-class-symbol: "[CLS]" [2022-01-14 18:11:57] [config] bert-mask-symbol: "[MASK]" [2022-01-14 18:11:57] [config] bert-masking-fraction: 0.15 [2022-01-14 18:11:57] [config] bert-sep-symbol: "[SEP]" [2022-01-14 18:11:57] [config] bert-train-type-embeddings: true [2022-01-14 18:11:57] [config] bert-type-vocab-size: 2 [2022-01-14 18:11:57] [config] build-info: "" [2022-01-14 18:11:57] [config] cite: false [2022-01-14 18:11:57] [config] clip-norm: 5 [2022-01-14 18:11:57] [config] cost-scaling: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] cost-type: ce-sum [2022-01-14 18:11:57] [config] cpu-threads: 0 [2022-01-14 18:11:57] [config] data-weighting: "" [2022-01-14 18:11:57] [config] data-weighting-type: sentence [2022-01-14 18:11:57] [config] dec-cell: gru [2022-01-14 18:11:57] [config] dec-cell-base-depth: 2 [2022-01-14 18:11:57] [config] dec-cell-high-depth: 1 [2022-01-14 18:11:57] [config] dec-depth: 6 [2022-01-14 18:11:57] [config] devices: [2022-01-14 18:11:57] [config] - 0 [2022-01-14 18:11:57] [config] dim-emb: 512 [2022-01-14 18:11:57] [config] dim-rnn: 1024 [2022-01-14 18:11:57] [config] dim-vocabs: [2022-01-14 18:11:57] [config] - 0 [2022-01-14 18:11:57] [config] - 0 [2022-01-14 18:11:57] [config] disp-first: 0 [2022-01-14 18:11:57] [config] disp-freq: 500 [2022-01-14 18:11:57] [config] disp-label-counts: true [2022-01-14 18:11:57] [config] dropout-rnn: 0 [2022-01-14 18:11:57] [config] dropout-src: 0 [2022-01-14 18:11:57] [config] dropout-trg: 0 [2022-01-14 18:11:57] [config] dump-config: "" [2022-01-14 18:11:57] [config] early-stopping: 10 [2022-01-14 18:11:57] [config] embedding-fix-src: false [2022-01-14 18:11:57] [config] embedding-fix-trg: false [2022-01-14 18:11:57] [config] embedding-normalization: false [2022-01-14 18:11:57] [config] embedding-vectors: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] enc-cell: gru [2022-01-14 18:11:57] [config] enc-cell-depth: 1 [2022-01-14 18:11:57] [config] enc-depth: 6 [2022-01-14 18:11:57] [config] enc-type: bidirectional [2022-01-14 18:11:57] [config] english-title-case-every: 0 [2022-01-14 18:11:57] [config] exponential-smoothing: 0.0001 [2022-01-14 18:11:57] [config] factor-weight: 1 [2022-01-14 18:11:57] [config] grad-dropping-momentum: 0 [2022-01-14 18:11:57] [config] grad-dropping-rate: 0 [2022-01-14 18:11:57] [config] grad-dropping-warmup: 100 [2022-01-14 18:11:57] [config] gradient-checkpointing: false [2022-01-14 18:11:57] [config] guided-alignment: none [2022-01-14 18:11:57] [config] guided-alignment-cost: mse [2022-01-14 18:11:57] [config] guided-alignment-weight: 0.1 [2022-01-14 18:11:57] [config] ignore-model-config: false [2022-01-14 18:11:57] [config] input-types: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] interpolate-env-vars: false [2022-01-14 18:11:57] [config] keep-best: false [2022-01-14 18:11:57] [config] label-smoothing: 0.1 [2022-01-14 18:11:57] [config] layer-normalization: false [2022-01-14 18:11:57] [config] learn-rate: 0.0003 [2022-01-14 18:11:57] [config] lemma-dim-emb: 0 [2022-01-14 18:11:57] [config] log: /home/wmi/train.log [2022-01-14 18:11:57] [config] log-level: info [2022-01-14 18:11:57] [config] log-time-zone: "" [2022-01-14 18:11:57] [config] logical-epoch: [2022-01-14 18:11:57] [config] - 1e [2022-01-14 18:11:57] [config] - 0 [2022-01-14 18:11:57] [config] lr-decay: 0 [2022-01-14 18:11:57] [config] lr-decay-freq: 50000 [2022-01-14 18:11:57] [config] lr-decay-inv-sqrt: [2022-01-14 18:11:57] [config] - 16000 [2022-01-14 18:11:57] [config] lr-decay-repeat-warmup: false [2022-01-14 18:11:57] [config] lr-decay-reset-optimizer: false [2022-01-14 18:11:57] [config] lr-decay-start: [2022-01-14 18:11:57] [config] - 10 [2022-01-14 18:11:57] [config] - 1 [2022-01-14 18:11:57] [config] lr-decay-strategy: epoch+stalled [2022-01-14 18:11:57] [config] lr-report: true [2022-01-14 18:11:57] [config] lr-warmup: 16000 [2022-01-14 18:11:57] [config] lr-warmup-at-reload: false [2022-01-14 18:11:57] [config] lr-warmup-cycle: false [2022-01-14 18:11:57] [config] lr-warmup-start-rate: 0 [2022-01-14 18:11:57] [config] max-length: 100 [2022-01-14 18:11:57] [config] max-length-crop: false [2022-01-14 18:11:57] [config] max-length-factor: 3 [2022-01-14 18:11:57] [config] maxi-batch: 1000 [2022-01-14 18:11:57] [config] maxi-batch-sort: trg [2022-01-14 18:11:57] [config] mini-batch: 64 [2022-01-14 18:11:57] [config] mini-batch-fit: true [2022-01-14 18:11:57] [config] mini-batch-fit-step: 10 [2022-01-14 18:11:57] [config] mini-batch-track-lr: false [2022-01-14 18:11:57] [config] mini-batch-warmup: 0 [2022-01-14 18:11:57] [config] mini-batch-words: 0 [2022-01-14 18:11:57] [config] mini-batch-words-ref: 0 [2022-01-14 18:11:57] [config] model: model.npz [2022-01-14 18:11:57] [config] multi-loss-type: sum [2022-01-14 18:11:57] [config] multi-node: false [2022-01-14 18:11:57] [config] multi-node-overlap: true [2022-01-14 18:11:57] [config] n-best: false [2022-01-14 18:11:57] [config] no-nccl: false [2022-01-14 18:11:57] [config] no-reload: false [2022-01-14 18:11:57] [config] no-restore-corpus: false [2022-01-14 18:11:57] [config] normalize: 0.6 [2022-01-14 18:11:57] [config] normalize-gradient: false [2022-01-14 18:11:57] [config] num-devices: 0 [2022-01-14 18:11:57] [config] optimizer: adam [2022-01-14 18:11:57] [config] optimizer-delay: 1 [2022-01-14 18:11:57] [config] optimizer-params: [2022-01-14 18:11:57] [config] - 0.9 [2022-01-14 18:11:57] [config] - 0.98 [2022-01-14 18:11:57] [config] - 1e-09 [2022-01-14 18:11:57] [config] output-omit-bias: false [2022-01-14 18:11:57] [config] overwrite: true [2022-01-14 18:11:57] [config] precision: [2022-01-14 18:11:57] [config] - float32 [2022-01-14 18:11:57] [config] - float32 [2022-01-14 18:11:57] [config] - float32 [2022-01-14 18:11:57] [config] pretrained-model: "" [2022-01-14 18:11:57] [config] quantize-biases: false [2022-01-14 18:11:57] [config] quantize-bits: 0 [2022-01-14 18:11:57] [config] quantize-log-based: false [2022-01-14 18:11:57] [config] quantize-optimization-steps: 0 [2022-01-14 18:11:57] [config] quiet: false [2022-01-14 18:11:57] [config] quiet-translation: false [2022-01-14 18:11:57] [config] relative-paths: false [2022-01-14 18:11:57] [config] right-left: false [2022-01-14 18:11:57] [config] save-freq: 5000 [2022-01-14 18:11:57] [config] seed: 0 [2022-01-14 18:11:57] [config] sentencepiece-alphas: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] sentencepiece-max-lines: 2000000 [2022-01-14 18:11:57] [config] sentencepiece-options: "" [2022-01-14 18:11:57] [config] shuffle: data [2022-01-14 18:11:57] [config] shuffle-in-ram: false [2022-01-14 18:11:57] [config] sigterm: save-and-exit [2022-01-14 18:11:57] [config] skip: false [2022-01-14 18:11:57] [config] sqlite: "" [2022-01-14 18:11:57] [config] sqlite-drop: false [2022-01-14 18:11:57] [config] sync-sgd: false [2022-01-14 18:11:57] [config] tempdir: /tmp [2022-01-14 18:11:57] [config] tied-embeddings: false [2022-01-14 18:11:57] [config] tied-embeddings-all: true [2022-01-14 18:11:57] [config] tied-embeddings-src: false [2022-01-14 18:11:57] [config] train-embedder-rank: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] train-sets: [2022-01-14 18:11:57] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-14 18:11:57] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-14 18:11:57] [config] transformer-aan-activation: swish [2022-01-14 18:11:57] [config] transformer-aan-depth: 2 [2022-01-14 18:11:57] [config] transformer-aan-nogate: false [2022-01-14 18:11:57] [config] transformer-decoder-autoreg: self-attention [2022-01-14 18:11:57] [config] transformer-depth-scaling: false [2022-01-14 18:11:57] [config] transformer-dim-aan: 2048 [2022-01-14 18:11:57] [config] transformer-dim-ffn: 2048 [2022-01-14 18:11:57] [config] transformer-dropout: 0.1 [2022-01-14 18:11:57] [config] transformer-dropout-attention: 0 [2022-01-14 18:11:57] [config] transformer-dropout-ffn: 0 [2022-01-14 18:11:57] [config] transformer-ffn-activation: swish [2022-01-14 18:11:57] [config] transformer-ffn-depth: 2 [2022-01-14 18:11:57] [config] transformer-guided-alignment-layer: last [2022-01-14 18:11:57] [config] transformer-heads: 8 [2022-01-14 18:11:57] [config] transformer-no-projection: false [2022-01-14 18:11:57] [config] transformer-pool: false [2022-01-14 18:11:57] [config] transformer-postprocess: dan [2022-01-14 18:11:57] [config] transformer-postprocess-emb: d [2022-01-14 18:11:57] [config] transformer-postprocess-top: "" [2022-01-14 18:11:57] [config] transformer-preprocess: "" [2022-01-14 18:11:57] [config] transformer-tied-layers: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] transformer-train-position-embeddings: false [2022-01-14 18:11:57] [config] tsv: false [2022-01-14 18:11:57] [config] tsv-fields: 0 [2022-01-14 18:11:57] [config] type: transformer [2022-01-14 18:11:57] [config] ulr: false [2022-01-14 18:11:57] [config] ulr-dim-emb: 0 [2022-01-14 18:11:57] [config] ulr-dropout: 0 [2022-01-14 18:11:57] [config] ulr-keys-vectors: "" [2022-01-14 18:11:57] [config] ulr-query-vectors: "" [2022-01-14 18:11:57] [config] ulr-softmax-temperature: 1 [2022-01-14 18:11:57] [config] ulr-trainable-transformation: false [2022-01-14 18:11:57] [config] unlikelihood-loss: false [2022-01-14 18:11:57] [config] valid-freq: 5000 [2022-01-14 18:11:57] [config] valid-log: "" [2022-01-14 18:11:57] [config] valid-max-length: 1000 [2022-01-14 18:11:57] [config] valid-metrics: [2022-01-14 18:11:57] [config] - cross-entropy [2022-01-14 18:11:57] [config] valid-mini-batch: 32 [2022-01-14 18:11:57] [config] valid-reset-stalled: false [2022-01-14 18:11:57] [config] valid-script-args: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] valid-script-path: "" [2022-01-14 18:11:57] [config] valid-sets: [2022-01-14 18:11:57] [config] [] [2022-01-14 18:11:57] [config] valid-translation-output: "" [2022-01-14 18:11:57] [config] vocabs: [2022-01-14 18:11:57] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-14 18:11:57] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 [2022-01-14 18:11:57] [config] word-penalty: 0 [2022-01-14 18:11:57] [config] word-scores: false [2022-01-14 18:11:57] [config] workspace: 10000 [2022-01-14 18:11:57] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-14 18:11:57] [training] Using single-device training [2022-01-14 18:11:57] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-14 18:11:57] Error: DefaultVocabulary file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 is expected to contain an entry for [2022-01-14 18:11:57] Error: Aborted from marian::DefaultVocab::addRequiredVocabulary(const string&, bool):: in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:199 [CALL STACK] [0x55eb3f84e0d8] marian::DefaultVocab::addRequiredVocabulary(std::__cxx11::basic_string,std::allocator> const&,bool)::{lambda(std::__cxx11::basic_string,std::allocator> const&,std::__cxx11::basic_string,std::allocator> const&,marian::Word)#1}:: operator() (std::__cxx11::basic_string,std::allocator> const&, std::__cxx11::basic_string,std::allocator> const&, marian::Word) const + 0x4a8 [0x55eb3f84e5d9] marian::DefaultVocab:: addRequiredVocabulary (std::__cxx11::basic_string,std::allocator> const&, bool) + 0x59 [0x55eb3f85168b] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0xb9b [0x55eb3f840e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x55eb3f841728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x55eb3f88d189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x55eb3f8a0084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x55eb3f6fef8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x55eb3f78694b] marian::Train:: run () + 0x19cb [0x55eb3f68d389] mainTrainer (int, char**) + 0x5e9 [0x55eb3f64b1bc] main + 0x3c [0x7feb9a0910b3] __libc_start_main + 0xf3 [0x55eb3f68bb0e] _start + 0x2e [2022-01-14 18:57:10] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-14 18:57:10] [marian] Running on s470607-gpu as process 1959 with command line: [2022-01-14 18:57:10] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1 [2022-01-14 18:57:10] [config] after: 0e [2022-01-14 18:57:10] [config] after-batches: 0 [2022-01-14 18:57:10] [config] after-epochs: 1 [2022-01-14 18:57:10] [config] all-caps-every: 0 [2022-01-14 18:57:10] [config] allow-unk: false [2022-01-14 18:57:10] [config] authors: false [2022-01-14 18:57:10] [config] beam-size: 6 [2022-01-14 18:57:10] [config] bert-class-symbol: "[CLS]" [2022-01-14 18:57:10] [config] bert-mask-symbol: "[MASK]" [2022-01-14 18:57:10] [config] bert-masking-fraction: 0.15 [2022-01-14 18:57:10] [config] bert-sep-symbol: "[SEP]" [2022-01-14 18:57:10] [config] bert-train-type-embeddings: true [2022-01-14 18:57:10] [config] bert-type-vocab-size: 2 [2022-01-14 18:57:10] [config] build-info: "" [2022-01-14 18:57:10] [config] cite: false [2022-01-14 18:57:10] [config] clip-norm: 5 [2022-01-14 18:57:10] [config] cost-scaling: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] cost-type: ce-sum [2022-01-14 18:57:10] [config] cpu-threads: 0 [2022-01-14 18:57:10] [config] data-weighting: "" [2022-01-14 18:57:10] [config] data-weighting-type: sentence [2022-01-14 18:57:10] [config] dec-cell: gru [2022-01-14 18:57:10] [config] dec-cell-base-depth: 2 [2022-01-14 18:57:10] [config] dec-cell-high-depth: 1 [2022-01-14 18:57:10] [config] dec-depth: 6 [2022-01-14 18:57:10] [config] devices: [2022-01-14 18:57:10] [config] - 0 [2022-01-14 18:57:10] [config] dim-emb: 512 [2022-01-14 18:57:10] [config] dim-rnn: 1024 [2022-01-14 18:57:10] [config] dim-vocabs: [2022-01-14 18:57:10] [config] - 0 [2022-01-14 18:57:10] [config] - 0 [2022-01-14 18:57:10] [config] disp-first: 0 [2022-01-14 18:57:10] [config] disp-freq: 500 [2022-01-14 18:57:10] [config] disp-label-counts: true [2022-01-14 18:57:10] [config] dropout-rnn: 0 [2022-01-14 18:57:10] [config] dropout-src: 0 [2022-01-14 18:57:10] [config] dropout-trg: 0 [2022-01-14 18:57:10] [config] dump-config: "" [2022-01-14 18:57:10] [config] early-stopping: 10 [2022-01-14 18:57:10] [config] embedding-fix-src: false [2022-01-14 18:57:10] [config] embedding-fix-trg: false [2022-01-14 18:57:10] [config] embedding-normalization: false [2022-01-14 18:57:10] [config] embedding-vectors: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] enc-cell: gru [2022-01-14 18:57:10] [config] enc-cell-depth: 1 [2022-01-14 18:57:10] [config] enc-depth: 6 [2022-01-14 18:57:10] [config] enc-type: bidirectional [2022-01-14 18:57:10] [config] english-title-case-every: 0 [2022-01-14 18:57:10] [config] exponential-smoothing: 0.0001 [2022-01-14 18:57:10] [config] factor-weight: 1 [2022-01-14 18:57:10] [config] grad-dropping-momentum: 0 [2022-01-14 18:57:10] [config] grad-dropping-rate: 0 [2022-01-14 18:57:10] [config] grad-dropping-warmup: 100 [2022-01-14 18:57:10] [config] gradient-checkpointing: false [2022-01-14 18:57:10] [config] guided-alignment: none [2022-01-14 18:57:10] [config] guided-alignment-cost: mse [2022-01-14 18:57:10] [config] guided-alignment-weight: 0.1 [2022-01-14 18:57:10] [config] ignore-model-config: false [2022-01-14 18:57:10] [config] input-types: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] interpolate-env-vars: false [2022-01-14 18:57:10] [config] keep-best: false [2022-01-14 18:57:10] [config] label-smoothing: 0.1 [2022-01-14 18:57:10] [config] layer-normalization: false [2022-01-14 18:57:10] [config] learn-rate: 0.0003 [2022-01-14 18:57:10] [config] lemma-dim-emb: 0 [2022-01-14 18:57:10] [config] log: /home/wmi/train.log [2022-01-14 18:57:10] [config] log-level: info [2022-01-14 18:57:10] [config] log-time-zone: "" [2022-01-14 18:57:10] [config] logical-epoch: [2022-01-14 18:57:10] [config] - 1e [2022-01-14 18:57:10] [config] - 0 [2022-01-14 18:57:10] [config] lr-decay: 0 [2022-01-14 18:57:10] [config] lr-decay-freq: 50000 [2022-01-14 18:57:10] [config] lr-decay-inv-sqrt: [2022-01-14 18:57:10] [config] - 16000 [2022-01-14 18:57:10] [config] lr-decay-repeat-warmup: false [2022-01-14 18:57:10] [config] lr-decay-reset-optimizer: false [2022-01-14 18:57:10] [config] lr-decay-start: [2022-01-14 18:57:10] [config] - 10 [2022-01-14 18:57:10] [config] - 1 [2022-01-14 18:57:10] [config] lr-decay-strategy: epoch+stalled [2022-01-14 18:57:10] [config] lr-report: true [2022-01-14 18:57:10] [config] lr-warmup: 16000 [2022-01-14 18:57:10] [config] lr-warmup-at-reload: false [2022-01-14 18:57:10] [config] lr-warmup-cycle: false [2022-01-14 18:57:10] [config] lr-warmup-start-rate: 0 [2022-01-14 18:57:10] [config] max-length: 100 [2022-01-14 18:57:10] [config] max-length-crop: false [2022-01-14 18:57:10] [config] max-length-factor: 3 [2022-01-14 18:57:10] [config] maxi-batch: 1000 [2022-01-14 18:57:10] [config] maxi-batch-sort: trg [2022-01-14 18:57:10] [config] mini-batch: 64 [2022-01-14 18:57:10] [config] mini-batch-fit: true [2022-01-14 18:57:10] [config] mini-batch-fit-step: 10 [2022-01-14 18:57:10] [config] mini-batch-track-lr: false [2022-01-14 18:57:10] [config] mini-batch-warmup: 0 [2022-01-14 18:57:10] [config] mini-batch-words: 0 [2022-01-14 18:57:10] [config] mini-batch-words-ref: 0 [2022-01-14 18:57:10] [config] model: model.npz [2022-01-14 18:57:10] [config] multi-loss-type: sum [2022-01-14 18:57:10] [config] multi-node: false [2022-01-14 18:57:10] [config] multi-node-overlap: true [2022-01-14 18:57:10] [config] n-best: false [2022-01-14 18:57:10] [config] no-nccl: false [2022-01-14 18:57:10] [config] no-reload: false [2022-01-14 18:57:10] [config] no-restore-corpus: false [2022-01-14 18:57:10] [config] normalize: 0.6 [2022-01-14 18:57:10] [config] normalize-gradient: false [2022-01-14 18:57:10] [config] num-devices: 0 [2022-01-14 18:57:10] [config] optimizer: adam [2022-01-14 18:57:10] [config] optimizer-delay: 1 [2022-01-14 18:57:10] [config] optimizer-params: [2022-01-14 18:57:10] [config] - 0.9 [2022-01-14 18:57:10] [config] - 0.98 [2022-01-14 18:57:10] [config] - 1e-09 [2022-01-14 18:57:10] [config] output-omit-bias: false [2022-01-14 18:57:10] [config] overwrite: true [2022-01-14 18:57:10] [config] precision: [2022-01-14 18:57:10] [config] - float32 [2022-01-14 18:57:10] [config] - float32 [2022-01-14 18:57:10] [config] - float32 [2022-01-14 18:57:10] [config] pretrained-model: "" [2022-01-14 18:57:10] [config] quantize-biases: false [2022-01-14 18:57:10] [config] quantize-bits: 0 [2022-01-14 18:57:10] [config] quantize-log-based: false [2022-01-14 18:57:10] [config] quantize-optimization-steps: 0 [2022-01-14 18:57:10] [config] quiet: false [2022-01-14 18:57:10] [config] quiet-translation: false [2022-01-14 18:57:10] [config] relative-paths: false [2022-01-14 18:57:10] [config] right-left: false [2022-01-14 18:57:10] [config] save-freq: 5000 [2022-01-14 18:57:10] [config] seed: 0 [2022-01-14 18:57:10] [config] sentencepiece-alphas: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] sentencepiece-max-lines: 2000000 [2022-01-14 18:57:10] [config] sentencepiece-options: "" [2022-01-14 18:57:10] [config] shuffle: data [2022-01-14 18:57:10] [config] shuffle-in-ram: false [2022-01-14 18:57:10] [config] sigterm: save-and-exit [2022-01-14 18:57:10] [config] skip: false [2022-01-14 18:57:10] [config] sqlite: "" [2022-01-14 18:57:10] [config] sqlite-drop: false [2022-01-14 18:57:10] [config] sync-sgd: false [2022-01-14 18:57:10] [config] tempdir: /tmp [2022-01-14 18:57:10] [config] tied-embeddings: false [2022-01-14 18:57:10] [config] tied-embeddings-all: true [2022-01-14 18:57:10] [config] tied-embeddings-src: false [2022-01-14 18:57:10] [config] train-embedder-rank: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] train-sets: [2022-01-14 18:57:10] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-14 18:57:10] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-14 18:57:10] [config] transformer-aan-activation: swish [2022-01-14 18:57:10] [config] transformer-aan-depth: 2 [2022-01-14 18:57:10] [config] transformer-aan-nogate: false [2022-01-14 18:57:10] [config] transformer-decoder-autoreg: self-attention [2022-01-14 18:57:10] [config] transformer-depth-scaling: false [2022-01-14 18:57:10] [config] transformer-dim-aan: 2048 [2022-01-14 18:57:10] [config] transformer-dim-ffn: 2048 [2022-01-14 18:57:10] [config] transformer-dropout: 0.1 [2022-01-14 18:57:10] [config] transformer-dropout-attention: 0 [2022-01-14 18:57:10] [config] transformer-dropout-ffn: 0 [2022-01-14 18:57:10] [config] transformer-ffn-activation: swish [2022-01-14 18:57:10] [config] transformer-ffn-depth: 2 [2022-01-14 18:57:10] [config] transformer-guided-alignment-layer: last [2022-01-14 18:57:10] [config] transformer-heads: 8 [2022-01-14 18:57:10] [config] transformer-no-projection: false [2022-01-14 18:57:10] [config] transformer-pool: false [2022-01-14 18:57:10] [config] transformer-postprocess: dan [2022-01-14 18:57:10] [config] transformer-postprocess-emb: d [2022-01-14 18:57:10] [config] transformer-postprocess-top: "" [2022-01-14 18:57:10] [config] transformer-preprocess: "" [2022-01-14 18:57:10] [config] transformer-tied-layers: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] transformer-train-position-embeddings: false [2022-01-14 18:57:10] [config] tsv: false [2022-01-14 18:57:10] [config] tsv-fields: 0 [2022-01-14 18:57:10] [config] type: transformer [2022-01-14 18:57:10] [config] ulr: false [2022-01-14 18:57:10] [config] ulr-dim-emb: 0 [2022-01-14 18:57:10] [config] ulr-dropout: 0 [2022-01-14 18:57:10] [config] ulr-keys-vectors: "" [2022-01-14 18:57:10] [config] ulr-query-vectors: "" [2022-01-14 18:57:10] [config] ulr-softmax-temperature: 1 [2022-01-14 18:57:10] [config] ulr-trainable-transformation: false [2022-01-14 18:57:10] [config] unlikelihood-loss: false [2022-01-14 18:57:10] [config] valid-freq: 5000 [2022-01-14 18:57:10] [config] valid-log: "" [2022-01-14 18:57:10] [config] valid-max-length: 1000 [2022-01-14 18:57:10] [config] valid-metrics: [2022-01-14 18:57:10] [config] - cross-entropy [2022-01-14 18:57:10] [config] valid-mini-batch: 32 [2022-01-14 18:57:10] [config] valid-reset-stalled: false [2022-01-14 18:57:10] [config] valid-script-args: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] valid-script-path: "" [2022-01-14 18:57:10] [config] valid-sets: [2022-01-14 18:57:10] [config] [] [2022-01-14 18:57:10] [config] valid-translation-output: "" [2022-01-14 18:57:10] [config] vocabs: [2022-01-14 18:57:10] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-14 18:57:10] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 [2022-01-14 18:57:10] [config] word-penalty: 0 [2022-01-14 18:57:10] [config] word-scores: false [2022-01-14 18:57:10] [config] workspace: 10000 [2022-01-14 18:57:10] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-14 18:57:10] [training] Using single-device training [2022-01-14 18:57:10] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-14 18:57:10] Error: DefaultVocabulary file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 is expected to contain an entry for [2022-01-14 18:57:10] Error: Aborted from marian::DefaultVocab::addRequiredVocabulary(const string&, bool):: in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:199 [CALL STACK] [0x560bb35130d8] marian::DefaultVocab::addRequiredVocabulary(std::__cxx11::basic_string,std::allocator> const&,bool)::{lambda(std::__cxx11::basic_string,std::allocator> const&,std::__cxx11::basic_string,std::allocator> const&,marian::Word)#1}:: operator() (std::__cxx11::basic_string,std::allocator> const&, std::__cxx11::basic_string,std::allocator> const&, marian::Word) const + 0x4a8 [0x560bb35135d9] marian::DefaultVocab:: addRequiredVocabulary (std::__cxx11::basic_string,std::allocator> const&, bool) + 0x59 [0x560bb351668b] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0xb9b [0x560bb3505e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x560bb3506728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x560bb3552189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x560bb3565084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x560bb33c3f8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x560bb344b94b] marian::Train:: run () + 0x19cb [0x560bb3352389] mainTrainer (int, char**) + 0x5e9 [0x560bb33101bc] main + 0x3c [0x7fbe70a860b3] __libc_start_main + 0xf3 [0x560bb3350b0e] _start + 0x2e [2022-01-14 19:20:18] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-14 19:20:18] [marian] Running on s470607-gpu as process 2041 with command line: [2022-01-14 19:20:18] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1 [2022-01-14 19:20:18] [config] after: 0e [2022-01-14 19:20:18] [config] after-batches: 0 [2022-01-14 19:20:18] [config] after-epochs: 1 [2022-01-14 19:20:18] [config] all-caps-every: 0 [2022-01-14 19:20:18] [config] allow-unk: false [2022-01-14 19:20:18] [config] authors: false [2022-01-14 19:20:18] [config] beam-size: 6 [2022-01-14 19:20:18] [config] bert-class-symbol: "[CLS]" [2022-01-14 19:20:18] [config] bert-mask-symbol: "[MASK]" [2022-01-14 19:20:18] [config] bert-masking-fraction: 0.15 [2022-01-14 19:20:18] [config] bert-sep-symbol: "[SEP]" [2022-01-14 19:20:18] [config] bert-train-type-embeddings: true [2022-01-14 19:20:18] [config] bert-type-vocab-size: 2 [2022-01-14 19:20:18] [config] build-info: "" [2022-01-14 19:20:18] [config] cite: false [2022-01-14 19:20:18] [config] clip-norm: 5 [2022-01-14 19:20:18] [config] cost-scaling: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] cost-type: ce-sum [2022-01-14 19:20:18] [config] cpu-threads: 0 [2022-01-14 19:20:18] [config] data-weighting: "" [2022-01-14 19:20:18] [config] data-weighting-type: sentence [2022-01-14 19:20:18] [config] dec-cell: gru [2022-01-14 19:20:18] [config] dec-cell-base-depth: 2 [2022-01-14 19:20:18] [config] dec-cell-high-depth: 1 [2022-01-14 19:20:18] [config] dec-depth: 6 [2022-01-14 19:20:18] [config] devices: [2022-01-14 19:20:18] [config] - 0 [2022-01-14 19:20:18] [config] dim-emb: 512 [2022-01-14 19:20:18] [config] dim-rnn: 1024 [2022-01-14 19:20:18] [config] dim-vocabs: [2022-01-14 19:20:18] [config] - 0 [2022-01-14 19:20:18] [config] - 0 [2022-01-14 19:20:18] [config] disp-first: 0 [2022-01-14 19:20:18] [config] disp-freq: 500 [2022-01-14 19:20:18] [config] disp-label-counts: true [2022-01-14 19:20:18] [config] dropout-rnn: 0 [2022-01-14 19:20:18] [config] dropout-src: 0 [2022-01-14 19:20:18] [config] dropout-trg: 0 [2022-01-14 19:20:18] [config] dump-config: "" [2022-01-14 19:20:18] [config] early-stopping: 10 [2022-01-14 19:20:18] [config] embedding-fix-src: false [2022-01-14 19:20:18] [config] embedding-fix-trg: false [2022-01-14 19:20:18] [config] embedding-normalization: false [2022-01-14 19:20:18] [config] embedding-vectors: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] enc-cell: gru [2022-01-14 19:20:18] [config] enc-cell-depth: 1 [2022-01-14 19:20:18] [config] enc-depth: 6 [2022-01-14 19:20:18] [config] enc-type: bidirectional [2022-01-14 19:20:18] [config] english-title-case-every: 0 [2022-01-14 19:20:18] [config] exponential-smoothing: 0.0001 [2022-01-14 19:20:18] [config] factor-weight: 1 [2022-01-14 19:20:18] [config] grad-dropping-momentum: 0 [2022-01-14 19:20:18] [config] grad-dropping-rate: 0 [2022-01-14 19:20:18] [config] grad-dropping-warmup: 100 [2022-01-14 19:20:18] [config] gradient-checkpointing: false [2022-01-14 19:20:18] [config] guided-alignment: none [2022-01-14 19:20:18] [config] guided-alignment-cost: mse [2022-01-14 19:20:18] [config] guided-alignment-weight: 0.1 [2022-01-14 19:20:18] [config] ignore-model-config: false [2022-01-14 19:20:18] [config] input-types: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] interpolate-env-vars: false [2022-01-14 19:20:18] [config] keep-best: false [2022-01-14 19:20:18] [config] label-smoothing: 0.1 [2022-01-14 19:20:18] [config] layer-normalization: false [2022-01-14 19:20:18] [config] learn-rate: 0.0003 [2022-01-14 19:20:18] [config] lemma-dim-emb: 0 [2022-01-14 19:20:18] [config] log: /home/wmi/train.log [2022-01-14 19:20:18] [config] log-level: info [2022-01-14 19:20:18] [config] log-time-zone: "" [2022-01-14 19:20:18] [config] logical-epoch: [2022-01-14 19:20:18] [config] - 1e [2022-01-14 19:20:18] [config] - 0 [2022-01-14 19:20:18] [config] lr-decay: 0 [2022-01-14 19:20:18] [config] lr-decay-freq: 50000 [2022-01-14 19:20:18] [config] lr-decay-inv-sqrt: [2022-01-14 19:20:18] [config] - 16000 [2022-01-14 19:20:18] [config] lr-decay-repeat-warmup: false [2022-01-14 19:20:18] [config] lr-decay-reset-optimizer: false [2022-01-14 19:20:18] [config] lr-decay-start: [2022-01-14 19:20:18] [config] - 10 [2022-01-14 19:20:18] [config] - 1 [2022-01-14 19:20:18] [config] lr-decay-strategy: epoch+stalled [2022-01-14 19:20:18] [config] lr-report: true [2022-01-14 19:20:18] [config] lr-warmup: 16000 [2022-01-14 19:20:18] [config] lr-warmup-at-reload: false [2022-01-14 19:20:18] [config] lr-warmup-cycle: false [2022-01-14 19:20:18] [config] lr-warmup-start-rate: 0 [2022-01-14 19:20:18] [config] max-length: 100 [2022-01-14 19:20:18] [config] max-length-crop: false [2022-01-14 19:20:18] [config] max-length-factor: 3 [2022-01-14 19:20:18] [config] maxi-batch: 1000 [2022-01-14 19:20:18] [config] maxi-batch-sort: trg [2022-01-14 19:20:18] [config] mini-batch: 64 [2022-01-14 19:20:18] [config] mini-batch-fit: true [2022-01-14 19:20:18] [config] mini-batch-fit-step: 10 [2022-01-14 19:20:18] [config] mini-batch-track-lr: false [2022-01-14 19:20:18] [config] mini-batch-warmup: 0 [2022-01-14 19:20:18] [config] mini-batch-words: 0 [2022-01-14 19:20:18] [config] mini-batch-words-ref: 0 [2022-01-14 19:20:18] [config] model: model.npz [2022-01-14 19:20:18] [config] multi-loss-type: sum [2022-01-14 19:20:18] [config] multi-node: false [2022-01-14 19:20:18] [config] multi-node-overlap: true [2022-01-14 19:20:18] [config] n-best: false [2022-01-14 19:20:18] [config] no-nccl: false [2022-01-14 19:20:18] [config] no-reload: false [2022-01-14 19:20:18] [config] no-restore-corpus: false [2022-01-14 19:20:18] [config] normalize: 0.6 [2022-01-14 19:20:18] [config] normalize-gradient: false [2022-01-14 19:20:18] [config] num-devices: 0 [2022-01-14 19:20:18] [config] optimizer: adam [2022-01-14 19:20:18] [config] optimizer-delay: 1 [2022-01-14 19:20:18] [config] optimizer-params: [2022-01-14 19:20:18] [config] - 0.9 [2022-01-14 19:20:18] [config] - 0.98 [2022-01-14 19:20:18] [config] - 1e-09 [2022-01-14 19:20:18] [config] output-omit-bias: false [2022-01-14 19:20:18] [config] overwrite: true [2022-01-14 19:20:18] [config] precision: [2022-01-14 19:20:18] [config] - float32 [2022-01-14 19:20:18] [config] - float32 [2022-01-14 19:20:18] [config] - float32 [2022-01-14 19:20:18] [config] pretrained-model: "" [2022-01-14 19:20:18] [config] quantize-biases: false [2022-01-14 19:20:18] [config] quantize-bits: 0 [2022-01-14 19:20:18] [config] quantize-log-based: false [2022-01-14 19:20:18] [config] quantize-optimization-steps: 0 [2022-01-14 19:20:18] [config] quiet: false [2022-01-14 19:20:18] [config] quiet-translation: false [2022-01-14 19:20:18] [config] relative-paths: false [2022-01-14 19:20:18] [config] right-left: false [2022-01-14 19:20:18] [config] save-freq: 5000 [2022-01-14 19:20:18] [config] seed: 0 [2022-01-14 19:20:18] [config] sentencepiece-alphas: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] sentencepiece-max-lines: 2000000 [2022-01-14 19:20:18] [config] sentencepiece-options: "" [2022-01-14 19:20:18] [config] shuffle: data [2022-01-14 19:20:18] [config] shuffle-in-ram: false [2022-01-14 19:20:18] [config] sigterm: save-and-exit [2022-01-14 19:20:18] [config] skip: false [2022-01-14 19:20:18] [config] sqlite: "" [2022-01-14 19:20:18] [config] sqlite-drop: false [2022-01-14 19:20:18] [config] sync-sgd: false [2022-01-14 19:20:18] [config] tempdir: /tmp [2022-01-14 19:20:18] [config] tied-embeddings: false [2022-01-14 19:20:18] [config] tied-embeddings-all: true [2022-01-14 19:20:18] [config] tied-embeddings-src: false [2022-01-14 19:20:18] [config] train-embedder-rank: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] train-sets: [2022-01-14 19:20:18] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-14 19:20:18] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-14 19:20:18] [config] transformer-aan-activation: swish [2022-01-14 19:20:18] [config] transformer-aan-depth: 2 [2022-01-14 19:20:18] [config] transformer-aan-nogate: false [2022-01-14 19:20:18] [config] transformer-decoder-autoreg: self-attention [2022-01-14 19:20:18] [config] transformer-depth-scaling: false [2022-01-14 19:20:18] [config] transformer-dim-aan: 2048 [2022-01-14 19:20:18] [config] transformer-dim-ffn: 2048 [2022-01-14 19:20:18] [config] transformer-dropout: 0.1 [2022-01-14 19:20:18] [config] transformer-dropout-attention: 0 [2022-01-14 19:20:18] [config] transformer-dropout-ffn: 0 [2022-01-14 19:20:18] [config] transformer-ffn-activation: swish [2022-01-14 19:20:18] [config] transformer-ffn-depth: 2 [2022-01-14 19:20:18] [config] transformer-guided-alignment-layer: last [2022-01-14 19:20:18] [config] transformer-heads: 8 [2022-01-14 19:20:18] [config] transformer-no-projection: false [2022-01-14 19:20:18] [config] transformer-pool: false [2022-01-14 19:20:18] [config] transformer-postprocess: dan [2022-01-14 19:20:18] [config] transformer-postprocess-emb: d [2022-01-14 19:20:18] [config] transformer-postprocess-top: "" [2022-01-14 19:20:18] [config] transformer-preprocess: "" [2022-01-14 19:20:18] [config] transformer-tied-layers: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] transformer-train-position-embeddings: false [2022-01-14 19:20:18] [config] tsv: false [2022-01-14 19:20:18] [config] tsv-fields: 0 [2022-01-14 19:20:18] [config] type: transformer [2022-01-14 19:20:18] [config] ulr: false [2022-01-14 19:20:18] [config] ulr-dim-emb: 0 [2022-01-14 19:20:18] [config] ulr-dropout: 0 [2022-01-14 19:20:18] [config] ulr-keys-vectors: "" [2022-01-14 19:20:18] [config] ulr-query-vectors: "" [2022-01-14 19:20:18] [config] ulr-softmax-temperature: 1 [2022-01-14 19:20:18] [config] ulr-trainable-transformation: false [2022-01-14 19:20:18] [config] unlikelihood-loss: false [2022-01-14 19:20:18] [config] valid-freq: 5000 [2022-01-14 19:20:18] [config] valid-log: "" [2022-01-14 19:20:18] [config] valid-max-length: 1000 [2022-01-14 19:20:18] [config] valid-metrics: [2022-01-14 19:20:18] [config] - cross-entropy [2022-01-14 19:20:18] [config] valid-mini-batch: 32 [2022-01-14 19:20:18] [config] valid-reset-stalled: false [2022-01-14 19:20:18] [config] valid-script-args: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] valid-script-path: "" [2022-01-14 19:20:18] [config] valid-sets: [2022-01-14 19:20:18] [config] [] [2022-01-14 19:20:18] [config] valid-translation-output: "" [2022-01-14 19:20:18] [config] vocabs: [2022-01-14 19:20:18] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-14 19:20:18] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 [2022-01-14 19:20:18] [config] word-penalty: 0 [2022-01-14 19:20:18] [config] word-scores: false [2022-01-14 19:20:18] [config] workspace: 10000 [2022-01-14 19:20:18] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-14 19:20:18] [training] Using single-device training [2022-01-14 19:20:18] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-14 19:20:18] Error: Duplicate vocabulary entry - [2022-01-14 19:20:18] Error: Aborted from virtual size_t marian::DefaultVocab::load(const string&, size_t) in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:116 [CALL STACK] [0x56321bb861ed] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x6fd [0x56321bb75e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x56321bb76728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x56321bbc2189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x56321bbd5084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x56321ba33f8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x56321babb94b] marian::Train:: run () + 0x19cb [0x56321b9c2389] mainTrainer (int, char**) + 0x5e9 [0x56321b9801bc] main + 0x3c [0x7f3235c160b3] __libc_start_main + 0xf3 [0x56321b9c0b0e] _start + 0x2e [2022-01-15 14:02:43] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:02:43] [marian] Running on s470607-gpu as process 2586 with command line: [2022-01-15 14:02:43] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1 [2022-01-15 14:02:43] [config] after: 0e [2022-01-15 14:02:43] [config] after-batches: 0 [2022-01-15 14:02:43] [config] after-epochs: 1 [2022-01-15 14:02:43] [config] all-caps-every: 0 [2022-01-15 14:02:43] [config] allow-unk: false [2022-01-15 14:02:43] [config] authors: false [2022-01-15 14:02:43] [config] beam-size: 6 [2022-01-15 14:02:43] [config] bert-class-symbol: "[CLS]" [2022-01-15 14:02:43] [config] bert-mask-symbol: "[MASK]" [2022-01-15 14:02:43] [config] bert-masking-fraction: 0.15 [2022-01-15 14:02:43] [config] bert-sep-symbol: "[SEP]" [2022-01-15 14:02:43] [config] bert-train-type-embeddings: true [2022-01-15 14:02:43] [config] bert-type-vocab-size: 2 [2022-01-15 14:02:43] [config] build-info: "" [2022-01-15 14:02:43] [config] cite: false [2022-01-15 14:02:43] [config] clip-norm: 5 [2022-01-15 14:02:43] [config] cost-scaling: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] cost-type: ce-sum [2022-01-15 14:02:43] [config] cpu-threads: 0 [2022-01-15 14:02:43] [config] data-weighting: "" [2022-01-15 14:02:43] [config] data-weighting-type: sentence [2022-01-15 14:02:43] [config] dec-cell: gru [2022-01-15 14:02:43] [config] dec-cell-base-depth: 2 [2022-01-15 14:02:43] [config] dec-cell-high-depth: 1 [2022-01-15 14:02:43] [config] dec-depth: 6 [2022-01-15 14:02:43] [config] devices: [2022-01-15 14:02:43] [config] - 0 [2022-01-15 14:02:43] [config] dim-emb: 512 [2022-01-15 14:02:43] [config] dim-rnn: 1024 [2022-01-15 14:02:43] [config] dim-vocabs: [2022-01-15 14:02:43] [config] - 0 [2022-01-15 14:02:43] [config] - 0 [2022-01-15 14:02:43] [config] disp-first: 0 [2022-01-15 14:02:43] [config] disp-freq: 500 [2022-01-15 14:02:43] [config] disp-label-counts: true [2022-01-15 14:02:43] [config] dropout-rnn: 0 [2022-01-15 14:02:43] [config] dropout-src: 0 [2022-01-15 14:02:43] [config] dropout-trg: 0 [2022-01-15 14:02:43] [config] dump-config: "" [2022-01-15 14:02:43] [config] early-stopping: 10 [2022-01-15 14:02:43] [config] embedding-fix-src: false [2022-01-15 14:02:43] [config] embedding-fix-trg: false [2022-01-15 14:02:43] [config] embedding-normalization: false [2022-01-15 14:02:43] [config] embedding-vectors: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] enc-cell: gru [2022-01-15 14:02:43] [config] enc-cell-depth: 1 [2022-01-15 14:02:43] [config] enc-depth: 6 [2022-01-15 14:02:43] [config] enc-type: bidirectional [2022-01-15 14:02:43] [config] english-title-case-every: 0 [2022-01-15 14:02:43] [config] exponential-smoothing: 0.0001 [2022-01-15 14:02:43] [config] factor-weight: 1 [2022-01-15 14:02:43] [config] grad-dropping-momentum: 0 [2022-01-15 14:02:43] [config] grad-dropping-rate: 0 [2022-01-15 14:02:43] [config] grad-dropping-warmup: 100 [2022-01-15 14:02:43] [config] gradient-checkpointing: false [2022-01-15 14:02:43] [config] guided-alignment: none [2022-01-15 14:02:43] [config] guided-alignment-cost: mse [2022-01-15 14:02:43] [config] guided-alignment-weight: 0.1 [2022-01-15 14:02:43] [config] ignore-model-config: false [2022-01-15 14:02:43] [config] input-types: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] interpolate-env-vars: false [2022-01-15 14:02:43] [config] keep-best: false [2022-01-15 14:02:43] [config] label-smoothing: 0.1 [2022-01-15 14:02:43] [config] layer-normalization: false [2022-01-15 14:02:43] [config] learn-rate: 0.0003 [2022-01-15 14:02:43] [config] lemma-dim-emb: 0 [2022-01-15 14:02:43] [config] log: /home/wmi/train.log [2022-01-15 14:02:43] [config] log-level: info [2022-01-15 14:02:43] [config] log-time-zone: "" [2022-01-15 14:02:43] [config] logical-epoch: [2022-01-15 14:02:43] [config] - 1e [2022-01-15 14:02:43] [config] - 0 [2022-01-15 14:02:43] [config] lr-decay: 0 [2022-01-15 14:02:43] [config] lr-decay-freq: 50000 [2022-01-15 14:02:43] [config] lr-decay-inv-sqrt: [2022-01-15 14:02:43] [config] - 16000 [2022-01-15 14:02:43] [config] lr-decay-repeat-warmup: false [2022-01-15 14:02:43] [config] lr-decay-reset-optimizer: false [2022-01-15 14:02:43] [config] lr-decay-start: [2022-01-15 14:02:43] [config] - 10 [2022-01-15 14:02:43] [config] - 1 [2022-01-15 14:02:43] [config] lr-decay-strategy: epoch+stalled [2022-01-15 14:02:43] [config] lr-report: true [2022-01-15 14:02:43] [config] lr-warmup: 16000 [2022-01-15 14:02:43] [config] lr-warmup-at-reload: false [2022-01-15 14:02:43] [config] lr-warmup-cycle: false [2022-01-15 14:02:43] [config] lr-warmup-start-rate: 0 [2022-01-15 14:02:43] [config] max-length: 100 [2022-01-15 14:02:43] [config] max-length-crop: false [2022-01-15 14:02:43] [config] max-length-factor: 3 [2022-01-15 14:02:43] [config] maxi-batch: 1000 [2022-01-15 14:02:43] [config] maxi-batch-sort: trg [2022-01-15 14:02:43] [config] mini-batch: 64 [2022-01-15 14:02:43] [config] mini-batch-fit: true [2022-01-15 14:02:43] [config] mini-batch-fit-step: 10 [2022-01-15 14:02:43] [config] mini-batch-track-lr: false [2022-01-15 14:02:43] [config] mini-batch-warmup: 0 [2022-01-15 14:02:43] [config] mini-batch-words: 0 [2022-01-15 14:02:43] [config] mini-batch-words-ref: 0 [2022-01-15 14:02:43] [config] model: model.npz [2022-01-15 14:02:43] [config] multi-loss-type: sum [2022-01-15 14:02:43] [config] multi-node: false [2022-01-15 14:02:43] [config] multi-node-overlap: true [2022-01-15 14:02:43] [config] n-best: false [2022-01-15 14:02:43] [config] no-nccl: false [2022-01-15 14:02:43] [config] no-reload: false [2022-01-15 14:02:43] [config] no-restore-corpus: false [2022-01-15 14:02:43] [config] normalize: 0.6 [2022-01-15 14:02:43] [config] normalize-gradient: false [2022-01-15 14:02:43] [config] num-devices: 0 [2022-01-15 14:02:43] [config] optimizer: adam [2022-01-15 14:02:43] [config] optimizer-delay: 1 [2022-01-15 14:02:43] [config] optimizer-params: [2022-01-15 14:02:43] [config] - 0.9 [2022-01-15 14:02:43] [config] - 0.98 [2022-01-15 14:02:43] [config] - 1e-09 [2022-01-15 14:02:43] [config] output-omit-bias: false [2022-01-15 14:02:43] [config] overwrite: true [2022-01-15 14:02:43] [config] precision: [2022-01-15 14:02:43] [config] - float32 [2022-01-15 14:02:43] [config] - float32 [2022-01-15 14:02:43] [config] - float32 [2022-01-15 14:02:43] [config] pretrained-model: "" [2022-01-15 14:02:43] [config] quantize-biases: false [2022-01-15 14:02:43] [config] quantize-bits: 0 [2022-01-15 14:02:43] [config] quantize-log-based: false [2022-01-15 14:02:43] [config] quantize-optimization-steps: 0 [2022-01-15 14:02:43] [config] quiet: false [2022-01-15 14:02:43] [config] quiet-translation: false [2022-01-15 14:02:43] [config] relative-paths: false [2022-01-15 14:02:43] [config] right-left: false [2022-01-15 14:02:43] [config] save-freq: 5000 [2022-01-15 14:02:43] [config] seed: 0 [2022-01-15 14:02:43] [config] sentencepiece-alphas: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] sentencepiece-max-lines: 2000000 [2022-01-15 14:02:43] [config] sentencepiece-options: "" [2022-01-15 14:02:43] [config] shuffle: data [2022-01-15 14:02:43] [config] shuffle-in-ram: false [2022-01-15 14:02:43] [config] sigterm: save-and-exit [2022-01-15 14:02:43] [config] skip: false [2022-01-15 14:02:43] [config] sqlite: "" [2022-01-15 14:02:43] [config] sqlite-drop: false [2022-01-15 14:02:43] [config] sync-sgd: false [2022-01-15 14:02:43] [config] tempdir: /tmp [2022-01-15 14:02:43] [config] tied-embeddings: false [2022-01-15 14:02:43] [config] tied-embeddings-all: true [2022-01-15 14:02:43] [config] tied-embeddings-src: false [2022-01-15 14:02:43] [config] train-embedder-rank: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] train-sets: [2022-01-15 14:02:43] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-15 14:02:43] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-15 14:02:43] [config] transformer-aan-activation: swish [2022-01-15 14:02:43] [config] transformer-aan-depth: 2 [2022-01-15 14:02:43] [config] transformer-aan-nogate: false [2022-01-15 14:02:43] [config] transformer-decoder-autoreg: self-attention [2022-01-15 14:02:43] [config] transformer-depth-scaling: false [2022-01-15 14:02:43] [config] transformer-dim-aan: 2048 [2022-01-15 14:02:43] [config] transformer-dim-ffn: 2048 [2022-01-15 14:02:43] [config] transformer-dropout: 0.1 [2022-01-15 14:02:43] [config] transformer-dropout-attention: 0 [2022-01-15 14:02:43] [config] transformer-dropout-ffn: 0 [2022-01-15 14:02:43] [config] transformer-ffn-activation: swish [2022-01-15 14:02:43] [config] transformer-ffn-depth: 2 [2022-01-15 14:02:43] [config] transformer-guided-alignment-layer: last [2022-01-15 14:02:43] [config] transformer-heads: 8 [2022-01-15 14:02:43] [config] transformer-no-projection: false [2022-01-15 14:02:43] [config] transformer-pool: false [2022-01-15 14:02:43] [config] transformer-postprocess: dan [2022-01-15 14:02:43] [config] transformer-postprocess-emb: d [2022-01-15 14:02:43] [config] transformer-postprocess-top: "" [2022-01-15 14:02:43] [config] transformer-preprocess: "" [2022-01-15 14:02:43] [config] transformer-tied-layers: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] transformer-train-position-embeddings: false [2022-01-15 14:02:43] [config] tsv: false [2022-01-15 14:02:43] [config] tsv-fields: 0 [2022-01-15 14:02:43] [config] type: transformer [2022-01-15 14:02:43] [config] ulr: false [2022-01-15 14:02:43] [config] ulr-dim-emb: 0 [2022-01-15 14:02:43] [config] ulr-dropout: 0 [2022-01-15 14:02:43] [config] ulr-keys-vectors: "" [2022-01-15 14:02:43] [config] ulr-query-vectors: "" [2022-01-15 14:02:43] [config] ulr-softmax-temperature: 1 [2022-01-15 14:02:43] [config] ulr-trainable-transformation: false [2022-01-15 14:02:43] [config] unlikelihood-loss: false [2022-01-15 14:02:43] [config] valid-freq: 5000 [2022-01-15 14:02:43] [config] valid-log: "" [2022-01-15 14:02:43] [config] valid-max-length: 1000 [2022-01-15 14:02:43] [config] valid-metrics: [2022-01-15 14:02:43] [config] - cross-entropy [2022-01-15 14:02:43] [config] valid-mini-batch: 32 [2022-01-15 14:02:43] [config] valid-reset-stalled: false [2022-01-15 14:02:43] [config] valid-script-args: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] valid-script-path: "" [2022-01-15 14:02:43] [config] valid-sets: [2022-01-15 14:02:43] [config] [] [2022-01-15 14:02:43] [config] valid-translation-output: "" [2022-01-15 14:02:43] [config] vocabs: [2022-01-15 14:02:43] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-15 14:02:43] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 [2022-01-15 14:02:43] [config] word-penalty: 0 [2022-01-15 14:02:43] [config] word-scores: false [2022-01-15 14:02:43] [config] workspace: 10000 [2022-01-15 14:02:43] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:02:43] [training] Using single-device training [2022-01-15 14:02:43] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 [2022-01-15 14:02:43] Error: DefaultVocabulary file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 is expected to contain an entry for [2022-01-15 14:02:43] Error: Aborted from marian::DefaultVocab::addRequiredVocabulary(const string&, bool):: in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:199 [CALL STACK] [0x56222e7700d8] marian::DefaultVocab::addRequiredVocabulary(std::__cxx11::basic_string,std::allocator> const&,bool)::{lambda(std::__cxx11::basic_string,std::allocator> const&,std::__cxx11::basic_string,std::allocator> const&,marian::Word)#1}:: operator() (std::__cxx11::basic_string,std::allocator> const&, std::__cxx11::basic_string,std::allocator> const&, marian::Word) const + 0x4a8 [0x56222e7705d9] marian::DefaultVocab:: addRequiredVocabulary (std::__cxx11::basic_string,std::allocator> const&, bool) + 0x59 [0x56222e77368b] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0xb9b [0x56222e762e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x56222e763728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x56222e7af189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x56222e7c2084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x56222e620f8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x56222e6a894b] marian::Train:: run () + 0x19cb [0x56222e5af389] mainTrainer (int, char**) + 0x5e9 [0x56222e56d1bc] main + 0x3c [0x7fbc01e290b3] __libc_start_main + 0xf3 [0x56222e5adb0e] _start + 0x2e [2022-01-15 14:24:00] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:24:00] [marian] Running on s470607-gpu as process 2853 with command line: [2022-01-15 14:24:00] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 14:24:00] [config] after: 0e [2022-01-15 14:24:00] [config] after-batches: 0 [2022-01-15 14:24:00] [config] after-epochs: 1 [2022-01-15 14:24:00] [config] all-caps-every: 0 [2022-01-15 14:24:00] [config] allow-unk: false [2022-01-15 14:24:00] [config] authors: false [2022-01-15 14:24:00] [config] beam-size: 6 [2022-01-15 14:24:00] [config] bert-class-symbol: "[CLS]" [2022-01-15 14:24:00] [config] bert-mask-symbol: "[MASK]" [2022-01-15 14:24:00] [config] bert-masking-fraction: 0.15 [2022-01-15 14:24:00] [config] bert-sep-symbol: "[SEP]" [2022-01-15 14:24:00] [config] bert-train-type-embeddings: true [2022-01-15 14:24:00] [config] bert-type-vocab-size: 2 [2022-01-15 14:24:00] [config] build-info: "" [2022-01-15 14:24:00] [config] cite: false [2022-01-15 14:24:00] [config] clip-norm: 5 [2022-01-15 14:24:00] [config] cost-scaling: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] cost-type: ce-sum [2022-01-15 14:24:00] [config] cpu-threads: 0 [2022-01-15 14:24:00] [config] data-weighting: "" [2022-01-15 14:24:00] [config] data-weighting-type: sentence [2022-01-15 14:24:00] [config] dec-cell: gru [2022-01-15 14:24:00] [config] dec-cell-base-depth: 2 [2022-01-15 14:24:00] [config] dec-cell-high-depth: 1 [2022-01-15 14:24:00] [config] dec-depth: 6 [2022-01-15 14:24:00] [config] devices: [2022-01-15 14:24:00] [config] - 0 [2022-01-15 14:24:00] [config] dim-emb: 512 [2022-01-15 14:24:00] [config] dim-rnn: 1024 [2022-01-15 14:24:00] [config] dim-vocabs: [2022-01-15 14:24:00] [config] - 0 [2022-01-15 14:24:00] [config] - 0 [2022-01-15 14:24:00] [config] disp-first: 0 [2022-01-15 14:24:00] [config] disp-freq: 500 [2022-01-15 14:24:00] [config] disp-label-counts: true [2022-01-15 14:24:00] [config] dropout-rnn: 0 [2022-01-15 14:24:00] [config] dropout-src: 0 [2022-01-15 14:24:00] [config] dropout-trg: 0 [2022-01-15 14:24:00] [config] dump-config: "" [2022-01-15 14:24:00] [config] early-stopping: 10 [2022-01-15 14:24:00] [config] embedding-fix-src: false [2022-01-15 14:24:00] [config] embedding-fix-trg: false [2022-01-15 14:24:00] [config] embedding-normalization: false [2022-01-15 14:24:00] [config] embedding-vectors: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] enc-cell: gru [2022-01-15 14:24:00] [config] enc-cell-depth: 1 [2022-01-15 14:24:00] [config] enc-depth: 6 [2022-01-15 14:24:00] [config] enc-type: bidirectional [2022-01-15 14:24:00] [config] english-title-case-every: 0 [2022-01-15 14:24:00] [config] exponential-smoothing: 0.0001 [2022-01-15 14:24:00] [config] factor-weight: 1 [2022-01-15 14:24:00] [config] grad-dropping-momentum: 0 [2022-01-15 14:24:00] [config] grad-dropping-rate: 0 [2022-01-15 14:24:00] [config] grad-dropping-warmup: 100 [2022-01-15 14:24:00] [config] gradient-checkpointing: false [2022-01-15 14:24:00] [config] guided-alignment: none [2022-01-15 14:24:00] [config] guided-alignment-cost: mse [2022-01-15 14:24:00] [config] guided-alignment-weight: 0.1 [2022-01-15 14:24:00] [config] ignore-model-config: false [2022-01-15 14:24:00] [config] input-types: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] interpolate-env-vars: false [2022-01-15 14:24:00] [config] keep-best: false [2022-01-15 14:24:00] [config] label-smoothing: 0.1 [2022-01-15 14:24:00] [config] layer-normalization: false [2022-01-15 14:24:00] [config] learn-rate: 0.0003 [2022-01-15 14:24:00] [config] lemma-dim-emb: 0 [2022-01-15 14:24:00] [config] log: /home/wmi/train.log [2022-01-15 14:24:00] [config] log-level: info [2022-01-15 14:24:00] [config] log-time-zone: "" [2022-01-15 14:24:00] [config] logical-epoch: [2022-01-15 14:24:00] [config] - 1e [2022-01-15 14:24:00] [config] - 0 [2022-01-15 14:24:00] [config] lr-decay: 0 [2022-01-15 14:24:00] [config] lr-decay-freq: 50000 [2022-01-15 14:24:00] [config] lr-decay-inv-sqrt: [2022-01-15 14:24:00] [config] - 16000 [2022-01-15 14:24:00] [config] lr-decay-repeat-warmup: false [2022-01-15 14:24:00] [config] lr-decay-reset-optimizer: false [2022-01-15 14:24:00] [config] lr-decay-start: [2022-01-15 14:24:00] [config] - 10 [2022-01-15 14:24:00] [config] - 1 [2022-01-15 14:24:00] [config] lr-decay-strategy: epoch+stalled [2022-01-15 14:24:00] [config] lr-report: true [2022-01-15 14:24:00] [config] lr-warmup: 16000 [2022-01-15 14:24:00] [config] lr-warmup-at-reload: false [2022-01-15 14:24:00] [config] lr-warmup-cycle: false [2022-01-15 14:24:00] [config] lr-warmup-start-rate: 0 [2022-01-15 14:24:00] [config] max-length: 100 [2022-01-15 14:24:00] [config] max-length-crop: false [2022-01-15 14:24:00] [config] max-length-factor: 3 [2022-01-15 14:24:00] [config] maxi-batch: 1000 [2022-01-15 14:24:00] [config] maxi-batch-sort: trg [2022-01-15 14:24:00] [config] mini-batch: 64 [2022-01-15 14:24:00] [config] mini-batch-fit: true [2022-01-15 14:24:00] [config] mini-batch-fit-step: 10 [2022-01-15 14:24:00] [config] mini-batch-track-lr: false [2022-01-15 14:24:00] [config] mini-batch-warmup: 0 [2022-01-15 14:24:00] [config] mini-batch-words: 0 [2022-01-15 14:24:00] [config] mini-batch-words-ref: 0 [2022-01-15 14:24:00] [config] model: model.npz [2022-01-15 14:24:00] [config] multi-loss-type: sum [2022-01-15 14:24:00] [config] multi-node: false [2022-01-15 14:24:00] [config] multi-node-overlap: true [2022-01-15 14:24:00] [config] n-best: false [2022-01-15 14:24:00] [config] no-nccl: false [2022-01-15 14:24:00] [config] no-reload: false [2022-01-15 14:24:00] [config] no-restore-corpus: false [2022-01-15 14:24:00] [config] normalize: 0.6 [2022-01-15 14:24:00] [config] normalize-gradient: false [2022-01-15 14:24:00] [config] num-devices: 0 [2022-01-15 14:24:00] [config] optimizer: adam [2022-01-15 14:24:00] [config] optimizer-delay: 1 [2022-01-15 14:24:00] [config] optimizer-params: [2022-01-15 14:24:00] [config] - 0.9 [2022-01-15 14:24:00] [config] - 0.98 [2022-01-15 14:24:00] [config] - 1e-09 [2022-01-15 14:24:00] [config] output-omit-bias: false [2022-01-15 14:24:00] [config] overwrite: true [2022-01-15 14:24:00] [config] precision: [2022-01-15 14:24:00] [config] - float32 [2022-01-15 14:24:00] [config] - float32 [2022-01-15 14:24:00] [config] - float32 [2022-01-15 14:24:00] [config] pretrained-model: "" [2022-01-15 14:24:00] [config] quantize-biases: false [2022-01-15 14:24:00] [config] quantize-bits: 0 [2022-01-15 14:24:00] [config] quantize-log-based: false [2022-01-15 14:24:00] [config] quantize-optimization-steps: 0 [2022-01-15 14:24:00] [config] quiet: false [2022-01-15 14:24:00] [config] quiet-translation: false [2022-01-15 14:24:00] [config] relative-paths: false [2022-01-15 14:24:00] [config] right-left: false [2022-01-15 14:24:00] [config] save-freq: 5000 [2022-01-15 14:24:00] [config] seed: 0 [2022-01-15 14:24:00] [config] sentencepiece-alphas: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] sentencepiece-max-lines: 2000000 [2022-01-15 14:24:00] [config] sentencepiece-options: "" [2022-01-15 14:24:00] [config] shuffle: data [2022-01-15 14:24:00] [config] shuffle-in-ram: false [2022-01-15 14:24:00] [config] sigterm: save-and-exit [2022-01-15 14:24:00] [config] skip: false [2022-01-15 14:24:00] [config] sqlite: "" [2022-01-15 14:24:00] [config] sqlite-drop: false [2022-01-15 14:24:00] [config] sync-sgd: false [2022-01-15 14:24:00] [config] tempdir: /tmp [2022-01-15 14:24:00] [config] tied-embeddings: false [2022-01-15 14:24:00] [config] tied-embeddings-all: true [2022-01-15 14:24:00] [config] tied-embeddings-src: false [2022-01-15 14:24:00] [config] train-embedder-rank: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] train-sets: [2022-01-15 14:24:00] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-15 14:24:00] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-15 14:24:00] [config] transformer-aan-activation: swish [2022-01-15 14:24:00] [config] transformer-aan-depth: 2 [2022-01-15 14:24:00] [config] transformer-aan-nogate: false [2022-01-15 14:24:00] [config] transformer-decoder-autoreg: self-attention [2022-01-15 14:24:00] [config] transformer-depth-scaling: false [2022-01-15 14:24:00] [config] transformer-dim-aan: 2048 [2022-01-15 14:24:00] [config] transformer-dim-ffn: 2048 [2022-01-15 14:24:00] [config] transformer-dropout: 0.1 [2022-01-15 14:24:00] [config] transformer-dropout-attention: 0 [2022-01-15 14:24:00] [config] transformer-dropout-ffn: 0 [2022-01-15 14:24:00] [config] transformer-ffn-activation: swish [2022-01-15 14:24:00] [config] transformer-ffn-depth: 2 [2022-01-15 14:24:00] [config] transformer-guided-alignment-layer: last [2022-01-15 14:24:00] [config] transformer-heads: 8 [2022-01-15 14:24:00] [config] transformer-no-projection: false [2022-01-15 14:24:00] [config] transformer-pool: false [2022-01-15 14:24:00] [config] transformer-postprocess: dan [2022-01-15 14:24:00] [config] transformer-postprocess-emb: d [2022-01-15 14:24:00] [config] transformer-postprocess-top: "" [2022-01-15 14:24:00] [config] transformer-preprocess: "" [2022-01-15 14:24:00] [config] transformer-tied-layers: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] transformer-train-position-embeddings: false [2022-01-15 14:24:00] [config] tsv: false [2022-01-15 14:24:00] [config] tsv-fields: 0 [2022-01-15 14:24:00] [config] type: transformer [2022-01-15 14:24:00] [config] ulr: false [2022-01-15 14:24:00] [config] ulr-dim-emb: 0 [2022-01-15 14:24:00] [config] ulr-dropout: 0 [2022-01-15 14:24:00] [config] ulr-keys-vectors: "" [2022-01-15 14:24:00] [config] ulr-query-vectors: "" [2022-01-15 14:24:00] [config] ulr-softmax-temperature: 1 [2022-01-15 14:24:00] [config] ulr-trainable-transformation: false [2022-01-15 14:24:00] [config] unlikelihood-loss: false [2022-01-15 14:24:00] [config] valid-freq: 5000 [2022-01-15 14:24:00] [config] valid-log: "" [2022-01-15 14:24:00] [config] valid-max-length: 1000 [2022-01-15 14:24:00] [config] valid-metrics: [2022-01-15 14:24:00] [config] - cross-entropy [2022-01-15 14:24:00] [config] valid-mini-batch: 32 [2022-01-15 14:24:00] [config] valid-reset-stalled: false [2022-01-15 14:24:00] [config] valid-script-args: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] valid-script-path: "" [2022-01-15 14:24:00] [config] valid-sets: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] valid-translation-output: "" [2022-01-15 14:24:00] [config] vocabs: [2022-01-15 14:24:00] [config] [] [2022-01-15 14:24:00] [config] word-penalty: 0 [2022-01-15 14:24:00] [config] word-scores: false [2022-01-15 14:24:00] [config] workspace: 10000 [2022-01-15 14:24:00] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:24:00] [training] Using single-device training [2022-01-15 14:24:00] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 14:24:00] [data] Vocabularies will be built separately for each file. [2022-01-15 14:24:00] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-15 14:24:00] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-15 14:24:00] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en [2022-01-15 14:24:15] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.yml [2022-01-15 14:24:19] [data] Setting vocabulary size for input 0 to 861,279 [2022-01-15 14:24:19] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-15 14:24:19] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-15 14:24:19] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl [2022-01-15 14:24:40] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.yml [2022-01-15 14:24:45] [data] Setting vocabulary size for input 1 to 1,296,038 [2022-01-15 14:24:45] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 14:24:45] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 14:24:46] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 14:24:46] Error: Requested shape shape=1296038x512 size=663571456 for existing parameter 'Wemb' does not match original shape shape=861279x512 size=440974848 [2022-01-15 14:24:46] Error: Aborted from marian::Expr marian::ExpressionGraph::param(const string&, const marian::Shape&, marian::Ptr&, marian::Type, bool, bool) in /home/wmi/Workspace/marian/src/graph/expression_graph.h:314 [CALL STACK] [0x564e61ee3642] marian::ExpressionGraph:: param (std::__cxx11::basic_string,std::allocator> const&, marian::Shape const&, std::shared_ptr const&, marian::Type, bool, bool) + 0x992 [0x564e625cd899] marian::Embedding:: Embedding (std::shared_ptr, std::shared_ptr) + 0x4c9 [0x564e625de475] std::shared_ptr marian:: New const&,std::shared_ptr&>(std::shared_ptr const&, std::shared_ptr&) + 0x85 [0x564e625ce27d] marian::EncoderDecoderLayerBase:: createEmbeddingLayer () const + 0x59d [0x564e625cebe5] marian::EncoderDecoderLayerBase:: getEmbeddingLayer (bool) const + 0x145 [0x564e6219502f] marian::DecoderBase:: embeddingsFromBatch (std::shared_ptr, std::shared_ptr, std::shared_ptr) + 0x8f [0x564e62219766] marian::EncoderDecoder:: stepAll (std::shared_ptr, std::shared_ptr, bool) + 0x196 [0x564e621cfc69] marian::models::EncoderDecoderCECost:: apply (std::shared_ptr, std::shared_ptr, std::shared_ptr, bool) + 0x119 [0x564e61e22c82] marian::models::Trainer:: build (std::shared_ptr, std::shared_ptr, bool) + 0xb2 [0x564e622b75f4] marian::GraphGroup:: collectStats (std::shared_ptr, std::shared_ptr, std::vector,std::allocator>> const&, double) + 0xb84 [0x564e61ef9269] marian::Train:: run () + 0x2e9 [0x564e61e01389] mainTrainer (int, char**) + 0x5e9 [0x564e61dbf1bc] main + 0x3c [0x7f906e48b0b3] __libc_start_main + 0xf3 [0x564e61dffb0e] _start + 0x2e [2022-01-15 14:34:04] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:34:04] [marian] Running on s470607-gpu as process 3011 with command line: [2022-01-15 14:34:04] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 14:34:04] [config] after: 0e [2022-01-15 14:34:04] [config] after-batches: 0 [2022-01-15 14:34:04] [config] after-epochs: 1 [2022-01-15 14:34:04] [config] all-caps-every: 0 [2022-01-15 14:34:04] [config] allow-unk: false [2022-01-15 14:34:04] [config] authors: false [2022-01-15 14:34:04] [config] beam-size: 6 [2022-01-15 14:34:04] [config] bert-class-symbol: "[CLS]" [2022-01-15 14:34:04] [config] bert-mask-symbol: "[MASK]" [2022-01-15 14:34:04] [config] bert-masking-fraction: 0.15 [2022-01-15 14:34:04] [config] bert-sep-symbol: "[SEP]" [2022-01-15 14:34:04] [config] bert-train-type-embeddings: true [2022-01-15 14:34:04] [config] bert-type-vocab-size: 2 [2022-01-15 14:34:04] [config] build-info: "" [2022-01-15 14:34:04] [config] cite: false [2022-01-15 14:34:04] [config] clip-norm: 5 [2022-01-15 14:34:04] [config] cost-scaling: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] cost-type: ce-sum [2022-01-15 14:34:04] [config] cpu-threads: 0 [2022-01-15 14:34:04] [config] data-weighting: "" [2022-01-15 14:34:04] [config] data-weighting-type: sentence [2022-01-15 14:34:04] [config] dec-cell: gru [2022-01-15 14:34:04] [config] dec-cell-base-depth: 2 [2022-01-15 14:34:04] [config] dec-cell-high-depth: 1 [2022-01-15 14:34:04] [config] dec-depth: 6 [2022-01-15 14:34:04] [config] devices: [2022-01-15 14:34:04] [config] - 0 [2022-01-15 14:34:04] [config] dim-emb: 512 [2022-01-15 14:34:04] [config] dim-rnn: 1024 [2022-01-15 14:34:04] [config] dim-vocabs: [2022-01-15 14:34:04] [config] - 0 [2022-01-15 14:34:04] [config] - 0 [2022-01-15 14:34:04] [config] disp-first: 0 [2022-01-15 14:34:04] [config] disp-freq: 500 [2022-01-15 14:34:04] [config] disp-label-counts: true [2022-01-15 14:34:04] [config] dropout-rnn: 0 [2022-01-15 14:34:04] [config] dropout-src: 0 [2022-01-15 14:34:04] [config] dropout-trg: 0 [2022-01-15 14:34:04] [config] dump-config: "" [2022-01-15 14:34:04] [config] early-stopping: 10 [2022-01-15 14:34:04] [config] embedding-fix-src: false [2022-01-15 14:34:04] [config] embedding-fix-trg: false [2022-01-15 14:34:04] [config] embedding-normalization: false [2022-01-15 14:34:04] [config] embedding-vectors: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] enc-cell: gru [2022-01-15 14:34:04] [config] enc-cell-depth: 1 [2022-01-15 14:34:04] [config] enc-depth: 6 [2022-01-15 14:34:04] [config] enc-type: bidirectional [2022-01-15 14:34:04] [config] english-title-case-every: 0 [2022-01-15 14:34:04] [config] exponential-smoothing: 0.0001 [2022-01-15 14:34:04] [config] factor-weight: 1 [2022-01-15 14:34:04] [config] grad-dropping-momentum: 0 [2022-01-15 14:34:04] [config] grad-dropping-rate: 0 [2022-01-15 14:34:04] [config] grad-dropping-warmup: 100 [2022-01-15 14:34:04] [config] gradient-checkpointing: false [2022-01-15 14:34:04] [config] guided-alignment: none [2022-01-15 14:34:04] [config] guided-alignment-cost: mse [2022-01-15 14:34:04] [config] guided-alignment-weight: 0.1 [2022-01-15 14:34:04] [config] ignore-model-config: false [2022-01-15 14:34:04] [config] input-types: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] interpolate-env-vars: false [2022-01-15 14:34:04] [config] keep-best: false [2022-01-15 14:34:04] [config] label-smoothing: 0.1 [2022-01-15 14:34:04] [config] layer-normalization: false [2022-01-15 14:34:04] [config] learn-rate: 0.0003 [2022-01-15 14:34:04] [config] lemma-dim-emb: 0 [2022-01-15 14:34:04] [config] log: /home/wmi/train.log [2022-01-15 14:34:04] [config] log-level: info [2022-01-15 14:34:04] [config] log-time-zone: "" [2022-01-15 14:34:04] [config] logical-epoch: [2022-01-15 14:34:04] [config] - 1e [2022-01-15 14:34:04] [config] - 0 [2022-01-15 14:34:04] [config] lr-decay: 0 [2022-01-15 14:34:04] [config] lr-decay-freq: 50000 [2022-01-15 14:34:04] [config] lr-decay-inv-sqrt: [2022-01-15 14:34:04] [config] - 16000 [2022-01-15 14:34:04] [config] lr-decay-repeat-warmup: false [2022-01-15 14:34:04] [config] lr-decay-reset-optimizer: false [2022-01-15 14:34:04] [config] lr-decay-start: [2022-01-15 14:34:04] [config] - 10 [2022-01-15 14:34:04] [config] - 1 [2022-01-15 14:34:04] [config] lr-decay-strategy: epoch+stalled [2022-01-15 14:34:04] [config] lr-report: true [2022-01-15 14:34:04] [config] lr-warmup: 16000 [2022-01-15 14:34:04] [config] lr-warmup-at-reload: false [2022-01-15 14:34:04] [config] lr-warmup-cycle: false [2022-01-15 14:34:04] [config] lr-warmup-start-rate: 0 [2022-01-15 14:34:04] [config] max-length: 100 [2022-01-15 14:34:04] [config] max-length-crop: false [2022-01-15 14:34:04] [config] max-length-factor: 3 [2022-01-15 14:34:04] [config] maxi-batch: 1000 [2022-01-15 14:34:04] [config] maxi-batch-sort: trg [2022-01-15 14:34:04] [config] mini-batch: 64 [2022-01-15 14:34:04] [config] mini-batch-fit: true [2022-01-15 14:34:04] [config] mini-batch-fit-step: 10 [2022-01-15 14:34:04] [config] mini-batch-track-lr: false [2022-01-15 14:34:04] [config] mini-batch-warmup: 0 [2022-01-15 14:34:04] [config] mini-batch-words: 0 [2022-01-15 14:34:04] [config] mini-batch-words-ref: 0 [2022-01-15 14:34:04] [config] model: model.npz [2022-01-15 14:34:04] [config] multi-loss-type: sum [2022-01-15 14:34:04] [config] multi-node: false [2022-01-15 14:34:04] [config] multi-node-overlap: true [2022-01-15 14:34:04] [config] n-best: false [2022-01-15 14:34:04] [config] no-nccl: false [2022-01-15 14:34:04] [config] no-reload: false [2022-01-15 14:34:04] [config] no-restore-corpus: false [2022-01-15 14:34:04] [config] normalize: 0.6 [2022-01-15 14:34:04] [config] normalize-gradient: false [2022-01-15 14:34:04] [config] num-devices: 0 [2022-01-15 14:34:04] [config] optimizer: adam [2022-01-15 14:34:04] [config] optimizer-delay: 1 [2022-01-15 14:34:04] [config] optimizer-params: [2022-01-15 14:34:04] [config] - 0.9 [2022-01-15 14:34:04] [config] - 0.98 [2022-01-15 14:34:04] [config] - 1e-09 [2022-01-15 14:34:04] [config] output-omit-bias: false [2022-01-15 14:34:04] [config] overwrite: true [2022-01-15 14:34:04] [config] precision: [2022-01-15 14:34:04] [config] - float32 [2022-01-15 14:34:04] [config] - float32 [2022-01-15 14:34:04] [config] - float32 [2022-01-15 14:34:04] [config] pretrained-model: "" [2022-01-15 14:34:04] [config] quantize-biases: false [2022-01-15 14:34:04] [config] quantize-bits: 0 [2022-01-15 14:34:04] [config] quantize-log-based: false [2022-01-15 14:34:04] [config] quantize-optimization-steps: 0 [2022-01-15 14:34:04] [config] quiet: false [2022-01-15 14:34:04] [config] quiet-translation: false [2022-01-15 14:34:04] [config] relative-paths: false [2022-01-15 14:34:04] [config] right-left: false [2022-01-15 14:34:04] [config] save-freq: 5000 [2022-01-15 14:34:04] [config] seed: 0 [2022-01-15 14:34:04] [config] sentencepiece-alphas: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] sentencepiece-max-lines: 2000000 [2022-01-15 14:34:04] [config] sentencepiece-options: "" [2022-01-15 14:34:04] [config] shuffle: data [2022-01-15 14:34:04] [config] shuffle-in-ram: false [2022-01-15 14:34:04] [config] sigterm: save-and-exit [2022-01-15 14:34:04] [config] skip: false [2022-01-15 14:34:04] [config] sqlite: "" [2022-01-15 14:34:04] [config] sqlite-drop: false [2022-01-15 14:34:04] [config] sync-sgd: false [2022-01-15 14:34:04] [config] tempdir: /tmp [2022-01-15 14:34:04] [config] tied-embeddings: false [2022-01-15 14:34:04] [config] tied-embeddings-all: true [2022-01-15 14:34:04] [config] tied-embeddings-src: false [2022-01-15 14:34:04] [config] train-embedder-rank: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] train-sets: [2022-01-15 14:34:04] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 14:34:04] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 14:34:04] [config] transformer-aan-activation: swish [2022-01-15 14:34:04] [config] transformer-aan-depth: 2 [2022-01-15 14:34:04] [config] transformer-aan-nogate: false [2022-01-15 14:34:04] [config] transformer-decoder-autoreg: self-attention [2022-01-15 14:34:04] [config] transformer-depth-scaling: false [2022-01-15 14:34:04] [config] transformer-dim-aan: 2048 [2022-01-15 14:34:04] [config] transformer-dim-ffn: 2048 [2022-01-15 14:34:04] [config] transformer-dropout: 0.1 [2022-01-15 14:34:04] [config] transformer-dropout-attention: 0 [2022-01-15 14:34:04] [config] transformer-dropout-ffn: 0 [2022-01-15 14:34:04] [config] transformer-ffn-activation: swish [2022-01-15 14:34:04] [config] transformer-ffn-depth: 2 [2022-01-15 14:34:04] [config] transformer-guided-alignment-layer: last [2022-01-15 14:34:04] [config] transformer-heads: 8 [2022-01-15 14:34:04] [config] transformer-no-projection: false [2022-01-15 14:34:04] [config] transformer-pool: false [2022-01-15 14:34:04] [config] transformer-postprocess: dan [2022-01-15 14:34:04] [config] transformer-postprocess-emb: d [2022-01-15 14:34:04] [config] transformer-postprocess-top: "" [2022-01-15 14:34:04] [config] transformer-preprocess: "" [2022-01-15 14:34:04] [config] transformer-tied-layers: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] transformer-train-position-embeddings: false [2022-01-15 14:34:04] [config] tsv: false [2022-01-15 14:34:04] [config] tsv-fields: 0 [2022-01-15 14:34:04] [config] type: transformer [2022-01-15 14:34:04] [config] ulr: false [2022-01-15 14:34:04] [config] ulr-dim-emb: 0 [2022-01-15 14:34:04] [config] ulr-dropout: 0 [2022-01-15 14:34:04] [config] ulr-keys-vectors: "" [2022-01-15 14:34:04] [config] ulr-query-vectors: "" [2022-01-15 14:34:04] [config] ulr-softmax-temperature: 1 [2022-01-15 14:34:04] [config] ulr-trainable-transformation: false [2022-01-15 14:34:04] [config] unlikelihood-loss: false [2022-01-15 14:34:04] [config] valid-freq: 5000 [2022-01-15 14:34:04] [config] valid-log: "" [2022-01-15 14:34:04] [config] valid-max-length: 1000 [2022-01-15 14:34:04] [config] valid-metrics: [2022-01-15 14:34:04] [config] - cross-entropy [2022-01-15 14:34:04] [config] valid-mini-batch: 32 [2022-01-15 14:34:04] [config] valid-reset-stalled: false [2022-01-15 14:34:04] [config] valid-script-args: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] valid-script-path: "" [2022-01-15 14:34:04] [config] valid-sets: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] valid-translation-output: "" [2022-01-15 14:34:04] [config] vocabs: [2022-01-15 14:34:04] [config] [] [2022-01-15 14:34:04] [config] word-penalty: 0 [2022-01-15 14:34:04] [config] word-scores: false [2022-01-15 14:34:04] [config] workspace: 10000 [2022-01-15 14:34:04] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:34:04] [training] Using single-device training [2022-01-15 14:34:04] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 14:34:04] [data] Vocabularies will be built separately for each file. [2022-01-15 14:34:04] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 14:34:04] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 14:34:04] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 14:34:12] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml [2022-01-15 14:34:12] [data] Setting vocabulary size for input 0 to 18,703 [2022-01-15 14:34:12] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 14:34:12] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 14:34:12] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 14:34:20] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml [2022-01-15 14:34:20] [data] Setting vocabulary size for input 1 to 27,729 [2022-01-15 14:34:20] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 14:34:20] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 14:34:20] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 14:34:20] Error: Requested shape shape=27729x512 size=14197248 for existing parameter 'Wemb' does not match original shape shape=18703x512 size=9575936 [2022-01-15 14:34:20] Error: Aborted from marian::Expr marian::ExpressionGraph::param(const string&, const marian::Shape&, marian::Ptr&, marian::Type, bool, bool) in /home/wmi/Workspace/marian/src/graph/expression_graph.h:314 [CALL STACK] [0x563b98926642] marian::ExpressionGraph:: param (std::__cxx11::basic_string,std::allocator> const&, marian::Shape const&, std::shared_ptr const&, marian::Type, bool, bool) + 0x992 [0x563b99010899] marian::Embedding:: Embedding (std::shared_ptr, std::shared_ptr) + 0x4c9 [0x563b99021475] std::shared_ptr marian:: New const&,std::shared_ptr&>(std::shared_ptr const&, std::shared_ptr&) + 0x85 [0x563b9901127d] marian::EncoderDecoderLayerBase:: createEmbeddingLayer () const + 0x59d [0x563b99011be5] marian::EncoderDecoderLayerBase:: getEmbeddingLayer (bool) const + 0x145 [0x563b98bd802f] marian::DecoderBase:: embeddingsFromBatch (std::shared_ptr, std::shared_ptr, std::shared_ptr) + 0x8f [0x563b98c5c766] marian::EncoderDecoder:: stepAll (std::shared_ptr, std::shared_ptr, bool) + 0x196 [0x563b98c12c69] marian::models::EncoderDecoderCECost:: apply (std::shared_ptr, std::shared_ptr, std::shared_ptr, bool) + 0x119 [0x563b98865c82] marian::models::Trainer:: build (std::shared_ptr, std::shared_ptr, bool) + 0xb2 [0x563b98cfa5f4] marian::GraphGroup:: collectStats (std::shared_ptr, std::shared_ptr, std::vector,std::allocator>> const&, double) + 0xb84 [0x563b9893c269] marian::Train:: run () + 0x2e9 [0x563b98844389] mainTrainer (int, char**) + 0x5e9 [0x563b988021bc] main + 0x3c [0x7ffa93f970b3] __libc_start_main + 0xf3 [0x563b98842b0e] _start + 0x2e [2022-01-15 14:38:37] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:38:37] [marian] Running on s470607-gpu as process 3044 with command line: [2022-01-15 14:38:37] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 14:38:37] [config] after: 0e [2022-01-15 14:38:37] [config] after-batches: 0 [2022-01-15 14:38:37] [config] after-epochs: 1 [2022-01-15 14:38:37] [config] all-caps-every: 0 [2022-01-15 14:38:37] [config] allow-unk: false [2022-01-15 14:38:37] [config] authors: false [2022-01-15 14:38:37] [config] beam-size: 6 [2022-01-15 14:38:37] [config] bert-class-symbol: "[CLS]" [2022-01-15 14:38:37] [config] bert-mask-symbol: "[MASK]" [2022-01-15 14:38:37] [config] bert-masking-fraction: 0.15 [2022-01-15 14:38:37] [config] bert-sep-symbol: "[SEP]" [2022-01-15 14:38:37] [config] bert-train-type-embeddings: true [2022-01-15 14:38:37] [config] bert-type-vocab-size: 2 [2022-01-15 14:38:37] [config] build-info: "" [2022-01-15 14:38:37] [config] cite: false [2022-01-15 14:38:37] [config] clip-norm: 5 [2022-01-15 14:38:37] [config] cost-scaling: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] cost-type: ce-sum [2022-01-15 14:38:37] [config] cpu-threads: 0 [2022-01-15 14:38:37] [config] data-weighting: "" [2022-01-15 14:38:37] [config] data-weighting-type: sentence [2022-01-15 14:38:37] [config] dec-cell: gru [2022-01-15 14:38:37] [config] dec-cell-base-depth: 2 [2022-01-15 14:38:37] [config] dec-cell-high-depth: 1 [2022-01-15 14:38:37] [config] dec-depth: 6 [2022-01-15 14:38:37] [config] devices: [2022-01-15 14:38:37] [config] - 0 [2022-01-15 14:38:37] [config] dim-emb: 512 [2022-01-15 14:38:37] [config] dim-rnn: 1024 [2022-01-15 14:38:37] [config] dim-vocabs: [2022-01-15 14:38:37] [config] - 0 [2022-01-15 14:38:37] [config] - 0 [2022-01-15 14:38:37] [config] disp-first: 0 [2022-01-15 14:38:37] [config] disp-freq: 500 [2022-01-15 14:38:37] [config] disp-label-counts: true [2022-01-15 14:38:37] [config] dropout-rnn: 0 [2022-01-15 14:38:37] [config] dropout-src: 0 [2022-01-15 14:38:37] [config] dropout-trg: 0 [2022-01-15 14:38:37] [config] dump-config: "" [2022-01-15 14:38:37] [config] early-stopping: 10 [2022-01-15 14:38:37] [config] embedding-fix-src: false [2022-01-15 14:38:37] [config] embedding-fix-trg: false [2022-01-15 14:38:37] [config] embedding-normalization: false [2022-01-15 14:38:37] [config] embedding-vectors: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] enc-cell: gru [2022-01-15 14:38:37] [config] enc-cell-depth: 1 [2022-01-15 14:38:37] [config] enc-depth: 6 [2022-01-15 14:38:37] [config] enc-type: bidirectional [2022-01-15 14:38:37] [config] english-title-case-every: 0 [2022-01-15 14:38:37] [config] exponential-smoothing: 0.0001 [2022-01-15 14:38:37] [config] factor-weight: 1 [2022-01-15 14:38:37] [config] grad-dropping-momentum: 0 [2022-01-15 14:38:37] [config] grad-dropping-rate: 0 [2022-01-15 14:38:37] [config] grad-dropping-warmup: 100 [2022-01-15 14:38:37] [config] gradient-checkpointing: false [2022-01-15 14:38:37] [config] guided-alignment: none [2022-01-15 14:38:37] [config] guided-alignment-cost: mse [2022-01-15 14:38:37] [config] guided-alignment-weight: 0.1 [2022-01-15 14:38:37] [config] ignore-model-config: false [2022-01-15 14:38:37] [config] input-types: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] interpolate-env-vars: false [2022-01-15 14:38:37] [config] keep-best: false [2022-01-15 14:38:37] [config] label-smoothing: 0.1 [2022-01-15 14:38:37] [config] layer-normalization: false [2022-01-15 14:38:37] [config] learn-rate: 0.0003 [2022-01-15 14:38:37] [config] lemma-dim-emb: 0 [2022-01-15 14:38:37] [config] log: /home/wmi/train.log [2022-01-15 14:38:37] [config] log-level: info [2022-01-15 14:38:37] [config] log-time-zone: "" [2022-01-15 14:38:37] [config] logical-epoch: [2022-01-15 14:38:37] [config] - 1e [2022-01-15 14:38:37] [config] - 0 [2022-01-15 14:38:37] [config] lr-decay: 0 [2022-01-15 14:38:37] [config] lr-decay-freq: 50000 [2022-01-15 14:38:37] [config] lr-decay-inv-sqrt: [2022-01-15 14:38:37] [config] - 16000 [2022-01-15 14:38:37] [config] lr-decay-repeat-warmup: false [2022-01-15 14:38:37] [config] lr-decay-reset-optimizer: false [2022-01-15 14:38:37] [config] lr-decay-start: [2022-01-15 14:38:37] [config] - 10 [2022-01-15 14:38:37] [config] - 1 [2022-01-15 14:38:37] [config] lr-decay-strategy: epoch+stalled [2022-01-15 14:38:37] [config] lr-report: true [2022-01-15 14:38:37] [config] lr-warmup: 16000 [2022-01-15 14:38:37] [config] lr-warmup-at-reload: false [2022-01-15 14:38:37] [config] lr-warmup-cycle: false [2022-01-15 14:38:37] [config] lr-warmup-start-rate: 0 [2022-01-15 14:38:37] [config] max-length: 100 [2022-01-15 14:38:37] [config] max-length-crop: false [2022-01-15 14:38:37] [config] max-length-factor: 3 [2022-01-15 14:38:37] [config] maxi-batch: 1000 [2022-01-15 14:38:37] [config] maxi-batch-sort: trg [2022-01-15 14:38:37] [config] mini-batch: 64 [2022-01-15 14:38:37] [config] mini-batch-fit: true [2022-01-15 14:38:37] [config] mini-batch-fit-step: 10 [2022-01-15 14:38:37] [config] mini-batch-track-lr: false [2022-01-15 14:38:37] [config] mini-batch-warmup: 0 [2022-01-15 14:38:37] [config] mini-batch-words: 0 [2022-01-15 14:38:37] [config] mini-batch-words-ref: 0 [2022-01-15 14:38:37] [config] model: model.npz [2022-01-15 14:38:37] [config] multi-loss-type: sum [2022-01-15 14:38:37] [config] multi-node: false [2022-01-15 14:38:37] [config] multi-node-overlap: true [2022-01-15 14:38:37] [config] n-best: false [2022-01-15 14:38:37] [config] no-nccl: false [2022-01-15 14:38:37] [config] no-reload: false [2022-01-15 14:38:37] [config] no-restore-corpus: false [2022-01-15 14:38:37] [config] normalize: 0.6 [2022-01-15 14:38:37] [config] normalize-gradient: false [2022-01-15 14:38:37] [config] num-devices: 0 [2022-01-15 14:38:37] [config] optimizer: adam [2022-01-15 14:38:37] [config] optimizer-delay: 1 [2022-01-15 14:38:37] [config] optimizer-params: [2022-01-15 14:38:37] [config] - 0.9 [2022-01-15 14:38:37] [config] - 0.98 [2022-01-15 14:38:37] [config] - 1e-09 [2022-01-15 14:38:37] [config] output-omit-bias: false [2022-01-15 14:38:37] [config] overwrite: true [2022-01-15 14:38:37] [config] precision: [2022-01-15 14:38:37] [config] - float32 [2022-01-15 14:38:37] [config] - float32 [2022-01-15 14:38:37] [config] - float32 [2022-01-15 14:38:37] [config] pretrained-model: "" [2022-01-15 14:38:37] [config] quantize-biases: false [2022-01-15 14:38:37] [config] quantize-bits: 0 [2022-01-15 14:38:37] [config] quantize-log-based: false [2022-01-15 14:38:37] [config] quantize-optimization-steps: 0 [2022-01-15 14:38:37] [config] quiet: false [2022-01-15 14:38:37] [config] quiet-translation: false [2022-01-15 14:38:37] [config] relative-paths: false [2022-01-15 14:38:37] [config] right-left: false [2022-01-15 14:38:37] [config] save-freq: 5000 [2022-01-15 14:38:37] [config] seed: 0 [2022-01-15 14:38:37] [config] sentencepiece-alphas: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] sentencepiece-max-lines: 2000000 [2022-01-15 14:38:37] [config] sentencepiece-options: "" [2022-01-15 14:38:37] [config] shuffle: data [2022-01-15 14:38:37] [config] shuffle-in-ram: false [2022-01-15 14:38:37] [config] sigterm: save-and-exit [2022-01-15 14:38:37] [config] skip: false [2022-01-15 14:38:37] [config] sqlite: "" [2022-01-15 14:38:37] [config] sqlite-drop: false [2022-01-15 14:38:37] [config] sync-sgd: false [2022-01-15 14:38:37] [config] tempdir: /tmp [2022-01-15 14:38:37] [config] tied-embeddings: true [2022-01-15 14:38:37] [config] tied-embeddings-all: false [2022-01-15 14:38:37] [config] tied-embeddings-src: false [2022-01-15 14:38:37] [config] train-embedder-rank: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] train-sets: [2022-01-15 14:38:37] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 14:38:37] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 14:38:37] [config] transformer-aan-activation: swish [2022-01-15 14:38:37] [config] transformer-aan-depth: 2 [2022-01-15 14:38:37] [config] transformer-aan-nogate: false [2022-01-15 14:38:37] [config] transformer-decoder-autoreg: self-attention [2022-01-15 14:38:37] [config] transformer-depth-scaling: false [2022-01-15 14:38:37] [config] transformer-dim-aan: 2048 [2022-01-15 14:38:37] [config] transformer-dim-ffn: 2048 [2022-01-15 14:38:37] [config] transformer-dropout: 0.1 [2022-01-15 14:38:37] [config] transformer-dropout-attention: 0 [2022-01-15 14:38:37] [config] transformer-dropout-ffn: 0 [2022-01-15 14:38:37] [config] transformer-ffn-activation: swish [2022-01-15 14:38:37] [config] transformer-ffn-depth: 2 [2022-01-15 14:38:37] [config] transformer-guided-alignment-layer: last [2022-01-15 14:38:37] [config] transformer-heads: 8 [2022-01-15 14:38:37] [config] transformer-no-projection: false [2022-01-15 14:38:37] [config] transformer-pool: false [2022-01-15 14:38:37] [config] transformer-postprocess: dan [2022-01-15 14:38:37] [config] transformer-postprocess-emb: d [2022-01-15 14:38:37] [config] transformer-postprocess-top: "" [2022-01-15 14:38:37] [config] transformer-preprocess: "" [2022-01-15 14:38:37] [config] transformer-tied-layers: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] transformer-train-position-embeddings: false [2022-01-15 14:38:37] [config] tsv: false [2022-01-15 14:38:37] [config] tsv-fields: 0 [2022-01-15 14:38:37] [config] type: transformer [2022-01-15 14:38:37] [config] ulr: false [2022-01-15 14:38:37] [config] ulr-dim-emb: 0 [2022-01-15 14:38:37] [config] ulr-dropout: 0 [2022-01-15 14:38:37] [config] ulr-keys-vectors: "" [2022-01-15 14:38:37] [config] ulr-query-vectors: "" [2022-01-15 14:38:37] [config] ulr-softmax-temperature: 1 [2022-01-15 14:38:37] [config] ulr-trainable-transformation: false [2022-01-15 14:38:37] [config] unlikelihood-loss: false [2022-01-15 14:38:37] [config] valid-freq: 5000 [2022-01-15 14:38:37] [config] valid-log: "" [2022-01-15 14:38:37] [config] valid-max-length: 1000 [2022-01-15 14:38:37] [config] valid-metrics: [2022-01-15 14:38:37] [config] - cross-entropy [2022-01-15 14:38:37] [config] valid-mini-batch: 32 [2022-01-15 14:38:37] [config] valid-reset-stalled: false [2022-01-15 14:38:37] [config] valid-script-args: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] valid-script-path: "" [2022-01-15 14:38:37] [config] valid-sets: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] valid-translation-output: "" [2022-01-15 14:38:37] [config] vocabs: [2022-01-15 14:38:37] [config] [] [2022-01-15 14:38:37] [config] word-penalty: 0 [2022-01-15 14:38:37] [config] word-scores: false [2022-01-15 14:38:37] [config] workspace: 10000 [2022-01-15 14:38:37] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 14:38:37] [training] Using single-device training [2022-01-15 14:38:37] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 14:38:37] [data] Vocabularies will be built separately for each file. [2022-01-15 14:38:37] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 14:38:37] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml [2022-01-15 14:38:37] [data] Setting vocabulary size for input 0 to 18,703 [2022-01-15 14:38:37] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 14:38:37] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml [2022-01-15 14:38:37] [data] Setting vocabulary size for input 1 to 27,729 [2022-01-15 14:38:37] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 14:38:37] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 14:38:37] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 14:38:37] [logits] Applying loss function for 1 factor(s) [2022-01-15 14:38:37] [memory] Reserving 259 MB, device gpu0 [2022-01-15 14:38:39] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2022-01-15 14:38:39] [memory] Reserving 259 MB, device gpu0 [2022-01-15 14:38:49] [batching] Done. Typical MB size is 9,199 target words [2022-01-15 14:38:49] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 14:38:49] Training started [2022-01-15 14:38:49] [data] Shuffling data [2022-01-15 14:38:51] [data] Done reading 3,103,819 sentences [2022-01-15 14:39:07] [data] Done shuffling 3,103,819 sentences to temp files [2022-01-15 14:39:08] [memory] Reserving 259 MB, device gpu0 [2022-01-15 14:39:08] [memory] Reserving 259 MB, device gpu0 [2022-01-15 14:39:08] [memory] Reserving 518 MB, device gpu0 [2022-01-15 14:39:08] [memory] Reserving 259 MB, device gpu0 [2022-01-15 14:40:24] Ep. 1 : Up. 500 : Sen. 112,080 : Cost 9.74822521 * 3,311,741 @ 5,138 after 3,311,741 : Time 94.75s : 34952.55 words/s : L.r. 9.3750e-06 [2022-01-15 14:41:41] Ep. 1 : Up. 1000 : Sen. 222,538 : Cost 8.78999043 * 3,264,698 @ 6,432 after 6,576,439 : Time 76.66s : 42584.21 words/s : L.r. 1.8750e-05 [2022-01-15 14:42:58] Ep. 1 : Up. 1500 : Sen. 335,655 : Cost 8.44620609 * 3,307,791 @ 4,368 after 9,884,230 : Time 77.38s : 42747.19 words/s : L.r. 2.8125e-05 [2022-01-15 14:44:15] Ep. 1 : Up. 2000 : Sen. 445,738 : Cost 8.12953186 * 3,248,721 @ 6,930 after 13,132,951 : Time 76.93s : 42228.52 words/s : L.r. 3.7500e-05 [2022-01-15 14:45:33] Ep. 1 : Up. 2500 : Sen. 557,935 : Cost 7.74989128 * 3,303,644 @ 6,360 after 16,436,595 : Time 77.69s : 42520.98 words/s : L.r. 4.6875e-05 [2022-01-15 15:22:51] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 15:22:51] [marian] Running on s470607-gpu as process 3060 with command line: [2022-01-15 15:22:51] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 15:22:51] [config] after: 0e [2022-01-15 15:22:51] [config] after-batches: 0 [2022-01-15 15:22:51] [config] after-epochs: 1 [2022-01-15 15:22:51] [config] all-caps-every: 0 [2022-01-15 15:22:51] [config] allow-unk: false [2022-01-15 15:22:51] [config] authors: false [2022-01-15 15:22:51] [config] beam-size: 6 [2022-01-15 15:22:51] [config] bert-class-symbol: "[CLS]" [2022-01-15 15:22:51] [config] bert-mask-symbol: "[MASK]" [2022-01-15 15:22:51] [config] bert-masking-fraction: 0.15 [2022-01-15 15:22:51] [config] bert-sep-symbol: "[SEP]" [2022-01-15 15:22:51] [config] bert-train-type-embeddings: true [2022-01-15 15:22:51] [config] bert-type-vocab-size: 2 [2022-01-15 15:22:51] [config] build-info: "" [2022-01-15 15:22:51] [config] cite: false [2022-01-15 15:22:51] [config] clip-norm: 5 [2022-01-15 15:22:51] [config] cost-scaling: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] cost-type: ce-sum [2022-01-15 15:22:51] [config] cpu-threads: 0 [2022-01-15 15:22:51] [config] data-weighting: "" [2022-01-15 15:22:51] [config] data-weighting-type: sentence [2022-01-15 15:22:51] [config] dec-cell: gru [2022-01-15 15:22:51] [config] dec-cell-base-depth: 2 [2022-01-15 15:22:51] [config] dec-cell-high-depth: 1 [2022-01-15 15:22:51] [config] dec-depth: 6 [2022-01-15 15:22:51] [config] devices: [2022-01-15 15:22:51] [config] - 0 [2022-01-15 15:22:51] [config] dim-emb: 512 [2022-01-15 15:22:51] [config] dim-rnn: 1024 [2022-01-15 15:22:51] [config] dim-vocabs: [2022-01-15 15:22:51] [config] - 0 [2022-01-15 15:22:51] [config] - 0 [2022-01-15 15:22:51] [config] disp-first: 0 [2022-01-15 15:22:51] [config] disp-freq: 500 [2022-01-15 15:22:51] [config] disp-label-counts: true [2022-01-15 15:22:51] [config] dropout-rnn: 0 [2022-01-15 15:22:51] [config] dropout-src: 0 [2022-01-15 15:22:51] [config] dropout-trg: 0 [2022-01-15 15:22:51] [config] dump-config: "" [2022-01-15 15:22:51] [config] early-stopping: 10 [2022-01-15 15:22:51] [config] embedding-fix-src: false [2022-01-15 15:22:51] [config] embedding-fix-trg: false [2022-01-15 15:22:51] [config] embedding-normalization: false [2022-01-15 15:22:51] [config] embedding-vectors: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] enc-cell: gru [2022-01-15 15:22:51] [config] enc-cell-depth: 1 [2022-01-15 15:22:51] [config] enc-depth: 6 [2022-01-15 15:22:51] [config] enc-type: bidirectional [2022-01-15 15:22:51] [config] english-title-case-every: 0 [2022-01-15 15:22:51] [config] exponential-smoothing: 0.0001 [2022-01-15 15:22:51] [config] factor-weight: 1 [2022-01-15 15:22:51] [config] grad-dropping-momentum: 0 [2022-01-15 15:22:51] [config] grad-dropping-rate: 0 [2022-01-15 15:22:51] [config] grad-dropping-warmup: 100 [2022-01-15 15:22:51] [config] gradient-checkpointing: false [2022-01-15 15:22:51] [config] guided-alignment: none [2022-01-15 15:22:51] [config] guided-alignment-cost: mse [2022-01-15 15:22:51] [config] guided-alignment-weight: 0.1 [2022-01-15 15:22:51] [config] ignore-model-config: false [2022-01-15 15:22:51] [config] input-types: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] interpolate-env-vars: false [2022-01-15 15:22:51] [config] keep-best: false [2022-01-15 15:22:51] [config] label-smoothing: 0.1 [2022-01-15 15:22:51] [config] layer-normalization: false [2022-01-15 15:22:51] [config] learn-rate: 0.0003 [2022-01-15 15:22:51] [config] lemma-dim-emb: 0 [2022-01-15 15:22:51] [config] log: /home/wmi/train.log [2022-01-15 15:22:51] [config] log-level: info [2022-01-15 15:22:51] [config] log-time-zone: "" [2022-01-15 15:22:51] [config] logical-epoch: [2022-01-15 15:22:51] [config] - 1e [2022-01-15 15:22:51] [config] - 0 [2022-01-15 15:22:51] [config] lr-decay: 0 [2022-01-15 15:22:51] [config] lr-decay-freq: 50000 [2022-01-15 15:22:51] [config] lr-decay-inv-sqrt: [2022-01-15 15:22:51] [config] - 16000 [2022-01-15 15:22:51] [config] lr-decay-repeat-warmup: false [2022-01-15 15:22:51] [config] lr-decay-reset-optimizer: false [2022-01-15 15:22:51] [config] lr-decay-start: [2022-01-15 15:22:51] [config] - 10 [2022-01-15 15:22:51] [config] - 1 [2022-01-15 15:22:51] [config] lr-decay-strategy: epoch+stalled [2022-01-15 15:22:51] [config] lr-report: true [2022-01-15 15:22:51] [config] lr-warmup: 16000 [2022-01-15 15:22:51] [config] lr-warmup-at-reload: false [2022-01-15 15:22:51] [config] lr-warmup-cycle: false [2022-01-15 15:22:51] [config] lr-warmup-start-rate: 0 [2022-01-15 15:22:51] [config] max-length: 100 [2022-01-15 15:22:51] [config] max-length-crop: false [2022-01-15 15:22:51] [config] max-length-factor: 3 [2022-01-15 15:22:51] [config] maxi-batch: 1000 [2022-01-15 15:22:51] [config] maxi-batch-sort: trg [2022-01-15 15:22:51] [config] mini-batch: 64 [2022-01-15 15:22:51] [config] mini-batch-fit: true [2022-01-15 15:22:51] [config] mini-batch-fit-step: 10 [2022-01-15 15:22:51] [config] mini-batch-track-lr: false [2022-01-15 15:22:51] [config] mini-batch-warmup: 0 [2022-01-15 15:22:51] [config] mini-batch-words: 0 [2022-01-15 15:22:51] [config] mini-batch-words-ref: 0 [2022-01-15 15:22:51] [config] model: model.npz [2022-01-15 15:22:51] [config] multi-loss-type: sum [2022-01-15 15:22:51] [config] multi-node: false [2022-01-15 15:22:51] [config] multi-node-overlap: true [2022-01-15 15:22:51] [config] n-best: false [2022-01-15 15:22:51] [config] no-nccl: false [2022-01-15 15:22:51] [config] no-reload: false [2022-01-15 15:22:51] [config] no-restore-corpus: false [2022-01-15 15:22:51] [config] normalize: 0.6 [2022-01-15 15:22:51] [config] normalize-gradient: false [2022-01-15 15:22:51] [config] num-devices: 0 [2022-01-15 15:22:51] [config] optimizer: adam [2022-01-15 15:22:51] [config] optimizer-delay: 1 [2022-01-15 15:22:51] [config] optimizer-params: [2022-01-15 15:22:51] [config] - 0.9 [2022-01-15 15:22:51] [config] - 0.98 [2022-01-15 15:22:51] [config] - 1e-09 [2022-01-15 15:22:51] [config] output-omit-bias: false [2022-01-15 15:22:51] [config] overwrite: true [2022-01-15 15:22:51] [config] precision: [2022-01-15 15:22:51] [config] - float32 [2022-01-15 15:22:51] [config] - float32 [2022-01-15 15:22:51] [config] - float32 [2022-01-15 15:22:51] [config] pretrained-model: "" [2022-01-15 15:22:51] [config] quantize-biases: false [2022-01-15 15:22:51] [config] quantize-bits: 0 [2022-01-15 15:22:51] [config] quantize-log-based: false [2022-01-15 15:22:51] [config] quantize-optimization-steps: 0 [2022-01-15 15:22:51] [config] quiet: false [2022-01-15 15:22:51] [config] quiet-translation: false [2022-01-15 15:22:51] [config] relative-paths: false [2022-01-15 15:22:51] [config] right-left: false [2022-01-15 15:22:51] [config] save-freq: 5000 [2022-01-15 15:22:51] [config] seed: 0 [2022-01-15 15:22:51] [config] sentencepiece-alphas: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] sentencepiece-max-lines: 2000000 [2022-01-15 15:22:51] [config] sentencepiece-options: "" [2022-01-15 15:22:51] [config] shuffle: data [2022-01-15 15:22:51] [config] shuffle-in-ram: false [2022-01-15 15:22:51] [config] sigterm: save-and-exit [2022-01-15 15:22:51] [config] skip: false [2022-01-15 15:22:51] [config] sqlite: "" [2022-01-15 15:22:51] [config] sqlite-drop: false [2022-01-15 15:22:51] [config] sync-sgd: false [2022-01-15 15:22:51] [config] tempdir: /tmp [2022-01-15 15:22:51] [config] tied-embeddings: true [2022-01-15 15:22:51] [config] tied-embeddings-all: false [2022-01-15 15:22:51] [config] tied-embeddings-src: false [2022-01-15 15:22:51] [config] train-embedder-rank: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] train-sets: [2022-01-15 15:22:51] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 15:22:51] [config] - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 15:22:51] [config] transformer-aan-activation: swish [2022-01-15 15:22:51] [config] transformer-aan-depth: 2 [2022-01-15 15:22:51] [config] transformer-aan-nogate: false [2022-01-15 15:22:51] [config] transformer-decoder-autoreg: self-attention [2022-01-15 15:22:51] [config] transformer-depth-scaling: false [2022-01-15 15:22:51] [config] transformer-dim-aan: 2048 [2022-01-15 15:22:51] [config] transformer-dim-ffn: 2048 [2022-01-15 15:22:51] [config] transformer-dropout: 0.1 [2022-01-15 15:22:51] [config] transformer-dropout-attention: 0 [2022-01-15 15:22:51] [config] transformer-dropout-ffn: 0 [2022-01-15 15:22:51] [config] transformer-ffn-activation: swish [2022-01-15 15:22:51] [config] transformer-ffn-depth: 2 [2022-01-15 15:22:51] [config] transformer-guided-alignment-layer: last [2022-01-15 15:22:51] [config] transformer-heads: 8 [2022-01-15 15:22:51] [config] transformer-no-projection: false [2022-01-15 15:22:51] [config] transformer-pool: false [2022-01-15 15:22:51] [config] transformer-postprocess: dan [2022-01-15 15:22:51] [config] transformer-postprocess-emb: d [2022-01-15 15:22:51] [config] transformer-postprocess-top: "" [2022-01-15 15:22:51] [config] transformer-preprocess: "" [2022-01-15 15:22:51] [config] transformer-tied-layers: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] transformer-train-position-embeddings: false [2022-01-15 15:22:51] [config] tsv: false [2022-01-15 15:22:51] [config] tsv-fields: 0 [2022-01-15 15:22:51] [config] type: transformer [2022-01-15 15:22:51] [config] ulr: false [2022-01-15 15:22:51] [config] ulr-dim-emb: 0 [2022-01-15 15:22:51] [config] ulr-dropout: 0 [2022-01-15 15:22:51] [config] ulr-keys-vectors: "" [2022-01-15 15:22:51] [config] ulr-query-vectors: "" [2022-01-15 15:22:51] [config] ulr-softmax-temperature: 1 [2022-01-15 15:22:51] [config] ulr-trainable-transformation: false [2022-01-15 15:22:51] [config] unlikelihood-loss: false [2022-01-15 15:22:51] [config] valid-freq: 5000 [2022-01-15 15:22:51] [config] valid-log: "" [2022-01-15 15:22:51] [config] valid-max-length: 1000 [2022-01-15 15:22:51] [config] valid-metrics: [2022-01-15 15:22:51] [config] - cross-entropy [2022-01-15 15:22:51] [config] valid-mini-batch: 32 [2022-01-15 15:22:51] [config] valid-reset-stalled: false [2022-01-15 15:22:51] [config] valid-script-args: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] valid-script-path: "" [2022-01-15 15:22:51] [config] valid-sets: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] valid-translation-output: "" [2022-01-15 15:22:51] [config] vocabs: [2022-01-15 15:22:51] [config] [] [2022-01-15 15:22:51] [config] word-penalty: 0 [2022-01-15 15:22:51] [config] word-scores: false [2022-01-15 15:22:51] [config] workspace: 10000 [2022-01-15 15:22:51] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 15:22:51] [training] Using single-device training [2022-01-15 15:22:51] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 15:22:51] [data] Vocabularies will be built separately for each file. [2022-01-15 15:22:51] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 [2022-01-15 15:22:51] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml [2022-01-15 15:22:51] [data] Setting vocabulary size for input 0 to 18,703 [2022-01-15 15:22:51] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 [2022-01-15 15:22:51] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml [2022-01-15 15:22:51] [data] Setting vocabulary size for input 1 to 27,729 [2022-01-15 15:22:51] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 15:22:51] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 15:22:52] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 15:22:52] [logits] Applying loss function for 1 factor(s) [2022-01-15 15:22:52] [memory] Reserving 259 MB, device gpu0 [2022-01-15 15:22:52] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2022-01-15 15:22:52] [memory] Reserving 259 MB, device gpu0 [2022-01-15 15:23:03] [batching] Done. Typical MB size is 9,199 target words [2022-01-15 15:23:03] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 15:23:03] Training started [2022-01-15 15:23:03] [data] Shuffling data [2022-01-15 15:23:05] [data] Done reading 3,103,819 sentences [2022-01-15 15:23:20] [data] Done shuffling 3,103,819 sentences to temp files [2022-01-15 15:23:21] [memory] Reserving 259 MB, device gpu0 [2022-01-15 15:23:21] [memory] Reserving 259 MB, device gpu0 [2022-01-15 15:23:21] [memory] Reserving 518 MB, device gpu0 [2022-01-15 15:23:21] [memory] Reserving 259 MB, device gpu0 [2022-01-15 15:24:37] Ep. 1 : Up. 500 : Sen. 109,644 : Cost 9.73378372 * 3,294,320 @ 7,260 after 3,294,320 : Time 94.41s : 34893.02 words/s : L.r. 9.3750e-06 [2022-01-15 15:25:54] Ep. 1 : Up. 1000 : Sen. 226,634 : Cost 8.77455235 * 3,280,847 @ 8,930 after 6,575,167 : Time 76.53s : 42867.68 words/s : L.r. 1.8750e-05 [2022-01-15 15:27:11] Ep. 1 : Up. 1500 : Sen. 335,958 : Cost 8.43428230 * 3,298,838 @ 8,325 after 9,874,005 : Time 77.55s : 42538.42 words/s : L.r. 2.8125e-05 [2022-01-15 15:28:29] Ep. 1 : Up. 2000 : Sen. 447,084 : Cost 8.11272335 * 3,301,579 @ 6,290 after 13,175,584 : Time 77.65s : 42519.08 words/s : L.r. 3.7500e-05 [2022-01-15 17:18:23] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:18:23] [marian] Running on s470607-gpu as process 3435 with command line: [2022-01-15 17:18:23] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:18:23] [config] after: 0e [2022-01-15 17:18:23] [config] after-batches: 0 [2022-01-15 17:18:23] [config] after-epochs: 1 [2022-01-15 17:18:23] [config] all-caps-every: 0 [2022-01-15 17:18:23] [config] allow-unk: false [2022-01-15 17:18:23] [config] authors: false [2022-01-15 17:18:23] [config] beam-size: 6 [2022-01-15 17:18:23] [config] bert-class-symbol: "[CLS]" [2022-01-15 17:18:23] [config] bert-mask-symbol: "[MASK]" [2022-01-15 17:18:23] [config] bert-masking-fraction: 0.15 [2022-01-15 17:18:23] [config] bert-sep-symbol: "[SEP]" [2022-01-15 17:18:23] [config] bert-train-type-embeddings: true [2022-01-15 17:18:23] [config] bert-type-vocab-size: 2 [2022-01-15 17:18:23] [config] build-info: "" [2022-01-15 17:18:23] [config] cite: false [2022-01-15 17:18:23] [config] clip-norm: 5 [2022-01-15 17:18:23] [config] cost-scaling: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] cost-type: ce-sum [2022-01-15 17:18:23] [config] cpu-threads: 0 [2022-01-15 17:18:23] [config] data-weighting: "" [2022-01-15 17:18:23] [config] data-weighting-type: sentence [2022-01-15 17:18:23] [config] dec-cell: gru [2022-01-15 17:18:23] [config] dec-cell-base-depth: 2 [2022-01-15 17:18:23] [config] dec-cell-high-depth: 1 [2022-01-15 17:18:23] [config] dec-depth: 6 [2022-01-15 17:18:23] [config] devices: [2022-01-15 17:18:23] [config] - 0 [2022-01-15 17:18:23] [config] dim-emb: 512 [2022-01-15 17:18:23] [config] dim-rnn: 1024 [2022-01-15 17:18:23] [config] dim-vocabs: [2022-01-15 17:18:23] [config] - 0 [2022-01-15 17:18:23] [config] - 0 [2022-01-15 17:18:23] [config] disp-first: 0 [2022-01-15 17:18:23] [config] disp-freq: 500 [2022-01-15 17:18:23] [config] disp-label-counts: true [2022-01-15 17:18:23] [config] dropout-rnn: 0 [2022-01-15 17:18:23] [config] dropout-src: 0 [2022-01-15 17:18:23] [config] dropout-trg: 0 [2022-01-15 17:18:23] [config] dump-config: "" [2022-01-15 17:18:23] [config] early-stopping: 10 [2022-01-15 17:18:23] [config] embedding-fix-src: false [2022-01-15 17:18:23] [config] embedding-fix-trg: false [2022-01-15 17:18:23] [config] embedding-normalization: false [2022-01-15 17:18:23] [config] embedding-vectors: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] enc-cell: gru [2022-01-15 17:18:23] [config] enc-cell-depth: 1 [2022-01-15 17:18:23] [config] enc-depth: 6 [2022-01-15 17:18:23] [config] enc-type: bidirectional [2022-01-15 17:18:23] [config] english-title-case-every: 0 [2022-01-15 17:18:23] [config] exponential-smoothing: 0.0001 [2022-01-15 17:18:23] [config] factor-weight: 1 [2022-01-15 17:18:23] [config] grad-dropping-momentum: 0 [2022-01-15 17:18:23] [config] grad-dropping-rate: 0 [2022-01-15 17:18:23] [config] grad-dropping-warmup: 100 [2022-01-15 17:18:23] [config] gradient-checkpointing: false [2022-01-15 17:18:23] [config] guided-alignment: none [2022-01-15 17:18:23] [config] guided-alignment-cost: mse [2022-01-15 17:18:23] [config] guided-alignment-weight: 0.1 [2022-01-15 17:18:23] [config] ignore-model-config: false [2022-01-15 17:18:23] [config] input-types: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] interpolate-env-vars: false [2022-01-15 17:18:23] [config] keep-best: false [2022-01-15 17:18:23] [config] label-smoothing: 0.1 [2022-01-15 17:18:23] [config] layer-normalization: false [2022-01-15 17:18:23] [config] learn-rate: 0.0003 [2022-01-15 17:18:23] [config] lemma-dim-emb: 0 [2022-01-15 17:18:23] [config] log: /home/wmi/train.log [2022-01-15 17:18:23] [config] log-level: info [2022-01-15 17:18:23] [config] log-time-zone: "" [2022-01-15 17:18:23] [config] logical-epoch: [2022-01-15 17:18:23] [config] - 1e [2022-01-15 17:18:23] [config] - 0 [2022-01-15 17:18:23] [config] lr-decay: 0 [2022-01-15 17:18:23] [config] lr-decay-freq: 50000 [2022-01-15 17:18:23] [config] lr-decay-inv-sqrt: [2022-01-15 17:18:23] [config] - 16000 [2022-01-15 17:18:23] [config] lr-decay-repeat-warmup: false [2022-01-15 17:18:23] [config] lr-decay-reset-optimizer: false [2022-01-15 17:18:23] [config] lr-decay-start: [2022-01-15 17:18:23] [config] - 10 [2022-01-15 17:18:23] [config] - 1 [2022-01-15 17:18:23] [config] lr-decay-strategy: epoch+stalled [2022-01-15 17:18:23] [config] lr-report: true [2022-01-15 17:18:23] [config] lr-warmup: 16000 [2022-01-15 17:18:23] [config] lr-warmup-at-reload: false [2022-01-15 17:18:23] [config] lr-warmup-cycle: false [2022-01-15 17:18:23] [config] lr-warmup-start-rate: 0 [2022-01-15 17:18:23] [config] max-length: 100 [2022-01-15 17:18:23] [config] max-length-crop: false [2022-01-15 17:18:23] [config] max-length-factor: 3 [2022-01-15 17:18:23] [config] maxi-batch: 1000 [2022-01-15 17:18:23] [config] maxi-batch-sort: trg [2022-01-15 17:18:23] [config] mini-batch: 64 [2022-01-15 17:18:23] [config] mini-batch-fit: true [2022-01-15 17:18:23] [config] mini-batch-fit-step: 10 [2022-01-15 17:18:23] [config] mini-batch-track-lr: false [2022-01-15 17:18:23] [config] mini-batch-warmup: 0 [2022-01-15 17:18:23] [config] mini-batch-words: 0 [2022-01-15 17:18:23] [config] mini-batch-words-ref: 0 [2022-01-15 17:18:23] [config] model: model.npz [2022-01-15 17:18:23] [config] multi-loss-type: sum [2022-01-15 17:18:23] [config] multi-node: false [2022-01-15 17:18:23] [config] multi-node-overlap: true [2022-01-15 17:18:23] [config] n-best: false [2022-01-15 17:18:23] [config] no-nccl: false [2022-01-15 17:18:23] [config] no-reload: false [2022-01-15 17:18:23] [config] no-restore-corpus: false [2022-01-15 17:18:23] [config] normalize: 0.6 [2022-01-15 17:18:23] [config] normalize-gradient: false [2022-01-15 17:18:23] [config] num-devices: 0 [2022-01-15 17:18:23] [config] optimizer: adam [2022-01-15 17:18:23] [config] optimizer-delay: 1 [2022-01-15 17:18:23] [config] optimizer-params: [2022-01-15 17:18:23] [config] - 0.9 [2022-01-15 17:18:23] [config] - 0.98 [2022-01-15 17:18:23] [config] - 1e-09 [2022-01-15 17:18:23] [config] output-omit-bias: false [2022-01-15 17:18:23] [config] overwrite: true [2022-01-15 17:18:23] [config] precision: [2022-01-15 17:18:23] [config] - float32 [2022-01-15 17:18:23] [config] - float32 [2022-01-15 17:18:23] [config] - float32 [2022-01-15 17:18:23] [config] pretrained-model: "" [2022-01-15 17:18:23] [config] quantize-biases: false [2022-01-15 17:18:23] [config] quantize-bits: 0 [2022-01-15 17:18:23] [config] quantize-log-based: false [2022-01-15 17:18:23] [config] quantize-optimization-steps: 0 [2022-01-15 17:18:23] [config] quiet: false [2022-01-15 17:18:23] [config] quiet-translation: false [2022-01-15 17:18:23] [config] relative-paths: false [2022-01-15 17:18:23] [config] right-left: false [2022-01-15 17:18:23] [config] save-freq: 5000 [2022-01-15 17:18:23] [config] seed: 0 [2022-01-15 17:18:23] [config] sentencepiece-alphas: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] sentencepiece-max-lines: 2000000 [2022-01-15 17:18:23] [config] sentencepiece-options: "" [2022-01-15 17:18:23] [config] shuffle: data [2022-01-15 17:18:23] [config] shuffle-in-ram: false [2022-01-15 17:18:23] [config] sigterm: save-and-exit [2022-01-15 17:18:23] [config] skip: false [2022-01-15 17:18:23] [config] sqlite: "" [2022-01-15 17:18:23] [config] sqlite-drop: false [2022-01-15 17:18:23] [config] sync-sgd: false [2022-01-15 17:18:23] [config] tempdir: /tmp [2022-01-15 17:18:23] [config] tied-embeddings: true [2022-01-15 17:18:23] [config] tied-embeddings-all: false [2022-01-15 17:18:23] [config] tied-embeddings-src: false [2022-01-15 17:18:23] [config] train-embedder-rank: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] train-sets: [2022-01-15 17:18:23] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv [2022-01-15 17:18:23] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv [2022-01-15 17:18:23] [config] transformer-aan-activation: swish [2022-01-15 17:18:23] [config] transformer-aan-depth: 2 [2022-01-15 17:18:23] [config] transformer-aan-nogate: false [2022-01-15 17:18:23] [config] transformer-decoder-autoreg: self-attention [2022-01-15 17:18:23] [config] transformer-depth-scaling: false [2022-01-15 17:18:23] [config] transformer-dim-aan: 2048 [2022-01-15 17:18:23] [config] transformer-dim-ffn: 2048 [2022-01-15 17:18:23] [config] transformer-dropout: 0.1 [2022-01-15 17:18:23] [config] transformer-dropout-attention: 0 [2022-01-15 17:18:23] [config] transformer-dropout-ffn: 0 [2022-01-15 17:18:23] [config] transformer-ffn-activation: swish [2022-01-15 17:18:23] [config] transformer-ffn-depth: 2 [2022-01-15 17:18:23] [config] transformer-guided-alignment-layer: last [2022-01-15 17:18:23] [config] transformer-heads: 8 [2022-01-15 17:18:23] [config] transformer-no-projection: false [2022-01-15 17:18:23] [config] transformer-pool: false [2022-01-15 17:18:23] [config] transformer-postprocess: dan [2022-01-15 17:18:23] [config] transformer-postprocess-emb: d [2022-01-15 17:18:23] [config] transformer-postprocess-top: "" [2022-01-15 17:18:23] [config] transformer-preprocess: "" [2022-01-15 17:18:23] [config] transformer-tied-layers: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] transformer-train-position-embeddings: false [2022-01-15 17:18:23] [config] tsv: false [2022-01-15 17:18:23] [config] tsv-fields: 0 [2022-01-15 17:18:23] [config] type: transformer [2022-01-15 17:18:23] [config] ulr: false [2022-01-15 17:18:23] [config] ulr-dim-emb: 0 [2022-01-15 17:18:23] [config] ulr-dropout: 0 [2022-01-15 17:18:23] [config] ulr-keys-vectors: "" [2022-01-15 17:18:23] [config] ulr-query-vectors: "" [2022-01-15 17:18:23] [config] ulr-softmax-temperature: 1 [2022-01-15 17:18:23] [config] ulr-trainable-transformation: false [2022-01-15 17:18:23] [config] unlikelihood-loss: false [2022-01-15 17:18:23] [config] valid-freq: 5000 [2022-01-15 17:18:23] [config] valid-log: "" [2022-01-15 17:18:23] [config] valid-max-length: 1000 [2022-01-15 17:18:23] [config] valid-metrics: [2022-01-15 17:18:23] [config] - cross-entropy [2022-01-15 17:18:23] [config] valid-mini-batch: 32 [2022-01-15 17:18:23] [config] valid-reset-stalled: false [2022-01-15 17:18:23] [config] valid-script-args: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] valid-script-path: "" [2022-01-15 17:18:23] [config] valid-sets: [2022-01-15 17:18:23] [config] [] [2022-01-15 17:18:23] [config] valid-translation-output: "" [2022-01-15 17:18:23] [config] vocabs: [2022-01-15 17:18:23] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:18:23] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:18:23] [config] word-penalty: 0 [2022-01-15 17:18:23] [config] word-scores: false [2022-01-15 17:18:23] [config] workspace: 10000 [2022-01-15 17:18:23] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:18:23] [training] Using single-device training [2022-01-15 17:18:23] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:18:23] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion [2022-01-15 17:18:23] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113 [CALL STACK] [0x55bd8ec9a5e6] + 0x29c5e6 [0x7ff56152938c] + 0xaa38c [0x7ff5615293f7] + 0xaa3f7 [0x7ff5615296a9] + 0xaa6a9 [0x55bd8ef26c20] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x1130 [0x55bd8ef15e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x55bd8ef16728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x55bd8ef62189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x55bd8ef75084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x55bd8edd3f8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x55bd8ee5b94b] marian::Train:: run () + 0x19cb [0x55bd8ed62389] mainTrainer (int, char**) + 0x5e9 [0x55bd8ed201bc] main + 0x3c [0x7ff56114a0b3] __libc_start_main + 0xf3 [0x55bd8ed60b0e] _start + 0x2e [2022-01-15 17:26:24] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:26:24] [marian] Running on s470607-gpu as process 3591 with command line: [2022-01-15 17:26:24] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:26:24] [config] after: 0e [2022-01-15 17:26:24] [config] after-batches: 0 [2022-01-15 17:26:24] [config] after-epochs: 1 [2022-01-15 17:26:24] [config] all-caps-every: 0 [2022-01-15 17:26:24] [config] allow-unk: false [2022-01-15 17:26:24] [config] authors: false [2022-01-15 17:26:24] [config] beam-size: 6 [2022-01-15 17:26:24] [config] bert-class-symbol: "[CLS]" [2022-01-15 17:26:24] [config] bert-mask-symbol: "[MASK]" [2022-01-15 17:26:24] [config] bert-masking-fraction: 0.15 [2022-01-15 17:26:24] [config] bert-sep-symbol: "[SEP]" [2022-01-15 17:26:24] [config] bert-train-type-embeddings: true [2022-01-15 17:26:24] [config] bert-type-vocab-size: 2 [2022-01-15 17:26:24] [config] build-info: "" [2022-01-15 17:26:24] [config] cite: false [2022-01-15 17:26:24] [config] clip-norm: 5 [2022-01-15 17:26:24] [config] cost-scaling: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] cost-type: ce-sum [2022-01-15 17:26:24] [config] cpu-threads: 0 [2022-01-15 17:26:24] [config] data-weighting: "" [2022-01-15 17:26:24] [config] data-weighting-type: sentence [2022-01-15 17:26:24] [config] dec-cell: gru [2022-01-15 17:26:24] [config] dec-cell-base-depth: 2 [2022-01-15 17:26:24] [config] dec-cell-high-depth: 1 [2022-01-15 17:26:24] [config] dec-depth: 6 [2022-01-15 17:26:24] [config] devices: [2022-01-15 17:26:24] [config] - 0 [2022-01-15 17:26:24] [config] dim-emb: 512 [2022-01-15 17:26:24] [config] dim-rnn: 1024 [2022-01-15 17:26:24] [config] dim-vocabs: [2022-01-15 17:26:24] [config] - 0 [2022-01-15 17:26:24] [config] - 0 [2022-01-15 17:26:24] [config] disp-first: 0 [2022-01-15 17:26:24] [config] disp-freq: 500 [2022-01-15 17:26:24] [config] disp-label-counts: true [2022-01-15 17:26:24] [config] dropout-rnn: 0 [2022-01-15 17:26:24] [config] dropout-src: 0 [2022-01-15 17:26:24] [config] dropout-trg: 0 [2022-01-15 17:26:24] [config] dump-config: "" [2022-01-15 17:26:24] [config] early-stopping: 10 [2022-01-15 17:26:24] [config] embedding-fix-src: false [2022-01-15 17:26:24] [config] embedding-fix-trg: false [2022-01-15 17:26:24] [config] embedding-normalization: false [2022-01-15 17:26:24] [config] embedding-vectors: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] enc-cell: gru [2022-01-15 17:26:24] [config] enc-cell-depth: 1 [2022-01-15 17:26:24] [config] enc-depth: 6 [2022-01-15 17:26:24] [config] enc-type: bidirectional [2022-01-15 17:26:24] [config] english-title-case-every: 0 [2022-01-15 17:26:24] [config] exponential-smoothing: 0.0001 [2022-01-15 17:26:24] [config] factor-weight: 1 [2022-01-15 17:26:24] [config] grad-dropping-momentum: 0 [2022-01-15 17:26:24] [config] grad-dropping-rate: 0 [2022-01-15 17:26:24] [config] grad-dropping-warmup: 100 [2022-01-15 17:26:24] [config] gradient-checkpointing: false [2022-01-15 17:26:24] [config] guided-alignment: none [2022-01-15 17:26:24] [config] guided-alignment-cost: mse [2022-01-15 17:26:24] [config] guided-alignment-weight: 0.1 [2022-01-15 17:26:24] [config] ignore-model-config: false [2022-01-15 17:26:24] [config] input-types: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] interpolate-env-vars: false [2022-01-15 17:26:24] [config] keep-best: false [2022-01-15 17:26:24] [config] label-smoothing: 0.1 [2022-01-15 17:26:24] [config] layer-normalization: false [2022-01-15 17:26:24] [config] learn-rate: 0.0003 [2022-01-15 17:26:24] [config] lemma-dim-emb: 0 [2022-01-15 17:26:24] [config] log: /home/wmi/train.log [2022-01-15 17:26:24] [config] log-level: info [2022-01-15 17:26:24] [config] log-time-zone: "" [2022-01-15 17:26:24] [config] logical-epoch: [2022-01-15 17:26:24] [config] - 1e [2022-01-15 17:26:24] [config] - 0 [2022-01-15 17:26:24] [config] lr-decay: 0 [2022-01-15 17:26:24] [config] lr-decay-freq: 50000 [2022-01-15 17:26:24] [config] lr-decay-inv-sqrt: [2022-01-15 17:26:24] [config] - 16000 [2022-01-15 17:26:24] [config] lr-decay-repeat-warmup: false [2022-01-15 17:26:24] [config] lr-decay-reset-optimizer: false [2022-01-15 17:26:24] [config] lr-decay-start: [2022-01-15 17:26:24] [config] - 10 [2022-01-15 17:26:24] [config] - 1 [2022-01-15 17:26:24] [config] lr-decay-strategy: epoch+stalled [2022-01-15 17:26:24] [config] lr-report: true [2022-01-15 17:26:24] [config] lr-warmup: 16000 [2022-01-15 17:26:24] [config] lr-warmup-at-reload: false [2022-01-15 17:26:24] [config] lr-warmup-cycle: false [2022-01-15 17:26:24] [config] lr-warmup-start-rate: 0 [2022-01-15 17:26:24] [config] max-length: 100 [2022-01-15 17:26:24] [config] max-length-crop: false [2022-01-15 17:26:24] [config] max-length-factor: 3 [2022-01-15 17:26:24] [config] maxi-batch: 1000 [2022-01-15 17:26:24] [config] maxi-batch-sort: trg [2022-01-15 17:26:24] [config] mini-batch: 64 [2022-01-15 17:26:24] [config] mini-batch-fit: true [2022-01-15 17:26:24] [config] mini-batch-fit-step: 10 [2022-01-15 17:26:24] [config] mini-batch-track-lr: false [2022-01-15 17:26:24] [config] mini-batch-warmup: 0 [2022-01-15 17:26:24] [config] mini-batch-words: 0 [2022-01-15 17:26:24] [config] mini-batch-words-ref: 0 [2022-01-15 17:26:24] [config] model: model.npz [2022-01-15 17:26:24] [config] multi-loss-type: sum [2022-01-15 17:26:24] [config] multi-node: false [2022-01-15 17:26:24] [config] multi-node-overlap: true [2022-01-15 17:26:24] [config] n-best: false [2022-01-15 17:26:24] [config] no-nccl: false [2022-01-15 17:26:24] [config] no-reload: false [2022-01-15 17:26:24] [config] no-restore-corpus: false [2022-01-15 17:26:24] [config] normalize: 0.6 [2022-01-15 17:26:24] [config] normalize-gradient: false [2022-01-15 17:26:24] [config] num-devices: 0 [2022-01-15 17:26:24] [config] optimizer: adam [2022-01-15 17:26:24] [config] optimizer-delay: 1 [2022-01-15 17:26:24] [config] optimizer-params: [2022-01-15 17:26:24] [config] - 0.9 [2022-01-15 17:26:24] [config] - 0.98 [2022-01-15 17:26:24] [config] - 1e-09 [2022-01-15 17:26:24] [config] output-omit-bias: false [2022-01-15 17:26:24] [config] overwrite: true [2022-01-15 17:26:24] [config] precision: [2022-01-15 17:26:24] [config] - float32 [2022-01-15 17:26:24] [config] - float32 [2022-01-15 17:26:24] [config] - float32 [2022-01-15 17:26:24] [config] pretrained-model: "" [2022-01-15 17:26:24] [config] quantize-biases: false [2022-01-15 17:26:24] [config] quantize-bits: 0 [2022-01-15 17:26:24] [config] quantize-log-based: false [2022-01-15 17:26:24] [config] quantize-optimization-steps: 0 [2022-01-15 17:26:24] [config] quiet: false [2022-01-15 17:26:24] [config] quiet-translation: false [2022-01-15 17:26:24] [config] relative-paths: false [2022-01-15 17:26:24] [config] right-left: false [2022-01-15 17:26:24] [config] save-freq: 5000 [2022-01-15 17:26:24] [config] seed: 0 [2022-01-15 17:26:24] [config] sentencepiece-alphas: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] sentencepiece-max-lines: 2000000 [2022-01-15 17:26:24] [config] sentencepiece-options: "" [2022-01-15 17:26:24] [config] shuffle: data [2022-01-15 17:26:24] [config] shuffle-in-ram: false [2022-01-15 17:26:24] [config] sigterm: save-and-exit [2022-01-15 17:26:24] [config] skip: false [2022-01-15 17:26:24] [config] sqlite: "" [2022-01-15 17:26:24] [config] sqlite-drop: false [2022-01-15 17:26:24] [config] sync-sgd: false [2022-01-15 17:26:24] [config] tempdir: /tmp [2022-01-15 17:26:24] [config] tied-embeddings: true [2022-01-15 17:26:24] [config] tied-embeddings-all: false [2022-01-15 17:26:24] [config] tied-embeddings-src: false [2022-01-15 17:26:24] [config] train-embedder-rank: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] train-sets: [2022-01-15 17:26:24] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv [2022-01-15 17:26:24] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv [2022-01-15 17:26:24] [config] transformer-aan-activation: swish [2022-01-15 17:26:24] [config] transformer-aan-depth: 2 [2022-01-15 17:26:24] [config] transformer-aan-nogate: false [2022-01-15 17:26:24] [config] transformer-decoder-autoreg: self-attention [2022-01-15 17:26:24] [config] transformer-depth-scaling: false [2022-01-15 17:26:24] [config] transformer-dim-aan: 2048 [2022-01-15 17:26:24] [config] transformer-dim-ffn: 2048 [2022-01-15 17:26:24] [config] transformer-dropout: 0.1 [2022-01-15 17:26:24] [config] transformer-dropout-attention: 0 [2022-01-15 17:26:24] [config] transformer-dropout-ffn: 0 [2022-01-15 17:26:24] [config] transformer-ffn-activation: swish [2022-01-15 17:26:24] [config] transformer-ffn-depth: 2 [2022-01-15 17:26:24] [config] transformer-guided-alignment-layer: last [2022-01-15 17:26:24] [config] transformer-heads: 8 [2022-01-15 17:26:24] [config] transformer-no-projection: false [2022-01-15 17:26:24] [config] transformer-pool: false [2022-01-15 17:26:24] [config] transformer-postprocess: dan [2022-01-15 17:26:24] [config] transformer-postprocess-emb: d [2022-01-15 17:26:24] [config] transformer-postprocess-top: "" [2022-01-15 17:26:24] [config] transformer-preprocess: "" [2022-01-15 17:26:24] [config] transformer-tied-layers: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] transformer-train-position-embeddings: false [2022-01-15 17:26:24] [config] tsv: false [2022-01-15 17:26:24] [config] tsv-fields: 0 [2022-01-15 17:26:24] [config] type: transformer [2022-01-15 17:26:24] [config] ulr: false [2022-01-15 17:26:24] [config] ulr-dim-emb: 0 [2022-01-15 17:26:24] [config] ulr-dropout: 0 [2022-01-15 17:26:24] [config] ulr-keys-vectors: "" [2022-01-15 17:26:24] [config] ulr-query-vectors: "" [2022-01-15 17:26:24] [config] ulr-softmax-temperature: 1 [2022-01-15 17:26:24] [config] ulr-trainable-transformation: false [2022-01-15 17:26:24] [config] unlikelihood-loss: false [2022-01-15 17:26:24] [config] valid-freq: 5000 [2022-01-15 17:26:24] [config] valid-log: "" [2022-01-15 17:26:24] [config] valid-max-length: 1000 [2022-01-15 17:26:24] [config] valid-metrics: [2022-01-15 17:26:24] [config] - cross-entropy [2022-01-15 17:26:24] [config] valid-mini-batch: 32 [2022-01-15 17:26:24] [config] valid-reset-stalled: false [2022-01-15 17:26:24] [config] valid-script-args: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] valid-script-path: "" [2022-01-15 17:26:24] [config] valid-sets: [2022-01-15 17:26:24] [config] [] [2022-01-15 17:26:24] [config] valid-translation-output: "" [2022-01-15 17:26:24] [config] vocabs: [2022-01-15 17:26:24] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:26:24] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:26:24] [config] word-penalty: 0 [2022-01-15 17:26:24] [config] word-scores: false [2022-01-15 17:26:24] [config] workspace: 10000 [2022-01-15 17:26:24] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:26:24] [training] Using single-device training [2022-01-15 17:26:24] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:26:24] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion [2022-01-15 17:26:24] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113 [CALL STACK] [0x55e28e7915e6] + 0x29c5e6 [0x7f06c3a4d38c] + 0xaa38c [0x7f06c3a4d3f7] + 0xaa3f7 [0x7f06c3a4d6a9] + 0xaa6a9 [0x55e28ea1dc20] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x1130 [0x55e28ea0ce2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x55e28ea0d728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x55e28ea59189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x55e28ea6c084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x55e28e8caf8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x55e28e95294b] marian::Train:: run () + 0x19cb [0x55e28e859389] mainTrainer (int, char**) + 0x5e9 [0x55e28e8171bc] main + 0x3c [0x7f06c366e0b3] __libc_start_main + 0xf3 [0x55e28e857b0e] _start + 0x2e [2022-01-15 17:35:32] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:35:32] [marian] Running on s470607-gpu as process 3646 with command line: [2022-01-15 17:35:32] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:35:32] [config] after: 0e [2022-01-15 17:35:32] [config] after-batches: 0 [2022-01-15 17:35:32] [config] after-epochs: 1 [2022-01-15 17:35:32] [config] all-caps-every: 0 [2022-01-15 17:35:32] [config] allow-unk: false [2022-01-15 17:35:32] [config] authors: false [2022-01-15 17:35:32] [config] beam-size: 6 [2022-01-15 17:35:32] [config] bert-class-symbol: "[CLS]" [2022-01-15 17:35:32] [config] bert-mask-symbol: "[MASK]" [2022-01-15 17:35:32] [config] bert-masking-fraction: 0.15 [2022-01-15 17:35:32] [config] bert-sep-symbol: "[SEP]" [2022-01-15 17:35:32] [config] bert-train-type-embeddings: true [2022-01-15 17:35:32] [config] bert-type-vocab-size: 2 [2022-01-15 17:35:32] [config] build-info: "" [2022-01-15 17:35:32] [config] cite: false [2022-01-15 17:35:32] [config] clip-norm: 5 [2022-01-15 17:35:32] [config] cost-scaling: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] cost-type: ce-sum [2022-01-15 17:35:32] [config] cpu-threads: 0 [2022-01-15 17:35:32] [config] data-weighting: "" [2022-01-15 17:35:32] [config] data-weighting-type: sentence [2022-01-15 17:35:32] [config] dec-cell: gru [2022-01-15 17:35:32] [config] dec-cell-base-depth: 2 [2022-01-15 17:35:32] [config] dec-cell-high-depth: 1 [2022-01-15 17:35:32] [config] dec-depth: 6 [2022-01-15 17:35:32] [config] devices: [2022-01-15 17:35:32] [config] - 0 [2022-01-15 17:35:32] [config] dim-emb: 512 [2022-01-15 17:35:32] [config] dim-rnn: 1024 [2022-01-15 17:35:32] [config] dim-vocabs: [2022-01-15 17:35:32] [config] - 0 [2022-01-15 17:35:32] [config] - 0 [2022-01-15 17:35:32] [config] disp-first: 0 [2022-01-15 17:35:32] [config] disp-freq: 500 [2022-01-15 17:35:32] [config] disp-label-counts: true [2022-01-15 17:35:32] [config] dropout-rnn: 0 [2022-01-15 17:35:32] [config] dropout-src: 0 [2022-01-15 17:35:32] [config] dropout-trg: 0 [2022-01-15 17:35:32] [config] dump-config: "" [2022-01-15 17:35:32] [config] early-stopping: 10 [2022-01-15 17:35:32] [config] embedding-fix-src: false [2022-01-15 17:35:32] [config] embedding-fix-trg: false [2022-01-15 17:35:32] [config] embedding-normalization: false [2022-01-15 17:35:32] [config] embedding-vectors: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] enc-cell: gru [2022-01-15 17:35:32] [config] enc-cell-depth: 1 [2022-01-15 17:35:32] [config] enc-depth: 6 [2022-01-15 17:35:32] [config] enc-type: bidirectional [2022-01-15 17:35:32] [config] english-title-case-every: 0 [2022-01-15 17:35:32] [config] exponential-smoothing: 0.0001 [2022-01-15 17:35:32] [config] factor-weight: 1 [2022-01-15 17:35:32] [config] grad-dropping-momentum: 0 [2022-01-15 17:35:32] [config] grad-dropping-rate: 0 [2022-01-15 17:35:32] [config] grad-dropping-warmup: 100 [2022-01-15 17:35:32] [config] gradient-checkpointing: false [2022-01-15 17:35:32] [config] guided-alignment: none [2022-01-15 17:35:32] [config] guided-alignment-cost: mse [2022-01-15 17:35:32] [config] guided-alignment-weight: 0.1 [2022-01-15 17:35:32] [config] ignore-model-config: false [2022-01-15 17:35:32] [config] input-types: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] interpolate-env-vars: false [2022-01-15 17:35:32] [config] keep-best: false [2022-01-15 17:35:32] [config] label-smoothing: 0.1 [2022-01-15 17:35:32] [config] layer-normalization: false [2022-01-15 17:35:32] [config] learn-rate: 0.0003 [2022-01-15 17:35:32] [config] lemma-dim-emb: 0 [2022-01-15 17:35:32] [config] log: /home/wmi/train.log [2022-01-15 17:35:32] [config] log-level: info [2022-01-15 17:35:32] [config] log-time-zone: "" [2022-01-15 17:35:32] [config] logical-epoch: [2022-01-15 17:35:32] [config] - 1e [2022-01-15 17:35:32] [config] - 0 [2022-01-15 17:35:32] [config] lr-decay: 0 [2022-01-15 17:35:32] [config] lr-decay-freq: 50000 [2022-01-15 17:35:32] [config] lr-decay-inv-sqrt: [2022-01-15 17:35:32] [config] - 16000 [2022-01-15 17:35:32] [config] lr-decay-repeat-warmup: false [2022-01-15 17:35:32] [config] lr-decay-reset-optimizer: false [2022-01-15 17:35:32] [config] lr-decay-start: [2022-01-15 17:35:32] [config] - 10 [2022-01-15 17:35:32] [config] - 1 [2022-01-15 17:35:32] [config] lr-decay-strategy: epoch+stalled [2022-01-15 17:35:32] [config] lr-report: true [2022-01-15 17:35:32] [config] lr-warmup: 16000 [2022-01-15 17:35:32] [config] lr-warmup-at-reload: false [2022-01-15 17:35:32] [config] lr-warmup-cycle: false [2022-01-15 17:35:32] [config] lr-warmup-start-rate: 0 [2022-01-15 17:35:32] [config] max-length: 100 [2022-01-15 17:35:32] [config] max-length-crop: false [2022-01-15 17:35:32] [config] max-length-factor: 3 [2022-01-15 17:35:32] [config] maxi-batch: 1000 [2022-01-15 17:35:32] [config] maxi-batch-sort: trg [2022-01-15 17:35:32] [config] mini-batch: 64 [2022-01-15 17:35:32] [config] mini-batch-fit: true [2022-01-15 17:35:32] [config] mini-batch-fit-step: 10 [2022-01-15 17:35:32] [config] mini-batch-track-lr: false [2022-01-15 17:35:32] [config] mini-batch-warmup: 0 [2022-01-15 17:35:32] [config] mini-batch-words: 0 [2022-01-15 17:35:32] [config] mini-batch-words-ref: 0 [2022-01-15 17:35:32] [config] model: model.npz [2022-01-15 17:35:32] [config] multi-loss-type: sum [2022-01-15 17:35:32] [config] multi-node: false [2022-01-15 17:35:32] [config] multi-node-overlap: true [2022-01-15 17:35:32] [config] n-best: false [2022-01-15 17:35:32] [config] no-nccl: false [2022-01-15 17:35:32] [config] no-reload: false [2022-01-15 17:35:32] [config] no-restore-corpus: false [2022-01-15 17:35:32] [config] normalize: 0.6 [2022-01-15 17:35:32] [config] normalize-gradient: false [2022-01-15 17:35:32] [config] num-devices: 0 [2022-01-15 17:35:32] [config] optimizer: adam [2022-01-15 17:35:32] [config] optimizer-delay: 1 [2022-01-15 17:35:32] [config] optimizer-params: [2022-01-15 17:35:32] [config] - 0.9 [2022-01-15 17:35:32] [config] - 0.98 [2022-01-15 17:35:32] [config] - 1e-09 [2022-01-15 17:35:32] [config] output-omit-bias: false [2022-01-15 17:35:32] [config] overwrite: true [2022-01-15 17:35:32] [config] precision: [2022-01-15 17:35:32] [config] - float32 [2022-01-15 17:35:32] [config] - float32 [2022-01-15 17:35:32] [config] - float32 [2022-01-15 17:35:32] [config] pretrained-model: "" [2022-01-15 17:35:32] [config] quantize-biases: false [2022-01-15 17:35:32] [config] quantize-bits: 0 [2022-01-15 17:35:32] [config] quantize-log-based: false [2022-01-15 17:35:32] [config] quantize-optimization-steps: 0 [2022-01-15 17:35:32] [config] quiet: false [2022-01-15 17:35:32] [config] quiet-translation: false [2022-01-15 17:35:32] [config] relative-paths: false [2022-01-15 17:35:32] [config] right-left: false [2022-01-15 17:35:32] [config] save-freq: 5000 [2022-01-15 17:35:32] [config] seed: 0 [2022-01-15 17:35:32] [config] sentencepiece-alphas: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] sentencepiece-max-lines: 2000000 [2022-01-15 17:35:32] [config] sentencepiece-options: "" [2022-01-15 17:35:32] [config] shuffle: data [2022-01-15 17:35:32] [config] shuffle-in-ram: false [2022-01-15 17:35:32] [config] sigterm: save-and-exit [2022-01-15 17:35:32] [config] skip: false [2022-01-15 17:35:32] [config] sqlite: "" [2022-01-15 17:35:32] [config] sqlite-drop: false [2022-01-15 17:35:32] [config] sync-sgd: false [2022-01-15 17:35:32] [config] tempdir: /tmp [2022-01-15 17:35:32] [config] tied-embeddings: true [2022-01-15 17:35:32] [config] tied-embeddings-all: false [2022-01-15 17:35:32] [config] tied-embeddings-src: false [2022-01-15 17:35:32] [config] train-embedder-rank: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] train-sets: [2022-01-15 17:35:32] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv [2022-01-15 17:35:32] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv [2022-01-15 17:35:32] [config] transformer-aan-activation: swish [2022-01-15 17:35:32] [config] transformer-aan-depth: 2 [2022-01-15 17:35:32] [config] transformer-aan-nogate: false [2022-01-15 17:35:32] [config] transformer-decoder-autoreg: self-attention [2022-01-15 17:35:32] [config] transformer-depth-scaling: false [2022-01-15 17:35:32] [config] transformer-dim-aan: 2048 [2022-01-15 17:35:32] [config] transformer-dim-ffn: 2048 [2022-01-15 17:35:32] [config] transformer-dropout: 0.1 [2022-01-15 17:35:32] [config] transformer-dropout-attention: 0 [2022-01-15 17:35:32] [config] transformer-dropout-ffn: 0 [2022-01-15 17:35:32] [config] transformer-ffn-activation: swish [2022-01-15 17:35:32] [config] transformer-ffn-depth: 2 [2022-01-15 17:35:32] [config] transformer-guided-alignment-layer: last [2022-01-15 17:35:32] [config] transformer-heads: 8 [2022-01-15 17:35:32] [config] transformer-no-projection: false [2022-01-15 17:35:32] [config] transformer-pool: false [2022-01-15 17:35:32] [config] transformer-postprocess: dan [2022-01-15 17:35:32] [config] transformer-postprocess-emb: d [2022-01-15 17:35:32] [config] transformer-postprocess-top: "" [2022-01-15 17:35:32] [config] transformer-preprocess: "" [2022-01-15 17:35:32] [config] transformer-tied-layers: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] transformer-train-position-embeddings: false [2022-01-15 17:35:32] [config] tsv: false [2022-01-15 17:35:32] [config] tsv-fields: 0 [2022-01-15 17:35:32] [config] type: transformer [2022-01-15 17:35:32] [config] ulr: false [2022-01-15 17:35:32] [config] ulr-dim-emb: 0 [2022-01-15 17:35:32] [config] ulr-dropout: 0 [2022-01-15 17:35:32] [config] ulr-keys-vectors: "" [2022-01-15 17:35:32] [config] ulr-query-vectors: "" [2022-01-15 17:35:32] [config] ulr-softmax-temperature: 1 [2022-01-15 17:35:32] [config] ulr-trainable-transformation: false [2022-01-15 17:35:32] [config] unlikelihood-loss: false [2022-01-15 17:35:32] [config] valid-freq: 5000 [2022-01-15 17:35:32] [config] valid-log: "" [2022-01-15 17:35:32] [config] valid-max-length: 1000 [2022-01-15 17:35:32] [config] valid-metrics: [2022-01-15 17:35:32] [config] - cross-entropy [2022-01-15 17:35:32] [config] valid-mini-batch: 32 [2022-01-15 17:35:32] [config] valid-reset-stalled: false [2022-01-15 17:35:32] [config] valid-script-args: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] valid-script-path: "" [2022-01-15 17:35:32] [config] valid-sets: [2022-01-15 17:35:32] [config] [] [2022-01-15 17:35:32] [config] valid-translation-output: "" [2022-01-15 17:35:32] [config] vocabs: [2022-01-15 17:35:32] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:35:32] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:35:32] [config] word-penalty: 0 [2022-01-15 17:35:32] [config] word-scores: false [2022-01-15 17:35:32] [config] workspace: 10000 [2022-01-15 17:35:32] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:35:32] [training] Using single-device training [2022-01-15 17:35:32] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:35:32] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion [2022-01-15 17:35:32] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113 [CALL STACK] [0x561c3cd7b5e6] + 0x29c5e6 [0x7f4b0819138c] + 0xaa38c [0x7f4b081913f7] + 0xaa3f7 [0x7f4b081916a9] + 0xaa6a9 [0x561c3d007c20] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x1130 [0x561c3cff6e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x561c3cff7728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x561c3d043189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x561c3d056084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x561c3ceb4f8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x561c3cf3c94b] marian::Train:: run () + 0x19cb [0x561c3ce43389] mainTrainer (int, char**) + 0x5e9 [0x561c3ce011bc] main + 0x3c [0x7f4b07db20b3] __libc_start_main + 0xf3 [0x561c3ce41b0e] _start + 0x2e [2022-01-15 17:40:11] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:40:11] [marian] Running on s470607-gpu as process 3743 with command line: [2022-01-15 17:40:11] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:40:11] [config] after: 0e [2022-01-15 17:40:11] [config] after-batches: 0 [2022-01-15 17:40:11] [config] after-epochs: 1 [2022-01-15 17:40:11] [config] all-caps-every: 0 [2022-01-15 17:40:11] [config] allow-unk: false [2022-01-15 17:40:11] [config] authors: false [2022-01-15 17:40:11] [config] beam-size: 6 [2022-01-15 17:40:11] [config] bert-class-symbol: "[CLS]" [2022-01-15 17:40:11] [config] bert-mask-symbol: "[MASK]" [2022-01-15 17:40:11] [config] bert-masking-fraction: 0.15 [2022-01-15 17:40:11] [config] bert-sep-symbol: "[SEP]" [2022-01-15 17:40:11] [config] bert-train-type-embeddings: true [2022-01-15 17:40:11] [config] bert-type-vocab-size: 2 [2022-01-15 17:40:11] [config] build-info: "" [2022-01-15 17:40:11] [config] cite: false [2022-01-15 17:40:11] [config] clip-norm: 5 [2022-01-15 17:40:11] [config] cost-scaling: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] cost-type: ce-sum [2022-01-15 17:40:11] [config] cpu-threads: 0 [2022-01-15 17:40:11] [config] data-weighting: "" [2022-01-15 17:40:11] [config] data-weighting-type: sentence [2022-01-15 17:40:11] [config] dec-cell: gru [2022-01-15 17:40:11] [config] dec-cell-base-depth: 2 [2022-01-15 17:40:11] [config] dec-cell-high-depth: 1 [2022-01-15 17:40:11] [config] dec-depth: 6 [2022-01-15 17:40:11] [config] devices: [2022-01-15 17:40:11] [config] - 0 [2022-01-15 17:40:11] [config] dim-emb: 512 [2022-01-15 17:40:11] [config] dim-rnn: 1024 [2022-01-15 17:40:11] [config] dim-vocabs: [2022-01-15 17:40:11] [config] - 0 [2022-01-15 17:40:11] [config] - 0 [2022-01-15 17:40:11] [config] disp-first: 0 [2022-01-15 17:40:11] [config] disp-freq: 500 [2022-01-15 17:40:11] [config] disp-label-counts: true [2022-01-15 17:40:11] [config] dropout-rnn: 0 [2022-01-15 17:40:11] [config] dropout-src: 0 [2022-01-15 17:40:11] [config] dropout-trg: 0 [2022-01-15 17:40:11] [config] dump-config: "" [2022-01-15 17:40:11] [config] early-stopping: 10 [2022-01-15 17:40:11] [config] embedding-fix-src: false [2022-01-15 17:40:11] [config] embedding-fix-trg: false [2022-01-15 17:40:11] [config] embedding-normalization: false [2022-01-15 17:40:11] [config] embedding-vectors: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] enc-cell: gru [2022-01-15 17:40:11] [config] enc-cell-depth: 1 [2022-01-15 17:40:11] [config] enc-depth: 6 [2022-01-15 17:40:11] [config] enc-type: bidirectional [2022-01-15 17:40:11] [config] english-title-case-every: 0 [2022-01-15 17:40:11] [config] exponential-smoothing: 0.0001 [2022-01-15 17:40:11] [config] factor-weight: 1 [2022-01-15 17:40:11] [config] grad-dropping-momentum: 0 [2022-01-15 17:40:11] [config] grad-dropping-rate: 0 [2022-01-15 17:40:11] [config] grad-dropping-warmup: 100 [2022-01-15 17:40:11] [config] gradient-checkpointing: false [2022-01-15 17:40:11] [config] guided-alignment: none [2022-01-15 17:40:11] [config] guided-alignment-cost: mse [2022-01-15 17:40:11] [config] guided-alignment-weight: 0.1 [2022-01-15 17:40:11] [config] ignore-model-config: false [2022-01-15 17:40:11] [config] input-types: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] interpolate-env-vars: false [2022-01-15 17:40:11] [config] keep-best: false [2022-01-15 17:40:11] [config] label-smoothing: 0.1 [2022-01-15 17:40:11] [config] layer-normalization: false [2022-01-15 17:40:11] [config] learn-rate: 0.0003 [2022-01-15 17:40:11] [config] lemma-dim-emb: 0 [2022-01-15 17:40:11] [config] log: /home/wmi/train.log [2022-01-15 17:40:11] [config] log-level: info [2022-01-15 17:40:11] [config] log-time-zone: "" [2022-01-15 17:40:11] [config] logical-epoch: [2022-01-15 17:40:11] [config] - 1e [2022-01-15 17:40:11] [config] - 0 [2022-01-15 17:40:11] [config] lr-decay: 0 [2022-01-15 17:40:11] [config] lr-decay-freq: 50000 [2022-01-15 17:40:11] [config] lr-decay-inv-sqrt: [2022-01-15 17:40:11] [config] - 16000 [2022-01-15 17:40:11] [config] lr-decay-repeat-warmup: false [2022-01-15 17:40:11] [config] lr-decay-reset-optimizer: false [2022-01-15 17:40:11] [config] lr-decay-start: [2022-01-15 17:40:11] [config] - 10 [2022-01-15 17:40:11] [config] - 1 [2022-01-15 17:40:11] [config] lr-decay-strategy: epoch+stalled [2022-01-15 17:40:11] [config] lr-report: true [2022-01-15 17:40:11] [config] lr-warmup: 16000 [2022-01-15 17:40:11] [config] lr-warmup-at-reload: false [2022-01-15 17:40:11] [config] lr-warmup-cycle: false [2022-01-15 17:40:11] [config] lr-warmup-start-rate: 0 [2022-01-15 17:40:11] [config] max-length: 100 [2022-01-15 17:40:11] [config] max-length-crop: false [2022-01-15 17:40:11] [config] max-length-factor: 3 [2022-01-15 17:40:11] [config] maxi-batch: 1000 [2022-01-15 17:40:11] [config] maxi-batch-sort: trg [2022-01-15 17:40:11] [config] mini-batch: 64 [2022-01-15 17:40:11] [config] mini-batch-fit: true [2022-01-15 17:40:11] [config] mini-batch-fit-step: 10 [2022-01-15 17:40:11] [config] mini-batch-track-lr: false [2022-01-15 17:40:11] [config] mini-batch-warmup: 0 [2022-01-15 17:40:11] [config] mini-batch-words: 0 [2022-01-15 17:40:11] [config] mini-batch-words-ref: 0 [2022-01-15 17:40:11] [config] model: model.npz [2022-01-15 17:40:11] [config] multi-loss-type: sum [2022-01-15 17:40:11] [config] multi-node: false [2022-01-15 17:40:11] [config] multi-node-overlap: true [2022-01-15 17:40:11] [config] n-best: false [2022-01-15 17:40:11] [config] no-nccl: false [2022-01-15 17:40:11] [config] no-reload: false [2022-01-15 17:40:11] [config] no-restore-corpus: false [2022-01-15 17:40:11] [config] normalize: 0.6 [2022-01-15 17:40:11] [config] normalize-gradient: false [2022-01-15 17:40:11] [config] num-devices: 0 [2022-01-15 17:40:11] [config] optimizer: adam [2022-01-15 17:40:11] [config] optimizer-delay: 1 [2022-01-15 17:40:11] [config] optimizer-params: [2022-01-15 17:40:11] [config] - 0.9 [2022-01-15 17:40:11] [config] - 0.98 [2022-01-15 17:40:11] [config] - 1e-09 [2022-01-15 17:40:11] [config] output-omit-bias: false [2022-01-15 17:40:11] [config] overwrite: true [2022-01-15 17:40:11] [config] precision: [2022-01-15 17:40:11] [config] - float32 [2022-01-15 17:40:11] [config] - float32 [2022-01-15 17:40:11] [config] - float32 [2022-01-15 17:40:11] [config] pretrained-model: "" [2022-01-15 17:40:11] [config] quantize-biases: false [2022-01-15 17:40:11] [config] quantize-bits: 0 [2022-01-15 17:40:11] [config] quantize-log-based: false [2022-01-15 17:40:11] [config] quantize-optimization-steps: 0 [2022-01-15 17:40:11] [config] quiet: false [2022-01-15 17:40:11] [config] quiet-translation: false [2022-01-15 17:40:11] [config] relative-paths: false [2022-01-15 17:40:11] [config] right-left: false [2022-01-15 17:40:11] [config] save-freq: 5000 [2022-01-15 17:40:11] [config] seed: 0 [2022-01-15 17:40:11] [config] sentencepiece-alphas: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] sentencepiece-max-lines: 2000000 [2022-01-15 17:40:11] [config] sentencepiece-options: "" [2022-01-15 17:40:11] [config] shuffle: data [2022-01-15 17:40:11] [config] shuffle-in-ram: false [2022-01-15 17:40:11] [config] sigterm: save-and-exit [2022-01-15 17:40:11] [config] skip: false [2022-01-15 17:40:11] [config] sqlite: "" [2022-01-15 17:40:11] [config] sqlite-drop: false [2022-01-15 17:40:11] [config] sync-sgd: false [2022-01-15 17:40:11] [config] tempdir: /tmp [2022-01-15 17:40:11] [config] tied-embeddings: true [2022-01-15 17:40:11] [config] tied-embeddings-all: false [2022-01-15 17:40:11] [config] tied-embeddings-src: false [2022-01-15 17:40:11] [config] train-embedder-rank: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] train-sets: [2022-01-15 17:40:11] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv [2022-01-15 17:40:11] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv [2022-01-15 17:40:11] [config] transformer-aan-activation: swish [2022-01-15 17:40:11] [config] transformer-aan-depth: 2 [2022-01-15 17:40:11] [config] transformer-aan-nogate: false [2022-01-15 17:40:11] [config] transformer-decoder-autoreg: self-attention [2022-01-15 17:40:11] [config] transformer-depth-scaling: false [2022-01-15 17:40:11] [config] transformer-dim-aan: 2048 [2022-01-15 17:40:11] [config] transformer-dim-ffn: 2048 [2022-01-15 17:40:11] [config] transformer-dropout: 0.1 [2022-01-15 17:40:11] [config] transformer-dropout-attention: 0 [2022-01-15 17:40:11] [config] transformer-dropout-ffn: 0 [2022-01-15 17:40:11] [config] transformer-ffn-activation: swish [2022-01-15 17:40:11] [config] transformer-ffn-depth: 2 [2022-01-15 17:40:11] [config] transformer-guided-alignment-layer: last [2022-01-15 17:40:11] [config] transformer-heads: 8 [2022-01-15 17:40:11] [config] transformer-no-projection: false [2022-01-15 17:40:11] [config] transformer-pool: false [2022-01-15 17:40:11] [config] transformer-postprocess: dan [2022-01-15 17:40:11] [config] transformer-postprocess-emb: d [2022-01-15 17:40:11] [config] transformer-postprocess-top: "" [2022-01-15 17:40:11] [config] transformer-preprocess: "" [2022-01-15 17:40:11] [config] transformer-tied-layers: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] transformer-train-position-embeddings: false [2022-01-15 17:40:11] [config] tsv: false [2022-01-15 17:40:11] [config] tsv-fields: 0 [2022-01-15 17:40:11] [config] type: transformer [2022-01-15 17:40:11] [config] ulr: false [2022-01-15 17:40:11] [config] ulr-dim-emb: 0 [2022-01-15 17:40:11] [config] ulr-dropout: 0 [2022-01-15 17:40:11] [config] ulr-keys-vectors: "" [2022-01-15 17:40:11] [config] ulr-query-vectors: "" [2022-01-15 17:40:11] [config] ulr-softmax-temperature: 1 [2022-01-15 17:40:11] [config] ulr-trainable-transformation: false [2022-01-15 17:40:11] [config] unlikelihood-loss: false [2022-01-15 17:40:11] [config] valid-freq: 5000 [2022-01-15 17:40:11] [config] valid-log: "" [2022-01-15 17:40:11] [config] valid-max-length: 1000 [2022-01-15 17:40:11] [config] valid-metrics: [2022-01-15 17:40:11] [config] - cross-entropy [2022-01-15 17:40:11] [config] valid-mini-batch: 32 [2022-01-15 17:40:11] [config] valid-reset-stalled: false [2022-01-15 17:40:11] [config] valid-script-args: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] valid-script-path: "" [2022-01-15 17:40:11] [config] valid-sets: [2022-01-15 17:40:11] [config] [] [2022-01-15 17:40:11] [config] valid-translation-output: "" [2022-01-15 17:40:11] [config] vocabs: [2022-01-15 17:40:11] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:40:11] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:40:11] [config] word-penalty: 0 [2022-01-15 17:40:11] [config] word-scores: false [2022-01-15 17:40:11] [config] workspace: 10000 [2022-01-15 17:40:11] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:40:11] [training] Using single-device training [2022-01-15 17:40:11] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:40:11] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion [2022-01-15 17:40:11] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113 [CALL STACK] [0x564653fb55e6] + 0x29c5e6 [0x7f476d23938c] + 0xaa38c [0x7f476d2393f7] + 0xaa3f7 [0x7f476d2396a9] + 0xaa6a9 [0x564654241c20] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x1130 [0x564654230e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x564654231728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x56465427d189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x564654290084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x5646540eef8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x56465417694b] marian::Train:: run () + 0x19cb [0x56465407d389] mainTrainer (int, char**) + 0x5e9 [0x56465403b1bc] main + 0x3c [0x7f476ce5a0b3] __libc_start_main + 0xf3 [0x56465407bb0e] _start + 0x2e [2022-01-15 17:45:28] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:45:28] [marian] Running on s470607-gpu as process 3792 with command line: [2022-01-15 17:45:28] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:45:28] [config] after: 0e [2022-01-15 17:45:28] [config] after-batches: 0 [2022-01-15 17:45:28] [config] after-epochs: 1 [2022-01-15 17:45:28] [config] all-caps-every: 0 [2022-01-15 17:45:28] [config] allow-unk: false [2022-01-15 17:45:28] [config] authors: false [2022-01-15 17:45:28] [config] beam-size: 6 [2022-01-15 17:45:28] [config] bert-class-symbol: "[CLS]" [2022-01-15 17:45:28] [config] bert-mask-symbol: "[MASK]" [2022-01-15 17:45:28] [config] bert-masking-fraction: 0.15 [2022-01-15 17:45:28] [config] bert-sep-symbol: "[SEP]" [2022-01-15 17:45:28] [config] bert-train-type-embeddings: true [2022-01-15 17:45:28] [config] bert-type-vocab-size: 2 [2022-01-15 17:45:28] [config] build-info: "" [2022-01-15 17:45:28] [config] cite: false [2022-01-15 17:45:28] [config] clip-norm: 5 [2022-01-15 17:45:28] [config] cost-scaling: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] cost-type: ce-sum [2022-01-15 17:45:28] [config] cpu-threads: 0 [2022-01-15 17:45:28] [config] data-weighting: "" [2022-01-15 17:45:28] [config] data-weighting-type: sentence [2022-01-15 17:45:28] [config] dec-cell: gru [2022-01-15 17:45:28] [config] dec-cell-base-depth: 2 [2022-01-15 17:45:28] [config] dec-cell-high-depth: 1 [2022-01-15 17:45:28] [config] dec-depth: 6 [2022-01-15 17:45:28] [config] devices: [2022-01-15 17:45:28] [config] - 0 [2022-01-15 17:45:28] [config] dim-emb: 512 [2022-01-15 17:45:28] [config] dim-rnn: 1024 [2022-01-15 17:45:28] [config] dim-vocabs: [2022-01-15 17:45:28] [config] - 0 [2022-01-15 17:45:28] [config] - 0 [2022-01-15 17:45:28] [config] disp-first: 0 [2022-01-15 17:45:28] [config] disp-freq: 500 [2022-01-15 17:45:28] [config] disp-label-counts: true [2022-01-15 17:45:28] [config] dropout-rnn: 0 [2022-01-15 17:45:28] [config] dropout-src: 0 [2022-01-15 17:45:28] [config] dropout-trg: 0 [2022-01-15 17:45:28] [config] dump-config: "" [2022-01-15 17:45:28] [config] early-stopping: 10 [2022-01-15 17:45:28] [config] embedding-fix-src: false [2022-01-15 17:45:28] [config] embedding-fix-trg: false [2022-01-15 17:45:28] [config] embedding-normalization: false [2022-01-15 17:45:28] [config] embedding-vectors: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] enc-cell: gru [2022-01-15 17:45:28] [config] enc-cell-depth: 1 [2022-01-15 17:45:28] [config] enc-depth: 6 [2022-01-15 17:45:28] [config] enc-type: bidirectional [2022-01-15 17:45:28] [config] english-title-case-every: 0 [2022-01-15 17:45:28] [config] exponential-smoothing: 0.0001 [2022-01-15 17:45:28] [config] factor-weight: 1 [2022-01-15 17:45:28] [config] grad-dropping-momentum: 0 [2022-01-15 17:45:28] [config] grad-dropping-rate: 0 [2022-01-15 17:45:28] [config] grad-dropping-warmup: 100 [2022-01-15 17:45:28] [config] gradient-checkpointing: false [2022-01-15 17:45:28] [config] guided-alignment: none [2022-01-15 17:45:28] [config] guided-alignment-cost: mse [2022-01-15 17:45:28] [config] guided-alignment-weight: 0.1 [2022-01-15 17:45:28] [config] ignore-model-config: false [2022-01-15 17:45:28] [config] input-types: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] interpolate-env-vars: false [2022-01-15 17:45:28] [config] keep-best: false [2022-01-15 17:45:28] [config] label-smoothing: 0.1 [2022-01-15 17:45:28] [config] layer-normalization: false [2022-01-15 17:45:28] [config] learn-rate: 0.0003 [2022-01-15 17:45:28] [config] lemma-dim-emb: 0 [2022-01-15 17:45:28] [config] log: /home/wmi/train.log [2022-01-15 17:45:28] [config] log-level: info [2022-01-15 17:45:28] [config] log-time-zone: "" [2022-01-15 17:45:28] [config] logical-epoch: [2022-01-15 17:45:28] [config] - 1e [2022-01-15 17:45:28] [config] - 0 [2022-01-15 17:45:28] [config] lr-decay: 0 [2022-01-15 17:45:28] [config] lr-decay-freq: 50000 [2022-01-15 17:45:28] [config] lr-decay-inv-sqrt: [2022-01-15 17:45:28] [config] - 16000 [2022-01-15 17:45:28] [config] lr-decay-repeat-warmup: false [2022-01-15 17:45:28] [config] lr-decay-reset-optimizer: false [2022-01-15 17:45:28] [config] lr-decay-start: [2022-01-15 17:45:28] [config] - 10 [2022-01-15 17:45:28] [config] - 1 [2022-01-15 17:45:28] [config] lr-decay-strategy: epoch+stalled [2022-01-15 17:45:28] [config] lr-report: true [2022-01-15 17:45:28] [config] lr-warmup: 16000 [2022-01-15 17:45:28] [config] lr-warmup-at-reload: false [2022-01-15 17:45:28] [config] lr-warmup-cycle: false [2022-01-15 17:45:28] [config] lr-warmup-start-rate: 0 [2022-01-15 17:45:28] [config] max-length: 100 [2022-01-15 17:45:28] [config] max-length-crop: false [2022-01-15 17:45:28] [config] max-length-factor: 3 [2022-01-15 17:45:28] [config] maxi-batch: 1000 [2022-01-15 17:45:28] [config] maxi-batch-sort: trg [2022-01-15 17:45:28] [config] mini-batch: 64 [2022-01-15 17:45:28] [config] mini-batch-fit: true [2022-01-15 17:45:28] [config] mini-batch-fit-step: 10 [2022-01-15 17:45:28] [config] mini-batch-track-lr: false [2022-01-15 17:45:28] [config] mini-batch-warmup: 0 [2022-01-15 17:45:28] [config] mini-batch-words: 0 [2022-01-15 17:45:28] [config] mini-batch-words-ref: 0 [2022-01-15 17:45:28] [config] model: model.npz [2022-01-15 17:45:28] [config] multi-loss-type: sum [2022-01-15 17:45:28] [config] multi-node: false [2022-01-15 17:45:28] [config] multi-node-overlap: true [2022-01-15 17:45:28] [config] n-best: false [2022-01-15 17:45:28] [config] no-nccl: false [2022-01-15 17:45:28] [config] no-reload: false [2022-01-15 17:45:28] [config] no-restore-corpus: false [2022-01-15 17:45:28] [config] normalize: 0.6 [2022-01-15 17:45:28] [config] normalize-gradient: false [2022-01-15 17:45:28] [config] num-devices: 0 [2022-01-15 17:45:28] [config] optimizer: adam [2022-01-15 17:45:28] [config] optimizer-delay: 1 [2022-01-15 17:45:28] [config] optimizer-params: [2022-01-15 17:45:28] [config] - 0.9 [2022-01-15 17:45:28] [config] - 0.98 [2022-01-15 17:45:28] [config] - 1e-09 [2022-01-15 17:45:28] [config] output-omit-bias: false [2022-01-15 17:45:28] [config] overwrite: true [2022-01-15 17:45:28] [config] precision: [2022-01-15 17:45:28] [config] - float32 [2022-01-15 17:45:28] [config] - float32 [2022-01-15 17:45:28] [config] - float32 [2022-01-15 17:45:28] [config] pretrained-model: "" [2022-01-15 17:45:28] [config] quantize-biases: false [2022-01-15 17:45:28] [config] quantize-bits: 0 [2022-01-15 17:45:28] [config] quantize-log-based: false [2022-01-15 17:45:28] [config] quantize-optimization-steps: 0 [2022-01-15 17:45:28] [config] quiet: false [2022-01-15 17:45:28] [config] quiet-translation: false [2022-01-15 17:45:28] [config] relative-paths: false [2022-01-15 17:45:28] [config] right-left: false [2022-01-15 17:45:28] [config] save-freq: 5000 [2022-01-15 17:45:28] [config] seed: 0 [2022-01-15 17:45:28] [config] sentencepiece-alphas: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] sentencepiece-max-lines: 2000000 [2022-01-15 17:45:28] [config] sentencepiece-options: "" [2022-01-15 17:45:28] [config] shuffle: data [2022-01-15 17:45:28] [config] shuffle-in-ram: false [2022-01-15 17:45:28] [config] sigterm: save-and-exit [2022-01-15 17:45:28] [config] skip: false [2022-01-15 17:45:28] [config] sqlite: "" [2022-01-15 17:45:28] [config] sqlite-drop: false [2022-01-15 17:45:28] [config] sync-sgd: false [2022-01-15 17:45:28] [config] tempdir: /tmp [2022-01-15 17:45:28] [config] tied-embeddings: true [2022-01-15 17:45:28] [config] tied-embeddings-all: false [2022-01-15 17:45:28] [config] tied-embeddings-src: false [2022-01-15 17:45:28] [config] train-embedder-rank: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] train-sets: [2022-01-15 17:45:28] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv [2022-01-15 17:45:28] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv [2022-01-15 17:45:28] [config] transformer-aan-activation: swish [2022-01-15 17:45:28] [config] transformer-aan-depth: 2 [2022-01-15 17:45:28] [config] transformer-aan-nogate: false [2022-01-15 17:45:28] [config] transformer-decoder-autoreg: self-attention [2022-01-15 17:45:28] [config] transformer-depth-scaling: false [2022-01-15 17:45:28] [config] transformer-dim-aan: 2048 [2022-01-15 17:45:28] [config] transformer-dim-ffn: 2048 [2022-01-15 17:45:28] [config] transformer-dropout: 0.1 [2022-01-15 17:45:28] [config] transformer-dropout-attention: 0 [2022-01-15 17:45:28] [config] transformer-dropout-ffn: 0 [2022-01-15 17:45:28] [config] transformer-ffn-activation: swish [2022-01-15 17:45:28] [config] transformer-ffn-depth: 2 [2022-01-15 17:45:28] [config] transformer-guided-alignment-layer: last [2022-01-15 17:45:28] [config] transformer-heads: 8 [2022-01-15 17:45:28] [config] transformer-no-projection: false [2022-01-15 17:45:28] [config] transformer-pool: false [2022-01-15 17:45:28] [config] transformer-postprocess: dan [2022-01-15 17:45:28] [config] transformer-postprocess-emb: d [2022-01-15 17:45:28] [config] transformer-postprocess-top: "" [2022-01-15 17:45:28] [config] transformer-preprocess: "" [2022-01-15 17:45:28] [config] transformer-tied-layers: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] transformer-train-position-embeddings: false [2022-01-15 17:45:28] [config] tsv: false [2022-01-15 17:45:28] [config] tsv-fields: 0 [2022-01-15 17:45:28] [config] type: transformer [2022-01-15 17:45:28] [config] ulr: false [2022-01-15 17:45:28] [config] ulr-dim-emb: 0 [2022-01-15 17:45:28] [config] ulr-dropout: 0 [2022-01-15 17:45:28] [config] ulr-keys-vectors: "" [2022-01-15 17:45:28] [config] ulr-query-vectors: "" [2022-01-15 17:45:28] [config] ulr-softmax-temperature: 1 [2022-01-15 17:45:28] [config] ulr-trainable-transformation: false [2022-01-15 17:45:28] [config] unlikelihood-loss: false [2022-01-15 17:45:28] [config] valid-freq: 5000 [2022-01-15 17:45:28] [config] valid-log: "" [2022-01-15 17:45:28] [config] valid-max-length: 1000 [2022-01-15 17:45:28] [config] valid-metrics: [2022-01-15 17:45:28] [config] - cross-entropy [2022-01-15 17:45:28] [config] valid-mini-batch: 32 [2022-01-15 17:45:28] [config] valid-reset-stalled: false [2022-01-15 17:45:28] [config] valid-script-args: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] valid-script-path: "" [2022-01-15 17:45:28] [config] valid-sets: [2022-01-15 17:45:28] [config] [] [2022-01-15 17:45:28] [config] valid-translation-output: "" [2022-01-15 17:45:28] [config] vocabs: [2022-01-15 17:45:28] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:45:28] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml [2022-01-15 17:45:28] [config] word-penalty: 0 [2022-01-15 17:45:28] [config] word-scores: false [2022-01-15 17:45:28] [config] workspace: 10000 [2022-01-15 17:45:28] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:45:28] [training] Using single-device training [2022-01-15 17:45:28] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml [2022-01-15 17:45:28] Error: Unhandled exception of type 'N4YAML15ParserExceptionE': yaml-cpp: error at line 14, column 1: end of map not found [2022-01-15 17:45:28] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113 [CALL STACK] [0x5633dffce5e6] + 0x29c5e6 [0x7f5448b7938c] + 0xaa38c [0x7f5448b793f7] + 0xaa3f7 [0x7f5448b796a9] + 0xaa6a9 [0x5633e001b7c7] + 0x2e97c7 [0x5633e060b658] YAML::SingleDocParser:: HandleNode (YAML::EventHandler&) + 0x278 [0x5633e060bbcc] YAML::SingleDocParser:: HandleDocument (YAML::EventHandler&) + 0x5c [0x5633e05f0dcd] YAML::Parser:: HandleNextDocument (YAML::EventHandler&) + 0x7d [0x5633e05ed6d9] YAML:: Load (std::istream&) + 0x49 [0x5633e025a328] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x838 [0x5633e0249e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x5633e024a728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x5633e0296189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x5633e02a9084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x5633e0107f8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x5633e018f94b] marian::Train:: run () + 0x19cb [0x5633e0096389] mainTrainer (int, char**) + 0x5e9 [0x5633e00541bc] main + 0x3c [0x7f544879a0b3] __libc_start_main + 0xf3 [0x5633e0094b0e] _start + 0x2e [2022-01-15 17:51:29] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:51:29] [marian] Running on s470607-gpu as process 3840 with command line: [2022-01-15 17:51:29] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.vocab.10000.yml [2022-01-15 17:51:29] [config] after: 0e [2022-01-15 17:51:29] [config] after-batches: 0 [2022-01-15 17:51:29] [config] after-epochs: 1 [2022-01-15 17:51:29] [config] all-caps-every: 0 [2022-01-15 17:51:29] [config] allow-unk: false [2022-01-15 17:51:29] [config] authors: false [2022-01-15 17:51:29] [config] beam-size: 6 [2022-01-15 17:51:29] [config] bert-class-symbol: "[CLS]" [2022-01-15 17:51:29] [config] bert-mask-symbol: "[MASK]" [2022-01-15 17:51:29] [config] bert-masking-fraction: 0.15 [2022-01-15 17:51:29] [config] bert-sep-symbol: "[SEP]" [2022-01-15 17:51:29] [config] bert-train-type-embeddings: true [2022-01-15 17:51:29] [config] bert-type-vocab-size: 2 [2022-01-15 17:51:29] [config] build-info: "" [2022-01-15 17:51:29] [config] cite: false [2022-01-15 17:51:29] [config] clip-norm: 5 [2022-01-15 17:51:29] [config] cost-scaling: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] cost-type: ce-sum [2022-01-15 17:51:29] [config] cpu-threads: 0 [2022-01-15 17:51:29] [config] data-weighting: "" [2022-01-15 17:51:29] [config] data-weighting-type: sentence [2022-01-15 17:51:29] [config] dec-cell: gru [2022-01-15 17:51:29] [config] dec-cell-base-depth: 2 [2022-01-15 17:51:29] [config] dec-cell-high-depth: 1 [2022-01-15 17:51:29] [config] dec-depth: 6 [2022-01-15 17:51:29] [config] devices: [2022-01-15 17:51:29] [config] - 0 [2022-01-15 17:51:29] [config] dim-emb: 512 [2022-01-15 17:51:29] [config] dim-rnn: 1024 [2022-01-15 17:51:29] [config] dim-vocabs: [2022-01-15 17:51:29] [config] - 0 [2022-01-15 17:51:29] [config] - 0 [2022-01-15 17:51:29] [config] disp-first: 0 [2022-01-15 17:51:29] [config] disp-freq: 500 [2022-01-15 17:51:29] [config] disp-label-counts: true [2022-01-15 17:51:29] [config] dropout-rnn: 0 [2022-01-15 17:51:29] [config] dropout-src: 0 [2022-01-15 17:51:29] [config] dropout-trg: 0 [2022-01-15 17:51:29] [config] dump-config: "" [2022-01-15 17:51:29] [config] early-stopping: 10 [2022-01-15 17:51:29] [config] embedding-fix-src: false [2022-01-15 17:51:29] [config] embedding-fix-trg: false [2022-01-15 17:51:29] [config] embedding-normalization: false [2022-01-15 17:51:29] [config] embedding-vectors: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] enc-cell: gru [2022-01-15 17:51:29] [config] enc-cell-depth: 1 [2022-01-15 17:51:29] [config] enc-depth: 6 [2022-01-15 17:51:29] [config] enc-type: bidirectional [2022-01-15 17:51:29] [config] english-title-case-every: 0 [2022-01-15 17:51:29] [config] exponential-smoothing: 0.0001 [2022-01-15 17:51:29] [config] factor-weight: 1 [2022-01-15 17:51:29] [config] grad-dropping-momentum: 0 [2022-01-15 17:51:29] [config] grad-dropping-rate: 0 [2022-01-15 17:51:29] [config] grad-dropping-warmup: 100 [2022-01-15 17:51:29] [config] gradient-checkpointing: false [2022-01-15 17:51:29] [config] guided-alignment: none [2022-01-15 17:51:29] [config] guided-alignment-cost: mse [2022-01-15 17:51:29] [config] guided-alignment-weight: 0.1 [2022-01-15 17:51:29] [config] ignore-model-config: false [2022-01-15 17:51:29] [config] input-types: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] interpolate-env-vars: false [2022-01-15 17:51:29] [config] keep-best: false [2022-01-15 17:51:29] [config] label-smoothing: 0.1 [2022-01-15 17:51:29] [config] layer-normalization: false [2022-01-15 17:51:29] [config] learn-rate: 0.0003 [2022-01-15 17:51:29] [config] lemma-dim-emb: 0 [2022-01-15 17:51:29] [config] log: /home/wmi/train.log [2022-01-15 17:51:29] [config] log-level: info [2022-01-15 17:51:29] [config] log-time-zone: "" [2022-01-15 17:51:29] [config] logical-epoch: [2022-01-15 17:51:29] [config] - 1e [2022-01-15 17:51:29] [config] - 0 [2022-01-15 17:51:29] [config] lr-decay: 0 [2022-01-15 17:51:29] [config] lr-decay-freq: 50000 [2022-01-15 17:51:29] [config] lr-decay-inv-sqrt: [2022-01-15 17:51:29] [config] - 16000 [2022-01-15 17:51:29] [config] lr-decay-repeat-warmup: false [2022-01-15 17:51:29] [config] lr-decay-reset-optimizer: false [2022-01-15 17:51:29] [config] lr-decay-start: [2022-01-15 17:51:29] [config] - 10 [2022-01-15 17:51:29] [config] - 1 [2022-01-15 17:51:29] [config] lr-decay-strategy: epoch+stalled [2022-01-15 17:51:29] [config] lr-report: true [2022-01-15 17:51:29] [config] lr-warmup: 16000 [2022-01-15 17:51:29] [config] lr-warmup-at-reload: false [2022-01-15 17:51:29] [config] lr-warmup-cycle: false [2022-01-15 17:51:29] [config] lr-warmup-start-rate: 0 [2022-01-15 17:51:29] [config] max-length: 100 [2022-01-15 17:51:29] [config] max-length-crop: false [2022-01-15 17:51:29] [config] max-length-factor: 3 [2022-01-15 17:51:29] [config] maxi-batch: 1000 [2022-01-15 17:51:29] [config] maxi-batch-sort: trg [2022-01-15 17:51:29] [config] mini-batch: 64 [2022-01-15 17:51:29] [config] mini-batch-fit: true [2022-01-15 17:51:29] [config] mini-batch-fit-step: 10 [2022-01-15 17:51:29] [config] mini-batch-track-lr: false [2022-01-15 17:51:29] [config] mini-batch-warmup: 0 [2022-01-15 17:51:29] [config] mini-batch-words: 0 [2022-01-15 17:51:29] [config] mini-batch-words-ref: 0 [2022-01-15 17:51:29] [config] model: model.npz [2022-01-15 17:51:29] [config] multi-loss-type: sum [2022-01-15 17:51:29] [config] multi-node: false [2022-01-15 17:51:29] [config] multi-node-overlap: true [2022-01-15 17:51:29] [config] n-best: false [2022-01-15 17:51:29] [config] no-nccl: false [2022-01-15 17:51:29] [config] no-reload: false [2022-01-15 17:51:29] [config] no-restore-corpus: false [2022-01-15 17:51:29] [config] normalize: 0.6 [2022-01-15 17:51:29] [config] normalize-gradient: false [2022-01-15 17:51:29] [config] num-devices: 0 [2022-01-15 17:51:29] [config] optimizer: adam [2022-01-15 17:51:29] [config] optimizer-delay: 1 [2022-01-15 17:51:29] [config] optimizer-params: [2022-01-15 17:51:29] [config] - 0.9 [2022-01-15 17:51:29] [config] - 0.98 [2022-01-15 17:51:29] [config] - 1e-09 [2022-01-15 17:51:29] [config] output-omit-bias: false [2022-01-15 17:51:29] [config] overwrite: true [2022-01-15 17:51:29] [config] precision: [2022-01-15 17:51:29] [config] - float32 [2022-01-15 17:51:29] [config] - float32 [2022-01-15 17:51:29] [config] - float32 [2022-01-15 17:51:29] [config] pretrained-model: "" [2022-01-15 17:51:29] [config] quantize-biases: false [2022-01-15 17:51:29] [config] quantize-bits: 0 [2022-01-15 17:51:29] [config] quantize-log-based: false [2022-01-15 17:51:29] [config] quantize-optimization-steps: 0 [2022-01-15 17:51:29] [config] quiet: false [2022-01-15 17:51:29] [config] quiet-translation: false [2022-01-15 17:51:29] [config] relative-paths: false [2022-01-15 17:51:29] [config] right-left: false [2022-01-15 17:51:29] [config] save-freq: 5000 [2022-01-15 17:51:29] [config] seed: 0 [2022-01-15 17:51:29] [config] sentencepiece-alphas: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] sentencepiece-max-lines: 2000000 [2022-01-15 17:51:29] [config] sentencepiece-options: "" [2022-01-15 17:51:29] [config] shuffle: data [2022-01-15 17:51:29] [config] shuffle-in-ram: false [2022-01-15 17:51:29] [config] sigterm: save-and-exit [2022-01-15 17:51:29] [config] skip: false [2022-01-15 17:51:29] [config] sqlite: "" [2022-01-15 17:51:29] [config] sqlite-drop: false [2022-01-15 17:51:29] [config] sync-sgd: false [2022-01-15 17:51:29] [config] tempdir: /tmp [2022-01-15 17:51:29] [config] tied-embeddings: true [2022-01-15 17:51:29] [config] tied-embeddings-all: false [2022-01-15 17:51:29] [config] tied-embeddings-src: false [2022-01-15 17:51:29] [config] train-embedder-rank: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] train-sets: [2022-01-15 17:51:29] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 17:51:29] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 17:51:29] [config] transformer-aan-activation: swish [2022-01-15 17:51:29] [config] transformer-aan-depth: 2 [2022-01-15 17:51:29] [config] transformer-aan-nogate: false [2022-01-15 17:51:29] [config] transformer-decoder-autoreg: self-attention [2022-01-15 17:51:29] [config] transformer-depth-scaling: false [2022-01-15 17:51:29] [config] transformer-dim-aan: 2048 [2022-01-15 17:51:29] [config] transformer-dim-ffn: 2048 [2022-01-15 17:51:29] [config] transformer-dropout: 0.1 [2022-01-15 17:51:29] [config] transformer-dropout-attention: 0 [2022-01-15 17:51:29] [config] transformer-dropout-ffn: 0 [2022-01-15 17:51:29] [config] transformer-ffn-activation: swish [2022-01-15 17:51:29] [config] transformer-ffn-depth: 2 [2022-01-15 17:51:29] [config] transformer-guided-alignment-layer: last [2022-01-15 17:51:29] [config] transformer-heads: 8 [2022-01-15 17:51:29] [config] transformer-no-projection: false [2022-01-15 17:51:29] [config] transformer-pool: false [2022-01-15 17:51:29] [config] transformer-postprocess: dan [2022-01-15 17:51:29] [config] transformer-postprocess-emb: d [2022-01-15 17:51:29] [config] transformer-postprocess-top: "" [2022-01-15 17:51:29] [config] transformer-preprocess: "" [2022-01-15 17:51:29] [config] transformer-tied-layers: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] transformer-train-position-embeddings: false [2022-01-15 17:51:29] [config] tsv: false [2022-01-15 17:51:29] [config] tsv-fields: 0 [2022-01-15 17:51:29] [config] type: transformer [2022-01-15 17:51:29] [config] ulr: false [2022-01-15 17:51:29] [config] ulr-dim-emb: 0 [2022-01-15 17:51:29] [config] ulr-dropout: 0 [2022-01-15 17:51:29] [config] ulr-keys-vectors: "" [2022-01-15 17:51:29] [config] ulr-query-vectors: "" [2022-01-15 17:51:29] [config] ulr-softmax-temperature: 1 [2022-01-15 17:51:29] [config] ulr-trainable-transformation: false [2022-01-15 17:51:29] [config] unlikelihood-loss: false [2022-01-15 17:51:29] [config] valid-freq: 5000 [2022-01-15 17:51:29] [config] valid-log: "" [2022-01-15 17:51:29] [config] valid-max-length: 1000 [2022-01-15 17:51:29] [config] valid-metrics: [2022-01-15 17:51:29] [config] - cross-entropy [2022-01-15 17:51:29] [config] valid-mini-batch: 32 [2022-01-15 17:51:29] [config] valid-reset-stalled: false [2022-01-15 17:51:29] [config] valid-script-args: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] valid-script-path: "" [2022-01-15 17:51:29] [config] valid-sets: [2022-01-15 17:51:29] [config] [] [2022-01-15 17:51:29] [config] valid-translation-output: "" [2022-01-15 17:51:29] [config] vocabs: [2022-01-15 17:51:29] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.vocab.10000.yml [2022-01-15 17:51:29] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.vocab.10000.yml [2022-01-15 17:51:29] [config] word-penalty: 0 [2022-01-15 17:51:29] [config] word-scores: false [2022-01-15 17:51:29] [config] workspace: 10000 [2022-01-15 17:51:29] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:51:29] [training] Using single-device training [2022-01-15 17:51:29] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.vocab.10000.yml [2022-01-15 17:51:29] Error: Unhandled exception of type 'N4YAML15ParserExceptionE': yaml-cpp: error at line 14, column 1: end of map not found [2022-01-15 17:51:29] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113 [CALL STACK] [0x564a5ea8c5e6] + 0x29c5e6 [0x7f08ed7f338c] + 0xaa38c [0x7f08ed7f33f7] + 0xaa3f7 [0x7f08ed7f36a9] + 0xaa6a9 [0x564a5ead97c7] + 0x2e97c7 [0x564a5f0c9658] YAML::SingleDocParser:: HandleNode (YAML::EventHandler&) + 0x278 [0x564a5f0c9bcc] YAML::SingleDocParser:: HandleDocument (YAML::EventHandler&) + 0x5c [0x564a5f0aedcd] YAML::Parser:: HandleNextDocument (YAML::EventHandler&) + 0x7d [0x564a5f0ab6d9] YAML:: Load (std::istream&) + 0x49 [0x564a5ed18328] marian::DefaultVocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x838 [0x564a5ed07e2a] marian::Vocab:: load (std::__cxx11::basic_string,std::allocator> const&, unsigned long) + 0x3a [0x564a5ed08728] marian::Vocab:: loadOrCreate (std::__cxx11::basic_string,std::allocator> const&, std::vector,std::allocator>,std::allocator,std::allocator>>> const&, unsigned long) + 0x528 [0x564a5ed54189] marian::data::CorpusBase:: CorpusBase (std::shared_ptr, bool) + 0x1e09 [0x564a5ed67084] marian::data::Corpus:: Corpus (std::shared_ptr, bool) + 0x64 [0x564a5ebc5f8c] std::shared_ptr marian:: New &>(std::shared_ptr&) + 0x5c [0x564a5ec4d94b] marian::Train:: run () + 0x19cb [0x564a5eb54389] mainTrainer (int, char**) + 0x5e9 [0x564a5eb121bc] main + 0x3c [0x7f08ed4140b3] __libc_start_main + 0xf3 [0x564a5eb52b0e] _start + 0x2e [2022-01-15 17:53:26] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:53:26] [marian] Running on s470607-gpu as process 3870 with command line: [2022-01-15 17:53:26] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 17:53:26] [config] after: 0e [2022-01-15 17:53:26] [config] after-batches: 0 [2022-01-15 17:53:26] [config] after-epochs: 1 [2022-01-15 17:53:26] [config] all-caps-every: 0 [2022-01-15 17:53:26] [config] allow-unk: false [2022-01-15 17:53:26] [config] authors: false [2022-01-15 17:53:26] [config] beam-size: 6 [2022-01-15 17:53:26] [config] bert-class-symbol: "[CLS]" [2022-01-15 17:53:26] [config] bert-mask-symbol: "[MASK]" [2022-01-15 17:53:26] [config] bert-masking-fraction: 0.15 [2022-01-15 17:53:26] [config] bert-sep-symbol: "[SEP]" [2022-01-15 17:53:26] [config] bert-train-type-embeddings: true [2022-01-15 17:53:26] [config] bert-type-vocab-size: 2 [2022-01-15 17:53:26] [config] build-info: "" [2022-01-15 17:53:26] [config] cite: false [2022-01-15 17:53:26] [config] clip-norm: 5 [2022-01-15 17:53:26] [config] cost-scaling: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] cost-type: ce-sum [2022-01-15 17:53:26] [config] cpu-threads: 0 [2022-01-15 17:53:26] [config] data-weighting: "" [2022-01-15 17:53:26] [config] data-weighting-type: sentence [2022-01-15 17:53:26] [config] dec-cell: gru [2022-01-15 17:53:26] [config] dec-cell-base-depth: 2 [2022-01-15 17:53:26] [config] dec-cell-high-depth: 1 [2022-01-15 17:53:26] [config] dec-depth: 6 [2022-01-15 17:53:26] [config] devices: [2022-01-15 17:53:26] [config] - 0 [2022-01-15 17:53:26] [config] dim-emb: 512 [2022-01-15 17:53:26] [config] dim-rnn: 1024 [2022-01-15 17:53:26] [config] dim-vocabs: [2022-01-15 17:53:26] [config] - 0 [2022-01-15 17:53:26] [config] - 0 [2022-01-15 17:53:26] [config] disp-first: 0 [2022-01-15 17:53:26] [config] disp-freq: 500 [2022-01-15 17:53:26] [config] disp-label-counts: true [2022-01-15 17:53:26] [config] dropout-rnn: 0 [2022-01-15 17:53:26] [config] dropout-src: 0 [2022-01-15 17:53:26] [config] dropout-trg: 0 [2022-01-15 17:53:26] [config] dump-config: "" [2022-01-15 17:53:26] [config] early-stopping: 10 [2022-01-15 17:53:26] [config] embedding-fix-src: false [2022-01-15 17:53:26] [config] embedding-fix-trg: false [2022-01-15 17:53:26] [config] embedding-normalization: false [2022-01-15 17:53:26] [config] embedding-vectors: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] enc-cell: gru [2022-01-15 17:53:26] [config] enc-cell-depth: 1 [2022-01-15 17:53:26] [config] enc-depth: 6 [2022-01-15 17:53:26] [config] enc-type: bidirectional [2022-01-15 17:53:26] [config] english-title-case-every: 0 [2022-01-15 17:53:26] [config] exponential-smoothing: 0.0001 [2022-01-15 17:53:26] [config] factor-weight: 1 [2022-01-15 17:53:26] [config] grad-dropping-momentum: 0 [2022-01-15 17:53:26] [config] grad-dropping-rate: 0 [2022-01-15 17:53:26] [config] grad-dropping-warmup: 100 [2022-01-15 17:53:26] [config] gradient-checkpointing: false [2022-01-15 17:53:26] [config] guided-alignment: none [2022-01-15 17:53:26] [config] guided-alignment-cost: mse [2022-01-15 17:53:26] [config] guided-alignment-weight: 0.1 [2022-01-15 17:53:26] [config] ignore-model-config: false [2022-01-15 17:53:26] [config] input-types: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] interpolate-env-vars: false [2022-01-15 17:53:26] [config] keep-best: false [2022-01-15 17:53:26] [config] label-smoothing: 0.1 [2022-01-15 17:53:26] [config] layer-normalization: false [2022-01-15 17:53:26] [config] learn-rate: 0.0003 [2022-01-15 17:53:26] [config] lemma-dim-emb: 0 [2022-01-15 17:53:26] [config] log: /home/wmi/train.log [2022-01-15 17:53:26] [config] log-level: info [2022-01-15 17:53:26] [config] log-time-zone: "" [2022-01-15 17:53:26] [config] logical-epoch: [2022-01-15 17:53:26] [config] - 1e [2022-01-15 17:53:26] [config] - 0 [2022-01-15 17:53:26] [config] lr-decay: 0 [2022-01-15 17:53:26] [config] lr-decay-freq: 50000 [2022-01-15 17:53:26] [config] lr-decay-inv-sqrt: [2022-01-15 17:53:26] [config] - 16000 [2022-01-15 17:53:26] [config] lr-decay-repeat-warmup: false [2022-01-15 17:53:26] [config] lr-decay-reset-optimizer: false [2022-01-15 17:53:26] [config] lr-decay-start: [2022-01-15 17:53:26] [config] - 10 [2022-01-15 17:53:26] [config] - 1 [2022-01-15 17:53:26] [config] lr-decay-strategy: epoch+stalled [2022-01-15 17:53:26] [config] lr-report: true [2022-01-15 17:53:26] [config] lr-warmup: 16000 [2022-01-15 17:53:26] [config] lr-warmup-at-reload: false [2022-01-15 17:53:26] [config] lr-warmup-cycle: false [2022-01-15 17:53:26] [config] lr-warmup-start-rate: 0 [2022-01-15 17:53:26] [config] max-length: 100 [2022-01-15 17:53:26] [config] max-length-crop: false [2022-01-15 17:53:26] [config] max-length-factor: 3 [2022-01-15 17:53:26] [config] maxi-batch: 1000 [2022-01-15 17:53:26] [config] maxi-batch-sort: trg [2022-01-15 17:53:26] [config] mini-batch: 64 [2022-01-15 17:53:26] [config] mini-batch-fit: true [2022-01-15 17:53:26] [config] mini-batch-fit-step: 10 [2022-01-15 17:53:26] [config] mini-batch-track-lr: false [2022-01-15 17:53:26] [config] mini-batch-warmup: 0 [2022-01-15 17:53:26] [config] mini-batch-words: 0 [2022-01-15 17:53:26] [config] mini-batch-words-ref: 0 [2022-01-15 17:53:26] [config] model: model.npz [2022-01-15 17:53:26] [config] multi-loss-type: sum [2022-01-15 17:53:26] [config] multi-node: false [2022-01-15 17:53:26] [config] multi-node-overlap: true [2022-01-15 17:53:26] [config] n-best: false [2022-01-15 17:53:26] [config] no-nccl: false [2022-01-15 17:53:26] [config] no-reload: false [2022-01-15 17:53:26] [config] no-restore-corpus: false [2022-01-15 17:53:26] [config] normalize: 0.6 [2022-01-15 17:53:26] [config] normalize-gradient: false [2022-01-15 17:53:26] [config] num-devices: 0 [2022-01-15 17:53:26] [config] optimizer: adam [2022-01-15 17:53:26] [config] optimizer-delay: 1 [2022-01-15 17:53:26] [config] optimizer-params: [2022-01-15 17:53:26] [config] - 0.9 [2022-01-15 17:53:26] [config] - 0.98 [2022-01-15 17:53:26] [config] - 1e-09 [2022-01-15 17:53:26] [config] output-omit-bias: false [2022-01-15 17:53:26] [config] overwrite: true [2022-01-15 17:53:26] [config] precision: [2022-01-15 17:53:26] [config] - float32 [2022-01-15 17:53:26] [config] - float32 [2022-01-15 17:53:26] [config] - float32 [2022-01-15 17:53:26] [config] pretrained-model: "" [2022-01-15 17:53:26] [config] quantize-biases: false [2022-01-15 17:53:26] [config] quantize-bits: 0 [2022-01-15 17:53:26] [config] quantize-log-based: false [2022-01-15 17:53:26] [config] quantize-optimization-steps: 0 [2022-01-15 17:53:26] [config] quiet: false [2022-01-15 17:53:26] [config] quiet-translation: false [2022-01-15 17:53:26] [config] relative-paths: false [2022-01-15 17:53:26] [config] right-left: false [2022-01-15 17:53:26] [config] save-freq: 5000 [2022-01-15 17:53:26] [config] seed: 0 [2022-01-15 17:53:26] [config] sentencepiece-alphas: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] sentencepiece-max-lines: 2000000 [2022-01-15 17:53:26] [config] sentencepiece-options: "" [2022-01-15 17:53:26] [config] shuffle: data [2022-01-15 17:53:26] [config] shuffle-in-ram: false [2022-01-15 17:53:26] [config] sigterm: save-and-exit [2022-01-15 17:53:26] [config] skip: false [2022-01-15 17:53:26] [config] sqlite: "" [2022-01-15 17:53:26] [config] sqlite-drop: false [2022-01-15 17:53:26] [config] sync-sgd: false [2022-01-15 17:53:26] [config] tempdir: /tmp [2022-01-15 17:53:26] [config] tied-embeddings: true [2022-01-15 17:53:26] [config] tied-embeddings-all: false [2022-01-15 17:53:26] [config] tied-embeddings-src: false [2022-01-15 17:53:26] [config] train-embedder-rank: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] train-sets: [2022-01-15 17:53:26] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 17:53:26] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 17:53:26] [config] transformer-aan-activation: swish [2022-01-15 17:53:26] [config] transformer-aan-depth: 2 [2022-01-15 17:53:26] [config] transformer-aan-nogate: false [2022-01-15 17:53:26] [config] transformer-decoder-autoreg: self-attention [2022-01-15 17:53:26] [config] transformer-depth-scaling: false [2022-01-15 17:53:26] [config] transformer-dim-aan: 2048 [2022-01-15 17:53:26] [config] transformer-dim-ffn: 2048 [2022-01-15 17:53:26] [config] transformer-dropout: 0.1 [2022-01-15 17:53:26] [config] transformer-dropout-attention: 0 [2022-01-15 17:53:26] [config] transformer-dropout-ffn: 0 [2022-01-15 17:53:26] [config] transformer-ffn-activation: swish [2022-01-15 17:53:26] [config] transformer-ffn-depth: 2 [2022-01-15 17:53:26] [config] transformer-guided-alignment-layer: last [2022-01-15 17:53:26] [config] transformer-heads: 8 [2022-01-15 17:53:26] [config] transformer-no-projection: false [2022-01-15 17:53:26] [config] transformer-pool: false [2022-01-15 17:53:26] [config] transformer-postprocess: dan [2022-01-15 17:53:26] [config] transformer-postprocess-emb: d [2022-01-15 17:53:26] [config] transformer-postprocess-top: "" [2022-01-15 17:53:26] [config] transformer-preprocess: "" [2022-01-15 17:53:26] [config] transformer-tied-layers: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] transformer-train-position-embeddings: false [2022-01-15 17:53:26] [config] tsv: false [2022-01-15 17:53:26] [config] tsv-fields: 0 [2022-01-15 17:53:26] [config] type: transformer [2022-01-15 17:53:26] [config] ulr: false [2022-01-15 17:53:26] [config] ulr-dim-emb: 0 [2022-01-15 17:53:26] [config] ulr-dropout: 0 [2022-01-15 17:53:26] [config] ulr-keys-vectors: "" [2022-01-15 17:53:26] [config] ulr-query-vectors: "" [2022-01-15 17:53:26] [config] ulr-softmax-temperature: 1 [2022-01-15 17:53:26] [config] ulr-trainable-transformation: false [2022-01-15 17:53:26] [config] unlikelihood-loss: false [2022-01-15 17:53:26] [config] valid-freq: 5000 [2022-01-15 17:53:26] [config] valid-log: "" [2022-01-15 17:53:26] [config] valid-max-length: 1000 [2022-01-15 17:53:26] [config] valid-metrics: [2022-01-15 17:53:26] [config] - cross-entropy [2022-01-15 17:53:26] [config] valid-mini-batch: 32 [2022-01-15 17:53:26] [config] valid-reset-stalled: false [2022-01-15 17:53:26] [config] valid-script-args: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] valid-script-path: "" [2022-01-15 17:53:26] [config] valid-sets: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] valid-translation-output: "" [2022-01-15 17:53:26] [config] vocabs: [2022-01-15 17:53:26] [config] [] [2022-01-15 17:53:26] [config] word-penalty: 0 [2022-01-15 17:53:26] [config] word-scores: false [2022-01-15 17:53:26] [config] workspace: 10000 [2022-01-15 17:53:26] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 17:53:26] [training] Using single-device training [2022-01-15 17:53:26] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 17:53:26] [data] Vocabularies will be built separately for each file. [2022-01-15 17:53:26] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 17:53:26] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 17:53:26] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 17:53:55] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml [2022-01-15 17:54:06] [data] Setting vocabulary size for input 0 to 2,393,556 [2022-01-15 17:54:06] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 17:54:06] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 17:54:06] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 17:54:31] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml [2022-01-15 17:54:41] [data] Setting vocabulary size for input 1 to 2,113,516 [2022-01-15 17:54:41] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 17:54:41] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 17:54:41] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 17:54:41] [logits] Applying loss function for 1 factor(s) [2022-01-15 17:54:41] Error: Labels not matching logits shape (10821201920 != -2063699968, shape=1x10x512x2113516 size=-2063699968)?? [2022-01-15 17:54:41] Error: Aborted from marian::Expr marian::Logits::applyLossFunction(const Words&, const std::function > >(IntrusivePtr > >, IntrusivePtr > >)>&) const in /home/wmi/Workspace/marian/src/layers/generic.cpp:26 [CALL STACK] [0x5620559b48a5] marian::Logits:: applyLossFunction (std::vector> const&, std::function>> (IntrusivePtr>>,IntrusivePtr>>)> const&) const + 0xc35 [0x5620559d0f32] marian::CrossEntropyLoss:: compute (marian::Logits, std::vector> const&, IntrusivePtr>>, IntrusivePtr>>) + 0x82 [0x5620559cfde9] marian::LabelwiseLoss:: apply (marian::Logits, std::vector> const&, IntrusivePtr>>, IntrusivePtr>>) + 0x339 [0x5620555ba0db] marian::models::EncoderDecoderCECost:: apply (std::shared_ptr, std::shared_ptr, std::shared_ptr, bool) + 0x58b [0x56205520cc82] marian::models::Trainer:: build (std::shared_ptr, std::shared_ptr, bool) + 0xb2 [0x5620556a15f4] marian::GraphGroup:: collectStats (std::shared_ptr, std::shared_ptr, std::vector,std::allocator>> const&, double) + 0xb84 [0x5620552e3269] marian::Train:: run () + 0x2e9 [0x5620551eb389] mainTrainer (int, char**) + 0x5e9 [0x5620551a91bc] main + 0x3c [0x7f9abc80f0b3] __libc_start_main + 0xf3 [0x5620551e9b0e] _start + 0x2e [2022-01-15 18:02:17] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 18:02:17] [marian] Running on s470607-gpu as process 3955 with command line: [2022-01-15 18:02:17] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 18:02:17] [config] after: 0e [2022-01-15 18:02:17] [config] after-batches: 0 [2022-01-15 18:02:17] [config] after-epochs: 1 [2022-01-15 18:02:17] [config] all-caps-every: 0 [2022-01-15 18:02:17] [config] allow-unk: false [2022-01-15 18:02:17] [config] authors: false [2022-01-15 18:02:17] [config] beam-size: 6 [2022-01-15 18:02:17] [config] bert-class-symbol: "[CLS]" [2022-01-15 18:02:17] [config] bert-mask-symbol: "[MASK]" [2022-01-15 18:02:17] [config] bert-masking-fraction: 0.15 [2022-01-15 18:02:17] [config] bert-sep-symbol: "[SEP]" [2022-01-15 18:02:17] [config] bert-train-type-embeddings: true [2022-01-15 18:02:17] [config] bert-type-vocab-size: 2 [2022-01-15 18:02:17] [config] build-info: "" [2022-01-15 18:02:17] [config] cite: false [2022-01-15 18:02:17] [config] clip-norm: 5 [2022-01-15 18:02:17] [config] cost-scaling: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] cost-type: ce-sum [2022-01-15 18:02:17] [config] cpu-threads: 0 [2022-01-15 18:02:17] [config] data-weighting: "" [2022-01-15 18:02:17] [config] data-weighting-type: sentence [2022-01-15 18:02:17] [config] dec-cell: gru [2022-01-15 18:02:17] [config] dec-cell-base-depth: 2 [2022-01-15 18:02:17] [config] dec-cell-high-depth: 1 [2022-01-15 18:02:17] [config] dec-depth: 6 [2022-01-15 18:02:17] [config] devices: [2022-01-15 18:02:17] [config] - 0 [2022-01-15 18:02:17] [config] dim-emb: 512 [2022-01-15 18:02:17] [config] dim-rnn: 1024 [2022-01-15 18:02:17] [config] dim-vocabs: [2022-01-15 18:02:17] [config] - 0 [2022-01-15 18:02:17] [config] - 0 [2022-01-15 18:02:17] [config] disp-first: 0 [2022-01-15 18:02:17] [config] disp-freq: 500 [2022-01-15 18:02:17] [config] disp-label-counts: true [2022-01-15 18:02:17] [config] dropout-rnn: 0 [2022-01-15 18:02:17] [config] dropout-src: 0 [2022-01-15 18:02:17] [config] dropout-trg: 0 [2022-01-15 18:02:17] [config] dump-config: "" [2022-01-15 18:02:17] [config] early-stopping: 10 [2022-01-15 18:02:17] [config] embedding-fix-src: false [2022-01-15 18:02:17] [config] embedding-fix-trg: false [2022-01-15 18:02:17] [config] embedding-normalization: false [2022-01-15 18:02:17] [config] embedding-vectors: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] enc-cell: gru [2022-01-15 18:02:17] [config] enc-cell-depth: 1 [2022-01-15 18:02:17] [config] enc-depth: 6 [2022-01-15 18:02:17] [config] enc-type: bidirectional [2022-01-15 18:02:17] [config] english-title-case-every: 0 [2022-01-15 18:02:17] [config] exponential-smoothing: 0.0001 [2022-01-15 18:02:17] [config] factor-weight: 1 [2022-01-15 18:02:17] [config] grad-dropping-momentum: 0 [2022-01-15 18:02:17] [config] grad-dropping-rate: 0 [2022-01-15 18:02:17] [config] grad-dropping-warmup: 100 [2022-01-15 18:02:17] [config] gradient-checkpointing: false [2022-01-15 18:02:17] [config] guided-alignment: none [2022-01-15 18:02:17] [config] guided-alignment-cost: mse [2022-01-15 18:02:17] [config] guided-alignment-weight: 0.1 [2022-01-15 18:02:17] [config] ignore-model-config: false [2022-01-15 18:02:17] [config] input-types: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] interpolate-env-vars: false [2022-01-15 18:02:17] [config] keep-best: false [2022-01-15 18:02:17] [config] label-smoothing: 0.1 [2022-01-15 18:02:17] [config] layer-normalization: false [2022-01-15 18:02:17] [config] learn-rate: 0.0003 [2022-01-15 18:02:17] [config] lemma-dim-emb: 0 [2022-01-15 18:02:17] [config] log: /home/wmi/train.log [2022-01-15 18:02:17] [config] log-level: info [2022-01-15 18:02:17] [config] log-time-zone: "" [2022-01-15 18:02:17] [config] logical-epoch: [2022-01-15 18:02:17] [config] - 1e [2022-01-15 18:02:17] [config] - 0 [2022-01-15 18:02:17] [config] lr-decay: 0 [2022-01-15 18:02:17] [config] lr-decay-freq: 50000 [2022-01-15 18:02:17] [config] lr-decay-inv-sqrt: [2022-01-15 18:02:17] [config] - 16000 [2022-01-15 18:02:17] [config] lr-decay-repeat-warmup: false [2022-01-15 18:02:17] [config] lr-decay-reset-optimizer: false [2022-01-15 18:02:17] [config] lr-decay-start: [2022-01-15 18:02:17] [config] - 10 [2022-01-15 18:02:17] [config] - 1 [2022-01-15 18:02:17] [config] lr-decay-strategy: epoch+stalled [2022-01-15 18:02:17] [config] lr-report: true [2022-01-15 18:02:17] [config] lr-warmup: 16000 [2022-01-15 18:02:17] [config] lr-warmup-at-reload: false [2022-01-15 18:02:17] [config] lr-warmup-cycle: false [2022-01-15 18:02:17] [config] lr-warmup-start-rate: 0 [2022-01-15 18:02:17] [config] max-length: 100 [2022-01-15 18:02:17] [config] max-length-crop: false [2022-01-15 18:02:17] [config] max-length-factor: 3 [2022-01-15 18:02:17] [config] maxi-batch: 1000 [2022-01-15 18:02:17] [config] maxi-batch-sort: trg [2022-01-15 18:02:17] [config] mini-batch: 64 [2022-01-15 18:02:17] [config] mini-batch-fit: true [2022-01-15 18:02:17] [config] mini-batch-fit-step: 10 [2022-01-15 18:02:17] [config] mini-batch-track-lr: false [2022-01-15 18:02:17] [config] mini-batch-warmup: 0 [2022-01-15 18:02:17] [config] mini-batch-words: 0 [2022-01-15 18:02:17] [config] mini-batch-words-ref: 0 [2022-01-15 18:02:17] [config] model: model.npz [2022-01-15 18:02:17] [config] multi-loss-type: sum [2022-01-15 18:02:17] [config] multi-node: false [2022-01-15 18:02:17] [config] multi-node-overlap: true [2022-01-15 18:02:17] [config] n-best: false [2022-01-15 18:02:17] [config] no-nccl: false [2022-01-15 18:02:17] [config] no-reload: false [2022-01-15 18:02:17] [config] no-restore-corpus: false [2022-01-15 18:02:17] [config] normalize: 0.6 [2022-01-15 18:02:17] [config] normalize-gradient: false [2022-01-15 18:02:17] [config] num-devices: 0 [2022-01-15 18:02:17] [config] optimizer: adam [2022-01-15 18:02:17] [config] optimizer-delay: 1 [2022-01-15 18:02:17] [config] optimizer-params: [2022-01-15 18:02:17] [config] - 0.9 [2022-01-15 18:02:17] [config] - 0.98 [2022-01-15 18:02:17] [config] - 1e-09 [2022-01-15 18:02:17] [config] output-omit-bias: false [2022-01-15 18:02:17] [config] overwrite: true [2022-01-15 18:02:17] [config] precision: [2022-01-15 18:02:17] [config] - float32 [2022-01-15 18:02:17] [config] - float32 [2022-01-15 18:02:17] [config] - float32 [2022-01-15 18:02:17] [config] pretrained-model: "" [2022-01-15 18:02:17] [config] quantize-biases: false [2022-01-15 18:02:17] [config] quantize-bits: 0 [2022-01-15 18:02:17] [config] quantize-log-based: false [2022-01-15 18:02:17] [config] quantize-optimization-steps: 0 [2022-01-15 18:02:17] [config] quiet: false [2022-01-15 18:02:17] [config] quiet-translation: false [2022-01-15 18:02:17] [config] relative-paths: false [2022-01-15 18:02:17] [config] right-left: false [2022-01-15 18:02:17] [config] save-freq: 5000 [2022-01-15 18:02:17] [config] seed: 0 [2022-01-15 18:02:17] [config] sentencepiece-alphas: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] sentencepiece-max-lines: 2000000 [2022-01-15 18:02:17] [config] sentencepiece-options: "" [2022-01-15 18:02:17] [config] shuffle: data [2022-01-15 18:02:17] [config] shuffle-in-ram: false [2022-01-15 18:02:17] [config] sigterm: save-and-exit [2022-01-15 18:02:17] [config] skip: false [2022-01-15 18:02:17] [config] sqlite: "" [2022-01-15 18:02:17] [config] sqlite-drop: false [2022-01-15 18:02:17] [config] sync-sgd: false [2022-01-15 18:02:17] [config] tempdir: /tmp [2022-01-15 18:02:17] [config] tied-embeddings: true [2022-01-15 18:02:17] [config] tied-embeddings-all: false [2022-01-15 18:02:17] [config] tied-embeddings-src: false [2022-01-15 18:02:17] [config] train-embedder-rank: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] train-sets: [2022-01-15 18:02:17] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 18:02:17] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 18:02:17] [config] transformer-aan-activation: swish [2022-01-15 18:02:17] [config] transformer-aan-depth: 2 [2022-01-15 18:02:17] [config] transformer-aan-nogate: false [2022-01-15 18:02:17] [config] transformer-decoder-autoreg: self-attention [2022-01-15 18:02:17] [config] transformer-depth-scaling: false [2022-01-15 18:02:17] [config] transformer-dim-aan: 2048 [2022-01-15 18:02:17] [config] transformer-dim-ffn: 2048 [2022-01-15 18:02:17] [config] transformer-dropout: 0.1 [2022-01-15 18:02:17] [config] transformer-dropout-attention: 0 [2022-01-15 18:02:17] [config] transformer-dropout-ffn: 0 [2022-01-15 18:02:17] [config] transformer-ffn-activation: swish [2022-01-15 18:02:17] [config] transformer-ffn-depth: 2 [2022-01-15 18:02:17] [config] transformer-guided-alignment-layer: last [2022-01-15 18:02:17] [config] transformer-heads: 8 [2022-01-15 18:02:17] [config] transformer-no-projection: false [2022-01-15 18:02:17] [config] transformer-pool: false [2022-01-15 18:02:17] [config] transformer-postprocess: dan [2022-01-15 18:02:17] [config] transformer-postprocess-emb: d [2022-01-15 18:02:17] [config] transformer-postprocess-top: "" [2022-01-15 18:02:17] [config] transformer-preprocess: "" [2022-01-15 18:02:17] [config] transformer-tied-layers: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] transformer-train-position-embeddings: false [2022-01-15 18:02:17] [config] tsv: false [2022-01-15 18:02:17] [config] tsv-fields: 0 [2022-01-15 18:02:17] [config] type: transformer [2022-01-15 18:02:17] [config] ulr: false [2022-01-15 18:02:17] [config] ulr-dim-emb: 0 [2022-01-15 18:02:17] [config] ulr-dropout: 0 [2022-01-15 18:02:17] [config] ulr-keys-vectors: "" [2022-01-15 18:02:17] [config] ulr-query-vectors: "" [2022-01-15 18:02:17] [config] ulr-softmax-temperature: 1 [2022-01-15 18:02:17] [config] ulr-trainable-transformation: false [2022-01-15 18:02:17] [config] unlikelihood-loss: false [2022-01-15 18:02:17] [config] valid-freq: 5000 [2022-01-15 18:02:17] [config] valid-log: "" [2022-01-15 18:02:17] [config] valid-max-length: 1000 [2022-01-15 18:02:17] [config] valid-metrics: [2022-01-15 18:02:17] [config] - cross-entropy [2022-01-15 18:02:17] [config] valid-mini-batch: 32 [2022-01-15 18:02:17] [config] valid-reset-stalled: false [2022-01-15 18:02:17] [config] valid-script-args: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] valid-script-path: "" [2022-01-15 18:02:17] [config] valid-sets: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] valid-translation-output: "" [2022-01-15 18:02:17] [config] vocabs: [2022-01-15 18:02:17] [config] [] [2022-01-15 18:02:17] [config] word-penalty: 0 [2022-01-15 18:02:17] [config] word-scores: false [2022-01-15 18:02:17] [config] workspace: 10000 [2022-01-15 18:02:17] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 18:02:17] [training] Using single-device training [2022-01-15 18:02:17] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 18:02:17] [data] Vocabularies will be built separately for each file. [2022-01-15 18:02:17] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 18:02:17] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 18:02:17] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv [2022-01-15 18:02:47] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml [2022-01-15 18:02:58] [data] Setting vocabulary size for input 0 to 2,393,556 [2022-01-15 18:02:58] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 18:02:58] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 18:02:58] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv [2022-01-15 18:03:23] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml [2022-01-15 18:03:33] [data] Setting vocabulary size for input 1 to 2,113,516 [2022-01-15 18:03:33] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 18:03:33] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 18:03:33] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 18:03:34] [logits] Applying loss function for 1 factor(s) [2022-01-15 18:03:34] Error: Labels not matching logits shape (10821201920 != -2063699968, shape=1x10x512x2113516 size=-2063699968)?? [2022-01-15 18:03:34] Error: Aborted from marian::Expr marian::Logits::applyLossFunction(const Words&, const std::function > >(IntrusivePtr > >, IntrusivePtr > >)>&) const in /home/wmi/Workspace/marian/src/layers/generic.cpp:26 [CALL STACK] [0x55a61109a8a5] marian::Logits:: applyLossFunction (std::vector> const&, std::function>> (IntrusivePtr>>,IntrusivePtr>>)> const&) const + 0xc35 [0x55a6110b6f32] marian::CrossEntropyLoss:: compute (marian::Logits, std::vector> const&, IntrusivePtr>>, IntrusivePtr>>) + 0x82 [0x55a6110b5de9] marian::LabelwiseLoss:: apply (marian::Logits, std::vector> const&, IntrusivePtr>>, IntrusivePtr>>) + 0x339 [0x55a610ca00db] marian::models::EncoderDecoderCECost:: apply (std::shared_ptr, std::shared_ptr, std::shared_ptr, bool) + 0x58b [0x55a6108f2c82] marian::models::Trainer:: build (std::shared_ptr, std::shared_ptr, bool) + 0xb2 [0x55a610d875f4] marian::GraphGroup:: collectStats (std::shared_ptr, std::shared_ptr, std::vector,std::allocator>> const&, double) + 0xb84 [0x55a6109c9269] marian::Train:: run () + 0x2e9 [0x55a6108d1389] mainTrainer (int, char**) + 0x5e9 [0x55a61088f1bc] main + 0x3c [0x7fab12a310b3] __libc_start_main + 0xf3 [0x55a6108cfb0e] _start + 0x2e [2022-01-15 18:10:00] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 18:10:00] [marian] Running on s470607-gpu as process 4042 with command line: [2022-01-15 18:10:00] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 18:10:00] [config] after: 0e [2022-01-15 18:10:00] [config] after-batches: 0 [2022-01-15 18:10:00] [config] after-epochs: 1 [2022-01-15 18:10:00] [config] all-caps-every: 0 [2022-01-15 18:10:00] [config] allow-unk: false [2022-01-15 18:10:00] [config] authors: false [2022-01-15 18:10:00] [config] beam-size: 6 [2022-01-15 18:10:00] [config] bert-class-symbol: "[CLS]" [2022-01-15 18:10:00] [config] bert-mask-symbol: "[MASK]" [2022-01-15 18:10:00] [config] bert-masking-fraction: 0.15 [2022-01-15 18:10:00] [config] bert-sep-symbol: "[SEP]" [2022-01-15 18:10:00] [config] bert-train-type-embeddings: true [2022-01-15 18:10:00] [config] bert-type-vocab-size: 2 [2022-01-15 18:10:00] [config] build-info: "" [2022-01-15 18:10:00] [config] cite: false [2022-01-15 18:10:00] [config] clip-norm: 5 [2022-01-15 18:10:00] [config] cost-scaling: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] cost-type: ce-sum [2022-01-15 18:10:00] [config] cpu-threads: 0 [2022-01-15 18:10:00] [config] data-weighting: "" [2022-01-15 18:10:00] [config] data-weighting-type: sentence [2022-01-15 18:10:00] [config] dec-cell: gru [2022-01-15 18:10:00] [config] dec-cell-base-depth: 2 [2022-01-15 18:10:00] [config] dec-cell-high-depth: 1 [2022-01-15 18:10:00] [config] dec-depth: 6 [2022-01-15 18:10:00] [config] devices: [2022-01-15 18:10:00] [config] - 0 [2022-01-15 18:10:00] [config] dim-emb: 512 [2022-01-15 18:10:00] [config] dim-rnn: 1024 [2022-01-15 18:10:00] [config] dim-vocabs: [2022-01-15 18:10:00] [config] - 0 [2022-01-15 18:10:00] [config] - 0 [2022-01-15 18:10:00] [config] disp-first: 0 [2022-01-15 18:10:00] [config] disp-freq: 500 [2022-01-15 18:10:00] [config] disp-label-counts: true [2022-01-15 18:10:00] [config] dropout-rnn: 0 [2022-01-15 18:10:00] [config] dropout-src: 0 [2022-01-15 18:10:00] [config] dropout-trg: 0 [2022-01-15 18:10:00] [config] dump-config: "" [2022-01-15 18:10:00] [config] early-stopping: 10 [2022-01-15 18:10:00] [config] embedding-fix-src: false [2022-01-15 18:10:00] [config] embedding-fix-trg: false [2022-01-15 18:10:00] [config] embedding-normalization: false [2022-01-15 18:10:00] [config] embedding-vectors: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] enc-cell: gru [2022-01-15 18:10:00] [config] enc-cell-depth: 1 [2022-01-15 18:10:00] [config] enc-depth: 6 [2022-01-15 18:10:00] [config] enc-type: bidirectional [2022-01-15 18:10:00] [config] english-title-case-every: 0 [2022-01-15 18:10:00] [config] exponential-smoothing: 0.0001 [2022-01-15 18:10:00] [config] factor-weight: 1 [2022-01-15 18:10:00] [config] grad-dropping-momentum: 0 [2022-01-15 18:10:00] [config] grad-dropping-rate: 0 [2022-01-15 18:10:00] [config] grad-dropping-warmup: 100 [2022-01-15 18:10:00] [config] gradient-checkpointing: false [2022-01-15 18:10:00] [config] guided-alignment: none [2022-01-15 18:10:00] [config] guided-alignment-cost: mse [2022-01-15 18:10:00] [config] guided-alignment-weight: 0.1 [2022-01-15 18:10:00] [config] ignore-model-config: false [2022-01-15 18:10:00] [config] input-types: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] interpolate-env-vars: false [2022-01-15 18:10:00] [config] keep-best: false [2022-01-15 18:10:00] [config] label-smoothing: 0.1 [2022-01-15 18:10:00] [config] layer-normalization: false [2022-01-15 18:10:00] [config] learn-rate: 0.0003 [2022-01-15 18:10:00] [config] lemma-dim-emb: 0 [2022-01-15 18:10:00] [config] log: /home/wmi/train.log [2022-01-15 18:10:00] [config] log-level: info [2022-01-15 18:10:00] [config] log-time-zone: "" [2022-01-15 18:10:00] [config] logical-epoch: [2022-01-15 18:10:00] [config] - 1e [2022-01-15 18:10:00] [config] - 0 [2022-01-15 18:10:00] [config] lr-decay: 0 [2022-01-15 18:10:00] [config] lr-decay-freq: 50000 [2022-01-15 18:10:00] [config] lr-decay-inv-sqrt: [2022-01-15 18:10:00] [config] - 16000 [2022-01-15 18:10:00] [config] lr-decay-repeat-warmup: false [2022-01-15 18:10:00] [config] lr-decay-reset-optimizer: false [2022-01-15 18:10:00] [config] lr-decay-start: [2022-01-15 18:10:00] [config] - 10 [2022-01-15 18:10:00] [config] - 1 [2022-01-15 18:10:00] [config] lr-decay-strategy: epoch+stalled [2022-01-15 18:10:00] [config] lr-report: true [2022-01-15 18:10:00] [config] lr-warmup: 16000 [2022-01-15 18:10:00] [config] lr-warmup-at-reload: false [2022-01-15 18:10:00] [config] lr-warmup-cycle: false [2022-01-15 18:10:00] [config] lr-warmup-start-rate: 0 [2022-01-15 18:10:00] [config] max-length: 100 [2022-01-15 18:10:00] [config] max-length-crop: false [2022-01-15 18:10:00] [config] max-length-factor: 3 [2022-01-15 18:10:00] [config] maxi-batch: 1000 [2022-01-15 18:10:00] [config] maxi-batch-sort: trg [2022-01-15 18:10:00] [config] mini-batch: 64 [2022-01-15 18:10:00] [config] mini-batch-fit: true [2022-01-15 18:10:00] [config] mini-batch-fit-step: 10 [2022-01-15 18:10:00] [config] mini-batch-track-lr: false [2022-01-15 18:10:00] [config] mini-batch-warmup: 0 [2022-01-15 18:10:00] [config] mini-batch-words: 0 [2022-01-15 18:10:00] [config] mini-batch-words-ref: 0 [2022-01-15 18:10:00] [config] model: model.npz [2022-01-15 18:10:00] [config] multi-loss-type: sum [2022-01-15 18:10:00] [config] multi-node: false [2022-01-15 18:10:00] [config] multi-node-overlap: true [2022-01-15 18:10:00] [config] n-best: false [2022-01-15 18:10:00] [config] no-nccl: false [2022-01-15 18:10:00] [config] no-reload: false [2022-01-15 18:10:00] [config] no-restore-corpus: false [2022-01-15 18:10:00] [config] normalize: 0.6 [2022-01-15 18:10:00] [config] normalize-gradient: false [2022-01-15 18:10:00] [config] num-devices: 0 [2022-01-15 18:10:00] [config] optimizer: adam [2022-01-15 18:10:00] [config] optimizer-delay: 1 [2022-01-15 18:10:00] [config] optimizer-params: [2022-01-15 18:10:00] [config] - 0.9 [2022-01-15 18:10:00] [config] - 0.98 [2022-01-15 18:10:00] [config] - 1e-09 [2022-01-15 18:10:00] [config] output-omit-bias: false [2022-01-15 18:10:00] [config] overwrite: true [2022-01-15 18:10:00] [config] precision: [2022-01-15 18:10:00] [config] - float32 [2022-01-15 18:10:00] [config] - float32 [2022-01-15 18:10:00] [config] - float32 [2022-01-15 18:10:00] [config] pretrained-model: "" [2022-01-15 18:10:00] [config] quantize-biases: false [2022-01-15 18:10:00] [config] quantize-bits: 0 [2022-01-15 18:10:00] [config] quantize-log-based: false [2022-01-15 18:10:00] [config] quantize-optimization-steps: 0 [2022-01-15 18:10:00] [config] quiet: false [2022-01-15 18:10:00] [config] quiet-translation: false [2022-01-15 18:10:00] [config] relative-paths: false [2022-01-15 18:10:00] [config] right-left: false [2022-01-15 18:10:00] [config] save-freq: 5000 [2022-01-15 18:10:00] [config] seed: 0 [2022-01-15 18:10:00] [config] sentencepiece-alphas: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] sentencepiece-max-lines: 2000000 [2022-01-15 18:10:00] [config] sentencepiece-options: "" [2022-01-15 18:10:00] [config] shuffle: data [2022-01-15 18:10:00] [config] shuffle-in-ram: false [2022-01-15 18:10:00] [config] sigterm: save-and-exit [2022-01-15 18:10:00] [config] skip: false [2022-01-15 18:10:00] [config] sqlite: "" [2022-01-15 18:10:00] [config] sqlite-drop: false [2022-01-15 18:10:00] [config] sync-sgd: false [2022-01-15 18:10:00] [config] tempdir: /tmp [2022-01-15 18:10:00] [config] tied-embeddings: true [2022-01-15 18:10:00] [config] tied-embeddings-all: false [2022-01-15 18:10:00] [config] tied-embeddings-src: false [2022-01-15 18:10:00] [config] train-embedder-rank: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] train-sets: [2022-01-15 18:10:00] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 18:10:00] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 18:10:00] [config] transformer-aan-activation: swish [2022-01-15 18:10:00] [config] transformer-aan-depth: 2 [2022-01-15 18:10:00] [config] transformer-aan-nogate: false [2022-01-15 18:10:00] [config] transformer-decoder-autoreg: self-attention [2022-01-15 18:10:00] [config] transformer-depth-scaling: false [2022-01-15 18:10:00] [config] transformer-dim-aan: 2048 [2022-01-15 18:10:00] [config] transformer-dim-ffn: 2048 [2022-01-15 18:10:00] [config] transformer-dropout: 0.1 [2022-01-15 18:10:00] [config] transformer-dropout-attention: 0 [2022-01-15 18:10:00] [config] transformer-dropout-ffn: 0 [2022-01-15 18:10:00] [config] transformer-ffn-activation: swish [2022-01-15 18:10:00] [config] transformer-ffn-depth: 2 [2022-01-15 18:10:00] [config] transformer-guided-alignment-layer: last [2022-01-15 18:10:00] [config] transformer-heads: 8 [2022-01-15 18:10:00] [config] transformer-no-projection: false [2022-01-15 18:10:00] [config] transformer-pool: false [2022-01-15 18:10:00] [config] transformer-postprocess: dan [2022-01-15 18:10:00] [config] transformer-postprocess-emb: d [2022-01-15 18:10:00] [config] transformer-postprocess-top: "" [2022-01-15 18:10:00] [config] transformer-preprocess: "" [2022-01-15 18:10:00] [config] transformer-tied-layers: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] transformer-train-position-embeddings: false [2022-01-15 18:10:00] [config] tsv: false [2022-01-15 18:10:00] [config] tsv-fields: 0 [2022-01-15 18:10:00] [config] type: transformer [2022-01-15 18:10:00] [config] ulr: false [2022-01-15 18:10:00] [config] ulr-dim-emb: 0 [2022-01-15 18:10:00] [config] ulr-dropout: 0 [2022-01-15 18:10:00] [config] ulr-keys-vectors: "" [2022-01-15 18:10:00] [config] ulr-query-vectors: "" [2022-01-15 18:10:00] [config] ulr-softmax-temperature: 1 [2022-01-15 18:10:00] [config] ulr-trainable-transformation: false [2022-01-15 18:10:00] [config] unlikelihood-loss: false [2022-01-15 18:10:00] [config] valid-freq: 5000 [2022-01-15 18:10:00] [config] valid-log: "" [2022-01-15 18:10:00] [config] valid-max-length: 1000 [2022-01-15 18:10:00] [config] valid-metrics: [2022-01-15 18:10:00] [config] - cross-entropy [2022-01-15 18:10:00] [config] valid-mini-batch: 32 [2022-01-15 18:10:00] [config] valid-reset-stalled: false [2022-01-15 18:10:00] [config] valid-script-args: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] valid-script-path: "" [2022-01-15 18:10:00] [config] valid-sets: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] valid-translation-output: "" [2022-01-15 18:10:00] [config] vocabs: [2022-01-15 18:10:00] [config] [] [2022-01-15 18:10:00] [config] word-penalty: 0 [2022-01-15 18:10:00] [config] word-scores: false [2022-01-15 18:10:00] [config] workspace: 10000 [2022-01-15 18:10:00] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 18:10:00] [training] Using single-device training [2022-01-15 18:10:00] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 18:10:00] [data] Vocabularies will be built separately for each file. [2022-01-15 18:10:00] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 18:10:00] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 18:10:00] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 18:10:04] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml [2022-01-15 18:10:04] [data] Setting vocabulary size for input 0 to 14,881 [2022-01-15 18:10:04] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 18:10:04] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 18:10:04] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 18:10:08] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml [2022-01-15 18:10:08] [data] Setting vocabulary size for input 1 to 14,891 [2022-01-15 18:10:08] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 18:10:08] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 18:10:08] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 18:10:08] [logits] Applying loss function for 1 factor(s) [2022-01-15 18:10:08] [memory] Reserving 226 MB, device gpu0 [2022-01-15 18:10:10] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2022-01-15 18:10:10] [memory] Reserving 226 MB, device gpu0 [2022-01-15 18:10:20] [batching] Done. Typical MB size is 10,112 target words [2022-01-15 18:10:21] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 18:10:21] Training started [2022-01-15 18:10:21] [data] Shuffling data [2022-01-15 18:10:21] [data] Done reading 1,514,371 sentences [2022-01-15 18:10:28] [data] Done shuffling 1,514,371 sentences to temp files [2022-01-15 18:10:29] [memory] Reserving 226 MB, device gpu0 [2022-01-15 18:10:30] [memory] Reserving 226 MB, device gpu0 [2022-01-15 18:10:30] [memory] Reserving 453 MB, device gpu0 [2022-01-15 18:10:30] [memory] Reserving 226 MB, device gpu0 [2022-01-15 18:11:53] Ep. 1 : Up. 500 : Sen. 123,531 : Cost 9.24570084 * 4,202,148 @ 7,371 after 4,202,148 : Time 92.22s : 45564.85 words/s : L.r. 9.3750e-06 [2022-01-15 18:13:17] Ep. 1 : Up. 1000 : Sen. 247,289 : Cost 8.30709076 * 4,212,393 @ 9,174 after 8,414,541 : Time 84.57s : 49812.25 words/s : L.r. 1.8750e-05 [2022-01-15 18:14:42] Ep. 1 : Up. 1500 : Sen. 371,249 : Cost 7.93976021 * 4,214,955 @ 8,424 after 12,629,496 : Time 84.95s : 49616.76 words/s : L.r. 2.8125e-05 [2022-01-15 18:16:07] Ep. 1 : Up. 2000 : Sen. 494,238 : Cost 7.66895914 * 4,181,637 @ 6,976 after 16,811,133 : Time 84.54s : 49463.52 words/s : L.r. 3.7500e-05 [2022-01-15 18:17:32] Ep. 1 : Up. 2500 : Sen. 620,120 : Cost 7.33166742 * 4,211,519 @ 8,932 after 21,022,652 : Time 85.05s : 49517.60 words/s : L.r. 4.6875e-05 [2022-01-15 18:18:57] Ep. 1 : Up. 3000 : Sen. 742,304 : Cost 6.73455620 * 4,224,299 @ 7,037 after 25,246,951 : Time 85.22s : 49569.79 words/s : L.r. 5.6250e-05 [2022-01-15 18:20:22] Ep. 1 : Up. 3500 : Sen. 867,022 : Cost 5.82840490 * 4,186,159 @ 8,820 after 29,433,110 : Time 84.70s : 49423.21 words/s : L.r. 6.5625e-05 [2022-01-15 19:15:06] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 19:15:06] [marian] Running on s470607-gpu as process 4149 with command line: [2022-01-15 19:15:06] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 19:15:06] [config] after: 0e [2022-01-15 19:15:06] [config] after-batches: 0 [2022-01-15 19:15:06] [config] after-epochs: 1 [2022-01-15 19:15:06] [config] all-caps-every: 0 [2022-01-15 19:15:06] [config] allow-unk: false [2022-01-15 19:15:06] [config] authors: false [2022-01-15 19:15:06] [config] beam-size: 6 [2022-01-15 19:15:06] [config] bert-class-symbol: "[CLS]" [2022-01-15 19:15:06] [config] bert-mask-symbol: "[MASK]" [2022-01-15 19:15:06] [config] bert-masking-fraction: 0.15 [2022-01-15 19:15:06] [config] bert-sep-symbol: "[SEP]" [2022-01-15 19:15:06] [config] bert-train-type-embeddings: true [2022-01-15 19:15:06] [config] bert-type-vocab-size: 2 [2022-01-15 19:15:06] [config] build-info: "" [2022-01-15 19:15:06] [config] cite: false [2022-01-15 19:15:06] [config] clip-norm: 5 [2022-01-15 19:15:06] [config] cost-scaling: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] cost-type: ce-sum [2022-01-15 19:15:06] [config] cpu-threads: 0 [2022-01-15 19:15:06] [config] data-weighting: "" [2022-01-15 19:15:06] [config] data-weighting-type: sentence [2022-01-15 19:15:06] [config] dec-cell: gru [2022-01-15 19:15:06] [config] dec-cell-base-depth: 2 [2022-01-15 19:15:06] [config] dec-cell-high-depth: 1 [2022-01-15 19:15:06] [config] dec-depth: 6 [2022-01-15 19:15:06] [config] devices: [2022-01-15 19:15:06] [config] - 0 [2022-01-15 19:15:06] [config] dim-emb: 512 [2022-01-15 19:15:06] [config] dim-rnn: 1024 [2022-01-15 19:15:06] [config] dim-vocabs: [2022-01-15 19:15:06] [config] - 0 [2022-01-15 19:15:06] [config] - 0 [2022-01-15 19:15:06] [config] disp-first: 0 [2022-01-15 19:15:06] [config] disp-freq: 500 [2022-01-15 19:15:06] [config] disp-label-counts: true [2022-01-15 19:15:06] [config] dropout-rnn: 0 [2022-01-15 19:15:06] [config] dropout-src: 0 [2022-01-15 19:15:06] [config] dropout-trg: 0 [2022-01-15 19:15:06] [config] dump-config: "" [2022-01-15 19:15:06] [config] early-stopping: 10 [2022-01-15 19:15:06] [config] embedding-fix-src: false [2022-01-15 19:15:06] [config] embedding-fix-trg: false [2022-01-15 19:15:06] [config] embedding-normalization: false [2022-01-15 19:15:06] [config] embedding-vectors: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] enc-cell: gru [2022-01-15 19:15:06] [config] enc-cell-depth: 1 [2022-01-15 19:15:06] [config] enc-depth: 6 [2022-01-15 19:15:06] [config] enc-type: bidirectional [2022-01-15 19:15:06] [config] english-title-case-every: 0 [2022-01-15 19:15:06] [config] exponential-smoothing: 0.0001 [2022-01-15 19:15:06] [config] factor-weight: 1 [2022-01-15 19:15:06] [config] grad-dropping-momentum: 0 [2022-01-15 19:15:06] [config] grad-dropping-rate: 0 [2022-01-15 19:15:06] [config] grad-dropping-warmup: 100 [2022-01-15 19:15:06] [config] gradient-checkpointing: false [2022-01-15 19:15:06] [config] guided-alignment: none [2022-01-15 19:15:06] [config] guided-alignment-cost: mse [2022-01-15 19:15:06] [config] guided-alignment-weight: 0.1 [2022-01-15 19:15:06] [config] ignore-model-config: false [2022-01-15 19:15:06] [config] input-types: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] interpolate-env-vars: false [2022-01-15 19:15:06] [config] keep-best: false [2022-01-15 19:15:06] [config] label-smoothing: 0.1 [2022-01-15 19:15:06] [config] layer-normalization: false [2022-01-15 19:15:06] [config] learn-rate: 0.0003 [2022-01-15 19:15:06] [config] lemma-dim-emb: 0 [2022-01-15 19:15:06] [config] log: /home/wmi/train.log [2022-01-15 19:15:06] [config] log-level: info [2022-01-15 19:15:06] [config] log-time-zone: "" [2022-01-15 19:15:06] [config] logical-epoch: [2022-01-15 19:15:06] [config] - 1e [2022-01-15 19:15:06] [config] - 0 [2022-01-15 19:15:06] [config] lr-decay: 0 [2022-01-15 19:15:06] [config] lr-decay-freq: 50000 [2022-01-15 19:15:06] [config] lr-decay-inv-sqrt: [2022-01-15 19:15:06] [config] - 16000 [2022-01-15 19:15:06] [config] lr-decay-repeat-warmup: false [2022-01-15 19:15:06] [config] lr-decay-reset-optimizer: false [2022-01-15 19:15:06] [config] lr-decay-start: [2022-01-15 19:15:06] [config] - 10 [2022-01-15 19:15:06] [config] - 1 [2022-01-15 19:15:06] [config] lr-decay-strategy: epoch+stalled [2022-01-15 19:15:06] [config] lr-report: true [2022-01-15 19:15:06] [config] lr-warmup: 16000 [2022-01-15 19:15:06] [config] lr-warmup-at-reload: false [2022-01-15 19:15:06] [config] lr-warmup-cycle: false [2022-01-15 19:15:06] [config] lr-warmup-start-rate: 0 [2022-01-15 19:15:06] [config] max-length: 100 [2022-01-15 19:15:06] [config] max-length-crop: false [2022-01-15 19:15:06] [config] max-length-factor: 3 [2022-01-15 19:15:06] [config] maxi-batch: 1000 [2022-01-15 19:15:06] [config] maxi-batch-sort: trg [2022-01-15 19:15:06] [config] mini-batch: 64 [2022-01-15 19:15:06] [config] mini-batch-fit: true [2022-01-15 19:15:06] [config] mini-batch-fit-step: 10 [2022-01-15 19:15:06] [config] mini-batch-track-lr: false [2022-01-15 19:15:06] [config] mini-batch-warmup: 0 [2022-01-15 19:15:06] [config] mini-batch-words: 0 [2022-01-15 19:15:06] [config] mini-batch-words-ref: 0 [2022-01-15 19:15:06] [config] model: model.npz [2022-01-15 19:15:06] [config] multi-loss-type: sum [2022-01-15 19:15:06] [config] multi-node: false [2022-01-15 19:15:06] [config] multi-node-overlap: true [2022-01-15 19:15:06] [config] n-best: false [2022-01-15 19:15:06] [config] no-nccl: false [2022-01-15 19:15:06] [config] no-reload: false [2022-01-15 19:15:06] [config] no-restore-corpus: false [2022-01-15 19:15:06] [config] normalize: 0.6 [2022-01-15 19:15:06] [config] normalize-gradient: false [2022-01-15 19:15:06] [config] num-devices: 0 [2022-01-15 19:15:06] [config] optimizer: adam [2022-01-15 19:15:06] [config] optimizer-delay: 1 [2022-01-15 19:15:06] [config] optimizer-params: [2022-01-15 19:15:06] [config] - 0.9 [2022-01-15 19:15:06] [config] - 0.98 [2022-01-15 19:15:06] [config] - 1e-09 [2022-01-15 19:15:06] [config] output-omit-bias: false [2022-01-15 19:15:06] [config] overwrite: true [2022-01-15 19:15:06] [config] precision: [2022-01-15 19:15:06] [config] - float32 [2022-01-15 19:15:06] [config] - float32 [2022-01-15 19:15:06] [config] - float32 [2022-01-15 19:15:06] [config] pretrained-model: "" [2022-01-15 19:15:06] [config] quantize-biases: false [2022-01-15 19:15:06] [config] quantize-bits: 0 [2022-01-15 19:15:06] [config] quantize-log-based: false [2022-01-15 19:15:06] [config] quantize-optimization-steps: 0 [2022-01-15 19:15:06] [config] quiet: false [2022-01-15 19:15:06] [config] quiet-translation: false [2022-01-15 19:15:06] [config] relative-paths: false [2022-01-15 19:15:06] [config] right-left: false [2022-01-15 19:15:06] [config] save-freq: 5000 [2022-01-15 19:15:06] [config] seed: 0 [2022-01-15 19:15:06] [config] sentencepiece-alphas: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] sentencepiece-max-lines: 2000000 [2022-01-15 19:15:06] [config] sentencepiece-options: "" [2022-01-15 19:15:06] [config] shuffle: data [2022-01-15 19:15:06] [config] shuffle-in-ram: false [2022-01-15 19:15:06] [config] sigterm: save-and-exit [2022-01-15 19:15:06] [config] skip: false [2022-01-15 19:15:06] [config] sqlite: "" [2022-01-15 19:15:06] [config] sqlite-drop: false [2022-01-15 19:15:06] [config] sync-sgd: false [2022-01-15 19:15:06] [config] tempdir: /tmp [2022-01-15 19:15:06] [config] tied-embeddings: true [2022-01-15 19:15:06] [config] tied-embeddings-all: false [2022-01-15 19:15:06] [config] tied-embeddings-src: false [2022-01-15 19:15:06] [config] train-embedder-rank: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] train-sets: [2022-01-15 19:15:06] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 19:15:06] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 19:15:06] [config] transformer-aan-activation: swish [2022-01-15 19:15:06] [config] transformer-aan-depth: 2 [2022-01-15 19:15:06] [config] transformer-aan-nogate: false [2022-01-15 19:15:06] [config] transformer-decoder-autoreg: self-attention [2022-01-15 19:15:06] [config] transformer-depth-scaling: false [2022-01-15 19:15:06] [config] transformer-dim-aan: 2048 [2022-01-15 19:15:06] [config] transformer-dim-ffn: 2048 [2022-01-15 19:15:06] [config] transformer-dropout: 0.1 [2022-01-15 19:15:06] [config] transformer-dropout-attention: 0 [2022-01-15 19:15:06] [config] transformer-dropout-ffn: 0 [2022-01-15 19:15:06] [config] transformer-ffn-activation: swish [2022-01-15 19:15:06] [config] transformer-ffn-depth: 2 [2022-01-15 19:15:06] [config] transformer-guided-alignment-layer: last [2022-01-15 19:15:06] [config] transformer-heads: 8 [2022-01-15 19:15:06] [config] transformer-no-projection: false [2022-01-15 19:15:06] [config] transformer-pool: false [2022-01-15 19:15:06] [config] transformer-postprocess: dan [2022-01-15 19:15:06] [config] transformer-postprocess-emb: d [2022-01-15 19:15:06] [config] transformer-postprocess-top: "" [2022-01-15 19:15:06] [config] transformer-preprocess: "" [2022-01-15 19:15:06] [config] transformer-tied-layers: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] transformer-train-position-embeddings: false [2022-01-15 19:15:06] [config] tsv: false [2022-01-15 19:15:06] [config] tsv-fields: 0 [2022-01-15 19:15:06] [config] type: transformer [2022-01-15 19:15:06] [config] ulr: false [2022-01-15 19:15:06] [config] ulr-dim-emb: 0 [2022-01-15 19:15:06] [config] ulr-dropout: 0 [2022-01-15 19:15:06] [config] ulr-keys-vectors: "" [2022-01-15 19:15:06] [config] ulr-query-vectors: "" [2022-01-15 19:15:06] [config] ulr-softmax-temperature: 1 [2022-01-15 19:15:06] [config] ulr-trainable-transformation: false [2022-01-15 19:15:06] [config] unlikelihood-loss: false [2022-01-15 19:15:06] [config] valid-freq: 5000 [2022-01-15 19:15:06] [config] valid-log: "" [2022-01-15 19:15:06] [config] valid-max-length: 1000 [2022-01-15 19:15:06] [config] valid-metrics: [2022-01-15 19:15:06] [config] - cross-entropy [2022-01-15 19:15:06] [config] valid-mini-batch: 32 [2022-01-15 19:15:06] [config] valid-reset-stalled: false [2022-01-15 19:15:06] [config] valid-script-args: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] valid-script-path: "" [2022-01-15 19:15:06] [config] valid-sets: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] valid-translation-output: "" [2022-01-15 19:15:06] [config] vocabs: [2022-01-15 19:15:06] [config] [] [2022-01-15 19:15:06] [config] word-penalty: 0 [2022-01-15 19:15:06] [config] word-scores: false [2022-01-15 19:15:06] [config] workspace: 10000 [2022-01-15 19:15:06] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 19:15:06] [training] Using single-device training [2022-01-15 19:15:06] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 19:15:06] [data] Vocabularies will be built separately for each file. [2022-01-15 19:15:06] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 19:15:06] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 19:15:06] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 19:15:10] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml [2022-01-15 19:15:10] [data] Setting vocabulary size for input 0 to 14,891 [2022-01-15 19:15:10] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 19:15:10] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 19:15:10] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 19:15:14] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml [2022-01-15 19:15:14] [data] Setting vocabulary size for input 1 to 14,901 [2022-01-15 19:15:14] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 19:15:14] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 19:15:15] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 19:15:15] [logits] Applying loss function for 1 factor(s) [2022-01-15 19:15:15] [memory] Reserving 226 MB, device gpu0 [2022-01-15 19:15:16] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2022-01-15 19:15:16] [memory] Reserving 226 MB, device gpu0 [2022-01-15 19:15:27] [batching] Done. Typical MB size is 10,112 target words [2022-01-15 19:15:27] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 19:15:27] Training started [2022-01-15 19:15:27] [data] Shuffling data [2022-01-15 19:15:28] [data] Done reading 1,514,371 sentences [2022-01-15 19:15:35] [data] Done shuffling 1,514,371 sentences to temp files [2022-01-15 19:15:36] [memory] Reserving 226 MB, device gpu0 [2022-01-15 19:15:36] [memory] Reserving 226 MB, device gpu0 [2022-01-15 19:15:36] [memory] Reserving 453 MB, device gpu0 [2022-01-15 19:15:36] [memory] Reserving 226 MB, device gpu0 [2022-01-15 19:17:00] Ep. 1 : Up. 500 : Sen. 124,245 : Cost 9.25954151 * 4,225,979 @ 4,563 after 4,225,979 : Time 92.46s : 45708.10 words/s : L.r. 9.3750e-06 [2022-01-15 19:18:09] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 19:18:09] [marian] Running on s470607-gpu as process 4189 with command line: [2022-01-15 19:18:09] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-15 19:18:09] [config] after: 0e [2022-01-15 19:18:09] [config] after-batches: 0 [2022-01-15 19:18:09] [config] after-epochs: 1 [2022-01-15 19:18:09] [config] all-caps-every: 0 [2022-01-15 19:18:09] [config] allow-unk: false [2022-01-15 19:18:09] [config] authors: false [2022-01-15 19:18:09] [config] beam-size: 6 [2022-01-15 19:18:09] [config] bert-class-symbol: "[CLS]" [2022-01-15 19:18:09] [config] bert-mask-symbol: "[MASK]" [2022-01-15 19:18:09] [config] bert-masking-fraction: 0.15 [2022-01-15 19:18:09] [config] bert-sep-symbol: "[SEP]" [2022-01-15 19:18:09] [config] bert-train-type-embeddings: true [2022-01-15 19:18:09] [config] bert-type-vocab-size: 2 [2022-01-15 19:18:09] [config] build-info: "" [2022-01-15 19:18:09] [config] cite: false [2022-01-15 19:18:09] [config] clip-norm: 5 [2022-01-15 19:18:09] [config] cost-scaling: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] cost-type: ce-sum [2022-01-15 19:18:09] [config] cpu-threads: 0 [2022-01-15 19:18:09] [config] data-weighting: "" [2022-01-15 19:18:09] [config] data-weighting-type: sentence [2022-01-15 19:18:09] [config] dec-cell: gru [2022-01-15 19:18:09] [config] dec-cell-base-depth: 2 [2022-01-15 19:18:09] [config] dec-cell-high-depth: 1 [2022-01-15 19:18:09] [config] dec-depth: 6 [2022-01-15 19:18:09] [config] devices: [2022-01-15 19:18:09] [config] - 0 [2022-01-15 19:18:09] [config] dim-emb: 512 [2022-01-15 19:18:09] [config] dim-rnn: 1024 [2022-01-15 19:18:09] [config] dim-vocabs: [2022-01-15 19:18:09] [config] - 0 [2022-01-15 19:18:09] [config] - 0 [2022-01-15 19:18:09] [config] disp-first: 0 [2022-01-15 19:18:09] [config] disp-freq: 500 [2022-01-15 19:18:09] [config] disp-label-counts: true [2022-01-15 19:18:09] [config] dropout-rnn: 0 [2022-01-15 19:18:09] [config] dropout-src: 0 [2022-01-15 19:18:09] [config] dropout-trg: 0 [2022-01-15 19:18:09] [config] dump-config: "" [2022-01-15 19:18:09] [config] early-stopping: 10 [2022-01-15 19:18:09] [config] embedding-fix-src: false [2022-01-15 19:18:09] [config] embedding-fix-trg: false [2022-01-15 19:18:09] [config] embedding-normalization: false [2022-01-15 19:18:09] [config] embedding-vectors: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] enc-cell: gru [2022-01-15 19:18:09] [config] enc-cell-depth: 1 [2022-01-15 19:18:09] [config] enc-depth: 6 [2022-01-15 19:18:09] [config] enc-type: bidirectional [2022-01-15 19:18:09] [config] english-title-case-every: 0 [2022-01-15 19:18:09] [config] exponential-smoothing: 0.0001 [2022-01-15 19:18:09] [config] factor-weight: 1 [2022-01-15 19:18:09] [config] grad-dropping-momentum: 0 [2022-01-15 19:18:09] [config] grad-dropping-rate: 0 [2022-01-15 19:18:09] [config] grad-dropping-warmup: 100 [2022-01-15 19:18:09] [config] gradient-checkpointing: false [2022-01-15 19:18:09] [config] guided-alignment: none [2022-01-15 19:18:09] [config] guided-alignment-cost: mse [2022-01-15 19:18:09] [config] guided-alignment-weight: 0.1 [2022-01-15 19:18:09] [config] ignore-model-config: false [2022-01-15 19:18:09] [config] input-types: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] interpolate-env-vars: false [2022-01-15 19:18:09] [config] keep-best: false [2022-01-15 19:18:09] [config] label-smoothing: 0.1 [2022-01-15 19:18:09] [config] layer-normalization: false [2022-01-15 19:18:09] [config] learn-rate: 0.0003 [2022-01-15 19:18:09] [config] lemma-dim-emb: 0 [2022-01-15 19:18:09] [config] log: /home/wmi/train.log [2022-01-15 19:18:09] [config] log-level: info [2022-01-15 19:18:09] [config] log-time-zone: "" [2022-01-15 19:18:09] [config] logical-epoch: [2022-01-15 19:18:09] [config] - 1e [2022-01-15 19:18:09] [config] - 0 [2022-01-15 19:18:09] [config] lr-decay: 0 [2022-01-15 19:18:09] [config] lr-decay-freq: 50000 [2022-01-15 19:18:09] [config] lr-decay-inv-sqrt: [2022-01-15 19:18:09] [config] - 16000 [2022-01-15 19:18:09] [config] lr-decay-repeat-warmup: false [2022-01-15 19:18:09] [config] lr-decay-reset-optimizer: false [2022-01-15 19:18:09] [config] lr-decay-start: [2022-01-15 19:18:09] [config] - 10 [2022-01-15 19:18:09] [config] - 1 [2022-01-15 19:18:09] [config] lr-decay-strategy: epoch+stalled [2022-01-15 19:18:09] [config] lr-report: true [2022-01-15 19:18:09] [config] lr-warmup: 16000 [2022-01-15 19:18:09] [config] lr-warmup-at-reload: false [2022-01-15 19:18:09] [config] lr-warmup-cycle: false [2022-01-15 19:18:09] [config] lr-warmup-start-rate: 0 [2022-01-15 19:18:09] [config] max-length: 100 [2022-01-15 19:18:09] [config] max-length-crop: false [2022-01-15 19:18:09] [config] max-length-factor: 3 [2022-01-15 19:18:09] [config] maxi-batch: 1000 [2022-01-15 19:18:09] [config] maxi-batch-sort: trg [2022-01-15 19:18:09] [config] mini-batch: 64 [2022-01-15 19:18:09] [config] mini-batch-fit: true [2022-01-15 19:18:09] [config] mini-batch-fit-step: 10 [2022-01-15 19:18:09] [config] mini-batch-track-lr: false [2022-01-15 19:18:09] [config] mini-batch-warmup: 0 [2022-01-15 19:18:09] [config] mini-batch-words: 0 [2022-01-15 19:18:09] [config] mini-batch-words-ref: 0 [2022-01-15 19:18:09] [config] model: model.npz [2022-01-15 19:18:09] [config] multi-loss-type: sum [2022-01-15 19:18:09] [config] multi-node: false [2022-01-15 19:18:09] [config] multi-node-overlap: true [2022-01-15 19:18:09] [config] n-best: false [2022-01-15 19:18:09] [config] no-nccl: false [2022-01-15 19:18:09] [config] no-reload: false [2022-01-15 19:18:09] [config] no-restore-corpus: false [2022-01-15 19:18:09] [config] normalize: 0.6 [2022-01-15 19:18:09] [config] normalize-gradient: false [2022-01-15 19:18:09] [config] num-devices: 0 [2022-01-15 19:18:09] [config] optimizer: adam [2022-01-15 19:18:09] [config] optimizer-delay: 1 [2022-01-15 19:18:09] [config] optimizer-params: [2022-01-15 19:18:09] [config] - 0.9 [2022-01-15 19:18:09] [config] - 0.98 [2022-01-15 19:18:09] [config] - 1e-09 [2022-01-15 19:18:09] [config] output-omit-bias: false [2022-01-15 19:18:09] [config] overwrite: true [2022-01-15 19:18:09] [config] precision: [2022-01-15 19:18:09] [config] - float32 [2022-01-15 19:18:09] [config] - float32 [2022-01-15 19:18:09] [config] - float32 [2022-01-15 19:18:09] [config] pretrained-model: "" [2022-01-15 19:18:09] [config] quantize-biases: false [2022-01-15 19:18:09] [config] quantize-bits: 0 [2022-01-15 19:18:09] [config] quantize-log-based: false [2022-01-15 19:18:09] [config] quantize-optimization-steps: 0 [2022-01-15 19:18:09] [config] quiet: false [2022-01-15 19:18:09] [config] quiet-translation: false [2022-01-15 19:18:09] [config] relative-paths: false [2022-01-15 19:18:09] [config] right-left: false [2022-01-15 19:18:09] [config] save-freq: 5000 [2022-01-15 19:18:09] [config] seed: 0 [2022-01-15 19:18:09] [config] sentencepiece-alphas: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] sentencepiece-max-lines: 2000000 [2022-01-15 19:18:09] [config] sentencepiece-options: "" [2022-01-15 19:18:09] [config] shuffle: data [2022-01-15 19:18:09] [config] shuffle-in-ram: false [2022-01-15 19:18:09] [config] sigterm: save-and-exit [2022-01-15 19:18:09] [config] skip: false [2022-01-15 19:18:09] [config] sqlite: "" [2022-01-15 19:18:09] [config] sqlite-drop: false [2022-01-15 19:18:09] [config] sync-sgd: false [2022-01-15 19:18:09] [config] tempdir: /tmp [2022-01-15 19:18:09] [config] tied-embeddings: true [2022-01-15 19:18:09] [config] tied-embeddings-all: false [2022-01-15 19:18:09] [config] tied-embeddings-src: false [2022-01-15 19:18:09] [config] train-embedder-rank: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] train-sets: [2022-01-15 19:18:09] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 19:18:09] [config] - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 19:18:09] [config] transformer-aan-activation: swish [2022-01-15 19:18:09] [config] transformer-aan-depth: 2 [2022-01-15 19:18:09] [config] transformer-aan-nogate: false [2022-01-15 19:18:09] [config] transformer-decoder-autoreg: self-attention [2022-01-15 19:18:09] [config] transformer-depth-scaling: false [2022-01-15 19:18:09] [config] transformer-dim-aan: 2048 [2022-01-15 19:18:09] [config] transformer-dim-ffn: 2048 [2022-01-15 19:18:09] [config] transformer-dropout: 0.1 [2022-01-15 19:18:09] [config] transformer-dropout-attention: 0 [2022-01-15 19:18:09] [config] transformer-dropout-ffn: 0 [2022-01-15 19:18:09] [config] transformer-ffn-activation: swish [2022-01-15 19:18:09] [config] transformer-ffn-depth: 2 [2022-01-15 19:18:09] [config] transformer-guided-alignment-layer: last [2022-01-15 19:18:09] [config] transformer-heads: 8 [2022-01-15 19:18:09] [config] transformer-no-projection: false [2022-01-15 19:18:09] [config] transformer-pool: false [2022-01-15 19:18:09] [config] transformer-postprocess: dan [2022-01-15 19:18:09] [config] transformer-postprocess-emb: d [2022-01-15 19:18:09] [config] transformer-postprocess-top: "" [2022-01-15 19:18:09] [config] transformer-preprocess: "" [2022-01-15 19:18:09] [config] transformer-tied-layers: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] transformer-train-position-embeddings: false [2022-01-15 19:18:09] [config] tsv: false [2022-01-15 19:18:09] [config] tsv-fields: 0 [2022-01-15 19:18:09] [config] type: transformer [2022-01-15 19:18:09] [config] ulr: false [2022-01-15 19:18:09] [config] ulr-dim-emb: 0 [2022-01-15 19:18:09] [config] ulr-dropout: 0 [2022-01-15 19:18:09] [config] ulr-keys-vectors: "" [2022-01-15 19:18:09] [config] ulr-query-vectors: "" [2022-01-15 19:18:09] [config] ulr-softmax-temperature: 1 [2022-01-15 19:18:09] [config] ulr-trainable-transformation: false [2022-01-15 19:18:09] [config] unlikelihood-loss: false [2022-01-15 19:18:09] [config] valid-freq: 5000 [2022-01-15 19:18:09] [config] valid-log: "" [2022-01-15 19:18:09] [config] valid-max-length: 1000 [2022-01-15 19:18:09] [config] valid-metrics: [2022-01-15 19:18:09] [config] - cross-entropy [2022-01-15 19:18:09] [config] valid-mini-batch: 32 [2022-01-15 19:18:09] [config] valid-reset-stalled: false [2022-01-15 19:18:09] [config] valid-script-args: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] valid-script-path: "" [2022-01-15 19:18:09] [config] valid-sets: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] valid-translation-output: "" [2022-01-15 19:18:09] [config] vocabs: [2022-01-15 19:18:09] [config] [] [2022-01-15 19:18:09] [config] word-penalty: 0 [2022-01-15 19:18:09] [config] word-scores: false [2022-01-15 19:18:09] [config] workspace: 10000 [2022-01-15 19:18:09] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-15 19:18:09] [training] Using single-device training [2022-01-15 19:18:09] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-15 19:18:09] [data] Vocabularies will be built separately for each file. [2022-01-15 19:18:09] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 [2022-01-15 19:18:09] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml [2022-01-15 19:18:09] [data] Setting vocabulary size for input 0 to 14,891 [2022-01-15 19:18:09] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 [2022-01-15 19:18:09] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml [2022-01-15 19:18:09] [data] Setting vocabulary size for input 1 to 14,901 [2022-01-15 19:18:09] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-15 19:18:09] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-15 19:18:09] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 19:18:09] [logits] Applying loss function for 1 factor(s) [2022-01-15 19:18:09] [memory] Reserving 226 MB, device gpu0 [2022-01-15 19:18:10] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2022-01-15 19:18:10] [memory] Reserving 226 MB, device gpu0 [2022-01-15 19:18:21] [batching] Done. Typical MB size is 10,112 target words [2022-01-15 19:18:21] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-15 19:18:21] Training started [2022-01-15 19:18:21] [data] Shuffling data [2022-01-15 19:18:22] [data] Done reading 1,514,371 sentences [2022-01-17 14:28:25] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-17 14:28:25] [marian] Running on s470607-gpu as process 22804 with command line: [2022-01-17 14:28:25] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/train/in.tsv.32000 /home/wmi/mt-summit-corpora/train/expected.tsv.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 [2022-01-17 14:28:25] [config] after: 0e [2022-01-17 14:28:25] [config] after-batches: 0 [2022-01-17 14:28:25] [config] after-epochs: 1 [2022-01-17 14:28:25] [config] all-caps-every: 0 [2022-01-17 14:28:25] [config] allow-unk: false [2022-01-17 14:28:25] [config] authors: false [2022-01-17 14:28:25] [config] beam-size: 6 [2022-01-17 14:28:25] [config] bert-class-symbol: "[CLS]" [2022-01-17 14:28:25] [config] bert-mask-symbol: "[MASK]" [2022-01-17 14:28:25] [config] bert-masking-fraction: 0.15 [2022-01-17 14:28:25] [config] bert-sep-symbol: "[SEP]" [2022-01-17 14:28:25] [config] bert-train-type-embeddings: true [2022-01-17 14:28:25] [config] bert-type-vocab-size: 2 [2022-01-17 14:28:25] [config] build-info: "" [2022-01-17 14:28:25] [config] cite: false [2022-01-17 14:28:25] [config] clip-norm: 5 [2022-01-17 14:28:25] [config] cost-scaling: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] cost-type: ce-sum [2022-01-17 14:28:25] [config] cpu-threads: 0 [2022-01-17 14:28:25] [config] data-weighting: "" [2022-01-17 14:28:25] [config] data-weighting-type: sentence [2022-01-17 14:28:25] [config] dec-cell: gru [2022-01-17 14:28:25] [config] dec-cell-base-depth: 2 [2022-01-17 14:28:25] [config] dec-cell-high-depth: 1 [2022-01-17 14:28:25] [config] dec-depth: 6 [2022-01-17 14:28:25] [config] devices: [2022-01-17 14:28:25] [config] - 0 [2022-01-17 14:28:25] [config] dim-emb: 512 [2022-01-17 14:28:25] [config] dim-rnn: 1024 [2022-01-17 14:28:25] [config] dim-vocabs: [2022-01-17 14:28:25] [config] - 0 [2022-01-17 14:28:25] [config] - 0 [2022-01-17 14:28:25] [config] disp-first: 0 [2022-01-17 14:28:25] [config] disp-freq: 500 [2022-01-17 14:28:25] [config] disp-label-counts: true [2022-01-17 14:28:25] [config] dropout-rnn: 0 [2022-01-17 14:28:25] [config] dropout-src: 0 [2022-01-17 14:28:25] [config] dropout-trg: 0 [2022-01-17 14:28:25] [config] dump-config: "" [2022-01-17 14:28:25] [config] early-stopping: 10 [2022-01-17 14:28:25] [config] embedding-fix-src: false [2022-01-17 14:28:25] [config] embedding-fix-trg: false [2022-01-17 14:28:25] [config] embedding-normalization: false [2022-01-17 14:28:25] [config] embedding-vectors: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] enc-cell: gru [2022-01-17 14:28:25] [config] enc-cell-depth: 1 [2022-01-17 14:28:25] [config] enc-depth: 6 [2022-01-17 14:28:25] [config] enc-type: bidirectional [2022-01-17 14:28:25] [config] english-title-case-every: 0 [2022-01-17 14:28:25] [config] exponential-smoothing: 0.0001 [2022-01-17 14:28:25] [config] factor-weight: 1 [2022-01-17 14:28:25] [config] grad-dropping-momentum: 0 [2022-01-17 14:28:25] [config] grad-dropping-rate: 0 [2022-01-17 14:28:25] [config] grad-dropping-warmup: 100 [2022-01-17 14:28:25] [config] gradient-checkpointing: false [2022-01-17 14:28:25] [config] guided-alignment: none [2022-01-17 14:28:25] [config] guided-alignment-cost: mse [2022-01-17 14:28:25] [config] guided-alignment-weight: 0.1 [2022-01-17 14:28:25] [config] ignore-model-config: false [2022-01-17 14:28:25] [config] input-types: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] interpolate-env-vars: false [2022-01-17 14:28:25] [config] keep-best: false [2022-01-17 14:28:25] [config] label-smoothing: 0.1 [2022-01-17 14:28:25] [config] layer-normalization: false [2022-01-17 14:28:25] [config] learn-rate: 0.0003 [2022-01-17 14:28:25] [config] lemma-dim-emb: 0 [2022-01-17 14:28:25] [config] log: /home/wmi/train.log [2022-01-17 14:28:25] [config] log-level: info [2022-01-17 14:28:25] [config] log-time-zone: "" [2022-01-17 14:28:25] [config] logical-epoch: [2022-01-17 14:28:25] [config] - 1e [2022-01-17 14:28:25] [config] - 0 [2022-01-17 14:28:25] [config] lr-decay: 0 [2022-01-17 14:28:25] [config] lr-decay-freq: 50000 [2022-01-17 14:28:25] [config] lr-decay-inv-sqrt: [2022-01-17 14:28:25] [config] - 16000 [2022-01-17 14:28:25] [config] lr-decay-repeat-warmup: false [2022-01-17 14:28:25] [config] lr-decay-reset-optimizer: false [2022-01-17 14:28:25] [config] lr-decay-start: [2022-01-17 14:28:25] [config] - 10 [2022-01-17 14:28:25] [config] - 1 [2022-01-17 14:28:25] [config] lr-decay-strategy: epoch+stalled [2022-01-17 14:28:25] [config] lr-report: true [2022-01-17 14:28:25] [config] lr-warmup: 16000 [2022-01-17 14:28:25] [config] lr-warmup-at-reload: false [2022-01-17 14:28:25] [config] lr-warmup-cycle: false [2022-01-17 14:28:25] [config] lr-warmup-start-rate: 0 [2022-01-17 14:28:25] [config] max-length: 100 [2022-01-17 14:28:25] [config] max-length-crop: false [2022-01-17 14:28:25] [config] max-length-factor: 3 [2022-01-17 14:28:25] [config] maxi-batch: 1000 [2022-01-17 14:28:25] [config] maxi-batch-sort: trg [2022-01-17 14:28:25] [config] mini-batch: 64 [2022-01-17 14:28:25] [config] mini-batch-fit: true [2022-01-17 14:28:25] [config] mini-batch-fit-step: 10 [2022-01-17 14:28:25] [config] mini-batch-track-lr: false [2022-01-17 14:28:25] [config] mini-batch-warmup: 0 [2022-01-17 14:28:25] [config] mini-batch-words: 0 [2022-01-17 14:28:25] [config] mini-batch-words-ref: 0 [2022-01-17 14:28:25] [config] model: model.npz [2022-01-17 14:28:25] [config] multi-loss-type: sum [2022-01-17 14:28:25] [config] multi-node: false [2022-01-17 14:28:25] [config] multi-node-overlap: true [2022-01-17 14:28:25] [config] n-best: false [2022-01-17 14:28:25] [config] no-nccl: false [2022-01-17 14:28:25] [config] no-reload: false [2022-01-17 14:28:25] [config] no-restore-corpus: false [2022-01-17 14:28:25] [config] normalize: 0.6 [2022-01-17 14:28:25] [config] normalize-gradient: false [2022-01-17 14:28:25] [config] num-devices: 0 [2022-01-17 14:28:25] [config] optimizer: adam [2022-01-17 14:28:25] [config] optimizer-delay: 1 [2022-01-17 14:28:25] [config] optimizer-params: [2022-01-17 14:28:25] [config] - 0.9 [2022-01-17 14:28:25] [config] - 0.98 [2022-01-17 14:28:25] [config] - 1e-09 [2022-01-17 14:28:25] [config] output-omit-bias: false [2022-01-17 14:28:25] [config] overwrite: true [2022-01-17 14:28:25] [config] precision: [2022-01-17 14:28:25] [config] - float32 [2022-01-17 14:28:25] [config] - float32 [2022-01-17 14:28:25] [config] - float32 [2022-01-17 14:28:25] [config] pretrained-model: "" [2022-01-17 14:28:25] [config] quantize-biases: false [2022-01-17 14:28:25] [config] quantize-bits: 0 [2022-01-17 14:28:25] [config] quantize-log-based: false [2022-01-17 14:28:25] [config] quantize-optimization-steps: 0 [2022-01-17 14:28:25] [config] quiet: false [2022-01-17 14:28:25] [config] quiet-translation: false [2022-01-17 14:28:25] [config] relative-paths: false [2022-01-17 14:28:25] [config] right-left: false [2022-01-17 14:28:25] [config] save-freq: 5000 [2022-01-17 14:28:25] [config] seed: 0 [2022-01-17 14:28:25] [config] sentencepiece-alphas: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] sentencepiece-max-lines: 2000000 [2022-01-17 14:28:25] [config] sentencepiece-options: "" [2022-01-17 14:28:25] [config] shuffle: data [2022-01-17 14:28:25] [config] shuffle-in-ram: false [2022-01-17 14:28:25] [config] sigterm: save-and-exit [2022-01-17 14:28:25] [config] skip: false [2022-01-17 14:28:25] [config] sqlite: "" [2022-01-17 14:28:25] [config] sqlite-drop: false [2022-01-17 14:28:25] [config] sync-sgd: false [2022-01-17 14:28:25] [config] tempdir: /tmp [2022-01-17 14:28:25] [config] tied-embeddings: true [2022-01-17 14:28:25] [config] tied-embeddings-all: false [2022-01-17 14:28:25] [config] tied-embeddings-src: false [2022-01-17 14:28:25] [config] train-embedder-rank: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] train-sets: [2022-01-17 14:28:25] [config] - /home/wmi/mt-summit-corpora/train/in.tsv.32000 [2022-01-17 14:28:25] [config] - /home/wmi/mt-summit-corpora/train/expected.tsv.32000 [2022-01-17 14:28:25] [config] transformer-aan-activation: swish [2022-01-17 14:28:25] [config] transformer-aan-depth: 2 [2022-01-17 14:28:25] [config] transformer-aan-nogate: false [2022-01-17 14:28:25] [config] transformer-decoder-autoreg: self-attention [2022-01-17 14:28:25] [config] transformer-depth-scaling: false [2022-01-17 14:28:25] [config] transformer-dim-aan: 2048 [2022-01-17 14:28:25] [config] transformer-dim-ffn: 2048 [2022-01-17 14:28:25] [config] transformer-dropout: 0.1 [2022-01-17 14:28:25] [config] transformer-dropout-attention: 0 [2022-01-17 14:28:25] [config] transformer-dropout-ffn: 0 [2022-01-17 14:28:25] [config] transformer-ffn-activation: swish [2022-01-17 14:28:25] [config] transformer-ffn-depth: 2 [2022-01-17 14:28:25] [config] transformer-guided-alignment-layer: last [2022-01-17 14:28:25] [config] transformer-heads: 8 [2022-01-17 14:28:25] [config] transformer-no-projection: false [2022-01-17 14:28:25] [config] transformer-pool: false [2022-01-17 14:28:25] [config] transformer-postprocess: dan [2022-01-17 14:28:25] [config] transformer-postprocess-emb: d [2022-01-17 14:28:25] [config] transformer-postprocess-top: "" [2022-01-17 14:28:25] [config] transformer-preprocess: "" [2022-01-17 14:28:25] [config] transformer-tied-layers: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] transformer-train-position-embeddings: false [2022-01-17 14:28:25] [config] tsv: false [2022-01-17 14:28:25] [config] tsv-fields: 0 [2022-01-17 14:28:25] [config] type: transformer [2022-01-17 14:28:25] [config] ulr: false [2022-01-17 14:28:25] [config] ulr-dim-emb: 0 [2022-01-17 14:28:25] [config] ulr-dropout: 0 [2022-01-17 14:28:25] [config] ulr-keys-vectors: "" [2022-01-17 14:28:25] [config] ulr-query-vectors: "" [2022-01-17 14:28:25] [config] ulr-softmax-temperature: 1 [2022-01-17 14:28:25] [config] ulr-trainable-transformation: false [2022-01-17 14:28:25] [config] unlikelihood-loss: false [2022-01-17 14:28:25] [config] valid-freq: 5000 [2022-01-17 14:28:25] [config] valid-log: "" [2022-01-17 14:28:25] [config] valid-max-length: 1000 [2022-01-17 14:28:25] [config] valid-metrics: [2022-01-17 14:28:25] [config] - cross-entropy [2022-01-17 14:28:25] [config] valid-mini-batch: 32 [2022-01-17 14:28:25] [config] valid-reset-stalled: false [2022-01-17 14:28:25] [config] valid-script-args: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] valid-script-path: "" [2022-01-17 14:28:25] [config] valid-sets: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] valid-translation-output: "" [2022-01-17 14:28:25] [config] vocabs: [2022-01-17 14:28:25] [config] [] [2022-01-17 14:28:25] [config] word-penalty: 0 [2022-01-17 14:28:25] [config] word-scores: false [2022-01-17 14:28:25] [config] workspace: 10000 [2022-01-17 14:28:25] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800 [2022-01-17 14:28:25] [training] Using single-device training [2022-01-17 14:28:25] [data] No vocabulary files given, trying to find or build based on training data. [2022-01-17 14:28:25] [data] Vocabularies will be built separately for each file. [2022-01-17 14:28:25] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/train/in.tsv.32000 [2022-01-17 14:28:25] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/train/in.tsv.32000 [2022-01-17 14:28:25] [data] Creating vocabulary /home/wmi/mt-summit-corpora/train/in.tsv.32000.yml from /home/wmi/mt-summit-corpora/train/in.tsv.32000 [2022-01-17 14:28:33] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/train/in.tsv.32000.yml [2022-01-17 14:28:33] [data] Setting vocabulary size for input 0 to 18,703 [2022-01-17 14:28:33] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/train/expected.tsv.32000 [2022-01-17 14:28:33] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/train/expected.tsv.32000 [2022-01-17 14:28:33] [data] Creating vocabulary /home/wmi/mt-summit-corpora/train/expected.tsv.32000.yml from /home/wmi/mt-summit-corpora/train/expected.tsv.32000 [2022-01-17 14:28:41] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/train/expected.tsv.32000.yml [2022-01-17 14:28:41] [data] Setting vocabulary size for input 1 to 27,729 [2022-01-17 14:28:41] [comm] Compiled without MPI support. Running as a single process on s470607-gpu [2022-01-17 14:28:41] [batching] Collecting statistics for batch fitting with step size 10 [2022-01-17 14:28:41] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-17 14:28:41] [logits] Applying loss function for 1 factor(s) [2022-01-17 14:28:41] [memory] Reserving 259 MB, device gpu0 [2022-01-17 14:28:43] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2022-01-17 14:28:43] [memory] Reserving 259 MB, device gpu0 [2022-01-17 14:28:53] [batching] Done. Typical MB size is 9,199 target words [2022-01-17 14:28:53] [memory] Extending reserved space to 10112 MB (device gpu0) [2022-01-17 14:28:53] Training started [2022-01-17 14:28:53] [data] Shuffling data [2022-01-17 14:28:55] [data] Done reading 3,103,819 sentences [2022-01-17 14:29:10] [data] Done shuffling 3,103,819 sentences to temp files [2022-01-17 14:29:11] [memory] Reserving 259 MB, device gpu0 [2022-01-17 14:29:11] [memory] Reserving 259 MB, device gpu0 [2022-01-17 14:29:11] [memory] Reserving 518 MB, device gpu0 [2022-01-17 14:29:11] [memory] Reserving 259 MB, device gpu0 [2022-01-17 14:30:28] Ep. 1 : Up. 500 : Sen. 112,183 : Cost 9.75393009 * 3,315,817 @ 6,804 after 3,315,817 : Time 94.27s : 35172.46 words/s : L.r. 9.3750e-06 [2022-01-17 14:31:45] Ep. 1 : Up. 1000 : Sen. 226,207 : Cost 8.78037071 * 3,301,332 @ 7,600 after 6,617,149 : Time 77.09s : 42824.10 words/s : L.r. 1.8750e-05 [2022-01-17 14:33:01] Ep. 1 : Up. 1500 : Sen. 333,345 : Cost 8.45282364 * 3,230,947 @ 6,045 after 9,848,096 : Time 76.47s : 42251.10 words/s : L.r. 2.8125e-05 [2022-01-17 14:34:19] Ep. 1 : Up. 2000 : Sen. 447,334 : Cost 8.13397980 * 3,314,800 @ 9,540 after 13,162,896 : Time 77.69s : 42668.69 words/s : L.r. 3.7500e-05 [2022-01-17 14:35:37] Ep. 1 : Up. 2500 : Sen. 559,333 : Cost 7.77041626 * 3,308,454 @ 6,374 after 16,471,350 : Time 78.02s : 42405.44 words/s : L.r. 4.6875e-05 [2022-01-17 14:36:54] Ep. 1 : Up. 3000 : Sen. 671,385 : Cost 7.36499834 * 3,279,893 @ 8,930 after 19,751,243 : Time 77.38s : 42386.00 words/s : L.r. 5.6250e-05 [2022-01-17 14:38:13] Ep. 1 : Up. 3500 : Sen. 782,003 : Cost 7.04253387 * 3,350,646 @ 8,610 after 23,101,889 : Time 78.33s : 42776.03 words/s : L.r. 6.5625e-05 [2022-01-17 14:39:30] Ep. 1 : Up. 4000 : Sen. 895,728 : Cost 6.76005077 * 3,280,081 @ 6,688 after 26,381,970 : Time 77.36s : 42399.72 words/s : L.r. 7.5000e-05 [2022-01-17 14:40:47] Ep. 1 : Up. 4500 : Sen. 1,006,825 : Cost 6.52075815 * 3,274,292 @ 8,904 after 29,656,262 : Time 77.27s : 42375.09 words/s : L.r. 8.4375e-05 [2022-01-17 14:42:04] Ep. 1 : Up. 5000 : Sen. 1,117,568 : Cost 6.29208374 * 3,270,540 @ 7,950 after 32,926,802 : Time 77.07s : 42436.39 words/s : L.r. 9.3750e-05 [2022-01-17 14:42:04] Saving model weights and runtime parameters to model.npz.orig.npz [2022-01-17 14:42:05] Saving model weights and runtime parameters to model.npz [2022-01-17 14:42:06] Saving Adam parameters to model.npz.optimizer.npz [2022-01-17 14:43:26] Ep. 1 : Up. 5500 : Sen. 1,229,053 : Cost 6.08049774 * 3,283,922 @ 6,844 after 36,210,724 : Time 81.29s : 40395.28 words/s : L.r. 1.0313e-04 [2022-01-17 14:44:44] Ep. 1 : Up. 6000 : Sen. 1,341,287 : Cost 5.86857510 * 3,331,301 @ 6,475 after 39,542,025 : Time 78.21s : 42593.68 words/s : L.r. 1.1250e-04 [2022-01-17 14:46:02] Ep. 1 : Up. 6500 : Sen. 1,453,894 : Cost 5.67484188 * 3,316,106 @ 6,912 after 42,858,131 : Time 78.19s : 42412.79 words/s : L.r. 1.2188e-04 [2022-01-17 14:47:20] Ep. 1 : Up. 7000 : Sen. 1,566,550 : Cost 5.44415712 * 3,295,317 @ 5,587 after 46,153,448 : Time 77.57s : 42484.00 words/s : L.r. 1.3125e-04 [2022-01-17 14:48:37] Ep. 1 : Up. 7500 : Sen. 1,679,713 : Cost 5.21833611 * 3,305,387 @ 7,314 after 49,458,835 : Time 77.69s : 42545.45 words/s : L.r. 1.4063e-04 [2022-01-17 14:49:55] Ep. 1 : Up. 8000 : Sen. 1,790,607 : Cost 5.01008797 * 3,303,487 @ 5,952 after 52,762,322 : Time 77.89s : 42412.55 words/s : L.r. 1.5000e-04 [2022-01-17 14:51:13] Ep. 1 : Up. 8500 : Sen. 1,903,180 : Cost 4.78658533 * 3,305,121 @ 6,475 after 56,067,443 : Time 77.74s : 42514.82 words/s : L.r. 1.5938e-04 [2022-01-17 14:52:30] Ep. 1 : Up. 9000 : Sen. 2,012,146 : Cost 4.59634590 * 3,285,628 @ 2,944 after 59,353,071 : Time 77.55s : 42366.37 words/s : L.r. 1.6875e-04 [2022-01-17 14:53:48] Ep. 1 : Up. 9500 : Sen. 2,126,354 : Cost 4.37838125 * 3,299,705 @ 7,770 after 62,652,776 : Time 77.47s : 42590.84 words/s : L.r. 1.7813e-04 [2022-01-17 14:55:06] Ep. 1 : Up. 10000 : Sen. 2,237,464 : Cost 4.18511820 * 3,293,490 @ 6,282 after 65,946,266 : Time 77.79s : 42337.42 words/s : L.r. 1.8750e-04 [2022-01-17 14:55:06] Saving model weights and runtime parameters to model.npz.orig.npz [2022-01-17 14:55:06] Saving model weights and runtime parameters to model.npz [2022-01-17 14:55:07] Saving Adam parameters to model.npz.optimizer.npz [2022-01-17 14:56:27] Ep. 1 : Up. 10500 : Sen. 2,348,017 : Cost 4.03138113 * 3,310,483 @ 5,556 after 69,256,749 : Time 80.94s : 40899.10 words/s : L.r. 1.9688e-04 [2022-01-17 14:57:44] Ep. 1 : Up. 11000 : Sen. 2,459,529 : Cost 3.87746334 * 3,264,732 @ 8,325 after 72,521,481 : Time 77.20s : 42287.32 words/s : L.r. 2.0625e-04 [2022-01-17 14:59:02] Ep. 1 : Up. 11500 : Sen. 2,573,487 : Cost 3.74331832 * 3,306,391 @ 7,400 after 75,827,872 : Time 77.86s : 42466.56 words/s : L.r. 2.1563e-04 [2022-01-17 15:00:20] Ep. 1 : Up. 12000 : Sen. 2,685,390 : Cost 3.64547110 * 3,317,964 @ 6,688 after 79,145,836 : Time 78.15s : 42457.47 words/s : L.r. 2.2500e-04 [2022-01-17 15:01:37] Ep. 1 : Up. 12500 : Sen. 2,797,494 : Cost 3.54736304 * 3,298,276 @ 7,808 after 82,444,112 : Time 77.68s : 42457.41 words/s : L.r. 2.3438e-04 [2022-01-17 15:02:55] Ep. 1 : Up. 13000 : Sen. 2,908,256 : Cost 3.47763109 * 3,295,994 @ 5,773 after 85,740,106 : Time 77.68s : 42433.03 words/s : L.r. 2.4375e-04 [2022-01-17 15:04:12] Ep. 1 : Up. 13500 : Sen. 3,019,771 : Cost 3.40706968 * 3,254,919 @ 4,995 after 88,995,025 : Time 76.75s : 42409.32 words/s : L.r. 2.5313e-04 [2022-01-17 15:04:53] Seen 3078439 samples [2022-01-17 15:04:53] Starting data epoch 2 in logical epoch 2 [2022-01-17 15:04:53] Training finished [2022-01-17 15:04:53] Saving model weights and runtime parameters to model.npz.orig.npz [2022-01-17 15:04:53] Saving model weights and runtime parameters to model.npz [2022-01-17 15:04:54] Saving Adam parameters to model.npz.optimizer.npz