[2022-01-14 18:11:57] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-14 18:11:57] [marian] Running on s470607-gpu as process 1795 with command line:
[2022-01-14 18:11:57] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1
[2022-01-14 18:11:57] [config] after: 0e
[2022-01-14 18:11:57] [config] after-batches: 0
[2022-01-14 18:11:57] [config] after-epochs: 1
[2022-01-14 18:11:57] [config] all-caps-every: 0
[2022-01-14 18:11:57] [config] allow-unk: false
[2022-01-14 18:11:57] [config] authors: false
[2022-01-14 18:11:57] [config] beam-size: 6
[2022-01-14 18:11:57] [config] bert-class-symbol: "[CLS]"
[2022-01-14 18:11:57] [config] bert-mask-symbol: "[MASK]"
[2022-01-14 18:11:57] [config] bert-masking-fraction: 0.15
[2022-01-14 18:11:57] [config] bert-sep-symbol: "[SEP]"
[2022-01-14 18:11:57] [config] bert-train-type-embeddings: true
[2022-01-14 18:11:57] [config] bert-type-vocab-size: 2
[2022-01-14 18:11:57] [config] build-info: ""
[2022-01-14 18:11:57] [config] cite: false
[2022-01-14 18:11:57] [config] clip-norm: 5
[2022-01-14 18:11:57] [config] cost-scaling:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] cost-type: ce-sum
[2022-01-14 18:11:57] [config] cpu-threads: 0
[2022-01-14 18:11:57] [config] data-weighting: ""
[2022-01-14 18:11:57] [config] data-weighting-type: sentence
[2022-01-14 18:11:57] [config] dec-cell: gru
[2022-01-14 18:11:57] [config] dec-cell-base-depth: 2
[2022-01-14 18:11:57] [config] dec-cell-high-depth: 1
[2022-01-14 18:11:57] [config] dec-depth: 6
[2022-01-14 18:11:57] [config] devices:
[2022-01-14 18:11:57] [config]   - 0
[2022-01-14 18:11:57] [config] dim-emb: 512
[2022-01-14 18:11:57] [config] dim-rnn: 1024
[2022-01-14 18:11:57] [config] dim-vocabs:
[2022-01-14 18:11:57] [config]   - 0
[2022-01-14 18:11:57] [config]   - 0
[2022-01-14 18:11:57] [config] disp-first: 0
[2022-01-14 18:11:57] [config] disp-freq: 500
[2022-01-14 18:11:57] [config] disp-label-counts: true
[2022-01-14 18:11:57] [config] dropout-rnn: 0
[2022-01-14 18:11:57] [config] dropout-src: 0
[2022-01-14 18:11:57] [config] dropout-trg: 0
[2022-01-14 18:11:57] [config] dump-config: ""
[2022-01-14 18:11:57] [config] early-stopping: 10
[2022-01-14 18:11:57] [config] embedding-fix-src: false
[2022-01-14 18:11:57] [config] embedding-fix-trg: false
[2022-01-14 18:11:57] [config] embedding-normalization: false
[2022-01-14 18:11:57] [config] embedding-vectors:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] enc-cell: gru
[2022-01-14 18:11:57] [config] enc-cell-depth: 1
[2022-01-14 18:11:57] [config] enc-depth: 6
[2022-01-14 18:11:57] [config] enc-type: bidirectional
[2022-01-14 18:11:57] [config] english-title-case-every: 0
[2022-01-14 18:11:57] [config] exponential-smoothing: 0.0001
[2022-01-14 18:11:57] [config] factor-weight: 1
[2022-01-14 18:11:57] [config] grad-dropping-momentum: 0
[2022-01-14 18:11:57] [config] grad-dropping-rate: 0
[2022-01-14 18:11:57] [config] grad-dropping-warmup: 100
[2022-01-14 18:11:57] [config] gradient-checkpointing: false
[2022-01-14 18:11:57] [config] guided-alignment: none
[2022-01-14 18:11:57] [config] guided-alignment-cost: mse
[2022-01-14 18:11:57] [config] guided-alignment-weight: 0.1
[2022-01-14 18:11:57] [config] ignore-model-config: false
[2022-01-14 18:11:57] [config] input-types:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] interpolate-env-vars: false
[2022-01-14 18:11:57] [config] keep-best: false
[2022-01-14 18:11:57] [config] label-smoothing: 0.1
[2022-01-14 18:11:57] [config] layer-normalization: false
[2022-01-14 18:11:57] [config] learn-rate: 0.0003
[2022-01-14 18:11:57] [config] lemma-dim-emb: 0
[2022-01-14 18:11:57] [config] log: /home/wmi/train.log
[2022-01-14 18:11:57] [config] log-level: info
[2022-01-14 18:11:57] [config] log-time-zone: ""
[2022-01-14 18:11:57] [config] logical-epoch:
[2022-01-14 18:11:57] [config]   - 1e
[2022-01-14 18:11:57] [config]   - 0
[2022-01-14 18:11:57] [config] lr-decay: 0
[2022-01-14 18:11:57] [config] lr-decay-freq: 50000
[2022-01-14 18:11:57] [config] lr-decay-inv-sqrt:
[2022-01-14 18:11:57] [config]   - 16000
[2022-01-14 18:11:57] [config] lr-decay-repeat-warmup: false
[2022-01-14 18:11:57] [config] lr-decay-reset-optimizer: false
[2022-01-14 18:11:57] [config] lr-decay-start:
[2022-01-14 18:11:57] [config]   - 10
[2022-01-14 18:11:57] [config]   - 1
[2022-01-14 18:11:57] [config] lr-decay-strategy: epoch+stalled
[2022-01-14 18:11:57] [config] lr-report: true
[2022-01-14 18:11:57] [config] lr-warmup: 16000
[2022-01-14 18:11:57] [config] lr-warmup-at-reload: false
[2022-01-14 18:11:57] [config] lr-warmup-cycle: false
[2022-01-14 18:11:57] [config] lr-warmup-start-rate: 0
[2022-01-14 18:11:57] [config] max-length: 100
[2022-01-14 18:11:57] [config] max-length-crop: false
[2022-01-14 18:11:57] [config] max-length-factor: 3
[2022-01-14 18:11:57] [config] maxi-batch: 1000
[2022-01-14 18:11:57] [config] maxi-batch-sort: trg
[2022-01-14 18:11:57] [config] mini-batch: 64
[2022-01-14 18:11:57] [config] mini-batch-fit: true
[2022-01-14 18:11:57] [config] mini-batch-fit-step: 10
[2022-01-14 18:11:57] [config] mini-batch-track-lr: false
[2022-01-14 18:11:57] [config] mini-batch-warmup: 0
[2022-01-14 18:11:57] [config] mini-batch-words: 0
[2022-01-14 18:11:57] [config] mini-batch-words-ref: 0
[2022-01-14 18:11:57] [config] model: model.npz
[2022-01-14 18:11:57] [config] multi-loss-type: sum
[2022-01-14 18:11:57] [config] multi-node: false
[2022-01-14 18:11:57] [config] multi-node-overlap: true
[2022-01-14 18:11:57] [config] n-best: false
[2022-01-14 18:11:57] [config] no-nccl: false
[2022-01-14 18:11:57] [config] no-reload: false
[2022-01-14 18:11:57] [config] no-restore-corpus: false
[2022-01-14 18:11:57] [config] normalize: 0.6
[2022-01-14 18:11:57] [config] normalize-gradient: false
[2022-01-14 18:11:57] [config] num-devices: 0
[2022-01-14 18:11:57] [config] optimizer: adam
[2022-01-14 18:11:57] [config] optimizer-delay: 1
[2022-01-14 18:11:57] [config] optimizer-params:
[2022-01-14 18:11:57] [config]   - 0.9
[2022-01-14 18:11:57] [config]   - 0.98
[2022-01-14 18:11:57] [config]   - 1e-09
[2022-01-14 18:11:57] [config] output-omit-bias: false
[2022-01-14 18:11:57] [config] overwrite: true
[2022-01-14 18:11:57] [config] precision:
[2022-01-14 18:11:57] [config]   - float32
[2022-01-14 18:11:57] [config]   - float32
[2022-01-14 18:11:57] [config]   - float32
[2022-01-14 18:11:57] [config] pretrained-model: ""
[2022-01-14 18:11:57] [config] quantize-biases: false
[2022-01-14 18:11:57] [config] quantize-bits: 0
[2022-01-14 18:11:57] [config] quantize-log-based: false
[2022-01-14 18:11:57] [config] quantize-optimization-steps: 0
[2022-01-14 18:11:57] [config] quiet: false
[2022-01-14 18:11:57] [config] quiet-translation: false
[2022-01-14 18:11:57] [config] relative-paths: false
[2022-01-14 18:11:57] [config] right-left: false
[2022-01-14 18:11:57] [config] save-freq: 5000
[2022-01-14 18:11:57] [config] seed: 0
[2022-01-14 18:11:57] [config] sentencepiece-alphas:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] sentencepiece-max-lines: 2000000
[2022-01-14 18:11:57] [config] sentencepiece-options: ""
[2022-01-14 18:11:57] [config] shuffle: data
[2022-01-14 18:11:57] [config] shuffle-in-ram: false
[2022-01-14 18:11:57] [config] sigterm: save-and-exit
[2022-01-14 18:11:57] [config] skip: false
[2022-01-14 18:11:57] [config] sqlite: ""
[2022-01-14 18:11:57] [config] sqlite-drop: false
[2022-01-14 18:11:57] [config] sync-sgd: false
[2022-01-14 18:11:57] [config] tempdir: /tmp
[2022-01-14 18:11:57] [config] tied-embeddings: false
[2022-01-14 18:11:57] [config] tied-embeddings-all: true
[2022-01-14 18:11:57] [config] tied-embeddings-src: false
[2022-01-14 18:11:57] [config] train-embedder-rank:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] train-sets:
[2022-01-14 18:11:57] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-14 18:11:57] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-14 18:11:57] [config] transformer-aan-activation: swish
[2022-01-14 18:11:57] [config] transformer-aan-depth: 2
[2022-01-14 18:11:57] [config] transformer-aan-nogate: false
[2022-01-14 18:11:57] [config] transformer-decoder-autoreg: self-attention
[2022-01-14 18:11:57] [config] transformer-depth-scaling: false
[2022-01-14 18:11:57] [config] transformer-dim-aan: 2048
[2022-01-14 18:11:57] [config] transformer-dim-ffn: 2048
[2022-01-14 18:11:57] [config] transformer-dropout: 0.1
[2022-01-14 18:11:57] [config] transformer-dropout-attention: 0
[2022-01-14 18:11:57] [config] transformer-dropout-ffn: 0
[2022-01-14 18:11:57] [config] transformer-ffn-activation: swish
[2022-01-14 18:11:57] [config] transformer-ffn-depth: 2
[2022-01-14 18:11:57] [config] transformer-guided-alignment-layer: last
[2022-01-14 18:11:57] [config] transformer-heads: 8
[2022-01-14 18:11:57] [config] transformer-no-projection: false
[2022-01-14 18:11:57] [config] transformer-pool: false
[2022-01-14 18:11:57] [config] transformer-postprocess: dan
[2022-01-14 18:11:57] [config] transformer-postprocess-emb: d
[2022-01-14 18:11:57] [config] transformer-postprocess-top: ""
[2022-01-14 18:11:57] [config] transformer-preprocess: ""
[2022-01-14 18:11:57] [config] transformer-tied-layers:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] transformer-train-position-embeddings: false
[2022-01-14 18:11:57] [config] tsv: false
[2022-01-14 18:11:57] [config] tsv-fields: 0
[2022-01-14 18:11:57] [config] type: transformer
[2022-01-14 18:11:57] [config] ulr: false
[2022-01-14 18:11:57] [config] ulr-dim-emb: 0
[2022-01-14 18:11:57] [config] ulr-dropout: 0
[2022-01-14 18:11:57] [config] ulr-keys-vectors: ""
[2022-01-14 18:11:57] [config] ulr-query-vectors: ""
[2022-01-14 18:11:57] [config] ulr-softmax-temperature: 1
[2022-01-14 18:11:57] [config] ulr-trainable-transformation: false
[2022-01-14 18:11:57] [config] unlikelihood-loss: false
[2022-01-14 18:11:57] [config] valid-freq: 5000
[2022-01-14 18:11:57] [config] valid-log: ""
[2022-01-14 18:11:57] [config] valid-max-length: 1000
[2022-01-14 18:11:57] [config] valid-metrics:
[2022-01-14 18:11:57] [config]   - cross-entropy
[2022-01-14 18:11:57] [config] valid-mini-batch: 32
[2022-01-14 18:11:57] [config] valid-reset-stalled: false
[2022-01-14 18:11:57] [config] valid-script-args:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] valid-script-path: ""
[2022-01-14 18:11:57] [config] valid-sets:
[2022-01-14 18:11:57] [config]   []
[2022-01-14 18:11:57] [config] valid-translation-output: ""
[2022-01-14 18:11:57] [config] vocabs:
[2022-01-14 18:11:57] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-14 18:11:57] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000
[2022-01-14 18:11:57] [config] word-penalty: 0
[2022-01-14 18:11:57] [config] word-scores: false
[2022-01-14 18:11:57] [config] workspace: 10000
[2022-01-14 18:11:57] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-14 18:11:57] [training] Using single-device training
[2022-01-14 18:11:57] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-14 18:11:57] Error: DefaultVocabulary file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 is expected to contain an entry for </s>
[2022-01-14 18:11:57] Error: Aborted from marian::DefaultVocab::addRequiredVocabulary(const string&, bool)::<lambda(const string&, const string&, marian::Word)> in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:199

[CALL STACK]
[0x55eb3f84e0d8]    marian::DefaultVocab::addRequiredVocabulary(std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,bool)::{lambda(std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,marian::Word)#1}::  operator()  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  marian::Word) const + 0x4a8
[0x55eb3f84e5d9]    marian::DefaultVocab::  addRequiredVocabulary  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  bool) + 0x59
[0x55eb3f85168b]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0xb9b
[0x55eb3f840e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x55eb3f841728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x55eb3f88d189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x55eb3f8a0084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x55eb3f6fef8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x55eb3f78694b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x55eb3f68d389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x55eb3f64b1bc]    main                                               + 0x3c
[0x7feb9a0910b3]    __libc_start_main                                  + 0xf3
[0x55eb3f68bb0e]    _start                                             + 0x2e

[2022-01-14 18:57:10] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-14 18:57:10] [marian] Running on s470607-gpu as process 1959 with command line:
[2022-01-14 18:57:10] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1
[2022-01-14 18:57:10] [config] after: 0e
[2022-01-14 18:57:10] [config] after-batches: 0
[2022-01-14 18:57:10] [config] after-epochs: 1
[2022-01-14 18:57:10] [config] all-caps-every: 0
[2022-01-14 18:57:10] [config] allow-unk: false
[2022-01-14 18:57:10] [config] authors: false
[2022-01-14 18:57:10] [config] beam-size: 6
[2022-01-14 18:57:10] [config] bert-class-symbol: "[CLS]"
[2022-01-14 18:57:10] [config] bert-mask-symbol: "[MASK]"
[2022-01-14 18:57:10] [config] bert-masking-fraction: 0.15
[2022-01-14 18:57:10] [config] bert-sep-symbol: "[SEP]"
[2022-01-14 18:57:10] [config] bert-train-type-embeddings: true
[2022-01-14 18:57:10] [config] bert-type-vocab-size: 2
[2022-01-14 18:57:10] [config] build-info: ""
[2022-01-14 18:57:10] [config] cite: false
[2022-01-14 18:57:10] [config] clip-norm: 5
[2022-01-14 18:57:10] [config] cost-scaling:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] cost-type: ce-sum
[2022-01-14 18:57:10] [config] cpu-threads: 0
[2022-01-14 18:57:10] [config] data-weighting: ""
[2022-01-14 18:57:10] [config] data-weighting-type: sentence
[2022-01-14 18:57:10] [config] dec-cell: gru
[2022-01-14 18:57:10] [config] dec-cell-base-depth: 2
[2022-01-14 18:57:10] [config] dec-cell-high-depth: 1
[2022-01-14 18:57:10] [config] dec-depth: 6
[2022-01-14 18:57:10] [config] devices:
[2022-01-14 18:57:10] [config]   - 0
[2022-01-14 18:57:10] [config] dim-emb: 512
[2022-01-14 18:57:10] [config] dim-rnn: 1024
[2022-01-14 18:57:10] [config] dim-vocabs:
[2022-01-14 18:57:10] [config]   - 0
[2022-01-14 18:57:10] [config]   - 0
[2022-01-14 18:57:10] [config] disp-first: 0
[2022-01-14 18:57:10] [config] disp-freq: 500
[2022-01-14 18:57:10] [config] disp-label-counts: true
[2022-01-14 18:57:10] [config] dropout-rnn: 0
[2022-01-14 18:57:10] [config] dropout-src: 0
[2022-01-14 18:57:10] [config] dropout-trg: 0
[2022-01-14 18:57:10] [config] dump-config: ""
[2022-01-14 18:57:10] [config] early-stopping: 10
[2022-01-14 18:57:10] [config] embedding-fix-src: false
[2022-01-14 18:57:10] [config] embedding-fix-trg: false
[2022-01-14 18:57:10] [config] embedding-normalization: false
[2022-01-14 18:57:10] [config] embedding-vectors:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] enc-cell: gru
[2022-01-14 18:57:10] [config] enc-cell-depth: 1
[2022-01-14 18:57:10] [config] enc-depth: 6
[2022-01-14 18:57:10] [config] enc-type: bidirectional
[2022-01-14 18:57:10] [config] english-title-case-every: 0
[2022-01-14 18:57:10] [config] exponential-smoothing: 0.0001
[2022-01-14 18:57:10] [config] factor-weight: 1
[2022-01-14 18:57:10] [config] grad-dropping-momentum: 0
[2022-01-14 18:57:10] [config] grad-dropping-rate: 0
[2022-01-14 18:57:10] [config] grad-dropping-warmup: 100
[2022-01-14 18:57:10] [config] gradient-checkpointing: false
[2022-01-14 18:57:10] [config] guided-alignment: none
[2022-01-14 18:57:10] [config] guided-alignment-cost: mse
[2022-01-14 18:57:10] [config] guided-alignment-weight: 0.1
[2022-01-14 18:57:10] [config] ignore-model-config: false
[2022-01-14 18:57:10] [config] input-types:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] interpolate-env-vars: false
[2022-01-14 18:57:10] [config] keep-best: false
[2022-01-14 18:57:10] [config] label-smoothing: 0.1
[2022-01-14 18:57:10] [config] layer-normalization: false
[2022-01-14 18:57:10] [config] learn-rate: 0.0003
[2022-01-14 18:57:10] [config] lemma-dim-emb: 0
[2022-01-14 18:57:10] [config] log: /home/wmi/train.log
[2022-01-14 18:57:10] [config] log-level: info
[2022-01-14 18:57:10] [config] log-time-zone: ""
[2022-01-14 18:57:10] [config] logical-epoch:
[2022-01-14 18:57:10] [config]   - 1e
[2022-01-14 18:57:10] [config]   - 0
[2022-01-14 18:57:10] [config] lr-decay: 0
[2022-01-14 18:57:10] [config] lr-decay-freq: 50000
[2022-01-14 18:57:10] [config] lr-decay-inv-sqrt:
[2022-01-14 18:57:10] [config]   - 16000
[2022-01-14 18:57:10] [config] lr-decay-repeat-warmup: false
[2022-01-14 18:57:10] [config] lr-decay-reset-optimizer: false
[2022-01-14 18:57:10] [config] lr-decay-start:
[2022-01-14 18:57:10] [config]   - 10
[2022-01-14 18:57:10] [config]   - 1
[2022-01-14 18:57:10] [config] lr-decay-strategy: epoch+stalled
[2022-01-14 18:57:10] [config] lr-report: true
[2022-01-14 18:57:10] [config] lr-warmup: 16000
[2022-01-14 18:57:10] [config] lr-warmup-at-reload: false
[2022-01-14 18:57:10] [config] lr-warmup-cycle: false
[2022-01-14 18:57:10] [config] lr-warmup-start-rate: 0
[2022-01-14 18:57:10] [config] max-length: 100
[2022-01-14 18:57:10] [config] max-length-crop: false
[2022-01-14 18:57:10] [config] max-length-factor: 3
[2022-01-14 18:57:10] [config] maxi-batch: 1000
[2022-01-14 18:57:10] [config] maxi-batch-sort: trg
[2022-01-14 18:57:10] [config] mini-batch: 64
[2022-01-14 18:57:10] [config] mini-batch-fit: true
[2022-01-14 18:57:10] [config] mini-batch-fit-step: 10
[2022-01-14 18:57:10] [config] mini-batch-track-lr: false
[2022-01-14 18:57:10] [config] mini-batch-warmup: 0
[2022-01-14 18:57:10] [config] mini-batch-words: 0
[2022-01-14 18:57:10] [config] mini-batch-words-ref: 0
[2022-01-14 18:57:10] [config] model: model.npz
[2022-01-14 18:57:10] [config] multi-loss-type: sum
[2022-01-14 18:57:10] [config] multi-node: false
[2022-01-14 18:57:10] [config] multi-node-overlap: true
[2022-01-14 18:57:10] [config] n-best: false
[2022-01-14 18:57:10] [config] no-nccl: false
[2022-01-14 18:57:10] [config] no-reload: false
[2022-01-14 18:57:10] [config] no-restore-corpus: false
[2022-01-14 18:57:10] [config] normalize: 0.6
[2022-01-14 18:57:10] [config] normalize-gradient: false
[2022-01-14 18:57:10] [config] num-devices: 0
[2022-01-14 18:57:10] [config] optimizer: adam
[2022-01-14 18:57:10] [config] optimizer-delay: 1
[2022-01-14 18:57:10] [config] optimizer-params:
[2022-01-14 18:57:10] [config]   - 0.9
[2022-01-14 18:57:10] [config]   - 0.98
[2022-01-14 18:57:10] [config]   - 1e-09
[2022-01-14 18:57:10] [config] output-omit-bias: false
[2022-01-14 18:57:10] [config] overwrite: true
[2022-01-14 18:57:10] [config] precision:
[2022-01-14 18:57:10] [config]   - float32
[2022-01-14 18:57:10] [config]   - float32
[2022-01-14 18:57:10] [config]   - float32
[2022-01-14 18:57:10] [config] pretrained-model: ""
[2022-01-14 18:57:10] [config] quantize-biases: false
[2022-01-14 18:57:10] [config] quantize-bits: 0
[2022-01-14 18:57:10] [config] quantize-log-based: false
[2022-01-14 18:57:10] [config] quantize-optimization-steps: 0
[2022-01-14 18:57:10] [config] quiet: false
[2022-01-14 18:57:10] [config] quiet-translation: false
[2022-01-14 18:57:10] [config] relative-paths: false
[2022-01-14 18:57:10] [config] right-left: false
[2022-01-14 18:57:10] [config] save-freq: 5000
[2022-01-14 18:57:10] [config] seed: 0
[2022-01-14 18:57:10] [config] sentencepiece-alphas:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] sentencepiece-max-lines: 2000000
[2022-01-14 18:57:10] [config] sentencepiece-options: ""
[2022-01-14 18:57:10] [config] shuffle: data
[2022-01-14 18:57:10] [config] shuffle-in-ram: false
[2022-01-14 18:57:10] [config] sigterm: save-and-exit
[2022-01-14 18:57:10] [config] skip: false
[2022-01-14 18:57:10] [config] sqlite: ""
[2022-01-14 18:57:10] [config] sqlite-drop: false
[2022-01-14 18:57:10] [config] sync-sgd: false
[2022-01-14 18:57:10] [config] tempdir: /tmp
[2022-01-14 18:57:10] [config] tied-embeddings: false
[2022-01-14 18:57:10] [config] tied-embeddings-all: true
[2022-01-14 18:57:10] [config] tied-embeddings-src: false
[2022-01-14 18:57:10] [config] train-embedder-rank:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] train-sets:
[2022-01-14 18:57:10] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-14 18:57:10] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-14 18:57:10] [config] transformer-aan-activation: swish
[2022-01-14 18:57:10] [config] transformer-aan-depth: 2
[2022-01-14 18:57:10] [config] transformer-aan-nogate: false
[2022-01-14 18:57:10] [config] transformer-decoder-autoreg: self-attention
[2022-01-14 18:57:10] [config] transformer-depth-scaling: false
[2022-01-14 18:57:10] [config] transformer-dim-aan: 2048
[2022-01-14 18:57:10] [config] transformer-dim-ffn: 2048
[2022-01-14 18:57:10] [config] transformer-dropout: 0.1
[2022-01-14 18:57:10] [config] transformer-dropout-attention: 0
[2022-01-14 18:57:10] [config] transformer-dropout-ffn: 0
[2022-01-14 18:57:10] [config] transformer-ffn-activation: swish
[2022-01-14 18:57:10] [config] transformer-ffn-depth: 2
[2022-01-14 18:57:10] [config] transformer-guided-alignment-layer: last
[2022-01-14 18:57:10] [config] transformer-heads: 8
[2022-01-14 18:57:10] [config] transformer-no-projection: false
[2022-01-14 18:57:10] [config] transformer-pool: false
[2022-01-14 18:57:10] [config] transformer-postprocess: dan
[2022-01-14 18:57:10] [config] transformer-postprocess-emb: d
[2022-01-14 18:57:10] [config] transformer-postprocess-top: ""
[2022-01-14 18:57:10] [config] transformer-preprocess: ""
[2022-01-14 18:57:10] [config] transformer-tied-layers:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] transformer-train-position-embeddings: false
[2022-01-14 18:57:10] [config] tsv: false
[2022-01-14 18:57:10] [config] tsv-fields: 0
[2022-01-14 18:57:10] [config] type: transformer
[2022-01-14 18:57:10] [config] ulr: false
[2022-01-14 18:57:10] [config] ulr-dim-emb: 0
[2022-01-14 18:57:10] [config] ulr-dropout: 0
[2022-01-14 18:57:10] [config] ulr-keys-vectors: ""
[2022-01-14 18:57:10] [config] ulr-query-vectors: ""
[2022-01-14 18:57:10] [config] ulr-softmax-temperature: 1
[2022-01-14 18:57:10] [config] ulr-trainable-transformation: false
[2022-01-14 18:57:10] [config] unlikelihood-loss: false
[2022-01-14 18:57:10] [config] valid-freq: 5000
[2022-01-14 18:57:10] [config] valid-log: ""
[2022-01-14 18:57:10] [config] valid-max-length: 1000
[2022-01-14 18:57:10] [config] valid-metrics:
[2022-01-14 18:57:10] [config]   - cross-entropy
[2022-01-14 18:57:10] [config] valid-mini-batch: 32
[2022-01-14 18:57:10] [config] valid-reset-stalled: false
[2022-01-14 18:57:10] [config] valid-script-args:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] valid-script-path: ""
[2022-01-14 18:57:10] [config] valid-sets:
[2022-01-14 18:57:10] [config]   []
[2022-01-14 18:57:10] [config] valid-translation-output: ""
[2022-01-14 18:57:10] [config] vocabs:
[2022-01-14 18:57:10] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-14 18:57:10] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000
[2022-01-14 18:57:10] [config] word-penalty: 0
[2022-01-14 18:57:10] [config] word-scores: false
[2022-01-14 18:57:10] [config] workspace: 10000
[2022-01-14 18:57:10] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-14 18:57:10] [training] Using single-device training
[2022-01-14 18:57:10] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-14 18:57:10] Error: DefaultVocabulary file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 is expected to contain an entry for </s>
[2022-01-14 18:57:10] Error: Aborted from marian::DefaultVocab::addRequiredVocabulary(const string&, bool)::<lambda(const string&, const string&, marian::Word)> in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:199

[CALL STACK]
[0x560bb35130d8]    marian::DefaultVocab::addRequiredVocabulary(std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,bool)::{lambda(std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,marian::Word)#1}::  operator()  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  marian::Word) const + 0x4a8
[0x560bb35135d9]    marian::DefaultVocab::  addRequiredVocabulary  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  bool) + 0x59
[0x560bb351668b]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0xb9b
[0x560bb3505e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x560bb3506728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x560bb3552189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x560bb3565084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x560bb33c3f8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x560bb344b94b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x560bb3352389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x560bb33101bc]    main                                               + 0x3c
[0x7fbe70a860b3]    __libc_start_main                                  + 0xf3
[0x560bb3350b0e]    _start                                             + 0x2e

[2022-01-14 19:20:18] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-14 19:20:18] [marian] Running on s470607-gpu as process 2041 with command line:
[2022-01-14 19:20:18] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1
[2022-01-14 19:20:18] [config] after: 0e
[2022-01-14 19:20:18] [config] after-batches: 0
[2022-01-14 19:20:18] [config] after-epochs: 1
[2022-01-14 19:20:18] [config] all-caps-every: 0
[2022-01-14 19:20:18] [config] allow-unk: false
[2022-01-14 19:20:18] [config] authors: false
[2022-01-14 19:20:18] [config] beam-size: 6
[2022-01-14 19:20:18] [config] bert-class-symbol: "[CLS]"
[2022-01-14 19:20:18] [config] bert-mask-symbol: "[MASK]"
[2022-01-14 19:20:18] [config] bert-masking-fraction: 0.15
[2022-01-14 19:20:18] [config] bert-sep-symbol: "[SEP]"
[2022-01-14 19:20:18] [config] bert-train-type-embeddings: true
[2022-01-14 19:20:18] [config] bert-type-vocab-size: 2
[2022-01-14 19:20:18] [config] build-info: ""
[2022-01-14 19:20:18] [config] cite: false
[2022-01-14 19:20:18] [config] clip-norm: 5
[2022-01-14 19:20:18] [config] cost-scaling:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] cost-type: ce-sum
[2022-01-14 19:20:18] [config] cpu-threads: 0
[2022-01-14 19:20:18] [config] data-weighting: ""
[2022-01-14 19:20:18] [config] data-weighting-type: sentence
[2022-01-14 19:20:18] [config] dec-cell: gru
[2022-01-14 19:20:18] [config] dec-cell-base-depth: 2
[2022-01-14 19:20:18] [config] dec-cell-high-depth: 1
[2022-01-14 19:20:18] [config] dec-depth: 6
[2022-01-14 19:20:18] [config] devices:
[2022-01-14 19:20:18] [config]   - 0
[2022-01-14 19:20:18] [config] dim-emb: 512
[2022-01-14 19:20:18] [config] dim-rnn: 1024
[2022-01-14 19:20:18] [config] dim-vocabs:
[2022-01-14 19:20:18] [config]   - 0
[2022-01-14 19:20:18] [config]   - 0
[2022-01-14 19:20:18] [config] disp-first: 0
[2022-01-14 19:20:18] [config] disp-freq: 500
[2022-01-14 19:20:18] [config] disp-label-counts: true
[2022-01-14 19:20:18] [config] dropout-rnn: 0
[2022-01-14 19:20:18] [config] dropout-src: 0
[2022-01-14 19:20:18] [config] dropout-trg: 0
[2022-01-14 19:20:18] [config] dump-config: ""
[2022-01-14 19:20:18] [config] early-stopping: 10
[2022-01-14 19:20:18] [config] embedding-fix-src: false
[2022-01-14 19:20:18] [config] embedding-fix-trg: false
[2022-01-14 19:20:18] [config] embedding-normalization: false
[2022-01-14 19:20:18] [config] embedding-vectors:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] enc-cell: gru
[2022-01-14 19:20:18] [config] enc-cell-depth: 1
[2022-01-14 19:20:18] [config] enc-depth: 6
[2022-01-14 19:20:18] [config] enc-type: bidirectional
[2022-01-14 19:20:18] [config] english-title-case-every: 0
[2022-01-14 19:20:18] [config] exponential-smoothing: 0.0001
[2022-01-14 19:20:18] [config] factor-weight: 1
[2022-01-14 19:20:18] [config] grad-dropping-momentum: 0
[2022-01-14 19:20:18] [config] grad-dropping-rate: 0
[2022-01-14 19:20:18] [config] grad-dropping-warmup: 100
[2022-01-14 19:20:18] [config] gradient-checkpointing: false
[2022-01-14 19:20:18] [config] guided-alignment: none
[2022-01-14 19:20:18] [config] guided-alignment-cost: mse
[2022-01-14 19:20:18] [config] guided-alignment-weight: 0.1
[2022-01-14 19:20:18] [config] ignore-model-config: false
[2022-01-14 19:20:18] [config] input-types:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] interpolate-env-vars: false
[2022-01-14 19:20:18] [config] keep-best: false
[2022-01-14 19:20:18] [config] label-smoothing: 0.1
[2022-01-14 19:20:18] [config] layer-normalization: false
[2022-01-14 19:20:18] [config] learn-rate: 0.0003
[2022-01-14 19:20:18] [config] lemma-dim-emb: 0
[2022-01-14 19:20:18] [config] log: /home/wmi/train.log
[2022-01-14 19:20:18] [config] log-level: info
[2022-01-14 19:20:18] [config] log-time-zone: ""
[2022-01-14 19:20:18] [config] logical-epoch:
[2022-01-14 19:20:18] [config]   - 1e
[2022-01-14 19:20:18] [config]   - 0
[2022-01-14 19:20:18] [config] lr-decay: 0
[2022-01-14 19:20:18] [config] lr-decay-freq: 50000
[2022-01-14 19:20:18] [config] lr-decay-inv-sqrt:
[2022-01-14 19:20:18] [config]   - 16000
[2022-01-14 19:20:18] [config] lr-decay-repeat-warmup: false
[2022-01-14 19:20:18] [config] lr-decay-reset-optimizer: false
[2022-01-14 19:20:18] [config] lr-decay-start:
[2022-01-14 19:20:18] [config]   - 10
[2022-01-14 19:20:18] [config]   - 1
[2022-01-14 19:20:18] [config] lr-decay-strategy: epoch+stalled
[2022-01-14 19:20:18] [config] lr-report: true
[2022-01-14 19:20:18] [config] lr-warmup: 16000
[2022-01-14 19:20:18] [config] lr-warmup-at-reload: false
[2022-01-14 19:20:18] [config] lr-warmup-cycle: false
[2022-01-14 19:20:18] [config] lr-warmup-start-rate: 0
[2022-01-14 19:20:18] [config] max-length: 100
[2022-01-14 19:20:18] [config] max-length-crop: false
[2022-01-14 19:20:18] [config] max-length-factor: 3
[2022-01-14 19:20:18] [config] maxi-batch: 1000
[2022-01-14 19:20:18] [config] maxi-batch-sort: trg
[2022-01-14 19:20:18] [config] mini-batch: 64
[2022-01-14 19:20:18] [config] mini-batch-fit: true
[2022-01-14 19:20:18] [config] mini-batch-fit-step: 10
[2022-01-14 19:20:18] [config] mini-batch-track-lr: false
[2022-01-14 19:20:18] [config] mini-batch-warmup: 0
[2022-01-14 19:20:18] [config] mini-batch-words: 0
[2022-01-14 19:20:18] [config] mini-batch-words-ref: 0
[2022-01-14 19:20:18] [config] model: model.npz
[2022-01-14 19:20:18] [config] multi-loss-type: sum
[2022-01-14 19:20:18] [config] multi-node: false
[2022-01-14 19:20:18] [config] multi-node-overlap: true
[2022-01-14 19:20:18] [config] n-best: false
[2022-01-14 19:20:18] [config] no-nccl: false
[2022-01-14 19:20:18] [config] no-reload: false
[2022-01-14 19:20:18] [config] no-restore-corpus: false
[2022-01-14 19:20:18] [config] normalize: 0.6
[2022-01-14 19:20:18] [config] normalize-gradient: false
[2022-01-14 19:20:18] [config] num-devices: 0
[2022-01-14 19:20:18] [config] optimizer: adam
[2022-01-14 19:20:18] [config] optimizer-delay: 1
[2022-01-14 19:20:18] [config] optimizer-params:
[2022-01-14 19:20:18] [config]   - 0.9
[2022-01-14 19:20:18] [config]   - 0.98
[2022-01-14 19:20:18] [config]   - 1e-09
[2022-01-14 19:20:18] [config] output-omit-bias: false
[2022-01-14 19:20:18] [config] overwrite: true
[2022-01-14 19:20:18] [config] precision:
[2022-01-14 19:20:18] [config]   - float32
[2022-01-14 19:20:18] [config]   - float32
[2022-01-14 19:20:18] [config]   - float32
[2022-01-14 19:20:18] [config] pretrained-model: ""
[2022-01-14 19:20:18] [config] quantize-biases: false
[2022-01-14 19:20:18] [config] quantize-bits: 0
[2022-01-14 19:20:18] [config] quantize-log-based: false
[2022-01-14 19:20:18] [config] quantize-optimization-steps: 0
[2022-01-14 19:20:18] [config] quiet: false
[2022-01-14 19:20:18] [config] quiet-translation: false
[2022-01-14 19:20:18] [config] relative-paths: false
[2022-01-14 19:20:18] [config] right-left: false
[2022-01-14 19:20:18] [config] save-freq: 5000
[2022-01-14 19:20:18] [config] seed: 0
[2022-01-14 19:20:18] [config] sentencepiece-alphas:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] sentencepiece-max-lines: 2000000
[2022-01-14 19:20:18] [config] sentencepiece-options: ""
[2022-01-14 19:20:18] [config] shuffle: data
[2022-01-14 19:20:18] [config] shuffle-in-ram: false
[2022-01-14 19:20:18] [config] sigterm: save-and-exit
[2022-01-14 19:20:18] [config] skip: false
[2022-01-14 19:20:18] [config] sqlite: ""
[2022-01-14 19:20:18] [config] sqlite-drop: false
[2022-01-14 19:20:18] [config] sync-sgd: false
[2022-01-14 19:20:18] [config] tempdir: /tmp
[2022-01-14 19:20:18] [config] tied-embeddings: false
[2022-01-14 19:20:18] [config] tied-embeddings-all: true
[2022-01-14 19:20:18] [config] tied-embeddings-src: false
[2022-01-14 19:20:18] [config] train-embedder-rank:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] train-sets:
[2022-01-14 19:20:18] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-14 19:20:18] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-14 19:20:18] [config] transformer-aan-activation: swish
[2022-01-14 19:20:18] [config] transformer-aan-depth: 2
[2022-01-14 19:20:18] [config] transformer-aan-nogate: false
[2022-01-14 19:20:18] [config] transformer-decoder-autoreg: self-attention
[2022-01-14 19:20:18] [config] transformer-depth-scaling: false
[2022-01-14 19:20:18] [config] transformer-dim-aan: 2048
[2022-01-14 19:20:18] [config] transformer-dim-ffn: 2048
[2022-01-14 19:20:18] [config] transformer-dropout: 0.1
[2022-01-14 19:20:18] [config] transformer-dropout-attention: 0
[2022-01-14 19:20:18] [config] transformer-dropout-ffn: 0
[2022-01-14 19:20:18] [config] transformer-ffn-activation: swish
[2022-01-14 19:20:18] [config] transformer-ffn-depth: 2
[2022-01-14 19:20:18] [config] transformer-guided-alignment-layer: last
[2022-01-14 19:20:18] [config] transformer-heads: 8
[2022-01-14 19:20:18] [config] transformer-no-projection: false
[2022-01-14 19:20:18] [config] transformer-pool: false
[2022-01-14 19:20:18] [config] transformer-postprocess: dan
[2022-01-14 19:20:18] [config] transformer-postprocess-emb: d
[2022-01-14 19:20:18] [config] transformer-postprocess-top: ""
[2022-01-14 19:20:18] [config] transformer-preprocess: ""
[2022-01-14 19:20:18] [config] transformer-tied-layers:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] transformer-train-position-embeddings: false
[2022-01-14 19:20:18] [config] tsv: false
[2022-01-14 19:20:18] [config] tsv-fields: 0
[2022-01-14 19:20:18] [config] type: transformer
[2022-01-14 19:20:18] [config] ulr: false
[2022-01-14 19:20:18] [config] ulr-dim-emb: 0
[2022-01-14 19:20:18] [config] ulr-dropout: 0
[2022-01-14 19:20:18] [config] ulr-keys-vectors: ""
[2022-01-14 19:20:18] [config] ulr-query-vectors: ""
[2022-01-14 19:20:18] [config] ulr-softmax-temperature: 1
[2022-01-14 19:20:18] [config] ulr-trainable-transformation: false
[2022-01-14 19:20:18] [config] unlikelihood-loss: false
[2022-01-14 19:20:18] [config] valid-freq: 5000
[2022-01-14 19:20:18] [config] valid-log: ""
[2022-01-14 19:20:18] [config] valid-max-length: 1000
[2022-01-14 19:20:18] [config] valid-metrics:
[2022-01-14 19:20:18] [config]   - cross-entropy
[2022-01-14 19:20:18] [config] valid-mini-batch: 32
[2022-01-14 19:20:18] [config] valid-reset-stalled: false
[2022-01-14 19:20:18] [config] valid-script-args:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] valid-script-path: ""
[2022-01-14 19:20:18] [config] valid-sets:
[2022-01-14 19:20:18] [config]   []
[2022-01-14 19:20:18] [config] valid-translation-output: ""
[2022-01-14 19:20:18] [config] vocabs:
[2022-01-14 19:20:18] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-14 19:20:18] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000
[2022-01-14 19:20:18] [config] word-penalty: 0
[2022-01-14 19:20:18] [config] word-scores: false
[2022-01-14 19:20:18] [config] workspace: 10000
[2022-01-14 19:20:18] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-14 19:20:18] [training] Using single-device training
[2022-01-14 19:20:18] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-14 19:20:18] Error: Duplicate vocabulary entry -
[2022-01-14 19:20:18] Error: Aborted from virtual size_t marian::DefaultVocab::load(const string&, size_t) in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:116

[CALL STACK]
[0x56321bb861ed]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x6fd
[0x56321bb75e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x56321bb76728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x56321bbc2189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x56321bbd5084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x56321ba33f8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x56321babb94b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x56321b9c2389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x56321b9801bc]    main                                               + 0x3c
[0x7f3235c160b3]    __libc_start_main                                  + 0xf3
[0x56321b9c0b0e]    _start                                             + 0x2e

[2022-01-15 14:02:43] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:02:43] [marian] Running on s470607-gpu as process 2586 with command line:
[2022-01-15 14:02:43] [marian] ../marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --vocabs /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000 --after-epochs 1
[2022-01-15 14:02:43] [config] after: 0e
[2022-01-15 14:02:43] [config] after-batches: 0
[2022-01-15 14:02:43] [config] after-epochs: 1
[2022-01-15 14:02:43] [config] all-caps-every: 0
[2022-01-15 14:02:43] [config] allow-unk: false
[2022-01-15 14:02:43] [config] authors: false
[2022-01-15 14:02:43] [config] beam-size: 6
[2022-01-15 14:02:43] [config] bert-class-symbol: "[CLS]"
[2022-01-15 14:02:43] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 14:02:43] [config] bert-masking-fraction: 0.15
[2022-01-15 14:02:43] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 14:02:43] [config] bert-train-type-embeddings: true
[2022-01-15 14:02:43] [config] bert-type-vocab-size: 2
[2022-01-15 14:02:43] [config] build-info: ""
[2022-01-15 14:02:43] [config] cite: false
[2022-01-15 14:02:43] [config] clip-norm: 5
[2022-01-15 14:02:43] [config] cost-scaling:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] cost-type: ce-sum
[2022-01-15 14:02:43] [config] cpu-threads: 0
[2022-01-15 14:02:43] [config] data-weighting: ""
[2022-01-15 14:02:43] [config] data-weighting-type: sentence
[2022-01-15 14:02:43] [config] dec-cell: gru
[2022-01-15 14:02:43] [config] dec-cell-base-depth: 2
[2022-01-15 14:02:43] [config] dec-cell-high-depth: 1
[2022-01-15 14:02:43] [config] dec-depth: 6
[2022-01-15 14:02:43] [config] devices:
[2022-01-15 14:02:43] [config]   - 0
[2022-01-15 14:02:43] [config] dim-emb: 512
[2022-01-15 14:02:43] [config] dim-rnn: 1024
[2022-01-15 14:02:43] [config] dim-vocabs:
[2022-01-15 14:02:43] [config]   - 0
[2022-01-15 14:02:43] [config]   - 0
[2022-01-15 14:02:43] [config] disp-first: 0
[2022-01-15 14:02:43] [config] disp-freq: 500
[2022-01-15 14:02:43] [config] disp-label-counts: true
[2022-01-15 14:02:43] [config] dropout-rnn: 0
[2022-01-15 14:02:43] [config] dropout-src: 0
[2022-01-15 14:02:43] [config] dropout-trg: 0
[2022-01-15 14:02:43] [config] dump-config: ""
[2022-01-15 14:02:43] [config] early-stopping: 10
[2022-01-15 14:02:43] [config] embedding-fix-src: false
[2022-01-15 14:02:43] [config] embedding-fix-trg: false
[2022-01-15 14:02:43] [config] embedding-normalization: false
[2022-01-15 14:02:43] [config] embedding-vectors:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] enc-cell: gru
[2022-01-15 14:02:43] [config] enc-cell-depth: 1
[2022-01-15 14:02:43] [config] enc-depth: 6
[2022-01-15 14:02:43] [config] enc-type: bidirectional
[2022-01-15 14:02:43] [config] english-title-case-every: 0
[2022-01-15 14:02:43] [config] exponential-smoothing: 0.0001
[2022-01-15 14:02:43] [config] factor-weight: 1
[2022-01-15 14:02:43] [config] grad-dropping-momentum: 0
[2022-01-15 14:02:43] [config] grad-dropping-rate: 0
[2022-01-15 14:02:43] [config] grad-dropping-warmup: 100
[2022-01-15 14:02:43] [config] gradient-checkpointing: false
[2022-01-15 14:02:43] [config] guided-alignment: none
[2022-01-15 14:02:43] [config] guided-alignment-cost: mse
[2022-01-15 14:02:43] [config] guided-alignment-weight: 0.1
[2022-01-15 14:02:43] [config] ignore-model-config: false
[2022-01-15 14:02:43] [config] input-types:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] interpolate-env-vars: false
[2022-01-15 14:02:43] [config] keep-best: false
[2022-01-15 14:02:43] [config] label-smoothing: 0.1
[2022-01-15 14:02:43] [config] layer-normalization: false
[2022-01-15 14:02:43] [config] learn-rate: 0.0003
[2022-01-15 14:02:43] [config] lemma-dim-emb: 0
[2022-01-15 14:02:43] [config] log: /home/wmi/train.log
[2022-01-15 14:02:43] [config] log-level: info
[2022-01-15 14:02:43] [config] log-time-zone: ""
[2022-01-15 14:02:43] [config] logical-epoch:
[2022-01-15 14:02:43] [config]   - 1e
[2022-01-15 14:02:43] [config]   - 0
[2022-01-15 14:02:43] [config] lr-decay: 0
[2022-01-15 14:02:43] [config] lr-decay-freq: 50000
[2022-01-15 14:02:43] [config] lr-decay-inv-sqrt:
[2022-01-15 14:02:43] [config]   - 16000
[2022-01-15 14:02:43] [config] lr-decay-repeat-warmup: false
[2022-01-15 14:02:43] [config] lr-decay-reset-optimizer: false
[2022-01-15 14:02:43] [config] lr-decay-start:
[2022-01-15 14:02:43] [config]   - 10
[2022-01-15 14:02:43] [config]   - 1
[2022-01-15 14:02:43] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 14:02:43] [config] lr-report: true
[2022-01-15 14:02:43] [config] lr-warmup: 16000
[2022-01-15 14:02:43] [config] lr-warmup-at-reload: false
[2022-01-15 14:02:43] [config] lr-warmup-cycle: false
[2022-01-15 14:02:43] [config] lr-warmup-start-rate: 0
[2022-01-15 14:02:43] [config] max-length: 100
[2022-01-15 14:02:43] [config] max-length-crop: false
[2022-01-15 14:02:43] [config] max-length-factor: 3
[2022-01-15 14:02:43] [config] maxi-batch: 1000
[2022-01-15 14:02:43] [config] maxi-batch-sort: trg
[2022-01-15 14:02:43] [config] mini-batch: 64
[2022-01-15 14:02:43] [config] mini-batch-fit: true
[2022-01-15 14:02:43] [config] mini-batch-fit-step: 10
[2022-01-15 14:02:43] [config] mini-batch-track-lr: false
[2022-01-15 14:02:43] [config] mini-batch-warmup: 0
[2022-01-15 14:02:43] [config] mini-batch-words: 0
[2022-01-15 14:02:43] [config] mini-batch-words-ref: 0
[2022-01-15 14:02:43] [config] model: model.npz
[2022-01-15 14:02:43] [config] multi-loss-type: sum
[2022-01-15 14:02:43] [config] multi-node: false
[2022-01-15 14:02:43] [config] multi-node-overlap: true
[2022-01-15 14:02:43] [config] n-best: false
[2022-01-15 14:02:43] [config] no-nccl: false
[2022-01-15 14:02:43] [config] no-reload: false
[2022-01-15 14:02:43] [config] no-restore-corpus: false
[2022-01-15 14:02:43] [config] normalize: 0.6
[2022-01-15 14:02:43] [config] normalize-gradient: false
[2022-01-15 14:02:43] [config] num-devices: 0
[2022-01-15 14:02:43] [config] optimizer: adam
[2022-01-15 14:02:43] [config] optimizer-delay: 1
[2022-01-15 14:02:43] [config] optimizer-params:
[2022-01-15 14:02:43] [config]   - 0.9
[2022-01-15 14:02:43] [config]   - 0.98
[2022-01-15 14:02:43] [config]   - 1e-09
[2022-01-15 14:02:43] [config] output-omit-bias: false
[2022-01-15 14:02:43] [config] overwrite: true
[2022-01-15 14:02:43] [config] precision:
[2022-01-15 14:02:43] [config]   - float32
[2022-01-15 14:02:43] [config]   - float32
[2022-01-15 14:02:43] [config]   - float32
[2022-01-15 14:02:43] [config] pretrained-model: ""
[2022-01-15 14:02:43] [config] quantize-biases: false
[2022-01-15 14:02:43] [config] quantize-bits: 0
[2022-01-15 14:02:43] [config] quantize-log-based: false
[2022-01-15 14:02:43] [config] quantize-optimization-steps: 0
[2022-01-15 14:02:43] [config] quiet: false
[2022-01-15 14:02:43] [config] quiet-translation: false
[2022-01-15 14:02:43] [config] relative-paths: false
[2022-01-15 14:02:43] [config] right-left: false
[2022-01-15 14:02:43] [config] save-freq: 5000
[2022-01-15 14:02:43] [config] seed: 0
[2022-01-15 14:02:43] [config] sentencepiece-alphas:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] sentencepiece-max-lines: 2000000
[2022-01-15 14:02:43] [config] sentencepiece-options: ""
[2022-01-15 14:02:43] [config] shuffle: data
[2022-01-15 14:02:43] [config] shuffle-in-ram: false
[2022-01-15 14:02:43] [config] sigterm: save-and-exit
[2022-01-15 14:02:43] [config] skip: false
[2022-01-15 14:02:43] [config] sqlite: ""
[2022-01-15 14:02:43] [config] sqlite-drop: false
[2022-01-15 14:02:43] [config] sync-sgd: false
[2022-01-15 14:02:43] [config] tempdir: /tmp
[2022-01-15 14:02:43] [config] tied-embeddings: false
[2022-01-15 14:02:43] [config] tied-embeddings-all: true
[2022-01-15 14:02:43] [config] tied-embeddings-src: false
[2022-01-15 14:02:43] [config] train-embedder-rank:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] train-sets:
[2022-01-15 14:02:43] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-15 14:02:43] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-15 14:02:43] [config] transformer-aan-activation: swish
[2022-01-15 14:02:43] [config] transformer-aan-depth: 2
[2022-01-15 14:02:43] [config] transformer-aan-nogate: false
[2022-01-15 14:02:43] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 14:02:43] [config] transformer-depth-scaling: false
[2022-01-15 14:02:43] [config] transformer-dim-aan: 2048
[2022-01-15 14:02:43] [config] transformer-dim-ffn: 2048
[2022-01-15 14:02:43] [config] transformer-dropout: 0.1
[2022-01-15 14:02:43] [config] transformer-dropout-attention: 0
[2022-01-15 14:02:43] [config] transformer-dropout-ffn: 0
[2022-01-15 14:02:43] [config] transformer-ffn-activation: swish
[2022-01-15 14:02:43] [config] transformer-ffn-depth: 2
[2022-01-15 14:02:43] [config] transformer-guided-alignment-layer: last
[2022-01-15 14:02:43] [config] transformer-heads: 8
[2022-01-15 14:02:43] [config] transformer-no-projection: false
[2022-01-15 14:02:43] [config] transformer-pool: false
[2022-01-15 14:02:43] [config] transformer-postprocess: dan
[2022-01-15 14:02:43] [config] transformer-postprocess-emb: d
[2022-01-15 14:02:43] [config] transformer-postprocess-top: ""
[2022-01-15 14:02:43] [config] transformer-preprocess: ""
[2022-01-15 14:02:43] [config] transformer-tied-layers:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] transformer-train-position-embeddings: false
[2022-01-15 14:02:43] [config] tsv: false
[2022-01-15 14:02:43] [config] tsv-fields: 0
[2022-01-15 14:02:43] [config] type: transformer
[2022-01-15 14:02:43] [config] ulr: false
[2022-01-15 14:02:43] [config] ulr-dim-emb: 0
[2022-01-15 14:02:43] [config] ulr-dropout: 0
[2022-01-15 14:02:43] [config] ulr-keys-vectors: ""
[2022-01-15 14:02:43] [config] ulr-query-vectors: ""
[2022-01-15 14:02:43] [config] ulr-softmax-temperature: 1
[2022-01-15 14:02:43] [config] ulr-trainable-transformation: false
[2022-01-15 14:02:43] [config] unlikelihood-loss: false
[2022-01-15 14:02:43] [config] valid-freq: 5000
[2022-01-15 14:02:43] [config] valid-log: ""
[2022-01-15 14:02:43] [config] valid-max-length: 1000
[2022-01-15 14:02:43] [config] valid-metrics:
[2022-01-15 14:02:43] [config]   - cross-entropy
[2022-01-15 14:02:43] [config] valid-mini-batch: 32
[2022-01-15 14:02:43] [config] valid-reset-stalled: false
[2022-01-15 14:02:43] [config] valid-script-args:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] valid-script-path: ""
[2022-01-15 14:02:43] [config] valid-sets:
[2022-01-15 14:02:43] [config]   []
[2022-01-15 14:02:43] [config] valid-translation-output: ""
[2022-01-15 14:02:43] [config] vocabs:
[2022-01-15 14:02:43] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-15 14:02:43] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.vocab.32000
[2022-01-15 14:02:43] [config] word-penalty: 0
[2022-01-15 14:02:43] [config] word-scores: false
[2022-01-15 14:02:43] [config] workspace: 10000
[2022-01-15 14:02:43] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:02:43] [training] Using single-device training
[2022-01-15 14:02:43] [data] Loading vocabulary from text file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000
[2022-01-15 14:02:43] Error: DefaultVocabulary file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.vocab.32000 is expected to contain an entry for </s>
[2022-01-15 14:02:43] Error: Aborted from marian::DefaultVocab::addRequiredVocabulary(const string&, bool)::<lambda(const string&, const string&, marian::Word)> in /home/wmi/Workspace/marian/src/data/default_vocab.cpp:199

[CALL STACK]
[0x56222e7700d8]    marian::DefaultVocab::addRequiredVocabulary(std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,bool)::{lambda(std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,marian::Word)#1}::  operator()  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  marian::Word) const + 0x4a8
[0x56222e7705d9]    marian::DefaultVocab::  addRequiredVocabulary  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  bool) + 0x59
[0x56222e77368b]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0xb9b
[0x56222e762e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x56222e763728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x56222e7af189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x56222e7c2084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x56222e620f8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x56222e6a894b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x56222e5af389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x56222e56d1bc]    main                                               + 0x3c
[0x7fbc01e290b3]    __libc_start_main                                  + 0xf3
[0x56222e5adb0e]    _start                                             + 0x2e

[2022-01-15 14:24:00] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:24:00] [marian] Running on s470607-gpu as process 2853 with command line:
[2022-01-15 14:24:00] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 14:24:00] [config] after: 0e
[2022-01-15 14:24:00] [config] after-batches: 0
[2022-01-15 14:24:00] [config] after-epochs: 1
[2022-01-15 14:24:00] [config] all-caps-every: 0
[2022-01-15 14:24:00] [config] allow-unk: false
[2022-01-15 14:24:00] [config] authors: false
[2022-01-15 14:24:00] [config] beam-size: 6
[2022-01-15 14:24:00] [config] bert-class-symbol: "[CLS]"
[2022-01-15 14:24:00] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 14:24:00] [config] bert-masking-fraction: 0.15
[2022-01-15 14:24:00] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 14:24:00] [config] bert-train-type-embeddings: true
[2022-01-15 14:24:00] [config] bert-type-vocab-size: 2
[2022-01-15 14:24:00] [config] build-info: ""
[2022-01-15 14:24:00] [config] cite: false
[2022-01-15 14:24:00] [config] clip-norm: 5
[2022-01-15 14:24:00] [config] cost-scaling:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] cost-type: ce-sum
[2022-01-15 14:24:00] [config] cpu-threads: 0
[2022-01-15 14:24:00] [config] data-weighting: ""
[2022-01-15 14:24:00] [config] data-weighting-type: sentence
[2022-01-15 14:24:00] [config] dec-cell: gru
[2022-01-15 14:24:00] [config] dec-cell-base-depth: 2
[2022-01-15 14:24:00] [config] dec-cell-high-depth: 1
[2022-01-15 14:24:00] [config] dec-depth: 6
[2022-01-15 14:24:00] [config] devices:
[2022-01-15 14:24:00] [config]   - 0
[2022-01-15 14:24:00] [config] dim-emb: 512
[2022-01-15 14:24:00] [config] dim-rnn: 1024
[2022-01-15 14:24:00] [config] dim-vocabs:
[2022-01-15 14:24:00] [config]   - 0
[2022-01-15 14:24:00] [config]   - 0
[2022-01-15 14:24:00] [config] disp-first: 0
[2022-01-15 14:24:00] [config] disp-freq: 500
[2022-01-15 14:24:00] [config] disp-label-counts: true
[2022-01-15 14:24:00] [config] dropout-rnn: 0
[2022-01-15 14:24:00] [config] dropout-src: 0
[2022-01-15 14:24:00] [config] dropout-trg: 0
[2022-01-15 14:24:00] [config] dump-config: ""
[2022-01-15 14:24:00] [config] early-stopping: 10
[2022-01-15 14:24:00] [config] embedding-fix-src: false
[2022-01-15 14:24:00] [config] embedding-fix-trg: false
[2022-01-15 14:24:00] [config] embedding-normalization: false
[2022-01-15 14:24:00] [config] embedding-vectors:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] enc-cell: gru
[2022-01-15 14:24:00] [config] enc-cell-depth: 1
[2022-01-15 14:24:00] [config] enc-depth: 6
[2022-01-15 14:24:00] [config] enc-type: bidirectional
[2022-01-15 14:24:00] [config] english-title-case-every: 0
[2022-01-15 14:24:00] [config] exponential-smoothing: 0.0001
[2022-01-15 14:24:00] [config] factor-weight: 1
[2022-01-15 14:24:00] [config] grad-dropping-momentum: 0
[2022-01-15 14:24:00] [config] grad-dropping-rate: 0
[2022-01-15 14:24:00] [config] grad-dropping-warmup: 100
[2022-01-15 14:24:00] [config] gradient-checkpointing: false
[2022-01-15 14:24:00] [config] guided-alignment: none
[2022-01-15 14:24:00] [config] guided-alignment-cost: mse
[2022-01-15 14:24:00] [config] guided-alignment-weight: 0.1
[2022-01-15 14:24:00] [config] ignore-model-config: false
[2022-01-15 14:24:00] [config] input-types:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] interpolate-env-vars: false
[2022-01-15 14:24:00] [config] keep-best: false
[2022-01-15 14:24:00] [config] label-smoothing: 0.1
[2022-01-15 14:24:00] [config] layer-normalization: false
[2022-01-15 14:24:00] [config] learn-rate: 0.0003
[2022-01-15 14:24:00] [config] lemma-dim-emb: 0
[2022-01-15 14:24:00] [config] log: /home/wmi/train.log
[2022-01-15 14:24:00] [config] log-level: info
[2022-01-15 14:24:00] [config] log-time-zone: ""
[2022-01-15 14:24:00] [config] logical-epoch:
[2022-01-15 14:24:00] [config]   - 1e
[2022-01-15 14:24:00] [config]   - 0
[2022-01-15 14:24:00] [config] lr-decay: 0
[2022-01-15 14:24:00] [config] lr-decay-freq: 50000
[2022-01-15 14:24:00] [config] lr-decay-inv-sqrt:
[2022-01-15 14:24:00] [config]   - 16000
[2022-01-15 14:24:00] [config] lr-decay-repeat-warmup: false
[2022-01-15 14:24:00] [config] lr-decay-reset-optimizer: false
[2022-01-15 14:24:00] [config] lr-decay-start:
[2022-01-15 14:24:00] [config]   - 10
[2022-01-15 14:24:00] [config]   - 1
[2022-01-15 14:24:00] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 14:24:00] [config] lr-report: true
[2022-01-15 14:24:00] [config] lr-warmup: 16000
[2022-01-15 14:24:00] [config] lr-warmup-at-reload: false
[2022-01-15 14:24:00] [config] lr-warmup-cycle: false
[2022-01-15 14:24:00] [config] lr-warmup-start-rate: 0
[2022-01-15 14:24:00] [config] max-length: 100
[2022-01-15 14:24:00] [config] max-length-crop: false
[2022-01-15 14:24:00] [config] max-length-factor: 3
[2022-01-15 14:24:00] [config] maxi-batch: 1000
[2022-01-15 14:24:00] [config] maxi-batch-sort: trg
[2022-01-15 14:24:00] [config] mini-batch: 64
[2022-01-15 14:24:00] [config] mini-batch-fit: true
[2022-01-15 14:24:00] [config] mini-batch-fit-step: 10
[2022-01-15 14:24:00] [config] mini-batch-track-lr: false
[2022-01-15 14:24:00] [config] mini-batch-warmup: 0
[2022-01-15 14:24:00] [config] mini-batch-words: 0
[2022-01-15 14:24:00] [config] mini-batch-words-ref: 0
[2022-01-15 14:24:00] [config] model: model.npz
[2022-01-15 14:24:00] [config] multi-loss-type: sum
[2022-01-15 14:24:00] [config] multi-node: false
[2022-01-15 14:24:00] [config] multi-node-overlap: true
[2022-01-15 14:24:00] [config] n-best: false
[2022-01-15 14:24:00] [config] no-nccl: false
[2022-01-15 14:24:00] [config] no-reload: false
[2022-01-15 14:24:00] [config] no-restore-corpus: false
[2022-01-15 14:24:00] [config] normalize: 0.6
[2022-01-15 14:24:00] [config] normalize-gradient: false
[2022-01-15 14:24:00] [config] num-devices: 0
[2022-01-15 14:24:00] [config] optimizer: adam
[2022-01-15 14:24:00] [config] optimizer-delay: 1
[2022-01-15 14:24:00] [config] optimizer-params:
[2022-01-15 14:24:00] [config]   - 0.9
[2022-01-15 14:24:00] [config]   - 0.98
[2022-01-15 14:24:00] [config]   - 1e-09
[2022-01-15 14:24:00] [config] output-omit-bias: false
[2022-01-15 14:24:00] [config] overwrite: true
[2022-01-15 14:24:00] [config] precision:
[2022-01-15 14:24:00] [config]   - float32
[2022-01-15 14:24:00] [config]   - float32
[2022-01-15 14:24:00] [config]   - float32
[2022-01-15 14:24:00] [config] pretrained-model: ""
[2022-01-15 14:24:00] [config] quantize-biases: false
[2022-01-15 14:24:00] [config] quantize-bits: 0
[2022-01-15 14:24:00] [config] quantize-log-based: false
[2022-01-15 14:24:00] [config] quantize-optimization-steps: 0
[2022-01-15 14:24:00] [config] quiet: false
[2022-01-15 14:24:00] [config] quiet-translation: false
[2022-01-15 14:24:00] [config] relative-paths: false
[2022-01-15 14:24:00] [config] right-left: false
[2022-01-15 14:24:00] [config] save-freq: 5000
[2022-01-15 14:24:00] [config] seed: 0
[2022-01-15 14:24:00] [config] sentencepiece-alphas:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] sentencepiece-max-lines: 2000000
[2022-01-15 14:24:00] [config] sentencepiece-options: ""
[2022-01-15 14:24:00] [config] shuffle: data
[2022-01-15 14:24:00] [config] shuffle-in-ram: false
[2022-01-15 14:24:00] [config] sigterm: save-and-exit
[2022-01-15 14:24:00] [config] skip: false
[2022-01-15 14:24:00] [config] sqlite: ""
[2022-01-15 14:24:00] [config] sqlite-drop: false
[2022-01-15 14:24:00] [config] sync-sgd: false
[2022-01-15 14:24:00] [config] tempdir: /tmp
[2022-01-15 14:24:00] [config] tied-embeddings: false
[2022-01-15 14:24:00] [config] tied-embeddings-all: true
[2022-01-15 14:24:00] [config] tied-embeddings-src: false
[2022-01-15 14:24:00] [config] train-embedder-rank:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] train-sets:
[2022-01-15 14:24:00] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-15 14:24:00] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-15 14:24:00] [config] transformer-aan-activation: swish
[2022-01-15 14:24:00] [config] transformer-aan-depth: 2
[2022-01-15 14:24:00] [config] transformer-aan-nogate: false
[2022-01-15 14:24:00] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 14:24:00] [config] transformer-depth-scaling: false
[2022-01-15 14:24:00] [config] transformer-dim-aan: 2048
[2022-01-15 14:24:00] [config] transformer-dim-ffn: 2048
[2022-01-15 14:24:00] [config] transformer-dropout: 0.1
[2022-01-15 14:24:00] [config] transformer-dropout-attention: 0
[2022-01-15 14:24:00] [config] transformer-dropout-ffn: 0
[2022-01-15 14:24:00] [config] transformer-ffn-activation: swish
[2022-01-15 14:24:00] [config] transformer-ffn-depth: 2
[2022-01-15 14:24:00] [config] transformer-guided-alignment-layer: last
[2022-01-15 14:24:00] [config] transformer-heads: 8
[2022-01-15 14:24:00] [config] transformer-no-projection: false
[2022-01-15 14:24:00] [config] transformer-pool: false
[2022-01-15 14:24:00] [config] transformer-postprocess: dan
[2022-01-15 14:24:00] [config] transformer-postprocess-emb: d
[2022-01-15 14:24:00] [config] transformer-postprocess-top: ""
[2022-01-15 14:24:00] [config] transformer-preprocess: ""
[2022-01-15 14:24:00] [config] transformer-tied-layers:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] transformer-train-position-embeddings: false
[2022-01-15 14:24:00] [config] tsv: false
[2022-01-15 14:24:00] [config] tsv-fields: 0
[2022-01-15 14:24:00] [config] type: transformer
[2022-01-15 14:24:00] [config] ulr: false
[2022-01-15 14:24:00] [config] ulr-dim-emb: 0
[2022-01-15 14:24:00] [config] ulr-dropout: 0
[2022-01-15 14:24:00] [config] ulr-keys-vectors: ""
[2022-01-15 14:24:00] [config] ulr-query-vectors: ""
[2022-01-15 14:24:00] [config] ulr-softmax-temperature: 1
[2022-01-15 14:24:00] [config] ulr-trainable-transformation: false
[2022-01-15 14:24:00] [config] unlikelihood-loss: false
[2022-01-15 14:24:00] [config] valid-freq: 5000
[2022-01-15 14:24:00] [config] valid-log: ""
[2022-01-15 14:24:00] [config] valid-max-length: 1000
[2022-01-15 14:24:00] [config] valid-metrics:
[2022-01-15 14:24:00] [config]   - cross-entropy
[2022-01-15 14:24:00] [config] valid-mini-batch: 32
[2022-01-15 14:24:00] [config] valid-reset-stalled: false
[2022-01-15 14:24:00] [config] valid-script-args:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] valid-script-path: ""
[2022-01-15 14:24:00] [config] valid-sets:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] valid-translation-output: ""
[2022-01-15 14:24:00] [config] vocabs:
[2022-01-15 14:24:00] [config]   []
[2022-01-15 14:24:00] [config] word-penalty: 0
[2022-01-15 14:24:00] [config] word-scores: false
[2022-01-15 14:24:00] [config] workspace: 10000
[2022-01-15 14:24:00] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:24:00] [training] Using single-device training
[2022-01-15 14:24:00] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 14:24:00] [data] Vocabularies will be built separately for each file.
[2022-01-15 14:24:00] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-15 14:24:00] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-15 14:24:00] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en
[2022-01-15 14:24:15] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.yml
[2022-01-15 14:24:19] [data] Setting vocabulary size for input 0 to 861,279
[2022-01-15 14:24:19] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-15 14:24:19] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-15 14:24:19] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl
[2022-01-15 14:24:40] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.yml
[2022-01-15 14:24:45] [data] Setting vocabulary size for input 1 to 1,296,038
[2022-01-15 14:24:45] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 14:24:45] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 14:24:46] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 14:24:46] Error: Requested shape shape=1296038x512 size=663571456 for existing parameter 'Wemb' does not match original shape shape=861279x512 size=440974848
[2022-01-15 14:24:46] Error: Aborted from marian::Expr marian::ExpressionGraph::param(const string&, const marian::Shape&, marian::Ptr<marian::inits::NodeInitializer>&, marian::Type, bool, bool) in /home/wmi/Workspace/marian/src/graph/expression_graph.h:314

[CALL STACK]
[0x564e61ee3642]    marian::ExpressionGraph::  param  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  marian::Shape const&,  std::shared_ptr<marian::inits::NodeInitializer> const&,  marian::Type,  bool,  bool) + 0x992
[0x564e625cd899]    marian::Embedding::  Embedding  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::Options>) + 0x4c9
[0x564e625de475]    std::shared_ptr<marian::Embedding> marian::  New  <marian::Embedding,std::shared_ptr<marian::ExpressionGraph> const&,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::ExpressionGraph> const&,  std::shared_ptr<marian::Options>&) + 0x85
[0x564e625ce27d]    marian::EncoderDecoderLayerBase::  createEmbeddingLayer  () const + 0x59d
[0x564e625cebe5]    marian::EncoderDecoderLayerBase::  getEmbeddingLayer  (bool) const + 0x145
[0x564e6219502f]    marian::DecoderBase::  embeddingsFromBatch  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::DecoderState>,  std::shared_ptr<marian::data::CorpusBatch>) + 0x8f
[0x564e62219766]    marian::EncoderDecoder::  stepAll  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::CorpusBatch>,  bool) + 0x196
[0x564e621cfc69]    marian::models::EncoderDecoderCECost::  apply  (std::shared_ptr<marian::models::IModel>,  std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0x119
[0x564e61e22c82]    marian::models::Trainer::  build  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0xb2
[0x564e622b75f4]    marian::GraphGroup::  collectStats  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::models::ICriterionFunction>,  std::vector<std::shared_ptr<marian::Vocab>,std::allocator<std::shared_ptr<marian::Vocab>>> const&,  double) + 0xb84
[0x564e61ef9269]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x2e9
[0x564e61e01389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x564e61dbf1bc]    main                                               + 0x3c
[0x7f906e48b0b3]    __libc_start_main                                  + 0xf3
[0x564e61dffb0e]    _start                                             + 0x2e

[2022-01-15 14:34:04] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:34:04] [marian] Running on s470607-gpu as process 3011 with command line:
[2022-01-15 14:34:04] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings-all --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 14:34:04] [config] after: 0e
[2022-01-15 14:34:04] [config] after-batches: 0
[2022-01-15 14:34:04] [config] after-epochs: 1
[2022-01-15 14:34:04] [config] all-caps-every: 0
[2022-01-15 14:34:04] [config] allow-unk: false
[2022-01-15 14:34:04] [config] authors: false
[2022-01-15 14:34:04] [config] beam-size: 6
[2022-01-15 14:34:04] [config] bert-class-symbol: "[CLS]"
[2022-01-15 14:34:04] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 14:34:04] [config] bert-masking-fraction: 0.15
[2022-01-15 14:34:04] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 14:34:04] [config] bert-train-type-embeddings: true
[2022-01-15 14:34:04] [config] bert-type-vocab-size: 2
[2022-01-15 14:34:04] [config] build-info: ""
[2022-01-15 14:34:04] [config] cite: false
[2022-01-15 14:34:04] [config] clip-norm: 5
[2022-01-15 14:34:04] [config] cost-scaling:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] cost-type: ce-sum
[2022-01-15 14:34:04] [config] cpu-threads: 0
[2022-01-15 14:34:04] [config] data-weighting: ""
[2022-01-15 14:34:04] [config] data-weighting-type: sentence
[2022-01-15 14:34:04] [config] dec-cell: gru
[2022-01-15 14:34:04] [config] dec-cell-base-depth: 2
[2022-01-15 14:34:04] [config] dec-cell-high-depth: 1
[2022-01-15 14:34:04] [config] dec-depth: 6
[2022-01-15 14:34:04] [config] devices:
[2022-01-15 14:34:04] [config]   - 0
[2022-01-15 14:34:04] [config] dim-emb: 512
[2022-01-15 14:34:04] [config] dim-rnn: 1024
[2022-01-15 14:34:04] [config] dim-vocabs:
[2022-01-15 14:34:04] [config]   - 0
[2022-01-15 14:34:04] [config]   - 0
[2022-01-15 14:34:04] [config] disp-first: 0
[2022-01-15 14:34:04] [config] disp-freq: 500
[2022-01-15 14:34:04] [config] disp-label-counts: true
[2022-01-15 14:34:04] [config] dropout-rnn: 0
[2022-01-15 14:34:04] [config] dropout-src: 0
[2022-01-15 14:34:04] [config] dropout-trg: 0
[2022-01-15 14:34:04] [config] dump-config: ""
[2022-01-15 14:34:04] [config] early-stopping: 10
[2022-01-15 14:34:04] [config] embedding-fix-src: false
[2022-01-15 14:34:04] [config] embedding-fix-trg: false
[2022-01-15 14:34:04] [config] embedding-normalization: false
[2022-01-15 14:34:04] [config] embedding-vectors:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] enc-cell: gru
[2022-01-15 14:34:04] [config] enc-cell-depth: 1
[2022-01-15 14:34:04] [config] enc-depth: 6
[2022-01-15 14:34:04] [config] enc-type: bidirectional
[2022-01-15 14:34:04] [config] english-title-case-every: 0
[2022-01-15 14:34:04] [config] exponential-smoothing: 0.0001
[2022-01-15 14:34:04] [config] factor-weight: 1
[2022-01-15 14:34:04] [config] grad-dropping-momentum: 0
[2022-01-15 14:34:04] [config] grad-dropping-rate: 0
[2022-01-15 14:34:04] [config] grad-dropping-warmup: 100
[2022-01-15 14:34:04] [config] gradient-checkpointing: false
[2022-01-15 14:34:04] [config] guided-alignment: none
[2022-01-15 14:34:04] [config] guided-alignment-cost: mse
[2022-01-15 14:34:04] [config] guided-alignment-weight: 0.1
[2022-01-15 14:34:04] [config] ignore-model-config: false
[2022-01-15 14:34:04] [config] input-types:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] interpolate-env-vars: false
[2022-01-15 14:34:04] [config] keep-best: false
[2022-01-15 14:34:04] [config] label-smoothing: 0.1
[2022-01-15 14:34:04] [config] layer-normalization: false
[2022-01-15 14:34:04] [config] learn-rate: 0.0003
[2022-01-15 14:34:04] [config] lemma-dim-emb: 0
[2022-01-15 14:34:04] [config] log: /home/wmi/train.log
[2022-01-15 14:34:04] [config] log-level: info
[2022-01-15 14:34:04] [config] log-time-zone: ""
[2022-01-15 14:34:04] [config] logical-epoch:
[2022-01-15 14:34:04] [config]   - 1e
[2022-01-15 14:34:04] [config]   - 0
[2022-01-15 14:34:04] [config] lr-decay: 0
[2022-01-15 14:34:04] [config] lr-decay-freq: 50000
[2022-01-15 14:34:04] [config] lr-decay-inv-sqrt:
[2022-01-15 14:34:04] [config]   - 16000
[2022-01-15 14:34:04] [config] lr-decay-repeat-warmup: false
[2022-01-15 14:34:04] [config] lr-decay-reset-optimizer: false
[2022-01-15 14:34:04] [config] lr-decay-start:
[2022-01-15 14:34:04] [config]   - 10
[2022-01-15 14:34:04] [config]   - 1
[2022-01-15 14:34:04] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 14:34:04] [config] lr-report: true
[2022-01-15 14:34:04] [config] lr-warmup: 16000
[2022-01-15 14:34:04] [config] lr-warmup-at-reload: false
[2022-01-15 14:34:04] [config] lr-warmup-cycle: false
[2022-01-15 14:34:04] [config] lr-warmup-start-rate: 0
[2022-01-15 14:34:04] [config] max-length: 100
[2022-01-15 14:34:04] [config] max-length-crop: false
[2022-01-15 14:34:04] [config] max-length-factor: 3
[2022-01-15 14:34:04] [config] maxi-batch: 1000
[2022-01-15 14:34:04] [config] maxi-batch-sort: trg
[2022-01-15 14:34:04] [config] mini-batch: 64
[2022-01-15 14:34:04] [config] mini-batch-fit: true
[2022-01-15 14:34:04] [config] mini-batch-fit-step: 10
[2022-01-15 14:34:04] [config] mini-batch-track-lr: false
[2022-01-15 14:34:04] [config] mini-batch-warmup: 0
[2022-01-15 14:34:04] [config] mini-batch-words: 0
[2022-01-15 14:34:04] [config] mini-batch-words-ref: 0
[2022-01-15 14:34:04] [config] model: model.npz
[2022-01-15 14:34:04] [config] multi-loss-type: sum
[2022-01-15 14:34:04] [config] multi-node: false
[2022-01-15 14:34:04] [config] multi-node-overlap: true
[2022-01-15 14:34:04] [config] n-best: false
[2022-01-15 14:34:04] [config] no-nccl: false
[2022-01-15 14:34:04] [config] no-reload: false
[2022-01-15 14:34:04] [config] no-restore-corpus: false
[2022-01-15 14:34:04] [config] normalize: 0.6
[2022-01-15 14:34:04] [config] normalize-gradient: false
[2022-01-15 14:34:04] [config] num-devices: 0
[2022-01-15 14:34:04] [config] optimizer: adam
[2022-01-15 14:34:04] [config] optimizer-delay: 1
[2022-01-15 14:34:04] [config] optimizer-params:
[2022-01-15 14:34:04] [config]   - 0.9
[2022-01-15 14:34:04] [config]   - 0.98
[2022-01-15 14:34:04] [config]   - 1e-09
[2022-01-15 14:34:04] [config] output-omit-bias: false
[2022-01-15 14:34:04] [config] overwrite: true
[2022-01-15 14:34:04] [config] precision:
[2022-01-15 14:34:04] [config]   - float32
[2022-01-15 14:34:04] [config]   - float32
[2022-01-15 14:34:04] [config]   - float32
[2022-01-15 14:34:04] [config] pretrained-model: ""
[2022-01-15 14:34:04] [config] quantize-biases: false
[2022-01-15 14:34:04] [config] quantize-bits: 0
[2022-01-15 14:34:04] [config] quantize-log-based: false
[2022-01-15 14:34:04] [config] quantize-optimization-steps: 0
[2022-01-15 14:34:04] [config] quiet: false
[2022-01-15 14:34:04] [config] quiet-translation: false
[2022-01-15 14:34:04] [config] relative-paths: false
[2022-01-15 14:34:04] [config] right-left: false
[2022-01-15 14:34:04] [config] save-freq: 5000
[2022-01-15 14:34:04] [config] seed: 0
[2022-01-15 14:34:04] [config] sentencepiece-alphas:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] sentencepiece-max-lines: 2000000
[2022-01-15 14:34:04] [config] sentencepiece-options: ""
[2022-01-15 14:34:04] [config] shuffle: data
[2022-01-15 14:34:04] [config] shuffle-in-ram: false
[2022-01-15 14:34:04] [config] sigterm: save-and-exit
[2022-01-15 14:34:04] [config] skip: false
[2022-01-15 14:34:04] [config] sqlite: ""
[2022-01-15 14:34:04] [config] sqlite-drop: false
[2022-01-15 14:34:04] [config] sync-sgd: false
[2022-01-15 14:34:04] [config] tempdir: /tmp
[2022-01-15 14:34:04] [config] tied-embeddings: false
[2022-01-15 14:34:04] [config] tied-embeddings-all: true
[2022-01-15 14:34:04] [config] tied-embeddings-src: false
[2022-01-15 14:34:04] [config] train-embedder-rank:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] train-sets:
[2022-01-15 14:34:04] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 14:34:04] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 14:34:04] [config] transformer-aan-activation: swish
[2022-01-15 14:34:04] [config] transformer-aan-depth: 2
[2022-01-15 14:34:04] [config] transformer-aan-nogate: false
[2022-01-15 14:34:04] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 14:34:04] [config] transformer-depth-scaling: false
[2022-01-15 14:34:04] [config] transformer-dim-aan: 2048
[2022-01-15 14:34:04] [config] transformer-dim-ffn: 2048
[2022-01-15 14:34:04] [config] transformer-dropout: 0.1
[2022-01-15 14:34:04] [config] transformer-dropout-attention: 0
[2022-01-15 14:34:04] [config] transformer-dropout-ffn: 0
[2022-01-15 14:34:04] [config] transformer-ffn-activation: swish
[2022-01-15 14:34:04] [config] transformer-ffn-depth: 2
[2022-01-15 14:34:04] [config] transformer-guided-alignment-layer: last
[2022-01-15 14:34:04] [config] transformer-heads: 8
[2022-01-15 14:34:04] [config] transformer-no-projection: false
[2022-01-15 14:34:04] [config] transformer-pool: false
[2022-01-15 14:34:04] [config] transformer-postprocess: dan
[2022-01-15 14:34:04] [config] transformer-postprocess-emb: d
[2022-01-15 14:34:04] [config] transformer-postprocess-top: ""
[2022-01-15 14:34:04] [config] transformer-preprocess: ""
[2022-01-15 14:34:04] [config] transformer-tied-layers:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] transformer-train-position-embeddings: false
[2022-01-15 14:34:04] [config] tsv: false
[2022-01-15 14:34:04] [config] tsv-fields: 0
[2022-01-15 14:34:04] [config] type: transformer
[2022-01-15 14:34:04] [config] ulr: false
[2022-01-15 14:34:04] [config] ulr-dim-emb: 0
[2022-01-15 14:34:04] [config] ulr-dropout: 0
[2022-01-15 14:34:04] [config] ulr-keys-vectors: ""
[2022-01-15 14:34:04] [config] ulr-query-vectors: ""
[2022-01-15 14:34:04] [config] ulr-softmax-temperature: 1
[2022-01-15 14:34:04] [config] ulr-trainable-transformation: false
[2022-01-15 14:34:04] [config] unlikelihood-loss: false
[2022-01-15 14:34:04] [config] valid-freq: 5000
[2022-01-15 14:34:04] [config] valid-log: ""
[2022-01-15 14:34:04] [config] valid-max-length: 1000
[2022-01-15 14:34:04] [config] valid-metrics:
[2022-01-15 14:34:04] [config]   - cross-entropy
[2022-01-15 14:34:04] [config] valid-mini-batch: 32
[2022-01-15 14:34:04] [config] valid-reset-stalled: false
[2022-01-15 14:34:04] [config] valid-script-args:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] valid-script-path: ""
[2022-01-15 14:34:04] [config] valid-sets:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] valid-translation-output: ""
[2022-01-15 14:34:04] [config] vocabs:
[2022-01-15 14:34:04] [config]   []
[2022-01-15 14:34:04] [config] word-penalty: 0
[2022-01-15 14:34:04] [config] word-scores: false
[2022-01-15 14:34:04] [config] workspace: 10000
[2022-01-15 14:34:04] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:34:04] [training] Using single-device training
[2022-01-15 14:34:04] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 14:34:04] [data] Vocabularies will be built separately for each file.
[2022-01-15 14:34:04] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 14:34:04] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 14:34:04] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 14:34:12] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml
[2022-01-15 14:34:12] [data] Setting vocabulary size for input 0 to 18,703
[2022-01-15 14:34:12] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 14:34:12] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 14:34:12] [data] Creating vocabulary /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml from /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 14:34:20] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml
[2022-01-15 14:34:20] [data] Setting vocabulary size for input 1 to 27,729
[2022-01-15 14:34:20] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 14:34:20] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 14:34:20] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 14:34:20] Error: Requested shape shape=27729x512 size=14197248 for existing parameter 'Wemb' does not match original shape shape=18703x512 size=9575936
[2022-01-15 14:34:20] Error: Aborted from marian::Expr marian::ExpressionGraph::param(const string&, const marian::Shape&, marian::Ptr<marian::inits::NodeInitializer>&, marian::Type, bool, bool) in /home/wmi/Workspace/marian/src/graph/expression_graph.h:314

[CALL STACK]
[0x563b98926642]    marian::ExpressionGraph::  param  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  marian::Shape const&,  std::shared_ptr<marian::inits::NodeInitializer> const&,  marian::Type,  bool,  bool) + 0x992
[0x563b99010899]    marian::Embedding::  Embedding  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::Options>) + 0x4c9
[0x563b99021475]    std::shared_ptr<marian::Embedding> marian::  New  <marian::Embedding,std::shared_ptr<marian::ExpressionGraph> const&,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::ExpressionGraph> const&,  std::shared_ptr<marian::Options>&) + 0x85
[0x563b9901127d]    marian::EncoderDecoderLayerBase::  createEmbeddingLayer  () const + 0x59d
[0x563b99011be5]    marian::EncoderDecoderLayerBase::  getEmbeddingLayer  (bool) const + 0x145
[0x563b98bd802f]    marian::DecoderBase::  embeddingsFromBatch  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::DecoderState>,  std::shared_ptr<marian::data::CorpusBatch>) + 0x8f
[0x563b98c5c766]    marian::EncoderDecoder::  stepAll  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::CorpusBatch>,  bool) + 0x196
[0x563b98c12c69]    marian::models::EncoderDecoderCECost::  apply  (std::shared_ptr<marian::models::IModel>,  std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0x119
[0x563b98865c82]    marian::models::Trainer::  build  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0xb2
[0x563b98cfa5f4]    marian::GraphGroup::  collectStats  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::models::ICriterionFunction>,  std::vector<std::shared_ptr<marian::Vocab>,std::allocator<std::shared_ptr<marian::Vocab>>> const&,  double) + 0xb84
[0x563b9893c269]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x2e9
[0x563b98844389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x563b988021bc]    main                                               + 0x3c
[0x7ffa93f970b3]    __libc_start_main                                  + 0xf3
[0x563b98842b0e]    _start                                             + 0x2e

[2022-01-15 14:38:37] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:38:37] [marian] Running on s470607-gpu as process 3044 with command line:
[2022-01-15 14:38:37] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 14:38:37] [config] after: 0e
[2022-01-15 14:38:37] [config] after-batches: 0
[2022-01-15 14:38:37] [config] after-epochs: 1
[2022-01-15 14:38:37] [config] all-caps-every: 0
[2022-01-15 14:38:37] [config] allow-unk: false
[2022-01-15 14:38:37] [config] authors: false
[2022-01-15 14:38:37] [config] beam-size: 6
[2022-01-15 14:38:37] [config] bert-class-symbol: "[CLS]"
[2022-01-15 14:38:37] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 14:38:37] [config] bert-masking-fraction: 0.15
[2022-01-15 14:38:37] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 14:38:37] [config] bert-train-type-embeddings: true
[2022-01-15 14:38:37] [config] bert-type-vocab-size: 2
[2022-01-15 14:38:37] [config] build-info: ""
[2022-01-15 14:38:37] [config] cite: false
[2022-01-15 14:38:37] [config] clip-norm: 5
[2022-01-15 14:38:37] [config] cost-scaling:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] cost-type: ce-sum
[2022-01-15 14:38:37] [config] cpu-threads: 0
[2022-01-15 14:38:37] [config] data-weighting: ""
[2022-01-15 14:38:37] [config] data-weighting-type: sentence
[2022-01-15 14:38:37] [config] dec-cell: gru
[2022-01-15 14:38:37] [config] dec-cell-base-depth: 2
[2022-01-15 14:38:37] [config] dec-cell-high-depth: 1
[2022-01-15 14:38:37] [config] dec-depth: 6
[2022-01-15 14:38:37] [config] devices:
[2022-01-15 14:38:37] [config]   - 0
[2022-01-15 14:38:37] [config] dim-emb: 512
[2022-01-15 14:38:37] [config] dim-rnn: 1024
[2022-01-15 14:38:37] [config] dim-vocabs:
[2022-01-15 14:38:37] [config]   - 0
[2022-01-15 14:38:37] [config]   - 0
[2022-01-15 14:38:37] [config] disp-first: 0
[2022-01-15 14:38:37] [config] disp-freq: 500
[2022-01-15 14:38:37] [config] disp-label-counts: true
[2022-01-15 14:38:37] [config] dropout-rnn: 0
[2022-01-15 14:38:37] [config] dropout-src: 0
[2022-01-15 14:38:37] [config] dropout-trg: 0
[2022-01-15 14:38:37] [config] dump-config: ""
[2022-01-15 14:38:37] [config] early-stopping: 10
[2022-01-15 14:38:37] [config] embedding-fix-src: false
[2022-01-15 14:38:37] [config] embedding-fix-trg: false
[2022-01-15 14:38:37] [config] embedding-normalization: false
[2022-01-15 14:38:37] [config] embedding-vectors:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] enc-cell: gru
[2022-01-15 14:38:37] [config] enc-cell-depth: 1
[2022-01-15 14:38:37] [config] enc-depth: 6
[2022-01-15 14:38:37] [config] enc-type: bidirectional
[2022-01-15 14:38:37] [config] english-title-case-every: 0
[2022-01-15 14:38:37] [config] exponential-smoothing: 0.0001
[2022-01-15 14:38:37] [config] factor-weight: 1
[2022-01-15 14:38:37] [config] grad-dropping-momentum: 0
[2022-01-15 14:38:37] [config] grad-dropping-rate: 0
[2022-01-15 14:38:37] [config] grad-dropping-warmup: 100
[2022-01-15 14:38:37] [config] gradient-checkpointing: false
[2022-01-15 14:38:37] [config] guided-alignment: none
[2022-01-15 14:38:37] [config] guided-alignment-cost: mse
[2022-01-15 14:38:37] [config] guided-alignment-weight: 0.1
[2022-01-15 14:38:37] [config] ignore-model-config: false
[2022-01-15 14:38:37] [config] input-types:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] interpolate-env-vars: false
[2022-01-15 14:38:37] [config] keep-best: false
[2022-01-15 14:38:37] [config] label-smoothing: 0.1
[2022-01-15 14:38:37] [config] layer-normalization: false
[2022-01-15 14:38:37] [config] learn-rate: 0.0003
[2022-01-15 14:38:37] [config] lemma-dim-emb: 0
[2022-01-15 14:38:37] [config] log: /home/wmi/train.log
[2022-01-15 14:38:37] [config] log-level: info
[2022-01-15 14:38:37] [config] log-time-zone: ""
[2022-01-15 14:38:37] [config] logical-epoch:
[2022-01-15 14:38:37] [config]   - 1e
[2022-01-15 14:38:37] [config]   - 0
[2022-01-15 14:38:37] [config] lr-decay: 0
[2022-01-15 14:38:37] [config] lr-decay-freq: 50000
[2022-01-15 14:38:37] [config] lr-decay-inv-sqrt:
[2022-01-15 14:38:37] [config]   - 16000
[2022-01-15 14:38:37] [config] lr-decay-repeat-warmup: false
[2022-01-15 14:38:37] [config] lr-decay-reset-optimizer: false
[2022-01-15 14:38:37] [config] lr-decay-start:
[2022-01-15 14:38:37] [config]   - 10
[2022-01-15 14:38:37] [config]   - 1
[2022-01-15 14:38:37] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 14:38:37] [config] lr-report: true
[2022-01-15 14:38:37] [config] lr-warmup: 16000
[2022-01-15 14:38:37] [config] lr-warmup-at-reload: false
[2022-01-15 14:38:37] [config] lr-warmup-cycle: false
[2022-01-15 14:38:37] [config] lr-warmup-start-rate: 0
[2022-01-15 14:38:37] [config] max-length: 100
[2022-01-15 14:38:37] [config] max-length-crop: false
[2022-01-15 14:38:37] [config] max-length-factor: 3
[2022-01-15 14:38:37] [config] maxi-batch: 1000
[2022-01-15 14:38:37] [config] maxi-batch-sort: trg
[2022-01-15 14:38:37] [config] mini-batch: 64
[2022-01-15 14:38:37] [config] mini-batch-fit: true
[2022-01-15 14:38:37] [config] mini-batch-fit-step: 10
[2022-01-15 14:38:37] [config] mini-batch-track-lr: false
[2022-01-15 14:38:37] [config] mini-batch-warmup: 0
[2022-01-15 14:38:37] [config] mini-batch-words: 0
[2022-01-15 14:38:37] [config] mini-batch-words-ref: 0
[2022-01-15 14:38:37] [config] model: model.npz
[2022-01-15 14:38:37] [config] multi-loss-type: sum
[2022-01-15 14:38:37] [config] multi-node: false
[2022-01-15 14:38:37] [config] multi-node-overlap: true
[2022-01-15 14:38:37] [config] n-best: false
[2022-01-15 14:38:37] [config] no-nccl: false
[2022-01-15 14:38:37] [config] no-reload: false
[2022-01-15 14:38:37] [config] no-restore-corpus: false
[2022-01-15 14:38:37] [config] normalize: 0.6
[2022-01-15 14:38:37] [config] normalize-gradient: false
[2022-01-15 14:38:37] [config] num-devices: 0
[2022-01-15 14:38:37] [config] optimizer: adam
[2022-01-15 14:38:37] [config] optimizer-delay: 1
[2022-01-15 14:38:37] [config] optimizer-params:
[2022-01-15 14:38:37] [config]   - 0.9
[2022-01-15 14:38:37] [config]   - 0.98
[2022-01-15 14:38:37] [config]   - 1e-09
[2022-01-15 14:38:37] [config] output-omit-bias: false
[2022-01-15 14:38:37] [config] overwrite: true
[2022-01-15 14:38:37] [config] precision:
[2022-01-15 14:38:37] [config]   - float32
[2022-01-15 14:38:37] [config]   - float32
[2022-01-15 14:38:37] [config]   - float32
[2022-01-15 14:38:37] [config] pretrained-model: ""
[2022-01-15 14:38:37] [config] quantize-biases: false
[2022-01-15 14:38:37] [config] quantize-bits: 0
[2022-01-15 14:38:37] [config] quantize-log-based: false
[2022-01-15 14:38:37] [config] quantize-optimization-steps: 0
[2022-01-15 14:38:37] [config] quiet: false
[2022-01-15 14:38:37] [config] quiet-translation: false
[2022-01-15 14:38:37] [config] relative-paths: false
[2022-01-15 14:38:37] [config] right-left: false
[2022-01-15 14:38:37] [config] save-freq: 5000
[2022-01-15 14:38:37] [config] seed: 0
[2022-01-15 14:38:37] [config] sentencepiece-alphas:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] sentencepiece-max-lines: 2000000
[2022-01-15 14:38:37] [config] sentencepiece-options: ""
[2022-01-15 14:38:37] [config] shuffle: data
[2022-01-15 14:38:37] [config] shuffle-in-ram: false
[2022-01-15 14:38:37] [config] sigterm: save-and-exit
[2022-01-15 14:38:37] [config] skip: false
[2022-01-15 14:38:37] [config] sqlite: ""
[2022-01-15 14:38:37] [config] sqlite-drop: false
[2022-01-15 14:38:37] [config] sync-sgd: false
[2022-01-15 14:38:37] [config] tempdir: /tmp
[2022-01-15 14:38:37] [config] tied-embeddings: true
[2022-01-15 14:38:37] [config] tied-embeddings-all: false
[2022-01-15 14:38:37] [config] tied-embeddings-src: false
[2022-01-15 14:38:37] [config] train-embedder-rank:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] train-sets:
[2022-01-15 14:38:37] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 14:38:37] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 14:38:37] [config] transformer-aan-activation: swish
[2022-01-15 14:38:37] [config] transformer-aan-depth: 2
[2022-01-15 14:38:37] [config] transformer-aan-nogate: false
[2022-01-15 14:38:37] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 14:38:37] [config] transformer-depth-scaling: false
[2022-01-15 14:38:37] [config] transformer-dim-aan: 2048
[2022-01-15 14:38:37] [config] transformer-dim-ffn: 2048
[2022-01-15 14:38:37] [config] transformer-dropout: 0.1
[2022-01-15 14:38:37] [config] transformer-dropout-attention: 0
[2022-01-15 14:38:37] [config] transformer-dropout-ffn: 0
[2022-01-15 14:38:37] [config] transformer-ffn-activation: swish
[2022-01-15 14:38:37] [config] transformer-ffn-depth: 2
[2022-01-15 14:38:37] [config] transformer-guided-alignment-layer: last
[2022-01-15 14:38:37] [config] transformer-heads: 8
[2022-01-15 14:38:37] [config] transformer-no-projection: false
[2022-01-15 14:38:37] [config] transformer-pool: false
[2022-01-15 14:38:37] [config] transformer-postprocess: dan
[2022-01-15 14:38:37] [config] transformer-postprocess-emb: d
[2022-01-15 14:38:37] [config] transformer-postprocess-top: ""
[2022-01-15 14:38:37] [config] transformer-preprocess: ""
[2022-01-15 14:38:37] [config] transformer-tied-layers:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] transformer-train-position-embeddings: false
[2022-01-15 14:38:37] [config] tsv: false
[2022-01-15 14:38:37] [config] tsv-fields: 0
[2022-01-15 14:38:37] [config] type: transformer
[2022-01-15 14:38:37] [config] ulr: false
[2022-01-15 14:38:37] [config] ulr-dim-emb: 0
[2022-01-15 14:38:37] [config] ulr-dropout: 0
[2022-01-15 14:38:37] [config] ulr-keys-vectors: ""
[2022-01-15 14:38:37] [config] ulr-query-vectors: ""
[2022-01-15 14:38:37] [config] ulr-softmax-temperature: 1
[2022-01-15 14:38:37] [config] ulr-trainable-transformation: false
[2022-01-15 14:38:37] [config] unlikelihood-loss: false
[2022-01-15 14:38:37] [config] valid-freq: 5000
[2022-01-15 14:38:37] [config] valid-log: ""
[2022-01-15 14:38:37] [config] valid-max-length: 1000
[2022-01-15 14:38:37] [config] valid-metrics:
[2022-01-15 14:38:37] [config]   - cross-entropy
[2022-01-15 14:38:37] [config] valid-mini-batch: 32
[2022-01-15 14:38:37] [config] valid-reset-stalled: false
[2022-01-15 14:38:37] [config] valid-script-args:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] valid-script-path: ""
[2022-01-15 14:38:37] [config] valid-sets:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] valid-translation-output: ""
[2022-01-15 14:38:37] [config] vocabs:
[2022-01-15 14:38:37] [config]   []
[2022-01-15 14:38:37] [config] word-penalty: 0
[2022-01-15 14:38:37] [config] word-scores: false
[2022-01-15 14:38:37] [config] workspace: 10000
[2022-01-15 14:38:37] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 14:38:37] [training] Using single-device training
[2022-01-15 14:38:37] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 14:38:37] [data] Vocabularies will be built separately for each file.
[2022-01-15 14:38:37] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 14:38:37] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml
[2022-01-15 14:38:37] [data] Setting vocabulary size for input 0 to 18,703
[2022-01-15 14:38:37] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 14:38:37] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml
[2022-01-15 14:38:37] [data] Setting vocabulary size for input 1 to 27,729
[2022-01-15 14:38:37] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 14:38:37] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 14:38:37] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 14:38:37] [logits] Applying loss function for 1 factor(s)
[2022-01-15 14:38:37] [memory] Reserving 259 MB, device gpu0
[2022-01-15 14:38:39] [gpu] 16-bit TensorCores enabled for float32 matrix operations
[2022-01-15 14:38:39] [memory] Reserving 259 MB, device gpu0
[2022-01-15 14:38:49] [batching] Done. Typical MB size is 9,199 target words
[2022-01-15 14:38:49] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 14:38:49] Training started
[2022-01-15 14:38:49] [data] Shuffling data
[2022-01-15 14:38:51] [data] Done reading 3,103,819 sentences
[2022-01-15 14:39:07] [data] Done shuffling 3,103,819 sentences to temp files
[2022-01-15 14:39:08] [memory] Reserving 259 MB, device gpu0
[2022-01-15 14:39:08] [memory] Reserving 259 MB, device gpu0
[2022-01-15 14:39:08] [memory] Reserving 518 MB, device gpu0
[2022-01-15 14:39:08] [memory] Reserving 259 MB, device gpu0
[2022-01-15 14:40:24] Ep. 1 : Up. 500 : Sen. 112,080 : Cost 9.74822521 * 3,311,741 @ 5,138 after 3,311,741 : Time 94.75s : 34952.55 words/s : L.r. 9.3750e-06
[2022-01-15 14:41:41] Ep. 1 : Up. 1000 : Sen. 222,538 : Cost 8.78999043 * 3,264,698 @ 6,432 after 6,576,439 : Time 76.66s : 42584.21 words/s : L.r. 1.8750e-05
[2022-01-15 14:42:58] Ep. 1 : Up. 1500 : Sen. 335,655 : Cost 8.44620609 * 3,307,791 @ 4,368 after 9,884,230 : Time 77.38s : 42747.19 words/s : L.r. 2.8125e-05
[2022-01-15 14:44:15] Ep. 1 : Up. 2000 : Sen. 445,738 : Cost 8.12953186 * 3,248,721 @ 6,930 after 13,132,951 : Time 76.93s : 42228.52 words/s : L.r. 3.7500e-05
[2022-01-15 14:45:33] Ep. 1 : Up. 2500 : Sen. 557,935 : Cost 7.74989128 * 3,303,644 @ 6,360 after 16,436,595 : Time 77.69s : 42520.98 words/s : L.r. 4.6875e-05
[2022-01-15 15:22:51] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 15:22:51] [marian] Running on s470607-gpu as process 3060 with command line:
[2022-01-15 15:22:51] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000 /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 15:22:51] [config] after: 0e
[2022-01-15 15:22:51] [config] after-batches: 0
[2022-01-15 15:22:51] [config] after-epochs: 1
[2022-01-15 15:22:51] [config] all-caps-every: 0
[2022-01-15 15:22:51] [config] allow-unk: false
[2022-01-15 15:22:51] [config] authors: false
[2022-01-15 15:22:51] [config] beam-size: 6
[2022-01-15 15:22:51] [config] bert-class-symbol: "[CLS]"
[2022-01-15 15:22:51] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 15:22:51] [config] bert-masking-fraction: 0.15
[2022-01-15 15:22:51] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 15:22:51] [config] bert-train-type-embeddings: true
[2022-01-15 15:22:51] [config] bert-type-vocab-size: 2
[2022-01-15 15:22:51] [config] build-info: ""
[2022-01-15 15:22:51] [config] cite: false
[2022-01-15 15:22:51] [config] clip-norm: 5
[2022-01-15 15:22:51] [config] cost-scaling:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] cost-type: ce-sum
[2022-01-15 15:22:51] [config] cpu-threads: 0
[2022-01-15 15:22:51] [config] data-weighting: ""
[2022-01-15 15:22:51] [config] data-weighting-type: sentence
[2022-01-15 15:22:51] [config] dec-cell: gru
[2022-01-15 15:22:51] [config] dec-cell-base-depth: 2
[2022-01-15 15:22:51] [config] dec-cell-high-depth: 1
[2022-01-15 15:22:51] [config] dec-depth: 6
[2022-01-15 15:22:51] [config] devices:
[2022-01-15 15:22:51] [config]   - 0
[2022-01-15 15:22:51] [config] dim-emb: 512
[2022-01-15 15:22:51] [config] dim-rnn: 1024
[2022-01-15 15:22:51] [config] dim-vocabs:
[2022-01-15 15:22:51] [config]   - 0
[2022-01-15 15:22:51] [config]   - 0
[2022-01-15 15:22:51] [config] disp-first: 0
[2022-01-15 15:22:51] [config] disp-freq: 500
[2022-01-15 15:22:51] [config] disp-label-counts: true
[2022-01-15 15:22:51] [config] dropout-rnn: 0
[2022-01-15 15:22:51] [config] dropout-src: 0
[2022-01-15 15:22:51] [config] dropout-trg: 0
[2022-01-15 15:22:51] [config] dump-config: ""
[2022-01-15 15:22:51] [config] early-stopping: 10
[2022-01-15 15:22:51] [config] embedding-fix-src: false
[2022-01-15 15:22:51] [config] embedding-fix-trg: false
[2022-01-15 15:22:51] [config] embedding-normalization: false
[2022-01-15 15:22:51] [config] embedding-vectors:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] enc-cell: gru
[2022-01-15 15:22:51] [config] enc-cell-depth: 1
[2022-01-15 15:22:51] [config] enc-depth: 6
[2022-01-15 15:22:51] [config] enc-type: bidirectional
[2022-01-15 15:22:51] [config] english-title-case-every: 0
[2022-01-15 15:22:51] [config] exponential-smoothing: 0.0001
[2022-01-15 15:22:51] [config] factor-weight: 1
[2022-01-15 15:22:51] [config] grad-dropping-momentum: 0
[2022-01-15 15:22:51] [config] grad-dropping-rate: 0
[2022-01-15 15:22:51] [config] grad-dropping-warmup: 100
[2022-01-15 15:22:51] [config] gradient-checkpointing: false
[2022-01-15 15:22:51] [config] guided-alignment: none
[2022-01-15 15:22:51] [config] guided-alignment-cost: mse
[2022-01-15 15:22:51] [config] guided-alignment-weight: 0.1
[2022-01-15 15:22:51] [config] ignore-model-config: false
[2022-01-15 15:22:51] [config] input-types:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] interpolate-env-vars: false
[2022-01-15 15:22:51] [config] keep-best: false
[2022-01-15 15:22:51] [config] label-smoothing: 0.1
[2022-01-15 15:22:51] [config] layer-normalization: false
[2022-01-15 15:22:51] [config] learn-rate: 0.0003
[2022-01-15 15:22:51] [config] lemma-dim-emb: 0
[2022-01-15 15:22:51] [config] log: /home/wmi/train.log
[2022-01-15 15:22:51] [config] log-level: info
[2022-01-15 15:22:51] [config] log-time-zone: ""
[2022-01-15 15:22:51] [config] logical-epoch:
[2022-01-15 15:22:51] [config]   - 1e
[2022-01-15 15:22:51] [config]   - 0
[2022-01-15 15:22:51] [config] lr-decay: 0
[2022-01-15 15:22:51] [config] lr-decay-freq: 50000
[2022-01-15 15:22:51] [config] lr-decay-inv-sqrt:
[2022-01-15 15:22:51] [config]   - 16000
[2022-01-15 15:22:51] [config] lr-decay-repeat-warmup: false
[2022-01-15 15:22:51] [config] lr-decay-reset-optimizer: false
[2022-01-15 15:22:51] [config] lr-decay-start:
[2022-01-15 15:22:51] [config]   - 10
[2022-01-15 15:22:51] [config]   - 1
[2022-01-15 15:22:51] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 15:22:51] [config] lr-report: true
[2022-01-15 15:22:51] [config] lr-warmup: 16000
[2022-01-15 15:22:51] [config] lr-warmup-at-reload: false
[2022-01-15 15:22:51] [config] lr-warmup-cycle: false
[2022-01-15 15:22:51] [config] lr-warmup-start-rate: 0
[2022-01-15 15:22:51] [config] max-length: 100
[2022-01-15 15:22:51] [config] max-length-crop: false
[2022-01-15 15:22:51] [config] max-length-factor: 3
[2022-01-15 15:22:51] [config] maxi-batch: 1000
[2022-01-15 15:22:51] [config] maxi-batch-sort: trg
[2022-01-15 15:22:51] [config] mini-batch: 64
[2022-01-15 15:22:51] [config] mini-batch-fit: true
[2022-01-15 15:22:51] [config] mini-batch-fit-step: 10
[2022-01-15 15:22:51] [config] mini-batch-track-lr: false
[2022-01-15 15:22:51] [config] mini-batch-warmup: 0
[2022-01-15 15:22:51] [config] mini-batch-words: 0
[2022-01-15 15:22:51] [config] mini-batch-words-ref: 0
[2022-01-15 15:22:51] [config] model: model.npz
[2022-01-15 15:22:51] [config] multi-loss-type: sum
[2022-01-15 15:22:51] [config] multi-node: false
[2022-01-15 15:22:51] [config] multi-node-overlap: true
[2022-01-15 15:22:51] [config] n-best: false
[2022-01-15 15:22:51] [config] no-nccl: false
[2022-01-15 15:22:51] [config] no-reload: false
[2022-01-15 15:22:51] [config] no-restore-corpus: false
[2022-01-15 15:22:51] [config] normalize: 0.6
[2022-01-15 15:22:51] [config] normalize-gradient: false
[2022-01-15 15:22:51] [config] num-devices: 0
[2022-01-15 15:22:51] [config] optimizer: adam
[2022-01-15 15:22:51] [config] optimizer-delay: 1
[2022-01-15 15:22:51] [config] optimizer-params:
[2022-01-15 15:22:51] [config]   - 0.9
[2022-01-15 15:22:51] [config]   - 0.98
[2022-01-15 15:22:51] [config]   - 1e-09
[2022-01-15 15:22:51] [config] output-omit-bias: false
[2022-01-15 15:22:51] [config] overwrite: true
[2022-01-15 15:22:51] [config] precision:
[2022-01-15 15:22:51] [config]   - float32
[2022-01-15 15:22:51] [config]   - float32
[2022-01-15 15:22:51] [config]   - float32
[2022-01-15 15:22:51] [config] pretrained-model: ""
[2022-01-15 15:22:51] [config] quantize-biases: false
[2022-01-15 15:22:51] [config] quantize-bits: 0
[2022-01-15 15:22:51] [config] quantize-log-based: false
[2022-01-15 15:22:51] [config] quantize-optimization-steps: 0
[2022-01-15 15:22:51] [config] quiet: false
[2022-01-15 15:22:51] [config] quiet-translation: false
[2022-01-15 15:22:51] [config] relative-paths: false
[2022-01-15 15:22:51] [config] right-left: false
[2022-01-15 15:22:51] [config] save-freq: 5000
[2022-01-15 15:22:51] [config] seed: 0
[2022-01-15 15:22:51] [config] sentencepiece-alphas:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] sentencepiece-max-lines: 2000000
[2022-01-15 15:22:51] [config] sentencepiece-options: ""
[2022-01-15 15:22:51] [config] shuffle: data
[2022-01-15 15:22:51] [config] shuffle-in-ram: false
[2022-01-15 15:22:51] [config] sigterm: save-and-exit
[2022-01-15 15:22:51] [config] skip: false
[2022-01-15 15:22:51] [config] sqlite: ""
[2022-01-15 15:22:51] [config] sqlite-drop: false
[2022-01-15 15:22:51] [config] sync-sgd: false
[2022-01-15 15:22:51] [config] tempdir: /tmp
[2022-01-15 15:22:51] [config] tied-embeddings: true
[2022-01-15 15:22:51] [config] tied-embeddings-all: false
[2022-01-15 15:22:51] [config] tied-embeddings-src: false
[2022-01-15 15:22:51] [config] train-embedder-rank:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] train-sets:
[2022-01-15 15:22:51] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 15:22:51] [config]   - /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 15:22:51] [config] transformer-aan-activation: swish
[2022-01-15 15:22:51] [config] transformer-aan-depth: 2
[2022-01-15 15:22:51] [config] transformer-aan-nogate: false
[2022-01-15 15:22:51] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 15:22:51] [config] transformer-depth-scaling: false
[2022-01-15 15:22:51] [config] transformer-dim-aan: 2048
[2022-01-15 15:22:51] [config] transformer-dim-ffn: 2048
[2022-01-15 15:22:51] [config] transformer-dropout: 0.1
[2022-01-15 15:22:51] [config] transformer-dropout-attention: 0
[2022-01-15 15:22:51] [config] transformer-dropout-ffn: 0
[2022-01-15 15:22:51] [config] transformer-ffn-activation: swish
[2022-01-15 15:22:51] [config] transformer-ffn-depth: 2
[2022-01-15 15:22:51] [config] transformer-guided-alignment-layer: last
[2022-01-15 15:22:51] [config] transformer-heads: 8
[2022-01-15 15:22:51] [config] transformer-no-projection: false
[2022-01-15 15:22:51] [config] transformer-pool: false
[2022-01-15 15:22:51] [config] transformer-postprocess: dan
[2022-01-15 15:22:51] [config] transformer-postprocess-emb: d
[2022-01-15 15:22:51] [config] transformer-postprocess-top: ""
[2022-01-15 15:22:51] [config] transformer-preprocess: ""
[2022-01-15 15:22:51] [config] transformer-tied-layers:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] transformer-train-position-embeddings: false
[2022-01-15 15:22:51] [config] tsv: false
[2022-01-15 15:22:51] [config] tsv-fields: 0
[2022-01-15 15:22:51] [config] type: transformer
[2022-01-15 15:22:51] [config] ulr: false
[2022-01-15 15:22:51] [config] ulr-dim-emb: 0
[2022-01-15 15:22:51] [config] ulr-dropout: 0
[2022-01-15 15:22:51] [config] ulr-keys-vectors: ""
[2022-01-15 15:22:51] [config] ulr-query-vectors: ""
[2022-01-15 15:22:51] [config] ulr-softmax-temperature: 1
[2022-01-15 15:22:51] [config] ulr-trainable-transformation: false
[2022-01-15 15:22:51] [config] unlikelihood-loss: false
[2022-01-15 15:22:51] [config] valid-freq: 5000
[2022-01-15 15:22:51] [config] valid-log: ""
[2022-01-15 15:22:51] [config] valid-max-length: 1000
[2022-01-15 15:22:51] [config] valid-metrics:
[2022-01-15 15:22:51] [config]   - cross-entropy
[2022-01-15 15:22:51] [config] valid-mini-batch: 32
[2022-01-15 15:22:51] [config] valid-reset-stalled: false
[2022-01-15 15:22:51] [config] valid-script-args:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] valid-script-path: ""
[2022-01-15 15:22:51] [config] valid-sets:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] valid-translation-output: ""
[2022-01-15 15:22:51] [config] vocabs:
[2022-01-15 15:22:51] [config]   []
[2022-01-15 15:22:51] [config] word-penalty: 0
[2022-01-15 15:22:51] [config] word-scores: false
[2022-01-15 15:22:51] [config] workspace: 10000
[2022-01-15 15:22:51] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 15:22:51] [training] Using single-device training
[2022-01-15 15:22:51] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 15:22:51] [data] Vocabularies will be built separately for each file.
[2022-01-15 15:22:51] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000
[2022-01-15 15:22:51] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.en.32000.yml
[2022-01-15 15:22:51] [data] Setting vocabulary size for input 0 to 18,703
[2022-01-15 15:22:51] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000
[2022-01-15 15:22:51] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/mt-summit-corpora/train/train.pl.32000.yml
[2022-01-15 15:22:51] [data] Setting vocabulary size for input 1 to 27,729
[2022-01-15 15:22:51] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 15:22:51] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 15:22:52] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 15:22:52] [logits] Applying loss function for 1 factor(s)
[2022-01-15 15:22:52] [memory] Reserving 259 MB, device gpu0
[2022-01-15 15:22:52] [gpu] 16-bit TensorCores enabled for float32 matrix operations
[2022-01-15 15:22:52] [memory] Reserving 259 MB, device gpu0
[2022-01-15 15:23:03] [batching] Done. Typical MB size is 9,199 target words
[2022-01-15 15:23:03] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 15:23:03] Training started
[2022-01-15 15:23:03] [data] Shuffling data
[2022-01-15 15:23:05] [data] Done reading 3,103,819 sentences
[2022-01-15 15:23:20] [data] Done shuffling 3,103,819 sentences to temp files
[2022-01-15 15:23:21] [memory] Reserving 259 MB, device gpu0
[2022-01-15 15:23:21] [memory] Reserving 259 MB, device gpu0
[2022-01-15 15:23:21] [memory] Reserving 518 MB, device gpu0
[2022-01-15 15:23:21] [memory] Reserving 259 MB, device gpu0
[2022-01-15 15:24:37] Ep. 1 : Up. 500 : Sen. 109,644 : Cost 9.73378372 * 3,294,320 @ 7,260 after 3,294,320 : Time 94.41s : 34893.02 words/s : L.r. 9.3750e-06
[2022-01-15 15:25:54] Ep. 1 : Up. 1000 : Sen. 226,634 : Cost 8.77455235 * 3,280,847 @ 8,930 after 6,575,167 : Time 76.53s : 42867.68 words/s : L.r. 1.8750e-05
[2022-01-15 15:27:11] Ep. 1 : Up. 1500 : Sen. 335,958 : Cost 8.43428230 * 3,298,838 @ 8,325 after 9,874,005 : Time 77.55s : 42538.42 words/s : L.r. 2.8125e-05
[2022-01-15 15:28:29] Ep. 1 : Up. 2000 : Sen. 447,084 : Cost 8.11272335 * 3,301,579 @ 6,290 after 13,175,584 : Time 77.65s : 42519.08 words/s : L.r. 3.7500e-05
[2022-01-15 17:18:23] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:18:23] [marian] Running on s470607-gpu as process 3435 with command line:
[2022-01-15 17:18:23] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:18:23] [config] after: 0e
[2022-01-15 17:18:23] [config] after-batches: 0
[2022-01-15 17:18:23] [config] after-epochs: 1
[2022-01-15 17:18:23] [config] all-caps-every: 0
[2022-01-15 17:18:23] [config] allow-unk: false
[2022-01-15 17:18:23] [config] authors: false
[2022-01-15 17:18:23] [config] beam-size: 6
[2022-01-15 17:18:23] [config] bert-class-symbol: "[CLS]"
[2022-01-15 17:18:23] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 17:18:23] [config] bert-masking-fraction: 0.15
[2022-01-15 17:18:23] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 17:18:23] [config] bert-train-type-embeddings: true
[2022-01-15 17:18:23] [config] bert-type-vocab-size: 2
[2022-01-15 17:18:23] [config] build-info: ""
[2022-01-15 17:18:23] [config] cite: false
[2022-01-15 17:18:23] [config] clip-norm: 5
[2022-01-15 17:18:23] [config] cost-scaling:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] cost-type: ce-sum
[2022-01-15 17:18:23] [config] cpu-threads: 0
[2022-01-15 17:18:23] [config] data-weighting: ""
[2022-01-15 17:18:23] [config] data-weighting-type: sentence
[2022-01-15 17:18:23] [config] dec-cell: gru
[2022-01-15 17:18:23] [config] dec-cell-base-depth: 2
[2022-01-15 17:18:23] [config] dec-cell-high-depth: 1
[2022-01-15 17:18:23] [config] dec-depth: 6
[2022-01-15 17:18:23] [config] devices:
[2022-01-15 17:18:23] [config]   - 0
[2022-01-15 17:18:23] [config] dim-emb: 512
[2022-01-15 17:18:23] [config] dim-rnn: 1024
[2022-01-15 17:18:23] [config] dim-vocabs:
[2022-01-15 17:18:23] [config]   - 0
[2022-01-15 17:18:23] [config]   - 0
[2022-01-15 17:18:23] [config] disp-first: 0
[2022-01-15 17:18:23] [config] disp-freq: 500
[2022-01-15 17:18:23] [config] disp-label-counts: true
[2022-01-15 17:18:23] [config] dropout-rnn: 0
[2022-01-15 17:18:23] [config] dropout-src: 0
[2022-01-15 17:18:23] [config] dropout-trg: 0
[2022-01-15 17:18:23] [config] dump-config: ""
[2022-01-15 17:18:23] [config] early-stopping: 10
[2022-01-15 17:18:23] [config] embedding-fix-src: false
[2022-01-15 17:18:23] [config] embedding-fix-trg: false
[2022-01-15 17:18:23] [config] embedding-normalization: false
[2022-01-15 17:18:23] [config] embedding-vectors:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] enc-cell: gru
[2022-01-15 17:18:23] [config] enc-cell-depth: 1
[2022-01-15 17:18:23] [config] enc-depth: 6
[2022-01-15 17:18:23] [config] enc-type: bidirectional
[2022-01-15 17:18:23] [config] english-title-case-every: 0
[2022-01-15 17:18:23] [config] exponential-smoothing: 0.0001
[2022-01-15 17:18:23] [config] factor-weight: 1
[2022-01-15 17:18:23] [config] grad-dropping-momentum: 0
[2022-01-15 17:18:23] [config] grad-dropping-rate: 0
[2022-01-15 17:18:23] [config] grad-dropping-warmup: 100
[2022-01-15 17:18:23] [config] gradient-checkpointing: false
[2022-01-15 17:18:23] [config] guided-alignment: none
[2022-01-15 17:18:23] [config] guided-alignment-cost: mse
[2022-01-15 17:18:23] [config] guided-alignment-weight: 0.1
[2022-01-15 17:18:23] [config] ignore-model-config: false
[2022-01-15 17:18:23] [config] input-types:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] interpolate-env-vars: false
[2022-01-15 17:18:23] [config] keep-best: false
[2022-01-15 17:18:23] [config] label-smoothing: 0.1
[2022-01-15 17:18:23] [config] layer-normalization: false
[2022-01-15 17:18:23] [config] learn-rate: 0.0003
[2022-01-15 17:18:23] [config] lemma-dim-emb: 0
[2022-01-15 17:18:23] [config] log: /home/wmi/train.log
[2022-01-15 17:18:23] [config] log-level: info
[2022-01-15 17:18:23] [config] log-time-zone: ""
[2022-01-15 17:18:23] [config] logical-epoch:
[2022-01-15 17:18:23] [config]   - 1e
[2022-01-15 17:18:23] [config]   - 0
[2022-01-15 17:18:23] [config] lr-decay: 0
[2022-01-15 17:18:23] [config] lr-decay-freq: 50000
[2022-01-15 17:18:23] [config] lr-decay-inv-sqrt:
[2022-01-15 17:18:23] [config]   - 16000
[2022-01-15 17:18:23] [config] lr-decay-repeat-warmup: false
[2022-01-15 17:18:23] [config] lr-decay-reset-optimizer: false
[2022-01-15 17:18:23] [config] lr-decay-start:
[2022-01-15 17:18:23] [config]   - 10
[2022-01-15 17:18:23] [config]   - 1
[2022-01-15 17:18:23] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 17:18:23] [config] lr-report: true
[2022-01-15 17:18:23] [config] lr-warmup: 16000
[2022-01-15 17:18:23] [config] lr-warmup-at-reload: false
[2022-01-15 17:18:23] [config] lr-warmup-cycle: false
[2022-01-15 17:18:23] [config] lr-warmup-start-rate: 0
[2022-01-15 17:18:23] [config] max-length: 100
[2022-01-15 17:18:23] [config] max-length-crop: false
[2022-01-15 17:18:23] [config] max-length-factor: 3
[2022-01-15 17:18:23] [config] maxi-batch: 1000
[2022-01-15 17:18:23] [config] maxi-batch-sort: trg
[2022-01-15 17:18:23] [config] mini-batch: 64
[2022-01-15 17:18:23] [config] mini-batch-fit: true
[2022-01-15 17:18:23] [config] mini-batch-fit-step: 10
[2022-01-15 17:18:23] [config] mini-batch-track-lr: false
[2022-01-15 17:18:23] [config] mini-batch-warmup: 0
[2022-01-15 17:18:23] [config] mini-batch-words: 0
[2022-01-15 17:18:23] [config] mini-batch-words-ref: 0
[2022-01-15 17:18:23] [config] model: model.npz
[2022-01-15 17:18:23] [config] multi-loss-type: sum
[2022-01-15 17:18:23] [config] multi-node: false
[2022-01-15 17:18:23] [config] multi-node-overlap: true
[2022-01-15 17:18:23] [config] n-best: false
[2022-01-15 17:18:23] [config] no-nccl: false
[2022-01-15 17:18:23] [config] no-reload: false
[2022-01-15 17:18:23] [config] no-restore-corpus: false
[2022-01-15 17:18:23] [config] normalize: 0.6
[2022-01-15 17:18:23] [config] normalize-gradient: false
[2022-01-15 17:18:23] [config] num-devices: 0
[2022-01-15 17:18:23] [config] optimizer: adam
[2022-01-15 17:18:23] [config] optimizer-delay: 1
[2022-01-15 17:18:23] [config] optimizer-params:
[2022-01-15 17:18:23] [config]   - 0.9
[2022-01-15 17:18:23] [config]   - 0.98
[2022-01-15 17:18:23] [config]   - 1e-09
[2022-01-15 17:18:23] [config] output-omit-bias: false
[2022-01-15 17:18:23] [config] overwrite: true
[2022-01-15 17:18:23] [config] precision:
[2022-01-15 17:18:23] [config]   - float32
[2022-01-15 17:18:23] [config]   - float32
[2022-01-15 17:18:23] [config]   - float32
[2022-01-15 17:18:23] [config] pretrained-model: ""
[2022-01-15 17:18:23] [config] quantize-biases: false
[2022-01-15 17:18:23] [config] quantize-bits: 0
[2022-01-15 17:18:23] [config] quantize-log-based: false
[2022-01-15 17:18:23] [config] quantize-optimization-steps: 0
[2022-01-15 17:18:23] [config] quiet: false
[2022-01-15 17:18:23] [config] quiet-translation: false
[2022-01-15 17:18:23] [config] relative-paths: false
[2022-01-15 17:18:23] [config] right-left: false
[2022-01-15 17:18:23] [config] save-freq: 5000
[2022-01-15 17:18:23] [config] seed: 0
[2022-01-15 17:18:23] [config] sentencepiece-alphas:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] sentencepiece-max-lines: 2000000
[2022-01-15 17:18:23] [config] sentencepiece-options: ""
[2022-01-15 17:18:23] [config] shuffle: data
[2022-01-15 17:18:23] [config] shuffle-in-ram: false
[2022-01-15 17:18:23] [config] sigterm: save-and-exit
[2022-01-15 17:18:23] [config] skip: false
[2022-01-15 17:18:23] [config] sqlite: ""
[2022-01-15 17:18:23] [config] sqlite-drop: false
[2022-01-15 17:18:23] [config] sync-sgd: false
[2022-01-15 17:18:23] [config] tempdir: /tmp
[2022-01-15 17:18:23] [config] tied-embeddings: true
[2022-01-15 17:18:23] [config] tied-embeddings-all: false
[2022-01-15 17:18:23] [config] tied-embeddings-src: false
[2022-01-15 17:18:23] [config] train-embedder-rank:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] train-sets:
[2022-01-15 17:18:23] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv
[2022-01-15 17:18:23] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv
[2022-01-15 17:18:23] [config] transformer-aan-activation: swish
[2022-01-15 17:18:23] [config] transformer-aan-depth: 2
[2022-01-15 17:18:23] [config] transformer-aan-nogate: false
[2022-01-15 17:18:23] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 17:18:23] [config] transformer-depth-scaling: false
[2022-01-15 17:18:23] [config] transformer-dim-aan: 2048
[2022-01-15 17:18:23] [config] transformer-dim-ffn: 2048
[2022-01-15 17:18:23] [config] transformer-dropout: 0.1
[2022-01-15 17:18:23] [config] transformer-dropout-attention: 0
[2022-01-15 17:18:23] [config] transformer-dropout-ffn: 0
[2022-01-15 17:18:23] [config] transformer-ffn-activation: swish
[2022-01-15 17:18:23] [config] transformer-ffn-depth: 2
[2022-01-15 17:18:23] [config] transformer-guided-alignment-layer: last
[2022-01-15 17:18:23] [config] transformer-heads: 8
[2022-01-15 17:18:23] [config] transformer-no-projection: false
[2022-01-15 17:18:23] [config] transformer-pool: false
[2022-01-15 17:18:23] [config] transformer-postprocess: dan
[2022-01-15 17:18:23] [config] transformer-postprocess-emb: d
[2022-01-15 17:18:23] [config] transformer-postprocess-top: ""
[2022-01-15 17:18:23] [config] transformer-preprocess: ""
[2022-01-15 17:18:23] [config] transformer-tied-layers:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] transformer-train-position-embeddings: false
[2022-01-15 17:18:23] [config] tsv: false
[2022-01-15 17:18:23] [config] tsv-fields: 0
[2022-01-15 17:18:23] [config] type: transformer
[2022-01-15 17:18:23] [config] ulr: false
[2022-01-15 17:18:23] [config] ulr-dim-emb: 0
[2022-01-15 17:18:23] [config] ulr-dropout: 0
[2022-01-15 17:18:23] [config] ulr-keys-vectors: ""
[2022-01-15 17:18:23] [config] ulr-query-vectors: ""
[2022-01-15 17:18:23] [config] ulr-softmax-temperature: 1
[2022-01-15 17:18:23] [config] ulr-trainable-transformation: false
[2022-01-15 17:18:23] [config] unlikelihood-loss: false
[2022-01-15 17:18:23] [config] valid-freq: 5000
[2022-01-15 17:18:23] [config] valid-log: ""
[2022-01-15 17:18:23] [config] valid-max-length: 1000
[2022-01-15 17:18:23] [config] valid-metrics:
[2022-01-15 17:18:23] [config]   - cross-entropy
[2022-01-15 17:18:23] [config] valid-mini-batch: 32
[2022-01-15 17:18:23] [config] valid-reset-stalled: false
[2022-01-15 17:18:23] [config] valid-script-args:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] valid-script-path: ""
[2022-01-15 17:18:23] [config] valid-sets:
[2022-01-15 17:18:23] [config]   []
[2022-01-15 17:18:23] [config] valid-translation-output: ""
[2022-01-15 17:18:23] [config] vocabs:
[2022-01-15 17:18:23] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:18:23] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:18:23] [config] word-penalty: 0
[2022-01-15 17:18:23] [config] word-scores: false
[2022-01-15 17:18:23] [config] workspace: 10000
[2022-01-15 17:18:23] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:18:23] [training] Using single-device training
[2022-01-15 17:18:23] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:18:23] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion
[2022-01-15 17:18:23] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113

[CALL STACK]
[0x55bd8ec9a5e6]                                                       + 0x29c5e6
[0x7ff56152938c]                                                       + 0xaa38c
[0x7ff5615293f7]                                                       + 0xaa3f7
[0x7ff5615296a9]                                                       + 0xaa6a9
[0x55bd8ef26c20]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x1130
[0x55bd8ef15e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x55bd8ef16728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x55bd8ef62189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x55bd8ef75084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x55bd8edd3f8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x55bd8ee5b94b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x55bd8ed62389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x55bd8ed201bc]    main                                               + 0x3c
[0x7ff56114a0b3]    __libc_start_main                                  + 0xf3
[0x55bd8ed60b0e]    _start                                             + 0x2e

[2022-01-15 17:26:24] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:26:24] [marian] Running on s470607-gpu as process 3591 with command line:
[2022-01-15 17:26:24] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:26:24] [config] after: 0e
[2022-01-15 17:26:24] [config] after-batches: 0
[2022-01-15 17:26:24] [config] after-epochs: 1
[2022-01-15 17:26:24] [config] all-caps-every: 0
[2022-01-15 17:26:24] [config] allow-unk: false
[2022-01-15 17:26:24] [config] authors: false
[2022-01-15 17:26:24] [config] beam-size: 6
[2022-01-15 17:26:24] [config] bert-class-symbol: "[CLS]"
[2022-01-15 17:26:24] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 17:26:24] [config] bert-masking-fraction: 0.15
[2022-01-15 17:26:24] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 17:26:24] [config] bert-train-type-embeddings: true
[2022-01-15 17:26:24] [config] bert-type-vocab-size: 2
[2022-01-15 17:26:24] [config] build-info: ""
[2022-01-15 17:26:24] [config] cite: false
[2022-01-15 17:26:24] [config] clip-norm: 5
[2022-01-15 17:26:24] [config] cost-scaling:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] cost-type: ce-sum
[2022-01-15 17:26:24] [config] cpu-threads: 0
[2022-01-15 17:26:24] [config] data-weighting: ""
[2022-01-15 17:26:24] [config] data-weighting-type: sentence
[2022-01-15 17:26:24] [config] dec-cell: gru
[2022-01-15 17:26:24] [config] dec-cell-base-depth: 2
[2022-01-15 17:26:24] [config] dec-cell-high-depth: 1
[2022-01-15 17:26:24] [config] dec-depth: 6
[2022-01-15 17:26:24] [config] devices:
[2022-01-15 17:26:24] [config]   - 0
[2022-01-15 17:26:24] [config] dim-emb: 512
[2022-01-15 17:26:24] [config] dim-rnn: 1024
[2022-01-15 17:26:24] [config] dim-vocabs:
[2022-01-15 17:26:24] [config]   - 0
[2022-01-15 17:26:24] [config]   - 0
[2022-01-15 17:26:24] [config] disp-first: 0
[2022-01-15 17:26:24] [config] disp-freq: 500
[2022-01-15 17:26:24] [config] disp-label-counts: true
[2022-01-15 17:26:24] [config] dropout-rnn: 0
[2022-01-15 17:26:24] [config] dropout-src: 0
[2022-01-15 17:26:24] [config] dropout-trg: 0
[2022-01-15 17:26:24] [config] dump-config: ""
[2022-01-15 17:26:24] [config] early-stopping: 10
[2022-01-15 17:26:24] [config] embedding-fix-src: false
[2022-01-15 17:26:24] [config] embedding-fix-trg: false
[2022-01-15 17:26:24] [config] embedding-normalization: false
[2022-01-15 17:26:24] [config] embedding-vectors:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] enc-cell: gru
[2022-01-15 17:26:24] [config] enc-cell-depth: 1
[2022-01-15 17:26:24] [config] enc-depth: 6
[2022-01-15 17:26:24] [config] enc-type: bidirectional
[2022-01-15 17:26:24] [config] english-title-case-every: 0
[2022-01-15 17:26:24] [config] exponential-smoothing: 0.0001
[2022-01-15 17:26:24] [config] factor-weight: 1
[2022-01-15 17:26:24] [config] grad-dropping-momentum: 0
[2022-01-15 17:26:24] [config] grad-dropping-rate: 0
[2022-01-15 17:26:24] [config] grad-dropping-warmup: 100
[2022-01-15 17:26:24] [config] gradient-checkpointing: false
[2022-01-15 17:26:24] [config] guided-alignment: none
[2022-01-15 17:26:24] [config] guided-alignment-cost: mse
[2022-01-15 17:26:24] [config] guided-alignment-weight: 0.1
[2022-01-15 17:26:24] [config] ignore-model-config: false
[2022-01-15 17:26:24] [config] input-types:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] interpolate-env-vars: false
[2022-01-15 17:26:24] [config] keep-best: false
[2022-01-15 17:26:24] [config] label-smoothing: 0.1
[2022-01-15 17:26:24] [config] layer-normalization: false
[2022-01-15 17:26:24] [config] learn-rate: 0.0003
[2022-01-15 17:26:24] [config] lemma-dim-emb: 0
[2022-01-15 17:26:24] [config] log: /home/wmi/train.log
[2022-01-15 17:26:24] [config] log-level: info
[2022-01-15 17:26:24] [config] log-time-zone: ""
[2022-01-15 17:26:24] [config] logical-epoch:
[2022-01-15 17:26:24] [config]   - 1e
[2022-01-15 17:26:24] [config]   - 0
[2022-01-15 17:26:24] [config] lr-decay: 0
[2022-01-15 17:26:24] [config] lr-decay-freq: 50000
[2022-01-15 17:26:24] [config] lr-decay-inv-sqrt:
[2022-01-15 17:26:24] [config]   - 16000
[2022-01-15 17:26:24] [config] lr-decay-repeat-warmup: false
[2022-01-15 17:26:24] [config] lr-decay-reset-optimizer: false
[2022-01-15 17:26:24] [config] lr-decay-start:
[2022-01-15 17:26:24] [config]   - 10
[2022-01-15 17:26:24] [config]   - 1
[2022-01-15 17:26:24] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 17:26:24] [config] lr-report: true
[2022-01-15 17:26:24] [config] lr-warmup: 16000
[2022-01-15 17:26:24] [config] lr-warmup-at-reload: false
[2022-01-15 17:26:24] [config] lr-warmup-cycle: false
[2022-01-15 17:26:24] [config] lr-warmup-start-rate: 0
[2022-01-15 17:26:24] [config] max-length: 100
[2022-01-15 17:26:24] [config] max-length-crop: false
[2022-01-15 17:26:24] [config] max-length-factor: 3
[2022-01-15 17:26:24] [config] maxi-batch: 1000
[2022-01-15 17:26:24] [config] maxi-batch-sort: trg
[2022-01-15 17:26:24] [config] mini-batch: 64
[2022-01-15 17:26:24] [config] mini-batch-fit: true
[2022-01-15 17:26:24] [config] mini-batch-fit-step: 10
[2022-01-15 17:26:24] [config] mini-batch-track-lr: false
[2022-01-15 17:26:24] [config] mini-batch-warmup: 0
[2022-01-15 17:26:24] [config] mini-batch-words: 0
[2022-01-15 17:26:24] [config] mini-batch-words-ref: 0
[2022-01-15 17:26:24] [config] model: model.npz
[2022-01-15 17:26:24] [config] multi-loss-type: sum
[2022-01-15 17:26:24] [config] multi-node: false
[2022-01-15 17:26:24] [config] multi-node-overlap: true
[2022-01-15 17:26:24] [config] n-best: false
[2022-01-15 17:26:24] [config] no-nccl: false
[2022-01-15 17:26:24] [config] no-reload: false
[2022-01-15 17:26:24] [config] no-restore-corpus: false
[2022-01-15 17:26:24] [config] normalize: 0.6
[2022-01-15 17:26:24] [config] normalize-gradient: false
[2022-01-15 17:26:24] [config] num-devices: 0
[2022-01-15 17:26:24] [config] optimizer: adam
[2022-01-15 17:26:24] [config] optimizer-delay: 1
[2022-01-15 17:26:24] [config] optimizer-params:
[2022-01-15 17:26:24] [config]   - 0.9
[2022-01-15 17:26:24] [config]   - 0.98
[2022-01-15 17:26:24] [config]   - 1e-09
[2022-01-15 17:26:24] [config] output-omit-bias: false
[2022-01-15 17:26:24] [config] overwrite: true
[2022-01-15 17:26:24] [config] precision:
[2022-01-15 17:26:24] [config]   - float32
[2022-01-15 17:26:24] [config]   - float32
[2022-01-15 17:26:24] [config]   - float32
[2022-01-15 17:26:24] [config] pretrained-model: ""
[2022-01-15 17:26:24] [config] quantize-biases: false
[2022-01-15 17:26:24] [config] quantize-bits: 0
[2022-01-15 17:26:24] [config] quantize-log-based: false
[2022-01-15 17:26:24] [config] quantize-optimization-steps: 0
[2022-01-15 17:26:24] [config] quiet: false
[2022-01-15 17:26:24] [config] quiet-translation: false
[2022-01-15 17:26:24] [config] relative-paths: false
[2022-01-15 17:26:24] [config] right-left: false
[2022-01-15 17:26:24] [config] save-freq: 5000
[2022-01-15 17:26:24] [config] seed: 0
[2022-01-15 17:26:24] [config] sentencepiece-alphas:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] sentencepiece-max-lines: 2000000
[2022-01-15 17:26:24] [config] sentencepiece-options: ""
[2022-01-15 17:26:24] [config] shuffle: data
[2022-01-15 17:26:24] [config] shuffle-in-ram: false
[2022-01-15 17:26:24] [config] sigterm: save-and-exit
[2022-01-15 17:26:24] [config] skip: false
[2022-01-15 17:26:24] [config] sqlite: ""
[2022-01-15 17:26:24] [config] sqlite-drop: false
[2022-01-15 17:26:24] [config] sync-sgd: false
[2022-01-15 17:26:24] [config] tempdir: /tmp
[2022-01-15 17:26:24] [config] tied-embeddings: true
[2022-01-15 17:26:24] [config] tied-embeddings-all: false
[2022-01-15 17:26:24] [config] tied-embeddings-src: false
[2022-01-15 17:26:24] [config] train-embedder-rank:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] train-sets:
[2022-01-15 17:26:24] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv
[2022-01-15 17:26:24] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv
[2022-01-15 17:26:24] [config] transformer-aan-activation: swish
[2022-01-15 17:26:24] [config] transformer-aan-depth: 2
[2022-01-15 17:26:24] [config] transformer-aan-nogate: false
[2022-01-15 17:26:24] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 17:26:24] [config] transformer-depth-scaling: false
[2022-01-15 17:26:24] [config] transformer-dim-aan: 2048
[2022-01-15 17:26:24] [config] transformer-dim-ffn: 2048
[2022-01-15 17:26:24] [config] transformer-dropout: 0.1
[2022-01-15 17:26:24] [config] transformer-dropout-attention: 0
[2022-01-15 17:26:24] [config] transformer-dropout-ffn: 0
[2022-01-15 17:26:24] [config] transformer-ffn-activation: swish
[2022-01-15 17:26:24] [config] transformer-ffn-depth: 2
[2022-01-15 17:26:24] [config] transformer-guided-alignment-layer: last
[2022-01-15 17:26:24] [config] transformer-heads: 8
[2022-01-15 17:26:24] [config] transformer-no-projection: false
[2022-01-15 17:26:24] [config] transformer-pool: false
[2022-01-15 17:26:24] [config] transformer-postprocess: dan
[2022-01-15 17:26:24] [config] transformer-postprocess-emb: d
[2022-01-15 17:26:24] [config] transformer-postprocess-top: ""
[2022-01-15 17:26:24] [config] transformer-preprocess: ""
[2022-01-15 17:26:24] [config] transformer-tied-layers:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] transformer-train-position-embeddings: false
[2022-01-15 17:26:24] [config] tsv: false
[2022-01-15 17:26:24] [config] tsv-fields: 0
[2022-01-15 17:26:24] [config] type: transformer
[2022-01-15 17:26:24] [config] ulr: false
[2022-01-15 17:26:24] [config] ulr-dim-emb: 0
[2022-01-15 17:26:24] [config] ulr-dropout: 0
[2022-01-15 17:26:24] [config] ulr-keys-vectors: ""
[2022-01-15 17:26:24] [config] ulr-query-vectors: ""
[2022-01-15 17:26:24] [config] ulr-softmax-temperature: 1
[2022-01-15 17:26:24] [config] ulr-trainable-transformation: false
[2022-01-15 17:26:24] [config] unlikelihood-loss: false
[2022-01-15 17:26:24] [config] valid-freq: 5000
[2022-01-15 17:26:24] [config] valid-log: ""
[2022-01-15 17:26:24] [config] valid-max-length: 1000
[2022-01-15 17:26:24] [config] valid-metrics:
[2022-01-15 17:26:24] [config]   - cross-entropy
[2022-01-15 17:26:24] [config] valid-mini-batch: 32
[2022-01-15 17:26:24] [config] valid-reset-stalled: false
[2022-01-15 17:26:24] [config] valid-script-args:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] valid-script-path: ""
[2022-01-15 17:26:24] [config] valid-sets:
[2022-01-15 17:26:24] [config]   []
[2022-01-15 17:26:24] [config] valid-translation-output: ""
[2022-01-15 17:26:24] [config] vocabs:
[2022-01-15 17:26:24] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:26:24] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:26:24] [config] word-penalty: 0
[2022-01-15 17:26:24] [config] word-scores: false
[2022-01-15 17:26:24] [config] workspace: 10000
[2022-01-15 17:26:24] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:26:24] [training] Using single-device training
[2022-01-15 17:26:24] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:26:24] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion
[2022-01-15 17:26:24] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113

[CALL STACK]
[0x55e28e7915e6]                                                       + 0x29c5e6
[0x7f06c3a4d38c]                                                       + 0xaa38c
[0x7f06c3a4d3f7]                                                       + 0xaa3f7
[0x7f06c3a4d6a9]                                                       + 0xaa6a9
[0x55e28ea1dc20]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x1130
[0x55e28ea0ce2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x55e28ea0d728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x55e28ea59189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x55e28ea6c084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x55e28e8caf8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x55e28e95294b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x55e28e859389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x55e28e8171bc]    main                                               + 0x3c
[0x7f06c366e0b3]    __libc_start_main                                  + 0xf3
[0x55e28e857b0e]    _start                                             + 0x2e

[2022-01-15 17:35:32] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:35:32] [marian] Running on s470607-gpu as process 3646 with command line:
[2022-01-15 17:35:32] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:35:32] [config] after: 0e
[2022-01-15 17:35:32] [config] after-batches: 0
[2022-01-15 17:35:32] [config] after-epochs: 1
[2022-01-15 17:35:32] [config] all-caps-every: 0
[2022-01-15 17:35:32] [config] allow-unk: false
[2022-01-15 17:35:32] [config] authors: false
[2022-01-15 17:35:32] [config] beam-size: 6
[2022-01-15 17:35:32] [config] bert-class-symbol: "[CLS]"
[2022-01-15 17:35:32] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 17:35:32] [config] bert-masking-fraction: 0.15
[2022-01-15 17:35:32] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 17:35:32] [config] bert-train-type-embeddings: true
[2022-01-15 17:35:32] [config] bert-type-vocab-size: 2
[2022-01-15 17:35:32] [config] build-info: ""
[2022-01-15 17:35:32] [config] cite: false
[2022-01-15 17:35:32] [config] clip-norm: 5
[2022-01-15 17:35:32] [config] cost-scaling:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] cost-type: ce-sum
[2022-01-15 17:35:32] [config] cpu-threads: 0
[2022-01-15 17:35:32] [config] data-weighting: ""
[2022-01-15 17:35:32] [config] data-weighting-type: sentence
[2022-01-15 17:35:32] [config] dec-cell: gru
[2022-01-15 17:35:32] [config] dec-cell-base-depth: 2
[2022-01-15 17:35:32] [config] dec-cell-high-depth: 1
[2022-01-15 17:35:32] [config] dec-depth: 6
[2022-01-15 17:35:32] [config] devices:
[2022-01-15 17:35:32] [config]   - 0
[2022-01-15 17:35:32] [config] dim-emb: 512
[2022-01-15 17:35:32] [config] dim-rnn: 1024
[2022-01-15 17:35:32] [config] dim-vocabs:
[2022-01-15 17:35:32] [config]   - 0
[2022-01-15 17:35:32] [config]   - 0
[2022-01-15 17:35:32] [config] disp-first: 0
[2022-01-15 17:35:32] [config] disp-freq: 500
[2022-01-15 17:35:32] [config] disp-label-counts: true
[2022-01-15 17:35:32] [config] dropout-rnn: 0
[2022-01-15 17:35:32] [config] dropout-src: 0
[2022-01-15 17:35:32] [config] dropout-trg: 0
[2022-01-15 17:35:32] [config] dump-config: ""
[2022-01-15 17:35:32] [config] early-stopping: 10
[2022-01-15 17:35:32] [config] embedding-fix-src: false
[2022-01-15 17:35:32] [config] embedding-fix-trg: false
[2022-01-15 17:35:32] [config] embedding-normalization: false
[2022-01-15 17:35:32] [config] embedding-vectors:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] enc-cell: gru
[2022-01-15 17:35:32] [config] enc-cell-depth: 1
[2022-01-15 17:35:32] [config] enc-depth: 6
[2022-01-15 17:35:32] [config] enc-type: bidirectional
[2022-01-15 17:35:32] [config] english-title-case-every: 0
[2022-01-15 17:35:32] [config] exponential-smoothing: 0.0001
[2022-01-15 17:35:32] [config] factor-weight: 1
[2022-01-15 17:35:32] [config] grad-dropping-momentum: 0
[2022-01-15 17:35:32] [config] grad-dropping-rate: 0
[2022-01-15 17:35:32] [config] grad-dropping-warmup: 100
[2022-01-15 17:35:32] [config] gradient-checkpointing: false
[2022-01-15 17:35:32] [config] guided-alignment: none
[2022-01-15 17:35:32] [config] guided-alignment-cost: mse
[2022-01-15 17:35:32] [config] guided-alignment-weight: 0.1
[2022-01-15 17:35:32] [config] ignore-model-config: false
[2022-01-15 17:35:32] [config] input-types:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] interpolate-env-vars: false
[2022-01-15 17:35:32] [config] keep-best: false
[2022-01-15 17:35:32] [config] label-smoothing: 0.1
[2022-01-15 17:35:32] [config] layer-normalization: false
[2022-01-15 17:35:32] [config] learn-rate: 0.0003
[2022-01-15 17:35:32] [config] lemma-dim-emb: 0
[2022-01-15 17:35:32] [config] log: /home/wmi/train.log
[2022-01-15 17:35:32] [config] log-level: info
[2022-01-15 17:35:32] [config] log-time-zone: ""
[2022-01-15 17:35:32] [config] logical-epoch:
[2022-01-15 17:35:32] [config]   - 1e
[2022-01-15 17:35:32] [config]   - 0
[2022-01-15 17:35:32] [config] lr-decay: 0
[2022-01-15 17:35:32] [config] lr-decay-freq: 50000
[2022-01-15 17:35:32] [config] lr-decay-inv-sqrt:
[2022-01-15 17:35:32] [config]   - 16000
[2022-01-15 17:35:32] [config] lr-decay-repeat-warmup: false
[2022-01-15 17:35:32] [config] lr-decay-reset-optimizer: false
[2022-01-15 17:35:32] [config] lr-decay-start:
[2022-01-15 17:35:32] [config]   - 10
[2022-01-15 17:35:32] [config]   - 1
[2022-01-15 17:35:32] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 17:35:32] [config] lr-report: true
[2022-01-15 17:35:32] [config] lr-warmup: 16000
[2022-01-15 17:35:32] [config] lr-warmup-at-reload: false
[2022-01-15 17:35:32] [config] lr-warmup-cycle: false
[2022-01-15 17:35:32] [config] lr-warmup-start-rate: 0
[2022-01-15 17:35:32] [config] max-length: 100
[2022-01-15 17:35:32] [config] max-length-crop: false
[2022-01-15 17:35:32] [config] max-length-factor: 3
[2022-01-15 17:35:32] [config] maxi-batch: 1000
[2022-01-15 17:35:32] [config] maxi-batch-sort: trg
[2022-01-15 17:35:32] [config] mini-batch: 64
[2022-01-15 17:35:32] [config] mini-batch-fit: true
[2022-01-15 17:35:32] [config] mini-batch-fit-step: 10
[2022-01-15 17:35:32] [config] mini-batch-track-lr: false
[2022-01-15 17:35:32] [config] mini-batch-warmup: 0
[2022-01-15 17:35:32] [config] mini-batch-words: 0
[2022-01-15 17:35:32] [config] mini-batch-words-ref: 0
[2022-01-15 17:35:32] [config] model: model.npz
[2022-01-15 17:35:32] [config] multi-loss-type: sum
[2022-01-15 17:35:32] [config] multi-node: false
[2022-01-15 17:35:32] [config] multi-node-overlap: true
[2022-01-15 17:35:32] [config] n-best: false
[2022-01-15 17:35:32] [config] no-nccl: false
[2022-01-15 17:35:32] [config] no-reload: false
[2022-01-15 17:35:32] [config] no-restore-corpus: false
[2022-01-15 17:35:32] [config] normalize: 0.6
[2022-01-15 17:35:32] [config] normalize-gradient: false
[2022-01-15 17:35:32] [config] num-devices: 0
[2022-01-15 17:35:32] [config] optimizer: adam
[2022-01-15 17:35:32] [config] optimizer-delay: 1
[2022-01-15 17:35:32] [config] optimizer-params:
[2022-01-15 17:35:32] [config]   - 0.9
[2022-01-15 17:35:32] [config]   - 0.98
[2022-01-15 17:35:32] [config]   - 1e-09
[2022-01-15 17:35:32] [config] output-omit-bias: false
[2022-01-15 17:35:32] [config] overwrite: true
[2022-01-15 17:35:32] [config] precision:
[2022-01-15 17:35:32] [config]   - float32
[2022-01-15 17:35:32] [config]   - float32
[2022-01-15 17:35:32] [config]   - float32
[2022-01-15 17:35:32] [config] pretrained-model: ""
[2022-01-15 17:35:32] [config] quantize-biases: false
[2022-01-15 17:35:32] [config] quantize-bits: 0
[2022-01-15 17:35:32] [config] quantize-log-based: false
[2022-01-15 17:35:32] [config] quantize-optimization-steps: 0
[2022-01-15 17:35:32] [config] quiet: false
[2022-01-15 17:35:32] [config] quiet-translation: false
[2022-01-15 17:35:32] [config] relative-paths: false
[2022-01-15 17:35:32] [config] right-left: false
[2022-01-15 17:35:32] [config] save-freq: 5000
[2022-01-15 17:35:32] [config] seed: 0
[2022-01-15 17:35:32] [config] sentencepiece-alphas:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] sentencepiece-max-lines: 2000000
[2022-01-15 17:35:32] [config] sentencepiece-options: ""
[2022-01-15 17:35:32] [config] shuffle: data
[2022-01-15 17:35:32] [config] shuffle-in-ram: false
[2022-01-15 17:35:32] [config] sigterm: save-and-exit
[2022-01-15 17:35:32] [config] skip: false
[2022-01-15 17:35:32] [config] sqlite: ""
[2022-01-15 17:35:32] [config] sqlite-drop: false
[2022-01-15 17:35:32] [config] sync-sgd: false
[2022-01-15 17:35:32] [config] tempdir: /tmp
[2022-01-15 17:35:32] [config] tied-embeddings: true
[2022-01-15 17:35:32] [config] tied-embeddings-all: false
[2022-01-15 17:35:32] [config] tied-embeddings-src: false
[2022-01-15 17:35:32] [config] train-embedder-rank:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] train-sets:
[2022-01-15 17:35:32] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv
[2022-01-15 17:35:32] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv
[2022-01-15 17:35:32] [config] transformer-aan-activation: swish
[2022-01-15 17:35:32] [config] transformer-aan-depth: 2
[2022-01-15 17:35:32] [config] transformer-aan-nogate: false
[2022-01-15 17:35:32] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 17:35:32] [config] transformer-depth-scaling: false
[2022-01-15 17:35:32] [config] transformer-dim-aan: 2048
[2022-01-15 17:35:32] [config] transformer-dim-ffn: 2048
[2022-01-15 17:35:32] [config] transformer-dropout: 0.1
[2022-01-15 17:35:32] [config] transformer-dropout-attention: 0
[2022-01-15 17:35:32] [config] transformer-dropout-ffn: 0
[2022-01-15 17:35:32] [config] transformer-ffn-activation: swish
[2022-01-15 17:35:32] [config] transformer-ffn-depth: 2
[2022-01-15 17:35:32] [config] transformer-guided-alignment-layer: last
[2022-01-15 17:35:32] [config] transformer-heads: 8
[2022-01-15 17:35:32] [config] transformer-no-projection: false
[2022-01-15 17:35:32] [config] transformer-pool: false
[2022-01-15 17:35:32] [config] transformer-postprocess: dan
[2022-01-15 17:35:32] [config] transformer-postprocess-emb: d
[2022-01-15 17:35:32] [config] transformer-postprocess-top: ""
[2022-01-15 17:35:32] [config] transformer-preprocess: ""
[2022-01-15 17:35:32] [config] transformer-tied-layers:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] transformer-train-position-embeddings: false
[2022-01-15 17:35:32] [config] tsv: false
[2022-01-15 17:35:32] [config] tsv-fields: 0
[2022-01-15 17:35:32] [config] type: transformer
[2022-01-15 17:35:32] [config] ulr: false
[2022-01-15 17:35:32] [config] ulr-dim-emb: 0
[2022-01-15 17:35:32] [config] ulr-dropout: 0
[2022-01-15 17:35:32] [config] ulr-keys-vectors: ""
[2022-01-15 17:35:32] [config] ulr-query-vectors: ""
[2022-01-15 17:35:32] [config] ulr-softmax-temperature: 1
[2022-01-15 17:35:32] [config] ulr-trainable-transformation: false
[2022-01-15 17:35:32] [config] unlikelihood-loss: false
[2022-01-15 17:35:32] [config] valid-freq: 5000
[2022-01-15 17:35:32] [config] valid-log: ""
[2022-01-15 17:35:32] [config] valid-max-length: 1000
[2022-01-15 17:35:32] [config] valid-metrics:
[2022-01-15 17:35:32] [config]   - cross-entropy
[2022-01-15 17:35:32] [config] valid-mini-batch: 32
[2022-01-15 17:35:32] [config] valid-reset-stalled: false
[2022-01-15 17:35:32] [config] valid-script-args:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] valid-script-path: ""
[2022-01-15 17:35:32] [config] valid-sets:
[2022-01-15 17:35:32] [config]   []
[2022-01-15 17:35:32] [config] valid-translation-output: ""
[2022-01-15 17:35:32] [config] vocabs:
[2022-01-15 17:35:32] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:35:32] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:35:32] [config] word-penalty: 0
[2022-01-15 17:35:32] [config] word-scores: false
[2022-01-15 17:35:32] [config] workspace: 10000
[2022-01-15 17:35:32] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:35:32] [training] Using single-device training
[2022-01-15 17:35:32] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:35:32] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion
[2022-01-15 17:35:32] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113

[CALL STACK]
[0x561c3cd7b5e6]                                                       + 0x29c5e6
[0x7f4b0819138c]                                                       + 0xaa38c
[0x7f4b081913f7]                                                       + 0xaa3f7
[0x7f4b081916a9]                                                       + 0xaa6a9
[0x561c3d007c20]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x1130
[0x561c3cff6e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x561c3cff7728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x561c3d043189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x561c3d056084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x561c3ceb4f8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x561c3cf3c94b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x561c3ce43389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x561c3ce011bc]    main                                               + 0x3c
[0x7f4b07db20b3]    __libc_start_main                                  + 0xf3
[0x561c3ce41b0e]    _start                                             + 0x2e

[2022-01-15 17:40:11] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:40:11] [marian] Running on s470607-gpu as process 3743 with command line:
[2022-01-15 17:40:11] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:40:11] [config] after: 0e
[2022-01-15 17:40:11] [config] after-batches: 0
[2022-01-15 17:40:11] [config] after-epochs: 1
[2022-01-15 17:40:11] [config] all-caps-every: 0
[2022-01-15 17:40:11] [config] allow-unk: false
[2022-01-15 17:40:11] [config] authors: false
[2022-01-15 17:40:11] [config] beam-size: 6
[2022-01-15 17:40:11] [config] bert-class-symbol: "[CLS]"
[2022-01-15 17:40:11] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 17:40:11] [config] bert-masking-fraction: 0.15
[2022-01-15 17:40:11] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 17:40:11] [config] bert-train-type-embeddings: true
[2022-01-15 17:40:11] [config] bert-type-vocab-size: 2
[2022-01-15 17:40:11] [config] build-info: ""
[2022-01-15 17:40:11] [config] cite: false
[2022-01-15 17:40:11] [config] clip-norm: 5
[2022-01-15 17:40:11] [config] cost-scaling:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] cost-type: ce-sum
[2022-01-15 17:40:11] [config] cpu-threads: 0
[2022-01-15 17:40:11] [config] data-weighting: ""
[2022-01-15 17:40:11] [config] data-weighting-type: sentence
[2022-01-15 17:40:11] [config] dec-cell: gru
[2022-01-15 17:40:11] [config] dec-cell-base-depth: 2
[2022-01-15 17:40:11] [config] dec-cell-high-depth: 1
[2022-01-15 17:40:11] [config] dec-depth: 6
[2022-01-15 17:40:11] [config] devices:
[2022-01-15 17:40:11] [config]   - 0
[2022-01-15 17:40:11] [config] dim-emb: 512
[2022-01-15 17:40:11] [config] dim-rnn: 1024
[2022-01-15 17:40:11] [config] dim-vocabs:
[2022-01-15 17:40:11] [config]   - 0
[2022-01-15 17:40:11] [config]   - 0
[2022-01-15 17:40:11] [config] disp-first: 0
[2022-01-15 17:40:11] [config] disp-freq: 500
[2022-01-15 17:40:11] [config] disp-label-counts: true
[2022-01-15 17:40:11] [config] dropout-rnn: 0
[2022-01-15 17:40:11] [config] dropout-src: 0
[2022-01-15 17:40:11] [config] dropout-trg: 0
[2022-01-15 17:40:11] [config] dump-config: ""
[2022-01-15 17:40:11] [config] early-stopping: 10
[2022-01-15 17:40:11] [config] embedding-fix-src: false
[2022-01-15 17:40:11] [config] embedding-fix-trg: false
[2022-01-15 17:40:11] [config] embedding-normalization: false
[2022-01-15 17:40:11] [config] embedding-vectors:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] enc-cell: gru
[2022-01-15 17:40:11] [config] enc-cell-depth: 1
[2022-01-15 17:40:11] [config] enc-depth: 6
[2022-01-15 17:40:11] [config] enc-type: bidirectional
[2022-01-15 17:40:11] [config] english-title-case-every: 0
[2022-01-15 17:40:11] [config] exponential-smoothing: 0.0001
[2022-01-15 17:40:11] [config] factor-weight: 1
[2022-01-15 17:40:11] [config] grad-dropping-momentum: 0
[2022-01-15 17:40:11] [config] grad-dropping-rate: 0
[2022-01-15 17:40:11] [config] grad-dropping-warmup: 100
[2022-01-15 17:40:11] [config] gradient-checkpointing: false
[2022-01-15 17:40:11] [config] guided-alignment: none
[2022-01-15 17:40:11] [config] guided-alignment-cost: mse
[2022-01-15 17:40:11] [config] guided-alignment-weight: 0.1
[2022-01-15 17:40:11] [config] ignore-model-config: false
[2022-01-15 17:40:11] [config] input-types:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] interpolate-env-vars: false
[2022-01-15 17:40:11] [config] keep-best: false
[2022-01-15 17:40:11] [config] label-smoothing: 0.1
[2022-01-15 17:40:11] [config] layer-normalization: false
[2022-01-15 17:40:11] [config] learn-rate: 0.0003
[2022-01-15 17:40:11] [config] lemma-dim-emb: 0
[2022-01-15 17:40:11] [config] log: /home/wmi/train.log
[2022-01-15 17:40:11] [config] log-level: info
[2022-01-15 17:40:11] [config] log-time-zone: ""
[2022-01-15 17:40:11] [config] logical-epoch:
[2022-01-15 17:40:11] [config]   - 1e
[2022-01-15 17:40:11] [config]   - 0
[2022-01-15 17:40:11] [config] lr-decay: 0
[2022-01-15 17:40:11] [config] lr-decay-freq: 50000
[2022-01-15 17:40:11] [config] lr-decay-inv-sqrt:
[2022-01-15 17:40:11] [config]   - 16000
[2022-01-15 17:40:11] [config] lr-decay-repeat-warmup: false
[2022-01-15 17:40:11] [config] lr-decay-reset-optimizer: false
[2022-01-15 17:40:11] [config] lr-decay-start:
[2022-01-15 17:40:11] [config]   - 10
[2022-01-15 17:40:11] [config]   - 1
[2022-01-15 17:40:11] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 17:40:11] [config] lr-report: true
[2022-01-15 17:40:11] [config] lr-warmup: 16000
[2022-01-15 17:40:11] [config] lr-warmup-at-reload: false
[2022-01-15 17:40:11] [config] lr-warmup-cycle: false
[2022-01-15 17:40:11] [config] lr-warmup-start-rate: 0
[2022-01-15 17:40:11] [config] max-length: 100
[2022-01-15 17:40:11] [config] max-length-crop: false
[2022-01-15 17:40:11] [config] max-length-factor: 3
[2022-01-15 17:40:11] [config] maxi-batch: 1000
[2022-01-15 17:40:11] [config] maxi-batch-sort: trg
[2022-01-15 17:40:11] [config] mini-batch: 64
[2022-01-15 17:40:11] [config] mini-batch-fit: true
[2022-01-15 17:40:11] [config] mini-batch-fit-step: 10
[2022-01-15 17:40:11] [config] mini-batch-track-lr: false
[2022-01-15 17:40:11] [config] mini-batch-warmup: 0
[2022-01-15 17:40:11] [config] mini-batch-words: 0
[2022-01-15 17:40:11] [config] mini-batch-words-ref: 0
[2022-01-15 17:40:11] [config] model: model.npz
[2022-01-15 17:40:11] [config] multi-loss-type: sum
[2022-01-15 17:40:11] [config] multi-node: false
[2022-01-15 17:40:11] [config] multi-node-overlap: true
[2022-01-15 17:40:11] [config] n-best: false
[2022-01-15 17:40:11] [config] no-nccl: false
[2022-01-15 17:40:11] [config] no-reload: false
[2022-01-15 17:40:11] [config] no-restore-corpus: false
[2022-01-15 17:40:11] [config] normalize: 0.6
[2022-01-15 17:40:11] [config] normalize-gradient: false
[2022-01-15 17:40:11] [config] num-devices: 0
[2022-01-15 17:40:11] [config] optimizer: adam
[2022-01-15 17:40:11] [config] optimizer-delay: 1
[2022-01-15 17:40:11] [config] optimizer-params:
[2022-01-15 17:40:11] [config]   - 0.9
[2022-01-15 17:40:11] [config]   - 0.98
[2022-01-15 17:40:11] [config]   - 1e-09
[2022-01-15 17:40:11] [config] output-omit-bias: false
[2022-01-15 17:40:11] [config] overwrite: true
[2022-01-15 17:40:11] [config] precision:
[2022-01-15 17:40:11] [config]   - float32
[2022-01-15 17:40:11] [config]   - float32
[2022-01-15 17:40:11] [config]   - float32
[2022-01-15 17:40:11] [config] pretrained-model: ""
[2022-01-15 17:40:11] [config] quantize-biases: false
[2022-01-15 17:40:11] [config] quantize-bits: 0
[2022-01-15 17:40:11] [config] quantize-log-based: false
[2022-01-15 17:40:11] [config] quantize-optimization-steps: 0
[2022-01-15 17:40:11] [config] quiet: false
[2022-01-15 17:40:11] [config] quiet-translation: false
[2022-01-15 17:40:11] [config] relative-paths: false
[2022-01-15 17:40:11] [config] right-left: false
[2022-01-15 17:40:11] [config] save-freq: 5000
[2022-01-15 17:40:11] [config] seed: 0
[2022-01-15 17:40:11] [config] sentencepiece-alphas:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] sentencepiece-max-lines: 2000000
[2022-01-15 17:40:11] [config] sentencepiece-options: ""
[2022-01-15 17:40:11] [config] shuffle: data
[2022-01-15 17:40:11] [config] shuffle-in-ram: false
[2022-01-15 17:40:11] [config] sigterm: save-and-exit
[2022-01-15 17:40:11] [config] skip: false
[2022-01-15 17:40:11] [config] sqlite: ""
[2022-01-15 17:40:11] [config] sqlite-drop: false
[2022-01-15 17:40:11] [config] sync-sgd: false
[2022-01-15 17:40:11] [config] tempdir: /tmp
[2022-01-15 17:40:11] [config] tied-embeddings: true
[2022-01-15 17:40:11] [config] tied-embeddings-all: false
[2022-01-15 17:40:11] [config] tied-embeddings-src: false
[2022-01-15 17:40:11] [config] train-embedder-rank:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] train-sets:
[2022-01-15 17:40:11] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv
[2022-01-15 17:40:11] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv
[2022-01-15 17:40:11] [config] transformer-aan-activation: swish
[2022-01-15 17:40:11] [config] transformer-aan-depth: 2
[2022-01-15 17:40:11] [config] transformer-aan-nogate: false
[2022-01-15 17:40:11] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 17:40:11] [config] transformer-depth-scaling: false
[2022-01-15 17:40:11] [config] transformer-dim-aan: 2048
[2022-01-15 17:40:11] [config] transformer-dim-ffn: 2048
[2022-01-15 17:40:11] [config] transformer-dropout: 0.1
[2022-01-15 17:40:11] [config] transformer-dropout-attention: 0
[2022-01-15 17:40:11] [config] transformer-dropout-ffn: 0
[2022-01-15 17:40:11] [config] transformer-ffn-activation: swish
[2022-01-15 17:40:11] [config] transformer-ffn-depth: 2
[2022-01-15 17:40:11] [config] transformer-guided-alignment-layer: last
[2022-01-15 17:40:11] [config] transformer-heads: 8
[2022-01-15 17:40:11] [config] transformer-no-projection: false
[2022-01-15 17:40:11] [config] transformer-pool: false
[2022-01-15 17:40:11] [config] transformer-postprocess: dan
[2022-01-15 17:40:11] [config] transformer-postprocess-emb: d
[2022-01-15 17:40:11] [config] transformer-postprocess-top: ""
[2022-01-15 17:40:11] [config] transformer-preprocess: ""
[2022-01-15 17:40:11] [config] transformer-tied-layers:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] transformer-train-position-embeddings: false
[2022-01-15 17:40:11] [config] tsv: false
[2022-01-15 17:40:11] [config] tsv-fields: 0
[2022-01-15 17:40:11] [config] type: transformer
[2022-01-15 17:40:11] [config] ulr: false
[2022-01-15 17:40:11] [config] ulr-dim-emb: 0
[2022-01-15 17:40:11] [config] ulr-dropout: 0
[2022-01-15 17:40:11] [config] ulr-keys-vectors: ""
[2022-01-15 17:40:11] [config] ulr-query-vectors: ""
[2022-01-15 17:40:11] [config] ulr-softmax-temperature: 1
[2022-01-15 17:40:11] [config] ulr-trainable-transformation: false
[2022-01-15 17:40:11] [config] unlikelihood-loss: false
[2022-01-15 17:40:11] [config] valid-freq: 5000
[2022-01-15 17:40:11] [config] valid-log: ""
[2022-01-15 17:40:11] [config] valid-max-length: 1000
[2022-01-15 17:40:11] [config] valid-metrics:
[2022-01-15 17:40:11] [config]   - cross-entropy
[2022-01-15 17:40:11] [config] valid-mini-batch: 32
[2022-01-15 17:40:11] [config] valid-reset-stalled: false
[2022-01-15 17:40:11] [config] valid-script-args:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] valid-script-path: ""
[2022-01-15 17:40:11] [config] valid-sets:
[2022-01-15 17:40:11] [config]   []
[2022-01-15 17:40:11] [config] valid-translation-output: ""
[2022-01-15 17:40:11] [config] vocabs:
[2022-01-15 17:40:11] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:40:11] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:40:11] [config] word-penalty: 0
[2022-01-15 17:40:11] [config] word-scores: false
[2022-01-15 17:40:11] [config] workspace: 10000
[2022-01-15 17:40:11] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:40:11] [training] Using single-device training
[2022-01-15 17:40:11] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:40:11] Error: Unhandled exception of type 'N4YAML18TypedBadConversionINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE': yaml-cpp: error at line 1, column 1: bad conversion
[2022-01-15 17:40:11] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113

[CALL STACK]
[0x564653fb55e6]                                                       + 0x29c5e6
[0x7f476d23938c]                                                       + 0xaa38c
[0x7f476d2393f7]                                                       + 0xaa3f7
[0x7f476d2396a9]                                                       + 0xaa6a9
[0x564654241c20]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x1130
[0x564654230e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x564654231728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x56465427d189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x564654290084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x5646540eef8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x56465417694b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x56465407d389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x56465403b1bc]    main                                               + 0x3c
[0x7f476ce5a0b3]    __libc_start_main                                  + 0xf3
[0x56465407bb0e]    _start                                             + 0x2e

[2022-01-15 17:45:28] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:45:28] [marian] Running on s470607-gpu as process 3792 with command line:
[2022-01-15 17:45:28] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:45:28] [config] after: 0e
[2022-01-15 17:45:28] [config] after-batches: 0
[2022-01-15 17:45:28] [config] after-epochs: 1
[2022-01-15 17:45:28] [config] all-caps-every: 0
[2022-01-15 17:45:28] [config] allow-unk: false
[2022-01-15 17:45:28] [config] authors: false
[2022-01-15 17:45:28] [config] beam-size: 6
[2022-01-15 17:45:28] [config] bert-class-symbol: "[CLS]"
[2022-01-15 17:45:28] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 17:45:28] [config] bert-masking-fraction: 0.15
[2022-01-15 17:45:28] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 17:45:28] [config] bert-train-type-embeddings: true
[2022-01-15 17:45:28] [config] bert-type-vocab-size: 2
[2022-01-15 17:45:28] [config] build-info: ""
[2022-01-15 17:45:28] [config] cite: false
[2022-01-15 17:45:28] [config] clip-norm: 5
[2022-01-15 17:45:28] [config] cost-scaling:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] cost-type: ce-sum
[2022-01-15 17:45:28] [config] cpu-threads: 0
[2022-01-15 17:45:28] [config] data-weighting: ""
[2022-01-15 17:45:28] [config] data-weighting-type: sentence
[2022-01-15 17:45:28] [config] dec-cell: gru
[2022-01-15 17:45:28] [config] dec-cell-base-depth: 2
[2022-01-15 17:45:28] [config] dec-cell-high-depth: 1
[2022-01-15 17:45:28] [config] dec-depth: 6
[2022-01-15 17:45:28] [config] devices:
[2022-01-15 17:45:28] [config]   - 0
[2022-01-15 17:45:28] [config] dim-emb: 512
[2022-01-15 17:45:28] [config] dim-rnn: 1024
[2022-01-15 17:45:28] [config] dim-vocabs:
[2022-01-15 17:45:28] [config]   - 0
[2022-01-15 17:45:28] [config]   - 0
[2022-01-15 17:45:28] [config] disp-first: 0
[2022-01-15 17:45:28] [config] disp-freq: 500
[2022-01-15 17:45:28] [config] disp-label-counts: true
[2022-01-15 17:45:28] [config] dropout-rnn: 0
[2022-01-15 17:45:28] [config] dropout-src: 0
[2022-01-15 17:45:28] [config] dropout-trg: 0
[2022-01-15 17:45:28] [config] dump-config: ""
[2022-01-15 17:45:28] [config] early-stopping: 10
[2022-01-15 17:45:28] [config] embedding-fix-src: false
[2022-01-15 17:45:28] [config] embedding-fix-trg: false
[2022-01-15 17:45:28] [config] embedding-normalization: false
[2022-01-15 17:45:28] [config] embedding-vectors:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] enc-cell: gru
[2022-01-15 17:45:28] [config] enc-cell-depth: 1
[2022-01-15 17:45:28] [config] enc-depth: 6
[2022-01-15 17:45:28] [config] enc-type: bidirectional
[2022-01-15 17:45:28] [config] english-title-case-every: 0
[2022-01-15 17:45:28] [config] exponential-smoothing: 0.0001
[2022-01-15 17:45:28] [config] factor-weight: 1
[2022-01-15 17:45:28] [config] grad-dropping-momentum: 0
[2022-01-15 17:45:28] [config] grad-dropping-rate: 0
[2022-01-15 17:45:28] [config] grad-dropping-warmup: 100
[2022-01-15 17:45:28] [config] gradient-checkpointing: false
[2022-01-15 17:45:28] [config] guided-alignment: none
[2022-01-15 17:45:28] [config] guided-alignment-cost: mse
[2022-01-15 17:45:28] [config] guided-alignment-weight: 0.1
[2022-01-15 17:45:28] [config] ignore-model-config: false
[2022-01-15 17:45:28] [config] input-types:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] interpolate-env-vars: false
[2022-01-15 17:45:28] [config] keep-best: false
[2022-01-15 17:45:28] [config] label-smoothing: 0.1
[2022-01-15 17:45:28] [config] layer-normalization: false
[2022-01-15 17:45:28] [config] learn-rate: 0.0003
[2022-01-15 17:45:28] [config] lemma-dim-emb: 0
[2022-01-15 17:45:28] [config] log: /home/wmi/train.log
[2022-01-15 17:45:28] [config] log-level: info
[2022-01-15 17:45:28] [config] log-time-zone: ""
[2022-01-15 17:45:28] [config] logical-epoch:
[2022-01-15 17:45:28] [config]   - 1e
[2022-01-15 17:45:28] [config]   - 0
[2022-01-15 17:45:28] [config] lr-decay: 0
[2022-01-15 17:45:28] [config] lr-decay-freq: 50000
[2022-01-15 17:45:28] [config] lr-decay-inv-sqrt:
[2022-01-15 17:45:28] [config]   - 16000
[2022-01-15 17:45:28] [config] lr-decay-repeat-warmup: false
[2022-01-15 17:45:28] [config] lr-decay-reset-optimizer: false
[2022-01-15 17:45:28] [config] lr-decay-start:
[2022-01-15 17:45:28] [config]   - 10
[2022-01-15 17:45:28] [config]   - 1
[2022-01-15 17:45:28] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 17:45:28] [config] lr-report: true
[2022-01-15 17:45:28] [config] lr-warmup: 16000
[2022-01-15 17:45:28] [config] lr-warmup-at-reload: false
[2022-01-15 17:45:28] [config] lr-warmup-cycle: false
[2022-01-15 17:45:28] [config] lr-warmup-start-rate: 0
[2022-01-15 17:45:28] [config] max-length: 100
[2022-01-15 17:45:28] [config] max-length-crop: false
[2022-01-15 17:45:28] [config] max-length-factor: 3
[2022-01-15 17:45:28] [config] maxi-batch: 1000
[2022-01-15 17:45:28] [config] maxi-batch-sort: trg
[2022-01-15 17:45:28] [config] mini-batch: 64
[2022-01-15 17:45:28] [config] mini-batch-fit: true
[2022-01-15 17:45:28] [config] mini-batch-fit-step: 10
[2022-01-15 17:45:28] [config] mini-batch-track-lr: false
[2022-01-15 17:45:28] [config] mini-batch-warmup: 0
[2022-01-15 17:45:28] [config] mini-batch-words: 0
[2022-01-15 17:45:28] [config] mini-batch-words-ref: 0
[2022-01-15 17:45:28] [config] model: model.npz
[2022-01-15 17:45:28] [config] multi-loss-type: sum
[2022-01-15 17:45:28] [config] multi-node: false
[2022-01-15 17:45:28] [config] multi-node-overlap: true
[2022-01-15 17:45:28] [config] n-best: false
[2022-01-15 17:45:28] [config] no-nccl: false
[2022-01-15 17:45:28] [config] no-reload: false
[2022-01-15 17:45:28] [config] no-restore-corpus: false
[2022-01-15 17:45:28] [config] normalize: 0.6
[2022-01-15 17:45:28] [config] normalize-gradient: false
[2022-01-15 17:45:28] [config] num-devices: 0
[2022-01-15 17:45:28] [config] optimizer: adam
[2022-01-15 17:45:28] [config] optimizer-delay: 1
[2022-01-15 17:45:28] [config] optimizer-params:
[2022-01-15 17:45:28] [config]   - 0.9
[2022-01-15 17:45:28] [config]   - 0.98
[2022-01-15 17:45:28] [config]   - 1e-09
[2022-01-15 17:45:28] [config] output-omit-bias: false
[2022-01-15 17:45:28] [config] overwrite: true
[2022-01-15 17:45:28] [config] precision:
[2022-01-15 17:45:28] [config]   - float32
[2022-01-15 17:45:28] [config]   - float32
[2022-01-15 17:45:28] [config]   - float32
[2022-01-15 17:45:28] [config] pretrained-model: ""
[2022-01-15 17:45:28] [config] quantize-biases: false
[2022-01-15 17:45:28] [config] quantize-bits: 0
[2022-01-15 17:45:28] [config] quantize-log-based: false
[2022-01-15 17:45:28] [config] quantize-optimization-steps: 0
[2022-01-15 17:45:28] [config] quiet: false
[2022-01-15 17:45:28] [config] quiet-translation: false
[2022-01-15 17:45:28] [config] relative-paths: false
[2022-01-15 17:45:28] [config] right-left: false
[2022-01-15 17:45:28] [config] save-freq: 5000
[2022-01-15 17:45:28] [config] seed: 0
[2022-01-15 17:45:28] [config] sentencepiece-alphas:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] sentencepiece-max-lines: 2000000
[2022-01-15 17:45:28] [config] sentencepiece-options: ""
[2022-01-15 17:45:28] [config] shuffle: data
[2022-01-15 17:45:28] [config] shuffle-in-ram: false
[2022-01-15 17:45:28] [config] sigterm: save-and-exit
[2022-01-15 17:45:28] [config] skip: false
[2022-01-15 17:45:28] [config] sqlite: ""
[2022-01-15 17:45:28] [config] sqlite-drop: false
[2022-01-15 17:45:28] [config] sync-sgd: false
[2022-01-15 17:45:28] [config] tempdir: /tmp
[2022-01-15 17:45:28] [config] tied-embeddings: true
[2022-01-15 17:45:28] [config] tied-embeddings-all: false
[2022-01-15 17:45:28] [config] tied-embeddings-src: false
[2022-01-15 17:45:28] [config] train-embedder-rank:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] train-sets:
[2022-01-15 17:45:28] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv
[2022-01-15 17:45:28] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv
[2022-01-15 17:45:28] [config] transformer-aan-activation: swish
[2022-01-15 17:45:28] [config] transformer-aan-depth: 2
[2022-01-15 17:45:28] [config] transformer-aan-nogate: false
[2022-01-15 17:45:28] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 17:45:28] [config] transformer-depth-scaling: false
[2022-01-15 17:45:28] [config] transformer-dim-aan: 2048
[2022-01-15 17:45:28] [config] transformer-dim-ffn: 2048
[2022-01-15 17:45:28] [config] transformer-dropout: 0.1
[2022-01-15 17:45:28] [config] transformer-dropout-attention: 0
[2022-01-15 17:45:28] [config] transformer-dropout-ffn: 0
[2022-01-15 17:45:28] [config] transformer-ffn-activation: swish
[2022-01-15 17:45:28] [config] transformer-ffn-depth: 2
[2022-01-15 17:45:28] [config] transformer-guided-alignment-layer: last
[2022-01-15 17:45:28] [config] transformer-heads: 8
[2022-01-15 17:45:28] [config] transformer-no-projection: false
[2022-01-15 17:45:28] [config] transformer-pool: false
[2022-01-15 17:45:28] [config] transformer-postprocess: dan
[2022-01-15 17:45:28] [config] transformer-postprocess-emb: d
[2022-01-15 17:45:28] [config] transformer-postprocess-top: ""
[2022-01-15 17:45:28] [config] transformer-preprocess: ""
[2022-01-15 17:45:28] [config] transformer-tied-layers:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] transformer-train-position-embeddings: false
[2022-01-15 17:45:28] [config] tsv: false
[2022-01-15 17:45:28] [config] tsv-fields: 0
[2022-01-15 17:45:28] [config] type: transformer
[2022-01-15 17:45:28] [config] ulr: false
[2022-01-15 17:45:28] [config] ulr-dim-emb: 0
[2022-01-15 17:45:28] [config] ulr-dropout: 0
[2022-01-15 17:45:28] [config] ulr-keys-vectors: ""
[2022-01-15 17:45:28] [config] ulr-query-vectors: ""
[2022-01-15 17:45:28] [config] ulr-softmax-temperature: 1
[2022-01-15 17:45:28] [config] ulr-trainable-transformation: false
[2022-01-15 17:45:28] [config] unlikelihood-loss: false
[2022-01-15 17:45:28] [config] valid-freq: 5000
[2022-01-15 17:45:28] [config] valid-log: ""
[2022-01-15 17:45:28] [config] valid-max-length: 1000
[2022-01-15 17:45:28] [config] valid-metrics:
[2022-01-15 17:45:28] [config]   - cross-entropy
[2022-01-15 17:45:28] [config] valid-mini-batch: 32
[2022-01-15 17:45:28] [config] valid-reset-stalled: false
[2022-01-15 17:45:28] [config] valid-script-args:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] valid-script-path: ""
[2022-01-15 17:45:28] [config] valid-sets:
[2022-01-15 17:45:28] [config]   []
[2022-01-15 17:45:28] [config] valid-translation-output: ""
[2022-01-15 17:45:28] [config] vocabs:
[2022-01-15 17:45:28] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:45:28] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge//train/expected.tsv.vocab.10000.yml
[2022-01-15 17:45:28] [config] word-penalty: 0
[2022-01-15 17:45:28] [config] word-scores: false
[2022-01-15 17:45:28] [config] workspace: 10000
[2022-01-15 17:45:28] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:45:28] [training] Using single-device training
[2022-01-15 17:45:28] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge//train/in.tsv.vocab.10000.yml
[2022-01-15 17:45:28] Error: Unhandled exception of type 'N4YAML15ParserExceptionE': yaml-cpp: error at line 14, column 1: end of map not found
[2022-01-15 17:45:28] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113

[CALL STACK]
[0x5633dffce5e6]                                                       + 0x29c5e6
[0x7f5448b7938c]                                                       + 0xaa38c
[0x7f5448b793f7]                                                       + 0xaa3f7
[0x7f5448b796a9]                                                       + 0xaa6a9
[0x5633e001b7c7]                                                       + 0x2e97c7
[0x5633e060b658]    YAML::SingleDocParser::  HandleNode  (YAML::EventHandler&) + 0x278
[0x5633e060bbcc]    YAML::SingleDocParser::  HandleDocument  (YAML::EventHandler&) + 0x5c
[0x5633e05f0dcd]    YAML::Parser::  HandleNextDocument  (YAML::EventHandler&) + 0x7d
[0x5633e05ed6d9]    YAML::  Load  (std::istream&)                      + 0x49
[0x5633e025a328]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x838
[0x5633e0249e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x5633e024a728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x5633e0296189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x5633e02a9084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x5633e0107f8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x5633e018f94b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x5633e0096389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x5633e00541bc]    main                                               + 0x3c
[0x7f544879a0b3]    __libc_start_main                                  + 0xf3
[0x5633e0094b0e]    _start                                             + 0x2e

[2022-01-15 17:51:29] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:51:29] [marian] Running on s470607-gpu as process 3840 with command line:
[2022-01-15 17:51:29] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1 --vocabs /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.vocab.10000.yml /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.vocab.10000.yml
[2022-01-15 17:51:29] [config] after: 0e
[2022-01-15 17:51:29] [config] after-batches: 0
[2022-01-15 17:51:29] [config] after-epochs: 1
[2022-01-15 17:51:29] [config] all-caps-every: 0
[2022-01-15 17:51:29] [config] allow-unk: false
[2022-01-15 17:51:29] [config] authors: false
[2022-01-15 17:51:29] [config] beam-size: 6
[2022-01-15 17:51:29] [config] bert-class-symbol: "[CLS]"
[2022-01-15 17:51:29] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 17:51:29] [config] bert-masking-fraction: 0.15
[2022-01-15 17:51:29] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 17:51:29] [config] bert-train-type-embeddings: true
[2022-01-15 17:51:29] [config] bert-type-vocab-size: 2
[2022-01-15 17:51:29] [config] build-info: ""
[2022-01-15 17:51:29] [config] cite: false
[2022-01-15 17:51:29] [config] clip-norm: 5
[2022-01-15 17:51:29] [config] cost-scaling:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] cost-type: ce-sum
[2022-01-15 17:51:29] [config] cpu-threads: 0
[2022-01-15 17:51:29] [config] data-weighting: ""
[2022-01-15 17:51:29] [config] data-weighting-type: sentence
[2022-01-15 17:51:29] [config] dec-cell: gru
[2022-01-15 17:51:29] [config] dec-cell-base-depth: 2
[2022-01-15 17:51:29] [config] dec-cell-high-depth: 1
[2022-01-15 17:51:29] [config] dec-depth: 6
[2022-01-15 17:51:29] [config] devices:
[2022-01-15 17:51:29] [config]   - 0
[2022-01-15 17:51:29] [config] dim-emb: 512
[2022-01-15 17:51:29] [config] dim-rnn: 1024
[2022-01-15 17:51:29] [config] dim-vocabs:
[2022-01-15 17:51:29] [config]   - 0
[2022-01-15 17:51:29] [config]   - 0
[2022-01-15 17:51:29] [config] disp-first: 0
[2022-01-15 17:51:29] [config] disp-freq: 500
[2022-01-15 17:51:29] [config] disp-label-counts: true
[2022-01-15 17:51:29] [config] dropout-rnn: 0
[2022-01-15 17:51:29] [config] dropout-src: 0
[2022-01-15 17:51:29] [config] dropout-trg: 0
[2022-01-15 17:51:29] [config] dump-config: ""
[2022-01-15 17:51:29] [config] early-stopping: 10
[2022-01-15 17:51:29] [config] embedding-fix-src: false
[2022-01-15 17:51:29] [config] embedding-fix-trg: false
[2022-01-15 17:51:29] [config] embedding-normalization: false
[2022-01-15 17:51:29] [config] embedding-vectors:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] enc-cell: gru
[2022-01-15 17:51:29] [config] enc-cell-depth: 1
[2022-01-15 17:51:29] [config] enc-depth: 6
[2022-01-15 17:51:29] [config] enc-type: bidirectional
[2022-01-15 17:51:29] [config] english-title-case-every: 0
[2022-01-15 17:51:29] [config] exponential-smoothing: 0.0001
[2022-01-15 17:51:29] [config] factor-weight: 1
[2022-01-15 17:51:29] [config] grad-dropping-momentum: 0
[2022-01-15 17:51:29] [config] grad-dropping-rate: 0
[2022-01-15 17:51:29] [config] grad-dropping-warmup: 100
[2022-01-15 17:51:29] [config] gradient-checkpointing: false
[2022-01-15 17:51:29] [config] guided-alignment: none
[2022-01-15 17:51:29] [config] guided-alignment-cost: mse
[2022-01-15 17:51:29] [config] guided-alignment-weight: 0.1
[2022-01-15 17:51:29] [config] ignore-model-config: false
[2022-01-15 17:51:29] [config] input-types:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] interpolate-env-vars: false
[2022-01-15 17:51:29] [config] keep-best: false
[2022-01-15 17:51:29] [config] label-smoothing: 0.1
[2022-01-15 17:51:29] [config] layer-normalization: false
[2022-01-15 17:51:29] [config] learn-rate: 0.0003
[2022-01-15 17:51:29] [config] lemma-dim-emb: 0
[2022-01-15 17:51:29] [config] log: /home/wmi/train.log
[2022-01-15 17:51:29] [config] log-level: info
[2022-01-15 17:51:29] [config] log-time-zone: ""
[2022-01-15 17:51:29] [config] logical-epoch:
[2022-01-15 17:51:29] [config]   - 1e
[2022-01-15 17:51:29] [config]   - 0
[2022-01-15 17:51:29] [config] lr-decay: 0
[2022-01-15 17:51:29] [config] lr-decay-freq: 50000
[2022-01-15 17:51:29] [config] lr-decay-inv-sqrt:
[2022-01-15 17:51:29] [config]   - 16000
[2022-01-15 17:51:29] [config] lr-decay-repeat-warmup: false
[2022-01-15 17:51:29] [config] lr-decay-reset-optimizer: false
[2022-01-15 17:51:29] [config] lr-decay-start:
[2022-01-15 17:51:29] [config]   - 10
[2022-01-15 17:51:29] [config]   - 1
[2022-01-15 17:51:29] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 17:51:29] [config] lr-report: true
[2022-01-15 17:51:29] [config] lr-warmup: 16000
[2022-01-15 17:51:29] [config] lr-warmup-at-reload: false
[2022-01-15 17:51:29] [config] lr-warmup-cycle: false
[2022-01-15 17:51:29] [config] lr-warmup-start-rate: 0
[2022-01-15 17:51:29] [config] max-length: 100
[2022-01-15 17:51:29] [config] max-length-crop: false
[2022-01-15 17:51:29] [config] max-length-factor: 3
[2022-01-15 17:51:29] [config] maxi-batch: 1000
[2022-01-15 17:51:29] [config] maxi-batch-sort: trg
[2022-01-15 17:51:29] [config] mini-batch: 64
[2022-01-15 17:51:29] [config] mini-batch-fit: true
[2022-01-15 17:51:29] [config] mini-batch-fit-step: 10
[2022-01-15 17:51:29] [config] mini-batch-track-lr: false
[2022-01-15 17:51:29] [config] mini-batch-warmup: 0
[2022-01-15 17:51:29] [config] mini-batch-words: 0
[2022-01-15 17:51:29] [config] mini-batch-words-ref: 0
[2022-01-15 17:51:29] [config] model: model.npz
[2022-01-15 17:51:29] [config] multi-loss-type: sum
[2022-01-15 17:51:29] [config] multi-node: false
[2022-01-15 17:51:29] [config] multi-node-overlap: true
[2022-01-15 17:51:29] [config] n-best: false
[2022-01-15 17:51:29] [config] no-nccl: false
[2022-01-15 17:51:29] [config] no-reload: false
[2022-01-15 17:51:29] [config] no-restore-corpus: false
[2022-01-15 17:51:29] [config] normalize: 0.6
[2022-01-15 17:51:29] [config] normalize-gradient: false
[2022-01-15 17:51:29] [config] num-devices: 0
[2022-01-15 17:51:29] [config] optimizer: adam
[2022-01-15 17:51:29] [config] optimizer-delay: 1
[2022-01-15 17:51:29] [config] optimizer-params:
[2022-01-15 17:51:29] [config]   - 0.9
[2022-01-15 17:51:29] [config]   - 0.98
[2022-01-15 17:51:29] [config]   - 1e-09
[2022-01-15 17:51:29] [config] output-omit-bias: false
[2022-01-15 17:51:29] [config] overwrite: true
[2022-01-15 17:51:29] [config] precision:
[2022-01-15 17:51:29] [config]   - float32
[2022-01-15 17:51:29] [config]   - float32
[2022-01-15 17:51:29] [config]   - float32
[2022-01-15 17:51:29] [config] pretrained-model: ""
[2022-01-15 17:51:29] [config] quantize-biases: false
[2022-01-15 17:51:29] [config] quantize-bits: 0
[2022-01-15 17:51:29] [config] quantize-log-based: false
[2022-01-15 17:51:29] [config] quantize-optimization-steps: 0
[2022-01-15 17:51:29] [config] quiet: false
[2022-01-15 17:51:29] [config] quiet-translation: false
[2022-01-15 17:51:29] [config] relative-paths: false
[2022-01-15 17:51:29] [config] right-left: false
[2022-01-15 17:51:29] [config] save-freq: 5000
[2022-01-15 17:51:29] [config] seed: 0
[2022-01-15 17:51:29] [config] sentencepiece-alphas:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] sentencepiece-max-lines: 2000000
[2022-01-15 17:51:29] [config] sentencepiece-options: ""
[2022-01-15 17:51:29] [config] shuffle: data
[2022-01-15 17:51:29] [config] shuffle-in-ram: false
[2022-01-15 17:51:29] [config] sigterm: save-and-exit
[2022-01-15 17:51:29] [config] skip: false
[2022-01-15 17:51:29] [config] sqlite: ""
[2022-01-15 17:51:29] [config] sqlite-drop: false
[2022-01-15 17:51:29] [config] sync-sgd: false
[2022-01-15 17:51:29] [config] tempdir: /tmp
[2022-01-15 17:51:29] [config] tied-embeddings: true
[2022-01-15 17:51:29] [config] tied-embeddings-all: false
[2022-01-15 17:51:29] [config] tied-embeddings-src: false
[2022-01-15 17:51:29] [config] train-embedder-rank:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] train-sets:
[2022-01-15 17:51:29] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 17:51:29] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 17:51:29] [config] transformer-aan-activation: swish
[2022-01-15 17:51:29] [config] transformer-aan-depth: 2
[2022-01-15 17:51:29] [config] transformer-aan-nogate: false
[2022-01-15 17:51:29] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 17:51:29] [config] transformer-depth-scaling: false
[2022-01-15 17:51:29] [config] transformer-dim-aan: 2048
[2022-01-15 17:51:29] [config] transformer-dim-ffn: 2048
[2022-01-15 17:51:29] [config] transformer-dropout: 0.1
[2022-01-15 17:51:29] [config] transformer-dropout-attention: 0
[2022-01-15 17:51:29] [config] transformer-dropout-ffn: 0
[2022-01-15 17:51:29] [config] transformer-ffn-activation: swish
[2022-01-15 17:51:29] [config] transformer-ffn-depth: 2
[2022-01-15 17:51:29] [config] transformer-guided-alignment-layer: last
[2022-01-15 17:51:29] [config] transformer-heads: 8
[2022-01-15 17:51:29] [config] transformer-no-projection: false
[2022-01-15 17:51:29] [config] transformer-pool: false
[2022-01-15 17:51:29] [config] transformer-postprocess: dan
[2022-01-15 17:51:29] [config] transformer-postprocess-emb: d
[2022-01-15 17:51:29] [config] transformer-postprocess-top: ""
[2022-01-15 17:51:29] [config] transformer-preprocess: ""
[2022-01-15 17:51:29] [config] transformer-tied-layers:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] transformer-train-position-embeddings: false
[2022-01-15 17:51:29] [config] tsv: false
[2022-01-15 17:51:29] [config] tsv-fields: 0
[2022-01-15 17:51:29] [config] type: transformer
[2022-01-15 17:51:29] [config] ulr: false
[2022-01-15 17:51:29] [config] ulr-dim-emb: 0
[2022-01-15 17:51:29] [config] ulr-dropout: 0
[2022-01-15 17:51:29] [config] ulr-keys-vectors: ""
[2022-01-15 17:51:29] [config] ulr-query-vectors: ""
[2022-01-15 17:51:29] [config] ulr-softmax-temperature: 1
[2022-01-15 17:51:29] [config] ulr-trainable-transformation: false
[2022-01-15 17:51:29] [config] unlikelihood-loss: false
[2022-01-15 17:51:29] [config] valid-freq: 5000
[2022-01-15 17:51:29] [config] valid-log: ""
[2022-01-15 17:51:29] [config] valid-max-length: 1000
[2022-01-15 17:51:29] [config] valid-metrics:
[2022-01-15 17:51:29] [config]   - cross-entropy
[2022-01-15 17:51:29] [config] valid-mini-batch: 32
[2022-01-15 17:51:29] [config] valid-reset-stalled: false
[2022-01-15 17:51:29] [config] valid-script-args:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] valid-script-path: ""
[2022-01-15 17:51:29] [config] valid-sets:
[2022-01-15 17:51:29] [config]   []
[2022-01-15 17:51:29] [config] valid-translation-output: ""
[2022-01-15 17:51:29] [config] vocabs:
[2022-01-15 17:51:29] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.vocab.10000.yml
[2022-01-15 17:51:29] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.vocab.10000.yml
[2022-01-15 17:51:29] [config] word-penalty: 0
[2022-01-15 17:51:29] [config] word-scores: false
[2022-01-15 17:51:29] [config] workspace: 10000
[2022-01-15 17:51:29] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:51:29] [training] Using single-device training
[2022-01-15 17:51:29] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.vocab.10000.yml
[2022-01-15 17:51:29] Error: Unhandled exception of type 'N4YAML15ParserExceptionE': yaml-cpp: error at line 14, column 1: end of map not found
[2022-01-15 17:51:29] Error: Aborted from void unhandledException() in /home/wmi/Workspace/marian/src/common/logging.cpp:113

[CALL STACK]
[0x564a5ea8c5e6]                                                       + 0x29c5e6
[0x7f08ed7f338c]                                                       + 0xaa38c
[0x7f08ed7f33f7]                                                       + 0xaa3f7
[0x7f08ed7f36a9]                                                       + 0xaa6a9
[0x564a5ead97c7]                                                       + 0x2e97c7
[0x564a5f0c9658]    YAML::SingleDocParser::  HandleNode  (YAML::EventHandler&) + 0x278
[0x564a5f0c9bcc]    YAML::SingleDocParser::  HandleDocument  (YAML::EventHandler&) + 0x5c
[0x564a5f0aedcd]    YAML::Parser::  HandleNextDocument  (YAML::EventHandler&) + 0x7d
[0x564a5f0ab6d9]    YAML::  Load  (std::istream&)                      + 0x49
[0x564a5ed18328]    marian::DefaultVocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x838
[0x564a5ed07e2a]    marian::Vocab::  load  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  unsigned long) + 0x3a
[0x564a5ed08728]    marian::Vocab::  loadOrCreate  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>> const&,  std::vector<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>,std::allocator<std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>>> const&,  unsigned long) + 0x528
[0x564a5ed54189]    marian::data::CorpusBase::  CorpusBase  (std::shared_ptr<marian::Options>,  bool) + 0x1e09
[0x564a5ed67084]    marian::data::Corpus::  Corpus  (std::shared_ptr<marian::Options>,  bool) + 0x64
[0x564a5ebc5f8c]    std::shared_ptr<marian::data::Corpus> marian::  New  <marian::data::Corpus,std::shared_ptr<marian::Options>&>(std::shared_ptr<marian::Options>&) + 0x5c
[0x564a5ec4d94b]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x19cb
[0x564a5eb54389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x564a5eb121bc]    main                                               + 0x3c
[0x7f08ed4140b3]    __libc_start_main                                  + 0xf3
[0x564a5eb52b0e]    _start                                             + 0x2e

[2022-01-15 17:53:26] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:53:26] [marian] Running on s470607-gpu as process 3870 with command line:
[2022-01-15 17:53:26] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 17:53:26] [config] after: 0e
[2022-01-15 17:53:26] [config] after-batches: 0
[2022-01-15 17:53:26] [config] after-epochs: 1
[2022-01-15 17:53:26] [config] all-caps-every: 0
[2022-01-15 17:53:26] [config] allow-unk: false
[2022-01-15 17:53:26] [config] authors: false
[2022-01-15 17:53:26] [config] beam-size: 6
[2022-01-15 17:53:26] [config] bert-class-symbol: "[CLS]"
[2022-01-15 17:53:26] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 17:53:26] [config] bert-masking-fraction: 0.15
[2022-01-15 17:53:26] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 17:53:26] [config] bert-train-type-embeddings: true
[2022-01-15 17:53:26] [config] bert-type-vocab-size: 2
[2022-01-15 17:53:26] [config] build-info: ""
[2022-01-15 17:53:26] [config] cite: false
[2022-01-15 17:53:26] [config] clip-norm: 5
[2022-01-15 17:53:26] [config] cost-scaling:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] cost-type: ce-sum
[2022-01-15 17:53:26] [config] cpu-threads: 0
[2022-01-15 17:53:26] [config] data-weighting: ""
[2022-01-15 17:53:26] [config] data-weighting-type: sentence
[2022-01-15 17:53:26] [config] dec-cell: gru
[2022-01-15 17:53:26] [config] dec-cell-base-depth: 2
[2022-01-15 17:53:26] [config] dec-cell-high-depth: 1
[2022-01-15 17:53:26] [config] dec-depth: 6
[2022-01-15 17:53:26] [config] devices:
[2022-01-15 17:53:26] [config]   - 0
[2022-01-15 17:53:26] [config] dim-emb: 512
[2022-01-15 17:53:26] [config] dim-rnn: 1024
[2022-01-15 17:53:26] [config] dim-vocabs:
[2022-01-15 17:53:26] [config]   - 0
[2022-01-15 17:53:26] [config]   - 0
[2022-01-15 17:53:26] [config] disp-first: 0
[2022-01-15 17:53:26] [config] disp-freq: 500
[2022-01-15 17:53:26] [config] disp-label-counts: true
[2022-01-15 17:53:26] [config] dropout-rnn: 0
[2022-01-15 17:53:26] [config] dropout-src: 0
[2022-01-15 17:53:26] [config] dropout-trg: 0
[2022-01-15 17:53:26] [config] dump-config: ""
[2022-01-15 17:53:26] [config] early-stopping: 10
[2022-01-15 17:53:26] [config] embedding-fix-src: false
[2022-01-15 17:53:26] [config] embedding-fix-trg: false
[2022-01-15 17:53:26] [config] embedding-normalization: false
[2022-01-15 17:53:26] [config] embedding-vectors:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] enc-cell: gru
[2022-01-15 17:53:26] [config] enc-cell-depth: 1
[2022-01-15 17:53:26] [config] enc-depth: 6
[2022-01-15 17:53:26] [config] enc-type: bidirectional
[2022-01-15 17:53:26] [config] english-title-case-every: 0
[2022-01-15 17:53:26] [config] exponential-smoothing: 0.0001
[2022-01-15 17:53:26] [config] factor-weight: 1
[2022-01-15 17:53:26] [config] grad-dropping-momentum: 0
[2022-01-15 17:53:26] [config] grad-dropping-rate: 0
[2022-01-15 17:53:26] [config] grad-dropping-warmup: 100
[2022-01-15 17:53:26] [config] gradient-checkpointing: false
[2022-01-15 17:53:26] [config] guided-alignment: none
[2022-01-15 17:53:26] [config] guided-alignment-cost: mse
[2022-01-15 17:53:26] [config] guided-alignment-weight: 0.1
[2022-01-15 17:53:26] [config] ignore-model-config: false
[2022-01-15 17:53:26] [config] input-types:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] interpolate-env-vars: false
[2022-01-15 17:53:26] [config] keep-best: false
[2022-01-15 17:53:26] [config] label-smoothing: 0.1
[2022-01-15 17:53:26] [config] layer-normalization: false
[2022-01-15 17:53:26] [config] learn-rate: 0.0003
[2022-01-15 17:53:26] [config] lemma-dim-emb: 0
[2022-01-15 17:53:26] [config] log: /home/wmi/train.log
[2022-01-15 17:53:26] [config] log-level: info
[2022-01-15 17:53:26] [config] log-time-zone: ""
[2022-01-15 17:53:26] [config] logical-epoch:
[2022-01-15 17:53:26] [config]   - 1e
[2022-01-15 17:53:26] [config]   - 0
[2022-01-15 17:53:26] [config] lr-decay: 0
[2022-01-15 17:53:26] [config] lr-decay-freq: 50000
[2022-01-15 17:53:26] [config] lr-decay-inv-sqrt:
[2022-01-15 17:53:26] [config]   - 16000
[2022-01-15 17:53:26] [config] lr-decay-repeat-warmup: false
[2022-01-15 17:53:26] [config] lr-decay-reset-optimizer: false
[2022-01-15 17:53:26] [config] lr-decay-start:
[2022-01-15 17:53:26] [config]   - 10
[2022-01-15 17:53:26] [config]   - 1
[2022-01-15 17:53:26] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 17:53:26] [config] lr-report: true
[2022-01-15 17:53:26] [config] lr-warmup: 16000
[2022-01-15 17:53:26] [config] lr-warmup-at-reload: false
[2022-01-15 17:53:26] [config] lr-warmup-cycle: false
[2022-01-15 17:53:26] [config] lr-warmup-start-rate: 0
[2022-01-15 17:53:26] [config] max-length: 100
[2022-01-15 17:53:26] [config] max-length-crop: false
[2022-01-15 17:53:26] [config] max-length-factor: 3
[2022-01-15 17:53:26] [config] maxi-batch: 1000
[2022-01-15 17:53:26] [config] maxi-batch-sort: trg
[2022-01-15 17:53:26] [config] mini-batch: 64
[2022-01-15 17:53:26] [config] mini-batch-fit: true
[2022-01-15 17:53:26] [config] mini-batch-fit-step: 10
[2022-01-15 17:53:26] [config] mini-batch-track-lr: false
[2022-01-15 17:53:26] [config] mini-batch-warmup: 0
[2022-01-15 17:53:26] [config] mini-batch-words: 0
[2022-01-15 17:53:26] [config] mini-batch-words-ref: 0
[2022-01-15 17:53:26] [config] model: model.npz
[2022-01-15 17:53:26] [config] multi-loss-type: sum
[2022-01-15 17:53:26] [config] multi-node: false
[2022-01-15 17:53:26] [config] multi-node-overlap: true
[2022-01-15 17:53:26] [config] n-best: false
[2022-01-15 17:53:26] [config] no-nccl: false
[2022-01-15 17:53:26] [config] no-reload: false
[2022-01-15 17:53:26] [config] no-restore-corpus: false
[2022-01-15 17:53:26] [config] normalize: 0.6
[2022-01-15 17:53:26] [config] normalize-gradient: false
[2022-01-15 17:53:26] [config] num-devices: 0
[2022-01-15 17:53:26] [config] optimizer: adam
[2022-01-15 17:53:26] [config] optimizer-delay: 1
[2022-01-15 17:53:26] [config] optimizer-params:
[2022-01-15 17:53:26] [config]   - 0.9
[2022-01-15 17:53:26] [config]   - 0.98
[2022-01-15 17:53:26] [config]   - 1e-09
[2022-01-15 17:53:26] [config] output-omit-bias: false
[2022-01-15 17:53:26] [config] overwrite: true
[2022-01-15 17:53:26] [config] precision:
[2022-01-15 17:53:26] [config]   - float32
[2022-01-15 17:53:26] [config]   - float32
[2022-01-15 17:53:26] [config]   - float32
[2022-01-15 17:53:26] [config] pretrained-model: ""
[2022-01-15 17:53:26] [config] quantize-biases: false
[2022-01-15 17:53:26] [config] quantize-bits: 0
[2022-01-15 17:53:26] [config] quantize-log-based: false
[2022-01-15 17:53:26] [config] quantize-optimization-steps: 0
[2022-01-15 17:53:26] [config] quiet: false
[2022-01-15 17:53:26] [config] quiet-translation: false
[2022-01-15 17:53:26] [config] relative-paths: false
[2022-01-15 17:53:26] [config] right-left: false
[2022-01-15 17:53:26] [config] save-freq: 5000
[2022-01-15 17:53:26] [config] seed: 0
[2022-01-15 17:53:26] [config] sentencepiece-alphas:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] sentencepiece-max-lines: 2000000
[2022-01-15 17:53:26] [config] sentencepiece-options: ""
[2022-01-15 17:53:26] [config] shuffle: data
[2022-01-15 17:53:26] [config] shuffle-in-ram: false
[2022-01-15 17:53:26] [config] sigterm: save-and-exit
[2022-01-15 17:53:26] [config] skip: false
[2022-01-15 17:53:26] [config] sqlite: ""
[2022-01-15 17:53:26] [config] sqlite-drop: false
[2022-01-15 17:53:26] [config] sync-sgd: false
[2022-01-15 17:53:26] [config] tempdir: /tmp
[2022-01-15 17:53:26] [config] tied-embeddings: true
[2022-01-15 17:53:26] [config] tied-embeddings-all: false
[2022-01-15 17:53:26] [config] tied-embeddings-src: false
[2022-01-15 17:53:26] [config] train-embedder-rank:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] train-sets:
[2022-01-15 17:53:26] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 17:53:26] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 17:53:26] [config] transformer-aan-activation: swish
[2022-01-15 17:53:26] [config] transformer-aan-depth: 2
[2022-01-15 17:53:26] [config] transformer-aan-nogate: false
[2022-01-15 17:53:26] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 17:53:26] [config] transformer-depth-scaling: false
[2022-01-15 17:53:26] [config] transformer-dim-aan: 2048
[2022-01-15 17:53:26] [config] transformer-dim-ffn: 2048
[2022-01-15 17:53:26] [config] transformer-dropout: 0.1
[2022-01-15 17:53:26] [config] transformer-dropout-attention: 0
[2022-01-15 17:53:26] [config] transformer-dropout-ffn: 0
[2022-01-15 17:53:26] [config] transformer-ffn-activation: swish
[2022-01-15 17:53:26] [config] transformer-ffn-depth: 2
[2022-01-15 17:53:26] [config] transformer-guided-alignment-layer: last
[2022-01-15 17:53:26] [config] transformer-heads: 8
[2022-01-15 17:53:26] [config] transformer-no-projection: false
[2022-01-15 17:53:26] [config] transformer-pool: false
[2022-01-15 17:53:26] [config] transformer-postprocess: dan
[2022-01-15 17:53:26] [config] transformer-postprocess-emb: d
[2022-01-15 17:53:26] [config] transformer-postprocess-top: ""
[2022-01-15 17:53:26] [config] transformer-preprocess: ""
[2022-01-15 17:53:26] [config] transformer-tied-layers:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] transformer-train-position-embeddings: false
[2022-01-15 17:53:26] [config] tsv: false
[2022-01-15 17:53:26] [config] tsv-fields: 0
[2022-01-15 17:53:26] [config] type: transformer
[2022-01-15 17:53:26] [config] ulr: false
[2022-01-15 17:53:26] [config] ulr-dim-emb: 0
[2022-01-15 17:53:26] [config] ulr-dropout: 0
[2022-01-15 17:53:26] [config] ulr-keys-vectors: ""
[2022-01-15 17:53:26] [config] ulr-query-vectors: ""
[2022-01-15 17:53:26] [config] ulr-softmax-temperature: 1
[2022-01-15 17:53:26] [config] ulr-trainable-transformation: false
[2022-01-15 17:53:26] [config] unlikelihood-loss: false
[2022-01-15 17:53:26] [config] valid-freq: 5000
[2022-01-15 17:53:26] [config] valid-log: ""
[2022-01-15 17:53:26] [config] valid-max-length: 1000
[2022-01-15 17:53:26] [config] valid-metrics:
[2022-01-15 17:53:26] [config]   - cross-entropy
[2022-01-15 17:53:26] [config] valid-mini-batch: 32
[2022-01-15 17:53:26] [config] valid-reset-stalled: false
[2022-01-15 17:53:26] [config] valid-script-args:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] valid-script-path: ""
[2022-01-15 17:53:26] [config] valid-sets:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] valid-translation-output: ""
[2022-01-15 17:53:26] [config] vocabs:
[2022-01-15 17:53:26] [config]   []
[2022-01-15 17:53:26] [config] word-penalty: 0
[2022-01-15 17:53:26] [config] word-scores: false
[2022-01-15 17:53:26] [config] workspace: 10000
[2022-01-15 17:53:26] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 17:53:26] [training] Using single-device training
[2022-01-15 17:53:26] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 17:53:26] [data] Vocabularies will be built separately for each file.
[2022-01-15 17:53:26] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 17:53:26] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 17:53:26] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 17:53:55] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml
[2022-01-15 17:54:06] [data] Setting vocabulary size for input 0 to 2,393,556
[2022-01-15 17:54:06] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 17:54:06] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 17:54:06] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 17:54:31] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml
[2022-01-15 17:54:41] [data] Setting vocabulary size for input 1 to 2,113,516
[2022-01-15 17:54:41] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 17:54:41] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 17:54:41] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 17:54:41] [logits] Applying loss function for 1 factor(s)
[2022-01-15 17:54:41] Error: Labels not matching logits shape (10821201920 != -2063699968, shape=1x10x512x2113516 size=-2063699968)??
[2022-01-15 17:54:41] Error: Aborted from marian::Expr marian::Logits::applyLossFunction(const Words&, const std::function<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >)>&) const in /home/wmi/Workspace/marian/src/layers/generic.cpp:26

[CALL STACK]
[0x5620559b48a5]    marian::Logits::  applyLossFunction  (std::vector<marian::Word,std::allocator<marian::Word>> const&,  std::function<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>> (IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>,IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>)> const&) const + 0xc35
[0x5620559d0f32]    marian::CrossEntropyLoss::  compute  (marian::Logits,  std::vector<marian::Word,std::allocator<marian::Word>> const&,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>) + 0x82
[0x5620559cfde9]    marian::LabelwiseLoss::  apply  (marian::Logits,  std::vector<marian::Word,std::allocator<marian::Word>> const&,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>) + 0x339
[0x5620555ba0db]    marian::models::EncoderDecoderCECost::  apply  (std::shared_ptr<marian::models::IModel>,  std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0x58b
[0x56205520cc82]    marian::models::Trainer::  build  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0xb2
[0x5620556a15f4]    marian::GraphGroup::  collectStats  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::models::ICriterionFunction>,  std::vector<std::shared_ptr<marian::Vocab>,std::allocator<std::shared_ptr<marian::Vocab>>> const&,  double) + 0xb84
[0x5620552e3269]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x2e9
[0x5620551eb389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x5620551a91bc]    main                                               + 0x3c
[0x7f9abc80f0b3]    __libc_start_main                                  + 0xf3
[0x5620551e9b0e]    _start                                             + 0x2e

[2022-01-15 18:02:17] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 18:02:17] [marian] Running on s470607-gpu as process 3955 with command line:
[2022-01-15 18:02:17] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 18:02:17] [config] after: 0e
[2022-01-15 18:02:17] [config] after-batches: 0
[2022-01-15 18:02:17] [config] after-epochs: 1
[2022-01-15 18:02:17] [config] all-caps-every: 0
[2022-01-15 18:02:17] [config] allow-unk: false
[2022-01-15 18:02:17] [config] authors: false
[2022-01-15 18:02:17] [config] beam-size: 6
[2022-01-15 18:02:17] [config] bert-class-symbol: "[CLS]"
[2022-01-15 18:02:17] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 18:02:17] [config] bert-masking-fraction: 0.15
[2022-01-15 18:02:17] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 18:02:17] [config] bert-train-type-embeddings: true
[2022-01-15 18:02:17] [config] bert-type-vocab-size: 2
[2022-01-15 18:02:17] [config] build-info: ""
[2022-01-15 18:02:17] [config] cite: false
[2022-01-15 18:02:17] [config] clip-norm: 5
[2022-01-15 18:02:17] [config] cost-scaling:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] cost-type: ce-sum
[2022-01-15 18:02:17] [config] cpu-threads: 0
[2022-01-15 18:02:17] [config] data-weighting: ""
[2022-01-15 18:02:17] [config] data-weighting-type: sentence
[2022-01-15 18:02:17] [config] dec-cell: gru
[2022-01-15 18:02:17] [config] dec-cell-base-depth: 2
[2022-01-15 18:02:17] [config] dec-cell-high-depth: 1
[2022-01-15 18:02:17] [config] dec-depth: 6
[2022-01-15 18:02:17] [config] devices:
[2022-01-15 18:02:17] [config]   - 0
[2022-01-15 18:02:17] [config] dim-emb: 512
[2022-01-15 18:02:17] [config] dim-rnn: 1024
[2022-01-15 18:02:17] [config] dim-vocabs:
[2022-01-15 18:02:17] [config]   - 0
[2022-01-15 18:02:17] [config]   - 0
[2022-01-15 18:02:17] [config] disp-first: 0
[2022-01-15 18:02:17] [config] disp-freq: 500
[2022-01-15 18:02:17] [config] disp-label-counts: true
[2022-01-15 18:02:17] [config] dropout-rnn: 0
[2022-01-15 18:02:17] [config] dropout-src: 0
[2022-01-15 18:02:17] [config] dropout-trg: 0
[2022-01-15 18:02:17] [config] dump-config: ""
[2022-01-15 18:02:17] [config] early-stopping: 10
[2022-01-15 18:02:17] [config] embedding-fix-src: false
[2022-01-15 18:02:17] [config] embedding-fix-trg: false
[2022-01-15 18:02:17] [config] embedding-normalization: false
[2022-01-15 18:02:17] [config] embedding-vectors:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] enc-cell: gru
[2022-01-15 18:02:17] [config] enc-cell-depth: 1
[2022-01-15 18:02:17] [config] enc-depth: 6
[2022-01-15 18:02:17] [config] enc-type: bidirectional
[2022-01-15 18:02:17] [config] english-title-case-every: 0
[2022-01-15 18:02:17] [config] exponential-smoothing: 0.0001
[2022-01-15 18:02:17] [config] factor-weight: 1
[2022-01-15 18:02:17] [config] grad-dropping-momentum: 0
[2022-01-15 18:02:17] [config] grad-dropping-rate: 0
[2022-01-15 18:02:17] [config] grad-dropping-warmup: 100
[2022-01-15 18:02:17] [config] gradient-checkpointing: false
[2022-01-15 18:02:17] [config] guided-alignment: none
[2022-01-15 18:02:17] [config] guided-alignment-cost: mse
[2022-01-15 18:02:17] [config] guided-alignment-weight: 0.1
[2022-01-15 18:02:17] [config] ignore-model-config: false
[2022-01-15 18:02:17] [config] input-types:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] interpolate-env-vars: false
[2022-01-15 18:02:17] [config] keep-best: false
[2022-01-15 18:02:17] [config] label-smoothing: 0.1
[2022-01-15 18:02:17] [config] layer-normalization: false
[2022-01-15 18:02:17] [config] learn-rate: 0.0003
[2022-01-15 18:02:17] [config] lemma-dim-emb: 0
[2022-01-15 18:02:17] [config] log: /home/wmi/train.log
[2022-01-15 18:02:17] [config] log-level: info
[2022-01-15 18:02:17] [config] log-time-zone: ""
[2022-01-15 18:02:17] [config] logical-epoch:
[2022-01-15 18:02:17] [config]   - 1e
[2022-01-15 18:02:17] [config]   - 0
[2022-01-15 18:02:17] [config] lr-decay: 0
[2022-01-15 18:02:17] [config] lr-decay-freq: 50000
[2022-01-15 18:02:17] [config] lr-decay-inv-sqrt:
[2022-01-15 18:02:17] [config]   - 16000
[2022-01-15 18:02:17] [config] lr-decay-repeat-warmup: false
[2022-01-15 18:02:17] [config] lr-decay-reset-optimizer: false
[2022-01-15 18:02:17] [config] lr-decay-start:
[2022-01-15 18:02:17] [config]   - 10
[2022-01-15 18:02:17] [config]   - 1
[2022-01-15 18:02:17] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 18:02:17] [config] lr-report: true
[2022-01-15 18:02:17] [config] lr-warmup: 16000
[2022-01-15 18:02:17] [config] lr-warmup-at-reload: false
[2022-01-15 18:02:17] [config] lr-warmup-cycle: false
[2022-01-15 18:02:17] [config] lr-warmup-start-rate: 0
[2022-01-15 18:02:17] [config] max-length: 100
[2022-01-15 18:02:17] [config] max-length-crop: false
[2022-01-15 18:02:17] [config] max-length-factor: 3
[2022-01-15 18:02:17] [config] maxi-batch: 1000
[2022-01-15 18:02:17] [config] maxi-batch-sort: trg
[2022-01-15 18:02:17] [config] mini-batch: 64
[2022-01-15 18:02:17] [config] mini-batch-fit: true
[2022-01-15 18:02:17] [config] mini-batch-fit-step: 10
[2022-01-15 18:02:17] [config] mini-batch-track-lr: false
[2022-01-15 18:02:17] [config] mini-batch-warmup: 0
[2022-01-15 18:02:17] [config] mini-batch-words: 0
[2022-01-15 18:02:17] [config] mini-batch-words-ref: 0
[2022-01-15 18:02:17] [config] model: model.npz
[2022-01-15 18:02:17] [config] multi-loss-type: sum
[2022-01-15 18:02:17] [config] multi-node: false
[2022-01-15 18:02:17] [config] multi-node-overlap: true
[2022-01-15 18:02:17] [config] n-best: false
[2022-01-15 18:02:17] [config] no-nccl: false
[2022-01-15 18:02:17] [config] no-reload: false
[2022-01-15 18:02:17] [config] no-restore-corpus: false
[2022-01-15 18:02:17] [config] normalize: 0.6
[2022-01-15 18:02:17] [config] normalize-gradient: false
[2022-01-15 18:02:17] [config] num-devices: 0
[2022-01-15 18:02:17] [config] optimizer: adam
[2022-01-15 18:02:17] [config] optimizer-delay: 1
[2022-01-15 18:02:17] [config] optimizer-params:
[2022-01-15 18:02:17] [config]   - 0.9
[2022-01-15 18:02:17] [config]   - 0.98
[2022-01-15 18:02:17] [config]   - 1e-09
[2022-01-15 18:02:17] [config] output-omit-bias: false
[2022-01-15 18:02:17] [config] overwrite: true
[2022-01-15 18:02:17] [config] precision:
[2022-01-15 18:02:17] [config]   - float32
[2022-01-15 18:02:17] [config]   - float32
[2022-01-15 18:02:17] [config]   - float32
[2022-01-15 18:02:17] [config] pretrained-model: ""
[2022-01-15 18:02:17] [config] quantize-biases: false
[2022-01-15 18:02:17] [config] quantize-bits: 0
[2022-01-15 18:02:17] [config] quantize-log-based: false
[2022-01-15 18:02:17] [config] quantize-optimization-steps: 0
[2022-01-15 18:02:17] [config] quiet: false
[2022-01-15 18:02:17] [config] quiet-translation: false
[2022-01-15 18:02:17] [config] relative-paths: false
[2022-01-15 18:02:17] [config] right-left: false
[2022-01-15 18:02:17] [config] save-freq: 5000
[2022-01-15 18:02:17] [config] seed: 0
[2022-01-15 18:02:17] [config] sentencepiece-alphas:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] sentencepiece-max-lines: 2000000
[2022-01-15 18:02:17] [config] sentencepiece-options: ""
[2022-01-15 18:02:17] [config] shuffle: data
[2022-01-15 18:02:17] [config] shuffle-in-ram: false
[2022-01-15 18:02:17] [config] sigterm: save-and-exit
[2022-01-15 18:02:17] [config] skip: false
[2022-01-15 18:02:17] [config] sqlite: ""
[2022-01-15 18:02:17] [config] sqlite-drop: false
[2022-01-15 18:02:17] [config] sync-sgd: false
[2022-01-15 18:02:17] [config] tempdir: /tmp
[2022-01-15 18:02:17] [config] tied-embeddings: true
[2022-01-15 18:02:17] [config] tied-embeddings-all: false
[2022-01-15 18:02:17] [config] tied-embeddings-src: false
[2022-01-15 18:02:17] [config] train-embedder-rank:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] train-sets:
[2022-01-15 18:02:17] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 18:02:17] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 18:02:17] [config] transformer-aan-activation: swish
[2022-01-15 18:02:17] [config] transformer-aan-depth: 2
[2022-01-15 18:02:17] [config] transformer-aan-nogate: false
[2022-01-15 18:02:17] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 18:02:17] [config] transformer-depth-scaling: false
[2022-01-15 18:02:17] [config] transformer-dim-aan: 2048
[2022-01-15 18:02:17] [config] transformer-dim-ffn: 2048
[2022-01-15 18:02:17] [config] transformer-dropout: 0.1
[2022-01-15 18:02:17] [config] transformer-dropout-attention: 0
[2022-01-15 18:02:17] [config] transformer-dropout-ffn: 0
[2022-01-15 18:02:17] [config] transformer-ffn-activation: swish
[2022-01-15 18:02:17] [config] transformer-ffn-depth: 2
[2022-01-15 18:02:17] [config] transformer-guided-alignment-layer: last
[2022-01-15 18:02:17] [config] transformer-heads: 8
[2022-01-15 18:02:17] [config] transformer-no-projection: false
[2022-01-15 18:02:17] [config] transformer-pool: false
[2022-01-15 18:02:17] [config] transformer-postprocess: dan
[2022-01-15 18:02:17] [config] transformer-postprocess-emb: d
[2022-01-15 18:02:17] [config] transformer-postprocess-top: ""
[2022-01-15 18:02:17] [config] transformer-preprocess: ""
[2022-01-15 18:02:17] [config] transformer-tied-layers:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] transformer-train-position-embeddings: false
[2022-01-15 18:02:17] [config] tsv: false
[2022-01-15 18:02:17] [config] tsv-fields: 0
[2022-01-15 18:02:17] [config] type: transformer
[2022-01-15 18:02:17] [config] ulr: false
[2022-01-15 18:02:17] [config] ulr-dim-emb: 0
[2022-01-15 18:02:17] [config] ulr-dropout: 0
[2022-01-15 18:02:17] [config] ulr-keys-vectors: ""
[2022-01-15 18:02:17] [config] ulr-query-vectors: ""
[2022-01-15 18:02:17] [config] ulr-softmax-temperature: 1
[2022-01-15 18:02:17] [config] ulr-trainable-transformation: false
[2022-01-15 18:02:17] [config] unlikelihood-loss: false
[2022-01-15 18:02:17] [config] valid-freq: 5000
[2022-01-15 18:02:17] [config] valid-log: ""
[2022-01-15 18:02:17] [config] valid-max-length: 1000
[2022-01-15 18:02:17] [config] valid-metrics:
[2022-01-15 18:02:17] [config]   - cross-entropy
[2022-01-15 18:02:17] [config] valid-mini-batch: 32
[2022-01-15 18:02:17] [config] valid-reset-stalled: false
[2022-01-15 18:02:17] [config] valid-script-args:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] valid-script-path: ""
[2022-01-15 18:02:17] [config] valid-sets:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] valid-translation-output: ""
[2022-01-15 18:02:17] [config] vocabs:
[2022-01-15 18:02:17] [config]   []
[2022-01-15 18:02:17] [config] word-penalty: 0
[2022-01-15 18:02:17] [config] word-scores: false
[2022-01-15 18:02:17] [config] workspace: 10000
[2022-01-15 18:02:17] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 18:02:17] [training] Using single-device training
[2022-01-15 18:02:17] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 18:02:17] [data] Vocabularies will be built separately for each file.
[2022-01-15 18:02:17] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 18:02:17] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 18:02:17] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv
[2022-01-15 18:02:47] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.yml
[2022-01-15 18:02:58] [data] Setting vocabulary size for input 0 to 2,393,556
[2022-01-15 18:02:58] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 18:02:58] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 18:02:58] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv
[2022-01-15 18:03:23] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.yml
[2022-01-15 18:03:33] [data] Setting vocabulary size for input 1 to 2,113,516
[2022-01-15 18:03:33] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 18:03:33] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 18:03:33] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 18:03:34] [logits] Applying loss function for 1 factor(s)
[2022-01-15 18:03:34] Error: Labels not matching logits shape (10821201920 != -2063699968, shape=1x10x512x2113516 size=-2063699968)??
[2022-01-15 18:03:34] Error: Aborted from marian::Expr marian::Logits::applyLossFunction(const Words&, const std::function<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >(IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >)>&) const in /home/wmi/Workspace/marian/src/layers/generic.cpp:26

[CALL STACK]
[0x55a61109a8a5]    marian::Logits::  applyLossFunction  (std::vector<marian::Word,std::allocator<marian::Word>> const&,  std::function<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>> (IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>,IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>)> const&) const + 0xc35
[0x55a6110b6f32]    marian::CrossEntropyLoss::  compute  (marian::Logits,  std::vector<marian::Word,std::allocator<marian::Word>> const&,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>) + 0x82
[0x55a6110b5de9]    marian::LabelwiseLoss::  apply  (marian::Logits,  std::vector<marian::Word,std::allocator<marian::Word>> const&,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>,  IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>) + 0x339
[0x55a610ca00db]    marian::models::EncoderDecoderCECost::  apply  (std::shared_ptr<marian::models::IModel>,  std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0x58b
[0x55a6108f2c82]    marian::models::Trainer::  build  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::data::Batch>,  bool) + 0xb2
[0x55a610d875f4]    marian::GraphGroup::  collectStats  (std::shared_ptr<marian::ExpressionGraph>,  std::shared_ptr<marian::models::ICriterionFunction>,  std::vector<std::shared_ptr<marian::Vocab>,std::allocator<std::shared_ptr<marian::Vocab>>> const&,  double) + 0xb84
[0x55a6109c9269]    marian::Train<marian::SingletonGraph>::  run  ()   + 0x2e9
[0x55a6108d1389]    mainTrainer  (int,  char**)                        + 0x5e9
[0x55a61088f1bc]    main                                               + 0x3c
[0x7fab12a310b3]    __libc_start_main                                  + 0xf3
[0x55a6108cfb0e]    _start                                             + 0x2e

[2022-01-15 18:10:00] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 18:10:00] [marian] Running on s470607-gpu as process 4042 with command line:
[2022-01-15 18:10:00] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 18:10:00] [config] after: 0e
[2022-01-15 18:10:00] [config] after-batches: 0
[2022-01-15 18:10:00] [config] after-epochs: 1
[2022-01-15 18:10:00] [config] all-caps-every: 0
[2022-01-15 18:10:00] [config] allow-unk: false
[2022-01-15 18:10:00] [config] authors: false
[2022-01-15 18:10:00] [config] beam-size: 6
[2022-01-15 18:10:00] [config] bert-class-symbol: "[CLS]"
[2022-01-15 18:10:00] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 18:10:00] [config] bert-masking-fraction: 0.15
[2022-01-15 18:10:00] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 18:10:00] [config] bert-train-type-embeddings: true
[2022-01-15 18:10:00] [config] bert-type-vocab-size: 2
[2022-01-15 18:10:00] [config] build-info: ""
[2022-01-15 18:10:00] [config] cite: false
[2022-01-15 18:10:00] [config] clip-norm: 5
[2022-01-15 18:10:00] [config] cost-scaling:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] cost-type: ce-sum
[2022-01-15 18:10:00] [config] cpu-threads: 0
[2022-01-15 18:10:00] [config] data-weighting: ""
[2022-01-15 18:10:00] [config] data-weighting-type: sentence
[2022-01-15 18:10:00] [config] dec-cell: gru
[2022-01-15 18:10:00] [config] dec-cell-base-depth: 2
[2022-01-15 18:10:00] [config] dec-cell-high-depth: 1
[2022-01-15 18:10:00] [config] dec-depth: 6
[2022-01-15 18:10:00] [config] devices:
[2022-01-15 18:10:00] [config]   - 0
[2022-01-15 18:10:00] [config] dim-emb: 512
[2022-01-15 18:10:00] [config] dim-rnn: 1024
[2022-01-15 18:10:00] [config] dim-vocabs:
[2022-01-15 18:10:00] [config]   - 0
[2022-01-15 18:10:00] [config]   - 0
[2022-01-15 18:10:00] [config] disp-first: 0
[2022-01-15 18:10:00] [config] disp-freq: 500
[2022-01-15 18:10:00] [config] disp-label-counts: true
[2022-01-15 18:10:00] [config] dropout-rnn: 0
[2022-01-15 18:10:00] [config] dropout-src: 0
[2022-01-15 18:10:00] [config] dropout-trg: 0
[2022-01-15 18:10:00] [config] dump-config: ""
[2022-01-15 18:10:00] [config] early-stopping: 10
[2022-01-15 18:10:00] [config] embedding-fix-src: false
[2022-01-15 18:10:00] [config] embedding-fix-trg: false
[2022-01-15 18:10:00] [config] embedding-normalization: false
[2022-01-15 18:10:00] [config] embedding-vectors:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] enc-cell: gru
[2022-01-15 18:10:00] [config] enc-cell-depth: 1
[2022-01-15 18:10:00] [config] enc-depth: 6
[2022-01-15 18:10:00] [config] enc-type: bidirectional
[2022-01-15 18:10:00] [config] english-title-case-every: 0
[2022-01-15 18:10:00] [config] exponential-smoothing: 0.0001
[2022-01-15 18:10:00] [config] factor-weight: 1
[2022-01-15 18:10:00] [config] grad-dropping-momentum: 0
[2022-01-15 18:10:00] [config] grad-dropping-rate: 0
[2022-01-15 18:10:00] [config] grad-dropping-warmup: 100
[2022-01-15 18:10:00] [config] gradient-checkpointing: false
[2022-01-15 18:10:00] [config] guided-alignment: none
[2022-01-15 18:10:00] [config] guided-alignment-cost: mse
[2022-01-15 18:10:00] [config] guided-alignment-weight: 0.1
[2022-01-15 18:10:00] [config] ignore-model-config: false
[2022-01-15 18:10:00] [config] input-types:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] interpolate-env-vars: false
[2022-01-15 18:10:00] [config] keep-best: false
[2022-01-15 18:10:00] [config] label-smoothing: 0.1
[2022-01-15 18:10:00] [config] layer-normalization: false
[2022-01-15 18:10:00] [config] learn-rate: 0.0003
[2022-01-15 18:10:00] [config] lemma-dim-emb: 0
[2022-01-15 18:10:00] [config] log: /home/wmi/train.log
[2022-01-15 18:10:00] [config] log-level: info
[2022-01-15 18:10:00] [config] log-time-zone: ""
[2022-01-15 18:10:00] [config] logical-epoch:
[2022-01-15 18:10:00] [config]   - 1e
[2022-01-15 18:10:00] [config]   - 0
[2022-01-15 18:10:00] [config] lr-decay: 0
[2022-01-15 18:10:00] [config] lr-decay-freq: 50000
[2022-01-15 18:10:00] [config] lr-decay-inv-sqrt:
[2022-01-15 18:10:00] [config]   - 16000
[2022-01-15 18:10:00] [config] lr-decay-repeat-warmup: false
[2022-01-15 18:10:00] [config] lr-decay-reset-optimizer: false
[2022-01-15 18:10:00] [config] lr-decay-start:
[2022-01-15 18:10:00] [config]   - 10
[2022-01-15 18:10:00] [config]   - 1
[2022-01-15 18:10:00] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 18:10:00] [config] lr-report: true
[2022-01-15 18:10:00] [config] lr-warmup: 16000
[2022-01-15 18:10:00] [config] lr-warmup-at-reload: false
[2022-01-15 18:10:00] [config] lr-warmup-cycle: false
[2022-01-15 18:10:00] [config] lr-warmup-start-rate: 0
[2022-01-15 18:10:00] [config] max-length: 100
[2022-01-15 18:10:00] [config] max-length-crop: false
[2022-01-15 18:10:00] [config] max-length-factor: 3
[2022-01-15 18:10:00] [config] maxi-batch: 1000
[2022-01-15 18:10:00] [config] maxi-batch-sort: trg
[2022-01-15 18:10:00] [config] mini-batch: 64
[2022-01-15 18:10:00] [config] mini-batch-fit: true
[2022-01-15 18:10:00] [config] mini-batch-fit-step: 10
[2022-01-15 18:10:00] [config] mini-batch-track-lr: false
[2022-01-15 18:10:00] [config] mini-batch-warmup: 0
[2022-01-15 18:10:00] [config] mini-batch-words: 0
[2022-01-15 18:10:00] [config] mini-batch-words-ref: 0
[2022-01-15 18:10:00] [config] model: model.npz
[2022-01-15 18:10:00] [config] multi-loss-type: sum
[2022-01-15 18:10:00] [config] multi-node: false
[2022-01-15 18:10:00] [config] multi-node-overlap: true
[2022-01-15 18:10:00] [config] n-best: false
[2022-01-15 18:10:00] [config] no-nccl: false
[2022-01-15 18:10:00] [config] no-reload: false
[2022-01-15 18:10:00] [config] no-restore-corpus: false
[2022-01-15 18:10:00] [config] normalize: 0.6
[2022-01-15 18:10:00] [config] normalize-gradient: false
[2022-01-15 18:10:00] [config] num-devices: 0
[2022-01-15 18:10:00] [config] optimizer: adam
[2022-01-15 18:10:00] [config] optimizer-delay: 1
[2022-01-15 18:10:00] [config] optimizer-params:
[2022-01-15 18:10:00] [config]   - 0.9
[2022-01-15 18:10:00] [config]   - 0.98
[2022-01-15 18:10:00] [config]   - 1e-09
[2022-01-15 18:10:00] [config] output-omit-bias: false
[2022-01-15 18:10:00] [config] overwrite: true
[2022-01-15 18:10:00] [config] precision:
[2022-01-15 18:10:00] [config]   - float32
[2022-01-15 18:10:00] [config]   - float32
[2022-01-15 18:10:00] [config]   - float32
[2022-01-15 18:10:00] [config] pretrained-model: ""
[2022-01-15 18:10:00] [config] quantize-biases: false
[2022-01-15 18:10:00] [config] quantize-bits: 0
[2022-01-15 18:10:00] [config] quantize-log-based: false
[2022-01-15 18:10:00] [config] quantize-optimization-steps: 0
[2022-01-15 18:10:00] [config] quiet: false
[2022-01-15 18:10:00] [config] quiet-translation: false
[2022-01-15 18:10:00] [config] relative-paths: false
[2022-01-15 18:10:00] [config] right-left: false
[2022-01-15 18:10:00] [config] save-freq: 5000
[2022-01-15 18:10:00] [config] seed: 0
[2022-01-15 18:10:00] [config] sentencepiece-alphas:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] sentencepiece-max-lines: 2000000
[2022-01-15 18:10:00] [config] sentencepiece-options: ""
[2022-01-15 18:10:00] [config] shuffle: data
[2022-01-15 18:10:00] [config] shuffle-in-ram: false
[2022-01-15 18:10:00] [config] sigterm: save-and-exit
[2022-01-15 18:10:00] [config] skip: false
[2022-01-15 18:10:00] [config] sqlite: ""
[2022-01-15 18:10:00] [config] sqlite-drop: false
[2022-01-15 18:10:00] [config] sync-sgd: false
[2022-01-15 18:10:00] [config] tempdir: /tmp
[2022-01-15 18:10:00] [config] tied-embeddings: true
[2022-01-15 18:10:00] [config] tied-embeddings-all: false
[2022-01-15 18:10:00] [config] tied-embeddings-src: false
[2022-01-15 18:10:00] [config] train-embedder-rank:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] train-sets:
[2022-01-15 18:10:00] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 18:10:00] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 18:10:00] [config] transformer-aan-activation: swish
[2022-01-15 18:10:00] [config] transformer-aan-depth: 2
[2022-01-15 18:10:00] [config] transformer-aan-nogate: false
[2022-01-15 18:10:00] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 18:10:00] [config] transformer-depth-scaling: false
[2022-01-15 18:10:00] [config] transformer-dim-aan: 2048
[2022-01-15 18:10:00] [config] transformer-dim-ffn: 2048
[2022-01-15 18:10:00] [config] transformer-dropout: 0.1
[2022-01-15 18:10:00] [config] transformer-dropout-attention: 0
[2022-01-15 18:10:00] [config] transformer-dropout-ffn: 0
[2022-01-15 18:10:00] [config] transformer-ffn-activation: swish
[2022-01-15 18:10:00] [config] transformer-ffn-depth: 2
[2022-01-15 18:10:00] [config] transformer-guided-alignment-layer: last
[2022-01-15 18:10:00] [config] transformer-heads: 8
[2022-01-15 18:10:00] [config] transformer-no-projection: false
[2022-01-15 18:10:00] [config] transformer-pool: false
[2022-01-15 18:10:00] [config] transformer-postprocess: dan
[2022-01-15 18:10:00] [config] transformer-postprocess-emb: d
[2022-01-15 18:10:00] [config] transformer-postprocess-top: ""
[2022-01-15 18:10:00] [config] transformer-preprocess: ""
[2022-01-15 18:10:00] [config] transformer-tied-layers:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] transformer-train-position-embeddings: false
[2022-01-15 18:10:00] [config] tsv: false
[2022-01-15 18:10:00] [config] tsv-fields: 0
[2022-01-15 18:10:00] [config] type: transformer
[2022-01-15 18:10:00] [config] ulr: false
[2022-01-15 18:10:00] [config] ulr-dim-emb: 0
[2022-01-15 18:10:00] [config] ulr-dropout: 0
[2022-01-15 18:10:00] [config] ulr-keys-vectors: ""
[2022-01-15 18:10:00] [config] ulr-query-vectors: ""
[2022-01-15 18:10:00] [config] ulr-softmax-temperature: 1
[2022-01-15 18:10:00] [config] ulr-trainable-transformation: false
[2022-01-15 18:10:00] [config] unlikelihood-loss: false
[2022-01-15 18:10:00] [config] valid-freq: 5000
[2022-01-15 18:10:00] [config] valid-log: ""
[2022-01-15 18:10:00] [config] valid-max-length: 1000
[2022-01-15 18:10:00] [config] valid-metrics:
[2022-01-15 18:10:00] [config]   - cross-entropy
[2022-01-15 18:10:00] [config] valid-mini-batch: 32
[2022-01-15 18:10:00] [config] valid-reset-stalled: false
[2022-01-15 18:10:00] [config] valid-script-args:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] valid-script-path: ""
[2022-01-15 18:10:00] [config] valid-sets:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] valid-translation-output: ""
[2022-01-15 18:10:00] [config] vocabs:
[2022-01-15 18:10:00] [config]   []
[2022-01-15 18:10:00] [config] word-penalty: 0
[2022-01-15 18:10:00] [config] word-scores: false
[2022-01-15 18:10:00] [config] workspace: 10000
[2022-01-15 18:10:00] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 18:10:00] [training] Using single-device training
[2022-01-15 18:10:00] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 18:10:00] [data] Vocabularies will be built separately for each file.
[2022-01-15 18:10:00] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 18:10:00] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 18:10:00] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 18:10:04] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml
[2022-01-15 18:10:04] [data] Setting vocabulary size for input 0 to 14,881
[2022-01-15 18:10:04] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 18:10:04] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 18:10:04] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 18:10:08] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml
[2022-01-15 18:10:08] [data] Setting vocabulary size for input 1 to 14,891
[2022-01-15 18:10:08] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 18:10:08] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 18:10:08] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 18:10:08] [logits] Applying loss function for 1 factor(s)
[2022-01-15 18:10:08] [memory] Reserving 226 MB, device gpu0
[2022-01-15 18:10:10] [gpu] 16-bit TensorCores enabled for float32 matrix operations
[2022-01-15 18:10:10] [memory] Reserving 226 MB, device gpu0
[2022-01-15 18:10:20] [batching] Done. Typical MB size is 10,112 target words
[2022-01-15 18:10:21] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 18:10:21] Training started
[2022-01-15 18:10:21] [data] Shuffling data
[2022-01-15 18:10:21] [data] Done reading 1,514,371 sentences
[2022-01-15 18:10:28] [data] Done shuffling 1,514,371 sentences to temp files
[2022-01-15 18:10:29] [memory] Reserving 226 MB, device gpu0
[2022-01-15 18:10:30] [memory] Reserving 226 MB, device gpu0
[2022-01-15 18:10:30] [memory] Reserving 453 MB, device gpu0
[2022-01-15 18:10:30] [memory] Reserving 226 MB, device gpu0
[2022-01-15 18:11:53] Ep. 1 : Up. 500 : Sen. 123,531 : Cost 9.24570084 * 4,202,148 @ 7,371 after 4,202,148 : Time 92.22s : 45564.85 words/s : L.r. 9.3750e-06
[2022-01-15 18:13:17] Ep. 1 : Up. 1000 : Sen. 247,289 : Cost 8.30709076 * 4,212,393 @ 9,174 after 8,414,541 : Time 84.57s : 49812.25 words/s : L.r. 1.8750e-05
[2022-01-15 18:14:42] Ep. 1 : Up. 1500 : Sen. 371,249 : Cost 7.93976021 * 4,214,955 @ 8,424 after 12,629,496 : Time 84.95s : 49616.76 words/s : L.r. 2.8125e-05
[2022-01-15 18:16:07] Ep. 1 : Up. 2000 : Sen. 494,238 : Cost 7.66895914 * 4,181,637 @ 6,976 after 16,811,133 : Time 84.54s : 49463.52 words/s : L.r. 3.7500e-05
[2022-01-15 18:17:32] Ep. 1 : Up. 2500 : Sen. 620,120 : Cost 7.33166742 * 4,211,519 @ 8,932 after 21,022,652 : Time 85.05s : 49517.60 words/s : L.r. 4.6875e-05
[2022-01-15 18:18:57] Ep. 1 : Up. 3000 : Sen. 742,304 : Cost 6.73455620 * 4,224,299 @ 7,037 after 25,246,951 : Time 85.22s : 49569.79 words/s : L.r. 5.6250e-05
[2022-01-15 18:20:22] Ep. 1 : Up. 3500 : Sen. 867,022 : Cost 5.82840490 * 4,186,159 @ 8,820 after 29,433,110 : Time 84.70s : 49423.21 words/s : L.r. 6.5625e-05
[2022-01-15 19:15:06] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 19:15:06] [marian] Running on s470607-gpu as process 4149 with command line:
[2022-01-15 19:15:06] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 19:15:06] [config] after: 0e
[2022-01-15 19:15:06] [config] after-batches: 0
[2022-01-15 19:15:06] [config] after-epochs: 1
[2022-01-15 19:15:06] [config] all-caps-every: 0
[2022-01-15 19:15:06] [config] allow-unk: false
[2022-01-15 19:15:06] [config] authors: false
[2022-01-15 19:15:06] [config] beam-size: 6
[2022-01-15 19:15:06] [config] bert-class-symbol: "[CLS]"
[2022-01-15 19:15:06] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 19:15:06] [config] bert-masking-fraction: 0.15
[2022-01-15 19:15:06] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 19:15:06] [config] bert-train-type-embeddings: true
[2022-01-15 19:15:06] [config] bert-type-vocab-size: 2
[2022-01-15 19:15:06] [config] build-info: ""
[2022-01-15 19:15:06] [config] cite: false
[2022-01-15 19:15:06] [config] clip-norm: 5
[2022-01-15 19:15:06] [config] cost-scaling:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] cost-type: ce-sum
[2022-01-15 19:15:06] [config] cpu-threads: 0
[2022-01-15 19:15:06] [config] data-weighting: ""
[2022-01-15 19:15:06] [config] data-weighting-type: sentence
[2022-01-15 19:15:06] [config] dec-cell: gru
[2022-01-15 19:15:06] [config] dec-cell-base-depth: 2
[2022-01-15 19:15:06] [config] dec-cell-high-depth: 1
[2022-01-15 19:15:06] [config] dec-depth: 6
[2022-01-15 19:15:06] [config] devices:
[2022-01-15 19:15:06] [config]   - 0
[2022-01-15 19:15:06] [config] dim-emb: 512
[2022-01-15 19:15:06] [config] dim-rnn: 1024
[2022-01-15 19:15:06] [config] dim-vocabs:
[2022-01-15 19:15:06] [config]   - 0
[2022-01-15 19:15:06] [config]   - 0
[2022-01-15 19:15:06] [config] disp-first: 0
[2022-01-15 19:15:06] [config] disp-freq: 500
[2022-01-15 19:15:06] [config] disp-label-counts: true
[2022-01-15 19:15:06] [config] dropout-rnn: 0
[2022-01-15 19:15:06] [config] dropout-src: 0
[2022-01-15 19:15:06] [config] dropout-trg: 0
[2022-01-15 19:15:06] [config] dump-config: ""
[2022-01-15 19:15:06] [config] early-stopping: 10
[2022-01-15 19:15:06] [config] embedding-fix-src: false
[2022-01-15 19:15:06] [config] embedding-fix-trg: false
[2022-01-15 19:15:06] [config] embedding-normalization: false
[2022-01-15 19:15:06] [config] embedding-vectors:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] enc-cell: gru
[2022-01-15 19:15:06] [config] enc-cell-depth: 1
[2022-01-15 19:15:06] [config] enc-depth: 6
[2022-01-15 19:15:06] [config] enc-type: bidirectional
[2022-01-15 19:15:06] [config] english-title-case-every: 0
[2022-01-15 19:15:06] [config] exponential-smoothing: 0.0001
[2022-01-15 19:15:06] [config] factor-weight: 1
[2022-01-15 19:15:06] [config] grad-dropping-momentum: 0
[2022-01-15 19:15:06] [config] grad-dropping-rate: 0
[2022-01-15 19:15:06] [config] grad-dropping-warmup: 100
[2022-01-15 19:15:06] [config] gradient-checkpointing: false
[2022-01-15 19:15:06] [config] guided-alignment: none
[2022-01-15 19:15:06] [config] guided-alignment-cost: mse
[2022-01-15 19:15:06] [config] guided-alignment-weight: 0.1
[2022-01-15 19:15:06] [config] ignore-model-config: false
[2022-01-15 19:15:06] [config] input-types:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] interpolate-env-vars: false
[2022-01-15 19:15:06] [config] keep-best: false
[2022-01-15 19:15:06] [config] label-smoothing: 0.1
[2022-01-15 19:15:06] [config] layer-normalization: false
[2022-01-15 19:15:06] [config] learn-rate: 0.0003
[2022-01-15 19:15:06] [config] lemma-dim-emb: 0
[2022-01-15 19:15:06] [config] log: /home/wmi/train.log
[2022-01-15 19:15:06] [config] log-level: info
[2022-01-15 19:15:06] [config] log-time-zone: ""
[2022-01-15 19:15:06] [config] logical-epoch:
[2022-01-15 19:15:06] [config]   - 1e
[2022-01-15 19:15:06] [config]   - 0
[2022-01-15 19:15:06] [config] lr-decay: 0
[2022-01-15 19:15:06] [config] lr-decay-freq: 50000
[2022-01-15 19:15:06] [config] lr-decay-inv-sqrt:
[2022-01-15 19:15:06] [config]   - 16000
[2022-01-15 19:15:06] [config] lr-decay-repeat-warmup: false
[2022-01-15 19:15:06] [config] lr-decay-reset-optimizer: false
[2022-01-15 19:15:06] [config] lr-decay-start:
[2022-01-15 19:15:06] [config]   - 10
[2022-01-15 19:15:06] [config]   - 1
[2022-01-15 19:15:06] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 19:15:06] [config] lr-report: true
[2022-01-15 19:15:06] [config] lr-warmup: 16000
[2022-01-15 19:15:06] [config] lr-warmup-at-reload: false
[2022-01-15 19:15:06] [config] lr-warmup-cycle: false
[2022-01-15 19:15:06] [config] lr-warmup-start-rate: 0
[2022-01-15 19:15:06] [config] max-length: 100
[2022-01-15 19:15:06] [config] max-length-crop: false
[2022-01-15 19:15:06] [config] max-length-factor: 3
[2022-01-15 19:15:06] [config] maxi-batch: 1000
[2022-01-15 19:15:06] [config] maxi-batch-sort: trg
[2022-01-15 19:15:06] [config] mini-batch: 64
[2022-01-15 19:15:06] [config] mini-batch-fit: true
[2022-01-15 19:15:06] [config] mini-batch-fit-step: 10
[2022-01-15 19:15:06] [config] mini-batch-track-lr: false
[2022-01-15 19:15:06] [config] mini-batch-warmup: 0
[2022-01-15 19:15:06] [config] mini-batch-words: 0
[2022-01-15 19:15:06] [config] mini-batch-words-ref: 0
[2022-01-15 19:15:06] [config] model: model.npz
[2022-01-15 19:15:06] [config] multi-loss-type: sum
[2022-01-15 19:15:06] [config] multi-node: false
[2022-01-15 19:15:06] [config] multi-node-overlap: true
[2022-01-15 19:15:06] [config] n-best: false
[2022-01-15 19:15:06] [config] no-nccl: false
[2022-01-15 19:15:06] [config] no-reload: false
[2022-01-15 19:15:06] [config] no-restore-corpus: false
[2022-01-15 19:15:06] [config] normalize: 0.6
[2022-01-15 19:15:06] [config] normalize-gradient: false
[2022-01-15 19:15:06] [config] num-devices: 0
[2022-01-15 19:15:06] [config] optimizer: adam
[2022-01-15 19:15:06] [config] optimizer-delay: 1
[2022-01-15 19:15:06] [config] optimizer-params:
[2022-01-15 19:15:06] [config]   - 0.9
[2022-01-15 19:15:06] [config]   - 0.98
[2022-01-15 19:15:06] [config]   - 1e-09
[2022-01-15 19:15:06] [config] output-omit-bias: false
[2022-01-15 19:15:06] [config] overwrite: true
[2022-01-15 19:15:06] [config] precision:
[2022-01-15 19:15:06] [config]   - float32
[2022-01-15 19:15:06] [config]   - float32
[2022-01-15 19:15:06] [config]   - float32
[2022-01-15 19:15:06] [config] pretrained-model: ""
[2022-01-15 19:15:06] [config] quantize-biases: false
[2022-01-15 19:15:06] [config] quantize-bits: 0
[2022-01-15 19:15:06] [config] quantize-log-based: false
[2022-01-15 19:15:06] [config] quantize-optimization-steps: 0
[2022-01-15 19:15:06] [config] quiet: false
[2022-01-15 19:15:06] [config] quiet-translation: false
[2022-01-15 19:15:06] [config] relative-paths: false
[2022-01-15 19:15:06] [config] right-left: false
[2022-01-15 19:15:06] [config] save-freq: 5000
[2022-01-15 19:15:06] [config] seed: 0
[2022-01-15 19:15:06] [config] sentencepiece-alphas:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] sentencepiece-max-lines: 2000000
[2022-01-15 19:15:06] [config] sentencepiece-options: ""
[2022-01-15 19:15:06] [config] shuffle: data
[2022-01-15 19:15:06] [config] shuffle-in-ram: false
[2022-01-15 19:15:06] [config] sigterm: save-and-exit
[2022-01-15 19:15:06] [config] skip: false
[2022-01-15 19:15:06] [config] sqlite: ""
[2022-01-15 19:15:06] [config] sqlite-drop: false
[2022-01-15 19:15:06] [config] sync-sgd: false
[2022-01-15 19:15:06] [config] tempdir: /tmp
[2022-01-15 19:15:06] [config] tied-embeddings: true
[2022-01-15 19:15:06] [config] tied-embeddings-all: false
[2022-01-15 19:15:06] [config] tied-embeddings-src: false
[2022-01-15 19:15:06] [config] train-embedder-rank:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] train-sets:
[2022-01-15 19:15:06] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 19:15:06] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 19:15:06] [config] transformer-aan-activation: swish
[2022-01-15 19:15:06] [config] transformer-aan-depth: 2
[2022-01-15 19:15:06] [config] transformer-aan-nogate: false
[2022-01-15 19:15:06] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 19:15:06] [config] transformer-depth-scaling: false
[2022-01-15 19:15:06] [config] transformer-dim-aan: 2048
[2022-01-15 19:15:06] [config] transformer-dim-ffn: 2048
[2022-01-15 19:15:06] [config] transformer-dropout: 0.1
[2022-01-15 19:15:06] [config] transformer-dropout-attention: 0
[2022-01-15 19:15:06] [config] transformer-dropout-ffn: 0
[2022-01-15 19:15:06] [config] transformer-ffn-activation: swish
[2022-01-15 19:15:06] [config] transformer-ffn-depth: 2
[2022-01-15 19:15:06] [config] transformer-guided-alignment-layer: last
[2022-01-15 19:15:06] [config] transformer-heads: 8
[2022-01-15 19:15:06] [config] transformer-no-projection: false
[2022-01-15 19:15:06] [config] transformer-pool: false
[2022-01-15 19:15:06] [config] transformer-postprocess: dan
[2022-01-15 19:15:06] [config] transformer-postprocess-emb: d
[2022-01-15 19:15:06] [config] transformer-postprocess-top: ""
[2022-01-15 19:15:06] [config] transformer-preprocess: ""
[2022-01-15 19:15:06] [config] transformer-tied-layers:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] transformer-train-position-embeddings: false
[2022-01-15 19:15:06] [config] tsv: false
[2022-01-15 19:15:06] [config] tsv-fields: 0
[2022-01-15 19:15:06] [config] type: transformer
[2022-01-15 19:15:06] [config] ulr: false
[2022-01-15 19:15:06] [config] ulr-dim-emb: 0
[2022-01-15 19:15:06] [config] ulr-dropout: 0
[2022-01-15 19:15:06] [config] ulr-keys-vectors: ""
[2022-01-15 19:15:06] [config] ulr-query-vectors: ""
[2022-01-15 19:15:06] [config] ulr-softmax-temperature: 1
[2022-01-15 19:15:06] [config] ulr-trainable-transformation: false
[2022-01-15 19:15:06] [config] unlikelihood-loss: false
[2022-01-15 19:15:06] [config] valid-freq: 5000
[2022-01-15 19:15:06] [config] valid-log: ""
[2022-01-15 19:15:06] [config] valid-max-length: 1000
[2022-01-15 19:15:06] [config] valid-metrics:
[2022-01-15 19:15:06] [config]   - cross-entropy
[2022-01-15 19:15:06] [config] valid-mini-batch: 32
[2022-01-15 19:15:06] [config] valid-reset-stalled: false
[2022-01-15 19:15:06] [config] valid-script-args:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] valid-script-path: ""
[2022-01-15 19:15:06] [config] valid-sets:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] valid-translation-output: ""
[2022-01-15 19:15:06] [config] vocabs:
[2022-01-15 19:15:06] [config]   []
[2022-01-15 19:15:06] [config] word-penalty: 0
[2022-01-15 19:15:06] [config] word-scores: false
[2022-01-15 19:15:06] [config] workspace: 10000
[2022-01-15 19:15:06] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 19:15:06] [training] Using single-device training
[2022-01-15 19:15:06] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 19:15:06] [data] Vocabularies will be built separately for each file.
[2022-01-15 19:15:06] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 19:15:06] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 19:15:06] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 19:15:10] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml
[2022-01-15 19:15:10] [data] Setting vocabulary size for input 0 to 14,891
[2022-01-15 19:15:10] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 19:15:10] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 19:15:10] [data] Creating vocabulary /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml from /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 19:15:14] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml
[2022-01-15 19:15:14] [data] Setting vocabulary size for input 1 to 14,901
[2022-01-15 19:15:14] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 19:15:14] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 19:15:15] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 19:15:15] [logits] Applying loss function for 1 factor(s)
[2022-01-15 19:15:15] [memory] Reserving 226 MB, device gpu0
[2022-01-15 19:15:16] [gpu] 16-bit TensorCores enabled for float32 matrix operations
[2022-01-15 19:15:16] [memory] Reserving 226 MB, device gpu0
[2022-01-15 19:15:27] [batching] Done. Typical MB size is 10,112 target words
[2022-01-15 19:15:27] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 19:15:27] Training started
[2022-01-15 19:15:27] [data] Shuffling data
[2022-01-15 19:15:28] [data] Done reading 1,514,371 sentences
[2022-01-15 19:15:35] [data] Done shuffling 1,514,371 sentences to temp files
[2022-01-15 19:15:36] [memory] Reserving 226 MB, device gpu0
[2022-01-15 19:15:36] [memory] Reserving 226 MB, device gpu0
[2022-01-15 19:15:36] [memory] Reserving 453 MB, device gpu0
[2022-01-15 19:15:36] [memory] Reserving 226 MB, device gpu0
[2022-01-15 19:17:00] Ep. 1 : Up. 500 : Sen. 124,245 : Cost 9.25954151 * 4,225,979 @ 4,563 after 4,225,979 : Time 92.46s : 45708.10 words/s : L.r. 9.3750e-06
[2022-01-15 19:18:09] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 19:18:09] [marian] Running on s470607-gpu as process 4189 with command line:
[2022-01-15 19:18:09] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000 /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-15 19:18:09] [config] after: 0e
[2022-01-15 19:18:09] [config] after-batches: 0
[2022-01-15 19:18:09] [config] after-epochs: 1
[2022-01-15 19:18:09] [config] all-caps-every: 0
[2022-01-15 19:18:09] [config] allow-unk: false
[2022-01-15 19:18:09] [config] authors: false
[2022-01-15 19:18:09] [config] beam-size: 6
[2022-01-15 19:18:09] [config] bert-class-symbol: "[CLS]"
[2022-01-15 19:18:09] [config] bert-mask-symbol: "[MASK]"
[2022-01-15 19:18:09] [config] bert-masking-fraction: 0.15
[2022-01-15 19:18:09] [config] bert-sep-symbol: "[SEP]"
[2022-01-15 19:18:09] [config] bert-train-type-embeddings: true
[2022-01-15 19:18:09] [config] bert-type-vocab-size: 2
[2022-01-15 19:18:09] [config] build-info: ""
[2022-01-15 19:18:09] [config] cite: false
[2022-01-15 19:18:09] [config] clip-norm: 5
[2022-01-15 19:18:09] [config] cost-scaling:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] cost-type: ce-sum
[2022-01-15 19:18:09] [config] cpu-threads: 0
[2022-01-15 19:18:09] [config] data-weighting: ""
[2022-01-15 19:18:09] [config] data-weighting-type: sentence
[2022-01-15 19:18:09] [config] dec-cell: gru
[2022-01-15 19:18:09] [config] dec-cell-base-depth: 2
[2022-01-15 19:18:09] [config] dec-cell-high-depth: 1
[2022-01-15 19:18:09] [config] dec-depth: 6
[2022-01-15 19:18:09] [config] devices:
[2022-01-15 19:18:09] [config]   - 0
[2022-01-15 19:18:09] [config] dim-emb: 512
[2022-01-15 19:18:09] [config] dim-rnn: 1024
[2022-01-15 19:18:09] [config] dim-vocabs:
[2022-01-15 19:18:09] [config]   - 0
[2022-01-15 19:18:09] [config]   - 0
[2022-01-15 19:18:09] [config] disp-first: 0
[2022-01-15 19:18:09] [config] disp-freq: 500
[2022-01-15 19:18:09] [config] disp-label-counts: true
[2022-01-15 19:18:09] [config] dropout-rnn: 0
[2022-01-15 19:18:09] [config] dropout-src: 0
[2022-01-15 19:18:09] [config] dropout-trg: 0
[2022-01-15 19:18:09] [config] dump-config: ""
[2022-01-15 19:18:09] [config] early-stopping: 10
[2022-01-15 19:18:09] [config] embedding-fix-src: false
[2022-01-15 19:18:09] [config] embedding-fix-trg: false
[2022-01-15 19:18:09] [config] embedding-normalization: false
[2022-01-15 19:18:09] [config] embedding-vectors:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] enc-cell: gru
[2022-01-15 19:18:09] [config] enc-cell-depth: 1
[2022-01-15 19:18:09] [config] enc-depth: 6
[2022-01-15 19:18:09] [config] enc-type: bidirectional
[2022-01-15 19:18:09] [config] english-title-case-every: 0
[2022-01-15 19:18:09] [config] exponential-smoothing: 0.0001
[2022-01-15 19:18:09] [config] factor-weight: 1
[2022-01-15 19:18:09] [config] grad-dropping-momentum: 0
[2022-01-15 19:18:09] [config] grad-dropping-rate: 0
[2022-01-15 19:18:09] [config] grad-dropping-warmup: 100
[2022-01-15 19:18:09] [config] gradient-checkpointing: false
[2022-01-15 19:18:09] [config] guided-alignment: none
[2022-01-15 19:18:09] [config] guided-alignment-cost: mse
[2022-01-15 19:18:09] [config] guided-alignment-weight: 0.1
[2022-01-15 19:18:09] [config] ignore-model-config: false
[2022-01-15 19:18:09] [config] input-types:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] interpolate-env-vars: false
[2022-01-15 19:18:09] [config] keep-best: false
[2022-01-15 19:18:09] [config] label-smoothing: 0.1
[2022-01-15 19:18:09] [config] layer-normalization: false
[2022-01-15 19:18:09] [config] learn-rate: 0.0003
[2022-01-15 19:18:09] [config] lemma-dim-emb: 0
[2022-01-15 19:18:09] [config] log: /home/wmi/train.log
[2022-01-15 19:18:09] [config] log-level: info
[2022-01-15 19:18:09] [config] log-time-zone: ""
[2022-01-15 19:18:09] [config] logical-epoch:
[2022-01-15 19:18:09] [config]   - 1e
[2022-01-15 19:18:09] [config]   - 0
[2022-01-15 19:18:09] [config] lr-decay: 0
[2022-01-15 19:18:09] [config] lr-decay-freq: 50000
[2022-01-15 19:18:09] [config] lr-decay-inv-sqrt:
[2022-01-15 19:18:09] [config]   - 16000
[2022-01-15 19:18:09] [config] lr-decay-repeat-warmup: false
[2022-01-15 19:18:09] [config] lr-decay-reset-optimizer: false
[2022-01-15 19:18:09] [config] lr-decay-start:
[2022-01-15 19:18:09] [config]   - 10
[2022-01-15 19:18:09] [config]   - 1
[2022-01-15 19:18:09] [config] lr-decay-strategy: epoch+stalled
[2022-01-15 19:18:09] [config] lr-report: true
[2022-01-15 19:18:09] [config] lr-warmup: 16000
[2022-01-15 19:18:09] [config] lr-warmup-at-reload: false
[2022-01-15 19:18:09] [config] lr-warmup-cycle: false
[2022-01-15 19:18:09] [config] lr-warmup-start-rate: 0
[2022-01-15 19:18:09] [config] max-length: 100
[2022-01-15 19:18:09] [config] max-length-crop: false
[2022-01-15 19:18:09] [config] max-length-factor: 3
[2022-01-15 19:18:09] [config] maxi-batch: 1000
[2022-01-15 19:18:09] [config] maxi-batch-sort: trg
[2022-01-15 19:18:09] [config] mini-batch: 64
[2022-01-15 19:18:09] [config] mini-batch-fit: true
[2022-01-15 19:18:09] [config] mini-batch-fit-step: 10
[2022-01-15 19:18:09] [config] mini-batch-track-lr: false
[2022-01-15 19:18:09] [config] mini-batch-warmup: 0
[2022-01-15 19:18:09] [config] mini-batch-words: 0
[2022-01-15 19:18:09] [config] mini-batch-words-ref: 0
[2022-01-15 19:18:09] [config] model: model.npz
[2022-01-15 19:18:09] [config] multi-loss-type: sum
[2022-01-15 19:18:09] [config] multi-node: false
[2022-01-15 19:18:09] [config] multi-node-overlap: true
[2022-01-15 19:18:09] [config] n-best: false
[2022-01-15 19:18:09] [config] no-nccl: false
[2022-01-15 19:18:09] [config] no-reload: false
[2022-01-15 19:18:09] [config] no-restore-corpus: false
[2022-01-15 19:18:09] [config] normalize: 0.6
[2022-01-15 19:18:09] [config] normalize-gradient: false
[2022-01-15 19:18:09] [config] num-devices: 0
[2022-01-15 19:18:09] [config] optimizer: adam
[2022-01-15 19:18:09] [config] optimizer-delay: 1
[2022-01-15 19:18:09] [config] optimizer-params:
[2022-01-15 19:18:09] [config]   - 0.9
[2022-01-15 19:18:09] [config]   - 0.98
[2022-01-15 19:18:09] [config]   - 1e-09
[2022-01-15 19:18:09] [config] output-omit-bias: false
[2022-01-15 19:18:09] [config] overwrite: true
[2022-01-15 19:18:09] [config] precision:
[2022-01-15 19:18:09] [config]   - float32
[2022-01-15 19:18:09] [config]   - float32
[2022-01-15 19:18:09] [config]   - float32
[2022-01-15 19:18:09] [config] pretrained-model: ""
[2022-01-15 19:18:09] [config] quantize-biases: false
[2022-01-15 19:18:09] [config] quantize-bits: 0
[2022-01-15 19:18:09] [config] quantize-log-based: false
[2022-01-15 19:18:09] [config] quantize-optimization-steps: 0
[2022-01-15 19:18:09] [config] quiet: false
[2022-01-15 19:18:09] [config] quiet-translation: false
[2022-01-15 19:18:09] [config] relative-paths: false
[2022-01-15 19:18:09] [config] right-left: false
[2022-01-15 19:18:09] [config] save-freq: 5000
[2022-01-15 19:18:09] [config] seed: 0
[2022-01-15 19:18:09] [config] sentencepiece-alphas:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] sentencepiece-max-lines: 2000000
[2022-01-15 19:18:09] [config] sentencepiece-options: ""
[2022-01-15 19:18:09] [config] shuffle: data
[2022-01-15 19:18:09] [config] shuffle-in-ram: false
[2022-01-15 19:18:09] [config] sigterm: save-and-exit
[2022-01-15 19:18:09] [config] skip: false
[2022-01-15 19:18:09] [config] sqlite: ""
[2022-01-15 19:18:09] [config] sqlite-drop: false
[2022-01-15 19:18:09] [config] sync-sgd: false
[2022-01-15 19:18:09] [config] tempdir: /tmp
[2022-01-15 19:18:09] [config] tied-embeddings: true
[2022-01-15 19:18:09] [config] tied-embeddings-all: false
[2022-01-15 19:18:09] [config] tied-embeddings-src: false
[2022-01-15 19:18:09] [config] train-embedder-rank:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] train-sets:
[2022-01-15 19:18:09] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 19:18:09] [config]   - /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 19:18:09] [config] transformer-aan-activation: swish
[2022-01-15 19:18:09] [config] transformer-aan-depth: 2
[2022-01-15 19:18:09] [config] transformer-aan-nogate: false
[2022-01-15 19:18:09] [config] transformer-decoder-autoreg: self-attention
[2022-01-15 19:18:09] [config] transformer-depth-scaling: false
[2022-01-15 19:18:09] [config] transformer-dim-aan: 2048
[2022-01-15 19:18:09] [config] transformer-dim-ffn: 2048
[2022-01-15 19:18:09] [config] transformer-dropout: 0.1
[2022-01-15 19:18:09] [config] transformer-dropout-attention: 0
[2022-01-15 19:18:09] [config] transformer-dropout-ffn: 0
[2022-01-15 19:18:09] [config] transformer-ffn-activation: swish
[2022-01-15 19:18:09] [config] transformer-ffn-depth: 2
[2022-01-15 19:18:09] [config] transformer-guided-alignment-layer: last
[2022-01-15 19:18:09] [config] transformer-heads: 8
[2022-01-15 19:18:09] [config] transformer-no-projection: false
[2022-01-15 19:18:09] [config] transformer-pool: false
[2022-01-15 19:18:09] [config] transformer-postprocess: dan
[2022-01-15 19:18:09] [config] transformer-postprocess-emb: d
[2022-01-15 19:18:09] [config] transformer-postprocess-top: ""
[2022-01-15 19:18:09] [config] transformer-preprocess: ""
[2022-01-15 19:18:09] [config] transformer-tied-layers:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] transformer-train-position-embeddings: false
[2022-01-15 19:18:09] [config] tsv: false
[2022-01-15 19:18:09] [config] tsv-fields: 0
[2022-01-15 19:18:09] [config] type: transformer
[2022-01-15 19:18:09] [config] ulr: false
[2022-01-15 19:18:09] [config] ulr-dim-emb: 0
[2022-01-15 19:18:09] [config] ulr-dropout: 0
[2022-01-15 19:18:09] [config] ulr-keys-vectors: ""
[2022-01-15 19:18:09] [config] ulr-query-vectors: ""
[2022-01-15 19:18:09] [config] ulr-softmax-temperature: 1
[2022-01-15 19:18:09] [config] ulr-trainable-transformation: false
[2022-01-15 19:18:09] [config] unlikelihood-loss: false
[2022-01-15 19:18:09] [config] valid-freq: 5000
[2022-01-15 19:18:09] [config] valid-log: ""
[2022-01-15 19:18:09] [config] valid-max-length: 1000
[2022-01-15 19:18:09] [config] valid-metrics:
[2022-01-15 19:18:09] [config]   - cross-entropy
[2022-01-15 19:18:09] [config] valid-mini-batch: 32
[2022-01-15 19:18:09] [config] valid-reset-stalled: false
[2022-01-15 19:18:09] [config] valid-script-args:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] valid-script-path: ""
[2022-01-15 19:18:09] [config] valid-sets:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] valid-translation-output: ""
[2022-01-15 19:18:09] [config] vocabs:
[2022-01-15 19:18:09] [config]   []
[2022-01-15 19:18:09] [config] word-penalty: 0
[2022-01-15 19:18:09] [config] word-scores: false
[2022-01-15 19:18:09] [config] workspace: 10000
[2022-01-15 19:18:09] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-15 19:18:09] [training] Using single-device training
[2022-01-15 19:18:09] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-15 19:18:09] [data] Vocabularies will be built separately for each file.
[2022-01-15 19:18:09] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000
[2022-01-15 19:18:09] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/in.tsv.10000.yml
[2022-01-15 19:18:09] [data] Setting vocabulary size for input 0 to 14,891
[2022-01-15 19:18:09] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000
[2022-01-15 19:18:09] [data] Loading vocabulary from JSON/Yaml file /home/wmi/PLEWI-polish-errors-correction-challenge/train/expected.tsv.10000.yml
[2022-01-15 19:18:09] [data] Setting vocabulary size for input 1 to 14,901
[2022-01-15 19:18:09] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-15 19:18:09] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-15 19:18:09] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 19:18:09] [logits] Applying loss function for 1 factor(s)
[2022-01-15 19:18:09] [memory] Reserving 226 MB, device gpu0
[2022-01-15 19:18:10] [gpu] 16-bit TensorCores enabled for float32 matrix operations
[2022-01-15 19:18:10] [memory] Reserving 226 MB, device gpu0
[2022-01-15 19:18:21] [batching] Done. Typical MB size is 10,112 target words
[2022-01-15 19:18:21] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-15 19:18:21] Training started
[2022-01-15 19:18:21] [data] Shuffling data
[2022-01-15 19:18:22] [data] Done reading 1,514,371 sentences
[2022-01-17 14:28:25] [marian] Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-17 14:28:25] [marian] Running on s470607-gpu as process 22804 with command line:
[2022-01-17 14:28:25] [marian] /home/wmi/marian/build/marian --type transformer --overwrite --train-sets /home/wmi/mt-summit-corpora/train/in.tsv.32000 /home/wmi/mt-summit-corpora/train/expected.tsv.32000 --max-length 100 --mini-batch-fit -w 10000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --beam-size 6 --normalize 0.6 --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --tied-embeddings --exponential-smoothing --log /home/wmi/train.log --after-epochs=1
[2022-01-17 14:28:25] [config] after: 0e
[2022-01-17 14:28:25] [config] after-batches: 0
[2022-01-17 14:28:25] [config] after-epochs: 1
[2022-01-17 14:28:25] [config] all-caps-every: 0
[2022-01-17 14:28:25] [config] allow-unk: false
[2022-01-17 14:28:25] [config] authors: false
[2022-01-17 14:28:25] [config] beam-size: 6
[2022-01-17 14:28:25] [config] bert-class-symbol: "[CLS]"
[2022-01-17 14:28:25] [config] bert-mask-symbol: "[MASK]"
[2022-01-17 14:28:25] [config] bert-masking-fraction: 0.15
[2022-01-17 14:28:25] [config] bert-sep-symbol: "[SEP]"
[2022-01-17 14:28:25] [config] bert-train-type-embeddings: true
[2022-01-17 14:28:25] [config] bert-type-vocab-size: 2
[2022-01-17 14:28:25] [config] build-info: ""
[2022-01-17 14:28:25] [config] cite: false
[2022-01-17 14:28:25] [config] clip-norm: 5
[2022-01-17 14:28:25] [config] cost-scaling:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] cost-type: ce-sum
[2022-01-17 14:28:25] [config] cpu-threads: 0
[2022-01-17 14:28:25] [config] data-weighting: ""
[2022-01-17 14:28:25] [config] data-weighting-type: sentence
[2022-01-17 14:28:25] [config] dec-cell: gru
[2022-01-17 14:28:25] [config] dec-cell-base-depth: 2
[2022-01-17 14:28:25] [config] dec-cell-high-depth: 1
[2022-01-17 14:28:25] [config] dec-depth: 6
[2022-01-17 14:28:25] [config] devices:
[2022-01-17 14:28:25] [config]   - 0
[2022-01-17 14:28:25] [config] dim-emb: 512
[2022-01-17 14:28:25] [config] dim-rnn: 1024
[2022-01-17 14:28:25] [config] dim-vocabs:
[2022-01-17 14:28:25] [config]   - 0
[2022-01-17 14:28:25] [config]   - 0
[2022-01-17 14:28:25] [config] disp-first: 0
[2022-01-17 14:28:25] [config] disp-freq: 500
[2022-01-17 14:28:25] [config] disp-label-counts: true
[2022-01-17 14:28:25] [config] dropout-rnn: 0
[2022-01-17 14:28:25] [config] dropout-src: 0
[2022-01-17 14:28:25] [config] dropout-trg: 0
[2022-01-17 14:28:25] [config] dump-config: ""
[2022-01-17 14:28:25] [config] early-stopping: 10
[2022-01-17 14:28:25] [config] embedding-fix-src: false
[2022-01-17 14:28:25] [config] embedding-fix-trg: false
[2022-01-17 14:28:25] [config] embedding-normalization: false
[2022-01-17 14:28:25] [config] embedding-vectors:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] enc-cell: gru
[2022-01-17 14:28:25] [config] enc-cell-depth: 1
[2022-01-17 14:28:25] [config] enc-depth: 6
[2022-01-17 14:28:25] [config] enc-type: bidirectional
[2022-01-17 14:28:25] [config] english-title-case-every: 0
[2022-01-17 14:28:25] [config] exponential-smoothing: 0.0001
[2022-01-17 14:28:25] [config] factor-weight: 1
[2022-01-17 14:28:25] [config] grad-dropping-momentum: 0
[2022-01-17 14:28:25] [config] grad-dropping-rate: 0
[2022-01-17 14:28:25] [config] grad-dropping-warmup: 100
[2022-01-17 14:28:25] [config] gradient-checkpointing: false
[2022-01-17 14:28:25] [config] guided-alignment: none
[2022-01-17 14:28:25] [config] guided-alignment-cost: mse
[2022-01-17 14:28:25] [config] guided-alignment-weight: 0.1
[2022-01-17 14:28:25] [config] ignore-model-config: false
[2022-01-17 14:28:25] [config] input-types:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] interpolate-env-vars: false
[2022-01-17 14:28:25] [config] keep-best: false
[2022-01-17 14:28:25] [config] label-smoothing: 0.1
[2022-01-17 14:28:25] [config] layer-normalization: false
[2022-01-17 14:28:25] [config] learn-rate: 0.0003
[2022-01-17 14:28:25] [config] lemma-dim-emb: 0
[2022-01-17 14:28:25] [config] log: /home/wmi/train.log
[2022-01-17 14:28:25] [config] log-level: info
[2022-01-17 14:28:25] [config] log-time-zone: ""
[2022-01-17 14:28:25] [config] logical-epoch:
[2022-01-17 14:28:25] [config]   - 1e
[2022-01-17 14:28:25] [config]   - 0
[2022-01-17 14:28:25] [config] lr-decay: 0
[2022-01-17 14:28:25] [config] lr-decay-freq: 50000
[2022-01-17 14:28:25] [config] lr-decay-inv-sqrt:
[2022-01-17 14:28:25] [config]   - 16000
[2022-01-17 14:28:25] [config] lr-decay-repeat-warmup: false
[2022-01-17 14:28:25] [config] lr-decay-reset-optimizer: false
[2022-01-17 14:28:25] [config] lr-decay-start:
[2022-01-17 14:28:25] [config]   - 10
[2022-01-17 14:28:25] [config]   - 1
[2022-01-17 14:28:25] [config] lr-decay-strategy: epoch+stalled
[2022-01-17 14:28:25] [config] lr-report: true
[2022-01-17 14:28:25] [config] lr-warmup: 16000
[2022-01-17 14:28:25] [config] lr-warmup-at-reload: false
[2022-01-17 14:28:25] [config] lr-warmup-cycle: false
[2022-01-17 14:28:25] [config] lr-warmup-start-rate: 0
[2022-01-17 14:28:25] [config] max-length: 100
[2022-01-17 14:28:25] [config] max-length-crop: false
[2022-01-17 14:28:25] [config] max-length-factor: 3
[2022-01-17 14:28:25] [config] maxi-batch: 1000
[2022-01-17 14:28:25] [config] maxi-batch-sort: trg
[2022-01-17 14:28:25] [config] mini-batch: 64
[2022-01-17 14:28:25] [config] mini-batch-fit: true
[2022-01-17 14:28:25] [config] mini-batch-fit-step: 10
[2022-01-17 14:28:25] [config] mini-batch-track-lr: false
[2022-01-17 14:28:25] [config] mini-batch-warmup: 0
[2022-01-17 14:28:25] [config] mini-batch-words: 0
[2022-01-17 14:28:25] [config] mini-batch-words-ref: 0
[2022-01-17 14:28:25] [config] model: model.npz
[2022-01-17 14:28:25] [config] multi-loss-type: sum
[2022-01-17 14:28:25] [config] multi-node: false
[2022-01-17 14:28:25] [config] multi-node-overlap: true
[2022-01-17 14:28:25] [config] n-best: false
[2022-01-17 14:28:25] [config] no-nccl: false
[2022-01-17 14:28:25] [config] no-reload: false
[2022-01-17 14:28:25] [config] no-restore-corpus: false
[2022-01-17 14:28:25] [config] normalize: 0.6
[2022-01-17 14:28:25] [config] normalize-gradient: false
[2022-01-17 14:28:25] [config] num-devices: 0
[2022-01-17 14:28:25] [config] optimizer: adam
[2022-01-17 14:28:25] [config] optimizer-delay: 1
[2022-01-17 14:28:25] [config] optimizer-params:
[2022-01-17 14:28:25] [config]   - 0.9
[2022-01-17 14:28:25] [config]   - 0.98
[2022-01-17 14:28:25] [config]   - 1e-09
[2022-01-17 14:28:25] [config] output-omit-bias: false
[2022-01-17 14:28:25] [config] overwrite: true
[2022-01-17 14:28:25] [config] precision:
[2022-01-17 14:28:25] [config]   - float32
[2022-01-17 14:28:25] [config]   - float32
[2022-01-17 14:28:25] [config]   - float32
[2022-01-17 14:28:25] [config] pretrained-model: ""
[2022-01-17 14:28:25] [config] quantize-biases: false
[2022-01-17 14:28:25] [config] quantize-bits: 0
[2022-01-17 14:28:25] [config] quantize-log-based: false
[2022-01-17 14:28:25] [config] quantize-optimization-steps: 0
[2022-01-17 14:28:25] [config] quiet: false
[2022-01-17 14:28:25] [config] quiet-translation: false
[2022-01-17 14:28:25] [config] relative-paths: false
[2022-01-17 14:28:25] [config] right-left: false
[2022-01-17 14:28:25] [config] save-freq: 5000
[2022-01-17 14:28:25] [config] seed: 0
[2022-01-17 14:28:25] [config] sentencepiece-alphas:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] sentencepiece-max-lines: 2000000
[2022-01-17 14:28:25] [config] sentencepiece-options: ""
[2022-01-17 14:28:25] [config] shuffle: data
[2022-01-17 14:28:25] [config] shuffle-in-ram: false
[2022-01-17 14:28:25] [config] sigterm: save-and-exit
[2022-01-17 14:28:25] [config] skip: false
[2022-01-17 14:28:25] [config] sqlite: ""
[2022-01-17 14:28:25] [config] sqlite-drop: false
[2022-01-17 14:28:25] [config] sync-sgd: false
[2022-01-17 14:28:25] [config] tempdir: /tmp
[2022-01-17 14:28:25] [config] tied-embeddings: true
[2022-01-17 14:28:25] [config] tied-embeddings-all: false
[2022-01-17 14:28:25] [config] tied-embeddings-src: false
[2022-01-17 14:28:25] [config] train-embedder-rank:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] train-sets:
[2022-01-17 14:28:25] [config]   - /home/wmi/mt-summit-corpora/train/in.tsv.32000
[2022-01-17 14:28:25] [config]   - /home/wmi/mt-summit-corpora/train/expected.tsv.32000
[2022-01-17 14:28:25] [config] transformer-aan-activation: swish
[2022-01-17 14:28:25] [config] transformer-aan-depth: 2
[2022-01-17 14:28:25] [config] transformer-aan-nogate: false
[2022-01-17 14:28:25] [config] transformer-decoder-autoreg: self-attention
[2022-01-17 14:28:25] [config] transformer-depth-scaling: false
[2022-01-17 14:28:25] [config] transformer-dim-aan: 2048
[2022-01-17 14:28:25] [config] transformer-dim-ffn: 2048
[2022-01-17 14:28:25] [config] transformer-dropout: 0.1
[2022-01-17 14:28:25] [config] transformer-dropout-attention: 0
[2022-01-17 14:28:25] [config] transformer-dropout-ffn: 0
[2022-01-17 14:28:25] [config] transformer-ffn-activation: swish
[2022-01-17 14:28:25] [config] transformer-ffn-depth: 2
[2022-01-17 14:28:25] [config] transformer-guided-alignment-layer: last
[2022-01-17 14:28:25] [config] transformer-heads: 8
[2022-01-17 14:28:25] [config] transformer-no-projection: false
[2022-01-17 14:28:25] [config] transformer-pool: false
[2022-01-17 14:28:25] [config] transformer-postprocess: dan
[2022-01-17 14:28:25] [config] transformer-postprocess-emb: d
[2022-01-17 14:28:25] [config] transformer-postprocess-top: ""
[2022-01-17 14:28:25] [config] transformer-preprocess: ""
[2022-01-17 14:28:25] [config] transformer-tied-layers:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] transformer-train-position-embeddings: false
[2022-01-17 14:28:25] [config] tsv: false
[2022-01-17 14:28:25] [config] tsv-fields: 0
[2022-01-17 14:28:25] [config] type: transformer
[2022-01-17 14:28:25] [config] ulr: false
[2022-01-17 14:28:25] [config] ulr-dim-emb: 0
[2022-01-17 14:28:25] [config] ulr-dropout: 0
[2022-01-17 14:28:25] [config] ulr-keys-vectors: ""
[2022-01-17 14:28:25] [config] ulr-query-vectors: ""
[2022-01-17 14:28:25] [config] ulr-softmax-temperature: 1
[2022-01-17 14:28:25] [config] ulr-trainable-transformation: false
[2022-01-17 14:28:25] [config] unlikelihood-loss: false
[2022-01-17 14:28:25] [config] valid-freq: 5000
[2022-01-17 14:28:25] [config] valid-log: ""
[2022-01-17 14:28:25] [config] valid-max-length: 1000
[2022-01-17 14:28:25] [config] valid-metrics:
[2022-01-17 14:28:25] [config]   - cross-entropy
[2022-01-17 14:28:25] [config] valid-mini-batch: 32
[2022-01-17 14:28:25] [config] valid-reset-stalled: false
[2022-01-17 14:28:25] [config] valid-script-args:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] valid-script-path: ""
[2022-01-17 14:28:25] [config] valid-sets:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] valid-translation-output: ""
[2022-01-17 14:28:25] [config] vocabs:
[2022-01-17 14:28:25] [config]   []
[2022-01-17 14:28:25] [config] word-penalty: 0
[2022-01-17 14:28:25] [config] word-scores: false
[2022-01-17 14:28:25] [config] workspace: 10000
[2022-01-17 14:28:25] [config] Model is being created with Marian v1.10.0 6f6d4846 2021-02-06 15:35:16 -0800
[2022-01-17 14:28:25] [training] Using single-device training
[2022-01-17 14:28:25] [data] No vocabulary files given, trying to find or build based on training data.
[2022-01-17 14:28:25] [data] Vocabularies will be built separately for each file.
[2022-01-17 14:28:25] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/train/in.tsv.32000
[2022-01-17 14:28:25] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/train/in.tsv.32000
[2022-01-17 14:28:25] [data] Creating vocabulary /home/wmi/mt-summit-corpora/train/in.tsv.32000.yml from /home/wmi/mt-summit-corpora/train/in.tsv.32000
[2022-01-17 14:28:33] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/train/in.tsv.32000.yml
[2022-01-17 14:28:33] [data] Setting vocabulary size for input 0 to 18,703
[2022-01-17 14:28:33] No vocabulary path given; trying to find default vocabulary based on data path /home/wmi/mt-summit-corpora/train/expected.tsv.32000
[2022-01-17 14:28:33] No vocabulary path given; trying to create vocabulary based on data paths /home/wmi/mt-summit-corpora/train/expected.tsv.32000
[2022-01-17 14:28:33] [data] Creating vocabulary /home/wmi/mt-summit-corpora/train/expected.tsv.32000.yml from /home/wmi/mt-summit-corpora/train/expected.tsv.32000
[2022-01-17 14:28:41] [data] Loading vocabulary from JSON/Yaml file /home/wmi/mt-summit-corpora/train/expected.tsv.32000.yml
[2022-01-17 14:28:41] [data] Setting vocabulary size for input 1 to 27,729
[2022-01-17 14:28:41] [comm] Compiled without MPI support. Running as a single process on s470607-gpu
[2022-01-17 14:28:41] [batching] Collecting statistics for batch fitting with step size 10
[2022-01-17 14:28:41] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-17 14:28:41] [logits] Applying loss function for 1 factor(s)
[2022-01-17 14:28:41] [memory] Reserving 259 MB, device gpu0
[2022-01-17 14:28:43] [gpu] 16-bit TensorCores enabled for float32 matrix operations
[2022-01-17 14:28:43] [memory] Reserving 259 MB, device gpu0
[2022-01-17 14:28:53] [batching] Done. Typical MB size is 9,199 target words
[2022-01-17 14:28:53] [memory] Extending reserved space to 10112 MB (device gpu0)
[2022-01-17 14:28:53] Training started
[2022-01-17 14:28:53] [data] Shuffling data
[2022-01-17 14:28:55] [data] Done reading 3,103,819 sentences
[2022-01-17 14:29:10] [data] Done shuffling 3,103,819 sentences to temp files
[2022-01-17 14:29:11] [memory] Reserving 259 MB, device gpu0
[2022-01-17 14:29:11] [memory] Reserving 259 MB, device gpu0
[2022-01-17 14:29:11] [memory] Reserving 518 MB, device gpu0
[2022-01-17 14:29:11] [memory] Reserving 259 MB, device gpu0
[2022-01-17 14:30:28] Ep. 1 : Up. 500 : Sen. 112,183 : Cost 9.75393009 * 3,315,817 @ 6,804 after 3,315,817 : Time 94.27s : 35172.46 words/s : L.r. 9.3750e-06
[2022-01-17 14:31:45] Ep. 1 : Up. 1000 : Sen. 226,207 : Cost 8.78037071 * 3,301,332 @ 7,600 after 6,617,149 : Time 77.09s : 42824.10 words/s : L.r. 1.8750e-05
[2022-01-17 14:33:01] Ep. 1 : Up. 1500 : Sen. 333,345 : Cost 8.45282364 * 3,230,947 @ 6,045 after 9,848,096 : Time 76.47s : 42251.10 words/s : L.r. 2.8125e-05
[2022-01-17 14:34:19] Ep. 1 : Up. 2000 : Sen. 447,334 : Cost 8.13397980 * 3,314,800 @ 9,540 after 13,162,896 : Time 77.69s : 42668.69 words/s : L.r. 3.7500e-05
[2022-01-17 14:35:37] Ep. 1 : Up. 2500 : Sen. 559,333 : Cost 7.77041626 * 3,308,454 @ 6,374 after 16,471,350 : Time 78.02s : 42405.44 words/s : L.r. 4.6875e-05
[2022-01-17 14:36:54] Ep. 1 : Up. 3000 : Sen. 671,385 : Cost 7.36499834 * 3,279,893 @ 8,930 after 19,751,243 : Time 77.38s : 42386.00 words/s : L.r. 5.6250e-05
[2022-01-17 14:38:13] Ep. 1 : Up. 3500 : Sen. 782,003 : Cost 7.04253387 * 3,350,646 @ 8,610 after 23,101,889 : Time 78.33s : 42776.03 words/s : L.r. 6.5625e-05
[2022-01-17 14:39:30] Ep. 1 : Up. 4000 : Sen. 895,728 : Cost 6.76005077 * 3,280,081 @ 6,688 after 26,381,970 : Time 77.36s : 42399.72 words/s : L.r. 7.5000e-05
[2022-01-17 14:40:47] Ep. 1 : Up. 4500 : Sen. 1,006,825 : Cost 6.52075815 * 3,274,292 @ 8,904 after 29,656,262 : Time 77.27s : 42375.09 words/s : L.r. 8.4375e-05
[2022-01-17 14:42:04] Ep. 1 : Up. 5000 : Sen. 1,117,568 : Cost 6.29208374 * 3,270,540 @ 7,950 after 32,926,802 : Time 77.07s : 42436.39 words/s : L.r. 9.3750e-05
[2022-01-17 14:42:04] Saving model weights and runtime parameters to model.npz.orig.npz
[2022-01-17 14:42:05] Saving model weights and runtime parameters to model.npz
[2022-01-17 14:42:06] Saving Adam parameters to model.npz.optimizer.npz
[2022-01-17 14:43:26] Ep. 1 : Up. 5500 : Sen. 1,229,053 : Cost 6.08049774 * 3,283,922 @ 6,844 after 36,210,724 : Time 81.29s : 40395.28 words/s : L.r. 1.0313e-04
[2022-01-17 14:44:44] Ep. 1 : Up. 6000 : Sen. 1,341,287 : Cost 5.86857510 * 3,331,301 @ 6,475 after 39,542,025 : Time 78.21s : 42593.68 words/s : L.r. 1.1250e-04
[2022-01-17 14:46:02] Ep. 1 : Up. 6500 : Sen. 1,453,894 : Cost 5.67484188 * 3,316,106 @ 6,912 after 42,858,131 : Time 78.19s : 42412.79 words/s : L.r. 1.2188e-04
[2022-01-17 14:47:20] Ep. 1 : Up. 7000 : Sen. 1,566,550 : Cost 5.44415712 * 3,295,317 @ 5,587 after 46,153,448 : Time 77.57s : 42484.00 words/s : L.r. 1.3125e-04
[2022-01-17 14:48:37] Ep. 1 : Up. 7500 : Sen. 1,679,713 : Cost 5.21833611 * 3,305,387 @ 7,314 after 49,458,835 : Time 77.69s : 42545.45 words/s : L.r. 1.4063e-04
[2022-01-17 14:49:55] Ep. 1 : Up. 8000 : Sen. 1,790,607 : Cost 5.01008797 * 3,303,487 @ 5,952 after 52,762,322 : Time 77.89s : 42412.55 words/s : L.r. 1.5000e-04
[2022-01-17 14:51:13] Ep. 1 : Up. 8500 : Sen. 1,903,180 : Cost 4.78658533 * 3,305,121 @ 6,475 after 56,067,443 : Time 77.74s : 42514.82 words/s : L.r. 1.5938e-04
[2022-01-17 14:52:30] Ep. 1 : Up. 9000 : Sen. 2,012,146 : Cost 4.59634590 * 3,285,628 @ 2,944 after 59,353,071 : Time 77.55s : 42366.37 words/s : L.r. 1.6875e-04
[2022-01-17 14:53:48] Ep. 1 : Up. 9500 : Sen. 2,126,354 : Cost 4.37838125 * 3,299,705 @ 7,770 after 62,652,776 : Time 77.47s : 42590.84 words/s : L.r. 1.7813e-04
[2022-01-17 14:55:06] Ep. 1 : Up. 10000 : Sen. 2,237,464 : Cost 4.18511820 * 3,293,490 @ 6,282 after 65,946,266 : Time 77.79s : 42337.42 words/s : L.r. 1.8750e-04
[2022-01-17 14:55:06] Saving model weights and runtime parameters to model.npz.orig.npz
[2022-01-17 14:55:06] Saving model weights and runtime parameters to model.npz
[2022-01-17 14:55:07] Saving Adam parameters to model.npz.optimizer.npz
[2022-01-17 14:56:27] Ep. 1 : Up. 10500 : Sen. 2,348,017 : Cost 4.03138113 * 3,310,483 @ 5,556 after 69,256,749 : Time 80.94s : 40899.10 words/s : L.r. 1.9688e-04
[2022-01-17 14:57:44] Ep. 1 : Up. 11000 : Sen. 2,459,529 : Cost 3.87746334 * 3,264,732 @ 8,325 after 72,521,481 : Time 77.20s : 42287.32 words/s : L.r. 2.0625e-04
[2022-01-17 14:59:02] Ep. 1 : Up. 11500 : Sen. 2,573,487 : Cost 3.74331832 * 3,306,391 @ 7,400 after 75,827,872 : Time 77.86s : 42466.56 words/s : L.r. 2.1563e-04
[2022-01-17 15:00:20] Ep. 1 : Up. 12000 : Sen. 2,685,390 : Cost 3.64547110 * 3,317,964 @ 6,688 after 79,145,836 : Time 78.15s : 42457.47 words/s : L.r. 2.2500e-04
[2022-01-17 15:01:37] Ep. 1 : Up. 12500 : Sen. 2,797,494 : Cost 3.54736304 * 3,298,276 @ 7,808 after 82,444,112 : Time 77.68s : 42457.41 words/s : L.r. 2.3438e-04
[2022-01-17 15:02:55] Ep. 1 : Up. 13000 : Sen. 2,908,256 : Cost 3.47763109 * 3,295,994 @ 5,773 after 85,740,106 : Time 77.68s : 42433.03 words/s : L.r. 2.4375e-04
[2022-01-17 15:04:12] Ep. 1 : Up. 13500 : Sen. 3,019,771 : Cost 3.40706968 * 3,254,919 @ 4,995 after 88,995,025 : Time 76.75s : 42409.32 words/s : L.r. 2.5313e-04
[2022-01-17 15:04:53] Seen 3078439 samples
[2022-01-17 15:04:53] Starting data epoch 2 in logical epoch 2
[2022-01-17 15:04:53] Training finished
[2022-01-17 15:04:53] Saving model weights and runtime parameters to model.npz.orig.npz
[2022-01-17 15:04:53] Saving model weights and runtime parameters to model.npz
[2022-01-17 15:04:54] Saving Adam parameters to model.npz.optimizer.npz