Add files
25
README.md
Normal file
@ -0,0 +1,25 @@
|
||||
HANOI challenge
|
||||
|
||||
This challenge is based on the contents of the HANOI corpus described in detail below. This is a binary classification challenge: the aim is to classify interpreter notes as either being written by a trainee or a professional. The training split of the dataset consists of 988 training examples in the form of scans of interpreter notes, with 786 of them being made by professionals and 202 by trainees (university students during an interpreting course).
|
||||
|
||||
HANOI, or Handwritten Notation of Interpreters, is a corpus of handwritten notes for consecutive interpreting, collected from professional interpreters and interpreting students. It is the only resource of its kind in the world.
|
||||
|
||||
Interpreting is the act of translating spoken language. Professional interpreters are needed to e.g. translate the discussion between international guests speaking in their native tongues during a conference. There are several types of interpreting, with one of them being consecutive interpreting. In this case, the interpreter waits for the speaker to finish his whole speech before starting to interpret. As such speeches can last up to 20 minutes, to accurately convey the content of the original speech interpreters rely on handwritten notes. The interpreter listens to the source language and, at the same time, notes down selected content to remember it, and later recreate it in the target language.
|
||||
|
||||
There are rules for note-taking. The writing should be sparse and diagonal, using abbreviations, acronyms, and symbols. Interpreters often take notes in two or more languages at the same time. The resulting specialized multilingual text, the so-called semi-product of interpreting, serves a unique function: supporting short-term memory during interpretation. Developing note-taking skills for interpreting is a process that starts at university with a course in notation and continues basically throughout an interpreter's entire career. Every interpreter's notation style is different, and it is virtually impossible to read someone else's notes.
|
||||
|
||||
The notes of consecutive interpreters constitute a unique type of handwritten text, quite unlike the notes people use for everyday tasks, school, and work. Interestingly, the notes of professional interpreters and of those who are new to the skill are also different. The notation of interpreting trainees is more reminiscent of 'traditional' notes: there are grammatically correct sentences and multi-syllable words, pages are densely written, and there are no symbols, abbreviations, or distinctive lines that would divide the speech into separate ideas.
|
||||
|
||||
(Description adapted from https://hanoi.amu.edu.pl/)
|
||||
|
||||
Taking into account the above-outlined unique characteristics of interpreter notes as well as the differences between the ones created by trainees versus the ones made by professionals, an interesting question arises: could a machine learning model reliably identify the interpreting experience of the author of a note across several hundred examples? Take part in the challenge and prove that the answer can be 'yes'!
|
||||
|
||||
Metric: accuracy
|
||||
|
||||
Labels: trainee, pro
|
||||
|
||||
Dataset authors: https://csi.amu.edu.pl/zespoly/zespol-lingwistyki-diachronicznej
|
||||
|
||||
License: CC BY-NC 4.0
|
||||
|
||||
HANOI is part of the Digital Research Infrastructure for the Humanities and Arts DARIAH-PL, funded from the Intelligent Development Operational Programme, Polish National Centre for Research and Development, ID: POIR.04.02.00-00-D006/20.
|
BIN
test/images/014fd08927785b2056411ddf4e54c965.png
Normal file
After ![]() (image error) Size: 440 KiB |
BIN
test/images/051b3f8ec0508da2b46984147123b827.png
Normal file
After ![]() (image error) Size: 359 KiB |
BIN
test/images/06a1ce76aa4b62747b04e7b50183762c.png
Normal file
After ![]() (image error) Size: 569 KiB |
BIN
test/images/094a80c8d2c906e81312741d267fd707.png
Normal file
After ![]() (image error) Size: 1.8 MiB |
BIN
test/images/0a5ed6de4998edf794e8be1085784de4.png
Normal file
After ![]() (image error) Size: 1.9 MiB |
BIN
test/images/0bc1a1a414cc7d698db0fbeb840a1577.png
Normal file
After ![]() (image error) Size: 1007 KiB |
BIN
test/images/0d4e0b45e6f5872d98b8d539eeb8b491.png
Normal file
After ![]() (image error) Size: 1.2 MiB |
BIN
test/images/0f60bf67955c1d24b922b18318031229.png
Normal file
After ![]() (image error) Size: 426 KiB |
BIN
test/images/11eaece58eb87c302761db1933fa66ef.png
Normal file
After ![]() (image error) Size: 125 KiB |
BIN
test/images/13038cd679c390f8329cb9a84e6201e1.png
Normal file
After ![]() (image error) Size: 494 KiB |
BIN
test/images/130bbb2c38d6e39103ca8524a7481728.png
Normal file
After ![]() (image error) Size: 527 KiB |
BIN
test/images/13ac946de4b886bd7cac88ec5a269495.png
Normal file
After ![]() (image error) Size: 1.9 MiB |
BIN
test/images/1643205598fc6d2d5d9ab98f7c31a296.png
Normal file
After ![]() (image error) Size: 369 KiB |
BIN
test/images/17e9d206de543677e9b50acbe5f074f6.png
Normal file
After ![]() (image error) Size: 2.0 MiB |
BIN
test/images/1a04c7901921a4a27af41f07d25b57d3.png
Normal file
After ![]() (image error) Size: 1.3 MiB |
BIN
test/images/1a9112ff3b933ec861c23abf628cbfc8.png
Normal file
After ![]() (image error) Size: 453 KiB |
BIN
test/images/1c82c9e95ebb6e90768ef1534a28a3bd.png
Normal file
After ![]() (image error) Size: 413 KiB |
BIN
test/images/1f4dbafbeefac7eb31ba976612899105.png
Normal file
After ![]() (image error) Size: 330 KiB |
BIN
test/images/218330e412cf8e9f484337d9f76feac6.png
Normal file
After ![]() (image error) Size: 1.1 MiB |
BIN
test/images/21c24fcfa244a5c767c211eeb5d7c8de.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/22ae0572f746709b72635a7b428c4c72.png
Normal file
After ![]() (image error) Size: 367 KiB |
BIN
test/images/23e121dd7bf58f1d995eac8ec7b4e13b.png
Normal file
After ![]() (image error) Size: 601 KiB |
BIN
test/images/241439383fb65ba2166377992aa03bab.png
Normal file
After ![]() (image error) Size: 1.1 MiB |
BIN
test/images/24cbbae26bed47353f84f6264f6c9e17.png
Normal file
After ![]() (image error) Size: 761 KiB |
BIN
test/images/25a80b6d45881ae086353bd460e82fe3.png
Normal file
After ![]() (image error) Size: 1.8 MiB |
BIN
test/images/25f7163e087c46a76995e42e81e5020f.png
Normal file
After ![]() (image error) Size: 327 KiB |
BIN
test/images/26aa0be7b5df887aad5a27256d6681cc.png
Normal file
After ![]() (image error) Size: 934 KiB |
BIN
test/images/27476085334fdb0f5c29c6eabc656361.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/27911ec971b8ff785316738fb296c331.png
Normal file
After ![]() (image error) Size: 472 KiB |
BIN
test/images/295904919a86e929f4520a985a27a1ea.png
Normal file
After ![]() (image error) Size: 1.6 MiB |
BIN
test/images/29bc37ac4404f77d5246e288c8a48f6e.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/2a56f7dab9dd04922613d9a1f5490359.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/2c23cc79c862f02d1d26e8f35013c97f.png
Normal file
After ![]() (image error) Size: 1.0 MiB |
BIN
test/images/2cab17553c4f407eb7ddc3585bd485a9.png
Normal file
After ![]() (image error) Size: 2.0 MiB |
BIN
test/images/2e3b3f6d93ed78b6cef7843d8bad8a40.png
Normal file
After ![]() (image error) Size: 609 KiB |
BIN
test/images/2f06045f645498492732b7cb0d66c09a.png
Normal file
After ![]() (image error) Size: 651 KiB |
BIN
test/images/31c8eedaa5d60d6ec47dd118dc5ac2ed.png
Normal file
After ![]() (image error) Size: 1.1 MiB |
BIN
test/images/3953b3b36c6a98c4943895bf9e740ad0.png
Normal file
After ![]() (image error) Size: 1.8 MiB |
BIN
test/images/3bda278501e85ca6c7f76bd756bef382.png
Normal file
After ![]() (image error) Size: 1.9 MiB |
BIN
test/images/3c69b19e112c83f0c20a63a19d6b4624.png
Normal file
After ![]() (image error) Size: 1.4 MiB |
BIN
test/images/3d91d8d07f7a302f0b1a798d27d85a37.png
Normal file
After ![]() (image error) Size: 96 KiB |
BIN
test/images/4041373cb8b2a380f7046d002329cb58.png
Normal file
After ![]() (image error) Size: 576 KiB |
BIN
test/images/4069973024822155c65e4649b096d8ac.png
Normal file
After ![]() (image error) Size: 1.0 MiB |
BIN
test/images/41b9511fa6a181f9cc20c4cfb9eb61fc.png
Normal file
After ![]() (image error) Size: 2.8 MiB |
BIN
test/images/43659f75793f624c25fd6fa0273f1292.png
Normal file
After ![]() (image error) Size: 332 KiB |
BIN
test/images/44f19eaacc091f539036d17e9f89ea1f.png
Normal file
After ![]() (image error) Size: 498 KiB |
BIN
test/images/47c140f36ae0ff068484ff4da83be2b2.png
Normal file
After ![]() (image error) Size: 1.3 MiB |
BIN
test/images/487f25fa31e3caeb02f96995977f5767.png
Normal file
After ![]() (image error) Size: 2.1 MiB |
BIN
test/images/49bae58d5b70141dd22da13b5023c738.png
Normal file
After ![]() (image error) Size: 693 KiB |
BIN
test/images/4a0f5296dbedfe8cbe106045f85407a0.png
Normal file
After ![]() (image error) Size: 811 KiB |
BIN
test/images/4b055c586ad5eca1f36f2bb419b1d9d1.png
Normal file
After ![]() (image error) Size: 410 KiB |
BIN
test/images/535357dad2402c50a2f6cfcf80d2f46e.png
Normal file
After ![]() (image error) Size: 2.0 MiB |
BIN
test/images/5353fa0016c1f81daba4d83d35604812.png
Normal file
After ![]() (image error) Size: 342 KiB |
BIN
test/images/54f5f9193225113f3f1079255f66b362.png
Normal file
After ![]() (image error) Size: 350 KiB |
BIN
test/images/560b08a062d8007ace58d4cf2b6e44e5.png
Normal file
After ![]() (image error) Size: 2.7 MiB |
BIN
test/images/579e8783a62cdebf32cbeac9bca2bd2d.png
Normal file
After ![]() (image error) Size: 1.3 MiB |
BIN
test/images/58f145bdcf0222fe479c7093cb7fdb96.png
Normal file
After ![]() (image error) Size: 444 KiB |
BIN
test/images/59cd1bcea094c15cd1b7fdf582e01778.png
Normal file
After ![]() (image error) Size: 401 KiB |
BIN
test/images/5a057cc97e801d4e943068ca933c16a3.png
Normal file
After ![]() (image error) Size: 836 KiB |
BIN
test/images/5a4674a7eea45e501e1feedd3f394e5d.png
Normal file
After ![]() (image error) Size: 348 KiB |
BIN
test/images/5a5226a2e1f55c26ca132ffe8ed726cd.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/5a83433cc315eb27e82837cf49ba8a1a.png
Normal file
After ![]() (image error) Size: 828 KiB |
BIN
test/images/5b3394003042008c784b2355b8238b30.png
Normal file
After ![]() (image error) Size: 2.3 MiB |
BIN
test/images/5b85ecc407a35a5d5f28c5be00cf0236.png
Normal file
After ![]() (image error) Size: 1.6 MiB |
BIN
test/images/5c86ce8683ae5121de7be6f3d1fa519e.png
Normal file
After ![]() (image error) Size: 1.1 MiB |
BIN
test/images/5cbf132ec9365b601a53fdcbdb93ae2b.png
Normal file
After ![]() (image error) Size: 512 KiB |
BIN
test/images/5cd469e8d2c02282d2596484fc80ebd7.png
Normal file
After ![]() (image error) Size: 373 KiB |
BIN
test/images/5f9a3e649f6d669632fa80a3abadcb96.png
Normal file
After ![]() (image error) Size: 553 KiB |
BIN
test/images/61ca03ac388a78809cf31a2eeedccfda.png
Normal file
After ![]() (image error) Size: 1.2 MiB |
BIN
test/images/6257ce56eb1eb610ac92a6da04c78d45.png
Normal file
After ![]() (image error) Size: 1.5 MiB |
BIN
test/images/63bd6752b8bae43e2b6fa5de3a404aee.png
Normal file
After ![]() (image error) Size: 2.0 MiB |
BIN
test/images/64fe7a86a9faf88a1b0e1fb72aaec9f9.png
Normal file
After ![]() (image error) Size: 482 KiB |
BIN
test/images/65035ae776d1c76fbd0401f72d801356.png
Normal file
After ![]() (image error) Size: 1.9 MiB |
BIN
test/images/65178336b7b85f2be9cb145b0abf2c8c.png
Normal file
After ![]() (image error) Size: 1.5 MiB |
BIN
test/images/653764a4f8938ac40f35b7b433a94b85.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/6864c3be1f8fe81634a61460a22263fc.png
Normal file
After ![]() (image error) Size: 634 KiB |
BIN
test/images/69115ccfe023c05e45c51877663df0b0.png
Normal file
After ![]() (image error) Size: 362 KiB |
BIN
test/images/69c67d59c305c5366f95403dbb614d74.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/6be7ce1ebfba0db2eec3796b3ccd11eb.png
Normal file
After ![]() (image error) Size: 193 KiB |
BIN
test/images/6c545ba60aa8cb81b09d9b9e547c55e1.png
Normal file
After ![]() (image error) Size: 797 KiB |
BIN
test/images/6d79a316417751256c97ddef290bcc17.png
Normal file
After ![]() (image error) Size: 1.7 MiB |
BIN
test/images/718cba757475c5cb36e151a6a44754fb.png
Normal file
After ![]() (image error) Size: 580 KiB |
BIN
test/images/719ece743298b89647004b3aaf9cd020.png
Normal file
After ![]() (image error) Size: 587 KiB |
BIN
test/images/726049ff3fb23d35c65d72c736770bbe.png
Normal file
After ![]() (image error) Size: 715 KiB |
BIN
test/images/72a2a4aa8fb59c012a3893058229f47b.png
Normal file
After ![]() (image error) Size: 1.5 MiB |
BIN
test/images/73721c6592232ad47cce540140bef537.png
Normal file
After ![]() (image error) Size: 919 KiB |
BIN
test/images/73d480046bebdd2d57e8f6eb7fa5a650.png
Normal file
After ![]() (image error) Size: 334 KiB |
BIN
test/images/75164ca455a533b12e6a6408d424f381.png
Normal file
After ![]() (image error) Size: 1015 KiB |
BIN
test/images/7669fe655c325346980d7c84f9f24638.png
Normal file
After ![]() (image error) Size: 1.9 MiB |
BIN
test/images/76d70e9b6037848af7e898953e7565b1.png
Normal file
After ![]() (image error) Size: 799 KiB |
BIN
test/images/773d45da7ed1562e0afa6740f6c83d67.png
Normal file
After ![]() (image error) Size: 481 KiB |
BIN
test/images/7790f82c149092ec3bd04b9ae83ce003.png
Normal file
After ![]() (image error) Size: 557 KiB |
BIN
test/images/77a731f54482e209800ed63f42dddbba.png
Normal file
After ![]() (image error) Size: 2.9 MiB |
BIN
test/images/77fa516e93655ea6381fb114b01320a9.png
Normal file
After ![]() (image error) Size: 1.3 MiB |
BIN
test/images/783ae486e17b6a7202ec1e0b80174bf5.png
Normal file
After ![]() (image error) Size: 598 KiB |
BIN
test/images/7844e280fd5ca7423f68abcd648cfb0b.png
Normal file
After ![]() (image error) Size: 564 KiB |
BIN
test/images/78988200024c57bf17f36126db8267a7.png
Normal file
After ![]() (image error) Size: 418 KiB |
BIN
test/images/7bf1286bcb75feaf6df76d6bdf594187.png
Normal file
After ![]() (image error) Size: 661 KiB |
BIN
test/images/7d277423e4731fdb5a82c32895e46f59.png
Normal file
After ![]() (image error) Size: 514 KiB |