170 lines
3.4 KiB
Plaintext
170 lines
3.4 KiB
Plaintext
|
Metadata-Version: 2.1
|
||
|
Name: python-mnist
|
||
|
Version: 0.7
|
||
|
Summary: Simple MNIST and EMNIST data parser written in pure Python
|
||
|
Home-page: https://github.com/sorki/python-mnist
|
||
|
Author: Richard Marko
|
||
|
Author-email: srk@48.io
|
||
|
License: BSD
|
||
|
Platform: UNKNOWN
|
||
|
Classifier: Development Status :: 5 - Production/Stable
|
||
|
Classifier: Intended Audience :: Developers
|
||
|
Classifier: Operating System :: OS Independent
|
||
|
Classifier: Programming Language :: Python
|
||
|
|
||
|
python-mnist
|
||
|
============
|
||
|
|
||
|
Simple MNIST and EMNIST data parser written in pure Python.
|
||
|
|
||
|
MNIST is a database of handwritten digits available on
|
||
|
http://yann.lecun.com/exdb/mnist/. EMNIST is an extended MNIST database
|
||
|
https://www.nist.gov/itl/iad/image-group/emnist-dataset.
|
||
|
|
||
|
Requirements
|
||
|
------------
|
||
|
|
||
|
- Python 2 or Python 3
|
||
|
|
||
|
Usage
|
||
|
-----
|
||
|
|
||
|
- ``git clone https://github.com/sorki/python-mnist``
|
||
|
|
||
|
- ``cd python-mnist``
|
||
|
|
||
|
- Get MNIST data:
|
||
|
|
||
|
::
|
||
|
|
||
|
./bin/mnist_get_data.sh
|
||
|
|
||
|
- Check preview with:
|
||
|
|
||
|
::
|
||
|
|
||
|
PYTHONPATH=. ./bin/mnist_preview
|
||
|
|
||
|
Installation
|
||
|
------------
|
||
|
|
||
|
Get the package from PyPi:
|
||
|
|
||
|
::
|
||
|
|
||
|
pip install python-mnist
|
||
|
|
||
|
or install with ``setup.py``:
|
||
|
|
||
|
::
|
||
|
|
||
|
python setup.py install
|
||
|
|
||
|
Code sample:
|
||
|
|
||
|
::
|
||
|
|
||
|
from mnist import MNIST
|
||
|
mndata = MNIST('./dir_with_mnist_data_files')
|
||
|
images, labels = mndata.load_training()
|
||
|
|
||
|
To enable loading of gzip-ed files use:
|
||
|
|
||
|
::
|
||
|
|
||
|
mndata.gz = True
|
||
|
|
||
|
Library tries to load files named t10k-images-idx3-ubyte
|
||
|
train-labels-idx1-ubyte train-images-idx3-ubyte and
|
||
|
t10k-labels-idx1-ubyte. If loading throws an exception check if these
|
||
|
names match.
|
||
|
|
||
|
EMNIST
|
||
|
------
|
||
|
|
||
|
- Get EMNIST data:
|
||
|
|
||
|
::
|
||
|
|
||
|
./bin/emnist_get_data.sh
|
||
|
|
||
|
- Check preview with:
|
||
|
|
||
|
::
|
||
|
|
||
|
PYTHONPATH=. ./bin/emnist_preview
|
||
|
|
||
|
To use EMNIST datasets you need to call:
|
||
|
|
||
|
::
|
||
|
|
||
|
mndata.select_emnist('digits')
|
||
|
|
||
|
Where digits is one of the available EMNIST datasets. You can choose
|
||
|
from
|
||
|
|
||
|
- balanced
|
||
|
- byclass
|
||
|
- bymerge
|
||
|
- digits
|
||
|
- letters
|
||
|
- mnist
|
||
|
|
||
|
EMNIST loader uses gziped files by default, this can be disabled by by
|
||
|
setting:
|
||
|
|
||
|
::
|
||
|
|
||
|
mndata.gz = False
|
||
|
|
||
|
You also need to unpack EMNIST files as bin/emnist_get_data.sh script
|
||
|
won't do it for you. EMNIST loader also needs to mirror and rotate
|
||
|
images so it is a bit slower (If this is an issue for you, you should
|
||
|
repack the data to avoid mirroring and rotation on each load).
|
||
|
|
||
|
Notes
|
||
|
-----
|
||
|
|
||
|
This package doesn't use numpy by design as when I've tried to find a
|
||
|
working implementation all of them were based on some archaic version of
|
||
|
numpy and none of them worked. This loads data files with struct.unpack
|
||
|
instead.
|
||
|
|
||
|
Example
|
||
|
-------
|
||
|
|
||
|
::
|
||
|
|
||
|
$ PYTHONPATH=. ./bin/mnist_preview
|
||
|
Showing num: 3
|
||
|
|
||
|
............................
|
||
|
............................
|
||
|
............................
|
||
|
............................
|
||
|
............................
|
||
|
............................
|
||
|
.............@@@@@..........
|
||
|
..........@@@@@@@@@@........
|
||
|
.......@@@@@@......@@.......
|
||
|
.......@@@........@@@.......
|
||
|
.................@@.........
|
||
|
................@@@.........
|
||
|
...............@@@@@........
|
||
|
.............@@@............
|
||
|
.............@.......@......
|
||
|
.....................@......
|
||
|
.....................@@.....
|
||
|
....................@@......
|
||
|
...................@@@......
|
||
|
.................@@@@.......
|
||
|
................@@@@........
|
||
|
....@........@@@@@..........
|
||
|
....@@@@@@@@@@@@............
|
||
|
......@@@@@@................
|
||
|
............................
|
||
|
............................
|
||
|
............................
|
||
|
............................
|
||
|
|