Go to file
2023-01-29 17:48:12 +01:00
model_pred new model 2023-01-24 18:49:32 +01:00
test_data added images to test_Data 2023-01-22 17:55:28 +01:00
test_data_andrzej Cropped images 2023-01-26 23:58:40 +01:00
test_data_cropped Better hand cropping 2023-01-28 19:11:59 +01:00
test_data_own new model 2023-01-24 18:49:32 +01:00
test_data_own_cropped Better hand cropping 2023-01-28 19:11:59 +01:00
training_model wip 2023-01-21 16:22:16 +01:00
.gitignore wip 2023-01-21 16:22:16 +01:00
crop_hand.py Better hand cropping 2023-01-28 19:11:59 +01:00
kamil_asl.mp4 Video processing WIP 2023-01-29 17:44:33 +01:00
main.py Streamlit WIP 2023-01-29 17:08:06 +01:00
pred_test.py Streamlit WIP 2023-01-29 17:08:06 +01:00
pred.py.save new model 2023-01-24 18:49:32 +01:00
process_video.py processing fix 2023-01-29 17:48:12 +01:00
README.md Update 'README.md' 2023-01-24 18:53:31 +01:00
requirements.txt Hand detection adn cropping 2023-01-28 16:22:36 +01:00

projekt_widzenie

Run apllication

  1. pip install -r requirements.txt
  2. streamlit run main.py
  3. On http://localhost:8501/ you should see the app

Dataset

Mamy łącznie 197784 zdjęć

  • swój własno zrobiony zbiór testowy 148 zdjęć

Linki do datasetów:

  1. https://www.kaggle.com/datasets/mrgeislinger/asl-rgb-depth-fingerspelling-spelling-it-out
  2. https://www.kaggle.com/datasets/grassknoted/asl-alphabet
  3. https://www.kaggle.com/datasets/lexset/synthetic-asl-alphabet
  4. https://www.kaggle.com/datasets/kuzivakwashe/significant-asl-sign-language-alphabet-dataset

Trening modelu

Do trenowania używano biblioteki Keras

Pierwsze podejście model trenowany od zera (from scratch)

img_height=256
img_width=256
batch_size=128
epochs=30
  layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(29,activation='softmax')

Zbiór testowy własny: 22% Accuracy

Zbiór testowy mieszany z Kaggle: 80% Accuracy


Drugie podejście model VGG16

Zastosowano early stopping z val_loss

img_height=224
img_width=224
batch_size=128
epochs=50

Usunięto 3 wierzchne wartswy i dodano warstwy:

x = layers.Flatten()(vgg_model.output)
x = layers.Dense(len(class_names), activation='softmax')(x)

Zbiór testowy własny: 52% Accuracy

Zbiór testowy mieszany z Kaggle: 79% Accuracy