add yolov3 and ocr to detect license plate numbers
BIN
Images/New/IMG_2518.jpeg
Normal file
After ![]() (image error) Size: 4.3 MiB |
BIN
Images/New/IMG_2519.jpeg
Normal file
After ![]() (image error) Size: 4.2 MiB |
BIN
Images/New/IMG_2521.jpeg
Normal file
After ![]() (image error) Size: 4.2 MiB |
BIN
Images/New/IMG_2524.jpeg
Normal file
After ![]() (image error) Size: 5.0 MiB |
BIN
Images/New/IMG_2525.jpeg
Normal file
After ![]() (image error) Size: 4.0 MiB |
BIN
Images/New/IMG_2526.jpeg
Normal file
After ![]() (image error) Size: 3.5 MiB |
BIN
Images/New/IMG_2527.jpeg
Normal file
After ![]() (image error) Size: 3.9 MiB |
BIN
Images/New/IMG_2528.jpeg
Normal file
After ![]() (image error) Size: 4.9 MiB |
BIN
Images/New/IMG_2529.jpeg
Normal file
After ![]() (image error) Size: 4.9 MiB |
BIN
Images/New/IMG_2530.jpeg
Normal file
After ![]() (image error) Size: 4.0 MiB |
BIN
Images/New/IMG_2531.jpeg
Normal file
After ![]() (image error) Size: 5.0 MiB |
BIN
Images/New/IMG_2532.jpeg
Normal file
After ![]() (image error) Size: 4.7 MiB |
BIN
Images/New/IMG_2533.jpeg
Normal file
After ![]() (image error) Size: 3.1 MiB |
BIN
Images/New/IMG_2534.jpeg
Normal file
After ![]() (image error) Size: 5.3 MiB |
BIN
Images/New/IMG_2537.jpeg
Normal file
After ![]() (image error) Size: 4.5 MiB |
BIN
Images/New/IMG_2538.jpeg
Normal file
After ![]() (image error) Size: 4.7 MiB |
BIN
Images/New/IMG_2539.jpeg
Normal file
After ![]() (image error) Size: 3.0 MiB |
BIN
Images/New/IMG_2540.jpeg
Normal file
After ![]() (image error) Size: 2.0 MiB |
BIN
Images/New/IMG_2579.jpeg
Normal file
After ![]() (image error) Size: 4.4 MiB |
BIN
Images/New/IMG_2580.jpeg
Normal file
After ![]() (image error) Size: 5.0 MiB |
BIN
Images/New/IMG_2581.jpeg
Normal file
After ![]() (image error) Size: 4.1 MiB |
BIN
Images/New/IMG_2582.jpeg
Normal file
After ![]() (image error) Size: 4.7 MiB |
BIN
Images/New/IMG_2583.jpeg
Normal file
After ![]() (image error) Size: 3.9 MiB |
BIN
Images/New/IMG_2585.jpeg
Normal file
After ![]() (image error) Size: 4.6 MiB |
BIN
Images/New/IMG_2586.jpeg
Normal file
After ![]() (image error) Size: 3.3 MiB |
BIN
Images/New/IMG_2587.jpeg
Normal file
After ![]() (image error) Size: 2.9 MiB |
BIN
Images/New/IMG_2588.jpeg
Normal file
After ![]() (image error) Size: 5.0 MiB |
BIN
Images/New/IMG_2589.jpeg
Normal file
After ![]() (image error) Size: 3.2 MiB |
BIN
Images/New/IMG_2590.jpeg
Normal file
After ![]() (image error) Size: 4.0 MiB |
BIN
Images/New/IMG_2591.jpeg
Normal file
After ![]() (image error) Size: 4.8 MiB |
BIN
Images/New/IMG_2592.jpeg
Normal file
After ![]() (image error) Size: 4.1 MiB |
BIN
Images/New/IMG_2593.jpeg
Normal file
After ![]() (image error) Size: 4.3 MiB |
BIN
Images/New/IMG_2594.jpeg
Normal file
After ![]() (image error) Size: 4.7 MiB |
BIN
Images/New/IMG_2595.jpeg
Normal file
After ![]() (image error) Size: 4.3 MiB |
BIN
Images/New/IMG_2596.jpeg
Normal file
After ![]() (image error) Size: 4.6 MiB |
BIN
Images/New/IMG_2598.jpeg
Normal file
After ![]() (image error) Size: 5.2 MiB |
BIN
Images/New/IMG_2599.jpeg
Normal file
After ![]() (image error) Size: 5.0 MiB |
BIN
Images/Old/IMG_0810.jpg
Normal file
After ![]() (image error) Size: 1.8 MiB |
BIN
Images/Old/IMG_1310.JPG
Normal file
After (image error) Size: 2.4 MiB |
BIN
Images/Old/IMG_1511.jpg
Normal file
After ![]() (image error) Size: 1.3 MiB |
BIN
Images/Old/IMG_1660.jpeg
Normal file
After ![]() (image error) Size: 4.1 MiB |
BIN
Images/Old/IMG_2488.jpeg
Normal file
After ![]() (image error) Size: 5.0 MiB |
BIN
Images/Old/IMG_2490.jpeg
Normal file
After ![]() (image error) Size: 4.0 MiB |
BIN
Images/Old/IMG_2491.jpeg
Normal file
After ![]() (image error) Size: 4.8 MiB |
BIN
Images/Old/IMG_2493.jpeg
Normal file
After ![]() (image error) Size: 4.3 MiB |
BIN
Images/Old/IMG_2494.jpeg
Normal file
After ![]() (image error) Size: 4.7 MiB |
BIN
Images/Old/IMG_2495.jpeg
Normal file
After ![]() (image error) Size: 4.3 MiB |
BIN
Images/Old/IMG_2499.jpeg
Normal file
After ![]() (image error) Size: 5.0 MiB |
BIN
Images/Old/IMG_3262.JPG
Normal file
After (image error) Size: 2.1 MiB |
BIN
Images/Old/IMG_3909.JPG
Normal file
After (image error) Size: 3.3 MiB |
BIN
Images/Old/IMG_3968.jpeg
Normal file
After ![]() (image error) Size: 4.2 MiB |
BIN
Images/Old/IMG_7088.jpeg
Normal file
After ![]() (image error) Size: 3.4 MiB |
BIN
Images/Old/IMG_7101.jpeg
Normal file
After ![]() (image error) Size: 3.6 MiB |
BIN
Images/Old/IMG_7823.jpeg
Normal file
After ![]() (image error) Size: 3.6 MiB |
BIN
Images/Old/IMG_7991.jpeg
Normal file
After ![]() (image error) Size: 1.6 MiB |
BIN
Images/Old/IMG_9387.jpg
Normal file
After ![]() (image error) Size: 2.3 MiB |
BIN
Images/Old/IMG_9390.jpg
Normal file
After ![]() (image error) Size: 3.4 MiB |
BIN
Images/Old/P8080275.JPG
Normal file
After (image error) Size: 7.8 MiB |
21
LICENSE
Normal file
@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2018 qqwweee
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
25
README.md
Normal file
@ -0,0 +1,25 @@
|
||||
## keras-yolo3 with Roboflow
|
||||
|
||||
[](LICENSE)
|
||||
|
||||
A Keras implementation of YOLOv3 (Tensorflow backend) inspired by [allanzelener/YAD2K](https://github.com/allanzelener/YAD2K).
|
||||
|
||||
## What You Will Learn
|
||||
* How to load your custom image detection data from Roboflow
|
||||
* How set up the YOLOv3 model in keras
|
||||
* How to train the YOLOv3 model
|
||||
* How to use the model for inference
|
||||
* How to save the keras model weights for future use
|
||||
|
||||
## Resources
|
||||
|
||||
* [This blog post](https://blog.roboflow.ai/training-a-yolov3-object-detection-model-with-a-custom-dataset/) provides a deep dive into the tutorial
|
||||
* This notebook provides the code necessary to run the tutorial [](https://colab.research.google.com/drive/1ByRi9d6_Yzu0nrEKArmLMLuMaZjYfygO#scrollTo=WgHANbxqWJPa)
|
||||
* For reading purposes, the notebook is also saved in Tutorial.ipynb
|
||||
|
||||
## About Roboflow for Data Management
|
||||
|
||||
[Roboflow](https://roboflow.ai) makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
|
||||
Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.
|
||||
|
||||

|
1804
Tutorial.ipynb
Normal file
40
api.py
Normal file
@ -0,0 +1,40 @@
|
||||
# python -m pip install flask
|
||||
# export FLASK_APP=main.py
|
||||
# flask run --without-threads
|
||||
|
||||
from flask import Flask, request
|
||||
from yolo import YOLO
|
||||
import yolo_video
|
||||
import base64
|
||||
|
||||
app = Flask(__name__)
|
||||
|
||||
""" Automatic call while FLASK init """
|
||||
yolo_model = YOLO()
|
||||
|
||||
# def deinit_yolo(yolo):
|
||||
# yolo.close_session()
|
||||
|
||||
"""API_address/detectLicense?img="""
|
||||
@app.get("/detectLicense")
|
||||
def detectLicensePlate():
|
||||
# build path
|
||||
image_path = request.args['img']
|
||||
# decoded_data = base64.b64decode((image_base64))
|
||||
# img_file = open('img_to_detect.png', 'wb')
|
||||
# img_file.write(decoded_data)
|
||||
# img_file.close()
|
||||
# P8080275.JPG
|
||||
# image_path = 'img_to_detect.png'
|
||||
image_path = './Images/' + image_path
|
||||
str = yolo_video.detect_license_plate(model=yolo_model, img_path=image_path)
|
||||
|
||||
if not base64:
|
||||
return {
|
||||
'str': ["None"],
|
||||
}, 200
|
||||
|
||||
else:
|
||||
return {
|
||||
'str': str,
|
||||
}, 200
|
BIN
base64.png
Normal file
After ![]() (image error) Size: 18 MiB |
1
base64.txt
Normal file
52
coco_annotation.py
Normal file
@ -0,0 +1,52 @@
|
||||
import json
|
||||
from collections import defaultdict
|
||||
|
||||
name_box_id = defaultdict(list)
|
||||
id_name = dict()
|
||||
f = open(
|
||||
"mscoco2017/annotations/instances_train2017.json",
|
||||
encoding='utf-8')
|
||||
data = json.load(f)
|
||||
|
||||
annotations = data['annotations']
|
||||
for ant in annotations:
|
||||
id = ant['image_id']
|
||||
name = 'mscoco2017/train2017/%012d.jpg' % id
|
||||
cat = ant['category_id']
|
||||
|
||||
if cat >= 1 and cat <= 11:
|
||||
cat = cat - 1
|
||||
elif cat >= 13 and cat <= 25:
|
||||
cat = cat - 2
|
||||
elif cat >= 27 and cat <= 28:
|
||||
cat = cat - 3
|
||||
elif cat >= 31 and cat <= 44:
|
||||
cat = cat - 5
|
||||
elif cat >= 46 and cat <= 65:
|
||||
cat = cat - 6
|
||||
elif cat == 67:
|
||||
cat = cat - 7
|
||||
elif cat == 70:
|
||||
cat = cat - 9
|
||||
elif cat >= 72 and cat <= 82:
|
||||
cat = cat - 10
|
||||
elif cat >= 84 and cat <= 90:
|
||||
cat = cat - 11
|
||||
|
||||
name_box_id[name].append([ant['bbox'], cat])
|
||||
|
||||
f = open('train.txt', 'w')
|
||||
for key in name_box_id.keys():
|
||||
f.write(key)
|
||||
box_infos = name_box_id[key]
|
||||
for info in box_infos:
|
||||
x_min = int(info[0][0])
|
||||
y_min = int(info[0][1])
|
||||
x_max = x_min + int(info[0][2])
|
||||
y_max = y_min + int(info[0][3])
|
||||
|
||||
box_info = " %d,%d,%d,%d,%d" % (
|
||||
x_min, y_min, x_max, y_max, int(info[1]))
|
||||
f.write(box_info)
|
||||
f.write('\n')
|
||||
f.close()
|
264
convert.py
Normal file
@ -0,0 +1,264 @@
|
||||
#! /usr/bin/env python
|
||||
"""
|
||||
Reads Darknet config and weights and creates Keras model with TF backend.
|
||||
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import configparser
|
||||
import io
|
||||
import os
|
||||
from collections import defaultdict
|
||||
|
||||
import numpy as np
|
||||
from keras import backend as K
|
||||
from keras.layers import (Conv2D, Input, ZeroPadding2D, Add,
|
||||
UpSampling2D, MaxPooling2D, Concatenate)
|
||||
#from keras.layers.advanced_activations import LeakyReLU
|
||||
from keras.layers import ELU, PReLU, LeakyReLU
|
||||
#from keras.layers.normalization import BatchNormalization
|
||||
from keras.layers import BatchNormalization
|
||||
from keras.models import Model
|
||||
from keras.regularizers import l2
|
||||
from keras.utils.vis_utils import plot_model as plot
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser(description='Darknet To Keras Converter.')
|
||||
parser.add_argument('config_path', help='Path to Darknet cfg file.')
|
||||
parser.add_argument('weights_path', help='Path to Darknet weights file.')
|
||||
parser.add_argument('output_path', help='Path to output Keras model file.')
|
||||
parser.add_argument(
|
||||
'-p',
|
||||
'--plot_model',
|
||||
help='Plot generated Keras model and save as image.',
|
||||
action='store_true')
|
||||
parser.add_argument(
|
||||
'-w',
|
||||
'--weights_only',
|
||||
help='Save as Keras weights file instead of model file.',
|
||||
action='store_true')
|
||||
|
||||
def unique_config_sections(config_file):
|
||||
"""Convert all config sections to have unique names.
|
||||
|
||||
Adds unique suffixes to config sections for compability with configparser.
|
||||
"""
|
||||
section_counters = defaultdict(int)
|
||||
output_stream = io.StringIO()
|
||||
with open(config_file) as fin:
|
||||
for line in fin:
|
||||
if line.startswith('['):
|
||||
section = line.strip().strip('[]')
|
||||
_section = section + '_' + str(section_counters[section])
|
||||
section_counters[section] += 1
|
||||
line = line.replace(section, _section)
|
||||
output_stream.write(line)
|
||||
output_stream.seek(0)
|
||||
return output_stream
|
||||
|
||||
# %%
|
||||
def _main(args):
|
||||
config_path = os.path.expanduser(args.config_path)
|
||||
weights_path = os.path.expanduser(args.weights_path)
|
||||
assert config_path.endswith('.cfg'), '{} is not a .cfg file'.format(
|
||||
config_path)
|
||||
assert weights_path.endswith(
|
||||
'.weights'), '{} is not a .weights file'.format(weights_path)
|
||||
|
||||
output_path = os.path.expanduser(args.output_path)
|
||||
assert output_path.endswith(
|
||||
'.h5'), 'output path {} is not a .h5 file'.format(output_path)
|
||||
output_root = os.path.splitext(output_path)[0]
|
||||
|
||||
# Load weights and config.
|
||||
print('Loading weights.')
|
||||
weights_file = open(weights_path, 'rb')
|
||||
major, minor, revision = np.ndarray(
|
||||
shape=(3, ), dtype='int32', buffer=weights_file.read(12))
|
||||
if (major*10+minor)>=2 and major<1000 and minor<1000:
|
||||
seen = np.ndarray(shape=(1,), dtype='int64', buffer=weights_file.read(8))
|
||||
else:
|
||||
seen = np.ndarray(shape=(1,), dtype='int32', buffer=weights_file.read(4))
|
||||
print('Weights Header: ', major, minor, revision, seen)
|
||||
|
||||
print('Parsing Darknet config.')
|
||||
unique_config_file = unique_config_sections(config_path)
|
||||
cfg_parser = configparser.ConfigParser()
|
||||
cfg_parser.read_file(unique_config_file)
|
||||
|
||||
print('Creating Keras model.')
|
||||
input_layer = Input(shape=(None, None, 3))
|
||||
prev_layer = input_layer
|
||||
all_layers = []
|
||||
|
||||
weight_decay = float(cfg_parser['net_0']['decay']
|
||||
) if 'net_0' in cfg_parser.sections() else 5e-4
|
||||
count = 0
|
||||
out_index = []
|
||||
for section in cfg_parser.sections():
|
||||
print('Parsing section {}'.format(section))
|
||||
if section.startswith('convolutional'):
|
||||
filters = int(cfg_parser[section]['filters'])
|
||||
size = int(cfg_parser[section]['size'])
|
||||
stride = int(cfg_parser[section]['stride'])
|
||||
pad = int(cfg_parser[section]['pad'])
|
||||
activation = cfg_parser[section]['activation']
|
||||
batch_normalize = 'batch_normalize' in cfg_parser[section]
|
||||
|
||||
padding = 'same' if pad == 1 and stride == 1 else 'valid'
|
||||
|
||||
# Setting weights.
|
||||
# Darknet serializes convolutional weights as:
|
||||
# [bias/beta, [gamma, mean, variance], conv_weights]
|
||||
prev_layer_shape = K.int_shape(prev_layer)
|
||||
|
||||
weights_shape = (size, size, prev_layer_shape[-1], filters)
|
||||
darknet_w_shape = (filters, weights_shape[2], size, size)
|
||||
weights_size = np.product(weights_shape)
|
||||
|
||||
print('conv2d', 'bn'
|
||||
if batch_normalize else ' ', activation, weights_shape)
|
||||
|
||||
conv_bias = np.ndarray(
|
||||
shape=(filters, ),
|
||||
dtype='float32',
|
||||
buffer=weights_file.read(filters * 4))
|
||||
count += filters
|
||||
|
||||
if batch_normalize:
|
||||
bn_weights = np.ndarray(
|
||||
shape=(3, filters),
|
||||
dtype='float32',
|
||||
buffer=weights_file.read(filters * 12))
|
||||
count += 3 * filters
|
||||
|
||||
bn_weight_list = [
|
||||
bn_weights[0], # scale gamma
|
||||
conv_bias, # shift beta
|
||||
bn_weights[1], # running mean
|
||||
bn_weights[2] # running var
|
||||
]
|
||||
|
||||
conv_weights = np.ndarray(
|
||||
shape=darknet_w_shape,
|
||||
dtype='float32',
|
||||
buffer=weights_file.read(weights_size * 4))
|
||||
count += weights_size
|
||||
|
||||
# DarkNet conv_weights are serialized Caffe-style:
|
||||
# (out_dim, in_dim, height, width)
|
||||
# We would like to set these to Tensorflow order:
|
||||
# (height, width, in_dim, out_dim)
|
||||
conv_weights = np.transpose(conv_weights, [2, 3, 1, 0])
|
||||
conv_weights = [conv_weights] if batch_normalize else [
|
||||
conv_weights, conv_bias
|
||||
]
|
||||
|
||||
# Handle activation.
|
||||
act_fn = None
|
||||
if activation == 'leaky':
|
||||
pass # Add advanced activation later.
|
||||
elif activation != 'linear':
|
||||
raise ValueError(
|
||||
'Unknown activation function `{}` in section {}'.format(
|
||||
activation, section))
|
||||
|
||||
# Create Conv2D layer
|
||||
if stride>1:
|
||||
# Darknet uses left and top padding instead of 'same' mode
|
||||
prev_layer = ZeroPadding2D(((1,0),(1,0)))(prev_layer)
|
||||
conv_layer = (Conv2D(
|
||||
filters, (size, size),
|
||||
strides=(stride, stride),
|
||||
kernel_regularizer=l2(weight_decay),
|
||||
use_bias=not batch_normalize,
|
||||
weights=conv_weights,
|
||||
activation=act_fn,
|
||||
padding=padding))(prev_layer)
|
||||
|
||||
if batch_normalize:
|
||||
conv_layer = (BatchNormalization(
|
||||
weights=bn_weight_list))(conv_layer)
|
||||
prev_layer = conv_layer
|
||||
|
||||
if activation == 'linear':
|
||||
all_layers.append(prev_layer)
|
||||
elif activation == 'leaky':
|
||||
act_layer = LeakyReLU(alpha=0.1)(prev_layer)
|
||||
prev_layer = act_layer
|
||||
all_layers.append(act_layer)
|
||||
|
||||
elif section.startswith('route'):
|
||||
ids = [int(i) for i in cfg_parser[section]['layers'].split(',')]
|
||||
layers = [all_layers[i] for i in ids]
|
||||
if len(layers) > 1:
|
||||
print('Concatenating route layers:', layers)
|
||||
concatenate_layer = Concatenate()(layers)
|
||||
all_layers.append(concatenate_layer)
|
||||
prev_layer = concatenate_layer
|
||||
else:
|
||||
skip_layer = layers[0] # only one layer to route
|
||||
all_layers.append(skip_layer)
|
||||
prev_layer = skip_layer
|
||||
|
||||
elif section.startswith('maxpool'):
|
||||
size = int(cfg_parser[section]['size'])
|
||||
stride = int(cfg_parser[section]['stride'])
|
||||
all_layers.append(
|
||||
MaxPooling2D(
|
||||
pool_size=(size, size),
|
||||
strides=(stride, stride),
|
||||
padding='same')(prev_layer))
|
||||
prev_layer = all_layers[-1]
|
||||
|
||||
elif section.startswith('shortcut'):
|
||||
index = int(cfg_parser[section]['from'])
|
||||
activation = cfg_parser[section]['activation']
|
||||
assert activation == 'linear', 'Only linear activation supported.'
|
||||
all_layers.append(Add()([all_layers[index], prev_layer]))
|
||||
prev_layer = all_layers[-1]
|
||||
|
||||
elif section.startswith('upsample'):
|
||||
stride = int(cfg_parser[section]['stride'])
|
||||
assert stride == 2, 'Only stride=2 supported.'
|
||||
all_layers.append(UpSampling2D(stride)(prev_layer))
|
||||
prev_layer = all_layers[-1]
|
||||
|
||||
elif section.startswith('yolo'):
|
||||
out_index.append(len(all_layers)-1)
|
||||
all_layers.append(None)
|
||||
prev_layer = all_layers[-1]
|
||||
|
||||
elif section.startswith('net'):
|
||||
pass
|
||||
|
||||
else:
|
||||
raise ValueError(
|
||||
'Unsupported section header type: {}'.format(section))
|
||||
|
||||
# Create and save model.
|
||||
if len(out_index)==0: out_index.append(len(all_layers)-1)
|
||||
model = Model(inputs=input_layer, outputs=[all_layers[i] for i in out_index])
|
||||
print(model.summary())
|
||||
if args.weights_only:
|
||||
model.save_weights('{}'.format(output_path))
|
||||
print('Saved Keras weights to {}'.format(output_path))
|
||||
else:
|
||||
model.save('{}'.format(output_path))
|
||||
print('Saved Keras model to {}'.format(output_path))
|
||||
|
||||
# Check to see if all weights have been read.
|
||||
remaining_weights = len(weights_file.read()) / 4
|
||||
weights_file.close()
|
||||
print('Read {} of {} from Darknet weights.'.format(count, count +
|
||||
remaining_weights))
|
||||
if remaining_weights > 0:
|
||||
print('Warning: {} unused weights'.format(remaining_weights))
|
||||
|
||||
if args.plot_model:
|
||||
plot(model, to_file='{}.png'.format(output_root), show_shapes=True)
|
||||
print('Saved model plot to {}.png'.format(output_root))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
_main(parser.parse_args())
|
548
darknet53.cfg
Normal file
@ -0,0 +1,548 @@
|
||||
[net]
|
||||
# Testing
|
||||
batch=1
|
||||
subdivisions=1
|
||||
# Training
|
||||
# batch=64
|
||||
# subdivisions=16
|
||||
width=416
|
||||
height=416
|
||||
channels=3
|
||||
momentum=0.9
|
||||
decay=0.0005
|
||||
angle=0
|
||||
saturation = 1.5
|
||||
exposure = 1.5
|
||||
hue=.1
|
||||
|
||||
learning_rate=0.001
|
||||
burn_in=1000
|
||||
max_batches = 500200
|
||||
policy=steps
|
||||
steps=400000,450000
|
||||
scales=.1,.1
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=32
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
# Downsample
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=64
|
||||
size=3
|
||||
stride=2
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=32
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=64
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
# Downsample
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=3
|
||||
stride=2
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=64
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=64
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
# Downsample
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=2
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=128
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
# Downsample
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=2
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=256
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
# Downsample
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=1024
|
||||
size=3
|
||||
stride=2
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=1024
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=1024
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=1024
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=512
|
||||
size=1
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[convolutional]
|
||||
batch_normalize=1
|
||||
filters=1024
|
||||
size=3
|
||||
stride=1
|
||||
pad=1
|
||||
activation=leaky
|
||||
|
||||
[shortcut]
|
||||
from=-3
|
||||
activation=linear
|
||||
|
BIN
detected.png
Normal file
After ![]() (image error) Size: 13 MiB |
BIN
final/final00.png
Normal file
After ![]() (image error) Size: 19 MiB |
BIN
font/FiraMono-Medium.otf
Normal file
45
font/SIL Open Font License.txt
Normal file
@ -0,0 +1,45 @@
|
||||
Copyright (c) 2014, Mozilla Foundation https://mozilla.org/ with Reserved Font Name Fira Mono.
|
||||
|
||||
Copyright (c) 2014, Telefonica S.A.
|
||||
|
||||
This Font Software is licensed under the SIL Open Font License, Version 1.1.
|
||||
This license is copied below, and is also available with a FAQ at: http://scripts.sil.org/OFL
|
||||
|
||||
-----------------------------------------------------------
|
||||
SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
|
||||
-----------------------------------------------------------
|
||||
|
||||
PREAMBLE
|
||||
The goals of the Open Font License (OFL) are to stimulate worldwide development of collaborative font projects, to support the font creation efforts of academic and linguistic communities, and to provide a free and open framework in which fonts may be shared and improved in partnership with others.
|
||||
|
||||
The OFL allows the licensed fonts to be used, studied, modified and redistributed freely as long as they are not sold by themselves. The fonts, including any derivative works, can be bundled, embedded, redistributed and/or sold with any software provided that any reserved names are not used by derivative works. The fonts and derivatives, however, cannot be released under any other type of license. The requirement for fonts to remain under this license does not apply to any document created using the fonts or their derivatives.
|
||||
|
||||
DEFINITIONS
|
||||
"Font Software" refers to the set of files released by the Copyright Holder(s) under this license and clearly marked as such. This may include source files, build scripts and documentation.
|
||||
|
||||
"Reserved Font Name" refers to any names specified as such after the copyright statement(s).
|
||||
|
||||
"Original Version" refers to the collection of Font Software components as distributed by the Copyright Holder(s).
|
||||
|
||||
"Modified Version" refers to any derivative made by adding to, deleting, or substituting -- in part or in whole -- any of the components of the Original Version, by changing formats or by porting the Font Software to a new environment.
|
||||
|
||||
"Author" refers to any designer, engineer, programmer, technical writer or other person who contributed to the Font Software.
|
||||
|
||||
PERMISSION & CONDITIONS
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy of the Font Software, to use, study, copy, merge, embed, modify, redistribute, and sell modified and unmodified copies of the Font Software, subject to the following conditions:
|
||||
|
||||
1) Neither the Font Software nor any of its individual components, in Original or Modified Versions, may be sold by itself.
|
||||
|
||||
2) Original or Modified Versions of the Font Software may be bundled, redistributed and/or sold with any software, provided that each copy contains the above copyright notice and this license. These can be included either as stand-alone text files, human-readable headers or in the appropriate machine-readable metadata fields within text or binary files as long as those fields can be easily viewed by the user.
|
||||
|
||||
3) No Modified Version of the Font Software may use the Reserved Font Name(s) unless explicit written permission is granted by the corresponding Copyright Holder. This restriction only applies to the primary font name as presented to the users.
|
||||
|
||||
4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font Software shall not be used to promote, endorse or advertise any Modified Version, except to acknowledge the contribution(s) of the Copyright Holder(s) and the Author(s) or with their explicit written permission.
|
||||
|
||||
5) The Font Software, modified or unmodified, in part or in whole, must be distributed entirely under this license, and must not be distributed under any other license. The requirement for fonts to remain under this license does not apply to any document created using the Font Software.
|
||||
|
||||
TERMINATION
|
||||
This license becomes null and void if any of the above conditions are not met.
|
||||
|
||||
DISCLAIMER
|
||||
THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE FONT SOFTWARE.
|
BIN
img/img00.png
Normal file
After ![]() (image error) Size: 3.8 KiB |
0
img_to_detect.png
Normal file
101
kmeans.py
Normal file
@ -0,0 +1,101 @@
|
||||
import numpy as np
|
||||
|
||||
|
||||
class YOLO_Kmeans:
|
||||
|
||||
def __init__(self, cluster_number, filename):
|
||||
self.cluster_number = cluster_number
|
||||
self.filename = "2012_train.txt"
|
||||
|
||||
def iou(self, boxes, clusters): # 1 box -> k clusters
|
||||
n = boxes.shape[0]
|
||||
k = self.cluster_number
|
||||
|
||||
box_area = boxes[:, 0] * boxes[:, 1]
|
||||
box_area = box_area.repeat(k)
|
||||
box_area = np.reshape(box_area, (n, k))
|
||||
|
||||
cluster_area = clusters[:, 0] * clusters[:, 1]
|
||||
cluster_area = np.tile(cluster_area, [1, n])
|
||||
cluster_area = np.reshape(cluster_area, (n, k))
|
||||
|
||||
box_w_matrix = np.reshape(boxes[:, 0].repeat(k), (n, k))
|
||||
cluster_w_matrix = np.reshape(np.tile(clusters[:, 0], (1, n)), (n, k))
|
||||
min_w_matrix = np.minimum(cluster_w_matrix, box_w_matrix)
|
||||
|
||||
box_h_matrix = np.reshape(boxes[:, 1].repeat(k), (n, k))
|
||||
cluster_h_matrix = np.reshape(np.tile(clusters[:, 1], (1, n)), (n, k))
|
||||
min_h_matrix = np.minimum(cluster_h_matrix, box_h_matrix)
|
||||
inter_area = np.multiply(min_w_matrix, min_h_matrix)
|
||||
|
||||
result = inter_area / (box_area + cluster_area - inter_area)
|
||||
return result
|
||||
|
||||
def avg_iou(self, boxes, clusters):
|
||||
accuracy = np.mean([np.max(self.iou(boxes, clusters), axis=1)])
|
||||
return accuracy
|
||||
|
||||
def kmeans(self, boxes, k, dist=np.median):
|
||||
box_number = boxes.shape[0]
|
||||
distances = np.empty((box_number, k))
|
||||
last_nearest = np.zeros((box_number,))
|
||||
np.random.seed()
|
||||
clusters = boxes[np.random.choice(
|
||||
box_number, k, replace=False)] # init k clusters
|
||||
while True:
|
||||
|
||||
distances = 1 - self.iou(boxes, clusters)
|
||||
|
||||
current_nearest = np.argmin(distances, axis=1)
|
||||
if (last_nearest == current_nearest).all():
|
||||
break # clusters won't change
|
||||
for cluster in range(k):
|
||||
clusters[cluster] = dist( # update clusters
|
||||
boxes[current_nearest == cluster], axis=0)
|
||||
|
||||
last_nearest = current_nearest
|
||||
|
||||
return clusters
|
||||
|
||||
def result2txt(self, data):
|
||||
f = open("yolo_anchors.txt", 'w')
|
||||
row = np.shape(data)[0]
|
||||
for i in range(row):
|
||||
if i == 0:
|
||||
x_y = "%d,%d" % (data[i][0], data[i][1])
|
||||
else:
|
||||
x_y = ", %d,%d" % (data[i][0], data[i][1])
|
||||
f.write(x_y)
|
||||
f.close()
|
||||
|
||||
def txt2boxes(self):
|
||||
f = open(self.filename, 'r')
|
||||
dataSet = []
|
||||
for line in f:
|
||||
infos = line.split(" ")
|
||||
length = len(infos)
|
||||
for i in range(1, length):
|
||||
width = int(infos[i].split(",")[2]) - \
|
||||
int(infos[i].split(",")[0])
|
||||
height = int(infos[i].split(",")[3]) - \
|
||||
int(infos[i].split(",")[1])
|
||||
dataSet.append([width, height])
|
||||
result = np.array(dataSet)
|
||||
f.close()
|
||||
return result
|
||||
|
||||
def txt2clusters(self):
|
||||
all_boxes = self.txt2boxes()
|
||||
result = self.kmeans(all_boxes, k=self.cluster_number)
|
||||
result = result[np.lexsort(result.T[0, None])]
|
||||
self.result2txt(result)
|
||||
print("K anchors:\n {}".format(result))
|
||||
print("Accuracy: {:.2f}%".format(
|
||||
self.avg_iou(all_boxes, result) * 100))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
cluster_number = 9
|
||||
filename = "2012_train.txt"
|
||||
kmeans = YOLO_Kmeans(cluster_number, filename)
|
||||
kmeans.txt2clusters()
|
80
model_data/coco_classes.txt
Normal file
@ -0,0 +1,80 @@
|
||||
person
|
||||
bicycle
|
||||
car
|
||||
motorbike
|
||||
aeroplane
|
||||
bus
|
||||
train
|
||||
truck
|
||||
boat
|
||||
traffic light
|
||||
fire hydrant
|
||||
stop sign
|
||||
parking meter
|
||||
bench
|
||||
bird
|
||||
cat
|
||||
dog
|
||||
horse
|
||||
sheep
|
||||
cow
|
||||
elephant
|
||||
bear
|
||||
zebra
|
||||
giraffe
|
||||
backpack
|
||||
umbrella
|
||||
handbag
|
||||
tie
|
||||
suitcase
|
||||
frisbee
|
||||
skis
|
||||
snowboard
|
||||
sports ball
|
||||
kite
|
||||
baseball bat
|
||||
baseball glove
|
||||
skateboard
|
||||
surfboard
|
||||
tennis racket
|
||||
bottle
|
||||
wine glass
|
||||
cup
|
||||
fork
|
||||
knife
|
||||
spoon
|
||||
bowl
|
||||
banana
|
||||
apple
|
||||
sandwich
|
||||
orange
|
||||
broccoli
|
||||
carrot
|
||||
hot dog
|
||||
pizza
|
||||
donut
|
||||
cake
|
||||
chair
|
||||
sofa
|
||||
pottedplant
|
||||
bed
|
||||
diningtable
|
||||
toilet
|
||||
tvmonitor
|
||||
laptop
|
||||
mouse
|
||||
remote
|
||||
keyboard
|
||||
cell phone
|
||||
microwave
|
||||
oven
|
||||
toaster
|
||||
sink
|
||||
refrigerator
|
||||
book
|
||||
clock
|
||||
vase
|
||||
scissors
|
||||
teddy bear
|
||||
hair drier
|
||||
toothbrush
|
1
model_data/tiny_yolo_anchors.txt
Normal file
@ -0,0 +1 @@
|
||||
10,14, 23,27, 37,58, 81,82, 135,169, 344,319
|
20
model_data/voc_classes.txt
Normal file
@ -0,0 +1,20 @@
|
||||
aeroplane
|
||||
bicycle
|
||||
bird
|
||||
boat
|
||||
bottle
|
||||
bus
|
||||
car
|
||||
cat
|
||||
chair
|
||||
cow
|
||||
diningtable
|
||||
dog
|
||||
horse
|
||||
motorbike
|
||||
person
|
||||
pottedplant
|
||||
sheep
|
||||
sofa
|
||||
train
|
||||
tvmonitor
|
BIN
model_data/yolo.h5
Normal file
1
model_data/yolo_anchors.txt
Normal file
@ -0,0 +1 @@
|
||||
10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
|
42
ocr.py
Normal file
@ -0,0 +1,42 @@
|
||||
import easyocr
|
||||
import cv2 as cv
|
||||
import keras_ocr
|
||||
import pytesseract
|
||||
|
||||
def keras_ocr_func():
|
||||
pipeline = keras_ocr.pipeline.Pipeline()
|
||||
images = [
|
||||
keras_ocr.tools.read(img) for img in ['img0.png', ]
|
||||
]
|
||||
prediction_groups = pipeline.recognize(images)
|
||||
car_numbers = ''
|
||||
|
||||
try:
|
||||
for i in prediction_groups[0]:
|
||||
car_numbers += i[0]
|
||||
except:
|
||||
print('no detection')
|
||||
|
||||
return car_numbers
|
||||
|
||||
def tesseract_ocr():
|
||||
img = cv.imread('img0.png')
|
||||
res = pytesseract.image_to_string(img,
|
||||
lang='eng',
|
||||
config='--oem 3 --psm 6 -c tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
|
||||
return res
|
||||
|
||||
def get_text_from_image(img_path, cut=7):
|
||||
text = ''
|
||||
image = cv.imread(img_path)
|
||||
|
||||
|
||||
reader = easyocr.Reader(['en'])
|
||||
ocr_result = reader.readtext((image), paragraph="True", min_size=120, #180 for rgb
|
||||
allowlist='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')
|
||||
try:
|
||||
text=ocr_result[0][1].replace(' ', '')[:cut] # cut to 7 symbols
|
||||
except:
|
||||
print('too few symbols')
|
||||
|
||||
return text
|
BIN
temp/boxes.jpg
Normal file
After ![]() (image error) Size: 60 KiB |
BIN
temp/bw_image.jpg
Normal file
After ![]() (image error) Size: 46 KiB |
BIN
temp/dilated_image.jpg
Normal file
After ![]() (image error) Size: 39 KiB |
BIN
temp/eroded_image.jpg
Normal file
After ![]() (image error) Size: 39 KiB |
BIN
temp/gray.jpg
Normal file
After ![]() (image error) Size: 60 KiB |
BIN
temp/image_with_border.jpg
Normal file
After ![]() (image error) Size: 16 KiB |
BIN
temp/inverted.jpg
Normal file
After ![]() (image error) Size: 66 KiB |
BIN
temp/no_borders.jpg
Normal file
After ![]() (image error) Size: 31 KiB |
BIN
temp/no_noise.jpg
Normal file
After ![]() (image error) Size: 39 KiB |
BIN
temp/rotated_fixed.jpg
Normal file
After ![]() (image error) Size: 19 KiB |
BIN
test/Cars102_png.rf.fc64602e11b511c8180d609d3e6cf5c0.jpg
Normal file
After ![]() (image error) Size: 38 KiB |
BIN
test/Cars104_png.rf.fc9e9b6314ae71cb84c7b09110790410.jpg
Normal file
After ![]() (image error) Size: 74 KiB |
BIN
test/Cars116_png.rf.3c38d65e8790c40a21ef2253048eba25.jpg
Normal file
After ![]() (image error) Size: 52 KiB |
BIN
test/Cars124_png.rf.1b9e1ce259db991ece5363ed977a9fa2.jpg
Normal file
After ![]() (image error) Size: 74 KiB |
BIN
test/Cars132_png.rf.88ef642a659593fa186905914fe931c2.jpg
Normal file
After ![]() (image error) Size: 72 KiB |
BIN
test/Cars156_png.rf.6b85deb9ddc25509114a6fb7838805dd.jpg
Normal file
After ![]() (image error) Size: 46 KiB |
BIN
test/Cars160_png.rf.7f2eeefb9439579d3378ce8503054130.jpg
Normal file
After ![]() (image error) Size: 50 KiB |
BIN
test/Cars164_png.rf.23d460e3a32949ccf6eeb737c33c8c81.jpg
Normal file
After ![]() (image error) Size: 50 KiB |
BIN
test/Cars185_png.rf.40422e05ce82446efa90595fd8cbaa2e 2.jpg
Normal file
After ![]() (image error) Size: 58 KiB |
BIN
test/Cars185_png.rf.40422e05ce82446efa90595fd8cbaa2e.jpg
Normal file
After ![]() (image error) Size: 58 KiB |