12 lines
761 B
Markdown
12 lines
761 B
Markdown
# Amazon Products (Japanese)
|
|
|
|
This challenge requires extracting product category from product description.
|
|
The data is taken from Japanese amazon and consists of over 8000 product offers.
|
|
It was scraped using a simple Python bot. Most of the product descriptions contain
|
|
the category as a substring somewhere in the text (or alternatively some synonym of the category).
|
|
There is also no predefined set of all possible categories. Hence this task is NOT about
|
|
sequence classification.
|
|
|
|
Scripts used for generating this dataset can be found here
|
|
[https://github.com/aleksander-mendoza/MachineLearningMiniprojects/blob/master/amazon\_products/scraper.py](https://github.com/aleksander-mendoza/MachineLearningMiniprojects/blob/master/amazon_products/scraper.py)
|