Go to file
paprykdev 1e4449dccf
Some checks failed
Docker Image CI / build (push) Has been cancelled
fix(encoding): utf8 encoding support
Signed-off-by: paprykdev <58005447+paprykdev@users.noreply.github.com>
2024-12-19 12:51:14 +01:00
.github/workflows fix: update docker workflow to build image using start.py instead of docker compose 2024-11-16 00:50:38 +01:00
app fix(encoding): utf8 encoding support 2024-12-19 12:51:14 +01:00
scripts feat: scraper for monet arts 2024-12-18 01:41:12 +01:00
.gitignore feat: scraper for monet arts 2024-12-18 01:41:12 +01:00
LICENSE docs: add MIT LICENSE 2024-11-12 05:17:24 +01:00
README.md docs: update README to correct script paths and improve instructions 2024-11-15 22:40:07 +01:00

Web scraper 🔍

Description

This project is a web scraper designed to extract data from websites.

Features

☑️ Extracts data from web pages

Usage

With Docker

  1. Clone the repository:
git clone https://git.wmi.amu.edu.pl/s500042/webscraper
  1. Navigate to the project directory:
cd webscraper
  1. Build the Docker image and run it using start.py script:
python scripts/start.py

On Mac, you'll have to use

python3 scripts/start.py
  1. Check /app/dist/data.json file to see the extracted data.

Without Docker

  1. Clone the repository:
git clone https://git.wmi.amu.edu.pl/s500042/webscraper
  1. Install the required dependencies:
pip install -r app/requirements.txt

If you're on Arch Linux, you'll need to create a virtual environment. Here's is a Step by step guide that will help you create it.

  1. Run run_with_no_docker.py script:
python scripts/run_with_no_docker.py

On Mac you'll, need to use:

python3 scripts/run_with_no_docker.py
  1. Check /app/dist/data.json file to see the extracted data.

License

This project is licensed under the MIT License. See the LICENSE file for details.