webscraper/README.md
paprykdev ba9e42b9c6
Some checks are pending
Docker Image CI / build (push) Waiting to run
docs: update README to correct script paths and improve instructions
2024-11-15 22:40:07 +01:00

1.2 KiB

Web scraper 🔍

Description

This project is a web scraper designed to extract data from websites.

Features

☑️ Extracts data from web pages

Usage

With Docker

  1. Clone the repository:
git clone https://git.wmi.amu.edu.pl/s500042/webscraper
  1. Navigate to the project directory:
cd webscraper
  1. Build the Docker image and run it using start.py script:
python scripts/start.py

On Mac, you'll have to use

python3 scripts/start.py
  1. Check /app/dist/data.json file to see the extracted data.

Without Docker

  1. Clone the repository:
git clone https://git.wmi.amu.edu.pl/s500042/webscraper
  1. Install the required dependencies:
pip install -r app/requirements.txt

If you're on Arch Linux, you'll need to create a virtual environment. Here's is a Step by step guide that will help you create it.

  1. Run run_with_no_docker.py script:
python scripts/run_with_no_docker.py

On Mac you'll, need to use:

python3 scripts/run_with_no_docker.py
  1. Check /app/dist/data.json file to see the extracted data.

License

This project is licensed under the MIT License. See the LICENSE file for details.