Web scraper 🔍

Description

This project is a web scraper designed to extract data from websites. It can be customized to scrape various types of data and save it in different formats.

Features

Extracts data from web pages

Installation

Using Docker

Clone the repository:

git clone https://git.wmi.amu.edu.pl/s500042/webscraper

Navigate to the project directory:

cd webscraper

Build the Docker image and run it using script:
- On Linux, ?Mac

./start.sh

Windows 🤡

python start.py

This one will work just fine on Linux, but on Mac, you'll have to use

python3 start.py

Without Docker

Clone the repository:

git clone https://github.com/yourusername/webscraper.git

Navigate to the project directory:

cd webscraper/app

Install the required dependencies:

pip install -r requirements.txt

If you're on Arch Linux, you'll need to create a virtual environment. Here's is a Step by step guide that will help you create it.

Usage

Configure the scraper by editing the config.json file.
Run the scraper:

python scraper.py

License

This project is licensed under the MIT License. See the LICENSE file for details.

1.4 KiB Raw Blame History