Go to file

paprykdev 7d904501c1 fix: update Dockerfile to remove docker installation and set user permissions		2024-11-12 05:29:24 +01:00
app	fix: update Dockerfile to remove docker installation and set user permissions	2024-11-12 05:29:24 +01:00
.dockerignore	feat: add docker integration	2024-11-12 05:17:00 +01:00
.gitignore	feat: initial commit	2024-11-12 05:16:21 +01:00
docker-compose.yaml	feat: add docker integration	2024-11-12 05:17:00 +01:00
LICENSE	docs: add MIT LICENSE	2024-11-12 05:17:24 +01:00
README.md	docs: update README.md	2024-11-12 05:17:45 +01:00
start.py	feat: add docker integration	2024-11-12 05:17:00 +01:00
start.sh	feat: add docker integration	2024-11-12 05:17:00 +01:00

README.md

Web scraper 🔍

Description

This project is a web scraper designed to extract data from websites. It can be customized to scrape various types of data and save it in different formats.

Features

Extracts data from web pages

Installation

Using Docker

Clone the repository:

git clone https://git.wmi.amu.edu.pl/s500042/webscraper

Navigate to the project directory:

cd webscraper

Build the Docker image and run it using script:
- On Linux, ?Mac

./start.sh

Windows 🤡

python start.py

This one will work just fine on Linux, but on Mac, you'll have to use

python3 start.py

Without Docker

Clone the repository:

git clone https://github.com/yourusername/webscraper.git

Navigate to the project directory:

cd webscraper/app

Install the required dependencies:

pip install -r requirements.txt

If you're on Arch Linux, you'll need to create a virtual environment. Here's is a Step by step guide that will help you create it.

Usage

Configure the scraper by editing the config.json file.
Run the scraper:

python scraper.py

License

This project is licensed under the MIT License. See the LICENSE file for details.