Go to file

paprykdev f65292d891 feat: seperate all logic into small methods		2024-11-13 23:56:43 +01:00
.github/workflows	Create docker-image.yml	2024-11-12 05:31:55 +01:00
app	feat: seperate all logic into small methods	2024-11-13 23:56:43 +01:00
.dockerignore	feat: add docker integration	2024-11-12 05:17:00 +01:00
.gitignore	feat: initial commit	2024-11-12 05:16:21 +01:00
docker-compose.yaml	feat: seperate all logic into small methods	2024-11-13 23:56:43 +01:00
LICENSE	docs: add MIT LICENSE	2024-11-12 05:17:24 +01:00
README.md	docs: update README.md	2024-11-12 05:17:45 +01:00
start.py	feat: add docker integration	2024-11-12 05:17:00 +01:00
start.sh	feat: seperate all logic into small methods	2024-11-13 23:56:43 +01:00

README.md

Web scraper 🔍

Description

This project is a web scraper designed to extract data from websites. It can be customized to scrape various types of data and save it in different formats.

Features

Extracts data from web pages

Installation

Using Docker

Clone the repository:

git clone https://git.wmi.amu.edu.pl/s500042/webscraper

Navigate to the project directory:

cd webscraper

Build the Docker image and run it using script:
- On Linux, ?Mac

./start.sh

Windows 🤡

python start.py

This one will work just fine on Linux, but on Mac, you'll have to use

python3 start.py

Without Docker

Clone the repository:

git clone https://github.com/yourusername/webscraper.git

Navigate to the project directory:

cd webscraper/app

Install the required dependencies:

pip install -r requirements.txt

If you're on Arch Linux, you'll need to create a virtual environment. Here's is a Step by step guide that will help you create it.

Usage

Configure the scraper by editing the config.json file.
Run the scraper:

python scraper.py

License

This project is licensed under the MIT License. See the LICENSE file for details.