Go to file

paprykdev d303dd4204 Some checks are pending Docker Image CI / build (push) Waiting to run Details feat: enhance Docker command execution with environment variable support		2024-11-14 03:44:27 +01:00
.github/workflows	fix: update Docker image CI workflow to use correct script path	2024-11-14 03:40:57 +01:00
app	refactor: remove unused import of time module in main.py	2024-11-14 03:38:53 +01:00
scripts	feat: enhance Docker command execution with environment variable support	2024-11-14 03:44:27 +01:00
.gitignore	feat: initial commit	2024-11-12 05:16:21 +01:00
LICENSE	docs: add MIT LICENSE	2024-11-12 05:17:24 +01:00
README.md	docs: update README.md	2024-11-12 05:17:45 +01:00

README.md

Web scraper 🔍

Description

This project is a web scraper designed to extract data from websites. It can be customized to scrape various types of data and save it in different formats.

Features

Extracts data from web pages

Installation

Using Docker

Clone the repository:

git clone https://git.wmi.amu.edu.pl/s500042/webscraper

Navigate to the project directory:

cd webscraper

Build the Docker image and run it using script:
- On Linux, ?Mac

./start.sh

Windows 🤡

python start.py

This one will work just fine on Linux, but on Mac, you'll have to use

python3 start.py

Without Docker

Clone the repository:

git clone https://github.com/yourusername/webscraper.git

Navigate to the project directory:

cd webscraper/app

Install the required dependencies:

pip install -r requirements.txt

If you're on Arch Linux, you'll need to create a virtual environment. Here's is a Step by step guide that will help you create it.

Usage

Configure the scraper by editing the config.json file.
Run the scraper:

python scraper.py

License

This project is licensed under the MIT License. See the LICENSE file for details.