Metadata-Version: 2.3
Name: zimscraperlib
Version: 3.4.0
Summary: Collection of python tools to re-use common code across scrapers
Project-URL: Donate, https://www.kiwix.org/en/support-us/
Project-URL: Homepage, https://www.kiwix.org
Author-email: openZIM <dev@openzim.org>
License: GPL-3.0-or-later
License-File: LICENSE
Keywords: offline,openzim,zim
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: <3.13,>=3.8
Requires-Dist: babel<3.0,>=2.9
Requires-Dist: beautifulsoup4<5.0,>=4.9.3
Requires-Dist: colorthief==0.2.1
Requires-Dist: iso639-lang<3.0,>=2.2.3
Requires-Dist: libzim<4.0,>=3.4.0
Requires-Dist: lxml<6.0,>=4.6.3
Requires-Dist: optimize-images<2.0,>=1.3.6
Requires-Dist: python-magic<0.5,>=0.4.3
Requires-Dist: python-resize-image<1.2,>=1.1.19
Requires-Dist: requests<3.0,>=2.25.1
Requires-Dist: yt-dlp
Provides-Extra: check
Requires-Dist: pyright==1.1.368; extra == 'check'
Requires-Dist: pytest==8.2.2; extra == 'check'
Provides-Extra: dev
Requires-Dist: black==24.4.2; extra == 'dev'
Requires-Dist: coverage==7.5.3; extra == 'dev'
Requires-Dist: invoke==2.2.0; extra == 'dev'
Requires-Dist: ipython==8.25.0; extra == 'dev'
Requires-Dist: pre-commit==3.7.1; extra == 'dev'
Requires-Dist: pyright==1.1.368; extra == 'dev'
Requires-Dist: pytest-mock==3.14.0; extra == 'dev'
Requires-Dist: pytest==8.2.2; extra == 'dev'
Requires-Dist: ruff==0.4.9; extra == 'dev'
Provides-Extra: lint
Requires-Dist: black==24.4.2; extra == 'lint'
Requires-Dist: ruff==0.4.9; extra == 'lint'
Provides-Extra: scripts
Requires-Dist: invoke==2.2.0; extra == 'scripts'
Provides-Extra: test
Requires-Dist: coverage==7.5.3; extra == 'test'
Requires-Dist: pytest-mock==3.14.0; extra == 'test'
Requires-Dist: pytest==8.2.2; extra == 'test'
Description-Content-Type: text/markdown

zimscraperlib
=============

[![Build Status](https://github.com/openzim/python-scraperlib/workflows/CI/badge.svg?query=branch%3Amain)](https://github.com/openzim/python-scraperlib/actions?query=branch%3Amain)
[![CodeFactor](https://www.codefactor.io/repository/github/openzim/python-scraperlib/badge)](https://www.codefactor.io/repository/github/openzim/python-scraperlib)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![PyPI version shields.io](https://img.shields.io/pypi/v/zimscraperlib.svg)](https://pypi.org/project/zimscraperlib/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/zimscraperlib.svg)](https://pypi.org/project/zimscraperlib)
[![codecov](https://codecov.io/gh/openzim/python-scraperlib/branch/master/graph/badge.svg)](https://codecov.io/gh/openzim/python-scraperlib)

Collection of python code to re-use across python-based scrapers

# Usage

* This library is meant to be installed via PyPI ([`zimscraperlib`](https://pypi.org/project/zimscraperlib/)).
* Make sure to reference it using a version code as the API is subject to frequent changes.
* API should remain the same only within the same *minor* version.

Example usage:

``` pip
zimscraperlib>=1.1,<1.2
```

# Dependencies

* libmagic
* wget
* libzim (auto-installed, not available on Windows)
* Pillow
* FFmpeg
* gifsicle (>=1.92)

## macOS

```sh
brew install libmagic wget libtiff libjpeg webp little-cms2 ffmpeg gifsicle
```

## Linux

```sh
sudo apt install libmagic1 wget ffmpeg \
    libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \
    libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk \
    libharfbuzz-dev libfribidi-dev libxcb1-dev gifsicle
```

## Alpine
```
apk add ffmpeg gifsicle libmagic wget libjpeg
```

**Nota:** i18n features do not work on Alpine, see https://github.com/openzim/python-scraperlib/issues/134 ; there is one corresponding test which is failing.

# Contribution

This project adheres to openZIM's [Contribution Guidelines](https://github.com/openzim/overview/wiki/Contributing).

This project has implemented openZIM's [Python bootstrap, conventions and policies](https://github.com/openzim/_python-bootstrap/docs/Policy.md) **v1.0.2**.

```shell
pip install hatch
pip install ".[dev]"
pre-commit install
# For tests
invoke coverage
```

# Users

Non-exhaustive list of scrapers using it (check status when updating API):

* [openzim/freecodecamp](https://github.com/openzim/freecodecamp)
* [openzim/gutenberg](https://github.com/openzim/gutenberg)
* [openzim/ifixit](https://github.com/openzim/ifixit)
* [openzim/kolibri](https://github.com/openzim/kolibri)
* [openzim/nautilus](https://github.com/openzim/nautilus)
* [openzim/nautilus](https://github.com/openzim/nautilus)
* [openzim/openedx](https://github.com/openzim/openedx)
* [openzim/sotoki](https://github.com/openzim/sotoki)
* [openzim/ted](https://github.com/openzim/ted)
* [openzim/warc2zim](https://github.com/openzim/warc2zim)
* [openzim/wikihow](https://github.com/openzim/wikihow)
* [openzim/youtube](https://github.com/openzim/youtube)
