Compare commits
15 Commits
b2bc16dfab
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| c53ba05874 | |||
| 54e3218779 | |||
|
|
1aeca0c420 | ||
| 2a3d7ed230 | |||
| dde5b572fb | |||
| 7e88e91077 | |||
| 569b8ee4f8 | |||
| 5de0d57612 | |||
| e3b7879eb2 | |||
| d5211572a5 | |||
|
|
ebdf4f1572 | ||
|
|
60bbc09ccc | ||
| 8edd016e39 | |||
|
|
fd3a8f4955 | ||
|
|
346318177d |
2
.gitignore
vendored
2
.gitignore
vendored
@@ -1,5 +1,5 @@
|
||||
.idea/
|
||||
data/
|
||||
beaky-backend/data/
|
||||
report.xml
|
||||
|
||||
# Byte-compiled / optimized / DLL files
|
||||
|
||||
@@ -1,40 +0,0 @@
|
||||
image: python:3.12-slim
|
||||
|
||||
cache:
|
||||
paths:
|
||||
- .cache/pip
|
||||
- venv/
|
||||
|
||||
variables:
|
||||
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
|
||||
|
||||
before_script:
|
||||
- python -V
|
||||
- python -m venv venv
|
||||
- source venv/bin/activate
|
||||
- pip install --upgrade pip
|
||||
- pip install ruff mypy pytest
|
||||
- pip install .
|
||||
|
||||
stages:
|
||||
- lint
|
||||
- test
|
||||
|
||||
run_ruff:
|
||||
stage: lint
|
||||
script:
|
||||
- ruff check .
|
||||
|
||||
run_mypy:
|
||||
stage: lint
|
||||
script:
|
||||
- mypy src
|
||||
|
||||
run_pytest:
|
||||
stage: test
|
||||
script:
|
||||
- pytest --junit-xml=report.xml
|
||||
artifacts:
|
||||
when: always
|
||||
reports:
|
||||
junit: report.xml
|
||||
98
CLAUDE.md
Normal file
98
CLAUDE.md
Normal file
@@ -0,0 +1,98 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## What This Project Does
|
||||
|
||||
Beaky is a CLI tool for verifying the truthfulness of sports betting tickets. It reads ticket URLs from an Excel file, classifies the bets on each ticket (via web scraping or OCR), then resolves each bet against a football statistics API to determine if the ticket is genuine.
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Install (with dev dependencies)
|
||||
pip install -e ".[dev]"
|
||||
|
||||
# Install Playwright browser (required for link classifier and screenshotter)
|
||||
playwright install chromium
|
||||
|
||||
# Run the CLI
|
||||
beaky <mode> [--config config/application.yml] [--id <ticket_id>] [--classifier {link,img,both}] [--dump]
|
||||
|
||||
# Modes:
|
||||
# screen - screenshot all ticket URLs to data/screenshots/<id>.png
|
||||
# parse - print all links loaded from Excel
|
||||
# compare - classify tickets and print bet comparison table
|
||||
# resolve - classify via link classifier, then resolve bets against football API
|
||||
|
||||
# Run the REST API (default: http://0.0.0.0:8000)
|
||||
beaky-api
|
||||
|
||||
# Run tests
|
||||
pytest
|
||||
|
||||
# Lint
|
||||
ruff check .
|
||||
|
||||
# Format
|
||||
ruff format .
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
Data flows through four stages:
|
||||
|
||||
1. **Scanner** (`scanner/scanner.py`) — Reads `data/odkazy.xlsx` and produces `Link` objects (id, url, date).
|
||||
|
||||
2. **Classifiers** — Two independent classifiers both produce a `Ticket` (list of typed `Bet` objects):
|
||||
- **Link classifier** (`link_classifier/classifier.py`) — Launches a headless Chromium browser via Playwright, navigates to the ticket URL (a Czech Fortuna betting site), and parses the DOM using CSS selectors to extract bet details.
|
||||
- **Image classifier** (`image_classifier/classifier.py`) — Runs pytesseract OCR on screenshots in `data/screenshots/`, then uses regex to parse the raw text into bets. Block segmentation is driven by date-start and sport-prefix end triggers.
|
||||
|
||||
3. **Resolver** (`resolvers/resolver.py`) — Takes a classified `Ticket` and resolves each bet's outcome (WIN/LOSE/VOID/UNKNOWN) by querying the `api-sports.io` football API. Matches fixtures using team name similarity (SequenceMatcher) and date proximity. Results are disk-cached in `data/fixture_cache/` to avoid redundant API calls.
|
||||
|
||||
4. **CLI** (`cli.py`) — Ties everything together. Handles `--classifier` and `--dump` flags; renders ANSI-colored comparison tables for side-by-side link-vs-image output.
|
||||
|
||||
5. **REST API** (`api/`) — FastAPI app exposing a single endpoint. Runs the full pipeline (screenshot → both classifiers → resolve) for a given URL and returns the verdict. Classifiers and resolver are instantiated once at startup (`app.state`) and reused across requests.
|
||||
|
||||
### Core Domain Models (`datamodels/ticket.py`)
|
||||
|
||||
`Bet` is an abstract Pydantic dataclass with a `resolve(MatchInfo) -> BetOutcome` method. Concrete subtypes include: `WinDrawLose`, `WinDrawLoseDouble`, `WinLose`, `BothTeamScored`, `GoalAmount`, `GoalHandicap`, `HalfTimeResult`, `HalfTimeDouble`, `HalfTimeFullTime`, `CornerAmount`, `TeamCornerAmount`, `MoreOffsides`, `Advance`, `UnknownBet`. Adding a new bet type requires: a new subclass here, detection regex in both classifiers, and a `resolve()` implementation.
|
||||
|
||||
### REST API
|
||||
|
||||
**Endpoint:** `POST /api/v1/resolve`
|
||||
|
||||
```json
|
||||
{ "url": "<fortuna ticket url>", "debug": false }
|
||||
```
|
||||
|
||||
Response includes `verdict` and per-bet `outcome`/`fixture_id`/`confidence`. With `debug: true` also returns raw `link_ticket`, `img_ticket`, and per-bet `match_info`.
|
||||
|
||||
Ticket ID is derived as `md5(url) % 10^9` — stable across restarts. Screenshots are saved to `data/screenshots/{ticket_id}.png`.
|
||||
|
||||
**Environment variables** (all optional):
|
||||
|
||||
| Var | Default |
|
||||
|---|---|
|
||||
| `BEAKY_CONFIG` | `config/application.yml` |
|
||||
| `BEAKY_HOST` | `0.0.0.0` |
|
||||
| `BEAKY_PORT` | `8000` |
|
||||
| `LOG_LEVEL` | value from `config/application.yml` → `api.log_level` |
|
||||
|
||||
OpenAPI docs available at `/docs` when the server is running.
|
||||
|
||||
### Logging
|
||||
|
||||
All modules use `logging` (no `print()`). The CLI's user-facing output (`cli.py`) still uses `print`. Resolver debug output (fixture matching, API calls) goes through `_ansi.log()` which emits at `DEBUG` level with ANSI colors preserved. Set `api.log_level: DEBUG` in `config/application.yml` (or `LOG_LEVEL=DEBUG` env var) to see it.
|
||||
|
||||
### Configuration
|
||||
|
||||
Config is loaded from `config/application.yml` into Pydantic dataclasses (`Config`, `ScreenshotterConfig`, `ResolverConfig`, `ImgClassifierConfig`, `ApiConfig`). Key fields:
|
||||
- `path` — path to the input Excel file
|
||||
- `resolver.api_key` — api-sports.io API key
|
||||
- `resolver.league_map` — maps Czech league name patterns to API league IDs (longest-match wins)
|
||||
- `resolver.cache_path` — disk cache directory (default: `data/fixture_cache`)
|
||||
- `api.log_level` — logging level for the API server (default: `INFO`)
|
||||
|
||||
### Bet text language
|
||||
|
||||
All bet type strings are in Czech (from the Fortuna betting platform). Regex patterns in both classifiers match Czech text (e.g. `"Výsledek zápasu"`, `"Počet gólů"`).
|
||||
@@ -6,6 +6,7 @@ screenshotter:
|
||||
resolver:
|
||||
api_key: 733f6882605be2de8980bbd074091ee4
|
||||
league_map:
|
||||
Kvalifikace MS - Evropa: 32
|
||||
# European cups
|
||||
liga mistrů: 2
|
||||
champions league: 2
|
||||
@@ -62,3 +63,9 @@ resolver:
|
||||
|
||||
img_classifier:
|
||||
target_path: data/screenshots/
|
||||
|
||||
log_level: DEBUG # set to DEBUG to see raw classifier and resolver output
|
||||
|
||||
api:
|
||||
host: 0.0.0.0
|
||||
port: 8000
|
||||
@@ -16,6 +16,9 @@ dependencies = [
|
||||
"playwright==1.58.0",
|
||||
"requests>=2.32.0",
|
||||
"diskcache>=5.6",
|
||||
"pytesseract==0.3.13",
|
||||
"fastapi>=0.115",
|
||||
"uvicorn[standard]>=0.34",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
@@ -30,6 +33,7 @@ dev = [
|
||||
|
||||
[project.scripts]
|
||||
beaky = "beaky.cli:main"
|
||||
beaky-api = "beaky.api.main:main"
|
||||
|
||||
|
||||
[tool.ruff]
|
||||
@@ -1,5 +1,14 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
|
||||
_logger = logging.getLogger("beaky")
|
||||
|
||||
|
||||
def log(text: str) -> None:
|
||||
"""Emit a (possibly ANSI-colored) message at DEBUG level."""
|
||||
_logger.debug("%s", text)
|
||||
|
||||
|
||||
def bold(text: str) -> str:
|
||||
return f"\033[1m{text}\033[0m"
|
||||
1
beaky-backend/src/beaky/api/__init__.py
Normal file
1
beaky-backend/src/beaky/api/__init__.py
Normal file
@@ -0,0 +1 @@
|
||||
|
||||
47
beaky-backend/src/beaky/api/app.py
Normal file
47
beaky-backend/src/beaky/api/app.py
Normal file
@@ -0,0 +1,47 @@
|
||||
import logging
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
|
||||
from beaky.api.routes import router
|
||||
from beaky.config import Config, load_config
|
||||
from beaky.link_classifier.classifier import LinkClassifier
|
||||
from beaky.resolvers.resolver import TicketResolver
|
||||
from beaky.screenshotter.screenshotter import Screenshotter
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def create_app(config_path: str = "config/application.yml") -> FastAPI:
|
||||
app = FastAPI(title="Beaky API", version="0.1.0")
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=["http://localhost:5173"],
|
||||
allow_methods=["POST"],
|
||||
allow_headers=["Content-Type"],
|
||||
)
|
||||
app.include_router(router)
|
||||
|
||||
@app.on_event("startup")
|
||||
def startup() -> None:
|
||||
config: Config = load_config(config_path)
|
||||
|
||||
log_level_str = config.log_level.upper()
|
||||
log_level: int = getattr(logging, log_level_str, logging.INFO)
|
||||
logging.basicConfig(
|
||||
level=log_level,
|
||||
format="%(asctime)s %(levelname)s %(name)s: %(message)s",
|
||||
)
|
||||
|
||||
logger.info("Config loaded from %s (log_level=%s)", config_path, log_level_str)
|
||||
|
||||
app.state.config = config
|
||||
app.state.screenshotter = Screenshotter(config)
|
||||
app.state.link_classifier = LinkClassifier()
|
||||
app.state.resolver = TicketResolver(config.resolver)
|
||||
logger.info("Beaky API ready")
|
||||
|
||||
return app
|
||||
|
||||
|
||||
app = create_app() # default path; overridden by main() via create_app(config_path)
|
||||
20
beaky-backend/src/beaky/api/main.py
Normal file
20
beaky-backend/src/beaky/api/main.py
Normal file
@@ -0,0 +1,20 @@
|
||||
import argparse
|
||||
|
||||
import uvicorn
|
||||
|
||||
from beaky.api.app import create_app
|
||||
from beaky.config import load_config, Config
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(prog="beaky-api")
|
||||
parser.add_argument("--config", default="config/application.yml", help="Path to config file.")
|
||||
args = parser.parse_args()
|
||||
|
||||
config: Config = load_config(args.config)
|
||||
app = create_app(config_path=args.config)
|
||||
uvicorn.run(app, host=config.api.host, port=config.api.port)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
42
beaky-backend/src/beaky/api/routes.py
Normal file
42
beaky-backend/src/beaky/api/routes.py
Normal file
@@ -0,0 +1,42 @@
|
||||
import logging
|
||||
|
||||
from fastapi import APIRouter, HTTPException, Request
|
||||
|
||||
from beaky.datamodels.api import ResolveRequest, ResolveResponse
|
||||
from beaky.api.service import run_pipeline
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter(prefix="/api/v1")
|
||||
|
||||
|
||||
@router.post("/resolve", response_model=ResolveResponse)
|
||||
def resolve(body: ResolveRequest, request: Request) -> ResolveResponse:
|
||||
logger.info("POST /api/v1/resolve url=%s debug=%s", body.url, body.debug)
|
||||
config = request.app.state.config
|
||||
screenshotter = request.app.state.screenshotter
|
||||
link_classifier = request.app.state.link_classifier
|
||||
resolver = request.app.state.resolver
|
||||
|
||||
try:
|
||||
resolved, link_ticket, img_ticket = run_pipeline(
|
||||
url=body.url,
|
||||
config=config,
|
||||
screenshotter=screenshotter,
|
||||
link_classifier=link_classifier,
|
||||
resolver=resolver,
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.exception("Pipeline failed for url=%s", body.url)
|
||||
raise HTTPException(status_code=500, detail=str(exc)) from exc
|
||||
|
||||
if not body.debug:
|
||||
for rb in resolved.bets:
|
||||
rb.match_info = None
|
||||
|
||||
return ResolveResponse(
|
||||
resolved_ticket=resolved,
|
||||
verdict=resolved.verdict.value,
|
||||
link_ticket=link_ticket if body.debug else None,
|
||||
img_ticket=img_ticket if body.debug else None,
|
||||
)
|
||||
45
beaky-backend/src/beaky/api/service.py
Normal file
45
beaky-backend/src/beaky/api/service.py
Normal file
@@ -0,0 +1,45 @@
|
||||
import hashlib
|
||||
import logging
|
||||
from pathlib import Path
|
||||
|
||||
from beaky.config import Config
|
||||
from beaky.datamodels.ticket import Ticket
|
||||
from beaky.image_classifier.classifier import img_classify
|
||||
from beaky.link_classifier.classifier import LinkClassifier
|
||||
from beaky.resolvers.resolver import ResolvedTicket, TicketResolver
|
||||
from beaky.scanner.scanner import Link
|
||||
from beaky.screenshotter.screenshotter import Screenshotter
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def run_pipeline(
|
||||
url: str,
|
||||
config: Config,
|
||||
screenshotter: Screenshotter,
|
||||
link_classifier: LinkClassifier,
|
||||
resolver: TicketResolver,
|
||||
) -> tuple[ResolvedTicket, Ticket, Ticket]:
|
||||
ticket_id = int(hashlib.md5(url.encode()).hexdigest(), 16) % (10**9)
|
||||
link = Link(id=ticket_id, url=url)
|
||||
logger.info("Pipeline started for ticket_id=%d url=%s", ticket_id, url)
|
||||
|
||||
logger.info("Screenshotting ticket_id=%d", ticket_id)
|
||||
screenshotter.capture_tickets([link])
|
||||
logger.info("Screenshot done for ticket_id=%d", ticket_id)
|
||||
|
||||
logger.info("Link classifying ticket_id=%d", ticket_id)
|
||||
link_ticket = link_classifier.classify(link)
|
||||
#link_ticket = img_classify([f"./data/screenshots/{ticket_id}.png"], ticket_id=ticket_id)
|
||||
logger.info("Link classification done: %d bets for ticket_id=%d", len(link_ticket.bets), ticket_id)
|
||||
|
||||
screenshot_path = Path(config.screenshotter.target_path) / f"{ticket_id}.png"
|
||||
logger.info("Image classifying ticket_id=%d from %s", ticket_id, screenshot_path)
|
||||
img_ticket = img_classify([str(screenshot_path)], ticket_id=ticket_id)
|
||||
logger.info("Image classification done: %d bets for ticket_id=%d", len(img_ticket.bets), ticket_id)
|
||||
|
||||
logger.info("Resolving ticket_id=%d", ticket_id)
|
||||
resolved = resolver.resolve(link_ticket)
|
||||
logger.info("Resolve done for ticket_id=%d verdict=%s", ticket_id, resolved.verdict.value)
|
||||
|
||||
return resolved, link_ticket, img_ticket
|
||||
@@ -1,13 +1,11 @@
|
||||
import argparse
|
||||
import logging
|
||||
import re as _re
|
||||
import shutil
|
||||
from datetime import datetime
|
||||
|
||||
import yaml
|
||||
from pydantic import ValidationError
|
||||
|
||||
from beaky import _ansi
|
||||
from beaky.config import Config
|
||||
from beaky.config import load_config
|
||||
from beaky.datamodels.ticket import Bet, Ticket
|
||||
from beaky.image_classifier.classifier import img_classify
|
||||
from beaky.link_classifier.classifier import LinkClassifier
|
||||
@@ -55,7 +53,8 @@ def _bet_fields(bet: Bet) -> dict[str, str]:
|
||||
for k, v in vars(bet).items():
|
||||
if k in _SKIP_FIELDS:
|
||||
continue
|
||||
fields[k] = v.strftime("%Y-%m-%d %H:%M") if k == "date" and isinstance(v, datetime) else str(v)
|
||||
val = v.strftime("%Y-%m-%d %H:%M") if k == "date" and isinstance(v, datetime) else str(v)
|
||||
fields[k] = val.replace("\n", " ").replace("\r", "")
|
||||
return fields
|
||||
|
||||
|
||||
@@ -190,16 +189,6 @@ def _print_dump(ticket: Ticket, label: str) -> None:
|
||||
print(f" {k}: {val}")
|
||||
|
||||
|
||||
def load_config(path: str) -> Config | None:
|
||||
with open(path) as f:
|
||||
config_dict = yaml.safe_load(f)
|
||||
try:
|
||||
return Config(**config_dict)
|
||||
except ValidationError as e:
|
||||
print("Bad config")
|
||||
print(e)
|
||||
return None
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(prog="beaky")
|
||||
parser.add_argument("--config", help="Path to config file.", default="config/application.yml")
|
||||
@@ -211,10 +200,15 @@ def main() -> None:
|
||||
help="Dump all bet fields untruncated (compare mode only).")
|
||||
|
||||
args = parser.parse_args()
|
||||
try:
|
||||
config = load_config(args.config)
|
||||
if config is None:
|
||||
except RuntimeError as e:
|
||||
print(e)
|
||||
return
|
||||
|
||||
log_level = getattr(logging, config.log_level.upper(), logging.INFO)
|
||||
logging.basicConfig(level=log_level, format="%(message)s")
|
||||
|
||||
# always load testing data, we will modify that later
|
||||
data = Links(config)
|
||||
data.ret_links()
|
||||
34
beaky-backend/src/beaky/config.py
Normal file
34
beaky-backend/src/beaky/config.py
Normal file
@@ -0,0 +1,34 @@
|
||||
from dataclasses import field as _field
|
||||
|
||||
import yaml
|
||||
from pydantic import ValidationError
|
||||
from pydantic.dataclasses import dataclass
|
||||
|
||||
from beaky.image_classifier.config import ImgClassifierConfig
|
||||
from beaky.resolvers.config import ResolverConfig
|
||||
from beaky.screenshotter.config import ScreenshotterConfig
|
||||
|
||||
|
||||
def load_config(path: str) -> "Config":
|
||||
with open(path) as f:
|
||||
data = yaml.safe_load(f)
|
||||
try:
|
||||
return Config(**data)
|
||||
except ValidationError as exc:
|
||||
raise RuntimeError(f"Invalid config at {path}: {exc}") from exc
|
||||
|
||||
|
||||
@dataclass
|
||||
class ApiConfig:
|
||||
host: str = "0.0.0.0"
|
||||
port: int = 8000
|
||||
|
||||
|
||||
@dataclass
|
||||
class Config:
|
||||
path: str
|
||||
screenshotter: ScreenshotterConfig
|
||||
resolver: ResolverConfig
|
||||
img_classifier: ImgClassifierConfig
|
||||
log_level: str = "INFO"
|
||||
api: ApiConfig = _field(default_factory=ApiConfig)
|
||||
19
beaky-backend/src/beaky/datamodels/api.py
Normal file
19
beaky-backend/src/beaky/datamodels/api.py
Normal file
@@ -0,0 +1,19 @@
|
||||
from pydantic import BaseModel
|
||||
|
||||
from beaky.datamodels.ticket import Ticket
|
||||
from beaky.resolvers.resolver import ResolvedTicket
|
||||
|
||||
|
||||
class ResolveRequest(BaseModel):
|
||||
url: str
|
||||
debug: bool = False
|
||||
|
||||
|
||||
class ResolveResponse(BaseModel):
|
||||
model_config = {"arbitrary_types_allowed": True}
|
||||
|
||||
resolved_ticket: ResolvedTicket
|
||||
verdict: str
|
||||
# populated only when debug=True
|
||||
link_ticket: Ticket | None = None
|
||||
img_ticket: Ticket | None = None
|
||||
238
beaky-backend/src/beaky/image_classifier/classifier.py
Normal file
238
beaky-backend/src/beaky/image_classifier/classifier.py
Normal file
@@ -0,0 +1,238 @@
|
||||
import datetime
|
||||
import logging
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
import pytesseract
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from beaky.datamodels.ticket import (
|
||||
Advance,
|
||||
Bet,
|
||||
BetType,
|
||||
BothTeamScored,
|
||||
GoalAmount,
|
||||
GoalHandicap,
|
||||
Ticket,
|
||||
UnknownBet,
|
||||
WinDrawLose,
|
||||
WinDrawLoseDouble,
|
||||
WinLose,
|
||||
)
|
||||
|
||||
|
||||
def img_to_text(path: str) -> str:
|
||||
"""Read text from image using tesseract; returns empty string on error."""
|
||||
try:
|
||||
return pytesseract.image_to_string(path, lang="ces").strip()
|
||||
except Exception as e:
|
||||
logger.error("Error processing %s: %s", path, e)
|
||||
return ""
|
||||
|
||||
|
||||
def _parse_block(lines: list[str]) -> Bet:
|
||||
"""Parses a single block of text representing exactly one bet."""
|
||||
team1, team2 = "Unknown", "Unknown"
|
||||
league = "Unknown"
|
||||
date_obj = datetime.datetime.now()
|
||||
raw_text = "\n".join(lines)
|
||||
|
||||
# 1. Date extraction
|
||||
if lines:
|
||||
# Regex is forgiving of letters attached to numbers due to OCR (e.g., s07.3.2026)
|
||||
date_m = re.search(r"(\d{1,2})\.\s*(\d{1,2})\.\s*(\d{4})", lines[0])
|
||||
if date_m:
|
||||
try:
|
||||
date_obj = datetime.datetime(int(date_m.group(3)), int(date_m.group(2)), int(date_m.group(1)))
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
# 2. Teams extraction (usually the line after the date)
|
||||
if len(lines) > 1:
|
||||
ln_norm = re.sub(r"[–—−]", "-", lines[1])
|
||||
m = re.match(r"^(.+?)\s*-\s*(.+)$", ln_norm)
|
||||
if m:
|
||||
team1, team2 = m.group(1).strip(), m.group(2).strip()
|
||||
|
||||
# 3. League extraction (typically contains a slash and sport name)
|
||||
for ln in lines:
|
||||
if "/" in ln and any(sport in ln for sport in ["Fotbal", "Hokej", "Tenis", "Basketbal"]):
|
||||
league = ln.strip()
|
||||
break
|
||||
|
||||
base_args = {"team1Name": team1, "team2Name": team2, "date": date_obj, "league": league}
|
||||
|
||||
# 4. Bet Type Classification
|
||||
for ln in lines:
|
||||
lower_line = ln.lower()
|
||||
|
||||
# Výsledek zápasu (1X2)
|
||||
m_vysl = re.search(r"výsledek zápasu\s*:?\s*(1|0|x|2)$", lower_line)
|
||||
if m_vysl and "dvojtip" not in lower_line and "remízy" not in lower_line:
|
||||
pick = m_vysl.group(1).upper()
|
||||
if pick == "X":
|
||||
pick = "0"
|
||||
return WinDrawLose(ticketType=BetType.WIN_DRAW_LOSE, betType=pick, **base_args)
|
||||
|
||||
# Výsledek zápasu - dvojtip (01, 02, 12, etc.)
|
||||
m_dvoj = re.search(r"výsledek zápasu - dvojtip\s*:?\s*(10|01|02|20|12|1x|x1|x2|2x)$", lower_line)
|
||||
if m_dvoj:
|
||||
pick = m_dvoj.group(1).replace("x", "0").replace("X", "0")
|
||||
if pick in ["10", "01"]:
|
||||
pick = "01"
|
||||
elif pick in ["20", "02"]:
|
||||
pick = "02"
|
||||
elif pick in ["12", "21"]:
|
||||
pick = "12"
|
||||
|
||||
if pick in ["01", "12", "02"]:
|
||||
return WinDrawLoseDouble(ticketType=BetType.WIN_DRAW_LOSE_DOUBLE, betType=pick, **base_args)
|
||||
|
||||
# Výsledek zápasu bez remízy
|
||||
m_bez = re.search(r"bez remízy\s*:?\s*(1|2)$", lower_line)
|
||||
if m_bez:
|
||||
return WinLose(ticketType=BetType.WIN_LOSE, betType=m_bez.group(1), **base_args)
|
||||
|
||||
# Každý z týmů dá gól v zápasu
|
||||
m_btts = re.search(r"každý z týmů dá gól.*?:\s*(ano|ne)$", lower_line)
|
||||
if m_btts:
|
||||
if m_btts.group(1) == "ano":
|
||||
return BothTeamScored(ticketType=BetType.BOTH_TEAM_SCORED, **base_args)
|
||||
else:
|
||||
break
|
||||
|
||||
# Počet gólů v zápasu
|
||||
m_goals = re.search(r"počet gólů v zápasu.*?:\s*([+-])\s*([\d.]+)", lower_line)
|
||||
if m_goals and "tým" not in lower_line:
|
||||
sign = m_goals.group(1)
|
||||
val = float(m_goals.group(2))
|
||||
is_over = sign == "+"
|
||||
return GoalAmount(ticketType=BetType.GOAL_AMOUNT, line=val, over=is_over, **base_args)
|
||||
|
||||
# Kdo postoupí
|
||||
if "postoupí" in lower_line or "postup" in lower_line:
|
||||
return Advance(ticketType=BetType.ADVANCED, **base_args)
|
||||
|
||||
# Handicap v zápasu
|
||||
m_hcp = re.search(r"handicap\s*(1|2)\s*:?\s*([+-]?[\d.]+)$", lower_line)
|
||||
if m_hcp:
|
||||
team_bet = m_hcp.group(1)
|
||||
val = float(m_hcp.group(2))
|
||||
return GoalHandicap(ticketType=BetType.GOAL_HANDICAP, team_bet=team_bet, handicap_amount=val, **base_args)
|
||||
|
||||
# Fallback
|
||||
return UnknownBet(ticketType=BetType.UNKNOWN, raw_text=raw_text, **base_args)
|
||||
|
||||
|
||||
def classify(text: str) -> list[Bet]:
|
||||
"""Return a list of Bet objects parsed from OCR `text`."""
|
||||
text = (text or "").strip()
|
||||
if not text:
|
||||
return [
|
||||
UnknownBet(
|
||||
ticketType=BetType.UNKNOWN,
|
||||
team1Name="N/A",
|
||||
team2Name="N/A",
|
||||
date=datetime.datetime.now(),
|
||||
league="N/A",
|
||||
raw_text="No text extracted",
|
||||
)
|
||||
]
|
||||
|
||||
lines = [ln.strip() for ln in text.splitlines() if ln.strip()]
|
||||
bets: list[Bet] = []
|
||||
|
||||
blocks = []
|
||||
current_block = []
|
||||
in_block = False
|
||||
|
||||
# START trigger: Looks for 'dnes', 'zítra', or 'DD.MM.'
|
||||
# date_start_pattern = re.compile(r"(\d{1,2}\.\s*\d{1,2}\.|\b(dnes|zítra|zitra|včera|vcera)\b)", re.IGNORECASE)
|
||||
date_start_pattern = re.compile(r"(\d{1,2}\.\s*\d{1,2}\.|\b(dnes|zítra|zitra|včera|vcera))", re.IGNORECASE)
|
||||
# END trigger: Looks for standard Fortuna sport prefixes
|
||||
# sport_end_pattern = re.compile(r"^(Fotbal|Hokej|Tenis|Basketbal|Florbal|Volejbal|E-sport|Šipky)\s*/", re.IGNORECASE)
|
||||
sport_end_pattern = re.compile(r"(Fotbal|Hokej|Tenis|Basketbal|Florbal|Volejbal|E-sport|Šipky)\s*/", re.IGNORECASE)
|
||||
for ln in lines:
|
||||
logger.debug("Processing line: '%s'", ln)
|
||||
is_start = date_start_pattern.search(ln)
|
||||
is_end = sport_end_pattern.match(ln)
|
||||
|
||||
if is_start:
|
||||
# If we somehow hit a start while already in a block (missing end marker fallback),
|
||||
# save the current block before starting a new one.
|
||||
if current_block:
|
||||
logger.warning("Block not properly ended, new block start detected: '%s'", ln)
|
||||
blocks.append(current_block)
|
||||
current_block = [ln]
|
||||
in_block = True
|
||||
|
||||
elif is_end:
|
||||
# We hit the league/sport line. Add it, save the block, and close the window.
|
||||
current_block.append(ln)
|
||||
blocks.append(current_block)
|
||||
current_block = []
|
||||
in_block = False
|
||||
|
||||
elif in_block:
|
||||
# We are inside a block, gathering standard match info (teams, bet types).
|
||||
current_block.append(ln)
|
||||
|
||||
else:
|
||||
# We are outside a block. This is noise (e.g. "© osmifinále / 2.zápas 0:1" or "170").
|
||||
# We simply ignore it and do nothing.
|
||||
logger.debug("Ignoring line outside of any block: '%s'", ln)
|
||||
pass
|
||||
|
||||
# Catch any dangling block at the very end of the document
|
||||
if current_block:
|
||||
blocks.append(current_block)
|
||||
|
||||
# Parse each block into a separate Bet object
|
||||
for block in blocks:
|
||||
if len(block) > 1: # Ensure the block has enough lines to be valid
|
||||
bets.append(_parse_block(block))
|
||||
|
||||
return bets
|
||||
|
||||
|
||||
def img_classify(paths: list[str], ticket_id: int) -> Ticket:
|
||||
"""Given a list of file paths to images, classify each and collect bets into a Ticket."""
|
||||
ticket = Ticket(id=ticket_id, bets=[])
|
||||
valid_extensions = {".png", ".jpg", ".jpeg", ".bmp", ".tiff", ".webp"}
|
||||
|
||||
for file in paths:
|
||||
file_path = Path(file)
|
||||
if not (file_path.is_file() and file_path.suffix.lower() in valid_extensions):
|
||||
logger.warning("Skipping invalid file: %s", file)
|
||||
continue
|
||||
|
||||
extracted_text = img_to_text(str(file_path))
|
||||
logger.debug("Extracted text from %s", file_path.name)
|
||||
|
||||
try:
|
||||
result = classify(extracted_text)
|
||||
except Exception as exc:
|
||||
logger.error("classify() error for %s: %s", file_path, exc)
|
||||
result = [
|
||||
UnknownBet(
|
||||
ticketType=BetType.UNKNOWN,
|
||||
team1Name="N/A",
|
||||
team2Name="N/A",
|
||||
date=datetime.datetime.now(),
|
||||
league="N/A",
|
||||
raw_text=extracted_text,
|
||||
)
|
||||
]
|
||||
|
||||
# for bet in result:
|
||||
# print(f"-> Parsed: {bet.ticketType.value} | {bet.team1Name} vs {bet.team2Name} | {bet.league}")
|
||||
|
||||
ticket.bets.extend(result)
|
||||
|
||||
return ticket
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Test script runner
|
||||
img_classify(["./data/screenshots/26.png", "./data/screenshots/27.png"], ticket_id=2)
|
||||
@@ -1,9 +1,12 @@
|
||||
import logging
|
||||
import re
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
|
||||
from playwright.sync_api import Page, sync_playwright
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
from beaky.datamodels.ticket import (
|
||||
Bet,
|
||||
BetType,
|
||||
@@ -130,7 +133,7 @@ def _extract_legs(page: Page, fallback_date: datetime | None) -> list[Bet]:
|
||||
title = leg.locator("h3").first.get_attribute("title") or ""
|
||||
date_text = leg.locator(".betslip-leg-date span").first.inner_text()
|
||||
bet_text = leg.locator("[data-selection-id]").first.inner_text()
|
||||
league = leg.locator(".f-mt-1.f-leading-tight.f-line-clamp-2").first.inner_text()
|
||||
league = leg.locator(".f-mt-1.f-leading-tight.f-line-clamp-2").first.inner_text().replace("Fotbal /", "")
|
||||
|
||||
team1, team2 = _parse_teams(title)
|
||||
date = _parse_czech_date(date_text) or fallback_date or datetime.now()
|
||||
@@ -151,7 +154,7 @@ class LinkClassifier:
|
||||
page.wait_for_timeout(500)
|
||||
result = Ticket(id=link.id, bets=_extract_legs(page, link.date))
|
||||
except Exception as e:
|
||||
print(f"Error classifying link {link.id}: {e}")
|
||||
logger.error("Error classifying link %d: %s", link.id, e)
|
||||
finally:
|
||||
page.close()
|
||||
browser.close()
|
||||
@@ -1,10 +1,14 @@
|
||||
import logging
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from dataclasses import field
|
||||
from datetime import date, datetime, timedelta
|
||||
from difflib import SequenceMatcher
|
||||
from enum import Enum
|
||||
from typing import Any
|
||||
|
||||
from pydantic import ConfigDict, SerializeAsAny
|
||||
from pydantic.dataclasses import dataclass
|
||||
|
||||
import diskcache
|
||||
import requests
|
||||
|
||||
@@ -18,9 +22,12 @@ from beaky.datamodels.ticket import (
|
||||
)
|
||||
from beaky.resolvers.config import ResolverConfig
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_API_BASE = "https://v3.football.api-sports.io"
|
||||
|
||||
_DATE_WINDOW = 3 # days either side of the bet date to search
|
||||
_NAME_THRESHOLD = 0.5
|
||||
|
||||
|
||||
class TicketVerdict(str, Enum):
|
||||
@@ -30,9 +37,9 @@ class TicketVerdict(str, Enum):
|
||||
UNKNOWN = "unknown — could not resolve enough bets to decide"
|
||||
|
||||
|
||||
@dataclass
|
||||
@dataclass(config=ConfigDict(arbitrary_types_allowed=True))
|
||||
class ResolvedBet:
|
||||
bet: Bet
|
||||
bet: SerializeAsAny[Bet]
|
||||
outcome: BetOutcome
|
||||
fixture_id: int | None = None
|
||||
# Confidence breakdown (each component 0.0–1.0):
|
||||
@@ -73,11 +80,11 @@ def _get(url: str, headers: dict[str, str], params: dict[str, str | int], retrie
|
||||
resp = requests.get(url, headers=headers, params=params)
|
||||
if resp.status_code == 429:
|
||||
wait = backoff * (attempt + 1)
|
||||
print(f" !! rate limited — waiting {wait:.0f}s before retry ({attempt + 1}/{retries})")
|
||||
logger.warning("rate limited — waiting %.0fs before retry (%d/%d)", wait, attempt + 1, retries)
|
||||
time.sleep(wait)
|
||||
continue
|
||||
return resp
|
||||
print(f" !! still rate limited after {retries} retries, giving up")
|
||||
logger.warning("still rate limited after %d retries, giving up", retries)
|
||||
return resp
|
||||
|
||||
|
||||
@@ -99,17 +106,17 @@ class TicketResolver:
|
||||
|
||||
def _resolve_bet(self, bet: Bet) -> ResolvedBet:
|
||||
bet_type = type(bet).__name__
|
||||
print(f"\n {_ansi.bold(_ansi.cyan(f'┌─ [{bet_type}]'))} {_ansi.bold(f'{bet.team1Name} vs {bet.team2Name}')}"
|
||||
f" {_ansi.dim(f'{bet.date.strftime('%Y-%m-%d')} | {bet.league}')}")
|
||||
_ansi.log(f"\n {_ansi.bold(_ansi.cyan(f'┌─ [{bet_type}]'))} {_ansi.bold(f'{bet.team1Name} vs {bet.team2Name}')}"
|
||||
f" {_ansi.dim(f'{bet.date.strftime("%Y-%m-%d")} | {bet.league}')}")
|
||||
|
||||
if isinstance(bet, UnknownBet):
|
||||
print(_ansi.gray(f" │ skipping — not implemented: {bet.raw_text!r}"))
|
||||
print(_ansi.gray(" └─ UNKNOWN"))
|
||||
_ansi.log(_ansi.gray(f" │ skipping — not implemented: {bet.raw_text!r}"))
|
||||
_ansi.log(_ansi.gray(" └─ UNKNOWN"))
|
||||
return ResolvedBet(bet=bet, outcome=BetOutcome.UNKNOWN)
|
||||
|
||||
fixture, name_match, date_prox, league_conf = self._find_fixture(bet)
|
||||
if fixture is None:
|
||||
print(_ansi.gray(" └─ UNKNOWN — no fixture found"))
|
||||
_ansi.log(_ansi.gray(" └─ UNKNOWN — no fixture found"))
|
||||
return ResolvedBet(bet=bet, outcome=BetOutcome.UNKNOWN, league_found=league_conf)
|
||||
|
||||
home_name = fixture["teams"]["home"]["name"]
|
||||
@@ -126,12 +133,12 @@ class TicketResolver:
|
||||
outcome = BetOutcome.UNKNOWN
|
||||
|
||||
goals = fixture["goals"]
|
||||
print(_ansi.dim(
|
||||
_ansi.log(_ansi.dim(
|
||||
f" │ matched #{fixture['fixture']['id']}: {home_name} vs {away_name}"
|
||||
f" | {goals['home']}:{goals['away']} | {fixture['fixture']['status']['short']}"
|
||||
f" | confidence {confidence} (name={name_match:.2f} date={date_prox:.2f} league={league_conf} finished={finished})"
|
||||
))
|
||||
print(_ansi.bold(_ansi.green(f" └─ {outcome.value.upper()}") if outcome == BetOutcome.WIN
|
||||
_ansi.log(_ansi.bold(_ansi.green(f" └─ {outcome.value.upper()}") if outcome == BetOutcome.WIN
|
||||
else _ansi.red(f" └─ {outcome.value.upper()}") if outcome == BetOutcome.LOSE
|
||||
else _ansi.yellow(f" └─ {outcome.value.upper()}") if outcome == BetOutcome.VOID
|
||||
else _ansi.gray(f" └─ {outcome.value.upper()}")))
|
||||
@@ -151,9 +158,9 @@ class TicketResolver:
|
||||
def _get_statistics(self, fixture_id: int) -> list[dict[str, Any]]:
|
||||
cache_key = ("stats", fixture_id)
|
||||
if cache_key in self._disk_cache:
|
||||
print(_ansi.gray(f" │ /fixtures/statistics served from disk cache (fixture={fixture_id})"))
|
||||
_ansi.log(_ansi.gray(f" │ /fixtures/statistics served from disk cache (fixture={fixture_id})"))
|
||||
return self._disk_cache[cache_key] # type: ignore[no-any-return]
|
||||
print(_ansi.gray(f" │ GET /fixtures/statistics fixture={fixture_id}"))
|
||||
_ansi.log(_ansi.gray(f" │ GET /fixtures/statistics fixture={fixture_id}"))
|
||||
resp = _get(f"{_API_BASE}/fixtures/statistics", headers=self._headers, params={"fixture": fixture_id})
|
||||
resp.raise_for_status()
|
||||
stats = resp.json().get("response", [])
|
||||
@@ -173,7 +180,8 @@ class TicketResolver:
|
||||
if cache_key not in self._fixture_cache:
|
||||
if cache_key in self._disk_cache and not cache_may_be_stale:
|
||||
self._fixture_cache[cache_key] = self._disk_cache[cache_key]
|
||||
print(_ansi.gray(f" │ /fixtures served from disk cache ({len(self._fixture_cache[cache_key])} fixtures)"))
|
||||
_ansi.log(
|
||||
_ansi.gray(f" │ /fixtures served from disk cache ({len(self._fixture_cache[cache_key])} fixtures)"))
|
||||
else:
|
||||
date_from = (center - timedelta(days=_DATE_WINDOW)).strftime("%Y-%m-%d")
|
||||
date_to = (center + timedelta(days=_DATE_WINDOW)).strftime("%Y-%m-%d")
|
||||
@@ -181,17 +189,18 @@ class TicketResolver:
|
||||
if league_id is not None:
|
||||
params["league"] = league_id
|
||||
params["season"] = center.year if center.month >= 7 else center.year - 1
|
||||
print(_ansi.gray(f" │ GET /fixtures {params}"))
|
||||
_ansi.log(_ansi.gray(f" │ GET /fixtures {params}"))
|
||||
resp = _get(f"{_API_BASE}/fixtures", headers=self._headers, params=params)
|
||||
resp.raise_for_status()
|
||||
self._fixture_cache[cache_key] = resp.json().get("response", [])
|
||||
print(_ansi.gray(f" │ {len(self._fixture_cache[cache_key])} fixtures returned"))
|
||||
_ansi.log(_ansi.gray(f" │ {len(self._fixture_cache[cache_key])} fixtures returned"))
|
||||
cacheable = [f for f in self._fixture_cache[cache_key] if f.get("fixture", {}).get("status", {}).get("short") != "NS"]
|
||||
if cacheable:
|
||||
self._disk_cache[cache_key] = cacheable
|
||||
print(_ansi.gray(f" │ {len(cacheable)} non-NS fixture(s) written to disk cache"))
|
||||
_ansi.log(_ansi.gray(f" │ {len(cacheable)} non-NS fixture(s) written to disk cache"))
|
||||
else:
|
||||
print(_ansi.gray(f" │ /fixtures (±{_DATE_WINDOW}d of {date_str}, league={league_id}) served from memory"))
|
||||
_ansi.log(
|
||||
_ansi.gray(f" │ /fixtures (±{_DATE_WINDOW}d of {date_str}, league={league_id}) served from memory"))
|
||||
|
||||
fixture, name_match, date_prox = _best_fixture_match(
|
||||
self._fixture_cache[cache_key], bet.team1Name, bet.team2Name, center
|
||||
@@ -203,29 +212,30 @@ class TicketResolver:
|
||||
if key in self._league_cache:
|
||||
return self._league_cache[key]
|
||||
|
||||
# Use longest-match so "1. itálie - ženy" beats "1. itálie"
|
||||
best_pattern, best_id = max(
|
||||
((p, lid) for p, lid in self._league_map.items() if p in key),
|
||||
key=lambda t: len(t[0]),
|
||||
default=(None, None),
|
||||
)
|
||||
if best_id is not None:
|
||||
print(_ansi.gray(f" │ league {league_name!r} -> id={best_id} (static map, pattern={best_pattern!r})"))
|
||||
self._league_cache[key] = (best_id, 1.0)
|
||||
return best_id, 1.0
|
||||
# Static map — fuzzy match
|
||||
patterns = [x.lower().strip() for x in self._league_map.keys()]
|
||||
idx, score = _best_match(key, patterns)
|
||||
if idx is not None:
|
||||
best_id = self._league_map[patterns[idx]]
|
||||
_ansi.log(_ansi.gray(f" │ league {league_name!r} -> id={best_id} (static map, pattern={patterns[idx]!r}, score={score:.2f})"))
|
||||
self._league_cache[key] = (best_id, score)
|
||||
return best_id, score
|
||||
|
||||
# Fall back to API search — lower confidence since first result is taken unverified
|
||||
print(_ansi.gray(f" │ GET /leagues search={league_name!r}"))
|
||||
# API fallback — fuzzy match all results
|
||||
_ansi.log(_ansi.gray(f" │ GET /leagues search={league_name!r}"))
|
||||
resp = _get(f"{_API_BASE}/leagues", headers=self._headers, params={"search": league_name[:20]})
|
||||
results = resp.json().get("response", [])
|
||||
if results:
|
||||
league_id = results[0]["league"]["id"]
|
||||
league_found_name = results[0]["league"]["name"]
|
||||
print(_ansi.gray(f" │ matched {league_found_name!r} id={league_id} (API fallback, confidence=0.7)"))
|
||||
names = [r["league"]["name"].lower() for r in results]
|
||||
idx, score = _best_match(key, names)
|
||||
if idx is not None:
|
||||
league_id = results[idx]["league"]["id"]
|
||||
league_found_name = results[idx]["league"]["name"]
|
||||
_ansi.log(_ansi.gray(f" │ matched {league_found_name!r} id={league_id} (API fallback, score={score:.2f}, confidence=0.7)"))
|
||||
self._league_cache[key] = (league_id, 0.7)
|
||||
return league_id, 0.7
|
||||
|
||||
print(_ansi.gray(" │ no league found, searching fixtures by date only (confidence=0.3)"))
|
||||
_ansi.log(_ansi.gray(" │ no league found, searching fixtures by date only (confidence=0.3)"))
|
||||
self._league_cache[key] = (None, 0.3)
|
||||
return None, 0.3
|
||||
|
||||
@@ -273,6 +283,16 @@ def _similarity(a: str, b: str) -> float:
|
||||
return SequenceMatcher(None, a.lower(), b.lower()).ratio()
|
||||
|
||||
|
||||
def _best_match(query: str, candidates: list[str], threshold: float = _NAME_THRESHOLD) -> tuple[int | None, float]:
|
||||
"""Return (index, score) of the best fuzzy match, or (None, score) if below threshold."""
|
||||
if not candidates:
|
||||
return None, 0.0
|
||||
scores = [_similarity(query, c) for c in candidates]
|
||||
best_idx = max(range(len(scores)), key=lambda i: scores[i])
|
||||
score = scores[best_idx]
|
||||
return (best_idx, score) if score >= threshold else (None, score)
|
||||
|
||||
|
||||
def _date_proximity(fixture: dict[str, Any], center: date) -> float:
|
||||
"""1.0 on exact date, linear decay to 0.0 at _DATE_WINDOW days away."""
|
||||
fixture_date = datetime.fromisoformat(fixture["fixture"]["date"].replace("Z", "+00:00")).date()
|
||||
@@ -282,21 +302,21 @@ def _date_proximity(fixture: dict[str, Any], center: date) -> float:
|
||||
|
||||
def _best_fixture_match(fixtures: list[dict[str, Any]], team1: str, team2: str, center: date) -> tuple[dict[str, Any] | None, float, float]:
|
||||
"""Returns (best_fixture, name_score, date_proximity) or (None, 0, 0) if no good match."""
|
||||
best, best_combined, best_name, best_date = None, 0.0, 0.0, 0.0
|
||||
for f in fixtures:
|
||||
home = f["teams"]["home"]["name"]
|
||||
away = f["teams"]["away"]["name"]
|
||||
name_score = (_similarity(team1, home) + _similarity(team2, away)) / 2
|
||||
date_prox = _date_proximity(f, center)
|
||||
if not fixtures:
|
||||
return None, 0.0, 0.0
|
||||
# Name similarity is the primary signal; date proximity is a tiebreaker
|
||||
combined = name_score * 0.8 + date_prox * 0.2
|
||||
if combined > best_combined:
|
||||
best_combined = combined
|
||||
best_name = name_score
|
||||
best_date = date_prox
|
||||
best = f
|
||||
home_names = [f["teams"]["home"]["name"] for f in fixtures]
|
||||
away_names = [f["teams"]["away"]["name"] for f in fixtures]
|
||||
print(home_names)
|
||||
print(away_names)
|
||||
name_scores = [(_similarity(team1, h) + _similarity(team2, a)) / 2 for h, a in zip(home_names, away_names)]
|
||||
print(name_scores)
|
||||
date_proxies = [_date_proximity(f, center) for f in fixtures]
|
||||
combined = [n * 0.8 + d * 0.2 for n, d in zip(name_scores, date_proxies)]
|
||||
best_idx = max(range(len(combined)), key=lambda i: combined[i])
|
||||
name, date = name_scores[best_idx], date_proxies[best_idx]
|
||||
# Require minimum name similarity — date alone cannot rescue a bad name match
|
||||
return (best, best_name, best_date) if best_name > 0.5 else (None, best_name, best_date)
|
||||
return (fixtures[best_idx], name, date) if name >= _NAME_THRESHOLD else (None, name, date)
|
||||
|
||||
|
||||
def _is_finished(fixture: dict[str, Any]) -> float:
|
||||
@@ -1,3 +1,4 @@
|
||||
import logging
|
||||
from datetime import datetime
|
||||
from typing import Any, Iterator, List, Optional
|
||||
|
||||
@@ -6,6 +7,8 @@ from pydantic.dataclasses import dataclass
|
||||
|
||||
from beaky.config import Config
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class Link:
|
||||
@@ -37,7 +40,7 @@ class Links:
|
||||
at least: 'id', 'link' (or 'url'), and optionally 'date' (case-insensitive).
|
||||
Returns the list of Link objects (also stored in self.links).
|
||||
"""
|
||||
print("started ret_links()")
|
||||
logger.debug("started ret_links()")
|
||||
wb = load_workbook(filename=self._path, read_only=True, data_only=True)
|
||||
ws = wb.active
|
||||
|
||||
@@ -84,7 +87,7 @@ class Links:
|
||||
|
||||
if id_idx is None or url_idx is None:
|
||||
# Required columns missing
|
||||
print(f"Required 'id' or 'url' column missing in header. Found headers: {list(header_map.keys())}")
|
||||
logger.warning("Required 'id' or 'url' column missing in header. Found headers: %s", list(header_map.keys()))
|
||||
return []
|
||||
|
||||
for row in rows:
|
||||
@@ -1,3 +1,4 @@
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
@@ -6,6 +7,8 @@ from playwright.sync_api import sync_playwright
|
||||
from beaky.config import Config
|
||||
from beaky.scanner.scanner import Link
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class Screenshotter:
|
||||
def __init__(self, config: Config):
|
||||
@@ -18,7 +21,7 @@ class Screenshotter:
|
||||
context = browser.new_context()
|
||||
|
||||
for link in links:
|
||||
print("capturing link:", link)
|
||||
logger.debug("capturing link: %s", link)
|
||||
page = context.new_page()
|
||||
target_path = Path(self.config.screenshotter.target_path) / f"{link.id}.png"
|
||||
self.capture_ticket(page, link.url, target_path)
|
||||
25
beaky-frontend/.gitignore
vendored
Normal file
25
beaky-frontend/.gitignore
vendored
Normal file
@@ -0,0 +1,25 @@
|
||||
# Logs
|
||||
.vscode
|
||||
logs
|
||||
*.log
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
pnpm-debug.log*
|
||||
lerna-debug.log*
|
||||
|
||||
node_modules
|
||||
dist
|
||||
dist-ssr
|
||||
*.local
|
||||
|
||||
# Editor directories and files
|
||||
.vscode/*
|
||||
!.vscode/extensions.json
|
||||
.idea
|
||||
.DS_Store
|
||||
*.suo
|
||||
*.ntvs*
|
||||
*.njsproj
|
||||
*.sln
|
||||
*.sw?
|
||||
43
beaky-frontend/README.md
Normal file
43
beaky-frontend/README.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Svelte + Vite
|
||||
|
||||
This template should help get you started developing with Svelte in Vite.
|
||||
|
||||
## Recommended IDE Setup
|
||||
|
||||
[VS Code](https://code.visualstudio.com/) + [Svelte](https://marketplace.visualstudio.com/items?itemName=svelte.svelte-vscode).
|
||||
|
||||
## Need an official Svelte framework?
|
||||
|
||||
Check out [SvelteKit](https://github.com/sveltejs/kit#readme), which is also powered by Vite. Deploy anywhere with its serverless-first approach and adapt to various platforms, with out of the box support for TypeScript, SCSS, and Less, and easily-added support for mdsvex, GraphQL, PostCSS, Tailwind CSS, and more.
|
||||
|
||||
## Technical considerations
|
||||
|
||||
**Why use this over SvelteKit?**
|
||||
|
||||
- It brings its own routing solution which might not be preferable for some users.
|
||||
- It is first and foremost a framework that just happens to use Vite under the hood, not a Vite app.
|
||||
|
||||
This template contains as little as possible to get started with Vite + Svelte, while taking into account the developer experience with regards to HMR and intellisense. It demonstrates capabilities on par with the other `create-vite` templates and is a good starting point for beginners dipping their toes into a Vite + Svelte project.
|
||||
|
||||
Should you later need the extended capabilities and extensibility provided by SvelteKit, the template has been structured similarly to SvelteKit so that it is easy to migrate.
|
||||
|
||||
**Why include `.vscode/extensions.json`?**
|
||||
|
||||
Other templates indirectly recommend extensions via the README, but this file allows VS Code to prompt the user to install the recommended extension upon opening the project.
|
||||
|
||||
**Why enable `checkJs` in the JS template?**
|
||||
|
||||
It is likely that most cases of changing variable types in runtime are likely to be accidental, rather than deliberate. This provides advanced typechecking out of the box. Should you like to take advantage of the dynamically-typed nature of JavaScript, it is trivial to change the configuration.
|
||||
|
||||
**Why is HMR not preserving my local component state?**
|
||||
|
||||
HMR state preservation comes with a number of gotchas! It has been disabled by default in both `svelte-hmr` and `@sveltejs/vite-plugin-svelte` due to its often surprising behavior. You can read the details [here](https://github.com/sveltejs/svelte-hmr/tree/master/packages/svelte-hmr#preservation-of-local-state).
|
||||
|
||||
If you have state that's important to retain within a component, consider creating an external store which would not be replaced by HMR.
|
||||
|
||||
```js
|
||||
// store.js
|
||||
// An extremely simple external store
|
||||
import { writable } from 'svelte/store'
|
||||
export default writable(0)
|
||||
```
|
||||
13
beaky-frontend/index.html
Normal file
13
beaky-frontend/index.html
Normal file
@@ -0,0 +1,13 @@
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<link rel="icon" type="image/svg+xml" href="/favicon.svg" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<title>beaky-frontend</title>
|
||||
</head>
|
||||
<body>
|
||||
<div id="app"></div>
|
||||
<script type="module" src="/src/main.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
33
beaky-frontend/jsconfig.json
Normal file
33
beaky-frontend/jsconfig.json
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"moduleResolution": "bundler",
|
||||
"target": "ESNext",
|
||||
"module": "ESNext",
|
||||
/**
|
||||
* svelte-preprocess cannot figure out whether you have
|
||||
* a value or a type, so tell TypeScript to enforce using
|
||||
* `import type` instead of `import` for Types.
|
||||
*/
|
||||
"verbatimModuleSyntax": true,
|
||||
"isolatedModules": true,
|
||||
"resolveJsonModule": true,
|
||||
/**
|
||||
* To have warnings / errors of the Svelte compiler at the
|
||||
* correct position, enable source maps by default.
|
||||
*/
|
||||
"sourceMap": true,
|
||||
"esModuleInterop": true,
|
||||
"types": ["vite/client"],
|
||||
"skipLibCheck": true,
|
||||
/**
|
||||
* Typecheck JS in `.svelte` and `.js` files by default.
|
||||
* Disable this if you'd like to use dynamic types.
|
||||
*/
|
||||
"checkJs": true
|
||||
},
|
||||
/**
|
||||
* Use global.d.ts instead of compilerOptions.types
|
||||
* to avoid limiting type declarations.
|
||||
*/
|
||||
"include": ["src/**/*.d.ts", "src/**/*.js", "src/**/*.svelte"]
|
||||
}
|
||||
1517
beaky-frontend/package-lock.json
generated
Normal file
1517
beaky-frontend/package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load Diff
18
beaky-frontend/package.json
Normal file
18
beaky-frontend/package.json
Normal file
@@ -0,0 +1,18 @@
|
||||
{
|
||||
"name": "beaky-frontend",
|
||||
"private": true,
|
||||
"version": "0.0.0",
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"dev": "vite",
|
||||
"build": "vite build",
|
||||
"preview": "vite preview"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@sveltejs/vite-plugin-svelte": "^7.0.0",
|
||||
"@tailwindcss/vite": "^4.2.2",
|
||||
"svelte": "^5.53.12",
|
||||
"tailwindcss": "^4.2.2",
|
||||
"vite": "^8.0.1"
|
||||
}
|
||||
}
|
||||
1
beaky-frontend/public/favicon.svg
Normal file
1
beaky-frontend/public/favicon.svg
Normal file
File diff suppressed because one or more lines are too long
|
After Width: | Height: | Size: 9.3 KiB |
24
beaky-frontend/public/icons.svg
Normal file
24
beaky-frontend/public/icons.svg
Normal file
@@ -0,0 +1,24 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg">
|
||||
<symbol id="bluesky-icon" viewBox="0 0 16 17">
|
||||
<g clip-path="url(#bluesky-clip)"><path fill="#08060d" d="M7.75 7.735c-.693-1.348-2.58-3.86-4.334-5.097-1.68-1.187-2.32-.981-2.74-.79C.188 2.065.1 2.812.1 3.251s.241 3.602.398 4.13c.52 1.744 2.367 2.333 4.07 2.145-2.495.37-4.71 1.278-1.805 4.512 3.196 3.309 4.38-.71 4.987-2.746.608 2.036 1.307 5.91 4.93 2.746 2.72-2.746.747-4.143-1.747-4.512 1.702.189 3.55-.4 4.07-2.145.156-.528.397-3.691.397-4.13s-.088-1.186-.575-1.406c-.42-.19-1.06-.395-2.741.79-1.755 1.24-3.64 3.752-4.334 5.099"/></g>
|
||||
<defs><clipPath id="bluesky-clip"><path fill="#fff" d="M.1.85h15.3v15.3H.1z"/></clipPath></defs>
|
||||
</symbol>
|
||||
<symbol id="discord-icon" viewBox="0 0 20 19">
|
||||
<path fill="#08060d" d="M16.224 3.768a14.5 14.5 0 0 0-3.67-1.153c-.158.286-.343.67-.47.976a13.5 13.5 0 0 0-4.067 0c-.128-.306-.317-.69-.476-.976A14.4 14.4 0 0 0 3.868 3.77C1.546 7.28.916 10.703 1.231 14.077a14.7 14.7 0 0 0 4.5 2.306q.545-.748.965-1.587a9.5 9.5 0 0 1-1.518-.74q.191-.14.372-.293c2.927 1.369 6.107 1.369 8.999 0q.183.152.372.294-.723.437-1.52.74.418.838.963 1.588a14.6 14.6 0 0 0 4.504-2.308c.37-3.911-.63-7.302-2.644-10.309m-9.13 8.234c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.894 0 1.614.82 1.599 1.82.001 1-.705 1.82-1.6 1.82m5.91 0c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.893 0 1.614.82 1.599 1.82 0 1-.706 1.82-1.6 1.82"/>
|
||||
</symbol>
|
||||
<symbol id="documentation-icon" viewBox="0 0 21 20">
|
||||
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="m15.5 13.333 1.533 1.322c.645.555.967.833.967 1.178s-.322.623-.967 1.179L15.5 18.333m-3.333-5-1.534 1.322c-.644.555-.966.833-.966 1.178s.322.623.966 1.179l1.534 1.321"/>
|
||||
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M17.167 10.836v-4.32c0-1.41 0-2.117-.224-2.68-.359-.906-1.118-1.621-2.08-1.96-.599-.21-1.349-.21-2.848-.21-2.623 0-3.935 0-4.983.369-1.684.591-3.013 1.842-3.641 3.428C3 6.449 3 7.684 3 10.154v2.122c0 2.558 0 3.838.706 4.726q.306.383.713.671c.76.536 1.79.64 3.581.66"/>
|
||||
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M3 10a2.78 2.78 0 0 1 2.778-2.778c.555 0 1.209.097 1.748-.047.48-.129.854-.503.982-.982.145-.54.048-1.194.048-1.749a2.78 2.78 0 0 1 2.777-2.777"/>
|
||||
</symbol>
|
||||
<symbol id="github-icon" viewBox="0 0 19 19">
|
||||
<path fill="#08060d" fill-rule="evenodd" d="M9.356 1.85C5.05 1.85 1.57 5.356 1.57 9.694a7.84 7.84 0 0 0 5.324 7.44c.387.079.528-.168.528-.376 0-.182-.013-.805-.013-1.454-2.165.467-2.616-.935-2.616-.935-.349-.91-.864-1.143-.864-1.143-.71-.48.051-.48.051-.48.787.051 1.2.805 1.2.805.695 1.194 1.817.857 2.268.649.064-.507.27-.857.49-1.052-1.728-.182-3.545-.857-3.545-3.87 0-.857.31-1.558.8-2.104-.078-.195-.349-1 .077-2.078 0 0 .657-.208 2.14.805a7.5 7.5 0 0 1 1.946-.26c.657 0 1.328.092 1.946.26 1.483-1.013 2.14-.805 2.14-.805.426 1.078.155 1.883.078 2.078.502.546.799 1.247.799 2.104 0 3.013-1.818 3.675-3.558 3.87.284.247.528.714.528 1.454 0 1.052-.012 1.896-.012 2.156 0 .208.142.455.528.377a7.84 7.84 0 0 0 5.324-7.441c.013-4.338-3.48-7.844-7.773-7.844" clip-rule="evenodd"/>
|
||||
</symbol>
|
||||
<symbol id="social-icon" viewBox="0 0 20 20">
|
||||
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M12.5 6.667a4.167 4.167 0 1 0-8.334 0 4.167 4.167 0 0 0 8.334 0"/>
|
||||
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M2.5 16.667a5.833 5.833 0 0 1 8.75-5.053m3.837.474.513 1.035c.07.144.257.282.414.309l.93.155c.596.1.736.536.307.965l-.723.73a.64.64 0 0 0-.152.531l.207.903c.164.715-.213.991-.84.618l-.872-.52a.63.63 0 0 0-.577 0l-.872.52c-.624.373-1.003.094-.84-.618l.207-.903a.64.64 0 0 0-.152-.532l-.723-.729c-.426-.43-.289-.864.306-.964l.93-.156a.64.64 0 0 0 .412-.31l.513-1.034c.28-.562.735-.562 1.012 0"/>
|
||||
</symbol>
|
||||
<symbol id="x-icon" viewBox="0 0 19 19">
|
||||
<path fill="#08060d" fill-rule="evenodd" d="M1.893 1.98c.052.072 1.245 1.769 2.653 3.77l2.892 4.114c.183.261.333.48.333.486s-.068.089-.152.183l-.522.593-.765.867-3.597 4.087c-.375.426-.734.834-.798.905a1 1 0 0 0-.118.148c0 .01.236.017.664.017h.663l.729-.83c.4-.457.796-.906.879-.999a692 692 0 0 0 1.794-2.038c.034-.037.301-.34.594-.675l.551-.624.345-.392a7 7 0 0 1 .34-.374c.006 0 .93 1.306 2.052 2.903l2.084 2.965.045.063h2.275c1.87 0 2.273-.003 2.266-.021-.008-.02-1.098-1.572-3.894-5.547-2.013-2.862-2.28-3.246-2.273-3.266.008-.019.282-.332 2.085-2.38l2-2.274 1.567-1.782c.022-.028-.016-.03-.65-.03h-.674l-.3.342a871 871 0 0 1-1.782 2.025c-.067.075-.405.458-.75.852a100 100 0 0 1-.803.91c-.148.172-.299.344-.99 1.127-.304.343-.32.358-.345.327-.015-.019-.904-1.282-1.976-2.808L6.365 1.85H1.8zm1.782.91 8.078 11.294c.772 1.08 1.413 1.973 1.425 1.984.016.017.241.02 1.05.017l1.03-.004-2.694-3.766L7.796 5.75 5.722 2.852l-1.039-.004-1.039-.004z" clip-rule="evenodd"/>
|
||||
</symbol>
|
||||
</svg>
|
||||
|
After Width: | Height: | Size: 4.9 KiB |
238
beaky-frontend/src/App.svelte
Normal file
238
beaky-frontend/src/App.svelte
Normal file
@@ -0,0 +1,238 @@
|
||||
<script>
|
||||
import BetsTable from './lib/BetsTable.svelte'
|
||||
|
||||
const API_URL = 'http://localhost:8000/api/v1/resolve'
|
||||
|
||||
let url = $state('')
|
||||
let loading = $state(false)
|
||||
let error = $state(null)
|
||||
let result = $state(null)
|
||||
let dark = $state(true)
|
||||
|
||||
$effect(() => {
|
||||
document.documentElement.classList.toggle('dark', dark)
|
||||
})
|
||||
|
||||
async function verify(e) {
|
||||
e.preventDefault()
|
||||
loading = true
|
||||
error = null
|
||||
result = null
|
||||
try {
|
||||
const res = await fetch(API_URL, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ url, debug: false }),
|
||||
})
|
||||
if (!res.ok) {
|
||||
let detail = res.statusText
|
||||
try { const body = await res.json(); detail = body.detail ?? detail } catch {}
|
||||
throw new Error('Server error: ' + detail)
|
||||
}
|
||||
result = await res.json()
|
||||
} catch (err) {
|
||||
error = err.name === 'TypeError'
|
||||
? 'Could not reach the Beaky API at localhost:8000. Is the server running?'
|
||||
: err.message
|
||||
} finally {
|
||||
loading = false
|
||||
}
|
||||
}
|
||||
|
||||
function verdictStyle(verdict) {
|
||||
if (verdict === 'truthful') return { label: 'TRUTHFUL', bg: '#22c55e', text: '#fff' }
|
||||
if (verdict === 'not truthful') return { label: 'NOT TRUTHFUL', bg: '#ef4444', text: '#fff' }
|
||||
if (verdict?.startsWith('possibly')) return { label: 'POSSIBLY TRUTHFUL — check manually', bg: '#eab308', text: '#111827' }
|
||||
return { label: 'UNKNOWN — insufficient data', bg: '#6b7280', text: '#fff' }
|
||||
}
|
||||
</script>
|
||||
|
||||
<div class="page">
|
||||
<div class="container">
|
||||
|
||||
<!-- Header -->
|
||||
<div class="header-row">
|
||||
<div>
|
||||
<h1>Beaky</h1>
|
||||
<p class="subtitle">Paste a Fortuna ticket URL to verify its truthfulness.</p>
|
||||
</div>
|
||||
<button class="theme-toggle" onclick={() => (dark = !dark)} aria-label="Toggle dark mode" title={dark ? 'Switch to light mode' : 'Switch to dark mode'}>
|
||||
{#if dark}
|
||||
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" d="M12 3v1m0 16v1m9-9h-1M4 12H3m15.364-6.364l-.707.707M6.343 17.657l-.707.707M17.657 17.657l-.707-.707M6.343 6.343l-.707-.707M12 5a7 7 0 100 14A7 7 0 0012 5z" />
|
||||
</svg>
|
||||
{:else}
|
||||
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" d="M21 12.79A9 9 0 1111.21 3a7 7 0 109.79 9.79z" />
|
||||
</svg>
|
||||
{/if}
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<!-- Input form -->
|
||||
<form onsubmit={verify} class="form">
|
||||
<input
|
||||
type="url"
|
||||
bind:value={url}
|
||||
placeholder="https://applink.ifortuna.cz/ticketdetail?id=..."
|
||||
required
|
||||
class="url-input"
|
||||
/>
|
||||
<button type="submit" disabled={loading} class="submit-btn">
|
||||
{loading ? 'Verifying…' : 'Verify'}
|
||||
</button>
|
||||
</form>
|
||||
|
||||
<!-- Spinner -->
|
||||
{#if loading}
|
||||
<div class="spinner-wrap">
|
||||
<div class="spinner"></div>
|
||||
<p class="spinner-label">Running classifiers and resolving bets — this takes ~20 seconds…</p>
|
||||
</div>
|
||||
{/if}
|
||||
|
||||
<!-- Error -->
|
||||
{#if error}
|
||||
<div class="error-box">{error}</div>
|
||||
{/if}
|
||||
|
||||
<!-- Result -->
|
||||
{#if result}
|
||||
{@const vs = verdictStyle(result.verdict)}
|
||||
<div class="verdict-banner" style="background:{vs.bg}; color:{vs.text}">
|
||||
{vs.label}
|
||||
</div>
|
||||
|
||||
{#if result.resolved_ticket?.ticket_id}
|
||||
<p class="ticket-id">Ticket ID: {result.resolved_ticket.ticket_id}</p>
|
||||
{/if}
|
||||
|
||||
<BetsTable bets={result.resolved_ticket?.bets ?? []} />
|
||||
{/if}
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<style>
|
||||
.page {
|
||||
min-height: 100vh;
|
||||
background: var(--bg-page);
|
||||
color: var(--text-primary);
|
||||
transition: background 0.2s, color 0.2s;
|
||||
}
|
||||
|
||||
.container {
|
||||
max-width: 960px;
|
||||
margin: 0 auto;
|
||||
padding: 3rem 1rem;
|
||||
}
|
||||
|
||||
.header-row {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: flex-start;
|
||||
margin-bottom: 2.5rem;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 2.25rem;
|
||||
font-weight: 700;
|
||||
margin: 0 0 0.25rem;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.subtitle {
|
||||
color: var(--text-secondary);
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
.theme-toggle {
|
||||
background: none;
|
||||
border: none;
|
||||
cursor: pointer;
|
||||
padding: 0.5rem;
|
||||
border-radius: 0.5rem;
|
||||
color: var(--text-secondary);
|
||||
transition: background 0.15s;
|
||||
}
|
||||
.theme-toggle:hover { background: var(--bg-surface-alt); }
|
||||
|
||||
.form {
|
||||
display: flex;
|
||||
gap: 0.5rem;
|
||||
margin-bottom: 2rem;
|
||||
}
|
||||
|
||||
.url-input {
|
||||
flex: 1;
|
||||
background: var(--bg-input);
|
||||
border: 1px solid var(--border-input);
|
||||
border-radius: 0.5rem;
|
||||
padding: 0.625rem 1rem;
|
||||
color: var(--text-primary);
|
||||
font-size: 0.875rem;
|
||||
outline: none;
|
||||
transition: border-color 0.15s, background 0.2s;
|
||||
}
|
||||
.url-input::placeholder { color: var(--text-muted); }
|
||||
.url-input:focus { border-color: #3b82f6; box-shadow: 0 0 0 3px rgba(59,130,246,0.2); }
|
||||
|
||||
.submit-btn {
|
||||
background: #2563eb;
|
||||
color: #fff;
|
||||
border: none;
|
||||
border-radius: 0.5rem;
|
||||
padding: 0.625rem 1.5rem;
|
||||
font-size: 0.875rem;
|
||||
font-weight: 500;
|
||||
cursor: pointer;
|
||||
transition: background 0.15s;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.submit-btn:hover:not(:disabled) { background: #1d4ed8; }
|
||||
.submit-btn:disabled { background: #1e3a5f; cursor: not-allowed; }
|
||||
|
||||
.spinner-wrap {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
gap: 0.75rem;
|
||||
padding: 3rem 0;
|
||||
}
|
||||
|
||||
.spinner {
|
||||
width: 2.5rem;
|
||||
height: 2.5rem;
|
||||
border: 4px solid var(--border);
|
||||
border-top-color: #3b82f6;
|
||||
border-radius: 50%;
|
||||
animation: spin 0.8s linear infinite;
|
||||
}
|
||||
@keyframes spin { to { transform: rotate(360deg); } }
|
||||
|
||||
.spinner-label { color: var(--text-secondary); font-size: 0.875rem; margin: 0; }
|
||||
|
||||
.error-box {
|
||||
background: var(--bg-error);
|
||||
border: 1px solid var(--border-error);
|
||||
color: var(--text-error);
|
||||
border-radius: 0.5rem;
|
||||
padding: 0.75rem 1rem;
|
||||
}
|
||||
|
||||
.verdict-banner {
|
||||
border-radius: 0.75rem;
|
||||
text-align: center;
|
||||
font-size: 1.75rem;
|
||||
font-weight: 700;
|
||||
letter-spacing: 0.05em;
|
||||
padding: 1.5rem 1rem;
|
||||
margin-bottom: 0.75rem;
|
||||
}
|
||||
|
||||
.ticket-id {
|
||||
color: var(--text-muted);
|
||||
font-size: 0.75rem;
|
||||
margin: 0 0 1.5rem;
|
||||
}
|
||||
</style>
|
||||
67
beaky-frontend/src/app.css
Normal file
67
beaky-frontend/src/app.css
Normal file
@@ -0,0 +1,67 @@
|
||||
@import "tailwindcss";
|
||||
|
||||
/* ── Light theme (default) ─────────────────────────────────────── */
|
||||
:root {
|
||||
--bg-page: #f9fafb;
|
||||
--bg-surface: #ffffff;
|
||||
--bg-surface-alt:#f3f4f6;
|
||||
--bg-header: #f9fafb;
|
||||
--bg-hover: #eff6ff;
|
||||
--bg-input: #ffffff;
|
||||
--bg-error: #fef2f2;
|
||||
|
||||
--text-primary: #111827;
|
||||
--text-secondary:#6b7280;
|
||||
--text-muted: #9ca3af;
|
||||
|
||||
--border: #e5e7eb;
|
||||
--border-subtle: #f3f4f6;
|
||||
--border-input: #d1d5db;
|
||||
--border-error: #fca5a5;
|
||||
|
||||
--text-error: #b91c1c;
|
||||
|
||||
--outcome-win-bg: #dcfce7;
|
||||
--outcome-win-text: #15803d;
|
||||
--outcome-lose-bg: #fee2e2;
|
||||
--outcome-lose-text: #b91c1c;
|
||||
--outcome-void-bg: #fef9c3;
|
||||
--outcome-void-text: #a16207;
|
||||
--outcome-unknown-bg: #f3f4f6;
|
||||
--outcome-unknown-text:#6b7280;
|
||||
|
||||
--conf-track: #e5e7eb;
|
||||
}
|
||||
|
||||
/* ── Dark theme ─────────────────────────────────────────────────── */
|
||||
.dark {
|
||||
--bg-page: #030712;
|
||||
--bg-surface: #111827;
|
||||
--bg-surface-alt:#030712;
|
||||
--bg-header: #111827;
|
||||
--bg-hover: #1f2937;
|
||||
--bg-input: #1f2937;
|
||||
--bg-error: #450a0a;
|
||||
|
||||
--text-primary: #f9fafb;
|
||||
--text-secondary:#9ca3af;
|
||||
--text-muted: #6b7280;
|
||||
|
||||
--border: #1f2937;
|
||||
--border-subtle: #1f2937;
|
||||
--border-input: #374151;
|
||||
--border-error: #991b1b;
|
||||
|
||||
--text-error: #fca5a5;
|
||||
|
||||
--outcome-win-bg: #14532d;
|
||||
--outcome-win-text: #86efac;
|
||||
--outcome-lose-bg: #7f1d1d;
|
||||
--outcome-lose-text: #fca5a5;
|
||||
--outcome-void-bg: #422006;
|
||||
--outcome-void-text: #fde68a;
|
||||
--outcome-unknown-bg: #1f2937;
|
||||
--outcome-unknown-text:#9ca3af;
|
||||
|
||||
--conf-track: #374151;
|
||||
}
|
||||
BIN
beaky-frontend/src/assets/hero.png
Normal file
BIN
beaky-frontend/src/assets/hero.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 44 KiB |
1
beaky-frontend/src/assets/svelte.svg
Normal file
1
beaky-frontend/src/assets/svelte.svg
Normal file
@@ -0,0 +1 @@
|
||||
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="26.6" height="32" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 308"><path fill="#FF3E00" d="M239.682 40.707C211.113-.182 154.69-12.301 113.895 13.69L42.247 59.356a82.198 82.198 0 0 0-37.135 55.056a86.566 86.566 0 0 0 8.536 55.576a82.425 82.425 0 0 0-12.296 30.719a87.596 87.596 0 0 0 14.964 66.244c28.574 40.893 84.997 53.007 125.787 27.016l71.648-45.664a82.182 82.182 0 0 0 37.135-55.057a86.601 86.601 0 0 0-8.53-55.577a82.409 82.409 0 0 0 12.29-30.718a87.573 87.573 0 0 0-14.963-66.244"></path><path fill="#FFF" d="M106.889 270.841c-23.102 6.007-47.497-3.036-61.103-22.648a52.685 52.685 0 0 1-9.003-39.85a49.978 49.978 0 0 1 1.713-6.693l1.35-4.115l3.671 2.697a92.447 92.447 0 0 0 28.036 14.007l2.663.808l-.245 2.659a16.067 16.067 0 0 0 2.89 10.656a17.143 17.143 0 0 0 18.397 6.828a15.786 15.786 0 0 0 4.403-1.935l71.67-45.672a14.922 14.922 0 0 0 6.734-9.977a15.923 15.923 0 0 0-2.713-12.011a17.156 17.156 0 0 0-18.404-6.832a15.78 15.78 0 0 0-4.396 1.933l-27.35 17.434a52.298 52.298 0 0 1-14.553 6.391c-23.101 6.007-47.497-3.036-61.101-22.649a52.681 52.681 0 0 1-9.004-39.849a49.428 49.428 0 0 1 22.34-33.114l71.664-45.677a52.218 52.218 0 0 1 14.563-6.398c23.101-6.007 47.497 3.036 61.101 22.648a52.685 52.685 0 0 1 9.004 39.85a50.559 50.559 0 0 1-1.713 6.692l-1.35 4.116l-3.67-2.693a92.373 92.373 0 0 0-28.037-14.013l-2.664-.809l.246-2.658a16.099 16.099 0 0 0-2.89-10.656a17.143 17.143 0 0 0-18.398-6.828a15.786 15.786 0 0 0-4.402 1.935l-71.67 45.674a14.898 14.898 0 0 0-6.73 9.975a15.9 15.9 0 0 0 2.709 12.012a17.156 17.156 0 0 0 18.404 6.832a15.841 15.841 0 0 0 4.402-1.935l27.345-17.427a52.147 52.147 0 0 1 14.552-6.397c23.101-6.006 47.497 3.037 61.102 22.65a52.681 52.681 0 0 1 9.003 39.848a49.453 49.453 0 0 1-22.34 33.12l-71.664 45.673a52.218 52.218 0 0 1-14.563 6.398"></path></svg>
|
||||
|
After Width: | Height: | Size: 1.9 KiB |
1
beaky-frontend/src/assets/vite.svg
Normal file
1
beaky-frontend/src/assets/vite.svg
Normal file
File diff suppressed because one or more lines are too long
|
After Width: | Height: | Size: 8.5 KiB |
171
beaky-frontend/src/lib/BetsTable.svelte
Normal file
171
beaky-frontend/src/lib/BetsTable.svelte
Normal file
@@ -0,0 +1,171 @@
|
||||
<script>
|
||||
let { bets } = $props()
|
||||
|
||||
function formatBetType(t) {
|
||||
return {
|
||||
win_draw_lose: '1X2 Result',
|
||||
win_draw_lose_double: '1X2 Double',
|
||||
win_lose: 'Win/Lose',
|
||||
both_team_scored: 'Both Teams Score',
|
||||
goal_amount: 'Over/Under Goals',
|
||||
goal_handicap: 'Goal Handicap',
|
||||
half_time_result: 'HT Result',
|
||||
half_time_double: 'HT Double',
|
||||
half_time_full_time: 'HT/FT',
|
||||
corner_amount: 'Over/Under Corners',
|
||||
team_corner_amount: 'Team Corners',
|
||||
more_offsides: 'More Offsides',
|
||||
advance: 'Advance',
|
||||
unknown: 'Unknown',
|
||||
}[t] ?? t
|
||||
}
|
||||
|
||||
function formatBetDetail(bet) {
|
||||
const result = (code) => ({ '1': 'Home win', '2': 'Away win', '0': 'Draw', 'X': 'Draw' }[code] ?? code)
|
||||
const double = (code) => ({ '01': 'Home or Draw', '02': 'Home or Away', '12': 'Draw or Away' }[code] ?? code)
|
||||
const team = (code) => code === '1' ? bet.team1Name : bet.team2Name
|
||||
const sign = (n) => (n >= 0 ? '+' : '') + n
|
||||
|
||||
switch (bet.ticketType) {
|
||||
case 'win_draw_lose': return result(bet.betType)
|
||||
case 'win_draw_lose_double': return double(bet.betType)
|
||||
case 'win_lose': return result(bet.betType)
|
||||
case 'both_team_scored': return 'Both teams to score'
|
||||
case 'goal_amount': return (bet.over ? 'Over ' : 'Under ') + bet.line + ' goals'
|
||||
case 'goal_handicap': return team(bet.team_bet) + ' ' + sign(bet.handicap_amount)
|
||||
case 'half_time_result': return 'HT: ' + result(bet.betType)
|
||||
case 'half_time_double': return 'HT: ' + double(bet.betType)
|
||||
case 'half_time_full_time': return 'HT: ' + result(bet.ht_bet) + ' / FT: ' + result(bet.ft_bet)
|
||||
case 'corner_amount': return (bet.over ? 'Over ' : 'Under ') + bet.line + ' corners'
|
||||
case 'team_corner_amount': return team(bet.team_bet) + ' ' + (bet.over ? 'over ' : 'under ') + bet.line
|
||||
case 'more_offsides': return team(bet.team_bet) + ' more offsides'
|
||||
case 'advance': return 'Advance'
|
||||
default: return bet.raw_text || 'Unknown'
|
||||
}
|
||||
}
|
||||
</script>
|
||||
|
||||
<div class="table-wrap">
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Match</th>
|
||||
<th>Date</th>
|
||||
<th>League</th>
|
||||
<th>Bet Type</th>
|
||||
<th>Pick</th>
|
||||
<th>Outcome</th>
|
||||
<th>Confidence</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{#each bets as resolved, i}
|
||||
{@const bet = resolved.bet}
|
||||
{@const outcome = (resolved.outcome ?? 'unknown').toLowerCase()}
|
||||
{@const conf = Math.round((resolved.confidence ?? 0) * 100)}
|
||||
{@const date = bet.date ? new Date(bet.date) : null}
|
||||
<tr class:alt={i % 2 !== 0}>
|
||||
<td class="match">
|
||||
{bet.team1Name} <span class="vs">vs</span> {bet.team2Name}
|
||||
</td>
|
||||
<td class="secondary">
|
||||
{date ? date.toLocaleDateString('cs-CZ', { day: 'numeric', month: 'short', year: 'numeric' }) : '—'}
|
||||
</td>
|
||||
<td class="secondary">{bet.league ?? '—'}</td>
|
||||
<td>{formatBetType(bet.ticketType)}</td>
|
||||
<td>{formatBetDetail(bet)}</td>
|
||||
<td>
|
||||
<span class="badge outcome-{outcome}">{outcome}</span>
|
||||
</td>
|
||||
<td>
|
||||
<div class="conf">
|
||||
<div class="conf-track">
|
||||
<div class="conf-fill" style="width:{conf}%"></div>
|
||||
</div>
|
||||
<span class="conf-label">{conf}%</span>
|
||||
</div>
|
||||
</td>
|
||||
</tr>
|
||||
{/each}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
|
||||
<style>
|
||||
.table-wrap {
|
||||
overflow-x: auto;
|
||||
border-radius: 0.75rem;
|
||||
border: 1px solid var(--border);
|
||||
}
|
||||
|
||||
table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
thead tr {
|
||||
border-bottom: 1px solid var(--border);
|
||||
background: var(--bg-header);
|
||||
}
|
||||
|
||||
th {
|
||||
padding: 0.75rem 1rem;
|
||||
text-align: left;
|
||||
font-weight: 500;
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
tbody tr {
|
||||
border-bottom: 1px solid var(--border-subtle);
|
||||
background: var(--bg-surface);
|
||||
transition: background 0.1s;
|
||||
}
|
||||
tbody tr.alt { background: var(--bg-surface-alt); }
|
||||
tbody tr:last-child { border-bottom: none; }
|
||||
tbody tr:hover { background: var(--bg-hover); }
|
||||
|
||||
td {
|
||||
padding: 0.75rem 1rem;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.match { font-weight: 500; }
|
||||
.vs { color: var(--text-muted); }
|
||||
.secondary { color: var(--text-secondary); white-space: nowrap; }
|
||||
|
||||
.badge {
|
||||
display: inline-block;
|
||||
padding: 0.125rem 0.5rem;
|
||||
border-radius: 0.25rem;
|
||||
font-size: 0.75rem;
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
}
|
||||
.outcome-win { background: var(--outcome-win-bg); color: var(--outcome-win-text); }
|
||||
.outcome-lose { background: var(--outcome-lose-bg); color: var(--outcome-lose-text); }
|
||||
.outcome-void { background: var(--outcome-void-bg); color: var(--outcome-void-text); }
|
||||
.outcome-unknown { background: var(--outcome-unknown-bg); color: var(--outcome-unknown-text); }
|
||||
|
||||
.conf {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
.conf-track {
|
||||
width: 4rem;
|
||||
height: 6px;
|
||||
background: var(--conf-track);
|
||||
border-radius: 9999px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.conf-fill {
|
||||
height: 100%;
|
||||
background: #3b82f6;
|
||||
border-radius: 9999px;
|
||||
}
|
||||
.conf-label {
|
||||
font-size: 0.75rem;
|
||||
color: var(--text-muted);
|
||||
}
|
||||
</style>
|
||||
5
beaky-frontend/src/lib/Counter.svelte
Normal file
5
beaky-frontend/src/lib/Counter.svelte
Normal file
@@ -0,0 +1,5 @@
|
||||
<script>
|
||||
let count = $state(0)
|
||||
</script>
|
||||
|
||||
<button class="counter" onclick={() => count++}>Count is {count}</button>
|
||||
9
beaky-frontend/src/main.js
Normal file
9
beaky-frontend/src/main.js
Normal file
@@ -0,0 +1,9 @@
|
||||
import { mount } from 'svelte'
|
||||
import './app.css'
|
||||
import App from './App.svelte'
|
||||
|
||||
const app = mount(App, {
|
||||
target: document.getElementById('app'),
|
||||
})
|
||||
|
||||
export default app
|
||||
2
beaky-frontend/svelte.config.js
Normal file
2
beaky-frontend/svelte.config.js
Normal file
@@ -0,0 +1,2 @@
|
||||
/** @type {import("@sveltejs/vite-plugin-svelte").SvelteConfig} */
|
||||
export default {}
|
||||
7
beaky-frontend/vite.config.js
Normal file
7
beaky-frontend/vite.config.js
Normal file
@@ -0,0 +1,7 @@
|
||||
import { defineConfig } from 'vite'
|
||||
import { svelte } from '@sveltejs/vite-plugin-svelte'
|
||||
import tailwindcss from '@tailwindcss/vite'
|
||||
|
||||
export default defineConfig({
|
||||
plugins: [tailwindcss(), svelte()],
|
||||
})
|
||||
@@ -1,88 +0,0 @@
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import argparse
|
||||
from datetime import datetime
|
||||
import pytz
|
||||
from openpyxl import Workbook
|
||||
|
||||
|
||||
def process_files(starting_id, output_filename="output.xlsx"):
|
||||
# Find all txt files in the current directory
|
||||
txt_files = [f for f in os.listdir('.') if f.endswith('.txt')]
|
||||
|
||||
if not txt_files:
|
||||
print("No .txt files found in the current directory.")
|
||||
return
|
||||
|
||||
# Regex patterns for input data
|
||||
date_pattern = re.compile(r'\[.*?(\d{1,2})\s+(\d{1,2}),\s+(\d{4})\s+at\s+(\d{1,2}:\d{2})\]')
|
||||
url_pattern = re.compile(r'(https?://[^\s]+)')
|
||||
|
||||
# Timezone setup (CET to UTC)
|
||||
local_tz = pytz.timezone("Europe/Prague")
|
||||
|
||||
# Set up the Excel Workbook
|
||||
wb = Workbook()
|
||||
ws = wb.active
|
||||
ws.title = "Fortuna Data"
|
||||
ws.append(["ID", "URL", "Date_UTC"]) # Add headers
|
||||
|
||||
current_id = starting_id
|
||||
success_files = []
|
||||
|
||||
for filename in txt_files:
|
||||
try:
|
||||
with open(filename, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
dates = date_pattern.findall(content)
|
||||
urls = url_pattern.findall(content)
|
||||
|
||||
# Extract and format the data
|
||||
for i in range(min(len(dates), len(urls))):
|
||||
month, day, year, time_str = dates[i]
|
||||
|
||||
# Parse the datetime from the text file
|
||||
dt_str = f"{year}-{month}-{day} {time_str}"
|
||||
local_dt = datetime.strptime(dt_str, "%Y-%m-%d %H:%M")
|
||||
|
||||
# Convert CET to UTC
|
||||
localized_dt = local_tz.localize(local_dt)
|
||||
utc_dt = localized_dt.astimezone(pytz.utc)
|
||||
|
||||
# NEW: Format to ISO 8601 with T and Z
|
||||
formatted_date = utc_dt.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
|
||||
# Add a new row to the Excel sheet
|
||||
ws.append([current_id, urls[i], formatted_date])
|
||||
current_id += 1
|
||||
|
||||
# Queue file for deletion
|
||||
success_files.append(filename)
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error processing {filename}: {e}", file=sys.stderr)
|
||||
|
||||
# Save the Excel file
|
||||
try:
|
||||
wb.save(output_filename)
|
||||
print(f"Successfully saved data to {output_filename}")
|
||||
|
||||
# Clean up only if save was successful
|
||||
for filename in success_files:
|
||||
os.remove(filename)
|
||||
print(f"Deleted: {filename}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Failed to save {output_filename}. No text files were deleted. Error: {e}", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description="Extract URLs to an Excel file with ISO UTC dates.")
|
||||
parser.add_argument("start_id", type=int, help="Starting ID for the output")
|
||||
parser.add_argument("--output", type=str, default="extracted_data.xlsx",
|
||||
help="Output Excel filename (default: extracted_data.xlsx)")
|
||||
args = parser.parse_args()
|
||||
|
||||
process_files(args.start_id, args.output)
|
||||
@@ -62,4 +62,4 @@ význam?
|
||||
- Sázka AS Roma +0.5: prohra (virtuální skóre 2 : 1.5)
|
||||
- Sázka AS Roma +1: storno (virtuální skóre 2 : 2, vrací se vklad)
|
||||
|
||||
|
||||
- Vin chce implementovat: rohy, karty, offside, střelci
|
||||
|
||||
@@ -1,13 +0,0 @@
|
||||
from pydantic.dataclasses import dataclass
|
||||
|
||||
from beaky.image_classifier.config import ImgClassifierConfig
|
||||
from beaky.resolvers.config import ResolverConfig
|
||||
from beaky.screenshotter.config import ScreenshotterConfig
|
||||
|
||||
|
||||
@dataclass
|
||||
class Config:
|
||||
path: str
|
||||
screenshotter: ScreenshotterConfig
|
||||
resolver: ResolverConfig
|
||||
img_classifier: ImgClassifierConfig
|
||||
@@ -1,194 +0,0 @@
|
||||
import datetime
|
||||
import logging
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
from pytesseract import pytesseract
|
||||
|
||||
from beaky.datamodels.ticket import (
|
||||
Advance,
|
||||
Bet,
|
||||
BetType,
|
||||
BothTeamScored,
|
||||
GoalAmount,
|
||||
GoalHandicap,
|
||||
Ticket,
|
||||
UnknownBet,
|
||||
WinDrawLose,
|
||||
WinDrawLoseDouble,
|
||||
WinLose,
|
||||
)
|
||||
|
||||
|
||||
def img_to_text(path: str) -> str:
|
||||
"""Given a path to an image, return the text contained in that image.
|
||||
Bypasses PIL and lets Tesseract read the file directly.
|
||||
"""
|
||||
try:
|
||||
text = pytesseract.image_to_string(path, lang="ces")
|
||||
return text.strip()
|
||||
except pytesseract.TesseractNotFoundError:
|
||||
print("Error: Tesseract executable not found on your system.")
|
||||
return ""
|
||||
except Exception as e:
|
||||
print(f"Error processing {path}: {e}")
|
||||
return ""
|
||||
|
||||
|
||||
def classify(text: str) -> Bet:
|
||||
"""Given text extracted from an image and a date, return a Bet object that is
|
||||
relevant to that text."""
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
if not text:
|
||||
return UnknownBet(
|
||||
ticketType=BetType.UNKNOWN,
|
||||
team1Name="N/A",
|
||||
team2Name="N/A",
|
||||
date=datetime.datetime.now(),
|
||||
league="N/A",
|
||||
raw_text="No text extracted",
|
||||
)
|
||||
|
||||
# 1. Defaults & Normalization
|
||||
text_lower = text.lower()
|
||||
date_obj = datetime.datetime.now()
|
||||
team1, team2 = "Unknown", "Unknown"
|
||||
league = "Unknown"
|
||||
|
||||
# 2. Heuristic extraction of Teams (Looking for "Team A - Team B" patterns)
|
||||
lines = [line.strip() for line in text.split("\n") if line.strip()]
|
||||
for line in lines:
|
||||
if " - " in line or " vs " in line or " v " in line:
|
||||
# Avoid splitting on hyphens in dates or numbers
|
||||
if not re.search(r"\d\s*-\s*\d", line):
|
||||
parts = re.split(r" - | vs | v ", line)
|
||||
if len(parts) >= 2:
|
||||
team1, team2 = parts[0].strip(), parts[1].strip()
|
||||
break
|
||||
|
||||
# 3. Heuristic extraction of Date (Looking for DD.MM. YYYY HH:MM)
|
||||
date_match = re.search(r"(\d{1,2}\.\s*\d{1,2}\.?\s*(?:\d{2,4})?)\s*(\d{1,2}:\d{2})?", text)
|
||||
if date_match:
|
||||
try:
|
||||
# Fallback to current year if missing, basic parse attempt
|
||||
date_str = f"{date_match.group(1).replace(' ', '')} {date_match.group(2) or '00:00'}"
|
||||
if len(date_str.split(".")[2]) <= 5: # Missing year
|
||||
date_str = date_str.replace(" ", f"{datetime.datetime.now().year} ")
|
||||
date_obj = datetime.datetime.strptime(date_str, "%d.%m.%Y %H:%M")
|
||||
except Exception:
|
||||
pass # Keep default if parsing fails
|
||||
|
||||
# 4. Classification Logic based on keywords
|
||||
base_args = {"team1Name": team1, "team2Name": team2, "date": date_obj, "league": league}
|
||||
|
||||
# Advance / Postup
|
||||
if any(kw in text_lower for kw in ["postup", "postoupí", "advance"]):
|
||||
return Advance(ticketType=BetType.ADVANCED, **base_args)
|
||||
|
||||
# Both Teams to Score / Oba dají gól
|
||||
if any(kw in text_lower for kw in ["oba dají gól", "btts", "oba týmy dají gól"]):
|
||||
return BothTeamScored(ticketType=BetType.BOTH_TEAM_SCORED, **base_args)
|
||||
|
||||
# Goal Amount (Over/Under)
|
||||
if any(kw in text_lower for kw in ["počet gólů", "více než", "méně než", "over", "under"]):
|
||||
# Attempt to find the goal line (e.g., 2.5, 3.5)
|
||||
line_match = re.search(r"(\d+\.\d+)", text)
|
||||
line_val = float(line_match.group(1)) if line_match else 2.5
|
||||
is_over = any(kw in text_lower for kw in ["více", "over", "+"])
|
||||
|
||||
return GoalAmount(ticketType=BetType.GOAL_AMOUNT, line=line_val, over=is_over, **base_args)
|
||||
|
||||
# Goal Handicap
|
||||
if any(kw in text_lower for kw in ["handicap", "hcp"]):
|
||||
hcp_match = re.search(r"([+-]?\d+\.\d+)", text)
|
||||
hcp_val = float(hcp_match.group(1)) if hcp_match else 0.0
|
||||
# Simplistic logic: guess team 1 if not explicitly stated
|
||||
team_bet = "2" if " 2 " in text else "1"
|
||||
|
||||
return GoalHandicap(ticketType=BetType.GOAL_HANDICAP, team_bet=team_bet, handicap_amount=hcp_val, **base_args)
|
||||
|
||||
# Win Draw Lose Double (1X, X2, 12)
|
||||
if any(kw in text_lower for kw in ["1x", "x2", "12", "dvojitá šance", "neprohra"]):
|
||||
bet_type = "01" if "1x" in text_lower else "02" if "x2" in text_lower else "12"
|
||||
return WinDrawLoseDouble(ticketType=BetType.WIN_DRAW_LOSE_DOUBLE, betType=bet_type, **base_args)
|
||||
|
||||
# Win Lose (Draw no bet / Vítěz do rozhodnutí)
|
||||
if any(kw in text_lower for kw in ["bez remízy", "vítěz do rozhodnutí", "konečný vítěz"]):
|
||||
bet_type = "2" if re.search(r"\b2\b", text) else "1"
|
||||
return WinLose(ticketType=BetType.WIN_LOSE, betType=bet_type, **base_args)
|
||||
|
||||
# Win Draw Lose (Standard Match Odds)
|
||||
if any(kw in text_lower for kw in ["zápas", "výsledek zápasu", "1x2"]):
|
||||
# Look for isolated 1, X (or 0), or 2
|
||||
match_pick = re.search(r"\b(1|x|0|2)\b", text_lower)
|
||||
bet_type = match_pick.group(1).upper() if match_pick else "1"
|
||||
if bet_type == "X":
|
||||
bet_type = "0"
|
||||
|
||||
return WinDrawLose(ticketType=BetType.WIN_DRAW_LOSE, betType=bet_type, **base_args)
|
||||
|
||||
# Fallback Unknown
|
||||
return UnknownBet(ticketType=BetType.UNKNOWN, raw_text=text, **base_args)
|
||||
|
||||
|
||||
def img_classify(paths: list[str], ticket_id: int) -> Ticket:
|
||||
"""Given a path to an image and a date, return a list of Tickets that are
|
||||
relevant to that image and date."""
|
||||
# Define valid image extensions to ignore system files or text documents
|
||||
ticket = Ticket(id=ticket_id, bets=[])
|
||||
valid_extensions = {".png", ".jpg", ".jpeg", ".bmp", ".tiff", ".webp"}
|
||||
|
||||
# Iterate through all files in the folder
|
||||
for file in paths:
|
||||
file_path = Path(file)
|
||||
if file_path.is_file() and file_path.suffix.lower() in valid_extensions:
|
||||
# 1. Extract the text (called separately)
|
||||
extracted_text = img_to_text(str(file_path))
|
||||
print(extracted_text)
|
||||
|
||||
# 2. Classify based on the extracted text (called separately)
|
||||
try:
|
||||
result = classify(extracted_text)
|
||||
except Exception as exc: # pragma: no cover - defensive fallback
|
||||
# Ensure result is always defined so downstream code cannot reference an unbound name
|
||||
print(f"classify() raised an exception: {exc}")
|
||||
result = UnknownBet(
|
||||
ticketType=BetType.UNKNOWN,
|
||||
team1Name="N/A",
|
||||
team2Name="N/A",
|
||||
league="N/A",
|
||||
raw_text=extracted_text,
|
||||
date=datetime.datetime.now(),
|
||||
)
|
||||
|
||||
# 3. Add the resulting tickets to our main list
|
||||
# Support classifier returning either a single Bet or a list of Bet
|
||||
if result is None:
|
||||
continue
|
||||
|
||||
if isinstance(result, list):
|
||||
for r in result:
|
||||
print(
|
||||
r.date,
|
||||
getattr(r, "ticketType", None),
|
||||
r.team1Name,
|
||||
r.team2Name,
|
||||
r.league,
|
||||
)
|
||||
ticket.bets.extend(result)
|
||||
else:
|
||||
print(
|
||||
result.date,
|
||||
getattr(result, "ticketType", None),
|
||||
result.team1Name,
|
||||
result.team2Name,
|
||||
result.league,
|
||||
)
|
||||
ticket.bets.append(result)
|
||||
|
||||
return ticket
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
img_classify(["./data/screenshots/2.png"], ticket_id=1)
|
||||
Reference in New Issue
Block a user