reddit-scrapper/TODO.md

2.4 KiB
Raw Blame History

TODO - Developer Task Tracking

This file is used temporarily by the assistant to track progress on implementation tasks. It will be removed when all work is complete.

Milestone 0 Bootstrap

  • M0-01: Create project skeleton directories and files
  • M0-02: Initialize Go module and pin go version
  • M0-03: Add .gitignore, .env.example, README
  • M0-04: Create Makefile (pending)

Milestone 1 Config and Logging

  • M1-01: Implement logging setup using log/slog
  • M1-02: Implement configuration loader: env + flags + .env
  • M1-03: Define config schema and defaults
  • M1-04: Add config validation
  • M1-05: Unit tests for config parsing precedence

Milestone 2 Types and Parser

  • M2-01: Define normalized data types
  • M2-02: Define minimal Reddit API response structs
  • M2-03: Implement parser
  • M2-04: Implement deleted/removed filters
  • M2-05: Parser unit tests

Milestone 3 Fetcher and Networking

  • M3-01: Build HTTP client
  • M3-02: Implement rate limiter
  • M3-03: Implement backoff with jitter
  • M3-04: URL builder for search.json
  • M3-05: Implement fetchPage
  • M3-06: Fetcher tests
  • M3-07: Implement metrics capture

Milestone 4 Storage and Dedup

  • M4-01: Implement JSONL writer
  • M4-02: File naming and rotation
  • M4-03: Ensure output dir creation
  • M4-04: Implement dedup index
  • M4-05: Dedup persistence
  • M4-06: Storage unit tests

Milestone 5 Controller and Orchestration

  • M5-01: Implement controller orchestrator
  • M5-02: Pagination loop
  • M5-03: Integrate fetcher→parser→storage
  • M5-04: Progress reporting
  • M5-05: Graceful shutdown
  • M5-06: Summary report
  • M5-07: Wire CLI entrypoint
  • M5-08: Error code taxonomy
  • M5-09: Controller integration test

Milestone 6 Nice-to-haves

  • M6-01: Date-based subdir option
  • M6-02: Optional compression on rollover

Milestone 7 Performance

  • M7-01: Performance runbook
  • M7-02: Benchmark tuning

Milestone 8 Docs and Release

  • M8-01: README expansion
  • M8-02: Cron examples
  • M8-03: Sample data
  • M8-04: CI steps
  • M8-05: Tag and build release

Milestone 9 Verification

  • M9-01: Post-implementation checklist

Progress notes:

  • Created project skeleton and minimal main.go.
  • Next: implement logging + config and update main.go to use them.