reddit-scrapper/TODO.md

75 lines
2.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# TODO - Developer Task Tracking
This file is used temporarily by the assistant to track progress on implementation tasks. It will be removed when all work is complete.
Milestone 0 Bootstrap
- [x] M0-01: Create project skeleton directories and files
- [x] M0-02: Initialize Go module and pin go version
- [x] M0-03: Add .gitignore, .env.example, README
- [x] M0-04: Create Makefile (pending)
Milestone 1 Config and Logging
- [ ] M1-01: Implement logging setup using log/slog
- [ ] M1-02: Implement configuration loader: env + flags + .env
- [ ] M1-03: Define config schema and defaults
- [ ] M1-04: Add config validation
- [ ] M1-05: Unit tests for config parsing precedence
Milestone 2 Types and Parser
- [ ] M2-01: Define normalized data types
- [ ] M2-02: Define minimal Reddit API response structs
- [ ] M2-03: Implement parser
- [ ] M2-04: Implement deleted/removed filters
- [ ] M2-05: Parser unit tests
Milestone 3 Fetcher and Networking
- [ ] M3-01: Build HTTP client
- [ ] M3-02: Implement rate limiter
- [ ] M3-03: Implement backoff with jitter
- [ ] M3-04: URL builder for search.json
- [ ] M3-05: Implement fetchPage
- [ ] M3-06: Fetcher tests
- [ ] M3-07: Implement metrics capture
Milestone 4 Storage and Dedup
- [ ] M4-01: Implement JSONL writer
- [ ] M4-02: File naming and rotation
- [ ] M4-03: Ensure output dir creation
- [ ] M4-04: Implement dedup index
- [ ] M4-05: Dedup persistence
- [ ] M4-06: Storage unit tests
Milestone 5 Controller and Orchestration
- [ ] M5-01: Implement controller orchestrator
- [ ] M5-02: Pagination loop
- [ ] M5-03: Integrate fetcher→parser→storage
- [ ] M5-04: Progress reporting
- [ ] M5-05: Graceful shutdown
- [ ] M5-06: Summary report
- [ ] M5-07: Wire CLI entrypoint
- [ ] M5-08: Error code taxonomy
- [ ] M5-09: Controller integration test
Milestone 6 Nice-to-haves
- [ ] M6-01: Date-based subdir option
- [ ] M6-02: Optional compression on rollover
Milestone 7 Performance
- [ ] M7-01: Performance runbook
- [ ] M7-02: Benchmark tuning
Milestone 8 Docs and Release
- [ ] M8-01: README expansion
- [ ] M8-02: Cron examples
- [ ] M8-03: Sample data
- [ ] M8-04: CI steps
- [ ] M8-05: Tag and build release
Milestone 9 Verification
- [ ] M9-01: Post-implementation checklist
Progress notes:
- Created project skeleton and minimal main.go.
- Next: implement logging + config and update main.go to use them.