One thing we’d love feedback on from HN folks: how are you currently orchestrating doc ingestion in AI pipelines? Did you build custom extractors, use open-source OCR, or go fully LLM? We’ve tried a bunch of approaches but I'm curious what others are doing. Happy to share more if useful.
One thing we’d love feedback on from HN folks: how are you currently orchestrating doc ingestion in AI pipelines? Did you build custom extractors, use open-source OCR, or go fully LLM? We’ve tried a bunch of approaches but I'm curious what others are doing. Happy to share more if useful.