Friday. Had a crazy idea yesterday to embark upon a largish scale conversion of PDFs (about 1800 of them) into searchable text docs. Things you will need...
- Scripting abilities (check)
- Material (check)
- Database schema (working on)
- OCR software (looking at)
- 1 CVS repository
- Lots of webspace (Hmm, could be tricky)
- Time (not good, none until Sunday)
- A few crazy people who think it's a good idea (1 so far)
- Lots of cool people who don't mind checking/transcribing documents
I don't know how to even start on this. Yeek.