It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
most books and lots of other things are on usenet , why scan it when its probably already available

whats the book ? i'll have a quick look
avatar
paulrainer: most books and lots of other things are on usenet , why scan it when its probably already available
I wouldn't try to scan books that are already available as digital material anywhere else, that would be counter-productive. As for availability on usenet I actually have never tried that, I'll take a look as there's a free trial.
avatar
Nirth: There's an application on android called Cam Scanner that has a OCR plugin that works fairly well actually. The problem is that I can't set it up so that everything is streamlined into one process like start: takes a picture, process, cropping, enhancing (like Black & White), OCR and perhaps tagging the page number. I also have no idea how to easily build a page turner mechanism.
The usual OCR soft on comp can do this quite well, too ... "with normal text". But when you have formulas in there, some strange symbols or just some dirt, or even thin paper where you see a bit from the backside, standard solutions wont help alot and you will have to do alot by your own.
If you don't want OCR then scanning pics and making one pdf out of them is quite easy. Some Libraries offer book copiers or book scanners so you either get the book on normal paper or even already as images. The first you just put into a document feeder of a normal multi function printer, the later is already in the format you want. At least Adobe Acrobat, and a guess some of the freeware solutions, can easily make one pdf out of this. (Adobe even with good size reduction so without losing quality filesize will be significantly smaller)
avatar
hohiro: Still the problem there is the OCR part, These scans are in pdfs from images, no texts. You could check if there is an E-Book or pdf edition of your book. Cheaper and better than selfmade scans, especially when formulas are in there.
Any good dedicated OCR app (i.e. not much of the stuff integrated or bundled) will accept images as source instead of requiring a scanner as the source. You can point it at your PDF and it will scan that.

It is a challenge though, with formulas like you say (though the more modern ones figure that out too), or even things like notes in the margin. Usually it will retain stuff it doesn't recognize as images and convert the text otherwise. They can also often recognize multiple languages on the same page! It is their raison d'être after all!

If I was seriously going to do more than one or two books (and if building one of these scanners would be worthwhile) I would definitely invest in a recent version of something like Abbyy or OmniPage. They're surprisingly good.
avatar
Nirth: I also have no idea how to easily build a page turner mechanism.
Since digitizing printed books is part of my job, I would like to offer some advice.

The best (so far) routine is this: You tear the book by hand into chunks of 50-70 pages. With a guillotine paper cutter you remove the spine of the book (the part with the glue). You will then have all the book in single pages. (no need to worry about turning pages)
After that, you insert the pages to a scanner with auto feeder and scan them automatically. You might have to insert them in packs of 100 or less, that depends on the scanner feeder capacity. You might also want to use tthe scanner software to enhance the images (most often increase contrast)

You now have digital images of every page of the book. Next step is to use an OCR program. You can select all the pages at once and let it do the recognition. However, even though the OCR programs are much better today, they still have a 99% to 99,9% success in recognition. Bear in mind that 99% recognition means that in every paragraph there will be 2 letters recognised wrongly, which you will have to correct manually. Some OCR have ''learning'' functions, but that will only benefit you in the long run. You can improve the recognition success by scanning in better quality. I think 600dpi is pretty good. You will also need enough disk space, let's say 10-20 GB per book.

If you don't really mind the wrong characters, the book is ready. Make it a pdf or whatever you like. You now have a digital copy of your book with 1%-0,1% mistakes. And all that is accompliced with minimal effort from your part.
(In my job, where we want 100%, they employ philologists to read and correct the text after the OCR. I do not recommend this for you. If you have time to read and correct the OCR text, then you might as well read the book and learn it by heart!!!)

Still this a procees that requires some time and to buy specific tools. So, if the books you want are available digitally, buy them and save yourself from all this trouble. Since you are in university you might ask your professors where to get them digitally, they generally know better where to get them than the students.

Lastly my personal opinion is that a digital book is of no comparison to a hardcopy of a book. It's more convinient to carry a book wherever you want, you don't need to charge batteries to read it, compared to a screen it has many times better contrast and is much easier to read. But that is just my opinion....
avatar
phandom: Lastly my personal opinion is that a digital book is of no comparison to a hardcopy of a book. It's more convinient to carry a book wherever you want, you don't need to charge batteries to read it, compared to a screen it has many times better contrast and is much easier to read. But that is just my opinion....
A small device can fit many books, I can use search function, bookmark anything or highlight anything MUCH easier than taking notes from a hardcopy, the ability to change background light, zoom in/out etc.. The only down side I see is battery but I can fix that, more or less, by buying one of those better Kindle e-ink readers (I've been looking at the Paperwhite 2).

Any certain scanners and/or OCR programs you would recommend?
avatar
Nirth: A small device can fit many books, I can use search function, bookmark anything or highlight anything MUCH easier than taking notes from a hardcopy, the ability to change background light, zoom in/out etc.. The only down side I see is battery but I can fix that, more or less, by buying one of those better Kindle e-ink readers (I've been looking at the Paperwhite 2).

Any certain scanners and/or OCR programs you would recommend?
I can't really say about scanners since we only use a professional Zeutchsel scanner (It costs as much as my annual income). However I've seen the HP Scanjet 5590 in a store and I liked it.
The best OCR program is FineReader, but OmniPage Standard, Readiris Pro and Presto! OCR are also good.