very unique document search problem in windows

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
I've got a document in a subfolder of 'my documents' that I need to find. The only criteria I remember is that it's got +/- 15 pages of 'padding' in the file. By that I mean 15 pages of linefeeds/carriage returns. There is other stuff in there also, but I know these blank pages are in there as well.

I've tried searching for the ascii code for linefeed as well as the hex value, but have come to nothing in windows search?

Ideas?
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
Barring being able to search for linefeeds, I could search by # of pages per document. With all those linefeeds, it would be one of the longer documents...
 

Lite_me

Flashlight Enthusiast
Joined
Jun 29, 2006
Messages
1,992
Location
Northern OH
Do you know 'anything' that's in the document? A specific word, something that's not too common would help.

I use Agent Ransack for searching Windows. There's a free version that does all I need. It searches in sub folders, and will look inside documents for specific words or phrases. It has shell integration. It's fast, and always delivers.
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
Do you know 'anything' that's in the document? A specific word, something that's not too common would help.

I use Agent Ransack for searching Windows. There's a free version that does all I need. It searches in sub folders, and will look inside documents for specific words or phrases. It has shell integration. It's fast, and always delivers.

No. And there are so many docs that searching for any word would yield a countless hits.
 

RetroTechie

Flashlight Enthusiast
Joined
Oct 11, 2013
Messages
1,007
Location
Hengelo, NL
Sorry but to me the problem doesn't make sense. It's important, but you can't remember one bit of info about it? In that case, what makes it important? If there is some circumstance / external 'pressure' that makes it important, wouldn't that circumstance provide a clue to what you're looking for? :thinking:
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
No other clues are available. I do remember editing the doc, but the contents were generic enough that you can't search for the content itself.
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
It was a legal document, hidden in a 'blank' document. Searches for content have yielded nothing. If I can't search based on # of linefeeds/carriage returns or # of pages then I'm sunk.

The # of pages may be the most promising???
 

inetdog

Enlightened
Joined
Mar 4, 2013
Messages
442
It was a legal document, hidden in a 'blank' document. Searches for content have yielded nothing. If I can't search based on # of linefeeds/carriage returns or # of pages then I'm sunk.

The # of pages may be the most promising???
1. Word does not necessarily represent characters, especially non-printing and white space characters by their ASCII code. It may also compress strings of repeated characters.
2. Is it saved in .doc or .docx format?
3. Can you narrow the search down by the last-modified date of the file?
4. Create a similar new Word doc in a separate folder and look at it with a hex editor to see what you could look for.
5. Number of pages would be metadata that may not be acessible without opening and parsing the file.
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
1. Word does not necessarily represent characters, especially non-printing and white space characters by their ASCII code. It may also compress strings of repeated characters.
2. Is it saved in .doc or .docx format?
3. Can you narrow the search down by the last-modified date of the file?
4. Create a similar new Word doc in a separate folder and look at it with a hex editor to see what you could look for.
5. Number of pages would be metadata that may not be acessible without opening and parsing the file.

2. doc format
3. too many files, and 'no' anyway
4. promising... will see what I can do.
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
Ok.

Made 2 doc files. One is empty. Other is empty, but with 10 pages of linefeeds. I opened both in a hex editor, but can't tell the difference between them. There are a lot of 'header' info areas. There are also areas where the same character repeats many times, but both files have similar sections.
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
Even if I find a string to search for... I'd have to find something that will search ALL docs files for some ascii-type string.
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
Went back and made the test doc ~600 pages long with linefeeds.

Open in hex editor. It's longer than a blank doc, but I can't find a single section that's significantly longer than any other. Yes, there are long sections of repeated data, but there are 2-3 of them, and they are all about the same length. Weird.
 

turbodog

Flashaholic
Joined
Jun 23, 2003
Messages
6,425
Location
central time
Ok.

Found the string. Edited the file again, put 'start' on one line, added 5 linefeeds/CR, put 'end'. Opened file. LF/CR appears to be '0D'.

Now, how to search for that?
 
Top