Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories

search

smegsmeg Member Posts: 4
Can anyone tell me if it is possible to search a number of word documents for keywords without opening the documents? I need to search approx. 600 cv's by job type.

Comments

  • leeosleeos Member Posts: 1,212
    : Can anyone tell me if it is possible to search a number of word documents for keywords without opening the documents? I need to search approx. 600 cv's by job type.


    This will look through a doc. you will need to modify to it to wither use FileSystemObject or DIR$ to make it lok through a directory of files.

    [code]
    Dim objword As Word.Application ' remember to add your word reference
    Dim FileName As String ' path and filename
    Dim j As Integer

    Static myKeywords(0 To 3) ' increase as you add my keywords
    myKeywords(0) = UCase("england")
    myKeywords(1) = UCase("china")
    myKeywords(2) = UCase("brazil")

    Set objword = CreateObject("Word.Application")

    FileName = "c: emplee.doc" ' modify to your doc name.

    objword.Documents.Open FileName

    Do Until j = 4

    If InStr(1, UCase(objword.ActiveDocument.Content), myKeywords(j)) > 1 Then
    Debug.Print myKeywords(j) & " " & "Exists In " & FileName
    'better to write this to a file for your 'real' version
    End If

    j = j + 1

    Loop

    objword.ActiveDocument.Close (False) ' dont save changes if it as old version of word

    objword.Quit

    Set objword = Nothing
    [/code]
  • ColdShineColdShine Member Posts: 597
    : [blue]Dim[/blue] j [blue]As Integer[/blue]

    Unless you're targeting a 16-bit Windows system, declare your counters as [blue]Long[/blue]. You will get the best performance at a minimal expense of memory.
    _____________________________
    [size=1][b][grey]Cold[/grey][blue]Shine[/blue][/b]
    http://www20.brinkster.com/coldshine[/size]

  • GenjuroGenjuro Member Posts: 913
    : : [blue]Dim[/blue] j [blue]As Integer[/blue]
    :
    : Unless you're targeting a 16-bit Windows system, declare your counters as [blue]Long[/blue]. You will get the best performance at a minimal expense of memory.
    : _____________________________
    : [size=1][b][grey]Cold[/grey][blue]Shine[/blue][/b]
    : http://www20.brinkster.com/coldshine[/size]
    :

    Well, actually, for once, I got you. *LOL*
    I know that it *may* seems absurd, but tests show that Integers are slightly faster than Longs. I don't know why, but there are tests that demonstrate it (although I agree that it is far from being logic - I didn't expect that too).

  • KDivad LeahcimKDivad Leahcim Member Posts: 3,948
    : : : [blue]Dim[/blue] j [blue]As Integer[/blue]
    : :
    : : Unless you're targeting a 16-bit Windows system, declare your counters as [blue]Long[/blue]. You will get the best performance at a minimal expense of memory.
    : : _____________________________
    : : [size=1][b][grey]Cold[/grey][blue]Shine[/blue][/b]
    : : http://www20.brinkster.com/coldshine[/size]
    : :
    :
    : Well, actually, for once, I got you. *LOL*
    : I know that it *may* seems absurd, but tests show that Integers are slightly faster than Longs. I don't know why, but there are tests that demonstrate it (although I agree that it is far from being logic - I didn't expect that too).
    :
    :

    You're kidding, right??? Wonder if VB handles Longs in an odd way. Wouldn't surprise me...
  • GenjuroGenjuro Member Posts: 913
    : : : : [blue]Dim[/blue] j [blue]As Integer[/blue]
    : : :
    : : : Unless you're targeting a 16-bit Windows system, declare your counters as [blue]Long[/blue]. You will get the best performance at a minimal expense of memory.
    : : : _____________________________
    : : : [size=1][b][grey]Cold[/grey][blue]Shine[/blue][/b]
    : : : http://www20.brinkster.com/coldshine[/size]
    : : :
    : :
    : : Well, actually, for once, I got you. *LOL*
    : : I know that it *may* seems absurd, but tests show that Integers are slightly faster than Longs. I don't know why, but there are tests that demonstrate it (although I agree that it is far from being logic - I didn't expect that too).
    : :
    : :
    :
    : You're kidding, right??? Wonder if VB handles Longs in an odd way. Wouldn't surprise me...
    :

    I am not kidding - and, if you look at Hardcore VB, from McKinney, there're a lot of tests about it.
    He was surprised as much as we all.
    Try testing various operation, and you'll see.
    My opinion is that VB allocates both Longs and Integers wherever it happens, with no regards to memory alignment (4 bytes on Intel processors).
    So, Longs have one chance on four NOT to be disaligned, thus forcing the processor to read *two* four-bytes block (IE: if a long is stored into address 8000003, it would trigger a segmentation fault, as the address is not evenly divisible by four and the processor *must* access an address that is - don't ask me why), so it reads memory from 8000001 to 8000004 for the first byte, and memory from 8000005 to 8000008 for the next three bytes.
    If an integer would be stored in the same position (8000003), then it would not trigger the same fault, as it occupies addresses 8000003 and 8000004, and the processor needs to read only bytes from 8000001 to 8000004.

    I'm far from being sure of this explanation, though it would surely explain this strange behavior.

    If someone knows more about memory alignment, I'd love to hear opinions.
  • KDivad LeahcimKDivad Leahcim Member Posts: 3,948
    : I am not kidding - and, if you look at Hardcore VB, from McKinney, there're a lot of tests about it.
    : He was surprised as much as we all.
    : Try testing various operation, and you'll see.
    : My opinion is that VB allocates both Longs and Integers wherever it happens, with no regards to memory alignment (4 bytes on Intel processors).
    : So, Longs have one chance on four NOT to be disaligned, thus forcing the processor to read *two* four-bytes block (IE: if a long is stored into address 8000003, it would trigger a segmentation fault, as the address is not evenly divisible by four and the processor *must* access an address that is - don't ask me why), so it reads memory from 8000001 to 8000004 for the first byte, and memory from 8000005 to 8000008 for the next three bytes.
    : If an integer would be stored in the same position (8000003), then it would not trigger the same fault, as it occupies addresses 8000003 and 8000004, and the processor needs to read only bytes from 8000001 to 8000004.
    :
    : I'm far from being sure of this explanation, though it would surely explain this strange behavior.
    :
    : If someone knows more about memory alignment, I'd love to hear opinions.
    :

    Ah, yes, I had forgotten about memory alignment. That probably is it, though I would have assumed that even VB would align the addresses. An easy test would be to do something like this:
    [code]
    Public Type lngAlign
    L1 As Long
    B1 As Byte
    L2 As Long
    B2 As Byte
    L3 As Long
    B3 As Byte
    L4 As Long
    End Type
    Public LongTest As lngAlign
    [/code]
    If the case is that VB does not align it's variables, then one of those Longs (LongTest.Lx) would be faster than an integer. If someone wants to plug those variables into McKinney's tests...

    Can't say I know a lot about memory alignment, but I do know a bit from delving into assembly.
  • ColdShineColdShine Member Posts: 597
    : My opinion is that VB allocates both Longs and Integers wherever it happens, with no regards to memory alignment (4 bytes on Intel processors).

    I've looked at McKinney's book, and I suppose you're right. If you look at the timings, you'll find the Long timings are almost the double as the Integer timings, which must be due to a double memory access caused by a data misalignment.

    But remember, McKinney performed the tests under VB5; this absurd behavior could have been removed with the improvements of the VB6 compiler. Does anyone care to re-test?
    _____________________________
    [size=1][b][grey]Cold[/grey][blue]Shine[/blue][/b]
    http://www20.brinkster.com/coldshine[/size]

  • GenjuroGenjuro Member Posts: 913
    : : My opinion is that VB allocates both Longs and Integers wherever it happens, with no regards to memory alignment (4 bytes on Intel processors).
    :
    : I've looked at McKinney's book, and I suppose you're right. If you look at the timings, you'll find the Long timings are almost the double as the Integer timings, which must be due to a double memory access caused by a data misalignment.
    :
    : But remember, McKinney performed the tests under VB5; this absurd behavior could have been removed with the improvements of the VB6 compiler. Does anyone care to re-test?
    : _____________________________
    : [size=1][b][grey]Cold[/grey][blue]Shine[/blue][/b]
    : http://www20.brinkster.com/coldshine[/size]
    :

    Yes, that could be true also. I haven't timed them in VB6, to say it all.


  • Justin BibJustin Bib USAMember Posts: 0

    ____ { http://forcoder.org } free video tutorials and ebooks about { PL/SQL Python C# R Objective-C Swift Ruby Assembly C JavaScript Go Scratch Perl Java Delphi C++ MATLAB PHP Visual Basic Visual Basic .NET Alice Prolog SAS Kotlin Scala Rust VBScript Crystal D ML FoxPro Bash Dart Transact-SQL Erlang F# Julia Logo Scheme Lua COBOL LabVIEW Clojure ABAP Ada Fortran Lisp Awk Hack Apex } ______

Sign In or Register to comment.