Sie sind auf Seite 1von 1

Developer Network

Sign in Subscriber portal Get tools

Downloads Programs Community Documentation

Ask a question Search related threads Search forum questions

Quick access

Answered by: browser automation


Archived Forums V > Visual Basic Express Edition
24,335
Question
Points
Top 0.5%
 

Martin Xie - MSFT Hi


Joined Feb 2008 i'd like to have a programm that navigates to http://www.handelsblatt.com/News/def...ymbol=FLUK.NWX
Martin Xie - MSFT… 0 selects "Times and Sales" from the menu "Darstellung", clicks on "aktualisieren" and copies the new table to a
5 6 11 Sign file.
Show activity
in to I'm still hoping i can deal with most of the steps, but I have no clue how to select from the dropdown menu.
vote
I'm using VB.NET 2005 express.
I'd really appreciate any kind of help.
Thank you!!

Wednesday, November 28, 2007 11:07 AM

d.j.t 20 Points

Answers

 d.j.t wrote:

And I'am not sure what you want  to tell me with:


1
Sign e.g. Dim WithEvents Button1 As Button  
in to
vote Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the
Object Browser comboBox, and all events corresponding to the Button1 will display in
the Event Browser comboBox.
do I need to insert this code even though i added a button?

Because you said a error occured " Handles clause requires a WithEvents variable defined in the
containing type or one of its base types ". The error has something to do with WithEvents. So that's only
extra reference. You can ignore it.

Come back to the topic: Please drag&drop a Button control named Button1 to your Form.

In this case, you have to click the button to perform the tasks. That's indeed restriction.

OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable
as switch, which can ensure perform the tasks only once.

Code Block
Public Class Form1
Dim march As Boolean ' Set a swith

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)


Handles MyBase.Load

march = True ' Initialize the switch as True

WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
'Dertermine the swith state
If march = True Then
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
End If
Next
Dim theWElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
End If
Next
Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")
w.Write(WebBrowser1.Document.Body.InnerHtml)
w.Close()
march = False ' If accomplish the task, change the switch to False.
End If
End Sub

End Class

Wednesday, December 5, 2007 11:34 AM

Martin Xie - MSFT 24,335 Points

Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the
first table is copied, the one that was displayed when first browsing to the page, before doing
the selections and refreshing. so to me it seems as if the skript doesnt wait for the
0 documentcompleted-event any more. but only sometimes! sometimes the correct table is
Sign also copied, sometimes not. i dont understand this! (actually i never fully understood of the
in to documentcompleted-event-thing). the only way i can explain is that the old computer is to
vote
slow... im frustrated!"

Hi Dominik,

In Part 6 you are extracting the javascript immediately after automatically clicking the More
button without waiting for the next webpage to load with new data:

Code Snippet
1. 'Part 6 Automatically click Continue link
2. Dim hrefElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("a")
3. For Each curElement As HtmlElement In hrefElementCollection
4. Dim controlName As String = curElement.GetAttribute("id").ToString
5. If controlName.Contains("LBtn_More") Then
6. curElement.InvokeMember("Click")
7. End If
8. Next
9. extract()

The code in my first post on this thread fixes that problem. The DocumentCompleted event fires
when a new webpage loads. After clicking the button in Part 4 we have to wait for the next
DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with
clicking the More button in Part 6 (see: http://msdn2.microsoft.com/en-
us/library/system.windows.forms.webbrowser.documentcompleted.aspx):

Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. document_completed = document_completed + 1
3. If document_completed = 1 Then ' First table
4. Part2() ' Automatically select specified option from ComboBox
5. Part3() ' Automatically check the CheckBox
6. Part4() ' Automatically click the Button
7. ElseIf document_completed > 1 And document_completed < 11 Then ' Second
to tenth tables
8. Part5() ' Extract javascript and update last_datetime
9. If last_datetime > earliest_datetime Then
10. Part6() ' Click Continue Button
11. End If
12. End If
13. End Sub

But the If statements need to be refined a bit because DocumentCompleted fires twice per page
(once for the page banner and once for the default page containing the javascript data that we
want):

Code Snippet
1. If (document_completed < 3) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
2. .
3. .
4. .
5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then

The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when
generating the filename so there is potential for overwriting old files or appending new data to an
old file:

Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")

Use a 24 hour clock instead using capital HH:

Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss")

The other bugs I pointed out were "features" that I had introduced myself when converting from
VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.

Edited by Tim Mathias Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.

Tuesday, January 29, 2008 10:24 AM

Tim Mathias 345 Points

> Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same
througout the whole procedure?

 
0
Sign It's essential because the url DOESN'T stay the same throughout the whole procedure because the
in to webpage contains a link to a banner page that also calls the procedure after it loads. I've added a
vote
MessageBox to show these two URLs. It's this double message that causes the first table to be
extracted in your skript (i.e. the table we want to ignore).

I've also added an If statement that returns when the banner URL completes (it's a bit neater than
the former If tests I wrote).

And I've added the Me.Close ()

Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
3. If Not (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
4. Return
5. End If
6. document_completed = document_completed + 1
7. If document_completed = 1 Then ' First table
8. Part2() ' Automatically select specified option from ComboBox
9. Part3() ' Automatically check the CheckBox
10. Part4() ' Automatically click the Button
11. ElseIf document_completed > 1 Then
12. Part5() ' Extract javascript and update last_datetime
13. If last_datetime > earliest_datetime Then
14. Part6() ' Automatically click Continue Button
15. Else
16. Me.Close() ' Part 7: Close programme
17. End If
18. End If
19. End Sub

Edited by Tim Mathias Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.

Wednesday, January 30, 2008 2:42 PM

Tim Mathias 345 Points

I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case
there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops
after you for a suspected DoS attack.

0  
Sign
in to Here's the ultimate bug free code (until you find the next one):
vote
Code Snippet
1. Dim previous_last_datetime As DateTime
2.
3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
4. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
5. If Not (e.Url.AbsoluteUri = seite) Then
6. Return
7. End If
8. document_completed = document_completed + 1
9. If document_completed = 1 Then ' First table
10. Part2() ' Automatically select specified option from ComboBox
11. Part3() ' Automatically check the CheckBox
12. Part4() ' Automatically click the Button
13. ElseIf document_completed > 1 And document_completed < 11 Then
14. previous_last_datetime = last_datetime
15. Part5() ' Extract javascript and update last_datetime
16. If previous_last_datetime > last_datetime Then
17. Part6() ' Automatically click Continue Button
18. Else
19. Me.Close() ' Part 7: Close programme
20. End If
21. End If
22. End Sub

Edited by Tim Mathias Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.

Friday, February 1, 2008 7:04 PM

Tim Mathias 345 Points

All replies

Hi d.j.t,

Your question is related to Automation Test technology.

The website you mentioned is a German website.


0
Sign Here is the Introduction of one Web Application Testing in .Net.
in to
vote
It allows you to emulate real users interacting with your web site by automating IE and bring you an easy
way to automate tests with Internet Explorer.

http://blogs.charteris.com/blogs/edwardw/archive/2007/07/16/watin-web-application-testing-in-net-
introduction.aspx

http://watin.sourceforge.net/

Check above documents for main idea of Web Automation Test.

Basic features:

Automates all major HTML elements


Find elements by multiple attributes

How to Locate elements


Creating test scripts in most cases involves finding an html element and either causing it to fire an
event, set it's value or assert it's expected value.

In order to perform an action against an element you must first obtain a reference to it. This can be
done in 3 different ways:

By the elements id (if it has one)


Regular expression that matches the elements id
Attribute class

Regards,

Martin

Edited by Pan Zhang Friday, July 19, 2013 3:25 AM

Friday, November 30, 2007 8:53 AM

Martin Xie - MSFT 24,335 Points

Hi d.j.t,

I think I have worked it out.


0
Sign We can locate and access elements of a webpage loaded in WebBrowser control. In your case, you want to
in to
vote select an option from ComboBox, check a CheckBox and click a Button.

1. Darstellung ComboBox element and Times & Sales Option:

<SELECT class=wp1-input id=ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_DD_Step


name=ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step> <OPTION value=0 selected>Times
&amp; Sales</OPTION>

2. The Kapitalmaßnahmen einbeziehen Checkbox element:

<INPUT id=ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_CBx_CapitalMeasures type=checkbox


name=ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures>

3. The Aktualisieren Button element:

<INPUT id=ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_IBtn_Refresh1 title=Aktualisieren


type=image alt=Aktualisieren
src="http://bc2.handelsblatt.com/hbi/images/wp1/wp1_refresh.gif" align=right
name=ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1>

This code can automatically perform above steps:

Code Block
Public Class Form1

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)


Handles MyBase.Load
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)

End If
Next

Dim theWElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)

'Part 4: Automatically click the button


ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
' javascript has a click method for we need to invoke on the current button element.
End If
Next
End Sub

End Class

Similar issue: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2456794&SiteID=1

Best regards,

Martin

Monday, December 3, 2007 4:00 AM

Martin Xie - MSFT 24,335 Points

 d.j.t wrote:
... and copies the new table to a file.

0
Sign  
in to
vote To achieve the task, here are two suggestions:

1.

Code Block
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted

' After automatically clicking the button,


' append the following code to save the webpage as htm file
Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")
w.Write(WebBrowser1.Document.Body.InnerHtml)
w.Close()

End Sub
 

1. Check this thread for detail: http://forums.microsoft.com/MSDN/ShowPost.aspx?


PostID=2468541&SiteID=1

You need to Add Reference... ->  COM tab -> Find Microsoft CDO For Windows 2000 Library and Microsoft
ActiveX Data Objects 2.5 Library and add them to your project

Code Block
Imports ADODB

Imports CDO

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted

' After automatically clicking the button,


' append the following code to save the webpage as mht file
SavePage(WebBrowser1.Url.ToString, "c:\table.mht")
End Sub

Private Sub SavePage(ByVal Url As String, ByVal FilePath As String)


Dim iMessage As CDO.Message = New CDO.Message
iMessage.CreateMHTMLBody(Url, CDO.CdoMHTMLFlags.cdoSuppressObjects, "", "")
Dim adodbstream As ADODB.Stream = New ADODB.Stream
adodbstream.Type = ADODB.StreamTypeEnum.adTypeText
adodbstream.Charset = "US-ASCII"
adodbstream.Open()
iMessage.DataSource.SaveToObject(adodbstream, "_Stream")
adodbstream.SaveToFile(FilePath, ADODB.SaveOptionsEnum.adSaveCreateOverWrite)
End Sub
 

Monday, December 3, 2007 4:34 AM

Martin Xie - MSFT 24,335 Points

Hi Martin
your first reply is great! Thanks a lot!

0 1. I just have one problem with the first task: when executing, the selection of the combo&checkboxes
Sign works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a
in to
vote webbrowser elemet from the toolbox in form1)

2. with the extraction i unfourtunately had problems too:


" 'Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)' has multiple definitions with identical
signatures "
naming the Private Sub "WebBrowser1_DocumentCompleted2" worked - i hope i can just do that...
But anyway, this only helped with the first solution, which only creates a html of the complete website (or
at least parts of it). But i need something that i can easily import to a database, such as .txt (the cellls
seperated by tabs and lines) or .xls.
So i tried the second solution (not really knowing what the output will be in that case, maybe more or less
the same), but after renaming the sub still there was the error: "  Value of type 'System.Uri' cannot be
converted to 'string'  "
But if the exported file will be more then the pure table data (as i expect) the problem doesn't really
matter.

If you have an idea how to deal with one of the problems, especially the first, I'd appreciate if you could
post it.

My project has made a enormous progress thanks to you!

Monday, December 3, 2007 2:59 PM

d.j.t 20 Points

Hi d.j.t,

1. "  'Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


0
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)' has multiple definitions with identical
Sign
in to signatures  "
vote naming the Private Sub "WebBrowser1_DocumentCompleted2" worked - i hope i can just do that...

->  You should place the two part code (Automation part and Save page part) into the
WebBrowser1_DocumentCompleted event. Don't name it as WebBrowser1_DocumentCompleted2.

Code Block
Public Class Form1
 
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)
Handles MyBase.Load
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub
 
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
 
End If
Next
 
Dim theWElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
 
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
' javascript has a click method for we need to invoke on the current button element.
End If
Next
 
' After automatically clicking the button,
' append the following code to save the webpage as htm file
Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")
w.Write(WebBrowser1.Document.Body.InnerHtml)
w.Close()

End Sub
 
End Class

2. So i tried the second solution (not really knowing what the output will be in that case, maybe more or
less the same), but after renaming the sub still there was the error: "  Value of type 'System.Uri' cannot be
converted to 'string'  "
-> Please change it to WebBrowser1.Url.ToString. I have modified my third post.

    This solution will save entire web page as .mht file which containing all text and images. It seems not to
be what you expect.

Tuesday, December 4, 2007 2:41 AM

Martin Xie - MSFT 24,335 Points

3. I just have one problem with the first task: when executing, the selection of the combo&checkboxes
works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a
webbrowser elemet from the toolbox in form1)
-> CAUSE: When clicking the button to retrieve data, it refresh and reload current page, so all the time it
0 fires the WebBrowser1_DocumentCompleted event.
Sign
in to Solution: You can place that code in Button1_Click event.
vote
Code Block
Public Class Form1
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)
Handles MyBase.Load
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub
 
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As
System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
MessageBox.Show("Complete loading webpage") ' Optional code
End Sub
 
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs)
Handles Button1.Click
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
End If
Next
 
Dim theWElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
' javascript has a click method for we need to invoke on the current button element.
End If
Next
 
Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")
w.Write(WebBrowser1.Document.Body.InnerHtml)
w.Close()
End Sub
End Class
 

4. But I need something that i can easily import to a database, such as .txt (the cellls seperated by tabs and
lines) or .xls.

But if the exported file will be more than the pure table data (as i expect) the problem doesn't really
matter.
-> You need to retrieve that part html code (<Table>...</Table>) containing table data. Here are some
references:

1) Using the HTML Parser to parse HTML code

   http://www.developer.com/net/csharp/article.php/10918_2230091_2

2) See the Similar issue, you can use Regular Expressions to extract part html code.

.NET Development » Regular Expressions Forum

I'm glad to hear that you have made enormous progress. Cheers!

Best regards,

Martin

Tuesday, December 4, 2007 3:47 AM

Martin Xie - MSFT 24,335 Points

Hi Martin
i tried to use the button1click event but a error  occured: " Handles clause requires a WithEvents variable
defined in the containing type or one of its base types "
Nevertheless, when excuting it, the same endless clicking of the refreshbutton happened...
0 Thanks for your efforts!
Sign Dominik
in to
vote
Wednesday, December 5, 2007 9:56 AM

d.j.t 20 Points

i'm just working on the extraction.


- the first link is related to c# ... can i just change the language?
- the similar issue seems to be excactly what i want but there is no complete code provided
- the regular expressions thing - i appologize for this noob question - what is that?
0 dominik
Sign
in to
vote Wednesday, December 5, 2007 10:32 AM

d.j.t 20 Points

 d.j.t wrote:
Hi Martin
i tried to use the button1click event but a error  occured: " Handles clause requires a
0
WithEvents variable defined in the containing type or one of its base types "
Sign
in to
vote
Please directly drag&drop a Button control named Button1 to your Form.

Reference: WithEvents keyword

http://msdn2.microsoft.com/en-us/library/aty3352y(VS.80).aspx

Specifies that one or more declared member variables refer to an instance of a class that can raise events.

e.g. Dim WithEvents Button1 As Button

    Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser
comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.

Wednesday, December 5, 2007 10:38 AM

Martin Xie - MSFT 24,335 Points

Well I could have known it had something to do with a button on the form... sorry :-/
But now im really confuesed... cause now i have to click the button to perform the tasks.
And I'am not sure what you want  to tell me with:

0 e.g. Dim WithEvents Button1 As Button


Sign
in to     Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser
vote
comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.

do I need to insert this code even though i added a button?

Well is there a possibility to solve that problem of the repetition by adding something like the following (in plain
english) to the code you first recommended?
"and if value of the combobox is not equal to 0?"

Wednesday, December 5, 2007 11:01 AM

d.j.t 20 Points

 d.j.t wrote:

And I'am not sure what you want  to tell me with:


1
Sign e.g. Dim WithEvents Button1 As Button  
in to
vote Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the
Object Browser comboBox, and all events corresponding to the Button1 will display in
the Event Browser comboBox.
do I need to insert this code even though i added a button?

Because you said a error occured " Handles clause requires a WithEvents variable defined in the
containing type or one of its base types ". The error has something to do with WithEvents. So that's only
extra reference. You can ignore it.

Come back to the topic: Please drag&drop a Button control named Button1 to your Form.

In this case, you have to click the button to perform the tasks. That's indeed restriction.

OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable
as switch, which can ensure perform the tasks only once.

Code Block
Public Class Form1
Dim march As Boolean ' Set a swith

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)


Handles MyBase.Load

march = True ' Initialize the switch as True

WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
'Dertermine the swith state
If march = True Then
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
End If
Next
Dim theWElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
End If
Next
Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")
w.Write(WebBrowser1.Document.Body.InnerHtml)
w.Close()
march = False ' If accomplish the task, change the switch to False.
End If
End Sub

End Class

Wednesday, December 5, 2007 11:34 AM

Martin Xie - MSFT 24,335 Points

Thank you! Thats exactly what i was trying to do (but lack of experience prevened me from doing so)! First
task acomplished!

So there remains the second task of extracting the table... even though - after you helped me so much -
0 i'm a bit embarressed to ask, did you see my questions concerning your links (regarding extraction)
Sign (Tuesday, 10:32 PM)?
in to
vote  

Wednesday, December 5, 2007 3:47 PM

d.j.t 20 Points

d.j.t wrote:
i'm just working on the extraction.
- the first link is related to c# ... can i just change the language?
0 - the similar issue seems to be excactly what i want but there is no complete
Sign code provided
in to - the regular expressions thing - i appologize for this noob question - what is
vote
that?
dominik

Yes, I see the second task of extracting the table.

Regular Expressions can be used to extract part html code.

You need to Imports System.Text.RegularExpressions namespace.

Suggest posting this task to Regular Expressions forum for quicker and better responses.

.NET Development » Regular Expressions Forum

Please remember to point out the html page:


http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX

Also point out the Table where you want to extract data as below:

Code Block
<TABLE cellSpacing=0 cellPadding=0 width="100%" border=0>

<TBODY>

<TR>

<TH class=wp1-header colSpan=6>Historische Daten </TH></TR>

<TR>

<TH class=wp1-header>Datum</TH>

<TH class=wp1-header>Eröffnung</TH>

<TH class=wp1-header>Hoch</TH>

<TH class=wp1-header>Tief</TH>

<TH class=wp1-header>Schluss</TH>

<TH class=wp1-header>Volumen</TH></TR>

<TR>

<TD class=wp1-line1 align=middle>05.12.07 15:23</TD>

<TD class=wp1-line1 align=right>57,60</TD>

<TD class=wp1-line1 align=right>59,90</TD>

<TD class=wp1-line1 align=right>57,60</TD>

<TD class=wp1-line1 align=right>59,90</TD>

<TD class=wp1-line1 align=right>3.753</TD></TR>

<TR>

<TD class=wp1-line2 align=middle>04.12.07 18:29</TD>

<TD class=wp1-line2 align=right>57,90</TD>

<TD class=wp1-line2 align=right>58,10</TD>

<TD class=wp1-line2 align=right>57,27</TD>

<TD class=wp1-line2 align=right>57,50</TD>

<TD class=wp1-line2 align=right>4.730</TD>

<TR>

<TD class=wp1-line1 align=middle>03.12.07 18:57</TD>

<TD class=wp1-line1 align=right>58,50</TD>

<TD class=wp1-line1 align=right>58,75</TD>

<TD class=wp1-line1 align=right>57,39</TD>

<TD class=wp1-line1 align=right>57,85</TD>

<TD class=wp1-line1 align=right>10.219</TD></TR>

<TR>

<TD class=wp1-line2 align=middle>30.11.07 14:43</TD>

<TD class=wp1-line2 align=right>57,95</TD>

<TD class=wp1-line2 align=right>58,75</TD>

<TD class=wp1-line2 align=right>57,95</TD>

<TD class=wp1-line2 align=right>58,46</TD>

<TD class=wp1-line2 align=right>12.249</TD>

<TR>

<TD class=wp1-line1 align=middle>29.11.07 14:52</TD>

<TD class=wp1-line1 align=right>58,45</TD>

<TD class=wp1-line1 align=right>58,75</TD>

<TD class=wp1-line1 align=right>58,00</TD>

<TD class=wp1-line1 align=right>58,00</TD>

<TD class=wp1-line1 align=right>1.532</TD></TR>

<TR>

<TD class=wp1-line2 align=middle>28.11.07 14:17</TD>

<TD class=wp1-line2 align=right>57,70</TD>

<TD class=wp1-line2 align=right>58,23</TD>

<TD class=wp1-line2 align=right>57,58</TD>

<TD class=wp1-line2 align=right>58,23</TD>

<TD class=wp1-line2 align=right>1.540</TD>

<TR>

<TD class=wp1-line1 align=middle>27.11.07 16:08</TD>

<TD class=wp1-line1 align=right>58,60</TD>

<TD class=wp1-line1 align=right>58,92</TD>

<TD class=wp1-line1 align=right>57,30</TD>

<TD class=wp1-line1 align=right>57,60</TD>

<TD class=wp1-line1 align=right>7.683</TD></TR>

<TR>

<TD class=wp1-line2 align=middle>26.11.07 14:09</TD>

<TD class=wp1-line2 align=right>58,30</TD>

<TD class=wp1-line2 align=right>59,00</TD>

<TD class=wp1-line2 align=right>58,30</TD>

<TD class=wp1-line2 align=right>58,90</TD>

<TD class=wp1-line2 align=right>5.321</TD>

<TR>

<TD class=wp1-line1 align=middle>23.11.07 19:10</TD>

<TD class=wp1-line1 align=right>57,15</TD>

<TD class=wp1-line1 align=right>57,74</TD>

<TD class=wp1-line1 align=right>57,15</TD>

<TD class=wp1-line1 align=right>57,50</TD>

<TD class=wp1-line1 align=right>8.880</TD></TR>

<TR>

<TD class=wp1-line2 align=middle>22.11.07 19:48</TD>

<TD class=wp1-line2 align=right>57,60</TD>

<TD class=wp1-line2 align=right>57,60</TD>

<TD class=wp1-line2 align=right>56,51</TD>

<TD class=wp1-line2 align=right>56,51</TD>

<TD class=wp1-line2 align=right>9.393</TD>

<TR>

<TD class=wp1-line1 align=middle>21.11.07 19:23</TD>

<TD class=wp1-line1 align=right>58,30</TD>

<TD class=wp1-line1 align=right>58,80</TD>

<TD class=wp1-line1 align=right>56,90</TD>

<TD class=wp1-line1 align=right>57,00</TD>

<TD class=wp1-line1 align=right>7.971</TD></TR>

<TR>

<TD class=wp1-line2 align=middle>20.11.07 15:12</TD>

<TD class=wp1-line2 align=right>58,05</TD>

<TD class=wp1-line2 align=right>58,80</TD>

<TD class=wp1-line2 align=right>57,07</TD>

<TD class=wp1-line2 align=right>58,80</TD>

<TD class=wp1-line2 align=right>5.601</TD>

<TR>

<TD class=wp1-line1 align=middle>19.11.07 15:23</TD>

<TD class=wp1-line1 align=right>58,70</TD>

<TD class=wp1-line1 align=right>59,35</TD>

<TD class=wp1-line1 align=right>57,60</TD>

<TD class=wp1-line1 align=right>57,95</TD>

<TD class=wp1-line1 align=right>6.562</TD>

</TR>

</TBODY>

</TABLE>

By the way, convert C# code to VB.NET code by means of this Code Translator tool.

Thursday, December 6, 2007 3:16 AM

Martin Xie - MSFT 24,335 Points

Hi Martin!
Well there is one last question (even though others might follow:-) that fits in this topic: How do i click the
"weiter" button at the bottom of the table? I tried to do it the same way as clicking "refresh":      

0 _________________________________________________________________________
Sign in Dim theWElementCollection As HtmlElementCollection =
to vote WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString

'Part 4: Automatically click the button


If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")

I tried to find the TagName and the attribute for the "weiter" link but it didnt work with what i found: "a"
instead of "input" and "id" instead of "name"
</td>
<td align="right"><a id="ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" class="wp1-more" hre
</tr>
Once more I hope you can provide help.
Thanks Dominik

Thursday, December 6, 2007 12:13 PM

d.j.t 20 Points

The following is complete code.

Please check part 5: Automatically click Continue link. ("weiter" is translated to "Continue")

Code Block
0 Public Class Form1
Sign in Dim march As Boolean ' Set a swith
to vote

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)


Handles MyBase.Load
march = True ' Initialize the switch as True
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
'Dertermine the swith state
If march = True Then
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
End If
Next

Dim theWElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
End If
Next

'Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")


'w.Write(WebBrowser1.Document.Body.InnerHtml)
'w.Close()

march = False ' If accomplish the task, change the switch to False.

Else ' If march = False, don't need to perform above tasks, directly continue to
click "Continue" link.
'Part 5: Automatically click Continue link
Dim hrefElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("a")
For Each curElement As HtmlElement In hrefElementCollection
Dim controlName As String = curElement.GetAttribute("id").ToString
If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then
curElement.InvokeMember("Click")
End If
Next
End If
End Sub
End Class

Friday, December 7, 2007 3:22 AM

Martin Xie - MSFT 24,335 Points

Hi Martin,

thanks for the reference to the other forum, it was quite useful: somebody there could provide assistance!

0 i have to extend the question above:


Sign in
to vote
This program is meant to be launched each day to copy the data. But due to holidays that wont be
possible. And sometimes all data doesn't fit onto 1 page (as the tables on the concerned site are limited to
100 rows). Thats why I am thinking about a loop in the final part: After selecting, refreshing and copying,
i'd like to have the "weiter" (next page) link clicked and the copying done again and again until a certain
past date appears in the table.

Like this

1. do selections and refresh


2. extract
3. click "weiter"(next page) (so far my above question) IF THE LAST DATE IN THE TABLE IS NOT MORE
THAN x DAYS AGO (click link if: last_date_in_table > todays_date - x)
4. then go back to step 2

i'd be fine if the x could be a variable, selected in a form when starting the programm. but that should be
rather  easy then.

thanks for you commitment

Dominik

edit: i just noticed your answer to my last question. many thanks!

Friday, December 7, 2007 1:33 PM

d.j.t 20 Points

Hi with that code - thanks for it - the repetition in the end is happening again. I introduced a second
switch and changed the final part to avoid this:

        Else   
0
Sign in             If marchb = True Then
to vote

                'Part 5: Automatically click Continue link

                Dim hrefElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("a")

                For Each curElement As HtmlElement In hrefElementCollection

                    Dim controlName As String = curElement.GetAttribute("id").ToString

                    If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

                         curElement.InvokeMember("Click")

                       

                     End If

'insert extraction once again

 marchb = False  ' missing: if date as specified

                Next

            End If
       End If

    End Sub 

End Class

The task with the Date remains.


I really appreciate your advice!

Friday, December 7, 2007 2:08 PM

d.j.t 20 Points

Hi martin,

-at the reg.ex. forum i was provided a lot of help but one Problem remains: I inserted the extraction where i
had planed it, but it seems it happens to fast: the extracted table is the one displayed before refreshing. I
0 hoped a few seconds pausing or another switch after the new table is completely loaded should do the
Sign in
to vote
trick, but my attempts have not been successfull yet.

-And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will
run often, i'd like to have a changing date component and (for several pages a day) a counter in the
filename.

This is the complete code:

Hi ok now i am puzzled once more: i finally tried the exporting but it did export the first table, the table
that is displayed before the selection from the comboboxes is done. (but i need the table that is
displayed after the comboboxselection). whats wrong? please have a look at my complete code. Thank
you:

Imports System.IO

Imports System.Text.RegularExpressions

Public Class Form1

Dim lastDate As DateTime

Dim marchb As Boolean

Dim march As Boolean ' Set a swith

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As


System.EventArgs) Handles MyBase.Load

march = True ' Initialize the switch as True

marchb = True

WebBrowser1.Dock = DockStyle.Fill

Me.WindowState = FormWindowState.Maximized

' Part 1: Use WebBrowser control to load web page

WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal


e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted

'Dertermine the swith state

If march = True Then

'Part 2: Automatically select specified option from ComboBox

Dim theElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In theElementCollection

Dim controlName As String =


curElement.GetAttribute("name").ToString

If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step"
Then

curElement.SetAttribute("Value", 0)

End If

Next

'Part 2,5: Automatically select specified option from ComboBox

Dim the2ElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In the2ElementCollection

Dim controlName As String =


curElement.GetAttribute("name").ToString

If controlName =
"ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then

curElement.SetAttribute("Value", 100)

End If

Next

Dim theWElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("input")

For Each curElement As HtmlElement In theWElementCollection

Dim controlName As String =


curElement.GetAttribute("name").ToString

'Part 3: Automatically check the CheckBox

If controlName =
"ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

curElement.SetAttribute("Checked", True)

'Part 4: Automatically click the button

ElseIf controlName =
"ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

curElement.InvokeMember("click")

End If

Next

'part 5 export

'java skript

Dim rows As New System.Collections.ObjectModel.Collection(Of String())


()

Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+


(?:\\t))+([^\\]+(?=\\r\\n'))"

For Each m As Match In Regex.Matches(WebBrowser1.DocumentText,


pattern)

rows.Add(m.Value.Split(New String() {"\t"},


StringSplitOptions.None))

Next

' export to txt

march = False ' If accomplish the task, change the switch to False.

lastDate = Nothing

Dim lastDateStr As String = Nothing

Dim separator As String = vbTab

Using sw As StreamWriter =
File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export.txt")

For Each row As String() In rows

sw.WriteLine(String.Join(separator, row))

lastDateStr = row(0)

Next

End Using

If lastDateStr IsNot Nothing Then

lastDate = DateTime.Parse(lastDateStr)

End If

Else ' If march = False, don't need to perform above tasks, directly
click Continue link.

If marchb = True And lastDate = Today.AddDays(1) Then ' something like


that - dont think that already works

'Part 6 Automatically click Continue link

Dim hrefElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("a")

For Each curElement As HtmlElement In hrefElementCollection

Dim controlName As String =


curElement.GetAttribute("id").ToString

If controlName =
"ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

curElement.InvokeMember("Click")

'extract again... yet to be inserted

End If

marchb = False

Next

End If

End If

End Sub

End Class

Wednesday, December 12, 2007 9:17 AM

d.j.t 20 Points

Hi d.j.t, 

Welcome back!
0
Sign in
to vote I'm glad to hear that you got much help from Regular Expressions forum.

"but it seems it happens to fast: the extracted table is the one displayed before refreshing."

->         'Delay 2 seconds

            System.Threading.Thread.Sleep(2000)

'Call sub to extract

ExportTableData()
"And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will
run often, i'd like to have a changing date component and (for several pages a day) a counter in the
filename."

->  'Add current DataTime to file name to identify

Dim currentDataTime As String = DateTime.Now.ToString("yyyymmddhhmmss")


Using sw As StreamWriter =
File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime &
".txt")

Thursday, December 13, 2007 2:58 AM

Martin Xie - MSFT 24,335 Points

This is complete code. The modified parts are marked in bold font.
Code Block
Imports System.IO
Imports System.Text.RegularExpressions
1 Public Class Form1
Sign in
to vote
Dim lastDate As DateTime
Dim marchb As Boolean
Dim march As Boolean ' Set a switch

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)


Handles MyBase.Load
march = True ' Initialize the switch as True
marchb = True
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Maximized
' Part 1: Use WebBrowser control to load web page
WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted

'Dertermine the swith state


If march = True Then
'Part 2: Automatically select specified option from ComboBox
Dim theElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")
For Each curElement As HtmlElement In theElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then
curElement.SetAttribute("Value", 0)
End If
Next

'Part 2,5: Automatically select specified option from ComboBox


Dim the2ElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In the2ElementCollection


Dim controlName As String = curElement.GetAttribute("name").ToString
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then
curElement.SetAttribute("Value", 100)
End If
Next

Dim theWElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("input")
For Each curElement As HtmlElement In theWElementCollection
Dim controlName As String = curElement.GetAttribute("name").ToString
'Part 3: Automatically check the CheckBox
If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures"
Then
curElement.SetAttribute("Checked", True)
'Part 4: Automatically click the button
ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then
curElement.InvokeMember("click")
End If
Next
march = False ' If accomplish the task, change the switch to False.

'Delay 2 seconds
System.Threading.Thread.Sleep(2000)
'Call sub to extract
ExportTableData()

Else ' If march = False, don't need to perform above tasks, directly click Continue link.
If marchb = True And lastDate = Today.AddDays(1) Then ' something like that - dont
think that already works

'Part 6 Automatically click Continue link


Dim hrefElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("a")
For Each curElement As HtmlElement In hrefElementCollection
Dim controlName As String = curElement.GetAttribute("id").ToString
If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then
curElement.InvokeMember("Click")

'Delay 2 seconds
System.Threading.Thread.Sleep(2000)
'Call sub to extract again
ExportTableData()

End If
marchb = False
Next
End If
End If
End Sub
' To be continue...

Thursday, December 13, 2007 3:10 AM

Martin Xie - MSFT 24,335 Points

Code Block
' Continue
 
' I put extract function code in custom method in order to be called conveniently.
1 Public Sub ExportTableData()
Sign in 'part 5 export
to vote
'java script
Dim rows As New System.Collections.ObjectModel.Collection(Of String())()
Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?
=\\r\\n'))"
For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)
rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))
Next

' export to txt


lastDate = Nothing
Dim lastDateStr As String = Nothing
Dim separator As String = vbTab

'Add current DataTime to file name to identify


Dim currentDataTime As String = DateTime.Now.ToString("yyyymmddhhmmss")
Using sw As StreamWriter =
File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime &
".txt")
For Each row As String() In rows
sw.WriteLine(String.Join(separator, row))
lastDateStr = row(0)
Next
End Using

If lastDateStr IsNot Nothing Then


lastDate = DateTime.Parse(lastDateStr)
End If
End Sub

End Class
 

Thursday, December 13, 2007 3:13 AM

Martin Xie - MSFT 24,335 Points

Thanks for all those answers!!!! Just Great! i hope that with this i can finally finish my task! Loads of thanks!

0 Thursday, December 13, 2007 10:58 AM


Sign in
to vote
d.j.t 20 Points

Hi Martin,

finally i have a complete working code doing exactly what i want. Big thanks to you! i have some questions
still but they are mere "cosmetics".
0
Sign in -With that code the first table is copied twice. I dont really understand why...
to vote

-Can it easyly be done, that the user doesnt notice anything else of the execution of the skript once it is
executed. I mean no window, no sounds...

-I'd like that programm to be used not only for one stock, but for several (up to 100). So i could just
change the adress in the first sub and create a executable programm for each stock. Then write few lines
that make all those programms be executed. I think this should even be possible at the same time.??.
Well of course i'd would be more elegant if i didnt need to create so many single programms . is there an
conviniently easy way to do this in the skipt?

Thanks! Dominik

Ps: Skript in next post... cant post it in color... (dont ask me why, the forum always refuses to accept
(unknown error))
 

Friday, December 14, 2007 1:23 PM

d.j.t 20 Points

Imports System.IO

Imports System.Text.RegularExpressions

0 Public Class Form1


Sign in
to vote
    Dim lastDate As DateTime

    Dim marchb As Boolean

    Dim marchc As Boolean

    Dim march As Boolean  ' Set a swith

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles


MyBase.Load

        march = True  ' Initialize the switch as True

        marchc = True

        WebBrowser1.Dock = DockStyle.Fill

        Me.WindowState = FormWindowState.Maximized

        ' Part 1: Use WebBrowser control to load web page    

        WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=EAD.ETR")

    End Sub

    Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As


System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted

        'Dertermine the swith state

        If march = True Then

            'Part 2: Automatically select specified option from ComboBox

            Dim theElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("select")

            For Each curElement As HtmlElement In theElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                    curElement.SetAttribute("Value", 0)

                End If

            Next

            'Part 2,5: Automatically select specified option from ComboBox

            Dim the2ElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("select")

            For Each curElement As HtmlElement In the2ElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then

                    curElement.SetAttribute("Value", 100)

                End If

            Next

            Dim theWElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("input")

            For Each curElement As HtmlElement In theWElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                'Part 3: Automatically check the CheckBox

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                    curElement.SetAttribute("Checked", True)

                    'Part 4: Automatically click the button

                ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                    curElement.InvokeMember("click")

                    march = False  ' If accomplish the task, change the switch to False.

                End If

            Next

        Else

            If marchc = True And march = False Then   ' If march = False, don't need to perform above tasks,
directly click Continue link.

                'part 5 export


                extract()

                marchc = False

            End If

        End If

        If marchc = False And lastDate > Today.AddDays(-2) Then ' im not sure if that works

            'Part 6 Automatically click Continue link

            Dim hrefElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("a")

            For Each curElement As HtmlElement In hrefElementCollection

                Dim controlName As String = curElement.GetAttribute("id").ToString

                If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

                    curElement.InvokeMember("Click")

                End If

            Next
            extract()

            'ElseIf lastDate > "01.01.0001" And lastDate < Today.AddDays(-2) Then : Close() 'just good to
know...
        End If

    End Sub


    Public Sub extract()
        Dim rows As New System.Collections.ObjectModel.Collection(Of String())()

        Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"

        For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

            rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

        Next

        ' export to txt

        lastDate = Nothing

        Dim lastDateStr As String = "0"

        Dim separator As String = vbTab

        Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")

        Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export"


& currentDataTime & ".txt")

            For Each row As String() In rows

                sw.WriteLine(String.Join(separator, row))

                lastDateStr = row(0)

            Next

        End Using


        If lastDateStr IsNot "0" Then

            lastDate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mm Tongue Tied s",


System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))
            System.Threading.Thread.Sleep(1000)
        End If
    End Sub

End Class

Friday, December 14, 2007 1:30 PM

d.j.t 20 Points

"im not sure if that works"

Try this:

0 Code Snippet
Sign in
to vote
1. Public Class Form1
2. Dim document_completed As Integer
3. Dim last_datetime As DateTime
4. Dim earliest_datetime As DateTime
5. Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles MyBase.Load
6. WebBrowser1.Dock = DockStyle.Fill
7. Me.WindowState = FormWindowState.Maximized
8. Part1() ' Use WebBrowser control to load web page
9. End Sub
10. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
11. document_completed = document_completed + 1
12. If document_completed = 1 Then ' First table
13. Part2() ' Automatically select specified option from ComboBox
14. Part3() ' Automatically check the CheckBox
15. Part4() ' Automatically click the Button
16. ElseIf document_completed > 1 And document_completed < 11 Then '
Second to tenth tables
17. Part5() ' Extract javascript and update last_datetime
18. If last_datetime > earliest_datetime Then
19. Part6() ' Click Continue Button
20. End If
21. End If
22. End Sub
23. Private Sub Part1()
24. ' Part 1: Use WebBrowser control to load web page
25. document_completed = 0
26. last_datetime = DateTime.Now
27. earliest_datetime = last_datetime.AddDays(-2)
28. WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
29. End Sub
30. Private Sub Part2()
31. ' Part 2: Automatically select specified option from ComboBox
32. End Sub
33. Private Sub Part3()
34. ' Part 3: Automatically check the CheckBox
35. End Sub
36. Private Sub Part4()
37. ' Part 4: Automatically click the Button
38. End Sub
39. Private Sub Part5()
40. ' Part 5: Extract javascript and update last_datetime
41. End Sub
42. Private Sub Part6()
43. ' Part 6: Click Continue Button
44. End Sub
45. End Class

Edited by Tim Mathias Wednesday, October 14, 2009 6:25 PM Reformatted code snippet.

Friday, January 25, 2008 6:06 AM

Tim Mathias 345 Points

Not forgetting Part 7 from this thread http://forums.microsoft.com/msdn/showpost.aspx?


postid=2514450&siteid=1&sb=0&d=1&at=7&ft=11&tf=0&pageid=2

Code Snippet
0 1. If last_datetime > earliest_datetime Then
Sign in 2. Part6() ' Click Continue Button
to vote 3. Else
4. Me.Close() ' Part 7: Close programme
5. End If

Edited by Tim Mathias Wednesday, October 14, 2009 6:10 PM Reformatted code snippet.

Friday, January 25, 2008 6:22 AM

Tim Mathias 345 Points

Hi Dominik,

I found a couple of bugs in Part 5 when I tried it out in C++ (I'm a C++ man not a VB one). I've
highlighted the important changes in bold (namely -- 24 hour clock, closed the output file
0 immediately after writing to it, and parsing a 15 character substring for the last datetime). (I've
Sign in also used GetElementById to get straight to the point.)
to vote

With the original version, ParseExact threw an exception every time, leaving the output file open
and empty. Maybe this is what is causing you stability issues with VB.

Code Snippet
1. void Part1 ()
2. {
3. Trace::WriteLine ("Part 1");
4.
5. // Part 1: Use WebBrowser control to load web page
6. document_completed = 0;
7. last_datetime = DateTime::Now;
8. earliest_datetime = last_datetime.AddDays (-2.0);
9. webBrowser1->DocumentCompleted += gcnew
WebBrowserDocumentCompletedEventHandler (this, &Form1::DocumentCompleted);
10. webBrowser1->Navigate ("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX");
11. }
12.
13. void Part2 ()
14. {
15. Trace::WriteLine ("Part 2");
16.
17. // Part 2: Automatically select specified option from ComboBox
18. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_DD_Step");
19. el->SetAttribute ("value", "0");
20. }
21.
22. void Part3 ()
23. {
24. Trace::WriteLine ("Part 3");
25.
26. // Part 3: Automatically check the CheckBox
27. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_CBx_CapitalMeasures");
28. el->SetAttribute ("checked", "true");
29. }
30.
31. void Part4 ()
32. {
33. Trace::WriteLine ("Part 4");
34.
35. // Part 4: Automatically click the button
36. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_IBtn_Refresh1");
37. el->InvokeMember ("click");
38. }
39.
40. void Part5 ()
41. {
42. Trace::WriteLine ("Part 5");
43.
44. // Part 5: Extract javascript and update last_datetime
45. try
46. {
47. ArrayList ^rows = gcnew ArrayList ();;
48. Regex ^pattern = gcnew Regex ("(?<=myl\\+=\\')([^\\\\]+(?:\\\\t))+
([^\\\\]+(?=\\\\r\\\\n'))");
49. Trace::WriteLine ("Part 5: pattern = " + pattern);
50. MatchCollection ^matches = pattern->Matches (webBrowser1-
>DocumentText);
51. Trace::WriteLine ("Part 5: matches->Count = " + matches->Count);
52. array <String^> ^tab = { gcnew String ("\\t") };
53. for (int i = 0; i < matches->Count; i++)
54. {
55. Trace::WriteLine (matches [i]->Value);
56. rows->Add (String::Join ("\t", matches [i]->Value->Split (tab,
StringSplitOptions::None)));
57. Trace::WriteLine (rows [i]);
58. }
59. String ^current_datetime = DateTime::Now.ToString ("yyyyMMddHHmmss");
// 24 hour clock
60. StreamWriter ^file = gcnew StreamWriter ("BrowserAutomation" +
current_datetime + ".txt");
61. for (int i = 0; i < rows->Count; i++)
62. {
63. file->WriteLine (rows [i]);
64. }
65. file->Close ();
66.
67. String ^str_last_datetime = (String ^) rows [rows->Count - 1];
68. Trace::WriteLine ("str_last_datetime = " + str_last_datetime);
69. last_datetime = DateTime::ParseExact (str_last_datetime->Substring
(0, 15), "dd.MM. HH:mm:ss",
System::Globalization::CultureInfo::CreateSpecificCulture ("de-de"));
70. Trace::WriteLine ("last_datetime = " + last_datetime);
71. }
72. catch (Exception ^e)
73. {
74. Trace::WriteLine ("Part 5: " + e->Message);
75. }
76. }
77.
78. void Part6 ()
79. {
80. Trace::WriteLine ("Part 6");
81.
82. // Part 6: Click Continue Button
83. HtmlElement ^el = webBrowser1->Document->GetElementById
("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_LBtn_More");
84. el->InvokeMember ("click");
85. }

Edited by Tim Mathias Wednesday, October 14, 2009 6:20 PM Reformatted code snippet.

Friday, January 25, 2008 10:37 PM

Tim Mathias 345 Points

Hi

thanks for your posts but as this is my first skript and therefore my programming experience is near zero, i
dont know how i would have to translate your skript to vb.net. or do you propose to change to c++? well
0 i've only used vb.net up to now.
Sign in
to vote
nevertheless i made some changes within my code (namely i put: add.days(-1) everywhere where i had
different numbers before) and now it seems to work.

well this programm is supposed to run on an old win2000sp4 computer that is not used for anything else,
so nobody can interfere. but after all was working fine on the (more or less new) win xpcomputer, on which
i wrote the whole thing, it is not working that fine on the old win2000sp4 computer.
what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the
one that was displayed when first browsing to the page, before doing the selections and refreshing. so to
me it seems as if the skript doesnt wait for the documentcompleted-event any more. but only sometimes!
sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully
understood of the documentcompleted-event-thing). the only way i can explain is that the old computer is
to slow... im frustrated!

is there anyone who has an idea why this could be?

i post the whole code once again....

Thanks Dominik

Monday, January 28, 2008 2:22 PM

d.j.t 20 Points

Imports System.IO

0 Imports System.Text.RegularExpressions
Sign in
to vote

Public Class Form1

Dim lastDate As DateTime

Dim marchc As Boolean

Dim march As Boolean' set 2 switches

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)


Handles MyBase.Load

Me.Visible = False

march = True' Initialize the switches as True

marchc = True

WebBrowser1.Dock = DockStyle.Fill

Me.WindowState = FormWindowState.Maximized

' Part 1: Use WebBrowser control to load web page

WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=SAP.ETR")

End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e


As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted

'Dertermine the swith state

'Me.Visible = False ' egal

If march = True Then

'Part 2: Automatically select specified option from ComboBox

Dim theElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In theElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

If controlName.contains("DD_Step") Then

curElement.SetAttribute("Value", 0)

End If

Next

'Part 2,5: Automatically select specified option from ComboBox

Dim the2ElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("select")

For Each curElement As HtmlElement In the2ElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

If controlName.contains("DD_Lines") Then

curElement.SetAttribute("Value", 100)

End If

Next

Dim theWElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("input")

For Each curElement As HtmlElement In theWElementCollection

Dim controlName As String = curElement.GetAttribute("name").ToString

'Part 3: Automatically check the CheckBox

If controlName.contains("CBx_CapitalMeasures") Then

curElement.SetAttribute("Checked", True)

'Part 4: Automatically click the button

ElseIf controlName.contains("IBtn_Refresh1") Then

curElement.InvokeMember("click")

march = False' If accomplish the task, change the switch1 to False.

End If

Next

Else

If marchc = True And march = False Then ' If march = False, don't need to perform
above tasks, directly click Continue link.

'part 5 export

extract()

marchc = False

End If

End If

If marchc = False And lastDate > Today.AddDays(-1) Then ' im not sure if that
works

'Part 6 Automatically click Continue link

Dim hrefElementCollection As HtmlElementCollection =


WebBrowser1.Document.GetElementsByTagName("a")

For Each curElement As HtmlElement In hrefElementCollection

Dim controlName As String = curElement.GetAttribute("id").ToString

If controlName.Contains("LBtn_More") Then

curElement.InvokeMember("Click")

End If

Next

extract()

'part 7 close program

ElseIf lastDate > "01.01.0001" And lastDate < Today.AddDays(-1) Then

Me.Close()

End If

End Sub

'sub to extract

Public Sub extract()

Dim rows As New System.Collections.ObjectModel.Collection(Of String())()

Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+


([^\\]+(?=\\r\\n'))"

For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

Next

' export to txt

lastDate = Nothing

Dim lastDateStr As String = "0"

Dim separator As String = vbTab

Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")

Using sw As StreamWriter = File.CreateText("C:\abc\def\etr" & currentDataTime &


".txt")

For Each row As String() In rows

sw.WriteLine(String.Join(separator, row))

lastDateStr = row(0)

Next

End Using

If lastDateStr IsNot "0" Then

lastDate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mm Tongue Tied s",


System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))

System.Threading.Thread.Sleep(2000)

End If

End Sub

End Class

Monday, January 28, 2008 2:23 PM

d.j.t 20 Points

Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the
first table is copied, the one that was displayed when first browsing to the page, before doing
the selections and refreshing. so to me it seems as if the skript doesnt wait for the
0 documentcompleted-event any more. but only sometimes! sometimes the correct table is
Sign in also copied, sometimes not. i dont understand this! (actually i never fully understood of the
to vote
documentcompleted-event-thing). the only way i can explain is that the old computer is to
slow... im frustrated!"

Hi Dominik,

In Part 6 you are extracting the javascript immediately after automatically clicking the More
button without waiting for the next webpage to load with new data:

Code Snippet
1. 'Part 6 Automatically click Continue link
2. Dim hrefElementCollection As HtmlElementCollection =
WebBrowser1.Document.GetElementsByTagName("a")
3. For Each curElement As HtmlElement In hrefElementCollection
4. Dim controlName As String = curElement.GetAttribute("id").ToString
5. If controlName.Contains("LBtn_More") Then
6. curElement.InvokeMember("Click")
7. End If
8. Next
9. extract()

The code in my first post on this thread fixes that problem. The DocumentCompleted event fires
when a new webpage loads. After clicking the button in Part 4 we have to wait for the next
DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with
clicking the More button in Part 6 (see: http://msdn2.microsoft.com/en-
us/library/system.windows.forms.webbrowser.documentcompleted.aspx):

Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. document_completed = document_completed + 1
3. If document_completed = 1 Then ' First table
4. Part2() ' Automatically select specified option from ComboBox
5. Part3() ' Automatically check the CheckBox
6. Part4() ' Automatically click the Button
7. ElseIf document_completed > 1 And document_completed < 11 Then ' Second
to tenth tables
8. Part5() ' Extract javascript and update last_datetime
9. If last_datetime > earliest_datetime Then
10. Part6() ' Click Continue Button
11. End If
12. End If
13. End Sub

But the If statements need to be refined a bit because DocumentCompleted fires twice per page
(once for the page banner and once for the default page containing the javascript data that we
want):

Code Snippet
1. If (document_completed < 3) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
2. .
3. .
4. .
5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then

The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when
generating the filename so there is potential for overwriting old files or appending new data to an
old file:

Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")

Use a 24 hour clock instead using capital HH:

Code Snippet
1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss")

The other bugs I pointed out were "features" that I had introduced myself when converting from
VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.

Edited by Tim Mathias Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.

Tuesday, January 29, 2008 10:24 AM

Tim Mathias 345 Points

Hi Tim,
thanks for your comprehensive explanations! I think with the structure you are adviceing it should work a
lot better than what i had before.
one thing i still dont understand is why my skript not only extracts the "old table" but also the new one...
0 well but that doesnt matter.
Sign in
to vote
First i wondered whether this would allow not more then 10 tables
ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables
But i see this part needs to be changed to what you wrote so this restriction drops out:
ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then

Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same througout the
whole procedure?

Well, as i am doing this while studying i cant implement all your advices right now, but i'll do so soon and report
my progress!

Thanks a lot! Dominik

Wednesday, January 30, 2008 10:23 AM

d.j.t 20 Points

Hi i just tried it, works fine! Just the me.close part is missing but no time left now, will continue next fryday.
Thanks a lot!!!!!! Dominik

Wednesday, January 30, 2008 11:10 AM


0
Sign in
to vote d.j.t 20 Points

> Is it exactly necessary to mention e.Url.AbsoluteUri = ... because the url stays the same
througout the whole procedure?

 
0
Sign in
It's essential because the url DOESN'T stay the same throughout the whole procedure because the
to vote webpage contains a link to a banner page that also calls the procedure after it loads. I've added a
MessageBox to show these two URLs. It's this double message that causes the first table to be
extracted in your skript (i.e. the table we want to ignore).

I've also added an If statement that returns when the banner URL completes (it's a bit neater than
the former If tests I wrote).

And I've added the Me.Close ()

Code Snippet
1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
2. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
3. If Not (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
4. Return
5. End If
6. document_completed = document_completed + 1
7. If document_completed = 1 Then ' First table
8. Part2() ' Automatically select specified option from ComboBox
9. Part3() ' Automatically check the CheckBox
10. Part4() ' Automatically click the Button
11. ElseIf document_completed > 1 Then
12. Part5() ' Extract javascript and update last_datetime
13. If last_datetime > earliest_datetime Then
14. Part6() ' Automatically click Continue Button
15. Else
16. Me.Close() ' Part 7: Close programme
17. End If
18. End If
19. End Sub

Edited by Tim Mathias Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.

Wednesday, January 30, 2008 2:42 PM

Tim Mathias 345 Points

Thanks a lot i! i think now i understand the documentcompleted structure better!

I'll test this skipt, but i think still there is one problem:

0
Sign in If the last date in the table is yesterday, the scipt will click "more/next table"("weiter") to get the next table.
to vote
Now sometimes there is no futher information [because the intraday-data i need is saved for only 5 days
or so]. Then when clicking on "more/next table" the same table is loaded again, as there is no next table. In
that case the program will endlessly repeat the re-loading and extraction of that table.
[With my data this is extremely unlikely to happen, but it happend for the first time in 2 weeks yesterday so
i got the same file a thousand times and the skript (the former one) ran for like 12 hours until it crashed].

What i thought of to solve this problem was to save the lastdate for one turn so that the next time we can
compare if the last date has changed. So we need the lastdate of the previous and the pre-previous table.
It can probably be done easier. So don't continue reading if you have an easy solution.

EDIT: i found an easyer way so dont read the second snipplet:


EDIT 2: tried it on http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX, didnt work
totally correct: it produced 2 times the same file with this link.  (but still
better than infinite times! Wink

Dim previouslastdate As DateTime

Private Sub Form1_Load


...
previouslastdate = DateTime.Now.AddDays(-1000)
WebBrowser1.Dock = DockStyle.Fill
Me.WindowState = FormWindowState.Minimized
Part1() ' Use WebBrowser control to load web page
End Sub

Private Sub WebBrowser1_DocumentCompleted...

document_completed = document_completed + 1

If (document_completed < 3) And (e.Url.AbsoluteUri = Seite) Then ' First


table
Part2() ' Automatically select specified option from ComboBox
Part3() ' Automatically check the CheckBox
Part4() ' Automatically click the Button

ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = Seite) Then '


Second to xth tables
previouslastdate = lastdate
Part5() ' Extract javascript and update last_datetime
If lastdate > earliest_datetime And lastdate <> previouslastdate
Then

Part6() ' Click Continue Button


Else
Me.Close() ' Part 7: Close programme
End If
End If

End Sub

But anyway,m y idea was therefore to save the lastdate every second time into a new variable. my idea was
to determine if it is the second time by counting the docment_completed events: i understand we get this
event 4 times whithin 2 turns .
So here the code... just didnt know how to determine if a variable is an integer...

Insert in the part 5 sub


...
dim checkdate as datetime1
dim checkdate as datetime2
lastDate = Nothing
Dim lastDateStr As String = "0"
Dim separator As String = vbTab
Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")
Using sw As StreamWriter = File.CreateText(Pfad & currentDataTime & ".txt")
For Each row As String() In rows
sw.WriteLine(String.Join(separator, row))
lastDateStr = row(0)
Next
End Using

If lastDateStr IsNot "0" Then


lastdate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mm Tongue Tied
s", System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))
If document_completed / 4 gives an integer Then checkdate1 = lastdate
checkdate2 = 0
Else checkdate2 = last date checkdate1 = 0
End If
System.Threading.Thread.Sleep(2000)
End If
...

and insert in the document completed sub

...
ElseIf (document_completed > 2) And (e.Url.AbsoluteUri =
"http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then ' Second to xth tables
Part5() ' Extract javascript and update last_datetime
If lastdate > earliest_datetime And document_completed / 4 gives an
integer and checkdate2 <> lastdate Then
Part6() ' Click Continue Button
    ElseIf lastdate > earliest_datetime And document_completed / 4 does not
give an integer and checkdate1 <> lastdate Then
Part6() ' Click Continue Button
Else
Me.Close() ' Part 7: Close programme
End If
End If
...

Friday, February 1, 2008 2:08 PM

d.j.t 20 Points

I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case
there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops
after you for a suspected DoS attack.

0  
Sign in
to vote Here's the ultimate bug free code (until you find the next one):

Code Snippet
1. Dim previous_last_datetime As DateTime
2.
3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles
WebBrowser1.DocumentCompleted
4. MessageBox.Show("DocumentCompleted: " & e.Url.AbsoluteUri)
5. If Not (e.Url.AbsoluteUri = seite) Then
6. Return
7. End If
8. document_completed = document_completed + 1
9. If document_completed = 1 Then ' First table
10. Part2() ' Automatically select specified option from ComboBox
11. Part3() ' Automatically check the CheckBox
12. Part4() ' Automatically click the Button
13. ElseIf document_completed > 1 And document_completed < 11 Then
14. previous_last_datetime = last_datetime
15. Part5() ' Extract javascript and update last_datetime
16. If previous_last_datetime > last_datetime Then
17. Part6() ' Automatically click Continue Button
18. Else
19. Me.Close() ' Part 7: Close programme
20. End If
21. End If
22. End Sub

Edited by Tim Mathias Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.

Friday, February 1, 2008 7:04 PM

Tim Mathias 345 Points

I've had a deeper look at the website's pagination problem. I've separated the reading of the table
rows from the writing of the table rows -- Part5A and Part5B. I've also added a new variable --
1 more_data -- to test whether the next table is really more data or just a repeat of the last table. If
Sign in you want you can also add a time limit to this test -- earliest_datetime -- as we had before.
to vote

Currently (at time of writing this post) there's still a mysterious problem with that particular website
with a double entry:

30.01. 17:15:08 47,80 Handel 1.000


30.01. 17:15:08 47,70 Handel 1.000

If you select 20 lines per page the latter of these entries disappears.

Here's the code:

Code Snippet
1. Imports System.IO
2. Imports System.Text.RegularExpressions
3.
4. Public Class Form1
5.
6. Dim seite As Uri
7. Dim document_completed As Integer
8. Dim last_datetime As DateTime
9. Dim rows As ArrayList
10. Dim more_data As Boolean
11.
12. Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles MyBase.Load
13. Trace.WriteLine(vbCrLf & vbCrLf & "Form1_Load")
14. Me.WindowState = FormWindowState.Maximized
15. Part1() ' Use WebBrowser control to load web page
16. End Sub
17.
18. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object,
ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)
Handles WebBrowser1.DocumentCompleted
19. Trace.WriteLine(vbCrLf & "WebBrowser1_DocumentCompleted url = " &
e.Url.ToString)
20. If (e.Url <> seite) Then
21. Return ' Ignore banner page load
22. End If
23. document_completed = document_completed + 1
24. Trace.WriteLine(vbCrLf & "document_completed = " &
document_completed & vbCrLf)
25. If document_completed = 1 Then ' First table
26. Trace.WriteLine(vbCrLf & "Section A" & vbCrLf)
27. Part2() ' Automatically select specified options from ComboBoxes
28. Part3() ' Automatically check the CheckBox
29. Part4() ' Automatically click the Button
30. ElseIf more_data And document_completed < 11 Then
31. Trace.WriteLine(vbCrLf & "Section B" & vbCrLf)
32. Part5A() ' Read javascript table rows and update more_data
33. If more_data Then
34. Part6() ' Automatically click More Button
35. Else
36. Part5B() ' Write combined table rows to file
37. Close() ' Part 7: Close programme
38. End If
39. Else
40. Trace.WriteLine("Too many tables.")
41. Part5B() ' Write combined table rows to file
42. Close() ' Part 7: Close programme
43. End If
44. End Sub
45.
46. Private Sub Part1()
47. ' Part 1: Use WebBrowser control to load web page
48. Trace.WriteLine("Part1: Use WebBrowser control to load web page")
49. seite = New Uri("http://www.handelsblatt.com/News/default.aspx?
_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
50. document_completed = 0
51. last_datetime = DateTime.Now
52. rows = New ArrayList
53. more_data = True
54. WebBrowser1.Dock = DockStyle.Fill
55. WebBrowser1.Navigate(seite)
56. End Sub
57.
58. Private Sub Part2()
59. ' Part 2: Automatically select specified options from ComboBoxes
60. Trace.WriteLine("Part2: Automatically select specified options from
ComboBoxes")
61. Try
62. ' Part 2A: Times & Sales
63. Dim el1 As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_DD_Step")
64. el1.SetAttribute("value", "0")
65.
66. ' Part 2B: 100 lines
67. Dim el2 As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_DD_Lines")
68. el2.SetAttribute("value", "100")
69. Catch e As Exception
70. Trace.WriteLine("ERROR: Part2: " & e.Message)
71. Close()
72. End Try
73. End Sub
74.
75. Private Sub Part3()
76. ' Part 3: Automatically check the CheckBox
77. Trace.WriteLine("Part3: Automatically check the CheckBox")
78. Try
79. Dim el As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_CBx_CapitalMeasures")
80. el.SetAttribute("checked", "true")
81. Catch e As Exception
82. Trace.WriteLine("ERROR: Part3: " & e.Message)
83. Close()
84. End Try
85. End Sub
86.
87. Private Sub Part4()
88. ' Part 4: Automatically click the Button
89. Trace.WriteLine("Part4: Automatically click the Button")
90. Try
91. Dim el As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_IBtn_Refresh1")
92. el.InvokeMember("click")
93. Catch e As Exception
94. Trace.WriteLine("ERROR: Part4: " & e.Message)
95. Close()
96. End Try
97. End Sub
98.
99. Private Sub Part5A()
100. ' Part 5A: Read javascript table rows and update more_data
101. Trace.WriteLine("Part5A: Read javascript table rows and update
more_data")
102. Try
103. Dim new_rows As New ArrayList
104. Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")
([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"
105. Dim separator As String = vbTab
106. For Each m As Match In Regex.Matches(WebBrowser1.DocumentText,
pattern)
107. new_rows.Add(String.Join(separator, m.Value.Split(New
String() {"\t"}, StringSplitOptions.None)))
108. Trace.WriteLine(new_rows(new_rows.Count - 1))
109. Next
110. Dim str_new_last_datetime As String = new_rows(new_rows.Count -
1)
111. Dim new_last_datetime As DateTime
112. new_last_datetime =
DateTime.ParseExact(str_new_last_datetime.Substring(0, 15), "dd.MM.
HH:mm:ss", System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))
113. If (new_last_datetime < last_datetime) Then
114. Trace.WriteLine("Adding " & new_rows.Count & " new row(s) to
combined rows.")
115. rows.AddRange(new_rows)
116. last_datetime = new_last_datetime
117. Else
118. Trace.WriteLine("Skipping new row(s).")
119. more_data = False
120. End If
121. Catch e As Exception
122. Trace.WriteLine("ERROR: Part5A: " & e.Message)
123. Part5B() ' Save any accrued data
124. Close()
125. End Try
126. End Sub
127.
128. Private Sub Part5B()
129. ' Part 5B: Write combined table rows to file
130. Trace.WriteLine("Part5B: Write combined table rows to file")
131. If rows.Count Then
132. Try
133. Dim current_datetime As String =
DateTime.Now.ToString("yyyyMMddHHmmss") ' 24 hour clock
134. Trace.WriteLine("Writing " & rows.Count & " row(s) to
file...")
135. Using sw As StreamWriter =
File.CreateText("BrowserAutomation" & current_datetime & ".txt")
136. For Each row As String In rows
137. sw.WriteLine(row)
138. Next
139. End Using
140. Trace.WriteLine("Done.")
141. Catch e As Exception
142. Trace.WriteLine("ERROR: Part5B: " & e.Message)
143. Close()
144. End Try
145. Else
146. Trace.WriteLine("No data to write.")
147. End If
148. End Sub
149.
150. Private Sub Part6()
151. ' Part 6: Automatically click More Button
152. Trace.WriteLine("Part 6: Automatically click More Button")
153. System.Threading.Thread.Sleep(2000)
154. Try
155. Dim el As HtmlElement =
WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04
_LBtn_More")
156. el.InvokeMember("click")
157. Catch e As Exception
158. Trace.WriteLine("ERROR: Part4: " & e.Message)
159. Part5B() ' Save any accrued data
160. Close()
161. End Try
162. End Sub
163.
164. End Class

Edited by Tim Mathias Wednesday, October 14, 2009 5:22 PM Reformatted code snippet.

Monday, February 4, 2008 10:48 AM

Tim Mathias 345 Points

Hi Tim

thanks for both your posts! I implemented the first post and it did work.

0 The missing lines on the website - we probably cant do anything about that but that shouldt matter i hope.
Sign in
to vote
Now your second post looks really scaring. There commands you use are totally different! Id like to
understand all that, but at the moment i just have no time as i am studying and exams are held next week
and then i'll be away for a while.
But thanks anyway! Should what i have yet not work i'll check it out!

Thanks for all your help, i appreciate that a lot!

Dominik

Monday, February 11, 2008 3:15 PM

d.j.t 20 Points

Hello d.j.t,

Considering that many developers in this forum ask how to automate a web page via WebBrowser, rotate or flip images, my
team has created a code sample for this frequently asked programming task in Microso All-In-One Code Framework. You
0 can download the code samples at:
Sign in
to vote
VBWebBrowserAutomation
 
http://bit.ly/VBWebBrowserAutomation
 
CSWebBrowserAutomation
 
http://bit.ly/CSWebBrowserAutomation
 
With these code samples, we hope to reduce developers’ efforts in solving the frequently asked
programming tasks. If you have any feedback or sugges ons for the code samples, please email us: onecode@microso .com.
------------
The Microso All-In-One Code Framework (h p://1code.codeplex.com) is a free, centralized code sample library driven by
developers' needs. Our goal is to provide typical code samples for all Microso development technologies, and reduce
developers' efforts in solving typical programming tasks.
Our team listens to developers’ pains in MSDN forums, social media and various developer communi es. We write code
samples based on developers’ frequently asked programming tasks, and allow developers to download them with a short
code sample publishing cycle. Addi onally, our team offers a free code sample request service. This service is a proac ve way
for our developer community to obtain code samples for certain programming tasks directly from Microso .
Thanks

Microso All-In-One Code Framework

Thursday, March 24, 2011 10:22 AM

All-In-One Code Framework by Microsoft Microsoft All-In-One Cod... 65 Points

Help us improve MSDN. Make a suggestion

Dev centers Learning resources Community Support


Microsoft Virtual Academy Forums Self support
Windows
Channel 9 Blogs

Office MSDN Magazine Codeplex

Visual Studio
Programs
Microsoft Azure BizSpark (for startups)
Microsoft Imagine (for students)
More...

United States (English) Newsletter Privacy & cookies Terms of use Trademarks © 2019 Microsoft

Das könnte Ihnen auch gefallen