Thursday, June 22, 2006

Requesting Content from a Web Page

In this example, we will demonstrate how to get the HTML source code of a web page into SIMPL Windows. This can be used for a wide variety of purposes, such as displaying information from the Internet on your touchpanel. I won't go into details about parsing the data from any particular site. Rather, I will provide you the information needed to get the HTML data into your program, and then you can do whatever you want with whatever web page you want.

This example is for an Ethernet equipped 2-series processor.

Before we begin, you must make sure you have DNS servers declared on your Crestron processor. DNS is how devices on the Internet resolve host names (like www.google.com) into IP addresses (like 64.233.167.99) Your Crestron processor will need to do this in order to connect with a web server.

At the text prompt in Viewport or Toolbox, type TESTDNS www.google.com. The processor should reply with an IP address. If it replies with 000.000.000.000, your DNS servers are not configured properly. You can add dns serverse by typing ADDDNS xxx.yyy.zzz.www. (Where xxx.yyy.zzz.www is the IP address of your DNS server) Typically you want to have two DNS servers if they are available.

If you don't know what your DNS servers are, open up a command prompt on a Windows computer using the same Internet connection as your Crestron processor. Type ipconfig/all. The DNS servers will be listed on the last few lines of the output.

Once you have your Crestron processor set up to resolve host names using DNS, you are ready to get to work in SIMPL Windows. We will begin in the configuration view by adding a TCP/IP client to an unused IP ID of your Ethernet card. Open up the properties for this TCP/IP client and select the IP Net Address tab. Underneath "Default Address," check "Use Host Name" and enter the hostname of the website you want to download. In our example, we will download the content of the page www.google.com. If you are trying to reach a page other than the index (such as http://www.google.com/intl/en/about.html) you would still enter only www.google.com here. This field is used only to resolve the IP address of the server on which the content is hosted.



Switch back to the programming view of SIMPL Windows. Open the TCP/IP client. Enter the port number (usually 80) in the port field. The "connect" digital input will tell the TCP/IP client to open a session. We'll name this signal "connect_http." We'll also need to know the status of the connection from the analog output, which we'll name "http_status." The traffic being sent by the Crestron processor to the web server will go out the tx$ (http_tx$). The traffic coming back from the web server will come in on the rx$ (http_rx$). This is what you will parse to get any information you want from the web server.



We'll now set up logic to timeout the connection if it is open too long. The web server will drop your connection automatically at the end of a successful download, but if things don't go well, we want the Crestron processor to drop the connection itself after 5 seconds. Insert a Stepper into your program. The trigger should be "http_go." This will be the signal you pulse when you want the program to retrieve data. The busy is not needed and can be commented out. The only step needed (delay 5s, length 0.1s) is "http_connection_timeout."



Next insert an Analog Equate (EQU) into the program. This will take the analog signal "http_status" and decode it into digital signals containing the status of the TCP/IP connection. Most of these can be commented out... The only ones we need are connected, connection_failed and connection_broken_locally.



We now have three different things that would necessitate the Crestron processor to drop the TCP/IP connection: If the connection fails, if it is broken remotely, or if it times out. Insert an OR and run these three signals to the input. The output will be a new signal, "http_disconnect."



We need a signal to be held high for the connection process. We'll use a Set/Reset Latch (SR). The "set" signal will be "http_go" (the same signal that starts our stepper). The "disconnect" signal will be "http_disconnect" which is generated by our disconnect logic.



Finally, we need to send a properly formatted HTTP GET request. The syntax of this is very important. Insert a Serial Send (SEND) into your program. The trigger will be "http_status_connected" (which will trigger the string to send once the TCP/IP client reports a successful connection) and the tx$ will be http_tx$. The string must be exactly as follows:

GET / HTTP/1.0\nHost: www.google.com\n\n

Replace www.google.com with the name of the website you are trying to connect to. If you are connecting to a page other than the index, you would replace the first / with the path to the files.) For example, to reach http://www.google.com/intl/en/about.html, your string would be as follows:

GET /intl/en/about.html HTTP/1.0\nHost: www.google.com\n\n



Upload to your processor (be sure to send the IP table when you do!) and pulse http_go. You will see the content of the web site coming in on http_rx$.

(Thanks to Chip Moody for suggestions cleaning up the logic that triggers the "GET" request)

20 Comments:

Anonymous Anonymous said...

Fan bloody tastic

7:48 AM  
Blogger IP Wrangler said...

I'm glad you enjoyed the post... If there are any other topics you'd like to see discussed, feel free to post a comment and/or a message on the Crestron Yahoo! list.

9:16 PM  
Anonymous Anonymous said...

Great tutorial.

Could you cover how to e-mail pre stored e-mails from the processor?

I've attempted past examples without success.

2:14 AM  
Blogger Unknown said...

I tried your program and it worked well, but it didn't work when I tried to get something other than the index page, like www.somewhere.com/something.html.
How do I format the string to get specific pages? Thanks.

7:44 PM  
Blogger Unknown said...

I read your article more carefully and got it. Thanks.

2:00 PM  
Anonymous Anonymous said...

great tutorial on setting up connection.. how about actually parsing the page?

12:35 PM  
Blogger Unknown said...

Great Tutorial,

Two questions, Is there a good way for dynamically changing the GET request? And what is a good way to parse the http headers out of the return.

4:26 PM  
Blogger vajsko said...

hi this is great but what if i need recieve data from web page ?

4:11 AM  
Anonymous Dan said...

Nice work. Helpful information!

8:13 AM  
Blogger FuzzyTheBear said...

this is the key to controlling XBMC
i just used your tip this morning to
get the XBMC player using their http api to work with my controller.
Just broke the case.Thanks

8:06 AM  
Anonymous Anonymous said...

Lol, people are still using Crestron products? Funny

1:45 AM  
Anonymous Anonymous said...

Good tutorial. I have a question, if i have a bunch of http commands. What should i use instead. Now have I one serial i/o, but haven't figure out how to check http status - if i don't want to use your example for each commands :(

Tips? =)

9:28 AM  
Anonymous Anonymous said...

Was very helpfull for controlling an AXIS IP Camera over TCP/IP

I use this kind of string:

GET /axis-cgi/com/ptz.cgi?move=left HTTP/1.1\nHost: 192.168.0.84\n\n

3:57 AM  
Blogger Unknown said...

hi. I am using this same concept to control a media server API.
Please guide as to how to use this logic to give http request to a local device which works on url based commands.

3:04 AM  
Blogger Unknown said...

Hi, thank you so much for this post. I was wondering if it is possible to add header parameters to the string, as I would need it for controlling my Nest thermostat.

Kind regards!

12:21 PM  
Anonymous Anonymous said...

Are people still using Crestron products...LoL, yeah I'd say so...to the tune of almost $2B/year.

6:35 PM  
Anonymous Anonymous said...

He's a Control 4 wannabe. Only people using Control 4 are the poors that wish they could afford crestron

1:57 PM  
Anonymous Anonymous said...

I was able to send the http request, but could not make the https working.

4:02 AM  
Blogger machine said...

WooW !!!! That's great! I need to do the opposite (read temperature from a sensor and send it via PUT), how can I change this behaviour ?THANKS !

7:40 PM  
Blogger Unknown said...

The typo with your http_connect and then again with connect_http totally threw me off for 24 hours, but thanks anyway.

4:25 PM  

Post a Comment

<< Home