BlackBerry Forums Support Community               

Closed Thread
 
LinkBack Thread Tools
Old 03-02-2009, 01:07 AM   #1 (permalink)
New Member
 
Join Date: Feb 2009
Model: na
PIN: N/A
Carrier: na
Posts: 11
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default How to parse an HTML page using JDE

Please Login to Remove!

Hi,

im developing an application which access a website and retreives the HTML content as a response. Now i need to parse the HTML response to get the desired data. Is there any JAR files available or any other approach available within JDE? Please help me in this regard.

thanks.
Offline  
Old 03-02-2009, 03:16 AM   #2 (permalink)
New Member
 
Join Date: Feb 2009
Model: na
PIN: N/A
Carrier: na
Posts: 11
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default

I tried using htmlparser.jar from Sourceforge and added it as a library project.
but i got the compilation error.
Attribute.java: Error!: Duplicate definition for 'org.htmlparser.Attribute' found in: org.htmlparser.Attribute

Please help...

thanks.
Offline  
Old 03-02-2009, 03:39 AM   #3 (permalink)
Thumbs Must Hurt
 
Join Date: Feb 2009
Model: 9000
PIN: N/A
Carrier: T-Mobile
Posts: 67
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default

have you tried using the native blackberry sax parser?
Offline  
Old 03-02-2009, 04:09 AM   #4 (permalink)
New Member
 
Join Date: Feb 2009
Model: na
PIN: N/A
Carrier: na
Posts: 11
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi Hippo,

thanks for your reply. Does SAX takes HTML input for parsing?
If you have any working samples SAX parser please share with me...
thanks in advance.
Offline  
Old 03-02-2009, 04:15 AM   #5 (permalink)
New Member
 
Join Date: Mar 2009
Model: 9350
PIN: N/A
Carrier: Sprint
Posts: 1
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default HTML Parsing

R there any native api for HTML parsing
Offline  
Old 03-02-2009, 04:40 AM   #6 (permalink)
Thumbs Must Hurt
 
Join Date: Jan 2007
Location: Ernakulam, Kerala, India
Model: 8320
Carrier: Airtel
Posts: 65
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi

You can use the SAX Parser for the purpose of parsing any xml, Of course you can parse the HTML, But What do you want actually ? Do you need to render it/The HTML response or do you need to extract some data from it like you do with XML ?

If it is the rendering then you have built-in APIs from rim, please go through it.
Otherwise if it is the parsing that you need, just implement the DefaultHandler (org.xml.sax.helpers.DefaultHandler) and parse the stream data through it.
__________________
Regards
Anand.
Offline  
Old 03-03-2009, 01:01 AM   #7 (permalink)
New Member
 
Join Date: Feb 2009
Model: na
PIN: N/A
Carrier: na
Posts: 11
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default

I tried implementing Default Handler but im not able to understand it fully.
Also im getting Errors while implementing the same. I am trying to parse a html page and I need to find a "tr" tag whose attribute is "class=somecssclass" and have to read the content and present inside this "tr" tag. How to do it with Defualt handler. Below is my code

InputStream in = this.getClass().getResourceAsStream("/stockQuote.html");
Reader rdr = new InputStreamReader(in);
MySAXHandler handler = new MySAXHandler();
handler.startElement("<what shud be given here?>","tbody","tr","class");
parser.parse(in,handler);

Here ismy Default Handler code:
class MySAXHandler extends DefaultHandler
{
public void startDocument() {
System.out.println("Start document: ");
}
public void endDocument() {
System.out.println("End document: ");
}

public void startElement(String uri, String localName, String qname,
Attributes attr)
{
System.out.println("Start element: local name: " + localName + " qname: "
+ qname + " uri: "+uri);
int attrCount = attr.getLength();
if(attrCount>0)
{
System.out.println("Attributes:");
for(int i = 0 ; i<attrCount ; i++)
{
System.out.println(" Name : " + attr.getQName(i));
System.out.println(" Type : " + attr.getType(i));
System.out.println(" Value: " + attr.getValue(i));
}
}

please help me...
Offline  
Old 03-03-2009, 04:01 AM   #8 (permalink)
Thumbs Must Hurt
 
Join Date: Jan 2007
Location: Ernakulam, Kerala, India
Model: 8320
Carrier: Airtel
Posts: 65
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default

Hi Samuel

You have gone wrong, Its not te way you implement it. Please look at the sample code below.

Code:
SAXParserFactory spf = SAXParserFactory.newInstance(); 
spf.setAllowUndefinedNamespaces(true); 
SAXParser parser = = spf.newSAXParser(); 
parser.setAllowUndefinedNamespaces(true); 
CustomHandler ch = new CustomHandler(); 
parser.parse(<inputstream>, <customhandler obj, ie. ch here>);

class CustomHandler extends DefaultHandler { 

public void startDocument() { 
System.out.println("start doc"); 
} 

public void endDocument() { 
System.out.println("end doc"); 
} 


public void startElement(String uri, String localName, String qname, Attributes attr){ 
System.out.println("local name is :"+localName); 
} 
public void clear(){ 
} 

public void endElement(String uri, String localName, String qname) { 
System.out.println("local name is :"+localName); 
} 

public void characters(char[] ch, int start, int length) { 
String _result = new String(ch, start, length); 
System.out.println(_result); 
} 

public void ignorableWhitespace(char[] ch, int start, int length) { 
} 

public void startPrefixMapping(String prefix, String uri) { 
} 

public void endPrefixMapping(String prefix) { 
} 

public void warning(SAXParseException spe) { 
} 

public void fatalError(SAXParseException spe) throws SAXException { 
System.out.println("SAXexception2:"+spe);
}
}
The handler will get notified while the parser parses the inputstream.
In the method startElement the localname parameter, it will receive the tr tag
if you check printing, it will print the same, The same will happen to all methods where you find a
localname String.
I guess this will help you.
__________________
Regards
Anand.

Last edited by Ananthasivan V K : 03-03-2009 at 04:04 AM. Reason: clarity
Offline  
Old 03-15-2009, 11:19 PM   #9 (permalink)
New Member
 
Join Date: Mar 2009
Model: 8700
PIN: N/A
Carrier: GSM
Posts: 12
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default

Thank you, I'm looking for it
Offline  
Old 08-10-2009, 09:56 AM   #10 (permalink)
New Member
 
Join Date: Aug 2009
Model: Model
PIN: N/A
Carrier: Carrier
Posts: 1
Post Thanks: 0
Thanked 0 Times in 0 Posts
Default Parsing HTML

I am not a developer, but thought should venture a suggestion that may possibly work. Have you considered biterscripting for parsing HTML ? I have seen several samples posted on the web for parsing HTML with it. I am a system admin, and use biterscripting daily for parsing log files, etc. It is very efficient in parsing, I just don't know yet how to make it work on this platform.

Patrick
Offline  
Closed Thread


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On





Copyright 2004-2014 BlackBerryForums.com.
The names RIM and BlackBerry are registered Trademarks of BlackBerry Inc.