Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Not printing recursively for stringValue #39

Open
hashier opened this issue Jun 3, 2015 · 2 comments
Open

Feature Request: Not printing recursively for stringValue #39

hashier opened this issue Jun 3, 2015 · 2 comments

Comments

@hashier
Copy link

hashier commented Jun 3, 2015

I would like to have a method similar to stringValue which doesn't recursively prints everything under a certain XPathQuery. Here is the full code + HTML and the produced output by Ono plus which output I'd like to have.

My XPath Query: XPathQuery: //div[@class='thread']

Ono code:

document = [ONOXMLDocument HTMLDocumentWithData:file error:&error];

[document enumerateElementsWithXPath:xPath usingBlock:^(ONOXMLElement *element, NSUInteger idx, BOOL *stop) {
    NSLog(@"%@", [element stringValue]);
}];

Which prints:

FirstName LastName, SecondNameFirst SecondNameLast


                FirstName LastName
                Wednesday, December 24, 2014 at 6:57pm UTC+01 


        This is a dummy text


                SecondNameFirst SecondNameLast
                Wednesday, December 24, 2014 at 6:56pm UTC+01


        And a 2nd one just to show off


Another, User


                Another
                Monday, April 27, 2015 at 10:54pm UTC+02


        Text: 2.1


                User
                Thursday, February 26, 2015 at 5:41pm UTC+01


        Text: 2.2


                Another
                Thursday, February 26, 2015 at 4:25pm UTC+01


        Text: 2.3

I would prefer to have an output similar to hpple which is:

FirstName LastName, SecondNameFirst SecondNameLast
Another, User

hpple code:

tutorialsParser = [TFHpple hppleWithHTMLData:file];
tutorialsNodes = [tutorialsParser searchWithXPathQuery:xPath];

for (TFHppleElement *element in tutorialsNodes) {
    NSLog(@"%@", [[element firstChild] content].trim);
}

And I don't want to use hpple since it is too slow.

Here is my input HTML file:

<!DOCTYPE html>
<html>
<head><title/></head>
<body>
    <div class="thread">FirstName LastName, SecondNameFirst SecondNameLast
        <div class="message">
            <div class="message_header">
                <span class="user">FirstName LastName</span>
                <span class="meta">Wednesday, December 24, 2014 at 6:57pm UTC+01 </span>
            </div>
        </div>
        <p>This is a dummy text</p>
        <div class="message">
            <div class="message_header">
                <span class="user">SecondNameFirst SecondNameLast</span>
                <span class="meta">Wednesday, December 24, 2014 at 6:56pm UTC+01</span>
            </div>
        </div>
        <p>And a 2nd one just to show off</p>
    </div>
    <div class="thread">Another, User
        <div class="message">
            <div class="message_header">
                <span class="user">Another</span>
                <span class="meta">Monday, April 27, 2015 at 10:54pm UTC+02</span>
            </div>
        </div>
        <p>Text: 2.1</p>
        <div class="message">
            <div class="message_header">
                <span class="user">User</span>
                <span class="meta">Thursday, February 26, 2015 at 5:41pm UTC+01</span>
            </div>
        </div>
        <p>Text: 2.2</p>
        <div class="message">
            <div class="message_header">
                <span class="user">Another</span>
                <span class="meta">Thursday, February 26, 2015 at 4:25pm UTC+01</span>
            </div>
        </div>
        <p>Text: 2.3</p>
    </div>
</body>
</html>
@tosbaha
Copy link

tosbaha commented Jun 22, 2015

Sorry I don't speak Objective-C but you may use something like this.

extension String {
    func trim() -> String {
        return self.stringByTrimmingCharactersInSet(.whitespaceAndNewlineCharacterSet())
    }

    func clean() ->String {
        return self.stringByReplacingOccurrencesOfString(
            "\\s+",
            withString: " ",
            options: .RegularExpressionSearch)
    }
}

Then in your code use it like below;

//remove extra spaces on left or right
let trimmedValue = (element.childrenWithTag("td")[3] as! ONOXMLElement).stringValue().trim() 
//remove white space
let cleanedValue = (element.childrenWithTag("td")[3] as! ONOXMLElement).stringValue().clean()
//or chain them together
let extraCleanValue = (element.childrenWithTag("td")[3] as! ONOXMLElement).stringValue().clean().trim()

@hashier
Copy link
Author

hashier commented Jun 22, 2015

That wouldn't help since it would still be recursively print everything out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants