Is there a way to scrape the stories themselves? #2

depthfirst · 2015-11-22T02:30:29Z

Working on topic modeling with LDA, and I'd like to try it on a sample of the stories, not just the summaries. Do you have code to do that, or can you write a function to return the text of story given ?

MrTyton · 2015-11-22T04:39:00Z

I don't have code to do that. I'll see if I can write something up for it
for tomorrow; otherwise you could try using
http://www.mobileread.com/forums/showthread.php?t=259221

On Sat, Nov 21, 2015 at 9:30 PM John Blackmore [email protected]
wrote:

Working on topic modeling with LDA, and I'd like to try it on a sample of
the stories, not just the summaries. Do you have code to do that, or can
you write a function to return the text of story given ?

—
Reply to this email directly or view it on GitHub
#2.

MrTyton · 2015-11-25T11:54:55Z

pip install FanFicFare

On Tue, Nov 24, 2015 at 11:03 PM John Blackmore [email protected]
wrote:

Assigned #2 #2 to @MrTyton
https://github.com/MrTyton.

—
Reply to this email directly or view it on GitHub
#2 (comment).

depthfirst · 2015-11-25T12:29:00Z

Can you give me a little more, like the function I'm asking for? I just thought this would be a lot easier for you since you scraped everything else.

MrTyton · 2015-11-25T12:38:21Z

Sorry the meds that I'm taking are fucking me up some. That plugin does all
the nice epub formatting and stuff and scrapes it all. Give me half an
hour, I'll have a basic thing, but it'll still have the formatting tags
within the story.

On Wed, Nov 25, 2015 at 7:29 AM John Blackmore [email protected]
wrote:

Can you give me a little more, like the function I'm asking for? I just
thought this would be a lot easier for you since you scraped everything
else.

—
Reply to this email directly or view it on GitHub
#2 (comment).

MrTyton · 2015-11-25T13:09:32Z

Done. Have to do it from the story class, do you want me to rewrite the
init function so that you can just make it from the sql row instead of just
getting it from the initial scrape?

On Wed, Nov 25, 2015 at 7:36 AM Joshua Gang [email protected] wrote:

Sorry the meds that I'm taking are fucking me up some. That plugin does
all the nice epub formatting and stuff and scrapes it all. Give me half an
hour, I'll have a basic thing, but it'll still have the formatting tags
within the story.

On Wed, Nov 25, 2015 at 7:29 AM John Blackmore [email protected]
wrote:

Can you give me a little more, like the function I'm asking for? I just
thought this would be a lot easier for you since you scraped everything
else.

—
Reply to this email directly or view it on GitHub
#2 (comment).

depthfirst · 2015-11-25T13:14:16Z

Does the init take a bit of time? If so, then yes, that'd be great. Thanks.

MrTyton · 2015-11-25T13:18:28Z

Init doesn't take time, but right now it literally only works when you give
it an XML document. Hold on...

On Wed, Nov 25, 2015 at 8:14 AM John Blackmore [email protected]
wrote:

Does the init take a bit of time? If so, then yes, that'd be great.
Thanks.

—
Reply to this email directly or view it on GitHub
#2 (comment).

MrTyton · 2015-11-25T13:38:49Z

Done, works with the sql row from stories, just make sure that you expand
it [Story(*row)]

On Wed, Nov 25, 2015 at 8:16 AM Joshua Gang [email protected] wrote:

Init doesn't take time, but right now it literally only works when you
give it an XML document. Hold on...

On Wed, Nov 25, 2015 at 8:14 AM John Blackmore [email protected]
wrote:

Does the init take a bit of time? If so, then yes, that'd be great.
Thanks.

—
Reply to this email directly or view it on GitHub
#2 (comment).

depthfirst assigned MrTyton Nov 25, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to scrape the stories themselves? #2

Is there a way to scrape the stories themselves? #2

depthfirst commented Nov 22, 2015

MrTyton commented Nov 22, 2015

MrTyton commented Nov 25, 2015

depthfirst commented Nov 25, 2015

MrTyton commented Nov 25, 2015

MrTyton commented Nov 25, 2015

depthfirst commented Nov 25, 2015

MrTyton commented Nov 25, 2015

MrTyton commented Nov 25, 2015

Is there a way to scrape the stories themselves? #2

Is there a way to scrape the stories themselves? #2

Comments

depthfirst commented Nov 22, 2015

MrTyton commented Nov 22, 2015

MrTyton commented Nov 25, 2015

depthfirst commented Nov 25, 2015

MrTyton commented Nov 25, 2015

MrTyton commented Nov 25, 2015

depthfirst commented Nov 25, 2015

MrTyton commented Nov 25, 2015

MrTyton commented Nov 25, 2015