Regexpr. and twitter troubles

Hello peoples,

A busy night for me. Midterm exams next thursday :(.

My question; I would really like to be able to get individual “tweets” (damn, i hate that word) of their website analyzing the code. I have found that "tweets are located between “” and “”.My idea was to somehow filter out the tweet. After taking “the scenic route” i finally found that regexpr. would do the job. I did manage to get out the first tweet but i’ve stumbled upon some problems with the remainder. The problem starts with this line when i want to further narrow done results. This is the expression used;

“(=“entry-content”>)(.*)(<a href=)”

Is this statement wrong?
I have a feeling it goes wrong with the “<a href=)-part” but something also tells me it doesn’t have to be since regexpr does’nt take “<a” as input right?

Thanks for any possible help!

FRid

(i’m no Beta-thinker and always flunked at math, so please be gentle :)

twitter-ding-4-9.v4p (53.5 kB)

Hmmm. It seems because i’ve entered html-code as a question the post became a bit funky. Oops…

The patch is quite selfexplanatory i guess.

FRid

I’ve managed to get it to work (at least on one a extracted post) :) !!!
Now lets see if it can be hooked up to the others.

FRid

I’ve managed to get it to work (at least on one a extracted post) :) !!!
Now lets see if it can be hooked up to the others.

FRid

twitter-ding-4-91-testing.v4p (26.5 kB)

Update;

Not yet as smooth as i’d want it to be but hey, it works! (if the tweets contain just plain text) I’m happy with that for now and i hope my teachers will be as well. Take a look and let me know what you think because there is probably a lot of improvement possible.

FRid

Update;

Not yet as smooth as i’d want it to be but hey, it works! (if the tweets contain just plain text) I’m happy with that for now and i hope my teachers will be as well. Take a look and let me know what you think because there is probably a lot of improvement possible.

FRid

twitter-ding-1.v4p (14.8 kB)

{CODE(ln=>1)}<span\sclass=“entry-content”>(.*?)</span>{CODE}
should do the trick all in one.

you could also try and convert the xhtml with the tidy node and parse the things you need via xpath

{CODE(ln=>1)}<span\sclass=“entry-content”>(.*?)</span>^
should do the trick all in one.

you could also try and convert the xhtml with the tidy node and parse the things you need via xpath

And it did. Thanks! That really cleaned up the patch. I did look at that solution before but i overlooked the “\s” so i didn’t get any output. I’ve added some filters for when people start linking and stuff.

The Tidy/Xpath-combo will take some more time though but thanks to point it out.

And it did. Thanks! That really cleaned up the patch. I did look at that solution before but i overlooked the “\s” so i didn’t get any output. I’ve added some filters for when people start linking and stuff.

The Tidy/Xpath-combo will take some more time though but thanks to point it out.

twitter-ding-1-8.v4p (30.7 kB)

… no text …

testing beta

… no text …

asdf… no text …

asfd… no text …

asdfasdf… no text …

asfasd… no text …

this is a testreply

still testing

another reply