Regexpr. and twitter troubles

FRid · February 3, 2010, 1:22am

Hello peoples,

A busy night for me. Midterm exams next thursday :(.

My question; I would really like to be able to get individual “tweets” (damn, i hate that word) of their website analyzing the code. I have found that "tweets are located between “” and “”.My idea was to somehow filter out the tweet. After taking “the scenic route” i finally found that regexpr. would do the job. I did manage to get out the first tweet but i’ve stumbled upon some problems with the remainder. The problem starts with this line when i want to further narrow done results. This is the expression used;

“(=“entry-content”>)(.*)(<a href=)”

Is this statement wrong?
I have a feeling it goes wrong with the “<a href=)-part” but something also tells me it doesn’t have to be since regexpr does’nt take “<a” as input right?

Thanks for any possible help!

FRid

(i’m no Beta-thinker and always flunked at math, so please be gentle :)

twitter-ding-4-9.v4p (53.5 kB)

FRid · February 3, 2010, 1:26am

Hmmm. It seems because i’ve entered html-code as a question the post became a bit funky. Oops…

The patch is quite selfexplanatory i guess.

FRid

FRid · February 3, 2010, 1:48pm

I’ve managed to get it to work (at least on one a extracted post) :) !!!
Now lets see if it can be hooked up to the others.

FRid

FRid · February 3, 2010, 1:48pm

I’ve managed to get it to work (at least on one a extracted post) :) !!!
Now lets see if it can be hooked up to the others.

FRid

twitter-ding-4-91-testing.v4p (26.5 kB)

FRid · February 3, 2010, 3:40pm

Update;

Not yet as smooth as i’d want it to be but hey, it works! (if the tweets contain just plain text) I’m happy with that for now and i hope my teachers will be as well. Take a look and let me know what you think because there is probably a lot of improvement possible.

FRid

FRid · February 3, 2010, 3:40pm

Update;

Not yet as smooth as i’d want it to be but hey, it works! (if the tweets contain just plain text) I’m happy with that for now and i hope my teachers will be as well. Take a look and let me know what you think because there is probably a lot of improvement possible.

FRid

twitter-ding-1.v4p (14.8 kB)

woei · February 3, 2010, 8:07pm

{CODE(ln=>1)}<span\sclass=“entry-content”>(.*?)</span>{CODE}
should do the trick all in one.

you could also try and convert the xhtml with the tidy node and parse the things you need via xpath

woei · February 3, 2010, 8:07pm

{CODE(ln=>1)}<span\sclass=“entry-content”>(.*?)</span>^
should do the trick all in one.

you could also try and convert the xhtml with the tidy node and parse the things you need via xpath

FRid · February 3, 2010, 11:53pm

And it did. Thanks! That really cleaned up the patch. I did look at that solution before but i overlooked the “\s” so i didn’t get any output. I’ve added some filters for when people start linking and stuff.

The Tidy/Xpath-combo will take some more time though but thanks to point it out.

FRid · February 3, 2010, 11:53pm

And it did. Thanks! That really cleaned up the patch. I did look at that solution before but i overlooked the “\s” so i didn’t get any output. I’ve added some filters for when people start linking and stuff.

The Tidy/Xpath-combo will take some more time though but thanks to point it out.

twitter-ding-1-8.v4p (30.7 kB)

guest · March 12, 2010, 12:49pm

… no text …

guest · March 12, 2010, 12:51pm

testing beta

guest · March 12, 2010, 12:59pm

… no text …

guest · March 12, 2010, 1:00pm

asdf… no text …

guest · March 12, 2010, 1:02pm

asfd… no text …

guest · March 12, 2010, 1:02pm

asdfasdf… no text …

guest · March 12, 2010, 1:03pm

asfasd… no text …

guest · March 16, 2010, 6:02pm

this is a testreply

guest · March 16, 2010, 6:37pm

still testing

guest · March 16, 2010, 6:38pm

another reply